The standard library used in conjunction with stdio.h provides some quite useful facilities for handling files. There are, inevitably, some host system dependencies in file handling. These will be mentioned when relevant.
All file handling is done via objects of type pointer to FILE. This compound data type is defined in stdio.h . The following file handling functions are provided in the standard library.
| Function | Action |
|---|---|
| fopen() | Open a file |
| fclose() | Close a file |
| fprintf() | Formatted write to a file |
| fscanf() | Formatted read from a file |
| fputc() | Write a character to a file |
| fgetc() | Read a character from a file |
| fputs() | Write a string to a file |
| fgets() | Read a string from a file |
| putc() | Write a character to a file (macro) |
| getc() | Read a character from a file (macro) |
| ungetc() | Un-read a character from a file |
| fread() | Unformatted read from a file |
| fwrite() | Unformatted write to a file |
| fgetpos() | Determine current position in a file |
| fseek() | Adjust current position in a file |
| fsetpos() | Adjust current position in a file |
| ftell() | Determine current position in a file |
| rewind() | Set current position to start of file |
| feof() | Tests whether end-of-file has been seen |
| ferror() | Tests whether a file error has occurred |
| clearerr() | Clears file error indicator |
| remove() | Delete a file |
| rename() | Rename a file |
| tmpfile() | Create a temporary file |
| tmpnam() | Create a unique file name |
| fflush() | Force writing of data from buffer to file |
| freopen() | Opens a file using a specific FILE object |
There are also a number of pre-defined constants in stdio.h The following are most likely to be useful.
| FOPEN_MAX | Maximum number of open files |
| FILENAME_MAX | Maximum length of file name |
| SEEK_CUR | Used with fseek() function |
| SEEK_END | Used with fseek() function |
| SEEK_SET | Used with fseek() function |
| stderr | Pre-opened file |
| stdout | Pre-opened file |
| stdin | Pre-opened file |
The standard file handling functions view a file as an array of bytes. Some of the functions are line oriented regarding a line as a sequence of bytes terminated by a new-line character (in the Unix tradition). If the host operating system is not Unix then the file handling functions will attempt to make the files look like Unix files, this can cause some portability problems.
A simple example program designed to copy a file called data1 to a file called data2 is listed below.
#include <stdio.h>
main()
{
FILE *ifp,*ofp;
int c;
if((ifp=fopen("data1","r")) == NULL)
{
printf("Couldn't open \"data1\" for input\n");
exit(1);
}
if((ofp=fopen("data2","w")) == NULL)
{
printf("Couldn't open \"data2\" for output\n");
exit(1);
}
while((c=getc(ifp)) != EOF) putc(c,ofp);
}
There are several points of interest in this program. The objects ifp and ofp are used to hold pointers to objects of type FILE (note the case), they are no different from any other pointer data type. The library function fopen() actually opens the file. It takes two parameters that are both pointers to characters. The first parameter is the name of the file to open, in this simple example the file names are "hard-wired" into the program but they could have been obtained interactively or from the command line. The second parameter is the file opening mode. This is also a string of characters, even though, in this case there is only a single character. ANSI C recognises the following opening modes
| mode | meaning |
|---|---|
| r | Open text file for reading |
| w | Truncate to zero size or create text file for writing |
| a | Append - open or create text file for writing at end of file |
These basic modes may be modified by two extra characters. A "+" may be used to mean that the file is to be opened for both reading and writing (updating). A b may be used to mean opening in binary mode, this means that the library functions will present the underlying data to the program without any attempt to make the file look Unix like, under Unix it has no effect.
The return value from fopen() is either a pointer to an object of type FILE or the value NULL if it was not possible to open the named file. The macros getc() and putc() should be noted, the entire action of copying the files is handled by the single line of code at the end of the program. There is no need to specifically close the files, when the program returns to the host environment all open files are automatically closed as part of the return to host environment mechanism.
The use of the library function fgets() is illustrated in the following example. fgets() is the file handling equivalent of gets(), it may be thought of as a function that reads in the next record from the file by equating a record with a sequence of characters terminated by a newline, which is the normal Unix convention. Unlike gets() which requires the programmer to be careful about the input buffer size, fgets() has a parameter that specifies the maximum number of characters to transfer to the input buffer. The prototype of fgets() is
char *fgets(char *, int, FILE *) The first parameter is, of course, the start address of the input buffer. The second parameter is the buffer size, the actual number of characters read is, at most, one less than the buffer size allowing a string terminating NUL to be placed at the end of the buffer. The final parameter identifies the input file.
If the input record is not too big for the input buffer then it is copied to the input buffer complete with the input line terminating newline character. If the record is too big then there will be no newline character in the input buffer and the next call to fgets() will carry on where the last one left off, getting the next portion of the record.
The use of fgets() is illustrated by the following example which analyses the number of lines in a file and their maximum and minimum lengths.
#include <stdio.h>
/* Program to report the number of records, their
average size & the smallest and largest record sizes.
*/
int getrec(FILE *);
main(int argc, char *argv[])
{
int n;
int recno = 0; /* number of records */
int minrec; /* smallest */
int maxrec; /* largest */
long cumrec = 0; /* cumulative size */
FILE *dfp; /* file to analyse */
char started = 0;
if(argc != 2)
{
printf("Usage : fil2 f\n");
exit(1);
}
if( (dfp = fopen(argv[1],"r")) == NULL)
{
printf("error opening %s\n",
argv[1]);
exit(1);
}
while (1)
{
if((n=getrec(dfp))==EOF) break;
recno++;
cumrec += n;
if(!started)
{
minrec = maxrec = n;
started = 1;
}
if(n<minrec) minrec = n;
if(n>maxrec) maxrec = n;
}
printf("%4d records\n",recno);
printf("average size %5.1f\n",
(double)cumrec/recno);
printf("smallest %2d\nlargest %4d\n",
minrec,maxrec);
}
int getrec(FILE *f)
/* function to read a record and return (as
functional value) the record size. If the
end of file is encountered then EOF is
returned.
*/
{
int rs=0;
int length;
char buff[25];
while(1)
{
if(fgets(buff,25,f)==NULL)
return EOF;
length = strlen(buff);
rs += length;
if(buff[length - 1] == '\n')
return rs;
}
}
With a suitably large file named on the command line, the program produced the following output. 8788 records average size 99.2 smallest 25 largest 282There are several interesting points about this program. The function getrec() reads the file in chunks of 24 bytes, this is ridiculously small. In order to determine how much data has actually been read in the function strlen() is applied to the string in the input buffer, remembering that the count returned by strlen() excludes the string terminating NUL. It is particularly important to look at the character before the string terminating NUL in the input buffer, if this is a newline then the repeated calls to fgets() have encountered the end of the current record and the accumulated record size can be returned.