Simple File Input
C++ programs can read and write files in many ways. For the sake of simplicity
and uniformity, files are read and written using the same stream method already
introduced for keyboard input.
For this page, we first need a test data file with known contents. Please
compile and run the following program:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ofstream ofs("data.txt");
for(int i =1;i <= 10;i++) {
ofs << "This is line " << i << endl;
}
return 0;
}
If you want to see what this program's output is, simply examine the contents
of the file "data.txt" that this program creates when run.
Not visible in the output file are some "control characters." A "control
character" is a character that, instead of printing a symbol, causes an action
to take place, like moving down to the next line on the display.
A regular character is simply printed. A control character causes an action.
Here are some common control characters, their symbols, and what they do:
Linefeed
|
'\n'
|
Causes the printing position to move to a new line
|
Tab
|
'\t'
|
Causes the printing position to advance to a fixed column
|
Bell
|
'\a'
|
Causes a bell to ring (most platforms)
|
These special symbols can be used alone or in quoted strings to format the
display:
cout << "This is a test line\n\n";
This example will print a line followed by two linefeeds, which assures one
blank line appears before the next line is printed.
Q: If I can add "\n" to the text of my printed lines, why use the special
operator "endl" as in the example above?
The operator "endl" does two things. It (1) causes a newline to be printed, and
it (2) causes the output to appear immediately.
In C++, input and output streams are "buffered." This means characters are read
and written in groups, for the sake of efficiency. When keyboard input is being
accepted, an entire line is read at once, which is why the user must press
"Enter" to move along. When program data is being written, it is normally
emitted in chunks.
To force output to be emitted at a particular time, either use "endl" as in the
above example, or do this:
cout << "This will appear right away." << flush;
The operator "flush" causes immediate output, which means these two lines are
equivalent:
cout << endl;
cout << "\n" << flush;
Now let's read our data file in an obvious way — line by line.
The following program contains the single most common student error in file
reading.
See if you can spot what the error is, what mistake it creates, and why. To
reduce any chance for confusion, the error is in
red
:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
// this program contains an error
int main()
{
ifstream ifs("data.txt");
string line;
// error in stream test
while(!ifs.eof()) {
getline(ifs,line);
cout << "[ " << line << " ]" << endl;
}
return 0;
}
When you run this program, you will see a blank line is printed after the last
valid data line. This is caused by the program error.
The error is attempting to test the stream for "end-of-file" without also
trying to read it:
while(!ifs.eof()) {
Remember: a C++ stream can have any origin — a file, a network connection, a
keyboard, or any other source.
Therefore the stream cannot detect that the data has ended until a read attempt
fails.
Because of this, a program that tests for end-of-file without reading, then
reads without testing, always fails.
A successful program always tests and reads at once.
Here is a corrected version of the program:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream ifs("data.txt");
string line;
while(getline(ifs,line)) {
cout << "[ " << line << " ]" << endl;
}
return 0;
}
In the next example, we will use the stream extraction operator ">>" to
read our file. Compile and run this program (it also has an
error
):
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream ifs("data.txt");
string word1, word2, word3;
int num;
while(ifs >> word1 >> word2 >> word3 >> num) {
cout << "[ "
<< word1
<< word2
<< word3
<< num
<< " ]"
<< endl;
}
return 0;
}
Why does the
error
cause the output to look all squeezed together? To answer this question, we
need to look at how the stream extraction operator ">>" works.
The extraction operator is actually a very sophisticated tool. It knows what
kind of variable is receiving the data, and it conducts itself accordingly.
It works like this:
-
Phase 1 (search):
-
Read characters.
-
If a character is "whitespace" (control characters or spaces), discard it.
-
If a non-whitespace character appears that is not appropriate to the target variable, stop,
indicate an error, "break" the stream.
-
If a character is appropriate to the target variable --
-
For integer variables, any of "+-0123456789".
-
For float/double variables, any of "+-.0123456789e" in a prescribed order.
-
For string variables, any non-whitespace characters.
— begin phase 2.
-
Phase 2 (read):
-
Read and accept characters that are appropriate to the target variable.
-
If a character appears that is not appropriate to the target variable, either
whitespace or some other character, don't discard it, stop reading, no error.
If you can commit this sequence of events to memory, it will greatly aid your
stream programming.
Using the extraction operator is called "formatted reading." It is called
"formatted" because the input is expected to have a particular format — groups
of characters meant to be received by particular variable types, separated by
whitespace. This kind of reading is ideal for text files that contain different
kinds of data, data that is separated by whitespace.
Here is another common student error — mixing the extraction operator and
getline() in the same program, without appropriate safeguards. Compile and run
this program:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream ifs("data.txt");
string word1, word2, word3, line;
int num;
// read a line using the extraction operator
if(ifs >> word1 >> word2 >> word3 >> num) {
cout << "[ "
<< word1 << ' '
<< word2 << ' '
<< word3 << ' '
<< num << " ]"
<< endl;
}
// read a line using getline
if(getline(ifs,line)) {
cout << "[" << line << "]" << endl;
}
return 0;
}
Why does this program fail — why can't it read the file's second line? Think:
-
In Phase 2, the extraction operator reads characters until it encounters
whitespace, then it stops without discarding any of the whitespace.
-
getline() only reads until it encounters a linefeed.
Unfortunately for our test program, the whitespace character that is left
behind by the extraction operator is a linefeed. getline() reads this single
linefeed and stops without reading the line that follows it.
The solution is to remove the linefeed that was left behind by the extraction
operator. Here is the corrected program:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream ifs("data.txt");
string word1, word2, word3, line;
int num;
// read a line using the extraction operator
if(ifs >> word1 >> word2 >> word3 >> num) {
cout << "[ "
<< word1 << ' '
<< word2 << ' '
<< word3 << ' '
<< num << " ]"
<< endl;
}
// discard whitespace
ifs.ignore(10000,'\n');
// read a line using getline
if(getline(ifs,line)) {
cout << "[ " << line << " ]" << endl;
}
return 0;
}
As is true throughout all of programming, if you know how the various parts of
your program work, you will also know why they don't.