Prayagasoft - web designer India, Ecommerce developer india, Ecommerce design

Stream I/O

INTRODUCTION TO STREAM I/O PART 1 - OVERLOADING <<

In this issue we will begin discussing the stream I/O package that comes with C++. The first four sections of this issue are related and present several aspects of stream I/O along with some related topics.

If you've used C++ at all, you've probably seen a simple example of how to do output:

cout << "Hello, world" << "\n";

instead of:

printf("Hello, world\n");

cout is an output stream, kind of like stdout in C. The C example could be written as:

fprintf(stdout, "Hello, world\n");

which makes this correspondence a bit clearer.

Once you get beyond simple input/output usage, what is the stream I/O package good for? One quite useful thing it can do is to allow the programmer to take control of I/O for particular C++ types such as classes. This end is achieved by the use of operator overloading.

Suppose that we have a Date class:

class Date { int month; int day; int year; public: Date(char*); Date(int, int, int); };

with an internal representation of a Date using three integers for month, day, and year, and a couple of constructors to create a Date object. How would we output the value of a Date object?

One way would be to devise a member function:

void out();

implemented as:

void Date::out() { printf("%d/%d/%d", month, day, year); }

This function would operate on an object instance of a Date and would access the month/day/year members and display them. This approach will certainly work and may be suitable in some kinds of applications.

But this scheme doesn't integrate very well with stream I/O. For example, I cannot say:

Date d(9, 25, 1956); cout << "Today's date is " << d;

but must say:

printf("Today's date is "); d.out();

For this purpose it is necessary to overload the << operator. We can add a friend function to Date:

friend ostream& operator<<(ostream& os, const Date& d);

with definition:

ostream& operator<<(ostream& os, const Date& d) { return os << d.month << "/" << d.day << "/" << d.year; }

With this definition, it is possible to say:

Date d(9, 25, 1956); cout << "Today's date is " << d << "\n";

Several aspects of this example need explanation. An overloaded operator in C++ is an operator like "+" or "<<" that is given a special meaning for certain kinds of arguments, and turns into a function. Wherever the operator is used with these arguments, a function is called. So, for example, an output statement:

cout << "xxx";

is actually:

cout.operator<<("xxx");

which is a valid function call in C++ if you're using stream I/O.

cout is an instance of class ostream (at least conceptually; the actual hierarchy is a bit complicated). When we wrote the actual statement to output a formatted Date:

return os << d.month << "/" << d.day << "/" << d.year;

we returned the ostream reference so as to allow << usage to be chained. Because << operators group left to right, a sequence like:

cout << "x" << "y";

actually means:

cout.operator<<("x").operator<<("y");

Finally, the reason that we declared operator<<(ostream&, const Date&) as a friend and not a member is that a member function that is a binary operator has an implicit convention on argument usage, namely, that for some operator @:

x @ y

means:

x.operator@(y);

that is, the left operand of the operator must be an instance of the class of which the overloaded operator is a member.

INTRODUCTION TO STREAM I/O PART 2 - FORMATTING AND MANIPULATORS

In this issue we will talk further about stream I/O. An excellent book on the subject is Steve Teale's "C++ IOStreams Handbook" (Addison-Wesley, $40). It has 350 pages with many examples and thoroughly covers I/O streams. It should be noted that I/O streams are undergoing revision as part of the ANSI/ISO standardization process. The examples we present are based on C++ headers and libraries in common use.

One obvious question about stream I/O is how to do formatting. A simple operation like:

int n = 37; cout << n;

is equivalent to:

printf("%d", n);

that is, no special formatting is done.

But what if you want to say:

printf("%08d", n);

displaying n in a field 8 wide with leading 0s? Such an operation would be performed by saying:

#include <iostream.h> #include <iomanip.h> /* stuff */ cout << setfill('0') << setw(8) << n;

setfill() and setw() are examples of I/O stream manipulators. A manipulator is a data object of a type known to I/O streams, that allows a user to change the state of a stream. We'll see how manipulators are implemented in a moment.

The operation illustrated here first sets the fill character to '0' and then the width of the field to 8 and then outputs the number. Some I/O stream settings like the fill character and left/right justification persist, but the width is reset after each output item.

A similar way of changing stream state is to use regular member function calls. For example,

cout.setf(ios::left); cout << setfill('0') << setw(8) << n;

produces output:

37000000

with left justification.

In this example, you see "ios::left" mentioned. What is ios? ios is the base class of the streams class hierarchy. In the implementation used here, the hierarchy is:

class ios { /* stuff */ }; class ostream : public ios { /* stuff */ } class ostream_withassign : public ostream { /* stuff */ }

and cout is an object instance of ostream_withassign. That is, there is a base class (ios), and the output streams class (ostream) derives or inherits from it, and ostream_withassign derives from ostream. Chapter #10 of Teale's book mentioned above discusses the rationale for the ostream_withassign class.

A statement like:

cout.setf(ios::left);

calls the member function setf() inherited from the ios class, to set flags for the stream. ios::left is an enumerator representing a particular flag value.

How can you design your own manipulators? A simple example is as follows:

#include <iostream.h> #include <iomanip.h> ostream& dash(ostream& os) { return os << " -- "; } main() { cout << "xxx" << dash << "yyy" << endl; return 0; }

We define a manipulator called "dash" that inserts a dash into an output stream. This is followed by the output of more text and then a builtin manipulator ("endl") is called. endl inserts a newline character and flushes the output buffer. We will say more about endl later in the newsletter.

Manipulators are in fact pointers to functions, and they are implemented via a couple of hooks in iostream.h:

ostream& operator<<(ostream& (*)(ostream&)); ostream& operator<<(ios& (*)(ios&));

These operators are member functions of class ostream. They will accept either a pointer to function that takes an ostream& or a pointer to function that takes an ios&. The former would be used for actual output, the latter for setting ios flags as discussed above.

INTRODUCTION TO STREAM I/O PART 3 - COPYING FILES

Suppose that you're writing a program to copy from standard input to standard output. A common way of doing this is to say:

#include <stdio.h> #include <assert.h> int main(int argc, char* argv[]) { FILE* fpin; FILE* fpout; int c; assert(argc == 3); fpin = fopen(argv[1], "r"); fpout = fopen(argv[2], "w"); assert(fpin && fpout); while ((c = getc(fpin)) != EOF) putc(c, fpout); fclose(fpin); fclose(fpout); return 0; }

EOF is a marker used to signify the end of file; its value typically is -1. In most commonly-used operating systems there is no actual character in a file to signify end of file.

This approach works on text files. Unfortunately, however, for binary files, an attempt to copy a 10406-byte file resulted in output of only 383 bytes. Why? Because EOF is itself a valid character that can occur in a binary file. If set to -1, then this is equivalent to 255 or 0377 or 0xff, a perfectly legal byte in a file. So we would need to say:

#include <stdio.h> #include <assert.h> int main(int argc, char* argv[]) { FILE* fpin; FILE* fpout; int c; assert(argc == 3); fpin = fopen(argv[1], "rb"); fpout = fopen(argv[2], "wb"); assert(fpin && fpout); for (;;) { c = getc(fpin); if (feof(fpin)) break; fputc(c, fpout); } fclose(fpin); fclose(fpout); return 0; }

feof() is a macro that tells whether the previous operation, in this case getc(), hit end of file. Note also that we open the files in binary mode.

How would we do the equivalent in C++? One way would be to say:

#include <fstream.h> #include <assert.h> int main(int argc, char* argv[]) { assert(argc == 3); ifstream ifs(argv[1], ios::in | ios::binary); ofstream ofs(argv[2], ios::out | ios::binary); assert(ifs && ofs); char c; while (ifs.get(c)) ofs.put(c); return 0; }

ifstream and ofstream are input and output file streams, taking a single char* argument and a set of flags.

These classes are derived from ios, which has an operator conversion function (from a stream object to void*). If a statement like:

assert(ifs && ofs);

is specified, then this conversion function is called. It returns 0 if there's something wrong with the stream. In other words, an object like "ifs" is converted to a void* automatically, and the value of the void* pointer tells the stream status (non-zero for a good state, zero for bad).

The actual copying is straightforward, using the get() member function. It accepts a reference to a character, so there's no need to use the return value to pass back the character that was read.

A somewhat terser approach would be to say:

#include <fstream.h> #include <assert.h> int main(int argc, char* argv[]) { assert(argc == 3); ifstream ifs(argv[1], ios::in | ios::binary); ofstream ofs(argv[2], ios::out | ios::binary); assert(ifs && ofs); ofs << ifs.rdbuf(); return 0; }

with no loop involved. The expression:

ifs.rdbuf()

returns a filebuf*, a pointer to an object that actually represents the low-level buffering for the file. filebuf is derived from a class streambuf, and ofstream is derived from ostream, and ostream has an operator<< defined for streambufs. So the looping over the input file occurs within operator<<. We are "outputting" a filebuf/streambuf.

Finally, how about code for copying standard input to output:

#include <iostream.h> int main() { char c; while (cin >> c) cout << c; return 0; }

If you run this program on text input, you will notice that the output's pretty jumbled. This is because by default whitespace is skipped on input. To fix this problem, you can say:

#include <iostream.h> int main() { char c; cin.unsetf(ios::skipws); while (cin >> c) cout << c; return 0; }

to disable the skipws flag. This program does not, however, work with binary files. To make it work gets into a tricky issue; the binary mode is specified when opening a file, and in this example standard input and output are already open. This ties in with low-level buffering and reading the first chunk of a file when it's opened. By contrast, skipping whitespace is a higher-level operation in the stream I/O library.

(correction)

In issue #008 we talked about copying files and said this about one of the examples of copying files using C:

This approach works on text files. Unfortunately, however, for binary files, an attempt to copy a 10406-byte file resulted in output of only 383 bytes. Why? Because EOF is itself a valid character that can occur in a binary file. If set to -1, then this is equivalent to 255 or 0377 or 0xff, a perfectly legal byte in a file.

This isn't quite the case. A common mistake when copying files in C is to use a char instead of an int with getc() and putc(). If a char is used, then the explanation above is correct, because with a binary file EOF interpreted as a character is one of the 256 valid bit patterns that a char can hold.

But with an int this is not a problem. getc(), and its functional equivalent fgetc(), return an unsigned char converted to an int. So the int can represent all character values 0-255, along with the EOF marker (typically -1).

It turns out that the reason why the example failed was due to a ^Z in the file. ^Z used to be used as an end-of-file marker for DOS files used on PCs.

Thanks to David Nelson for mentioning this.

INTRODUCTION TO STREAM I/O PART 4 - TIE()

In issue #008 we talked about copying files using a variety of methods. One example that was presented was this one:

#include <iostream.h> int main() { char c; cin.unsetf(ios::skipws); while (cin >> c) cout << c; return 0; }

Jerry Schwarz suggested that it might be worth discussing the tie() function and its effect on the performance of this code. Specifically, if we slightly change the above code to:

#include <iostream.h> int main() { char c; cin.tie(0); cin.unsetf(ios::skipws); while (cin >> c) cout << c; return 0; }

it runs about 8X faster with one popular C++ compiler, and about 18X with another.

The difference has to do with buffering and flushing of streams. When input is requested, for example with:

cin >> c

there may be output pending in the buffer for the output stream. The input stream is therefore tied to the output stream such that a request for input will cause pending output to be flushed. Flushing output is expensive, typically triggering a flush() call and a write() system call (on UNIX systems). Disabling the linkage between the input and output streams gets rid of this overhead.

To further illustrate this point, consider another example:

#include <iostream.h> int main() { char buf[100]; //cin.tie(0); cin.unsetf(ios::skipws); cout.unsetf(ios::unitbuf); cout << "What is your name? "; cin >> buf; return 0; }

It's common for output to be completely unbuffered (unit buffering) if going to a terminal (screen or window). So setting cin.tie(0) will not necessarily change observable behavior, because output will be flushed immediately in all cases.

To affect behavior in this example, one also needs to disable unit buffering for the stream, achieved by saying:

cout.unsetf(ios::unitbuf);

Once this is done, cin.tie(0) will change behavior in a visible way. If the input stream is untied, then the prompt in the example above will not come out before input is requested from the user, leading to confusion.

Note also that current libraries vary in their behavior. The above example works for one library that was tried, but for another, there appears to be no way to disable unit buffering under any circumstances, when output is to a terminal. The draft ANSI/ISO C++ standard calls for unit buffering to be set for error output ("cerr").

If tie() is called with no argument, it returns the stream currently tied to. For example:

cout << (void*)cin.tie() << "\n"; cout << (void*)(&cout) << "\n";

give identical results if cin is currently tied to cout.

Copying files a character at a time has other pitfalls. One has to be careful in assessing the buffering and function call overhead for anything done on a per-character basis. There is yet another way of copying files by character, using streambufs, that we'll present in a future issue.

INTRODUCTION TO STREAM I/O PART 5 - STREAMBUF

In previous issues we talked about various ways of copying files using stream I/O, some of the ways of affecting I/O operations by specifying unit buffering or not and tying one stream to another, and so on.

Another way of copying input to output using stream I/O is to say:

#include <iostream.h> int main() { int c; while ((c = cin.rdbuf()->sbumpc()) != EOF) cout.rdbuf()->sputc(c); return 0; }

This scheme uses what are known as streambufs, underlying buffers used in the stream I/O package. An expression:

cin.rdbuf()->sbumpc()

says "obtain the streambuf pointer for the standard input stream (cin) and grab the next character from it and then advance the internal pointer within the buffer". Similarly,

cout.rdbuf()->sputc(c)

adds a character to the output buffer.

Doing I/O in this way is lower-level than some other approaches, but correspondingly faster. If we summarize the four file-copying methods we've studied (see issues #008 and #009 for code examples of them), from slowest to fastest, they might be as follows.

Copy a character at a time with >> and <<:

cin.tie(0); cin.unsetf(ios::skipws); while (cin >> c) cout << c;

Copy using get() and put():

ifstream ifs(argv[1], ios::in | ios::binary); ofstream ofs(argv[2], ios::out | ios::binary); while (ifs.get(c)) ofs.put(c);

Copy with streambufs (above):

while ((c = cin.rdbuf()->sbumpc()) != EOF) cout.rdbuf()->sputc(c);

Copy with streambufs but explicit copying buried:

ifstream ifs(argv[1], ios::in | ios::binary); ofstream ofs(argv[2], ios::out | ios::binary); ofs << ifs.rdbuf();

A table of relative times, for one popular C++ compiler, comes out like so:

>>, << 100 get/put 72 streambuf 62 streambuf hidden 43

Actual times will vary for a given library. Perhaps the most critical factor is whether functions that are used in a given case are inlined or not. Note also that if you are copying binary files you need to be careful with the way copying is done.

Why the time differences? All of these methods use streambufs in some form. But the slowest method, using >> and <<, also does additional processing. For example, it calls internal functions like ipfx() and opfx() to handle unit buffering, elision of whitespace on input, and so on. get/put also call these functions.

The fastest two approaches do not worry about such processing, but simply allow one to manipulate the underlying buffer directly. They offer fewer services but are correspondingly faster.

INTRODUCTION TO STREAM I/O PART 6 - SEEKING IN FILES

In earlier issues we talked about streambufs, the underlying buffer used in I/O operations. One of the things that you can do with a buffer is position its pointer at various places. For example:

#include <fstream.h> int main() { ofstream ofs("xxx"); if (!ofs) ; // give error ofs << ' '; ofs << "abc"; streampos pos = ofs.tellp(); ofs.seekp(0); ofs << 'x'; ofs.seekp(pos); ofs << "def\n"; return 0; }

Here we have an output file stream attached to a file "xxx". We open this file and write a single blank character at the beginning of it. In this particular application this character is a status character of some sort that we will update from time to time.

After writing the status character, we write some characters to the file, at which point we wish to update the status character. To do this, we save the current position of the file using tellp(), seek to the beginning, write the character, and then seek back to where we were, at which point we can write some more characters.

Note that "streampos" is a defined type of some kind rather than simply a fixed fundamental type like "long". You should not assume particular types when working with file offsets and positions, but instead save the value returned by tellp() and then use it later.

In a similar way, it's tricky to use absolute file offsets other than 0 when seeking in files. For example, there are issues with binary files and with CR/LF translation. You may be assuming that a newline takes two characters when it only takes one, or vice versa.

seekp() also has a two-parameter version:

ofs.seekp(pos, ios::beg); // from beginning ofs.seekp(pos, ios::cur); // from current position ofs.seekp(pos, ios::end); // from end

 

India seo freelance web designer India web development ecommerce website developer India
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100