using sstream to read a vector<string>

A

arnuld

This works fine, I welcome any views/advices/coding-practices :)


/* C++ Primer - 4/e
*
* Exercise 8.9
* STATEMENT:
* write a program to store each line from a file into a
* vector<string>. Now, use istringstream to read read each line
* from the vector a word at a time.
*
*/

#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <sstream>


void read_file( std::ifstream& infile, std::vector<std::string>&
svec )
{
std::string a_line;
while( std::getline( infile, a_line ) )
{
svec.push_back( a_line );
}
}


/* This program works nearly in 3 steps:

step 1 : we take the path of the file and open it.
step 2 : we read the file into the vector<string>
ansd close it.
step 3 : we use the istringstream to read each line,
a word at a time.
*/
int main()
{
/* step 1 */
std::vector<std::string> svec;

std::cout << "Enter full path of the file: ";
std::string path_to_file;
std::getline( std::cin, path_to_file );

std::ifstream infile( path_to_file.c_str() );

/* step 2 */
/* check whether file was even opened or not */
if( infile.good() )
{
read_file( infile, svec );
}

/* reading finished, don't forget to close the file */
infile.close();

/* step 3 */
for( std::vector<std::string>::const_iterator iter = svec.begin();
iter != svec.end(); ++iter)
{
std::string a_word;
/* bind that line, to the istringstream */
std::istringstream vector_line( *iter );

while( vector_line >> a_word )
{
std::cout << a_word << " ";
}
}

std::cout << std::endl;

return 0;
}

--------- OUTPUT ------------
[arnuld@Arch64 c++]$ g++ -ansi -pedantic -Wall -Wextra ex_08-16.cpp
[arnuld@Arch64 c++]$ ./a.out
Enter full path of the file: /home/arnuld/programming/scratch.txt
;; This buffer is for notes you don't want to save, and for Lisp
evaluation. ;; If you want to create a file, visit that file with C-x
C-f, ;; then enter the text in that file's own buffer.
[arnuld@Arch64 c++]$
 
V

Victor Bazarov

arnuld said:
This works fine, I welcome any views/advices/coding-practices :)


/* C++ Primer - 4/e
*
* Exercise 8.9
* STATEMENT:
* write a program to store each line from a file into a
* vector<string>. Now, use istringstream to read read each line
* from the vector a word at a time.
*
*/

#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <sstream>


void read_file( std::ifstream& infile, std::vector<std::string>&
svec )
{
std::string a_line;
while( std::getline( infile, a_line ) )
{
svec.push_back( a_line );
}
}

Why limit this to 'ifstream'? Why not 'istream'? Then you could
re-use it to read from the standard input as well.
/* This program works nearly in 3 steps:

step 1 : we take the path of the file and open it.
step 2 : we read the file into the vector<string>
ansd close it.
step 3 : we use the istringstream to read each line,
a word at a time.
*/
int main()
{
/* step 1 */
std::vector<std::string> svec;

std::cout << "Enter full path of the file: ";
std::string path_to_file;
std::getline( std::cin, path_to_file );

You should consider asking this question only if 'argv[1]' was not
provided. If it is provided, try assigning it to 'path_to_file':

if (argc > 1)
path_to_file = argv[1];
else {
std::cout << "Enter file name: " << std::flush;
std::getline(std::cin, path_to_file);
}

And check the error condition on the 'cin' here -- what if I press
Ctrl-D. You should exit then.
std::ifstream infile( path_to_file.c_str() );

/* step 2 */
/* check whether file was even opened or not */
if( infile.good() )
{
read_file( infile, svec );

There is no checking whether you actually read anything, and there
was no error in process (not sure if you want/need that, but might
be useful to tell the user that reading failed in case it has).
}

/* reading finished, don't forget to close the file */
infile.close();

/* step 3 */
for( std::vector<std::string>::const_iterator iter = svec.begin();
iter != svec.end(); ++iter)
{
std::string a_word;
/* bind that line, to the istringstream */
std::istringstream vector_line( *iter );

while( vector_line >> a_word )
{
std::cout << a_word << " ";
}
}

std::cout << std::endl;

return 0;
}

--------- OUTPUT ------------
[arnuld@Arch64 c++]$ g++ -ansi -pedantic -Wall -Wextra ex_08-16.cpp
[arnuld@Arch64 c++]$ ./a.out
Enter full path of the file: /home/arnuld/programming/scratch.txt
;; This buffer is for notes you don't want to save, and for Lisp
evaluation. ;; If you want to create a file, visit that file with C-x
C-f, ;; then enter the text in that file's own buffer.
[arnuld@Arch64 c++]$


V
 
P

pavel.turbin

You should probably have wstring support as well. Depends on command
line parameter you read stream into
vector<string> or vector<wstring>
 
J

Jerry Coffin

[ ... ]
/* step 3 */
for( std::vector<std::string>::const_iterator iter = svec.begin();
iter != svec.end(); ++iter)
{
std::string a_word;
/* bind that line, to the istringstream */
std::istringstream vector_line( *iter );

while( vector_line >> a_word )
{
std::cout << a_word << " ";
}
}

I'd probably code this section something like this:

std::copy(std::istream_iterator<std::string>(svec.begin(), svec.end(),
std::eek:stream_iterator<std::string>(std::cout, " "));

I know it was required in an earlier exercise, but I can't say I'm
terribly enthused about the read_file function either. For reading data
a line at a time, I prefer a proxy to use with std::copy. Using that,
your program can be written something like this:

#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
#include <sstream>
#include <iostream>
#include <fstream>

// string proxy that reads a line at a time
class line {
std::string data;
public:
std::istream &read(std::istream &is) {
return std::getline(is, data);
}

operator std::string() const { return data; }
};

std::istream &operator>>(std::istream &is, line &d) {
return d.read(is);
}

// Write the words in a string out as a series of space-separated words.
class write_words {
std::eek:stream &os_;
public:
write_words(std::eek:stream &os) : os_(os) {}

void operator()(std::string const &line) {
std::stringstream words(line);

std::copy(std::istream_iterator<std::string>(words),
std::istream_iterator<std::string>(),
std::eek:stream_iterator<std::string>(os_, " "));
}
};

// To keep things simple, I'm assuming the filename is passed on the
// command line
int main(int argc, char **argv) {
std::vector<std::string> svec;

// The following block is vaguely equivalent to read_file
{
std::ifstream in(argv[1]);

std::copy(std::istream_iterator<line>(in),
std::istream_iterator<line>(),
std::back_inserter(svec));
}

std::for_each(svec.begin(), svec.end(), write_words(std::cout));
return 0;
}

Personally, I find this quite a bit more readable. Instead of the
"steps" being encoded as comments, the code itself expresses each step
fairly directly. The one fact that may not be immediately obvious is
that since 'in' is a local variable, it is automagically closed upon
leaving the block in which it was created. I _strongly_ prefer this to
explicitly calling in.close() -- it's much easier to produce exception-
safe code this way. That may not be much of an issue for you yet, but
you're progressing quickly enough that I'd guess it will be soon, and
it's a lot easier if you don't have to re-learn everything when you do.

As an aside, I suppose I should add that for a real program, more error-
checking is clearly needed. You should clearly check that argc>1 before
trying to use argv[1] -- though as Victor (I think it was Victor anyway)
suggested, you might want to use the file specified on the command line
if there was one, and prompt for an input file name otherwise. If you
decide to do that, I'd move the code to do so into a function instead of
leaving it directly in main though, something like:

char *file_name = argv[1];
if (file_name == NULL) // argv[argc] == NULL
file_name = get_file_name(std::cin);
 
J

Jerry Coffin

[ ... ]

I've done a bit more playing around with the code I posted earlier. I
wasn't entirely happy about using 'std::for_each' to do what was really
a copy operation. Admittedly, by using 'write_words(std::cout)' in the
for_each, it was fairly apparent what was happening. Nonetheless, I
prefer to copy things with std::copy. The proxy-based approach can be
extended to deal with output just as well as input, so I decided to try
that out:

#include <string>
#include <vector>
#include <iterator>
#include <algorithm>
#include <sstream>
#include <iostream>
#include <fstream>

class line {
std::string data;
static const std::string empty;
public:
std::istream &read(std::istream &is) {
return std::getline(is, data);
}

line(std::string const &init) :data(init) {}
line(){}

std::eek:stream &write(std::eek:stream &os) const {
std::stringstream x(data);
std::copy(std::istream_iterator<std::string>(x),
std::istream_iterator<std::string>(),
std::eek:stream_iterator<std::string>(os, " "));
return os;
}

operator std::string() const { return data; }
};

std::istream &operator>>(std::istream &is, line &d) {
return d.read(is);
}

std::eek:stream &operator<<(std::eek:stream &os, line const &d) {
return d.write(os);
}

int main(int argc, char **argv) {
std::vector<std::string> svec;
std::ifstream in(argv[1]);

std::copy(std::istream_iterator<line>(in),
std::istream_iterator<line>(),
std::back_inserter(svec));

std::copy(svec.begin(),
svec.end(),
std::eek:stream_iterator<line>(std::cout, " "));

return 0;
}

Class line is now something of a misnomer. In reality, 'line' should
probably be left as it was before, and a new class (word_writer or
something like that) created for the output. I didn't bother for now,
mostly because it makes the code longer, and I doubt it makes much
difference in understanding what I've done. I suspect many people
probably find the previous version more understandable than this anyway.
 
J

Jerry Coffin

I really don't understand what exactly is going on here :(

Sorry 'bout that. I undoubtedly should have included quite a bit more
commentary about the code when I posted it.

As I mentioned previously, our first class is a proxy. It's mostly a
stand-in for std::string, but it changes a few specific things about how
a string acts.

The first thing we change is how we read our special string type.
Instead of reading a word at a time, we read an entire line at a time:

std::istream &read(std::istream &is) {
return std::getline(is, data);
}

That's all this does: make it so line.read(some_istream) reads a whole
line of data into a string.

line(std::string const &init) :data(init) {}

This just lets us initialize a line from a string. It just copies the
string into the line's data member (which is a string, so it doesn't
really change anything).

line(){}

This default ctor would have been there by default, but since we added
the ctor that creates a line from a string, we have to explicitly add it
back in or it won't be there -- and we generally need a default ctor for
anything we're going to put into any standard container.

std::eek:stream &write(std::eek:stream &os) const {

std::stringstream x(data);

This much takes the line that we read in, and puts it into a
stringstream.


std::copy(std::istream_iterator<std::string>(x),
std::istream_iterator<std::string>(),
std::eek:stream_iterator<std::string>(os, " "));

Now we take our stringstream, read out one word at a time, and write it
out to a stream. It's roughly equivalent to code like:

std::string temp;
while ( x >> temp) {
os << temp;
os << " ";
}

The big difference is that std::copy can take essentially any pait of
iterators as input, and write those items to essentially any iterator
for the output.

operator std::string() const { return data; }

This is a crucial part of allowing our 'line' class to act like a
string: we allow it to be converted to a string at any time. This
becomes crucial in how we use it below.

std::istream &operator>>(std::istream &is, line &d) {
return d.read(is);
}

std::eek:stream &operator<<(std::eek:stream &os, line const &d) {
return d.write(os);
}

These two functions just overload operator>> and << to use our member
funtions. They allow code to read a line with code like:

std::cin >> some_line;

and it reads an entire line of data instead of just one word.



int main(int argc, char **argv) {
std::vector<std::string> svec;

std::ifstream in(argv[1]);

This just takes the first name that was passed on the command line and
opens it as a file.

std::copy(std::istream_iterator<line>(in),
std::istream_iterator<line>(),
std::back_inserter(svec));

Now, here's were we get a little bit tricky: even though we have a
vector of string, we use an istream_iterator<line>. That means the
istream_iterator uses the operator>> we defined above (which, in turn,
calls line::read) to read in each item from the stream. It then uses
push_back to add that item to svec. This, in turn, uses the operator
string we defined above, so it ends up using the entire string that we
read in (i.e. the entire line we read with readline in line.read().

std::copy(svec.begin(), svec.end(),
std::eek:stream_iterator<line>(std::cout, " "));

Here we reverse things: we copy the contents of svec out to an
ostream_iterator for std::cout -- but it's an ostream_iterator<line>, so
when it writes each complete line out to std::cout, it does so using
line.write. That, in turn, creates a stringstream and writes out each
word in the line individually instead of writing out the whole line.

Our proxy (line) is pretty close to what most people initially think
they will do with inheritance -- create a new class that acts like the
existing class (string, in this case) except for changing a couple of
specific operations (input and output, in this case). The ctor and cast
operator we provided in line allow us to freely convert between this
type and string whenever we want/need to. In this case, we read in
lines, convert and store them as strings, then convert them back to
lines as we write them out.

Given that we never do anything _else_ with the strings, we could
eliminate all the conversions by creating svec as a vector<line> instead
of a vector<string>. This, however, would have a substantial
disadvantage if we did much of anything else with our strings -- we'd
have to explicitly provide access to any other string operations, and as
we all know, there are a LOT of string operations. Providing the
conversions allows us to support all the string operations with very
little work.
 
A

arnuld

Sorry 'bout that. I undoubtedly should have included quite a bit more
commentary about the code when I posted it.
....[SNIP].....


Jerry, by just looking at the code of yours I realised that you just
created a new type that is much really like a string but different in
other aspects. The new type solves our problem in an effective way.

Though I never ever wrote any real-life code, If your code is what
we call real-life code then I understood most of it :) somethings were not
clear to me but I will not go into that as that is not my point here. my
point is I *can* understand what wrote (on the contrary, i always thought
I can never do that).

thank a lot for writing sucha long and clear code and documentation for me
:). you are cool :cool:

BTW, you have chiseled one habit into my code-wrting style: std::copy, I
dont think I will forget it ever again :)
 
J

Jerry Coffin

[ ... ]
Though I never ever wrote any real-life code, If your code is what
we call real-life code then I understood most of it :) somethings were not
clear to me but I will not go into that as that is not my point here. my
point is I *can* understand what wrote (on the contrary, i always thought
I can never do that).

That definitely was not real-life code. Real code (at least IME) tends
to become hard to understand less because it's complex per se, than
because some parts become sacred (i.e. nobody wants to touch them) and
other parts get hacked to work around shortcomings in the sacred parts.
Likewise, there tends to be a lot more error checking that simply
obscures the mainstream of the code, sometimes to the point that it's
almost impossible to figure out what really IS the mainstream of the
code anymore.

In this particular case, writing the same thing as real code would
generally be a bit simpler. This particular task was over-specified from
a viewpoint of getting the job done correctly. In reality, we could
achieve the same actual output with code like:

std::copy(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::eek:stream_iterator<std::string(std::cout, " "));

Since we write the output out a word at a time anyway, reading a word at
a time is no problem -- except, of course that it bypasses the point of
the exercise. :)
thank a lot for writing sucha long and clear code and documentation for me
:). you are cool :cool:

BTW, you have chiseled one habit into my code-wrting style: std::copy, I
dont think I will forget it ever again :)

It is handy, but make no mistake: it's not a panacea or anything like
that. OTOH, I would say it's probably one of the most under-used
algorithms in the library (and based on postings here, I'd say for_each
is the most over-used).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top