Tracking information in input stream

A

Adam H. Peterson

I would like to make a stream or streambuf that tracks the number of
lines that have been read and stuff like that (so, for example, when I
get an error message, I can ask the stream for the line number and put
it in error messages. I tried to create a streambuf class that held an
instance of another streambuf (through parameterized inheritance) and
would ferry calls from the one to the other, tracking data as it was
read, but I wasn't able to get a hook that found where the data gets
passed to the calling application.

Has this problem already been solved? Is there a place I can look for a
ready-made solution?

Anyway, thanks for reading.

Adam H. Peterson
 
M

Mike Wahler

Adam H. Peterson said:
I would like to make a stream or streambuf that tracks the number of
lines that have been read

You don't need to create your own stream or
stream buffer type to do this.

Just count the newsline characters when you read.
and stuff like that


"stuff like that" is such a vague phrase, I won't even
try to address it.
(so, for example, when I
get an error message, I can ask the stream for the line number and put
it in error messages.

Output the value of a counter which you increment when
encountering a newline character.
I tried to create a streambuf class that held an
instance of another streambuf (through parameterized inheritance) and
would ferry calls from the one to the other, tracking data as it was
read, but I wasn't able to get a hook that found where the data gets
passed to the calling application.

I think you're making things far more complex than necessary.
Has this problem already been solved? Is there a place I can look for a
ready-made solution?

Just do your i/o in a loop (typical way to do it anyway).
Exactly how you count newlines depends upon the functions
you use for your i/o. E.g. if you use cin.get(), check its
return value for '\n'. If you use e.g. 'std::getline', when
it returns, you know you've either encountered a newline, or
EOF (or an error occurred). You can sort out what happened
with e.g. 'eof()' and 'fail()'

-Mike
 
A

Adam H. Peterson

Mike said:
You don't need to create your own stream or
stream buffer type to do this.

Just count the newsline characters when you read.

This isn't really useful to me, though. If I have code like:

void f(istream &i) {
Object o;
i >> o;
}

I don't really want to poke around inside Object's stream extraction
operator, and I may be unable to. It's pretty common to have opaque I/O
routines that read and write objects or data structures of various
types. Often I can't instrument them, and even if I can, it usually
results in pretty messy looking code when compared with the original.
So I don't always have the option of keeping track of stuff as I read it
--- that is, unless I can get the stream to do it for me.
I think you're making things far more complex than necessary.

Possible. If I am, I would be grateful to know a simpler solution. (I
mean simple to code, rather than simple to explain.)
Just do your i/o in a loop (typical way to do it anyway).
Exactly how you count newlines depends upon the functions
you use for your i/o. E.g. if you use cin.get(), check its
return value for '\n'. If you use e.g. 'std::getline', when
it returns, you know you've either encountered a newline, or
EOF (or an error occurred). You can sort out what happened
with e.g. 'eof()' and 'fail()'

Well, that's fine if I'm reading primitive values like characters or
lines. But even reading integers using >> will get me into trouble
because it will consume whitespace without an opportunity for me to
inspect it. And I'm thinking in general of the read-opaque-object
situation.

Instrumenting all the I/O routines I use reduces their encapsulation and
makes them harder to read, and I believe it would be error prone. If
the stream object can be made to track that information for me
internally, it would (I believe) just work.
 
J

Jonathan Turkanis

Adam said:
I would like to make a stream or streambuf that tracks the number of
lines that have been read and stuff like that (so, for example, when I
get an error message, I can ask the stream for the line number and put
it in error messages.
Has this problem already been solved? Is there a place I can look
for a ready-made solution?

The Boost Iostreams library, which will be included with the next release of
Boost (1.33), makes it easy to attach chains of filters to existing streams and
stream buffers. These filters can modify data, or simply observe it, as in your
case. I'm planning to add a filter which counts newlines, but haven't done so
yet.

Here's a simple filter for counting lines of input (untested):

#include <boost/iostreams/concepts.hpp> // input_filter
#include <boost/iostreams/operations.hpp> // get

using namespace std;
using namespace boost::io;

class line_counter : public input_filter {
public:
line_counter() : lines_(0) { }
int lines() const { return lines; }

template<typename Source>
int get(Source& src)
{
// Pass characters through unchanged.
int c = boost::io::get(src);
if (c == '\n')
++lines_;
return c;
}
private:
int lines_;
};

(This is just an illustration; the official version will be more efficient.) It
could be used as follows:

#include <iostream>
#include <boost/ref.hpp>
#include <boost/iostreams/filtering_istream.hpp>

using namespace std;
using namespace boost::io;

int main()
{
line_counter counter;
filtering_istream in;
in.push(boost::ref(counter)); // Push a reference to the counter
in.push(cin);

// Read from 'in' instead of 'cin'. counter will count the lines.
...

std::cout << "lines read from std::cin = " << counter.lines() << "\n";
}

It's also not fairly easy to write a class extending filtering_istream which
keeps track of the line count automatically.

The most recent version of the library is here:

http://home.comcast.net/~jturkanis/iostreams/

I plan to update it soon.
Adam H. Peterson

Best Regards,
Jonathan
 
A

Adam H. Peterson

Great.

I love Boost and use it all the time, so I'll surely look into it.

Are there any facilities in Boost 1.32 that I might be able to use to
this effect?

And/or do you have a good guess as to when the next Boost will be out?

Thanks for the response,
Adam H. Peterson
 
J

Jonathan Turkanis

Adam said:
Great.

I love Boost and use it all the time, so I'll surely look into it.

Are there any facilities in Boost 1.32 that I might be able to use to
this effect?

I'm afraid not.
And/or do you have a good guess as to when the next Boost will be out?

As far as I know 1.33 hasn't been scheduled. Since 1.32 was just released, it's
sure to be a few months off.

Nearly 10 months elapsed between release 1.31 and 1.32, but this was unusual.
I'm hoping the next release will be in three or four months.
Thanks for the response,
Adam H. Peterson

Jonathan
 
M

Mike Wahler

Adam H. Peterson said:
Instrumenting all the I/O routines I use reduces their encapsulation and
makes them harder to read, and I believe it would be error prone. If
the stream object can be made to track that information for me
internally, it would (I believe) just work.

Now that you've elaborated upon what you want, I see
my advice is probably not applicable. See Jonathon T.'s reply.

-Mike
 
J

Jeff Flinn

Adam said:
Great.

I love Boost and use it all the time, so I'll surely look into it.

Are there any facilities in Boost 1.32 that I might be able to use to
this effect?

And/or do you have a good guess as to when the next Boost will be out?

I'm succefully using Jonathan's IOStream library with boost 1.32 now (and
have used it with 1.31 also). It's just a matter of copying to the
appropriate directories.

Jeff
 
A

Adam Peterson

Jeff said:
I'm succefully using Jonathan's IOStream library with boost 1.32 now (and
have used it with 1.31 also). It's just a matter of copying to the
appropriate directories.

My copy of boost is installed through RPM and I'd rather not mess with
the package management system. Can I use the library without installing
it in the same directory tree as boost?
 
J

Jeff Flinn

Adam said:
My copy of boost is installed through RPM and I'd rather not mess with

I've no idea what RPM is.
the package management system. Can I use the library without
installing it in the same directory tree as boost?

Someone with more boost.build knowledge would have to answer that.

Sorry, Jeff
 
J

Jonathan Turkanis

Adam said:
My copy of boost is installed through RPM and I'd rather not mess with
the package management system. Can I use the library without
installing it in the same directory tree as boost?

Yes, you can. I'm sorry I didn't say this explicitly before.

Unless you need to use memmory-mapped files, streams based on file descriptors
or filters using zlib or libbz2, all you need to do is put boost-1.32.0 and the
iostreams root directory in your include path. You reference the iostreams
headers the same way you would ordinary boost headers, e.g.:

#include <boost/iostreams/operations.hpp>

If you want to use the components mentioned aboce which have a separate .cpp
files, just add the .cpp files to your project, if you are using an IDE, or to
your makefile, and define the preprocessor symbol BOOST_IOSTREAMS_NO_LIB.

Best Regards,
Jonathan
 
A

Adam H. Peterson

Jonathan said:
(This is just an illustration; the official version will be more efficient.)

I would be interested to know when you have an official version
available, by the way.

It
could be used as follows:

#include <iostream>
#include <boost/ref.hpp>
#include <boost/iostreams/filtering_istream.hpp>

I think this last should be "#include

I'm using the library and the code, but I'm running into a buffering
problem. I'll query the line count and it will be actually too high
because the next several lines have already been processed but not
handed back to the application. Is there a way to work around this glitch?

Thanks for the help so far,
Adam Peterson
 
J

Jonathan Turkanis

Adam H. Peterson said:
I would be interested to know when you have an official version
available, by the way.

I'll email you when the next version is available. It won't be "official" until
Boost 1.33 is released, but I will be adding it to Boost CVS soon.
I think this last should be "#include
<boost/iostreams/filtering_stream.hpp>", no?
Right.

I'm using the library and the code, but I'm running into a buffering
problem. I'll query the line count and it will be actually too high
because the next several lines have already been processed but not
handed back to the application. Is there a way to work around this glitch?

I believe if you set the buffer size to zero when you add the filter, you should
get the actual number of lines read:

filtering_istream in;
in.push(line_counter(), 0);
in.push(cin);

Please let me know if this works, including whether it slows things down too
much.
Thanks for the help so far,
Adam Peterson

Jonathan
 
A

Adam Peterson

Jonathan said:
I believe if you set the buffer size to zero when you add the filter, you should
get the actual number of lines read:

filtering_istream in;
in.push(line_counter(), 0);
in.push(cin);

Please let me know if this works, including whether it slows things down too
much.

That appears to work fine. I'm not noticing any performance difference,
and I wasn't really expecting one either. I think buffering only makes
a difference when interfacing with block I/O devices, so by the time you
get past the original stream source it probably wouldn't help you any.

In any event, I think any performance loss for my purpose(s) would be
par for composing an alternate solution.

FYI, I get some warnings when I compile:
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::input_seekable' inaccessible in `boost::io::iostream_tag'
due to ambiguity
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::device_tag' inaccessible in `boost::io::iostream_tag' due to
ambiguity
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::eek:utput_seekable' inaccessible in `boost::io::iostream_tag'
due to ambiguity

They look like design warnings for choices not likely to be made, but I
thought I'd pass them along anyway in case you care.

Thanks,
Adam Peterson
 
J

Jonathan Turkanis

Adam said:
That appears to work fine. I'm not noticing any performance
difference, and I wasn't really expecting one either. I think
buffering only makes a difference when interfacing with block I/O
devices, so by the time you get past the original stream source it
probably wouldn't help you any.

I expect buffering to help a bit in with filters, since instead of calling a
virtual stream buffer function every character it happens only once every 128
characters or so. You're right that the gain can be much more significant when
you're accessing a device. Anyway, I'm very glad to see you haven't experienced
any performance loss.
In any event, I think any performance loss for my purpose(s) would be
par for composing an alternate solution.

FYI, I get some warnings when I compile:
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::input_seekable' inaccessible in `boost::io::iostream_tag'
due to ambiguity
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::device_tag' inaccessible in `boost::io::iostream_tag' due
to ambiguity
/.../boost/iostreams/categories.hpp:102: warning: virtual base
`boost::io::eek:utput_seekable' inaccessible in `boost::io::iostream_tag'
due to ambiguity

They look like design warnings for choices not likely to be made, but
I thought I'd pass them along anyway in case you care.

I think this can be fixed by adding a few more 'virtual's. Thanks.

Best Regards,
Jonathan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,189
Messages
2,571,016
Members
47,618
Latest member
Leemorton01

Latest Threads

Top