How/Where is a Stream Stored?

R

Randy Kramer

Background: In order to do the parsing I've talked about in another thread, in
many circumstances I need to know the number of spaces before and after the
current token. I'm trying to think about efficient ways to do that--one
might be to do a preprocess pass through the text to figure out how many
spaces separate various tokens then store the tokens and spaces between them
in a temporary in memory data structure, or I'll need a way to backtrack from
the found position of some token to find how many spaces separate it from the
previous token.

I'm thinking that maybe a stream ("on" the input file?) might be a way to do
the backtracking (by moving the pos back from the current position, either
one character at a time or several (and then read forward to the first
space)).

I'm wondering how efficient an operation that is--are such stream operations
performed on the disk file itself, or is the stream somehow buffered in
memory and the operations performed there. (Or, am I hopelessly
confused? ;-)

Randy Kramer

Aside: In another thread I'm going to ask about efficient storage for the
other alternative.
 
R

Robert Klemme

Randy Kramer said:
Background: In order to do the parsing I've talked about in another
thread, in
many circumstances I need to know the number of spaces before and after
the
current token. I'm trying to think about efficient ways to do that--one
might be to do a preprocess pass through the text to figure out how many
spaces separate various tokens then store the tokens and spaces between
them
in a temporary in memory data structure, or I'll need a way to backtrack
from
the found position of some token to find how many spaces separate it from
the
previous token.

I'm thinking that maybe a stream ("on" the input file?) might be a way to
do
the backtracking (by moving the pos back from the current position, either
one character at a time or several (and then read forward to the first
space)).

I'm wondering how efficient an operation that is--are such stream
operations
performed on the disk file itself, or is the stream somehow buffered in
memory and the operations performed there. (Or, am I hopelessly
confused? ;-)

Randy Kramer

Aside: In another thread I'm going to ask about efficient storage for the
other alternative.

:)

If your files aren't bit then I guess the most efficient way is to slurp
them into mem and use String#scan on that - especially so since the sub
strings share the same string buffer underneath so you get essentially just
one copy of the file in mem AFAIK.

Kind regards

robert
 
G

Guillaume Marcais

I'm thinking that maybe a stream ("on" the input file?) might be a way to do
the backtracking (by moving the pos back from the current position, either
one character at a time or several (and then read forward to the first
space)).

It depends on the type of stream. You can backtrack easily with a file,
but you can't with non-seekable stream (like stdin, network socket,
etc.). So using IO#seek would prevent your program to work as a UNIX
filter (reading from stdin, writing to stdout). Might be or not be a
great deal to you, your call...
I'm wondering how efficient an operation that is--are such stream operations
performed on the disk file itself, or is the stream somehow buffered in
memory and the operations performed there. (Or, am I hopelessly
confused? ;-)

All disk operations on recent OS are cached. Backtracking a small amount
is likely not to generate any disk activity.

Guillaume.
 
R

Randy Kramer

On Thu, 2005-03-03 at 05:26 +0900, Randy Kramer wrote:
It depends on the type of stream. You can backtrack easily with a file,
but you can't with non-seekable stream (like stdin, network socket,
etc.). So using IO#seek would prevent your program to work as a UNIX
filter (reading from stdin, writing to stdout). Might be or not be a
great deal to you, your call...
...

All disk operations on recent OS are cached. Backtracking a small amount
is likely not to generate any disk activity.

Guillaume,

Thanks!

Randy Kramer
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,169
Messages
2,570,919
Members
47,458
Latest member
Chris#

Latest Threads

Top