filebuf

D

davidrubin

Suppose I have a filebuf attached to a file of size 2K in write-only,
binary mode. The filebuf contains a 1K buffer. If I pubseekpos to
position 1024, what is in the buffer? The next 1024 bytes? The previous
1024 bytes? Something else? Thanks. /david
 
S

Shezan Baig

Suppose I have a filebuf attached to a file of size 2K in write-only,
binary mode. The filebuf contains a 1K buffer. If I pubseekpos to
position 1024, what is in the buffer? The next 1024 bytes? The previous
1024 bytes? Something else? Thanks. /david


It doesn't contain anything. pubseekpos just sets the pointer. At
least you cannot depend on the buffer containing anything. When you
start *reading* from the buffer, then it will start filling the *next*
1024 bytes.

Check out
http://www.cplusplus.com/ref/iostream/streambuf/pubseekpos.html

-shez-
 
D

davidrubin

Shezan said:
It doesn't contain anything. pubseekpos just sets the pointer. At
least you cannot depend on the buffer containing anything. When you
start *reading* from the buffer, then it will start filling the *next*
1024 bytes.


Well, if the filebuf is empty, the read will cause an underflow. At
this point the filebuf has to fetch data from the file before it can
service the read call (e.g., sgetc, sgetn). Presumably, it will fill
the buffer on underflow.

The question is really whether the buffer is pre-fetched when open is
called, or whether the filebuf is lazy and underflows the first read
attempt. /david
 
S

Shezan Baig

Well, if the filebuf is empty, the read will cause an underflow. At
this point the filebuf has to fetch data from the file before it can
service the read call (e.g., sgetc, sgetn). Presumably, it will fill
the buffer on underflow.

The question is really whether the buffer is pre-fetched when open is
called, or whether the filebuf is lazy and underflows the first read
attempt. /david


I don't think it's defined. It shouldn't really matter - why are you
accessing the buffer directly anyway?

-shez-
 
D

davidrubin

Shezan Baig wrote:

[snip]
I don't think it's defined. It shouldn't really matter - why are you
accessing the buffer directly anyway?

I'm accessing the filebuf. The problem is this: I have a
PersistentQueue implemented over a file. The queue has the following
format on disk:

+--------------------+
| nlr, lrl, npr, lpr | header (num logical records, length)
+--------------------+
| record 1 | first processed record
+--------------------+
| record 2 |
+--------------------+
| ... |
+--------------------+
| record i | last processed record
+--------------------+
| record i + 1 | next pending record (npr = front)
+--------------------+
| ... |
+--------------------+
| record n | last pending record (lpr = back)
+--------------------+

Now, suppose I want to provide iterators:

QueueIter beginProcessed();
QueueIter endProcessed();

QueueIter beginPending();
QueueIter endPending();

Each iterator needs to have its own file representation (e.g., filebuf)
because they are independently incremented and dereferenced (returning
a copy of the current record). If every filebuf pre-fetches 2K of data,
this implementation can be very inefficient. /david
 
S

Shezan Baig

I'm accessing the filebuf. The problem is this: I have a
PersistentQueue implemented over a file. The queue has the following
format on disk:

+--------------------+
| nlr, lrl, npr, lpr | header (num logical records, length)
+--------------------+
| record 1 | first processed record
+--------------------+
| record 2 |
+--------------------+
| ... |
+--------------------+
| record i | last processed record
+--------------------+
| record i + 1 | next pending record (npr = front)
+--------------------+
| ... |
+--------------------+
| record n | last pending record (lpr = back)
+--------------------+

Now, suppose I want to provide iterators:

QueueIter beginProcessed();
QueueIter endProcessed();

QueueIter beginPending();
QueueIter endPending();

Each iterator needs to have its own file representation (e.g., filebuf)
because they are independently incremented and dereferenced (returning
a copy of the current record). If every filebuf pre-fetches 2K of data,
this implementation can be very inefficient. /david


I don't think 'pubseekpos' will "pre-fetch" any data. It just sets the
position of the file pointer (something like 'fseek' in C).
 
D

davidrubin

Shezan said:
I don't think 'pubseekpos' will "pre-fetch" any data. It just sets the
position of the file pointer (something like 'fseek' in C).

What about 'open'? Like you said earlier, this behavior is not defined
in the standard. However, it makes a big difference from a practical
perspective. But I think there is a reasonable solution: if it turns
out that, say, most implementations read 2K bytes into the filebuf on
'open', you can replcace the filebuf buffer (via 'pubsetbuf') after
construction, but before calling 'open', and then replace it again when
you really want to read from the file. /david
 
S

Shezan Baig

What about 'open'? Like you said earlier, this behavior is not defined
in the standard. However, it makes a big difference from a practical
perspective. But I think there is a reasonable solution: if it turns
out that, say, most implementations read 2K bytes into the filebuf on
'open', you can replcace the filebuf buffer (via 'pubsetbuf') after
construction, but before calling 'open', and then replace it again when
you really want to read from the file. /david


I really think this is overkill and you will probably not gain anything
from it. Firstly, if it is not defined in the standard whether the
buffer will be filled on 'open' (not defined here implies "not
required"), then it is unlikely that an implementation will do the
extra work (since it is not required). secondly, when you 'open' a
file, it has to do a lot of other stuff that even if it *did* pre-fetch
the 2K buffer (which I doubt it will), the overhead of the fetch would
be negligible.
 
D

davidrubin

Shezan said:
I really think this is overkill and you will probably not gain anything
from it. Firstly, if it is not defined in the standard whether the
buffer will be filled on 'open' (not defined here implies "not
required"), then it is unlikely that an implementation will do the
extra work (since it is not required).

That is pure speculation. It certainly makes sense to fill the buffer
(up to EOF) when you open the file. Or you could be lazy and fill it on
the first read. In some sense the former is a better choice because the
latency is localized to where the file is opened.
secondly, when you 'open' a
file, it has to do a lot of other stuff that even if it *did* pre-fetch
the 2K buffer (which I doubt it will), the overhead of the fetch would
be negligible.

I disagree. Opening the file requires only that you allocate some file
structure and access the filesystem table (superblock, etc). Reading
the file, especially 2K of data, may several disk hits (for example,
files may be partitioned in 512B blocks), plus there is potential for
memory swapping, etc.

/david
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,809
Latest member
moe77

Latest Threads

Top