fseek

  • Thread starter Christopher Benson-Manica
  • Start date
A

Alan Balmer

Entirely irrelevant, for both kind of streams. If you attach a binary
stream to a text file, all the bets are off: you may not see a single
newline character in the whole file (implementations storing each line
of text in a variable size record typically don't bother to store the
newline character at all: it is implied, after the last character of the
record).
Which was exactly the point. There's more involved in seeking a text
stream than simply counting characters. In other words, the OP's
implication that the different treatment of text streams in the
standard is simply an "inconvenience" to implementors is incorrect.

Not irrelevant at all
 
A

Alan Balmer

Quite right. It does not say that. If the writers had wanted to say
that, I suspect they would have.
"Whereas a binary file can be treated as an ordered sequence of bytes
counting from zero, a text file need not map one-to-one to its
internal representation (see 7.19.2). Thus, only seeks to an earlier
reported position are permitted for text files. [...]"

And neither does this. However, this does say explicitly what I was
trying to convey to the OP. Thank you.Depends on what you mean by "work." I would expect it to "work" on
any implementation. I wouldn't, however, try to predict the
relationship between a given offset in a binary file and the same
offset in a text file.

I don't have a long list of applications for the procedure, either,
but I don't see that it's prohibited by the standard.

Remember that the difference between text files and binary files (if
any) is a characteristic of the implementation. Text files can always
be treated as binary files, but the reverse is not generally true, and
even if you know it's a text file, interpreting it as text from a
binary stream is not portable.
 
I

Irrwahn Grausewitz

Alan Balmer said:
Quite right. It does not say that. If the writers had wanted to say
that, I suspect they would have.

Maybe someone would like to generate a DR about this.
"Whereas a binary file can be treated as an ordered sequence of bytes
counting from zero, a text file need not map one-to-one to its
internal representation (see 7.19.2). Thus, only seeks to an earlier
reported position are permitted for text files. [...]"

And neither does this.

Not explicitly, but IMHO implicitly. Otherwise the "Thus, ..." part
makes no sense. However, it's quoted from the Rationale, not the
Standard.
However, this does say explicitly what I was
trying to convey to the OP. Thank you.
Depends on what you mean by "work." I would expect it to "work" on
any implementation. I wouldn't, however, try to predict the
relationship between a given offset in a binary file and the same
offset in a text file.

That's an, err, interesting definition of "work". Anyway, if you are
aware of these facts, why did you suggest upthread:

AB> However, you can open the same file as binary, then use the
AB> results of fseek and ftell to position the text file.

as if this procedure would do anything useful?

IOW: you at least forgot to add a disclaimer. ;-)

Regards
 
A

Alan Balmer

That's an, err, interesting definition of "work".

What's yours?
You can write the code, any compiler should compile it, and you can
execute it. It will work, whether the results are useful or not. I
wish that I could claim that all the work I've ever done turned out to
be useful ;-)
Anyway, if you are
aware of these facts, why did you suggest upthread:

AB> However, you can open the same file as binary, then use the
AB> results of fseek and ftell to position the text file.

as if this procedure would do anything useful?

IOW: you at least forgot to add a disclaimer. ;-)

It wasn't a suggestion, but a comment. After all, the OP wasn't
looking for suggestions, but commenting on how he thought things
*should* work, as opposed to how they *do* work.

In the course of the thread, someone brought up the fact that offsets
are counted as characters for binary files, but nothing in particular
for text files. While commenting on that, I mentioned that a file can
be opened and positioned as binary, and that the standard apparently
allows that position to be used in fseek for the same file opened as
text. I find that interesting, and could probably even invent a
legitimate use for it (investigating the actual structure of a text
file, perhaps.)

I suggest that you reread the thread from the beginning, remembering
that my prose generation is not always the best, and I may sometimes
mistakenly assume that the reader is thinking in the same twisted
direction that I am :)
 
I

Irrwahn Grausewitz

I suggest that you reread the thread from the beginning, remembering
that my prose generation is not always the best, and I may sometimes
mistakenly assume that the reader is thinking in the same twisted
direction that I am :)

The root of misinterpretation that may be, yes :)
 
D

Dan Pop

In said:
7.19.9.2

"For a text stream, either offset shall be zero, or offset shall be a
value returned by an _earlier successful call to the ftell function on
a stream associated with the same file_ and whence shall be SEEK_SET."

Emphasis added.

The other stream *must* be a text stream too. Connect a binary stream to
a text file and all the bets are off.

Dan
 
D

Dan Pop

In said:
Which was exactly the point. There's more involved in seeking a text
stream than simply counting characters.

And yet, you claim that you can use character offsets as arguments to a
fseek call on a text stream.

Dan
 
A

Alan Balmer

And yet, you claim that you can use character offsets as arguments to a
fseek call on a text stream.

Of course you can. You can also use random numbers. I don't claim to
be able to portably relate the results to anything in particular.
 
D

Dan Pop

In said:
Chapter and verse, please.

It's an obvious bug in the standard. Ask in comp.std.c if you don't
believe me.

Binary files and text files can have completely different internal
representations and the standard doesn't provide any guarantee about
what happens when you open a text file in binary mode or vice versa.
It only addresses the cases when a binary stream is attached to a binary
file and a text stream to a text file.

Dan
 
D

Dan Pop

In said:
Of course you can. You can also use random numbers.

The idea was to be able to seek to well defined positions inside the file.
I don't claim to
be able to portably relate the results to anything in particular.

In general, the results of ftell on a binary stream are
useless/meaningless to any text stream on the same implementation.

The thing you don't get is that the encoding of the file position returned
by ftell on a text stream need not be a plain byte offset. It could be
a record number and a byte offset inside the record.

Dan
 
E

Eric Sosman

Dan said:
Binary files and text files can have completely different internal
representations and the standard doesn't provide any guarantee about
what happens when you open a text file in binary mode or vice versa.
It only addresses the cases when a binary stream is attached to a binary
file and a text stream to a text file.

I don't think the notions of "text file" and "binary
file" as disjoint categories will withstand scrutiny. At
least, I've never seen any attempted definitions that were
beyond reproach.

The Standard says very little about "text files" and
"binary files," and that seems a wise choice. About all
we can infer is that a "text file" is an entity suitable
for access via a text stream, while a "binary file" is
suited to binary streams. Some kinds of files may work
with both kinds of streams, and some kinds of files may
work with neither -- and the implementation's whim rules
the day.
 
A

Alan Balmer

The idea was to be able to seek to well defined positions inside the file.


In general, the results of ftell on a binary stream are
useless/meaningless to any text stream on the same implementation.

The thing you don't get is that the encoding of the file position returned
by ftell on a text stream need not be a plain byte offset. It could be
a record number and a byte offset inside the record.

Dan
Sorry to dispell your illusion of omniscience (do you fancy yourself a
telepath?), but you have no idea what I "get" and "don't get." I not
only "get" that possibility, I have designed storage systems using
similar methodology. What *you* don't "get" (or at least want to argue
about) is that any file may be opened in binary mode, and a character
offset generated by fseek/ftell. The standard does not preclude using
that offset in fseek on a text file. Obviously, as I stated
previously, the standard says nothing about the usefulness of doing
this. You say this is an "obvious bug" in the standard. I suggest you
submit it for correction.
 
D

Dan Pop

In said:
I don't think the notions of "text file" and "binary
file" as disjoint categories will withstand scrutiny. At
least, I've never seen any attempted definitions that were
beyond reproach.

In the context of the C programming language, a text file is a file
created via a text stream and a binary file is a file created via a
binary stream. No guarantees for files created by other means than
correct C programs.

Dan
 
G

Glen Herrmannsfeldt

Dan Pop said:
(snip)


In the context of the C programming language, a text file is a file
created via a text stream and a binary file is a file created via a
binary stream. No guarantees for files created by other means than
correct C programs.

That may be true, but I would be disappointed to find a system where reading
a C text file would not read text files commonly available on a system, such
as produced by the systems normal text editor(s).

For binary files, though, one may find that the only way to create a binary
file that C programs can read is one written by a C program. In many
languages, this is normal for binary files. In C it is often expected that
one can read any binary file, written by any program in any language, though
that may not be true on all systems.

Any system that normally stores text files using less than eight bits per
character would make it hard for text and binary files to be compatible.

-- glen
 
L

lawrence.jones

Alan Balmer said:
What *you* don't "get" (or at least want to argue
about) is that any file may be opened in binary mode

That is true on many systems, but it is not guaranteed by the standard.

-Larry Jones

At times like these, all Mom can think of is how long she was in
labor with me. -- Calvin
 
L

lawrence.jones

Alan Balmer said:
Chapter and verse, please.

See 7.19.5.3 (fopen) -- the standard does not provide any way to connect
a text stream to a binary file or vice versa. Attempting to open a
binary file in text mode or vice versa results in undefined behavior.

-Larry Jones

In my opinion, we don't devote nearly enough scientific research
to finding a cure for jerks. -- Calvin
 
I

Irrwahn Grausewitz

See 7.19.5.3 (fopen) -- the standard does not provide any way to connect
a text stream to a binary file or vice versa. Attempting to open a
binary file in text mode or vice versa results in undefined behavior.

Seems to be the appropriate section, yes; but re-reading it I
stumbled over something I've overseen till now:

7.19.5.3p6
[...]
Opening (or creating) a text file with update mode may instead
open (or create) a binary stream in some implementations.

I'm not sure if this can be used to form an argument in the debate
about the relationship of binary and text files, and if so, pro or
contra which side; I'm just puzzled that such a strange behaviour
is sanctioned by the standard...

Regards
 
A

Alan Balmer

See 7.19.5.3 (fopen) -- the standard does not provide any way to connect
a text stream to a binary file or vice versa. Attempting to open a
binary file in text mode or vice versa results in undefined behavior.
This paragraph only describes the mode that the stream is opened in,
saying nothing about the files themselves. It does not say that
opening a file in the "wrong" mode results in undefined behavior. The
mode is important to the user, since in text mode it tells the I/O
implementation that it must look for, and possibly map, newline
characters. In binary mode, newline characters are treated the same as
any other character.

Opening a text file in binary mode is perfectly legitimate - in fact
the standard provides no way to distinguish between a binary file and
a text file. Refer to 17.19.2, where the two types of streams are
defined. Now, consider a "text" file containing one "line." The thing
that makes it a text file is that each "line" has a terminating
newline character. But the standard says that the last line need not
have the terminating newline (it's implementation dependent.) How can
this file be distinguished from a binary file? What will the
implementation do to me if I open it in binary mode?

On the other hand, presume that I have a binary file, say an
executable program. This file may contain numerous instances of a
character with the value 0xA, which happens to be the newline
character on the system I'm using now. Does that make it a text file?
Obviously not, but it may meet all the criteria for opening as a text
stream.

Perhaps Dan is right, and the writers made a mistake. I'm in no
position to make a judgment on that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,104
Messages
2,570,643
Members
47,247
Latest member
youngcoin

Latest Threads

Top