Portable EOL?

B

bwaichu

I have written the program below to just create and populate an html
file. I am running into a problem when viewing the created file in vi.
I am told by vi that the file does not have an end of line.

Here's the code:

int
main(int argc, char **argv) {

FILE *fd;
int x;

char buff[8196];
x = 0;

bzero(buff, sizeof(buff));

(void)strlcpy(buff,"<HTML>", sizeof(buff));

while (x < 1024) {
(void)strlcat(buff,"A",sizeof(buff));
x++;
}
(void)strlcat(buff, "</HTML>", sizeof(buff));

fd = fopen("test5.html", "w+");
if (fd == NULL)
errx(-1, "failed to open");
(void)fprintf(fd, "%s", buff);
(void)fclose(fd);
exit(EXIT_SUCCESS);
}

Now, vi will not warn about noeol if I change this line:

(void)fprintf(fd, "%s", buff);

to

(void)fprintf(fd, "%s\n", buff);

Where I am confused is that I thought streams automatically terminated
the file with an CR (0a) in a unix environment. What bothers me is
that the above would in theory make the code less portable. Am I
overlooking something?
 
B

Ben Pfaff

Now, vi will not warn about noeol if I change this line:

(void)fprintf(fd, "%s", buff);

to

(void)fprintf(fd, "%s\n", buff);

Where I am confused is that I thought streams automatically terminated
the file with an CR (0a) in a unix environment.

No, that's just wrong. The last line in a text stream needs to
be explicitly terminated with a new-line character.
 
B

Barry Schwarz

I have written the program below to just create and populate an html
file. I am running into a problem when viewing the created file in vi.
I am told by vi that the file does not have an end of line.

Here's the code:

int
main(int argc, char **argv) {

FILE *fd;
int x;

char buff[8196];
x = 0;

bzero(buff, sizeof(buff));

(void)strlcpy(buff,"<HTML>", sizeof(buff));

Why a non-standard function? What does it do that you could not do
with strcpy or strncpy? Since the source string is only seven
characters, do you really want to process 8,196.
while (x < 1024) {
(void)strlcat(buff,"A",sizeof(buff));
x++;
}
(void)strlcat(buff, "</HTML>", sizeof(buff));

At this point, buff contains a six character header, 1024 copies of
the letter A, a seven character trailer, a sting terminating '\0', and
some 7000+ characters of no interest. At no point did you ever place
a '\n' in this array.
fd = fopen("test5.html", "w+");
if (fd == NULL)
errx(-1, "failed to open");
(void)fprintf(fd, "%s", buff);

Your file now contains the characters described above, up to but not
including the '\0'.
(void)fclose(fd);
exit(EXIT_SUCCESS);
}

Now, vi will not warn about noeol if I change this line:

(void)fprintf(fd, "%s", buff);

to

(void)fprintf(fd, "%s\n", buff);

Your format string now includes an additional character which will get
written to the file immediately after your data.
Where I am confused is that I thought streams automatically terminated
the file with an CR (0a) in a unix environment. What bothers me is

If that were true, then multiple calls to fprintf could not be used to
build up a line from separate pieces.
that the above would in theory make the code less portable. Am I
overlooking something?

Less portable than what? The \n is C's portable way of indicating
that the output should contain an end of line indicator at this point.
The run time library will perform the magic necessary for your system.
This may be an 0a for unix or 0a0d (or is it 0d0a) for windows. It
will be something completely different for my IBM mainframe depending
on whether the RECFM is U, V, or F. The point is it is portable. The
code doesn't have to care and the compiler doesn't care much, if at
all.


Remove del for email
 
B

bwaichu

Ben said:
No, that's just wrong. The last line in a text stream needs to
be explicitly terminated with a new-line character.

Thanks. That is where I am wrong.

Are there any good reasons to use streams versus unix I/O besides
portability? I realize that if I use unix I/O I have to do my own
buffering,
but that's not a big deal.

Oh, the 8196 buffer was left over from re-writing this from unix I/O to
streams.
And strlcat is vastly superior to strcat and strncat.
 
M

Malcolm

I have written the program below to just create and populate an html
file. I am running into a problem when viewing the created file in vi.
I am told by vi that the file does not have an end of line.

Here's the code:

int
main(int argc, char **argv) {

FILE *fd;
int x;

char buff[8196];
x = 0;

bzero(buff, sizeof(buff));

(void)strlcpy(buff,"<HTML>", sizeof(buff));

while (x < 1024) {
(void)strlcat(buff,"A",sizeof(buff));
x++;
}
(void)strlcat(buff, "</HTML>", sizeof(buff));

fd = fopen("test5.html", "w+");
if (fd == NULL)
errx(-1, "failed to open");
(void)fprintf(fd, "%s", buff);
(void)fclose(fd);
exit(EXIT_SUCCESS);
}

Now, vi will not warn about noeol if I change this line:

(void)fprintf(fd, "%s", buff);

to

(void)fprintf(fd, "%s\n", buff);

Where I am confused is that I thought streams automatically terminated
the file with an CR (0a) in a unix environment. What bothers me is
that the above would in theory make the code less portable. Am I
overlooking something?
So OSes might be a bit hazy about text files that don't end with a newline
character.
The C standard doesn't specify that fclose() will automatically append a
newline if missing, so the only sensible thing to do is to add it yourself.
All file systems will handle

fp = fopen("temp.txt", "w");
fprintf(fp, "Hello world\n");
fclose(fp);

in a sensible way. If you don't add the newline you may or may not cause
problems.
 
K

Keith Thompson

Ben Pfaff said:
No, that's just wrong. The last line in a text stream needs to
be explicitly terminated with a new-line character.

More precisely, "Whether the last line requires a terminating new-line
character is implementation-defined." (C99 7.19.2p2)

<OT>
In Unix, there's nothing *inherently* wrong with having a text file
without a terminating new-line, but it's rarely what you want. vi
will most likely complain about it, emacs can be configured to behave
in any of several ways, and other utilities might or might not
misbehave in various ways.
</OT>
 
K

Keith Thompson

Thanks. That is where I am wrong.

Are there any good reasons to use streams versus unix I/O besides
portability? I realize that if I use unix I/O I have to do my own
buffering, but that's not a big deal.

The question is, are there any good reasons to use Unix I/O rather
than standard C streams?

One good reason *not* to use Unix-specific I/O is that it's less
portable; there are some extra features, but I don't think you're
using any of them. Another good reason is that we don't discuss
system-specific features here.
 
J

Jack Klein

More precisely, "Whether the last line requires a terminating new-line
character is implementation-defined." (C99 7.19.2p2)

Actually, in this particular case, this is not really relevant. It
has nothing to do with an "ordinary" text file, and everything to do
with the format of the particular file type he is writing.

The file he produces might or might not be a valid text file on his
platform. Even if it is, it is not a valid HTML file because it
violates the HTML standard. That standard, like the C standard for C
source files, requires that the last line of a file have a terminating
newline.
<OT>
In Unix, there's nothing *inherently* wrong with having a text file
without a terminating new-line, but it's rarely what you want. vi
will most likely complain about it, emacs can be configured to behave
in any of several ways, and other utilities might or might not
misbehave in various ways.
</OT>

Right, it is merely some HTML validation tool pointing out a violation
of the HTML standard.
 
B

bwaichu

Jack said:
Actually, in this particular case, this is not really relevant. It
has nothing to do with an "ordinary" text file, and everything to do
with the format of the particular file type he is writing.

The file he produces might or might not be a valid text file on his
platform. Even if it is, it is not a valid HTML file because it
violates the HTML standard. That standard, like the C standard for C
source files, requires that the last line of a file have a terminating
newline.

My thought that streams automatically terminate with CR's followed by a
NULL value was incorrect. The file definitely does not follow the
requirements for the HTTP Protocol.

Now, since the string is a C string, I would follow the terminating new
line
with \0, right?

So following the C standard, I would be:

<text><\n><\0>

where the C standard just requires the string to be terminated with a
NULL value,
right?
 
K

Keith Thompson

My thought that streams automatically terminate with CR's followed by a
NULL value was incorrect. The file definitely does not follow the
requirements for the HTTP Protocol.

Now, since the string is a C string, I would follow the terminating new
line
with \0, right?

So following the C standard, I would be:

<text><\n><\0>

where the C standard just requires the string to be terminated with a
NULL value,
right?

Um, no.

First of all, you're misusing the word NULL. NULL is a macro, defined
in <stddef.h>, that expands to a null pointer constant. It is *not*
the same thing as a null character, '\0' (sometimes called NUL).

A C string, stored in memory, is terminated by a trailing '\0'
character. A text file consists of a sequence of lines, where each
line is terminated by an end-of-line marker that appears as '\n' when
you read or write it in a C program. Text files normally do not
contain null characters.

For example:

char s[] = "hello, world";
/*
* The compiler implicitly adds a '\0' to the end of s, making it
* a valid string.
*/
fprintf(some_file, "%s\n", s);
/*
* This writes the characters of s, *not* including the trailing
* '\0', to some_file. The added '\n' makes it a line.
*/
 
S

spibou

Where I am confused is that I thought streams automatically terminated
the file with an CR (0a) in a unix environment.

CR short for carriage return is 0c in hexadecimal. NL short
for newline or LF short for linefeed is 0a. And as others have
pointed out neither is guaranteed to be the last byte in a Unix
file. Try typing
cat > some-file
type a few characters and press Control-D twice. You'll get a
file without a newline in the end.

By the way , all the above is out of topic here.
 
J

Joe Wright

CR short for carriage return is 0c in hexadecimal. NL short
for newline or LF short for linefeed is 0a. And as others have
pointed out neither is guaranteed to be the last byte in a Unix
file. Try typing
cat > some-file
type a few characters and press Control-D twice. You'll get a
file without a newline in the end.

By the way , all the above is out of topic here.

Both spibou and bwaichu seem unable to read a simple ASCII code chart.
The CR character is 0D (13). The Unix newline char NL is (10). It is the
formfeed character FF which is 0C or (12).
 
B

bwaichu

Joe said:
Both spibou and bwaichu seem unable to read a simple ASCII code chart.
The CR character is 0D (13). The Unix newline char NL is (10). It is the
formfeed character FF which is 0C or (12).

What the deal with the insults? If we had all the answers, we wouldn't
be asking questions.
And NL is 0a.

Here's the acsii table:

http://www.lookuptables.com/

I made a typo. Just like I wrote NULL instead of NUL. Typos happen.
That is why editors have jobs.

Jeez.
 
J

Joe Wright

What the deal with the insults? If we had all the answers, we wouldn't
be asking questions.
And NL is 0a.

Here's the acsii table:

http://www.lookuptables.com/

I made a typo. Just like I wrote NULL instead of NUL. Typos happen.
That is why editors have jobs.

I didn't mean it as a personal insult. I was correcting errors. I don't
like to correct careless errors, typographical or otherwise. I expect
you to proof-read your article and correct your own errors as best you
can before posting it. Too much to expect?

I wrote NL is (10) and you correct it to 0a? What's that about? As to
correctness, 10 is ten. 0x0a is ten. 012 is ten.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top