Old question?

M

mdh

Hi Group,
At the risk of being very repetitive, may I ask this. K&R 1-23 has been

extensively covered in the archives. I have read all these and the
answer offered by Tondo/Gimpel.
In seeking to eliminate comments from a valid C program, special
attention is directed at the double-forward slash (//). My question is
when would this arise? If it appears within double-quotes or
single-quotes it should be handled there, and if it appears thus:
/* stuff here */ /* more stuff here */

then it should be handled as 2 different comments?

I may have missed it in some of the previous discussions, and if so , I

will happily be directed there.

Thank you.
 
K

Keith Thompson

mdh said:
At the risk of being very repetitive, may I ask this. K&R 1-23 has been
extensively covered in the archives. I have read all these and the
answer offered by Tondo/Gimpel.
In seeking to eliminate comments from a valid C program, special
attention is directed at the double-forward slash (//). My question is
when would this arise? If it appears within double-quotes or
single-quotes it should be handled there, and if it appears thus:
/* stuff here */ /* more stuff here */
then it should be handled as 2 different comments?

I may have missed it in some of the previous discussions, and if so , I
will happily be directed there.

What is K&R 1-23? My copies of K&R (both editions) are several miles
away at the moment.

C99 (which neither edition of K&R covers) adds C++-style (or
BCPL-style for software archeologists) comments, introduced by // and
terminated by the end of the line. There are some very obscure cases
where the introduction of // comments causes a valid C90 program to
become a valid C99 program with a different meaning. I don't remember
the exact details, but it involves a division operator immediately
followed by an old-style /* ... */ comment, with the following line
carefully contructed to make the program legal whether the first '/'
is a division operator or the first character of a // comment
delimiter.

If a /* or // appears within a string or character literal, it doesn't
introduce a comment. Your example:
/* stuff here */ /* more stuff here */
is obviously two comments, as is this:
/* stuff here *//* more stuff here */

Beyond that, I'm not sure what you're asking.
 
M

mdh

Keith said:
What is K&R 1-23?


Write a program to remove all comments from a C program. Don't forget
to handle quoted strings and character constants properly. C comments
do not nest.
There are some very obscure cases
............. I don't remember the exact details.....involves a division operator immediately
followed by an old-style /* ... */ comment, with the following line
carefully contructed to make the program legal whether the first '/'
is a division operator or the first character of a // comment
delimiter.



Hi Keith,

Well, that makes sense. I will play around with it. Thank you.
 
K

Keith Thompson

mdh said:
Write a program to remove all comments from a C program. Don't forget
to handle quoted strings and character constants properly. C comments
do not nest.

The exercise refers to a version of the C language that doesn't
support // comments -- though of course you can handle them in your
own program if you wish. You might even consider having a
command-line option to tell you program whether to handle // comments.
 
M

mdh

Keith said:
The exercise refers to a version of the C language that doesn't
support // comments --


Yes...I finally got that after reading all the archived stuff. But, the
C answer book specifically handles the double forward slash. I think
your earlier answer about some obscure situation where a divisor token
is valid code is what I was trying to understand.
Having just started C, I can say that I feel fortunate to have stumbled
onto K&R's book early on...they really stretch one to one's limit of
understanding, or is it misunderstanding :)
 
C

CBFalconer

mdh said:
Yes...I finally got that after reading all the archived stuff.
But, the C answer book specifically handles the double forward
slash. I think your earlier answer about some obscure situation
where a divisor token is valid code is what I was trying to
understand. Having just started C, I can say that I feel
fortunate to have stumbled onto K&R's book early on...they
really stretch one to one's limit of understanding, or is it
misunderstanding :)

You can hardly go wrong with K&R II.

Here is my solution to that exercise:

/* File uncmntc.c - demo of a text filter
Strips C comments. Tested to strip itself
by C.B. Falconer. 2002-08-15
Public Domain. Attribution appreciated
report bugs to <mailto:[email protected]>
*/

/* With gcc3.1, must omit -ansi to compile eol comments */

#include <stdio.h>
#include <stdlib.h>

static int ch, lastch;

/* ---------------- */

static void putlast(void)
{
if (0 != lastch) fputc(lastch, stdout);
lastch = ch;
ch = 0;
} /* putlast */

/* ---------------- */

/* gobble chars until star slash appears */
static int stdcomment(void)
{
int ch, lastch;

ch = 0;
do {
lastch = ch;
if (EOF == (ch = fgetc(stdin))) return EOF;
} while (!(('*' == lastch) && ('/' == ch)));
return ch;
} /* stdcomment */

/* ---------------- */

/* gobble chars until EOLine or EOF. i.e. // comments */
static int eolcomment(void)
{
int ch, lastch;

ch = '\0';
do {
lastch = ch;
if (EOF == (ch = fgetc(stdin))) return EOF;
} while (!(('\n' == ch) && ('\\' != lastch)));
return ch;
} /* eolcomment */

/* ---------------- */

/* echo chars until '"' or EOF */
static int echostring(void)
{
putlast();
if (EOF == (ch = fgetc(stdin))) return EOF;
do {
putlast();
if (EOF == (ch = fgetc(stdin))) return EOF;
} while (!(('"' == ch) && ('\\' != lastch)));
return ch;
} /* echostring */

/* ---------------- */

int main(void)
{
lastch = '\0';
while (EOF != (ch = fgetc(stdin))) {
if ('/' == lastch)
if (ch == '*') {
lastch = '\0';
if (EOF == stdcomment()) break;
ch = ' ';
putlast();
}
else if (ch == '/') {
lastch = '\0';
if (EOF == eolcomment()) break;
ch = '\n';
putlast(); // Eolcomment here
// Eolcomment line \
with continuation line.
}
else {
putlast();
}
else if (('"' == ch) && ('\\' != lastch)
&& ('\'' != lastch)) {
if ('"' != (ch = echostring())) {
fputs("\"Unterminated\" string\n", stderr);
fputs("checking for\
continuation line string\n", stderr);
fputs("checking for" "concat string\n", stderr);
return EXIT_FAILURE;
}
putlast();
}
else {
putlast();
}
} /* while */
putlast(/* embedded comment */);
return 0;
} /* main */

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
S

Skarmander

Keith said:
What is K&R 1-23? My copies of K&R (both editions) are several miles
away at the moment.

C99 (which neither edition of K&R covers) adds C++-style (or
BCPL-style for software archeologists) comments, introduced by // and
terminated by the end of the line. There are some very obscure cases
where the introduction of // comments causes a valid C90 program to
become a valid C99 program with a different meaning. I don't remember
the exact details, but it involves a division operator immediately
followed by an old-style /* ... */ comment, with the following line
carefully contructed to make the program legal whether the first '/'
is a division operator or the first character of a // comment
delimiter.
This has also been used as an illustration of valid C code that is different
valid C++ code, if deliberately convoluted:

int f(int a, int b)
{
return a //* pretty unlikely */ b
; /* unrealistic: semicolon on separate line to avoid syntax
error */
}
--Bjarne Stroustrup, "The C++ Programming Language".

S.
 
M

mdh

Skarmander said:
This has also been used as an illustration of valid C code that is different
valid C++ code, if deliberately convoluted:

int f(int a, int b)
{
return a //* pretty unlikely */ b
; /* unrealistic: semicolon on separate line to avoid syntax
error */
}


Thanks...
Does the semicolon really have to be on a seperate line?

I thought to the compiler, it's a line till it finds the ";"?
So, if I understand this correctly, something like .....


void f(void)
{

int c;
c = 6 //* extremely silly comment */ 3 ;
}

.....is legal, and thus has to be handled.

:)
 
A

Andrew Poelstra

mdh said:
Thanks...
Does the semicolon really have to be on a seperate line?

I thought to the compiler, it's a line till it finds the ";"?
So, if I understand this correctly, something like .....


void f(void)
{

int c;
c = 6 //* extremely silly comment */ 3 ;
}

....is legal, and thus has to be handled.

:)
In C++ or C99, the ; will be commented out, and thus will be ignored by
the compiler, generating an error.
 
K

Keith Thompson

mdh said:
Thanks...
Does the semicolon really have to be on a seperate line?

It does in order for the code to be valid C90 and valid C99/C++ with
different meanings.
I thought to the compiler, it's a line till it finds the ";"?

A line is a line. A statement or declaration is terminated by a ";".
A statement or declaration can legally span multiple lines, and/or a
line can contain multiple statements or declarations (though either is
usually bad style).

In the example, if "//" comments are *not* recognized, the first "/"
is a division operator, "/* pretty unlikely */" is a comment, and "b"
is part of the expression. If "//" comments *are* recognized, the
"//" introduces a comment, the rest of the line is ignored because
it's part of the comment, and the ";" terminates the statement.

If the ";" were on the same line, it would be part of the "//"
comment, so the code would be legal in C90 but illegal in C99 and C++.
 
T

Thad Smith

Skarmander said:
This has also been used as an illustration of valid C code that is
different valid C++ code, if deliberately convoluted:

int f(int a, int b)
{
return a //* pretty unlikely */ b
; /* unrealistic: semicolon on separate line to avoid
syntax error */
}

It doesn't have to be quite that unrealistic:

double cuyds (
double height, /* height in inches */
double width, /* width in inches */
double length /* length in yards */
) {
return height//* convert to yds */36
*width //* convert to yds */36
*length;/* already in yds */
}
 
M

mdh

To all who have replied...a grateful thank you. It is really gratifying
to be able to ask, what many might perceive as "silly" questions and
receive such a response. Makes learning it all the more worthwhile.
 
J

jaysome

CBFalconer wrote:
[snip]
#include <stdio.h>
#include <stdlib.h>

static int ch, lastch;

/* ---------------- */

static void putlast(void)
{
if (0 != lastch) fputc(lastch, stdout);
lastch = ch;
ch = 0;
} /* putlast */

/* ---------------- */

/* gobble chars until star slash appears */
static int stdcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'

[snip]
/* gobble chars until EOLine or EOF. i.e. // comments */
static int eolcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'
 
C

CBFalconer

jaysome said:
CBFalconer wrote:
[snip]
#include <stdio.h>
#include <stdlib.h>

static int ch, lastch;

/* ---------------- */

static void putlast(void)
{
if (0 != lastch) fputc(lastch, stdout);
lastch = ch;
ch = 0;
} /* putlast */

/* ---------------- */

/* gobble chars until star slash appears */
static int stdcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'

[snip]
/* gobble chars until EOLine or EOF. i.e. // comments */
static int eolcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'

So? Those routines want local versions. Perfectly legitimate.
The hiding is a GOOD THING because it prevents accidentally
disturbing the globals.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
I

Irene

Keith said:
It does in order for the code to be valid C90 and valid C99/C++ with
different meanings.


A line is a line. A statement or declaration is terminated by a ";".
A statement or declaration can legally span multiple lines, and/or a
line can contain multiple statements or declarations (though either is
usually bad style).

In the example, if "//" comments are *not* recognized, the first "/"
is a division operator, "/* pretty unlikely */" is a comment, and "b"
is part of the expression. If "//" comments *are* recognized, the
"//" introduces a comment, the rest of the line is ignored because
it's part of the comment, and the ";" terminates the statement.

Also, the way the compiler handles the C++ comments depends on the
flags used during the compilation. Using gcc I noticed that without
flags, the // comments are recognized (and thus the function returns
a), whereas if we use the -ansi flag, the // comments are not
recognized and the funtion returns a/b. It's funny that you get
different results from the same code.
 
S

stathis gotsis

Irene said:
Also, the way the compiler handles the C++ comments depends on the
flags used during the compilation. Using gcc I noticed that without
flags, the // comments are recognized (and thus the function returns
a), whereas if we use the -ansi flag, the // comments are not
recognized and the funtion returns a/b. It's funny that you get
different results from the same code.

The "// comments" are valid C comments under C99. The ansi flag forces
compliance to C90, where these comments were illegal. Some C90 compilers
support them as an extension.
 
K

Keith Thompson

Irene said:
Also, the way the compiler handles the C++ comments depends on the
flags used during the compilation. Using gcc I noticed that without
flags, the // comments are recognized (and thus the function returns
a), whereas if we use the -ansi flag, the // comments are not
recognized and the funtion returns a/b. It's funny that you get
different results from the same code.

The way // comments are handled depends on the compiler.

A conforming C90 compiler does not recognize // comments.
A conforming C99 <OT>or C++</OT> compiler does.

gcc in its default mode does not conform to any language standard
(other than the informal GNU C "standard"), so it can behave any way
it likes. It happens to support // comments as an extension.

Strictly speaking, the behavior of specific compilers is off-topic
here, except as an illustration of the language rules.
 
A

August Karlstrom

Thad said:
It doesn't have to be quite that unrealistic:

double cuyds (
double height, /* height in inches */
double width, /* width in inches */
double length /* length in yards */
) {
return height//* convert to yds */36
*width //* convert to yds */36
*length;/* already in yds */
}

This program deserves to fail anyway. ;-)


August
 
J

jaysome

CBFalconer said:
jaysome said:
CBFalconer wrote:
[snip]

#include <stdio.h>
#include <stdlib.h>

static int ch, lastch;

/* ---------------- */

static void putlast(void)
{
if (0 != lastch) fputc(lastch, stdout);
lastch = ch;
ch = 0;
} /* putlast */

/* ---------------- */

/* gobble chars until star slash appears */
static int stdcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'

[snip]

/* gobble chars until EOLine or EOF. i.e. // comments */
static int eolcomment(void)
{
int ch, lastch;

Declaration of symbol 'ch' hides symbol 'ch'
Declaration of symbol 'lastch' hides symbol 'lastch'


So? Those routines want local versions. Perfectly legitimate.
The hiding is a GOOD THING because it prevents accidentally
disturbing the globals.

If it prevents accidentally disturbing the globals, it's not obviously
apparent from the code. Furthermore, it could be dangerous (see below).
It would be better to name a global with a prefix that uniquely
designates it as global so that your intentions are clear, e.g.:

static int G_ch, G_lastch;

From http://www.gimpel.com/html/pub80/msg.txt

578 Declaration of symbol 'Symbol' hides symbol 'Symbol' (Location)
-- A local symbol has the identical name as a global symbol ( or
possibly another local symbol). This could be dangerous. Was
this deliberate? It is usually best to rename the local symbol.
 
C

CBFalconer

jaysome said:
If it prevents accidentally disturbing the globals, it's not
obviously apparent from the code. Furthermore, it could be
dangerous (see below). It would be better to name a global with
a prefix that uniquely designates it as global so that your
intentions are clear, e.g.:

I'm not getting into a style war about it. It is legitimate, and
it has a purpose.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,177
Messages
2,570,953
Members
47,507
Latest member
codeguru31

Latest Threads

Top