How to remove // comments

Bart · Oct 20, 2006

jacob said:
????
Well, URLs in #include directives...

I don't remember seeing anything that forbids it.

If they are not allowed...

That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

Regards,
Bart.

Bart · Oct 20, 2006

Bart said:
I don't remember seeing anything that forbids it.

That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

Or since we're already talking about URLs, the more likely:

#error Please see http://domain.com/xyz for more information about this
error.

Regards,
Bart.

CBFalconer · Oct 20, 2006

jacob said:
Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about
this in comp.std.c.

I guess you have never seen a system without the following chars in
its char set. From N869:

5.2.1.1 Trigraph sequences

[#1] All occurrences in a source file of the following
sequences of three characters (called trigraph sequences11))
are replaced with the corresponding single character.

??= # ??) ] ??! |
??( [ ??' ^ ??> }
??/ \ ??< { ??- ~

Jalapeno · Oct 20, 2006

Walter said:
A trigraph case:

char* d = "??/""; // "

but of course I've never seen trigraphs outside of a test suite.

Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Jalapeno · Oct 20, 2006

Jalapeno said:
Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Just for kicks I created a terminal emulator macro that put the '[' and
']' into a source file and the resultant int aÝ8¨; is more
difficult to read than int a??(8??); (at least to me).

The code compiles exactly the same.

Walter Bright · Oct 20, 2006

Jalapeno said:
Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

No, I haven't. Nor has anyone I've worked with.

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

I suppose one can get used to anything <g>.

Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Walter Bright · Oct 20, 2006

Bart said:
That was just an example to show that your little program may entirely
change the meaning of an #error message. What if you had:

#error This is never supposed to happen (possible cause: // comments).

I don't think that's a reasonable test case, since presumably C code
that uses // comments would not reasonably expect that #error line to work.

Andrey Koptyaev · Oct 20, 2006

try this:

#include <stdio.h>
#define BSIZE 200

int main (int argc,char *argv[]){
char *buf;
FILE *in,*out;
void comm(char *);
char *str1="//";
char *str2="\x22\x2f\x2f\x22";
char *buf1,*substr;
int i;

if (argc<3){
printf("to low parameters\n");
return 1;
}
in=fopen(argv[1],"rb");
if (in==NULL){
printf("file not opening %s\n",argv[1]);
return 1;
}
out=fopen(argv[2],"wb");
buf=malloc(BSIZE);
while(fgets(buf,BSIZE,in) != NULL){
if (!(substr=strstr(buf,str2))){
if (substr=strstr(buf,str1)){
buf1=calloc(strlen(buf)+3,1);
for (i=0;i<(strlen(buf)-strlen(substr));i++)
buf1=buf;
buf1=strcat(buf1,"/*");
for (i=strlen(buf)-strlen(substr)+2;i<(strlen(buf)-2);i++)
buf1=buf;
buf1=strcat(buf1,"*/");
buf1=strcat(buf1,"\x0d\x0a");
fputs(buf1,out);
free(buf1);
}
else
fputs(buf,out);
}
else
fputs(buf,out);
}
fclose(in);
fclose(out);
free(buf);
return 0;
}

Jalapeno · Oct 20, 2006

Walter said:
No, I haven't. Nor has anyone I've worked with.

I suppose one can get used to anything <g>.

Do you need to run non-trigraph C code through a source translater to
get it on to your z/OS system?

Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

A standard ASCII to EBCDIC conversion utility (like one used in a
typical terminal emulator) that uploads source text from a PC to the
mainframe will see the '[' as 0x5B and the ']' as 0x5D and will
translate them to the EBCDIC '[' as 0xAD and EBCDIC ']' as 0xBD.

so the ASCII text statement:

char x[8]; which in ASCII is

0x63 0x68 0x61 0x72 0x20 0x78 0x5B 0x38 0x5D 0x3B

will be translated in a "typical" terminal emulator utility to:

0x83 0x88 0x81 0x99 0x40 0xA7 0xAD 0xF8 0xBD 0x5E

but on the screen that looks like:

char xÝ8¨; and not char x[8];

this compiles but looks horrible on the screen and you can't type those
characters when you edit, you have to copy and paste those characters
(or create a macro). Even though the '[' and ']' exist in EBCDIC the
3270 family of terminals do not have those characters to type in or to
display.

If I manually change the characters Ý and ¨ using the terminal
emulator keyboard to '[' and ']', which the Windows keyboard has, the
encoding becomes 0xBA for '[' and 0xBB for ']' and you have

0x83 0x88 0x81 0x99 0x40 0xA7 0xBA 0xF8 0xBB 0x5E

which becomes a syntax error and won't compile.

Our code base apparently contains "vendor" supplied source in the char
xÝ8¨; format and "home grown" (and IBM supplied sample) source in the
char x??(8??); format. We don't normally modify the vendor source so
there isn't any need to replace the ugly "screen" characters with
trigraphs but the "home grown" code is edited much more frequently and
I've become used to dealing with trigraphs.

Walter Bright · Oct 20, 2006

Jalapeno said:
Not so much to _get_ the source text to the mainframe but for it to be
usable it'll need to be in EBCDIC.

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

jacob navia · Oct 20, 2006

Walter said:
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters must
be translated anyway, there's not much reason to support trigraphs in
the C language standard itself.

EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

jacob

Jalapeno · Oct 20, 2006

Walter said:
That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

Character translation is only necessary if the text originates on an
ASCII system. Since all the "home grown" code here (and that supplied
by IBM) originates on EBCDIC systems absolutly no translations are
necessary and trigraphs are useful. All the world is not a PC. The
standard acknowledges that. I also understand that you don't find much
reason to have trigraphs supported. Some people use them, a lot. IBM's
Mainframes have'nt disappeared, they've just been renamed "Servers" ;o).

jacob navia · Oct 20, 2006

Andrey said:
try this:

#include <stdio.h>
#define BSIZE 200

int main (int argc,char *argv[]){
char *buf;
FILE *in,*out;
void comm(char *);
char *str1="//";
char *str2="\x22\x2f\x2f\x22";
char *buf1,*substr;
int i;

if (argc<3){
printf("to low parameters\n");
return 1;
}
in=fopen(argv[1],"rb");
if (in==NULL){
printf("file not opening %s\n",argv[1]);
return 1;
}
out=fopen(argv[2],"wb");
buf=malloc(BSIZE);
while(fgets(buf,BSIZE,in) != NULL){
if (!(substr=strstr(buf,str2))){
if (substr=strstr(buf,str1)){
buf1=calloc(strlen(buf)+3,1);
for (i=0;i<(strlen(buf)-strlen(substr));i++)
buf1=buf;
buf1=strcat(buf1,"/*");
for (i=strlen(buf)-strlen(substr)+2;i<(strlen(buf)-2);i++)
buf1=buf;
buf1=strcat(buf1,"*/");
buf1=strcat(buf1,"\x0d\x0a");
fputs(buf1,out);
free(buf1);
}
else
fputs(buf,out);
}
else
fputs(buf,out);
}
fclose(in);
fclose(out);
free(buf);
return 0;
}

Excuse me but this will blindly search a // sequence anywhere in the
line you get. Even within character strings:

char *a = "cpp coment is // isn't it?";
and there you go, you destroy the source.

You ignore all the discussion, and you put this program...

C'mon...

You can't do this in such a BRUTE force fashion...

If I write
char *a = "//";
it will replace it with

Yevgen Muntyan · Oct 20, 2006

jacob said:
EXACTLY.

Why should the language specs be cluttered with such details?
Why should *I* bother about that?

You should do whatever you like; but note that you are fooling people
if you are saying you are producing a C compiler, since people do not
expect a C compiler to intentionally ignore some parts of C standard.

You are not saying "a compiler system adding some sugar to C language
and removing some standard parts from it" on your web site, are you?
Web site says "lcc-win32 C compiler system".

Regards,
Yevgen

jacob navia · Oct 20, 2006

Yevgen said:
You should do whatever you like; but note that you are fooling people
if you are saying you are producing a C compiler, since people do not
expect a C compiler to intentionally ignore some parts of C standard.

You are not saying "a compiler system adding some sugar to C language
and removing some standard parts from it" on your web site, are you?
Web site says "lcc-win32 C compiler system".

Regards,
Yevgen

Who is tallking about the C compiler?
We are talking (and is the subject of this thread) about this utility
to eliminate // comments!!!

lcc-win32, by the way, will warn you about any trigraphs it sees by
default. If you want to use trigraphs you have to set the option
-ansic.

This is NONSENSE for all users that are NOT EBCDIC and do NOT work in
mainframes. By the way, the venerable 3270 is DEAD SINCE CONCEPTION
and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish... Nice isn't it?

Yevgen Muntyan · Oct 20, 2006

jacob said:
Who is tallking about the C compiler?

Below is what made me think your compiler does not support trigraphs
(this your reply elsethread). If you meant "my compiler supports
trigraphs but I do not support them" (not sure what you actually meant
then), then I apologize.

Walter said:
> Peter Nilsson wrote:
>
>
>
>
> A trigraph case:
>
> char* d = "??/""; // "
>
> but of course I've never seen trigraphs outside of a test suite.
>

Me neither. But I do not support trigraphs anyway. They are an
unnecessary feature. We had several lebgthy discussions about this in
comp.std.c.

Keith Thompson · Oct 20, 2006

Jalapeno said:
Haven't worked in a z/OS shop before, huh? (or a Sys 370 one either)

It only takes an hour or two of working with int a??(8??); to get used
to them (and they become second nature quickly when you see them all
day long).

Fascinating. There have been raging arguments about trigraphs both
here and in comp.std.c for years. I think you're the first person
I've seen who actually *uses* them. Maybe mainframe users just don't
post to Usenet very often?

In my own experience, and that of most people here, trigraphs have
caused far more problems than they solve; if a trigraph appears in a C
source file, it's far more likely to be accidental than intentional
(unless the code is deliberately obfuscated). For example:

fprintf(stderr, "Unexpected error, what happened??!\n");

Since there is currently no active effort to publish a new C standard,
it looks like we're stuck with the current situation for the
forseeable future, but some of us are still trying to come up with a
better solution. For example, I've proposed *disabling* trigraphs by
default, but enabling them if there's some unique marker at the top of
the file.

For any change like this, there's a danger of breaking existing code,
but for those of us outside the IBM mainframe world, it would probably
accidentally *fix* more code than it would break.

Also, why do you use trigraphs rather than digraphs? They were added
in a 1995 update to the standard (I think that's right); you could
write a[8] as a<:8:> rather than as a??(8??).

Any thoughts?

Walter Roberson · Oct 20, 2006

jacob navia said:
This is NONSENSE for all users that are NOT EBCDIC and do NOT work in
mainframes. By the way, the venerable 3270 is DEAD SINCE CONCEPTION

It was? You only had to wait 3 years for DEC to introduce the VT52,
whose 9600 bps serial interface wasn't up to the task of
connecting 17500 terminals to a single 16 megabyte computer.

and one of the nice things of the microcomputers that appeared in the
eighties was this wonderful KEYBOARDS where we could type any character
we wish... Nice isn't it?

"In the eighties" was literally a decade after the introduction
of the "dead since conception" 3270. And it took another decade (at least)
before all the codepages were in place.

Richard Heathfield · Oct 20, 2006

Walter Bright said:

Trigraphs are a worthless feature.

This "worthless feature" is sometimes the only way you can get C code to
compile on a particular implementation, because the native character set of
the implementation doesn't contain such fancy characters as { or [ - so to
dismiss it as worthless is to display mere parochialism. I've worked on a
system that had no end of trouble with [ and ] but was quite at home with
??( and ??)

Richard Heathfield · Oct 20, 2006

Walter Bright said:

That's what I expected. That pretty much means that trigraphs are a
reasonable solution for such systems, but that since the characters
must be translated anyway, there's not much reason to support trigraphs
in the C language standard itself.

If trigraphs were *not* supported in the Standard, you'd have a heck of a
job getting the same source base to run on, say, MS-DOS (or, nowadays,
Windows) and MVS. Just because you don't use 'em yourself, that doesn't
mean they're not useful.

// comments	35	Apr 26, 2008
A simple parser	121	Oct 14, 2006
Text processing	29	Sep 26, 2011
Command Line Arguments	0	Mar 7, 2023
Working with files	1	Dec 10, 2021
Serial port	5	Jun 2, 2013
hexump.c	79	Sep 9, 2011
Taking a stab at getline	40	Feb 7, 2013

How to remove // comments

Bart

Bart

CBFalconer

Jalapeno

Jalapeno

Walter Bright

Walter Bright

Andrey Koptyaev

Jalapeno

Walter Bright

jacob navia

Jalapeno

jacob navia

Yevgen Muntyan

jacob navia

Yevgen Muntyan

Keith Thompson

Walter Roberson

Richard Heathfield

Richard Heathfield

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads