token pasting problem in K&R preprocessor

H

Henry Townsend

I hope this is on-topic in c.l.c - it's about the C preprocessor more
than the language per se, more generally about the K&R behavior, and
most specifically about the Sun cpp which is why I've cross posted there.

The test case below is taken from an Imake setup (yes, old, I know).
There's a Concat() macro which uses old-style /**/ for token pasting (we
are not allowed to assume ANSI so ## isn't allowed). I'm trying to paste
AND expand the two macros X and Y. I cannot figure out why it doesn't
work in this scenario; the literal /**/ works but the macro doesn't.

% cat /tmp/X.c
#define X xxx
#define Y yyy
#define Concat(a,b)a/**/b
X/**/Y
Concat(X,Y)

% /usr/ccs/lib/cpp /tmp/X.c
# 1 "/tmp/X.c"
[blank space elided]
xxxyyy
XY

Thanks,
HT
 
K

Kenneth Brody

Henry Townsend wrote:
[...]
% cat /tmp/X.c
#define X xxx
#define Y yyy
#define Concat(a,b)a/**/b
X/**/Y
Concat(X,Y)

% /usr/ccs/lib/cpp /tmp/X.c
# 1 "/tmp/X.c"
[blank space elided]
xxxyyy
XY

Using MSVC6 (cl /P usenet.c), I get:
==========
xxxyyy
xxx yyy
==========

Using gcc version "egcs-2.91.66" (gcc -E usenet.c), I get:

==========
# 1 "usenet.c"


xxx yyy
xxx yyy
==========

I can't imagine how you get "XY" for the second line. Doesn't the
preprocessor have to expand the X and Y into their #define'd values?

Can you run the C compiler, asking it to only preprocess the file,
rather than run cpp directly? What happens then?

Now, as to why MSVC gives "xxxyyy" for the first line, and gcc gives
"xxx yyy", I don't know. Can someone here tell me which is "right"
and which is "wrong"? (Or are they both "right"?)

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
M

Michael Mair

Henry said:
I hope this is on-topic in c.l.c - it's about the C preprocessor more
than the language per se, more generally about the K&R behavior, and
most specifically about the Sun cpp which is why I've cross posted there.

The test case below is taken from an Imake setup (yes, old, I know).
There's a Concat() macro which uses old-style /**/ for token pasting (we
are not allowed to assume ANSI so ## isn't allowed). I'm trying to paste
AND expand the two macros X and Y. I cannot figure out why it doesn't
work in this scenario; the literal /**/ works but the macro doesn't.

% cat /tmp/X.c
#define X xxx
#define Y yyy
#define Concat(a,b)a/**/b
X/**/Y
Concat(X,Y)

% /usr/ccs/lib/cpp /tmp/X.c
# 1 "/tmp/X.c"
[blank space elided]
xxxyyy
XY

I don't have a K&R compiler, so I am only guessing.

Note that you have the same problem for ##. There, you solve
it by an additional "level of indirection", i.e. you force
expansion of the macro arguments by wrapping "Concat" once:
#define X xxx
#define Y yyy
#define con_cat(a,b) a##b
#define CONCAT(a,b) con_cat(a,b)
con_cat(X,Y)
CONCAT(X,Y)
leads to
XY
xxxyyy

I'd try the same for K&R, too.

HTH
Michael
 
E

Eric Sosman

Henry Townsend wrote On 06/19/06 14:52,:
[...]
There's a Concat() macro which uses old-style /**/ for token pasting (we
are not allowed to assume ANSI so ## isn't allowed). [...]

ANSI C has been around for a little less than seventeen
years, so perhaps your caution in adopting it is justified.
Just as a point of idle interest, how many decades do you
think must elapse before it is safe to adopt a new standard?
How are you doing with the plans to move from FORTRAN II to
FORTRAN IV?

Challenge: Without reference to a newspaper, almanac,
Wikipedia, or other such source, name three people who were
heads of state of members of the UN Security Council at the
time ANSI C was adopted.

Challenge: Is an `int' large enough to count the number
of seconds since the adoption of ANSI C? The number of hours?
Of days?

Challenge: In twenty-five words or fewer, compare and contrast
your organization's rate of standards adoption with the rate of
proton decay.

(All right, all right -- I'm having some fun at your expense.
But in all seriousness, I urge you to consider moving forward to
within a decade of the leading edge. There were once good reasons
to accommodate pre-ANSI implementations, but their goodness has
diminished with the passage of time and is now approximately equal
to that of a funerary meal from a Pharaoh's tomb. You really,
really ought to take a hard look at your reasons for adhering to
outdated technologies. Observe: The thread at hand demonstrates
that this antiquarian romanticism is making trouble and hence
costing you money ...)
 
C

CBFalconer

Henry said:
.... snip ...

The test case below is taken from an Imake setup (yes, old, I
know). There's a Concat() macro which uses old-style /**/ for
token pasting (we are not allowed to assume ANSI so ## isn't
allowed). I'm trying to paste AND expand the two macros X and Y.
I cannot figure out why it doesn't work in this scenario; the
literal /**/ works but the macro doesn't.

Then you just don't paste tokens. Even old K&R specified that any
comment was replaced by at least one blank. Some compilers were
too stupid to do this.
 
H

Henry Townsend

CBFalconer said:
Henry Townsend wrote:
... snip ...

Then you just don't paste tokens. Even old K&R specified that any
comment was replaced by at least one blank. Some compilers were
too stupid to do this.

A) The example was not typed in, it was pasted verbatim from a shell
session. It's only five lines - feel free to try it yourself if you want
to see that there's no blank in the /**/ case.

B) This is an imake-based build system which has been in use for about
18 years (yes, before ANSI C was ratified), and has been running on
every commercially viable Unix platform during that time. Developers and
users of this very-well-known product will be surprised to hear that it
can't be built.

C) The Concat() macro was take directly from the X11 imake system, where
it survives to this day.

So regardless of what K&R may have _specified_, empirical evidence is
that no blank is inserted.
 
E

Eric Sosman

Henry said:
A) The example was not typed in, it was pasted verbatim from a shell
session. It's only five lines - feel free to try it yourself if you want
to see that there's no blank in the /**/ case.

B) This is an imake-based build system which has been in use for about
18 years (yes, before ANSI C was ratified), and has been running on
every commercially viable Unix platform during that time. Developers and
users of this very-well-known product will be surprised to hear that it
can't be built.

C) The Concat() macro was take directly from the X11 imake system, where
it survives to this day.

So regardless of what K&R may have _specified_, empirical evidence is
that no blank is inserted.

The biggest problem tackled by the ANSI Standard was not
inventing the token-pasting operator, or void, or prototypes,
or any of the other "new features" in the language. Larger
than all of these -- than all of these put together -- was
the problem of reconciling the multiple divergent versions of
C that had arisen.

Does sprintf() return a count or a pointer?

Does unsigned short promote to int or to unsigned int?

Does an extern declaration inside a block have block
scope or file scope?

If a macro definition contains a string literal that in
turn contains a substring identical to one of the macro's
arguments, does substitution occur?

... and, of course: What happens when a macro definition
expands two of its arguments with only a comment separating
them?

Actual, real implementations of pre-Standard C differed
on all of these points (and more). The whole reason for the
standardization effort in the first place was that there was
no consensus on these matters. There was no single definition,
either formal or de facto, about how to paste tokens.

Now: You say you are "not allowed to assume ANSI C," a
stance I've commented on elsethread. However, you mention
that all this token-pasting is done through a Concat() macro.
Can you not make a preprocessor test for __STDC__ and define
Concat() accordingly? If you are worried about pre-Standard
(or anti-Standard) compilers that define __STDC__ but don't
do token-pasting, I think you're worried about trifles.
 
H

Henry Townsend

Eric said:
Now: You say you are "not allowed to assume ANSI C," a
stance I've commented on elsethread. However, you mention
that all this token-pasting is done through a Concat() macro.
Can you not make a preprocessor test for __STDC__ and define
Concat() accordingly? If you are worried about pre-Standard
(or anti-Standard) compilers that define __STDC__ but don't
do token-pasting, I think you're worried about trifles.

No, if you go back to the original post I was trying to understand
specifically how to make a certain kind of pasting work with a certain
K&R cpp, with the idea that the answer would probably apply to the 6-8
other old K&R preprocessors I'll be encountering soon on other
platforms. But what I think I've learned here is that each cpp may
behave differently. Mind you, though, this Concat() macro has worked for
every platform X11 has been ported to over the last 20 years, which is
quite a lot.

As for whether to upgrade to ANSI, I personally am a big fan of getting
with the times but it's not necessarily part of my charter to do so in
this case. I may be able to get it on the table but at the moment the
referenced problem must be dealt with in the context of K&R. Luckily
it's not a huge difficulty; I can special-case this particular value. I
was just hoping for a solution short of ANSI C.

HT
 
R

Richard B. Gilbert

Henry said:
No, if you go back to the original post I was trying to understand
specifically how to make a certain kind of pasting work with a certain
K&R cpp, with the idea that the answer would probably apply to the 6-8
other old K&R preprocessors I'll be encountering soon on other
platforms. But what I think I've learned here is that each cpp may
behave differently. Mind you, though, this Concat() macro has worked for
every platform X11 has been ported to over the last 20 years, which is
quite a lot.

As for whether to upgrade to ANSI, I personally am a big fan of getting
with the times but it's not necessarily part of my charter to do so in
this case. I may be able to get it on the table but at the moment the
referenced problem must be dealt with in the context of K&R. Luckily
it's not a huge difficulty; I can special-case this particular value. I
was just hoping for a solution short of ANSI C.

HT

There are other advantages to coding in ANSI C, the nicest one is that
the compilers no longer complain about the sleazy coding practices that
were so prevalent in K&R. The other nice thing is that it tends to
uncover bugs that may have been making you itch for years. Doing it is
generally tedious but trivial.
 
O

Old Wolf

Eric said:
Challenge: Without reference to a newspaper, almanac,
Wikipedia, or other such source, name three people who were
heads of state of members of the UN Security Council at the
time ANSI C was adopted.

I'll bite: Elizabeth II, Mitterrand, Deng Xiaoping, Gorbachev, Reagan

When was the standard actually ratified? I've heard rumblings
that even though the "89" tag is attached, it wasn't actually
ratified by ANSI until sometime in 1990 (in which case I
may have failed your challenge)
Challenge: Is an `int' large enough to count the number
of seconds since the adoption of ANSI C? The number of hours?
Of days?

Well, a 32-bit int is going to get us from 1 Jan 1970 to sometime
in 2039.
 
K

Keith Thompson

Old Wolf said:
I'll bite: Elizabeth II, Mitterrand, Deng Xiaoping, Gorbachev, Reagan

When was the standard actually ratified? I've heard rumblings
that even though the "89" tag is attached, it wasn't actually
ratified by ANSI until sometime in 1990 (in which case I
may have failed your challenge)

The ANSI C standard was dated 1989. The ISO C standard, equivalent to
the ANSI standard but with the sections renumbered and some other
cosmetic changes, was dated 1990. ANSI then officially adopted the
1990 ISO standard (I don't know just when that happened).
Well, a 32-bit int is going to get us from 1 Jan 1970 to sometime
in 2039.

Almost. Tue 2038-01-19 03:14:07 UTC.
 
J

Joe Wright

Keith said:
The ANSI C standard was dated 1989. The ISO C standard, equivalent to
the ANSI standard but with the sections renumbered and some other
cosmetic changes, was dated 1990. ANSI then officially adopted the
1990 ISO standard (I don't know just when that happened).


Almost. Tue 2038-01-19 03:14:07 UTC.
Regard..

With 'time' = 'INT_MAX' or 31 one bits..
2147483647 Tue Jan 19 03:14:07 2038

With 'time' = 'UINT_MAX' or 32 one bits..
4294967295 Sun Feb 7 06:28:15 2106
 
L

lawrence.jones

Old Wolf said:
When was the standard actually ratified?

According to the Foreword:

This document was approved as an American National Standard by the
American National Standards Institute (ANSI) on December 14, 1989.
I've heard rumblings
that even though the "89" tag is attached, it wasn't actually
ratified by ANSI until sometime in 1990 (in which case I
may have failed your challenge)

The delay came before the tag was attached -- the committee voted the
document out in 1988 but a procedural snafu at ANSI kept it from being
approved until 1989.

-Larry Jones

These things just seem to happen. -- Calvin
 
E

ena8t8si

Eric said:
The biggest problem tackled by the ANSI Standard was not
inventing the token-pasting operator, or void, or prototypes,
or any of the other "new features" in the language. Larger
than all of these -- than all of these put together -- was
the problem of reconciling the multiple divergent versions of
C that had arisen. ...

Does unsigned short promote to int or to unsigned int?


And they got that one wrong.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top