Type of a string literal

O

Old Wolf

C99 6.4.5/5 seems to say that string literals have type char[] .

However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.

Is this a GCC bug or am I misinterpreting the standard?


#include <stdio.h>

int main(void)
{
if ( 0 )
"abc"[0] = 1;

return 0;
}
 
B

Ben Bacarisse

Old Wolf said:
C99 6.4.5/5 seems to say that string literals have type char[] .

.... or of type wchar_t.
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.

Is this a GCC bug or am I misinterpreting the standard?

6.4.5/6 says "If the program attempts to modify such an array, the
behavior is undefined" so the issue is whether your program attempts to
modify the array. It certainly doesn't look like it:
#include <stdio.h>

int main(void)
{
if ( 0 )
"abc"[0] = 1;

return 0;
}

I get the same error if I move the assignment into a static function
which is never called. gcc then both complains about the assignment and
tells me that the function is never called.

It looks to me like gcc is overstepping the mark. clang is happy with
it.
 
G

glen herrmannsfeldt

Old Wolf said:
C99 6.4.5/5 seems to say that string literals have type char[] .
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location
This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.
Is this a GCC bug or am I misinterpreting the standard?
#include <stdio.h>
int main(void)
{
if ( 0 )
"abc"[0] = 1;
return 0;
}

As I understand it, in K&R C strings constants were writable.

While you won't normally do it that way, you might have a pointer to
one, and write over the constant using that.

ANSI removed that ability, but compilers should still support it given
the appropriate option.

Or maybe you are suggesting that the paragraph quoted should have
included the 'const' qualifier.

On the other hand, inside an if(0) it seems a little mean of the
compiler to disallow it. Could be a warning, though.

-- glen
 
J

James Kuyper

C99 6.4.5/5 seems to say that string literals have type char[] .

It actually doesn't specify the type of string literals directly. It
does specify that "The multibyte character sequence is then used to
initialize an array of static storage duration and length just
sufficient to contain the sequence.", so I would say it's more
appropriate to for the type to be char[N] for the appropriate value of
N. It does specify the type of the elements of the array; it's char if
there's no prefix or a u8 prefix, and is wchar_t, char16_t or char32_t,
depending upon which combination of the u or U and L, prefixes you use.

In most contexts, an lvalue of array type decays into a pointer to the
appropriate element type, so the array length doesn't matter. However,
there's three contexts where that's not the case: sizeof("Hello") should
be 6, &"world!" should have the type char(*)[7], and the declaration:

char greeting[] = "Hello world!";

is equivalent to

char greeting[13] = {'H', 'e', 'l', 'l', 'o', ' ',
'w', 'o', 'r', 'l', 'd', '!', '\0'};
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

"If the program attempts to modify such an array, the behavior is
undefined" (6.4.5p5). If "const" had been part of the original C
language, string literals should have had a const-qualified type.
However, by the time "const" was added, too much legacy code had been
written which would have broken if that change were made.
 
E

Eric Sosman

C99 6.4.5/5 seems to say that string literals have type char[] .

However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.

Is this a GCC bug or am I misinterpreting the standard?

Not a bug, because a compiler is always allowed to issue
any diagnostics it likes. That's why gcc can issue a warning
for `if (x = 0) ...' even though it's perfectly legal C.
#include <stdio.h>

int main(void)
{
if ( 0 )
"abc"[0] = 1;

If the assignment were attempted, the behavior would be
undefined. The anonymous array created by the string literal
is of type `char[4]' rather than `const char[4]', but that's
a historical accident: `const' was a latecomer to C. Despite
the array's non-`const'-ness, attempting to modify it yields
undefined behavior.

Perhaps one could argue that the compiler should have seen
that the assignment would never be executed, optimized the whole
business away, and suppressed the warning -- but that's an argument
to have with the compiler developers, not with the language.
return 0;
}

More on the "historical accident:" Before `const' there was no
way for a function to advertise that it didn't intend to write through
a pointer argument. A function operating on a string looked like:

int oldfunc(string)
char *string;
{ ... }

It looked this way whether it wanted read-only or read-write access;
the function definition was the same either way. Along came `const'
and it became possible to state the difference:

int readwrite(string)
char *string;
{ ... }

int readonly(string)
const char *string;
{ ... }

or with prototypes (which came along at the same time):

int readwrite(char *string) { ... }

int readonly(const char *string) { ... }

At this point the Committee *could* have `const'-ified the string
literal's array, but then what would have happened to oldfunc() --
to all those oldfunc()'s in their myriad thousands in C code that
had been written in the two decades preceding the ANSI Standard?
Every attempt to call them with literal arguments would suddenly
become an "error by fiat" -- with a diagnostic required, no less.
How eager would folks have been to adopt a brand-new Standard whose
immediate effect was to delegitimize a huge amount of pre-existing
code? So the Committee took the "impure" but intensely practical
stance that the literal arrays would be "non-`const' but please
don't write them." That's the situation that still prevails.
 
K

Keith Thompson

James Kuyper said:
C99 6.4.5/5 seems to say that string literals have type char[] .

It actually doesn't specify the type of string literals directly. It
does specify that "The multibyte character sequence is then used to
initialize an array of static storage duration and length just
sufficient to contain the sequence.", so I would say it's more
appropriate to for the type to be char[N] for the appropriate value of
N. It does specify the type of the elements of the array; it's char if
there's no prefix or a u8 prefix, and is wchar_t, char16_t or char32_t,
depending upon which combination of the u or U and L, prefixes you use.

N1570 6.5.1p4 (in the section on primary expressions) says:

A string literal is a primary expression. It is an lvalue with type
as detailed in 6.4.5.

It would be better IMHO if 6.4.5 specified the type of the literal, as
6.4.4 does for constants.

As far as I can tell, there's no explicit statement about the *value* of
a string literal. It's obvious that it's the value of the static array
object described in 6.4.5, but the standard doesn't actually say so.
 
K

Keith Thompson

glen herrmannsfeldt said:
As I understand it, in K&R C strings constants were writable.

While you won't normally do it that way, you might have a pointer to
one, and write over the constant using that.

ANSI removed that ability, but compilers should still support it given
the appropriate option.

I'll have to check, but I don't think K&R1 *required* string literals to
be writable.

As of C89 (and C90, and C99, and C11), string literals are not const,
but attempting to modify them (more precisely, the static arrays
associated with them) has undefined behavior.

There's no particular reason for a modern compiler to support
modifying string literals, except perhaps to support old (bad) code.
There's certainly no requirement in the standard to support it.
Or maybe you are suggesting that the paragraph quoted should have
included the 'const' qualifier.

Making string literals const (as C++ did) would have broken existing
code. Prior to the 1989 ANSI C standard, the "const" qualifier didn't
exist. A code snippet like this:

int func(char *s) { /* ... */ }
...
func("hello");

would be illegal in C89/C90 if string literals were const; the solution
would be to change the parameter to "const char *s" (which is a good
idea anyway), but that wasn't possible in pre-ANSI C. The existing
rule is a necessary compromise.

[...]
 
K

Keith Thompson

Old Wolf said:
C99 6.4.5/5 seems to say that string literals have type char[] .

It doesn't *explicitly* say that, but combined with 6.5.1p4 we can
determine that the type of "hello" is char[6].
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.

Is this a GCC bug or am I misinterpreting the standard?

It's a gcc bug, corrected in a later version.

With gcc 4.1.2, I see the same error message you do.

With gcc 4.7.1, I get:

foo.c: In function 'main':
foo.c:6:17: warning: assignment of read-only location '"abc"[0]' [enabled by default]

which is a reasonable warning, but not a diagnostic required by the
standard.
#include <stdio.h>

int main(void)
{
if ( 0 )
"abc"[0] = 1;

return 0;
}

A conforming hosted C implementation may not reject this program, since
it doesn't violate any syntax rules or constraints. (Though I suppose a
sufficiently perverse compiler could claim that it exceeds some capacity
limit.)
 
K

Keith Thompson

Eric Sosman said:
C99 6.4.5/5 seems to say that string literals have type char[] .
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location

This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.

Is this a GCC bug or am I misinterpreting the standard?

Not a bug, because a compiler is always allowed to issue
any diagnostics it likes. That's why gcc can issue a warning
for `if (x = 0) ...' even though it's perfectly legal C.

Yes, it's a bug, because it's a fatal error, not a warning.
The compiler rejects the program. The bug was corrected in a later
version of gcc.

[...]
 
O

osmium

Keith Thompson said:
I'll have to check, but I don't think K&R1 *required* string literals to
be writable.

That was one of the big criticisms of Schildt Books. He treated them as
writable and the standard said (or implied)otherwise. The compilers I used
always allowed me to write in them - but I didn't.
 
G

glen herrmannsfeldt

Eric Sosman said:
C99 6.4.5/5 seems to say that string literals have type char[] .
However, the version of GCC I have installed fails to compile the following
code, giving the error
foo.c:6: error: assignment of read-only location
This happens with "-ansi" and with "-std=c99" , with or without the
other usual warning switches.
Is this a GCC bug or am I misinterpreting the standard?
Not a bug, because a compiler is always allowed to issue
any diagnostics it likes. That's why gcc can issue a warning
for `if (x = 0) ...' even though it's perfectly legal C.

I suppose, and I think I wouldn't complain if it isssued a
warning for this one, but it issued an error instead.

(snip)
If the assignment were attempted, the behavior would be
undefined. The anonymous array created by the string literal
is of type `char[4]' rather than `const char[4]', but that's
a historical accident: `const' was a latecomer to C. Despite
the array's non-`const'-ness, attempting to modify it yields
undefined behavior.
Perhaps one could argue that the compiler should have seen
that the assignment would never be executed, optimized the whole
business away, and suppressed the warning -- but that's an argument
to have with the compiler developers, not with the language.

But it was an error, not warning.

Even without the if(0) the compiler doesn't know that it will
ever be executed.

-- glen
 
E

Eric Sosman

Eric Sosman said:
[...]
Perhaps one could argue that the compiler should have seen
that the assignment would never be executed, optimized the whole
business away, and suppressed the warning -- but that's an argument
to have with the compiler developers, not with the language.

But it was an error, not warning.

Thanks for the correction (and to Keith Thompson, too).
Even without the if(0) the compiler doesn't know that it will
ever be executed.

The argument (for warning or for error) goes the other
way around: The compiler needn't prove an execution *will*
be attempted, but should in a case as simple as this one be
able to prove that it *won't* be. I'm sure gcc can do such
proofs at suitable optimization levels, but the larger
question of whether to complain about "optimized out" code
remains open. It seems to me the compiler does the developer
a service by pointing out problems even in optimized out
sections, since the reasons for optimizing out today may not
obtain tomorrow (when a different and less helpful compiler
may be in use). Replace `if(0)' with `if(CHAR_BIT==8)' to
get a code block that will be optimized away on nearly every
platform, but should still be inspected and perhaps warned
about.
 
T

Tim Rentsch

Keith Thompson said:
Old Wolf said:
[concerning an error caused by trying to assign into an element
of a string literal.]

#include <stdio.h>

int main(void)
{
if ( 0 )
"abc"[0] = 1;

return 0;
}

A conforming hosted C implementation may not reject this program,
since it doesn't violate any syntax rules or constraints.

More precisely, because the program is strictly conforming. A
conforming implementation is required to accept any strictly
conforming program, but not any other programs. Not violating
any syntax rule or constraint is one aspect of being strictly
conforming, but not the only aspect. However this program
qualifies on those other aspects also.
(Though I suppose a sufficiently perverse compiler could claim
that it exceeds some capacity limit.)

Exceeding a capacity limit provides a basis for not being able to
execute a program successfully, but not for rejecting (ie, not
accepting) a program. Any strictly conforming program must be
accepted by a conforming implementation, even if the resultant
executable cannot be run successfully.
 
K

Keith Thompson

Tim Rentsch said:
More precisely, because the program is strictly conforming. A
conforming implementation is required to accept any strictly
conforming program, but not any other programs. Not violating
any syntax rule or constraint is one aspect of being strictly
conforming, but not the only aspect. However this program
qualifies on those other aspects also.

What about 4p3?

A program that is correct in all other aspects, operating on correct
data, containing unspecified behavior shall be a correct program and
act in accordance with 5.1.2.3.

This program:

#include <stdio.h>
#include <limits.h>
int main(void) {
printf("%d\n", INT_MAX);
}

is not strictly conforming, since its behavior is
implementation-defined, but I don't believe a compiler is permitted
to reject it because of that.
Exceeding a capacity limit provides a basis for not being able to
execute a program successfully, but not for rejecting (ie, not
accepting) a program. Any strictly conforming program must be
accepted by a conforming implementation, even if the resultant
executable cannot be run successfully.

Practically speaking, any compiler with finite resources (more briefly,
"any compiler") will have some programs that are just too big for it to
process. A likely example:

int main(void)
{
[6.02e23 lines of "{" omitted]
[6.02e23 lines of "}" omitted]
}

This obviously exceeds the minimal translation limits specified in in
5.2.4.1, and very likely exceeds the actual translation limits of any
given compiler. 5.2.4.1 requires a conforming implementation to
"translate and execute" a program that hits all the minimal limits (127
nesting levels of blocks, etc.), which I believe implies that it's not
intended to translate *or* execute programs that exceeds those limits.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,739
Latest member
Clint8040

Latest Threads

Top