Wide string initializer syntax

D

Derrick Coetzee

Looking through the C90 standard, it occurred to me that the possible
syntaxes for initializers, particularly of wchar_t arrays, are really
bizarre. Consider the following:

wchar_t s1[] = { L"abcdef" };
wchar_t* s2[] = { L"abcdef" };
wchar_t s3[][6] = { L"abcdef" };
wchar_t* s4[][6] = { L"abcdef" };

That's four different types initialized with exactly the same
initializer syntax, but it means four different things. In the first
case, a mutable buffer is being initialized, and the standard lets you
wrap the string intializing the buffer in braces for no apparent reason.
In the second case, an array containing one pointer to a literal string
is declared. In the third case, an array containing one initialized
mutable buffer is declared. In the fourth case, a 1 by 6 two-dimensional
array is declared, with s4[0][0] set to a literal string, and s4[0][1]
through s[0][5] set to a null pointer. I could continue with
larger-dimensional arrays right up to the environment limits.

Thoughts?
 
N

Nicolas Pavlidis

Derrick said:
Looking through the C90 standard, it occurred to me that the possible
syntaxes for initializers, particularly of wchar_t arrays, are really
bizarre. Consider the following:

Are you shure that wchar_t is a build-in TYpe for C? I don't know about
C99, but in C90 ther is defently no wchar_t build-in type!

Kind regards,
Nicolas
 
D

Derrick Coetzee

Nicolas said:
Are you shure that wchar_t is a build-in TYpe for C? I don't know about
C99, but in C90 ther is defently no wchar_t build-in type!

The wchar_t type is not built-in, but is required to be defined in the
standard header stddef.h. Wide string literals are always arrays of
whatever wchar_t is defined to be, even if the type's definition is not
available. The standard mentions wchar_t in several places.
 
C

Chris Torek

Looking through the C90 standard, it occurred to me that the possible
syntaxes for initializers, particularly of wchar_t arrays, are really
bizarre. Consider the following:

wchar_t s1[] = { L"abcdef" };
wchar_t* s2[] = { L"abcdef" };
wchar_t s3[][6] = { L"abcdef" };
wchar_t* s4[][6] = { L"abcdef" };

That's four different types initialized with exactly the same
initializer syntax, but it means four different things. ...

Indeed, this is all correct and true, but it is not special to wide
characters. Replace "wchar_t" with "char", and remove the uppercase
L's, and it is still all correct and true.

(Versions of gcc helpfully warn about incomplete/inconsistent
brace-bracketing of the fourth line, given the appropriate options.)
 
D

Derrick Coetzee

Chris said:
wchar_t s1[] = { L"abcdef" };
wchar_t* s2[] = { L"abcdef" };
wchar_t s3[][6] = { L"abcdef" };
wchar_t* s4[][6] = { L"abcdef" };

Indeed, this is all correct and true, but it is not special to wide
characters. Replace "wchar_t" with "char", and remove the uppercase
L's, and it is still all correct and true.

Ah, you're right. It was the first one I was unsure of, but:

"An array of character type may be initialized by a character string
literal, optionally enclosed in braces."
"An array with element type compatible with wchar_t may be initialized
by a wide string literal, optionally enclosed in braces."
- C90, 6.5.7

I can't figure out what these optional braces are for. I suppose yet
another concession to existing implementations.
 
J

J. J. Farrell

Derrick Coetzee said:
"An array of character type may be initialized by a character string
literal, optionally enclosed in braces."
"An array with element type compatible with wchar_t may be initialized
by a wide string literal, optionally enclosed in braces."
- C90, 6.5.7

I can't figure out what these optional braces are for. I suppose yet
another concession to existing implementations.

Consistency. In general, initializers for aggregate type are enclosed
in braces.
 
M

Michael Wojcik

Consistency. In general, initializers for aggregate type are enclosed
in braces.

The braces are also optional for initializers for scalar types.

This consistency simplifies things for source-code generators, and
means that {0} is a valid initializer for any object type or any
array of unknown size (in a declaration where initialization is
permitted).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,809
Latest member
moe77

Latest Threads

Top