an array of pointers, causing segmentation fault

J

Jess

Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

return 0;
}

The output is:
a
b
c
Segmentation fault
 
V

Victor Bazarov

Jess said:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters.
Then you try to change them, and that's undefined behaviour.
for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

return 0;
}

The output is:
a
b
c
Segmentation fault

"- Doctor, when I try changing the contents of the constant memory,
my program segfaults!
- Well, don't try changing the contents of the constant memory."

V
 
L

Lionel B

Jess said:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters. Then
you try to change them, and that's undefined behaviour.

Unfortunate indeed; however seeing as we're stuck with it I'm wondering
whether it's not too much to expect your compiler to pick up on it (with
at least a warning), since it will most likely complain if you try to
assign to any other const expression. Or would that contravene the
standard somewhere?
 
J

Jeroen

Lionel B schreef:
Jess said:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};
This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters. Then
you try to change them, and that's undefined behaviour.

Unfortunate indeed; however seeing as we're stuck with it I'm wondering
whether it's not too much to expect your compiler to pick up on it (with
at least a warning), since it will most likely complain if you try to
assign to any other const expression. Or would that contravene the
standard somewhere?

Not sure if the following is correct:

int main()
{
char* x[] = {"a","b","c"};

foo(x[0]);
}

void foo(char *p)
{
*p = 'A';
}

My idea is that it may require a lot of tracking and tracing for a
compiler in order to detect all possible situations where the segfault
can occur.

Jeroen
 
V

Victor Bazarov

Lionel said:
Jess said:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters.
Then you try to change them, and that's undefined behaviour.

Unfortunate indeed; however seeing as we're stuck with it I'm
wondering whether it's not too much to expect your compiler to pick
up on it (with at least a warning), since it will most likely
complain if you try to assign to any other const expression. Or would
that contravene the standard somewhere?

No, a warning never contravenes the Standard. Indeed, many compilers
give plenty of warnings if you ask them. I am not sure, however, that
initialising a pointer to non-const char with the address of the first
element of an array of const char (string literal) is one of commonly
available warnings.

Perhaps PC-lint might help...

V
 
V

Victor Bazarov

Jeroen said:
Lionel B schreef:
Jess wrote:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};
This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters.
Then you try to change them, and that's undefined behaviour.

Unfortunate indeed; however seeing as we're stuck with it I'm
wondering whether it's not too much to expect your compiler to pick
up on it (with at least a warning), since it will most likely complain if
you try to assign to any other const expression. Or
would that contravene the standard somewhere?

Not sure if the following is correct:

int main()
{
char* x[] = {"a","b","c"};

foo(x[0]);
}

void foo(char *p)
{
*p = 'A';
}

My idea is that it may require a lot of tracking and tracing for a
compiler in order to detect all possible situations where the segfault
can occur.

I don't think Lionel suggested tracking that. The mere initialisation
of 'x' with string literals should be the red flag. Once it's fixed,
then other problems can pop up and the whole thing will need to be
dealt with.

V
 
L

Lionel B

Jeroen said:
Lionel B schreef:
On Thu, 26 Apr 2007 08:44:29 -0400, Victor Bazarov wrote:

Jess wrote:
Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};
This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters.
Then you try to change them, and that's undefined behaviour.

Unfortunate indeed; however seeing as we're stuck with it I'm
wondering whether it's not too much to expect your compiler to pick up
on it (with at least a warning), since it will most likely complain if
you try to assign to any other const expression. Or would that
contravene the standard somewhere?
Not sure if the following is correct:

int main()
{
char* x[] = {"a","b","c"};

foo(x[0]);
}

void foo(char *p)
{
*p = 'A';
}

My idea is that it may require a lot of tracking and tracing for a
compiler in order to detect all possible situations where the segfault
can occur.

I don't think Lionel suggested tracking that.

Hmm, I think I might have been, actually...
The mere initialisation
of 'x' with string literals should be the red flag. Once it's fixed,
then other problems can pop up and the whole thing will need to be dealt
with.

I presume by "fixing" it, you mean:

char* const x = "abc"; // compile error

const char* const x = "abc"; // ok

I suspect that might break a lot of legacy code, though. I was thinking
of:

char* const x = "abc"; // warning
...
*x = 'A'; // REALLY SCARY warning

The second warning could, I suppose, be an error (insofar as any code it
breaks was already badly broken...) but either way it might entail some
work on the compiler's part, as suggested by the previous poster.
 
M

Mumia W.

Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

return 0;
}

The output is:
a
b
c
Segmentation fault

Constant strings in C++ are not generally writable; however, it's
perfectly acceptable to define arrays of characters that you can change:

char x[][2] = {"a", "b", "c"};

If you want to keep your old definition for "x", as a compiler
extension, gcc (g++) allows you to use the option "-fwritable-strings".
Other compilers may have a similar feature.
 
D

dragoncoder

Hello, the following code failed with "segmentation fault". Could
someone tell me why it failed please? Thanks!

#include<iostream>
#include<string>
#include<cstring>
#include<cstddef>

using namespace std;

int main(){
char* x[] = {"a","b","c"};

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

Correct, and "a" is a constant string literal which may be stored in
the read-only memory, modifying which causes undefined behavior.
for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

return 0;

}

The output is:
a
b
c
Segmentation fault

You are lucky to have a got a seg fault, UB may manifest itself in
many different ways e.g. getting your PC on fire.

/P
 
H

Howard

Jess said:
*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

I have my own follow-up question:

Is the above correct? Can you dereference an array member as if it were a
pointer? I would think the normal way to refer to the first character of
the first char array would be x[0][0], not *(x[0]). What he's doing is the
same as:

char xxx[10];
*xxx = 'a';

And that just doesn't seem right, to me. Is it?

-Howard
 
V

Victor Bazarov

Howard said:
Jess said:
*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

I have my own follow-up question:

Is the above correct? Can you dereference an array member as if it
were a pointer?

In the original post members of 'x' array _are_ pointers.
I would think the normal way to refer to the first
character of the first char array would be x[0][0], not *(x[0]). What he's
doing is the same as:

char xxx[10];
*xxx = 'a';

And that just doesn't seem right, to me. Is it?

It is. 'xxx' (the name of the array), when used in an expression,
decays to a pointer to the first element. In fact, the indexing
expression (xxx[blah]) is not indexing at all, it's dereferencing:
*(xxx + blah) , and of course it's just the same thing with the
name decaying to a pointer.

V
 
H

Howard

Victor Bazarov said:
Howard said:
Jess said:
*(x[0]) = 'A'; //since string "a" is an array of chars, and *(x[0])
is the location in memory that stores the first char in the array, ie
'a'. this line of code tries to change it to 'A'

for(size_t j = 0; j != 3; j++)
cout << *(x[j]) << endl;

I have my own follow-up question:

Is the above correct? Can you dereference an array member as if it
were a pointer?

In the original post members of 'x' array _are_ pointers.
I would think the normal way to refer to the first
character of the first char array would be x[0][0], not *(x[0]). What
he's doing is the same as:

char xxx[10];
*xxx = 'a';

And that just doesn't seem right, to me. Is it?

It is. 'xxx' (the name of the array), when used in an expression,
decays to a pointer to the first element. In fact, the indexing
expression (xxx[blah]) is not indexing at all, it's dereferencing:
*(xxx + blah) , and of course it's just the same thing with the
name decaying to a pointer.

Ah, ok. Thanks, Victor.
-Howard
 
D

Default User

Victor Bazarov wrote:

No, a warning never contravenes the Standard. Indeed, many compilers
give plenty of warnings if you ask them. I am not sure, however, that
initialising a pointer to non-const char with the address of the first
element of an array of const char (string literal) is one of commonly
available warnings.

I glanced through the options for g++, but couldn't hit on one. There's
one to make them writable, which avoids the UB, but not a portable
solution.




Brian
 
D

Default User

Default User wrote:

I glanced through the options for g++, but couldn't hit on one.
There's one to make them writable, which avoids the UB, but not a
portable solution.

Eh, doesn't avoid the UB of course.




Brian
 
J

James Kanze

[...]
Not sure if the following is correct:
int main()
{
char* x[] = {"a","b","c"};
foo(x[0]);
}
void foo(char *p)
{
*p = 'A';
}

The code has undefined behavior.
Hmm, I think I might have been, actually...

It's not reasonable. foo might be in another translation unit.
I presume by "fixing" it, you mean:
char* const x = "abc"; // compile error
const char* const x = "abc"; // ok

Not an error, but at least a warning.

In C++, the type of a string literal is char const[]. Which
converts implicitly to char const*. Normally, a string literal
could not be used to initialize as char*. To avoid breaking
legacy code, there is a special implicit conversion, however,
that only applies to char const* which are the direct result of
a string literal. When the committee made string literals char
const[], and introduced this conversion, it was their expressed
intent that compilers generate warnings when it was used. A
compiler *should* warn.
I suspect that might break a lot of legacy code, though. I was thinking
of:
char* const x = "abc"; // warning
...
*x = 'A'; // REALLY SCARY warning
The second warning could, I suppose, be an error (insofar as any code it
breaks was already badly broken...) but either way it might entail some
work on the compiler's part, as suggested by the previous poster.

The second could only be an error if the compiler can determine
that the code will actually be executed, for all possible input.
A warning on the first would be sufficient.
 
J

James Kanze

It is. 'xxx' (the name of the array), when used in an expression,
decays to a pointer to the first element.

Not always, of course. But it's easier to remember the
exceptions (argument of sizeof, binding to a reference) than the
cases where the implicit conversion occurs. (Formally, it's
just another implicit conversion, which only occurs if
necessary. Practically, however, there are very few contexts in
an expression where an array type is legal. As Victor points
out, C++ doesn't support indexing into an array, as such.)
In fact, the indexing expression (xxx[blah]) is not indexing
at all, it's dereferencing: *(xxx + blah), and of course it's
just the same thing with the name decaying to a pointer.

Correct. And because it involves addition, it is commutative.
All that is required is that one of the operands be a pointer
(not an array!), and the other convertible into size_t. So
things like "abcd"[ i ], or even i[ "abcd" ] are legal.

(This isn't quite true. It only holds for pointers and
arithmetic types; if you overload operator+ and the unary
operator*, for example, you still can't use [], and you can
overload [] to have completely different semantics than *(a+b).
And while someone designing a smart pointer into an array class
should definitely try to respect the conventions, and make []
work exactly like *(a+b), he won't be able to make []
commutative, since it has to be a member.)
 
L

Lionel B

Jeroen wrote:
Lionel B schreef:
[...]
Not sure if the following is correct:
int main()
{
char* x[] = {"a","b","c"};
foo(x[0]);
}
void foo(char *p)
{
*p = 'A';
}

The code has undefined behavior.
Hmm, I think I might have been, actually...

It's not reasonable. foo might be in another translation unit.
I presume by "fixing" it, you mean:
char* const x = "abc"; // compile error
const char* const x = "abc"; // ok

Not an error, but at least a warning.

In C++, the type of a string literal is char const[]. Which converts
implicitly to char const*. Normally, a string literal could not be used
to initialize as char*. To avoid breaking legacy code, there is a
special implicit conversion, however, that only applies to char const*
which are the direct result of a string literal. When the committee
made string literals char const[], and introduced this conversion, it
was their expressed intent that compilers generate warnings when it was
used. A compiler *should* warn.

Right, thanks for the clarification. I see now that gcc has a warning
flag -Wwrite-strings option which does exactly this, emitting a:

deprecated conversion from string constant to ‘char*’'

warning. The documentation states:

"When compiling C, give string constants the type const char[length] so
that copying the address of one into a non-const char * pointer will get
a warning; when compiling C++, warn about the deprecated conversion from
string constants to char *. These warnings will help you find at compile
time code that can try to write into a string constant, but only if you
have been very careful about using const in declarations and prototypes.
Otherwise, it will just be a nuisance; this is why we did not make -Wall
request these warnings."

.... and why I did not notice it.

[...]
 
J

Jess

This is an unfortunate language feature [that ought to be abolished
years ago!!!] which led you to believe you are allowed to change the
values of the characters to which elements of 'x' point. You have
initialised 'x' with addresses of arrays of _constant_ characters.
Then you try to change them, and that's undefined behaviour.

Is this because a string literal is always an array of const chars?
What if I have an array of string objects, rather than string
literals?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,710
Latest member
bernietqt

Latest Threads

Top