Understanding char*

N

Norm

Could someone explain for me what is happening here?

char *string;
string = new char[5];
string = "This is the string";
cout << string << std::endl;

The output contains (This the string) even though I only allocated 5
characters in memory with new.
 
R

Ron Natalie

Norm said:
Could someone explain for me what is happening here?

char* is not a string type. It is a pointer to a single character.
At this point in your career you should stay clear ofit. Use the std::string type instead.
char *string;
string = new char[5];

Allocates an array of 5 char and puts the pointer to the first one in string.
string = "This is the string";

You are not writing over the array of 5 you allocated above. You are assigning
the pointer to the first character string literal (a statically allocated array of as many characters as
needed) .

You've essentially lost (leaked memory) the string you allocated in the previous line
by writing over the remembered address with this one.

And if you were to write:
strcpy(string, "This is the string");
which does copy the string literal over the characters pointed at by string, you'd
be in big trouble because it's undefined behavior when you access off the end of your
allocated memory.

If you wanted to write the first 5 chars of a string, this is how to do it:

#include <string>
std::string str("This is a string");
cout << str.substr(0,5) << "\n";
 
J

Josh Lessard

Could someone explain for me what is happening here?

A memory leak.
char *string;
string = new char[5];

This dynamically allocates a chunk of memory big enough to hold 5 chars,
and returns a pointer to it (which gets stored in 'string').
string = "This is the string";

This places a pointer to the character string "This is the string" in your
variable 'string', which clobbers the old pointer value, rendering your
previously allocated memory inaccessible.
The output contains (This the string) even though I only allocated 5
characters in memory with new.

Yes, but you lost your pointer to that memory when you replaced it with a
pointer to "This is the string". Remember, 'string' doesn't hold the
allocated memory, it only holds a pointer to it.

*****************************************************
Josh Lessard
Master's Student
School of Computer Science
Faculty of Mathematics
University of Waterloo
(519)888-4567 x3400
http://www.cs.uwaterloo.ca
*****************************************************
 
G

Gianni Mariani

Norm said:
Could someone explain for me what is happening here?

char *string;
string = new char[5];
string = "This is the string";
cout << string << std::endl;

The output contains (This the string) even though I only allocated 5
characters in memory with new.

"char * s" is a "pointer to char". On a x86 this is a 32 bit number
which could theoretically address anything in memory.


new char[5] says - go find me an unused chunk of memory in my address
space big enout for 5 chars and "allocate it to me" and construct an
array of 5 chars - return the pointer to the first char.

now

s = new char[5]

takes the returned address and places it in s.

"This is the string" says to the compiler, create an const char array
and place the contents "This is the string" + a terminating ascii nul.
The value of "This is the string" is the address of the first unsigned
char in the array.

so

s = "This is the string";

REPLACES the address returned by new char[5] in the earlier statement.

Theoretically you have now leaked the memory from new char[5] because it
can no longer be deleted because you no longer have the address.

std::cout << s

applies "ostream & operator << ( ostream &, char * )" and passes the
address in s which happens to be "This is the string".

.....

Does this explain enough ?
 
N

Norm

Ron Natalie said:
char* is not a string type. It is a pointer to a single character.
At this point in your career you should stay clear ofit. Use the
std::string type instead.

I know I should be using std::string instead I just trying to better
understand c-strings and pointer concepts
char *string;
string = new char[5];

Allocates an array of 5 char and puts the pointer to the first one in string.
string = "This is the string";

You are not writing over the array of 5 you allocated above. You are assigning
the pointer to the first character string literal (a statically allocated array of as many characters as
needed) .

You've essentially lost (leaked memory) the string you allocated in the previous line
by writing over the remembered address with this one.

And if you were to write:
strcpy(string, "This is the string");
which does copy the string literal over the characters pointed at by string, you'd
be in big trouble because it's undefined behavior when you access off the end of your
allocated memory.

Just so I understand if I were to write this:

char *string;
string = new char[5];
strcpy(string, "This is the string");

It does copy to the the string to the allocated char's by new correct? But
in this example since the string does not fit I am getting an undefined
behavior? In my simple example what is being done with the extra char's that
don't fit? Is my compiler just seeing this and trying to fix the problem?
Because when I do (std::cout << string) the output is the entire string
literal.

Here is another question to clarify this concept for me. If I were to do
this:

char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
strcpy(string, "This is another string");
std::cout << string;

I am assuming the second strcpy is starting over from the begging and
overwriting everything from the first strcpy. Would it have been better to
delete and reallocate memory for the new string?

char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
delete string;
string = new char[50];
strcpy(string, "This is another string");
std::cout << string;

Thanks for the help everyone!
 
R

Ron Natalie

Norm said:
It does copy to the the string to the allocated char's by new correct?

Yes, strcpy is logically:
char* strcpy(char* to, const char* from) {
char* ret = to;
while(*to++ = *from++);
return ret;
}

But
in this example since the string does not fit I am getting an undefined
behavior?

Yes, as soon as strcpy attempts to write the string[5] it is undefined
behavior.
In my simple example what is being done with the extra char's that
don't fit?

It attempts to write them in string[5], string[6]... but that's not a defined
operation since those locations have not been allocated.
Is my compiler just seeing this and trying to fix the problem?
Because when I do (std::cout << string) the output is the entire string
literal.

The compiler doesn't know or care. That's the problem with undefined
behavior. It may or may not work at run time. If it printed it, you just
got lucky and the extra characters you wrote didn't hit anything.

All the str* functions and anything that treats char* as if it were a string
just starts marching through memory from the given pointer looking
for the first null character.

Here is another question to clarify this concept for me. If I were to do
this:

char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
strcpy(string, "This is another string");
std::cout << string;

I am assuming the second strcpy is starting over from the begging and
overwriting everything from the first strcpy. Would it have been better to
delete and reallocate memory for the new string?

No.... As long as you don't run off the 50 characters.
You don't even need to new anything, C++ is not JAVA.
char string[50];
would have worked just fine and you never have to remember to
delete it.

If you had used std::string, you wouldn't have to worry about running off
the end, and assignment would work as you expect.
delete string;

BZZT. More undefined behavior. Anything you allocate with the array form of new
you need to deallocate with the array form of delete.

delete [] string;
string = new char[50];
strcpy(string, "This is another string");
std::cout << string;

Don't forget that you need to delete string here if you are done with it.
 
A

Artie Gold

Norm said:
char* is not a string type. It is a pointer to a single character.
At this point in your career you should stay clear ofit. Use the

std::string type instead.

I know I should be using std::string instead I just trying to better
understand c-strings and pointer concepts

char *string;
string = new char[5];

Allocates an array of 5 char and puts the pointer to the first one in
string.
string = "This is the string";

You are not writing over the array of 5 you allocated above. You are
assigning

the pointer to the first character string literal (a statically allocated

array of as many characters as
needed) .

You've essentially lost (leaked memory) the string you allocated in the

previous line
by writing over the remembered address with this one.

And if you were to write:
strcpy(string, "This is the string");
which does copy the string literal over the characters pointed at by

string, you'd
be in big trouble because it's undefined behavior when you access off the

end of your
allocated memory.


Just so I understand if I were to write this:

char *string;
string = new char[5];
strcpy(string, "This is the string");

It does copy to the the string to the allocated char's by new correct? But
in this example since the string does not fit I am getting an undefined
behavior? In my simple example what is being done with the extra char's that
don't fit? Is my compiler just seeing this and trying to fix the problem?
Because when I do (std::cout << string) the output is the entire string
literal.

Right. You're getting undefined behavior. In this case, you just
happened to get unlucky, and it *seemed* to work correctly.[1]
Here is another question to clarify this concept for me. If I were to do
this:

char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
strcpy(string, "This is another string");
std::cout << string;

I am assuming the second strcpy is starting over from the begging and
overwriting everything from the first strcpy. Would it have been better to
delete and reallocate memory for the new string?

Not at all. You allocated space for 50 `char's. It's yours. You may
do what you want with it (but be sure to delete it when you no
longer need it).
char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
delete string;
string = new char[50];
strcpy(string, "This is another string");
std::cout << string;

Not necessary.
HTH,
--ag

[1] The reason *seeming* to work is an unlucky occurence is that,
particularly with buffer overrun kinds of problems, you're likely to
have overwritten a piece of memory that will result in a crash in
some other part of your program. Situations like that tend to be
devilishly difficult to track down.
 
Y

Ying Yang

Could someone explain for me what is happening here?

char *string;

You have declared a pointer named string to a char type. This pointer is
allocated on the stack.
string = new char[5];

You have made this pointer point to the first element in a constant array of
type char that has 5 elements. This constant array is allocated on the heap
(using the new operator). Note, there is a close association between
pointers and arrays.
string = "This is the string";
cout << string << std::endl;
The output contains (This the string) even though I only allocated 5
characters in memory with new.

The elements in the constant array is filled with the characters in the
string literal. Two important things to notice here is that (1) the null
character is automatically appended at the end of the array (2) it is
possible to over-write next memory locations as C++ does not do run-time
checks for array indexes that go out of bounds.

Hope this helps.

weeeeeeeeeeee
 
S

Steven C.

This is a very good explaination.


Norm said:
Could someone explain for me what is happening here?

char *string;
string = new char[5];
string = "This is the string";
cout << string << std::endl;

The output contains (This the string) even though I only allocated 5
characters in memory with new.

"char * s" is a "pointer to char". On a x86 this is a 32 bit number
which could theoretically address anything in memory.


new char[5] says - go find me an unused chunk of memory in my address
space big enout for 5 chars and "allocate it to me" and construct an
array of 5 chars - return the pointer to the first char.

now

s = new char[5]

takes the returned address and places it in s.

"This is the string" says to the compiler, create an const char array
and place the contents "This is the string" + a terminating ascii nul.
The value of "This is the string" is the address of the first unsigned
char in the array.

so

s = "This is the string";

REPLACES the address returned by new char[5] in the earlier statement.

Theoretically you have now leaked the memory from new char[5] because it
can no longer be deleted because you no longer have the address.

std::cout << s

applies "ostream & operator << ( ostream &, char * )" and passes the
address in s which happens to be "This is the string".

.....

Does this explain enough ?
 
B

Big Brian

Could someone explain for me what is happening here?
char* is not a string type. It is a pointer to a single character.
At this point in your career you should stay clear ofit. Use the std::string type instead.

Everybody needs to understand c style strings as well. I've worked on
projects where you couldn't use std::string. And you never know when
you're going to have to maintain old code which uses them.
 
J

Jakob Bieling

string = new char[5];

You have made this pointer point to the first element in a constant array of
type char that has 5 elements. This constant array is allocated on the heap
(using the new operator). Note, there is a close association between
pointers and arrays.

Why are you saying 'constant array'? The array of chars that is
allocated by the statement above is not constant at all.
The elements in the constant array is filled with the characters in the
string literal.

No, not at all. The elements of the array you allocated with 'new[]' are
untouched and the pointer to the first element of that array is lost. See
Gianni's post for an in-depth explanation.
Two important things to notice here is that (1) the null
character is automatically appended at the end of the array

Using the above statement, no. This applies when using any of the str*
functions, like strcpy, strcat and whatnot.

regards
 
J

John Carson

Norm said:
Just so I understand if I were to write this:

char *string;
string = new char[5];
strcpy(string, "This is the string");

It does copy to the the string to the allocated char's by new
correct? But in this example since the string does not fit I am
getting an undefined behavior? In my simple example what is being
done with the extra char's that don't fit? Is my compiler just seeing
this and trying to fix the problem? Because when I do (std::cout <<
string) the output is the entire string literal.

The point is that there are two memory allocation mechanisms available and
you are assuming that there is only one, name that provided by new.

You could write:

char *string;
string = "This is the string";
cout << string << std::endl;

without any use of new and it would be fine. The compiler will put "This is
the string" in statically allocated memory, i.e., memory reserved when the
program first starts up. It then makes the string variable point to the
start of this statically allocated memory. Statically allocated memory is
the same as what you get if, say, you write

int x;

at global scope.
Here is another question to clarify this concept for me. If I were to
do this:

char *string;
string = new char[50];
strcpy(string, "This is the string");
std::cout << string;
strcpy(string, "This is another string");
std::cout << string;

I am assuming the second strcpy is starting over from the begging and
overwriting everything from the first strcpy. Would it have been
better to delete and reallocate memory for the new string?

No, that just makes extra work for no gain.
 
J

Jakob Bieling

John Carson said:
Norm said:
Just so I understand if I were to write this:

char *string;
string = new char[5];
strcpy(string, "This is the string");

It does copy to the the string to the allocated char's by new
correct? But in this example since the string does not fit I am
getting an undefined behavior? In my simple example what is being
done with the extra char's that don't fit? Is my compiler just seeing
this and trying to fix the problem? Because when I do (std::cout <<
string) the output is the entire string literal.

The point is that there are two memory allocation mechanisms available and
you are assuming that there is only one, name that provided by new.

You could write:

char *string;
string = "This is the string";
cout << string << std::endl;

without any use of new and it would be fine. The compiler will put "This is
the string" in statically allocated memory, i.e., memory reserved when the
program first starts up. It then makes the string variable point to the
start of this statically allocated memory.

You should note that memory allocated like this is constant, meaning
that you may not change it. I do not remember why the following compiles:

char* tyt = "hello world";
tyt [0] = 'H';

but when running it, it will give you an access violation. IMO, the
string literal should be of type 'char const*', but for some reason it seems
not to be.

regards
 
K

Kevin Goodsell

Jakob said:
You should note that memory allocated like this is constant, meaning
that you may not change it. I do not remember why the following compiles:

char* tyt = "hello world";
tyt [0] = 'H';

Basically it's for historical reasons. C allowed literals to be modified
way back in the day. Implicit conversion of a string literal to a
(non-const) char * is deprecated in C++. You should always use const
char * for this purpose.
but when running it, it will give you an access violation. IMO, the
string literal should be of type 'char const*', but for some reason it seems
not to be.

Actually, the type of a string literal is const char[N] where N is the
number of characters including the null terminator. So it *is* const,
but it's a special case with that implicit conversion.

-Kevin
 
G

Gianni Mariani

You should note that memory allocated like this is constant, meaning
that you may not change it. I do not remember why the following compiles:

char* tyt = "hello world";
tyt [0] = 'H';

but when running it, it will give you an access violation. IMO, the
string literal should be of type 'char const*', but for some reason it seems
not to be.

It may or may not fault - it's undefined.

The reason why tyt does not need to be a "const char *" is there is a
special C++ rule for conversion of a string literal to a "char *" to
deal with all the bad code out there.
 
K

Kevin Goodsell

Big said:
Everybody needs to understand c style strings as well.

Yes, but they should avoid them. Beginners and experts alike should
prefer std::string.
I've worked on
projects where you couldn't use std::string.

So, projects where you weren't allowed to use C++? (At least not all of
it.) I wonder why that restriction would be placed on a project written
in C++.
And you never know when
you're going to have to maintain old code which uses them.

Yeah. When maintaining that kind of code, I recommend converting it to
use std::string.

The advantages of std::string over C-style strings are clear. Of course
a good programmer should now how to use messy things like C-style
strings, but he should also know better than to use them (at least most
of the time).

-Kevin
 
A

A

string = new char[5];

You have made this pointer point to the first element in a constant
array
of
type char that has 5 elements. This constant array is allocated on the heap
(using the new operator). Note, there is a close association between
pointers and arrays.

Why are you saying 'constant array'? The array of chars that is
allocated by the statement above is not constant at all.

Array identifiers are constant so you cannot assign a value to array
identifiers - you can only assign values to the elements in the array.
The elements in the constant array is filled with the characters in the
string literal.

No, not at all. The elements of the array you allocated with 'new[]' are
untouched and the pointer to the first element of that array is lost. See
Gianni's post for an in-depth explanation.
Incorrect.
Two important things to notice here is that (1) the null
character is automatically appended at the end of the array

Using the above statement, no. This applies when using any of the str*
functions, like strcpy, strcat and whatnot.

Incorrect.

char myString[] = "Hello World";

The null character is automatically included here.
 
K

Kevin Goodsell

A said:
Array identifiers are constant so you cannot assign a value to array
identifiers - you can only assign values to the elements in the array.

But there were no arrays in the example, other than the dynamically
allocated one (which doesn't have an identifier). Neither the pointer
nor the thing pointed to were const in the example.
No, not at all. The elements of the array you allocated with 'new[]' are
untouched and the pointer to the first element of that array is lost. See
Gianni's post for an in-depth explanation.


Incorrect.

He was precisely correct. You seem to have a lot to learn about C++.
There is absolutely no copying of array elements happening here. All
that happens is a pointer being modified to point to a different
address, thus losing the address of the dynamic array - in other words,
leaking the memory.
Incorrect.

He's right. You are not.
char myString[] = "Hello World";

The null character is automatically included here.

Yes, but this example has nothing to do with your (blatantly incorrect)
statements, and it does nothing to disprove Jakob's (correct) statements.

(As a side note, here we have another example of the difference between
assignment and initialization. The above initialization automatically
copies the elements of the string literal into the myString array. This
cannot be duplicated with an assignment.)

-Kevin
 
J

Jack Klein

Yes, but they should avoid them. Beginners and experts alike should
prefer std::string.


So, projects where you weren't allowed to use C++? (At least not all of
it.) I wonder why that restriction would be placed on a project written
in C++.

There are many existing applications that were written in a dialect of
C++ before there was a std::string or an ISO/IEC standard for C++. If
these are viable, they often need to be maintained and extended,
sometimes for very many years.

Not every project includes the luxury of starting with a clean sheet
of paper.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
K

Kevin Goodsell

Jack said:
There are many existing applications that were written in a dialect of
C++ before there was a std::string or an ISO/IEC standard for C++. If
these are viable, they often need to be maintained and extended,
sometimes for very many years.

Not every project includes the luxury of starting with a clean sheet
of paper.

Yeah, I know. But I think it's unfortunate when projects that *could*
use modern C++ don't, and do things the hard way instead. I question the
validity of the reasons for this. Is there really a good reason to
forbid, e.g., templates in a C++ program in this day and age?

-Kevin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,141
Messages
2,570,818
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top