So, as the faq says
- A string literal can be:
- An array initializer
- An unamed static array of chars
Right. When it is not an array initializer, it always produces
an anonymous array. This array has "static duration" -- meaning
it is valid during the entire execution of the program -- and may
or may not reside in physically-read-only storage. If it is
physically read-only, attempts to change it will fail:
char *p = "mellow";
*p = 'y';
may leave *p unchanged, and/or may trap at runtime. The effect is
formally undefined, and a compiler may well get confused and *think*
it changed even if it did not. It might even present you with
conflicting evidence, showing that it has both changed and not
changed:
printf("*p is '%c'; p is '%s'\n", *p, p);
might print:
*p is 'y'; p is 'mellow'
Here the compiler "knows" it just set *p to 'y', so it can replace
*p with 'y' in the call to printf(); but printf's %s format reads
what is really still there in read-only memory, which is an 'm'.
So my question its:
if i declare:
char *myString="Whatever";
and later in the code:
myString="Another Static Array of chars";
the pointer value will be the same ?
No -- in this case, the two values stored in myString *must* be
distinct and compare not equal. More specifically, after:
char *p1 = "one such array";
char *p2 = "something different";
it must definitely be the case that p1 != p2. However, in:
char *p3 = "we will see this again";
char *p4 = "we will see this again";
the compiler is allowed, but not required, to use a single array
to hold both strings, so that p3==p4 and p3!=p4 are *both* allowed.
Aside from the fact that the anonymous array is (at least in
principle) read-only -- and of course anonymous (unnamed) -- these
four string literals are just shorthand for:
static char __compiler_string_number_000001[] = "one such array";
static char __compiler_string_number_000002[] = "something different";
static char __compiler_string_number_000003[] = "we will see this again";
char *p1 = __compiler_string_number_000001;
char *p2 = __compiler_string_number_000002;
char *p3 = __compiler_string_number_000003;
Whether p4 is __compiler_string_number_000004 or
__compiler_string_number_000003 again is up to the compiler. Asking
whether p3==p4 is the same, in this case, as asking whether
__compiler_string_number_000003 was re-used.
It would make more sense if these were:
static const char __compiler_string_number_000001[] = ...
but the 1989 C standard left the type unqualified (non-"const")
for backwards compatibility with all those C compilers that came
before it, where there was no "const" keyword.
Finally, in the original example, we had, in effect:
p1 = "some string that ends with a specific word";
followed by:
p2 = "specific word";
In this case, a compiler is allowed -- but again not required -- to
act more or less as if we had written:
static char __compiler_string_number_000042[] =
"some string that ends with a specific word";
/* 1 2 3 4 */
/* 0123456789012345678901234567890123456789012 */
p1 = __compiler_string_number_000042;
p2 = __compiler_string_number_000042 + 29;
As you can see from the comment (assuming you are viewing this in
a fixed-width font), the text "specific word" occurs at offset 29
within the string "some string that ends with a specific word".
(This optimization turns out to be relatively easy to make -- one
just needs to collect all strings, group them by size, and then
"compare backwards" to see if any one string is an exact match for
the end of any equal-length-or-longer string. This does not catch
all possible matches due to the ability to embed '\0' characters
in a string literal, but the algorithm can be augmented if desired.
Note that counted-length strings, which are being discussed in a
separate ongoing thread in comp.lang.c now, do not lend themselves
as easily to this string-sharing technique. There are two main
ways to do counted-length strings, one of which prohibits sharing
entirely and the other of which affords even more opportunities for
sharing, so that a simple tail-match algorithm is insufficient.)
Another questions
If i:
char *MyString1="Dog";
char *MyString2="Cat";
MyString1=Mystring2;
What happens with the initial memory referenced by MyString1 (Dog), it
remains on the memory (wasted memory) or it gets cleared in some way ?
It remains in memory, because a static-duration array exists until
the program exits (and maybe even after that; the C standard
necessarily says nothing about what happens before and after a
program runs).
As for whether this memory is "wasted", ask yourself this question:
if the memory containing the string were to be used for something
else, how would it (a) get set up initially, and (b) get "restored"
if the function in question were re-entered? For instance:
void f(void) {
char *s;
s = "hello";
puts(s);
s = "world";
puts(s);
}
void g(void) {
f();
f();
}
Each time f() is called, the strings "hello" and "world" must be
copied to stdout (and a newline added after each, as puts() does).
If the six bytes that hold {'h', 'e', 'l', 'l', 'o', '\0'} were
ever replaced with another six bytes, who or what would put back
the original h e l l o \0 sequence? Would that take code and/or
data space in the program? Would it take *more* code and/or data
space than you might save by overwriting the original string?