what about memset?

Bill Cunningham · Oct 16, 2008

When I want to clear memory space this is what I typically do myself,

char a[100];
int i;
for (i=0;i != 100;++i)
a='\0';

Now with the function memset I could do the same thing and it would be
portable. But would it always work?

memset( a, '\0', sizeof a);

Bill

Bill Cunningham · Oct 17, 2008

If you only need to do it once, you can do this:

char a[100] = {0};

This is because of default initialisation rules for partial
initialisations.

So char a[100]={0}; sets all to zero? But that's not exactly \0 is it,
as far as C is concerned?

Bill

Ian Collins · Oct 17, 2008

Bill said:
If you only need to do it once, you can do this:

char a[100] = {0};

This is because of default initialisation rules for partial
initialisations.

Click to expand...

So char a[100]={0}; sets all to zero? But that's not exactly \0 is it,
as far as C is concerned?

If that's the case, what is \0?

Bill Cunningham · Oct 17, 2008

Ian Collins said:
If that's the case, what is \0?

It's supposed to mean null in addition to the string terminator.

Bill

Bill Cunningham · Oct 17, 2008

Ian Collins said:
If that's the case, what is \0?

When I set all elements of the a array to zero, zero is printed on my
screen 0. When I set all elements to \0 nothing is printed to my screen. No
characters are visible so I am assuming the null character is set.

Bill

Ian Collins · Oct 17, 2008

Bill said:
When I set all elements of the a array to zero, zero is printed on my
screen 0. When I set all elements to \0 nothing is printed to my screen. No
characters are visible so I am assuming the null character is set.

Set to 0 or '0'?

Ben Bacarisse · Oct 17, 2008

Bill Cunningham said:
When I set all elements of the a array to zero, zero is printed on my
screen 0.

The best explanation for this is that you set the elements to '0' not
zero. '0' is some number (most likely 48) that represents a character
in the source character set used by your compiler.

When I set all elements to \0 nothing is printed to my screen. No
characters are visible so I am assuming the null character is set.

\0 is not a number in C so I presume you mean '\0'. '\0' is a integer
constant expression with value 0. '\0' is 0 in almost every way --
the only difference is the way they are written. 0x0 and 000 are
other ways to write the same thing.

Bill Cunningham · Oct 17, 2008

\0 is not a number in C so I presume you mean '\0'. '\0' is a integer
constant expression with value 0. '\0' is 0 in almost every way --
the only difference is the way they are written. 0x0 and 000 are
other ways to write the same thing.

I am pretty sure there's an ascii difference in '\0' that I use and '0'
which would print a bunch of zeros.

Bill

Ian Collins · Oct 17, 2008

Bill said:
I am pretty sure there's an ascii difference in '\0' that I use and '0'
which would print a bunch of zeros.

What part of '\0' == 0 and '0' != 0 don't you understand?

Ben Bacarisse · Oct 17, 2008

Bill Cunningham said:
I am pretty sure there's an ascii difference in '\0' that I use and '0'
which would print a bunch of zeros.

Yes. There is a difference between '\0' and '0'. I said exactly that
in my post in a part that you snipped. The part of that post that you
quote says nothing at all about '0'. We are in total agreement!

Of course, the very fact that you posted at all suggests that you
misunderstood something, but I can't tell what or how.

arnuld · Oct 17, 2008

What part of '\0' == 0 and '0' != 0 don't you understand?

I also need some help here

So, '\0' is the real zero while '0' has value 48. It means that, in C, we
have 3 step procedure when we compile a file:

1) everything that is stored as a character (in an array or not) is first
converted to its corresponding ASCII table number

2) and then we get the object file

3) In the end, object file is converted into the machine readable file
containing 1s and 0s only.

4) we run it if we like it enough

Barry Schwarz · Oct 17, 2008

I also need some help here

So, '\0' is the real zero while '0' has value 48. It means that, in C, we

It may be 48 on your system. On mine it is 240.

have 3 step procedure when we compile a file:

The first sentence has absolutely nothing to do with anything that
follows. The standard identifies the phases of compiling. There are
more than three.

1) everything that is stored as a character (in an array or not) is first
converted to its corresponding ASCII table number

Nothing on my system is converted to ASCII.

On your system, what ASCII table number is involved in the statement
char x = 17;

All character literals in your source file are already
in the form they need to be for the compiler. Unless the execution
character set is different than the source, what conversion do you
think is needed for
char y = 'a';

2) and then we get the object file

Not until much later. Somewhere the executable statements in the
source code must be translated to actual machine instruction (ignoring
the case where C is interpreted).

3) In the end, object file is converted into the machine readable file
containing 1s and 0s only.

Since the only values a bit can assume are 1 and 0, everything was
already 1s and 0s before you started.

arnuld · Oct 17, 2008

It may be 48 on your system. On mine it is 240.

I thought ASCII values are same for all platforms.

Nothing on my system is converted to ASCII.

I don't get it. Why ASCII exists ?

On your system, what ASCII table number is involved in the statement
char x = 17;

All character literals in your source file are already
in the form they need to be for the compiler. Unless the execution
character set is different than the source, what conversion do you
think is needed for
char y = 'a';

#include <stdio.h>

int main( void )
{
char x = 17;
char y = 'a';
char z = '0';

printf("x = %d\n", x);
printf("y = %d\n", y);
printf("z = %d\n", z);

return 0;
}

[arnuld@dune ztest]$ gcc4 -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune ztest]$ ./a.out
x = 17
y = 97
z = 48
[arnuld@dune ztest]$

I always thought that everything inside a c programs is saved as
characters and those characters are converted to ASCII equivalents during
compilation.

Ian Collins · Oct 17, 2008

arnuld said:
I thought ASCII values are same for all platforms.

They are, but not all systems use ASCII.

Barry Schwarz · Oct 17, 2008

I thought ASCII values are same for all platforms.

I don't get it. Why ASCII exists ?

ASCII happens to be one of several standards that are used to
represent characters internal to a computer. Not all systems use that
standard.

On your system, what ASCII table number is involved in the statement
char x = 17;

Click to expand...

All character literals in your source file are already
in the form they need to be for the compiler. Unless the execution
character set is different than the source, what conversion do you
think is needed for
char y = 'a';

Click to expand...

#include <stdio.h>

int main( void )
{
char x = 17;
char y = 'a';
char z = '0';

printf("x = %d\n", x);
printf("y = %d\n", y);
printf("z = %d\n", z);

return 0;
}

[arnuld@dune ztest]$ gcc4 -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune ztest]$ ./a.out
x = 17
y = 97
z = 48
[arnuld@dune ztest]$

Those are the values that result from treating a character as a binary
value on an ASCII system that uses 8-bit bytes. That is not in
dispute. What is your point? Do you think that is the only type of
system that exists. If I compiled and executed your code on my
(EBCDIC) system, I would get 17, 129, and 240.

That is why there are frequent posts here that say do not use
if (x == 48) ...
which works only on your system but use
if (x == '0') ...
which will work on both our systems.

The same rational deals with
if (ch >= 'A' && ch <= 'Z') ...
compared with
if (isupper(ch))...
On an ASCII system, if the first evaluates true, ch must contain an
upper case letter. That is not the case on my system. The second
works correctly on both our systems.

On an ASCII system
if (isupper(ch)) ch += 0x20;
will convert ch to lower case. On my system, ch would become a
different letter, a number, or an unprintable/undisplayable control
character. But
if (isupper(ch)) ch = tolower(ch);
will work on both systems.

I always thought that everything inside a c programs is saved as
characters and those characters are converted to ASCII equivalents during
compilation.

Look at that sentence again. The compiler reads a (saved) source file
during compilation. The program is made up of characters. If your
system is an ASCII system, what standard was used to represent the
characters stored in the source file? Why do you think they are
converted to something else for the compiler?

Bill Cunningham · Oct 17, 2008

Ian Collins said:
What part of '\0' == 0 and '0' != 0 don't you understand?

Evidently all of the " '\0'==0 " part. If '0' is ascii 48 and '\0' is
ascii 0 then they are not the same ascii character.

Bill

Bill Cunningham · Oct 17, 2008

arnuld said:
3) In the end, object file is converted into the machine readable file
containing 1s and 0s only.

The binary notation means there's about 2-3v of electricity "turned on"
for 1 and "off" for 0. The absence or presence of an electrical charge.

Bill

Bill Cunningham · Oct 17, 2008

They are, but not all systems use ASCII.

This is true.

Bill

Martien Verbruggen · Oct 17, 2008

Evidently all of the " '\0'==0 " part. If '0' is ascii 48 and '\0' is
ascii 0 then they are not the same ascii character.

Read it again, and pay attention to where there are quotes and where
there are no quotes.

No one said that '\0' == '0'. The opposite was said. Several times.

Martien

jameskuyper · Oct 18, 2008

arnuld said:
I thought ASCII values are same for all platforms.

The ASCII values are the same, but not all platforms use ASCII values
for characters.

I don't get it. Why ASCII exists ?

To standardize the method used in the United Stated to represent
English words. There's also a completely different system called
EBCDIC which is still in use on many systems. ASCII had to be extended
significantly to be of any use in most of Europe, where most languages
have various special characters that don't occur in English. ISO
created 15 extended versions of ASCII called ISO 8859-1 through ISO
8859-16 (ISO 8859-12 was abandoned). However, extending ASCII is a
totally inadequate strategy for most East Asian languages, where the
number of different characters can run into the thousands. There are
hundreds of different character encodings in use somewhere in the
world, and dozens that are pretty common.

Unicode is a pretty popular standard that was created with the goal of
cleaning up this mess. It assigns unique code points to each of more
than 100,000 characters from a very large and diverse set of
languages, and it has lots of unused code points that have been
reserved for additional characters, if needed. UTF-8 is a popular
variable-length encoding for Unicode code points: some characters
require only a single byte, others require multiple bytes; all of the
characters that can be represented in ASCII are represented by a
single byte with the same value it has in ASCII. There's also a 16 bit
variable length encoding, and a 32 bit fixed-width encoding, and a few
other alternatives as well.

C has a certain limited amount of support for Unicode. Section 6.4.3
describes Universal Character Names (UCN), such as \U0B4D. According
to Appendix D, \U0B4D represents a character from the Oriya language.
UCNs can appear in identifiers, character literals, and string
literals. The intent was that editors would be created which could
display some or all of the Unicode characters that are not in the
basic C character set. Any UCN that corresponded to a character that
the editor knows how to display would be displayed as that character;
for any character that it could not display, the corresponding UCN
would be displayed as a UCN. I have no idea whether such editors
actually existed, I've never had a need to use a UCN.

....

I always thought that everything inside a c programs is saved as
characters and those characters are converted to ASCII equivalents during
compilation.

The encoding used for a C source code file is not required to be
ASCII. In general, characters from the input file do not get copied as
such to the output file, except when they occur inside a character
literal, a string literal, or an identifier with external linkage.
Even then, many such characters are transformed in various ways during
translation of the program. For instance, inside the string literal
"Ding\07!", the three characters '\', '0', and '7' will (generally)
cause a single byte with a value of 7 to be stored in the executable.
Escape sequences like '\n' and UCNs cause similar transformations to
occur. What finally gets written to the output file will use the
encoding for the execution character set, which is also not required
to be ASCII, and which might not be the same as the encoding used in
the source file.

memset	24	Apr 15, 2014
Adding adressing of IPv6 to program	1	Feb 16, 2023
a = b or memset/cpy?	17	Feb 7, 2012
memset	21	Jan 26, 2008
Question about classic stuff - calloc() and memset()	9	Dec 17, 2010
memset() on a struct or union	21	Feb 2, 2010
If(strcmp(str, "") == 0) - What does this line of code mean?	0	Aug 8, 2022
What do you think about this script?	0	Aug 11, 2023

what about memset?

Bill Cunningham

Bill Cunningham

Ian Collins

Bill Cunningham

Bill Cunningham

Ian Collins

Ben Bacarisse

Bill Cunningham

Ian Collins

Ben Bacarisse

arnuld

Barry Schwarz

arnuld

Ian Collins

Barry Schwarz

Bill Cunningham

Bill Cunningham

Bill Cunningham

Martien Verbruggen

jameskuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads