Difference between '\0' and 0

A

August Karlstrom

Emmanuel said:
Robert Gamble wrote on 29/05/05 :



'nil pointer' is a generic term used in Computer Science. Its
C-implementation is null pointer constant or NULL.

Thanks for the backup, this is exactly what I was trying to say.

-- August
 
P

Peter Nilsson

Robert said:
You said that literal 0 can denote the "nil pointer". There is no nil
pointer in C, but there is a NULL pointer constant

There are null pointer constants which become null pointers when
converted to a pointer. NULL is an implementation defined null pointer
constant available when you include certain headers.
which can be expressed as a literal 0 in a pointer context.

Any integer constant with zero value is a null pointer constant, by
definition.

People, there's a whole section of the FAQ devoted to this! ;)
 
A

August Karlstrom

Emmanuel said:
August Karlstrom wrote on 29/05/05 :



It's true that and int is not a char. Why in the world are you using a
char ? The cases where a single char is necessary are extremely rare
(only scanf("%c", &c) comes to my mind, and scanf() is not a recommended
function...)

You're right, this issue is more of theoretical interest.
That said, Lint seems to consider that '\0' is a char that is wrong in
C. (Or maybe, Lint is checking in C++ mode, in that case 0 is an int and
'\0' is a char). Your extension seems to be .c, that is correct for a
C-program. Check the Lint configuration.

Splint is a C only checker.

-- August
 
C

CBFalconer

Emmanuel said:
Dik T. Winter wrote on 29/05/05 :


Arf! Who has checked the checker ?

You are missing the point. Splint doesn't have to handle character
constants correctly, or anything else for that matter. It is NOT a
compiler. It is a tool, whose purpose is to point out questionable
or problematic constructs. Further action is up to the programmer
and his compiler.

A better comparison is a spell checker, which probably will
highlight things that smell suspiciously like a mis-spelling. It
isn't expected to be right all the time.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html
 
C

Chris Croughton

A digression: In my opinion, C depends too heavily on implicit
conversions. If I were designing the language from scratch, most
conversions that are implicit in C would probably be explicit -- but
most of them would be expressed by some mechanism less heavyweight
than a cast. The resulting code would probably be more verbose than
typical C. If you like extreme terseness, be glad that I don't
actually design programming languages.

I would have no problem with you designing such a language, since that
language would not be C and so I wouldn't be using it said:
But given the actual design of
C, it almost always makes sense to use implicit conversions rather
than explicit casts, which tend to swat flies with sledgehammers.
Implicit conversions are bad from a language design point of view, but
good (or at least far better than the alternative) from a C
programming point of view.

If I want a strongly typed language I know where to find Ada <g>...

Chris C
 
A

August Karlstrom

Chris said:
I would have no problem with you designing such a language, since that



If I want a strongly typed language I know where to find Ada <g>...

Or why not Oberon-2.

-- August
 
C

Chris Croughton

Or why not Oberon-2.

Not powerful enough. For instance, one of the features I like in Ada is
declaring variables with specific ranges (-3..7 etc.), so that they
can't go out of bound without throwing an exception. Oberon-2 was, I
gather, created as a language for teaching, like Pascal (but later so it
learnt from some of the pitfalls of early Pascal), Ada was designed for
Real World(tm) tasks.

(That's not a criticism of Oberon-2, which is fine for its purpose, it's
just not as good for my purposes...)

Chris C
 
P

Peter Nilsson

Old said:
Is that correct?

In strict terms, no.
I know that '\xFF' is implementation-defined
as to whether it is -1 or 255 ..

The value is the value c would have if it were evaluated as
in...

unsigned char x = 255;
int c = * (char *) &x;

In other words, the byte value of the constant is interpreted
as a (plain) char, and subsequently converted to an int.

It is implementation defined whether plain char is signed or
unsigned.

On implementations where CHAR_BIT > 8, and on implementations
where plain char is unsigned, the value is 255.

On 8-bit, signed plain char, ones complement or signed
magnitude machines, the value is effectively unspecified
under C90. C99 limits the representation of negative
integers more thoroughly, so the value is either -1, -0 or
-127 for 2c, 1c and sm representations respectively.

It is the view of many clc regulars that 1c and sm machines
must have unsigned plain char, if the implementation is
to have any QoI. However, there is nothing the standard
which precludes low QoI implementations.
 
A

August Karlstrom

Chris said:
Not powerful enough. For instance, one of the features I like in Ada is
declaring variables with specific ranges (-3..7 etc.), so that they
can't go out of bound without throwing an exception. Oberon-2 was, I
gather, created as a language for teaching, like Pascal (but later so it
learnt from some of the pitfalls of early Pascal), Ada was designed for
Real World(tm) tasks.

(That's not a criticism of Oberon-2, which is fine for its purpose, it's
just not as good for my purposes...)

No, this is a quite common misconception that is probably one of the
reasons the language isn't more widespread than it is. Oberon/Oberon-2
is not merely a teaching language. It was used to build an entire
operating system, namely Oberon. In Oberon-2 you isolate the potentially
dangerous low level code in so called SYSTEM modules. If you import the
SYSTEM pseudo module you can do *anything*. I recommend reading Stefan
Metzeler's article at
http://www.amadeus-3.com/main_files/oberon2vsCPP.php (although he is
biased in favour of the product he sells, he has some good points).

-- August
 
M

Malcolm

Chris Croughton said:
I would have no problem with you designing such a language, since that
language would not be C and so I wouldn't be using it <g>.
Here's what happens in Java.

We want to attach an integer to a generic structure (let's say a tree).

To retrive our integer.

Object *obj = tree.getMyObject();
if( obj instanceof Integer)
int x = ((Integer) obj).intValue();


in C

void *ptr = tree_getmyobject(tree);
if(ptr)
x = *(int*) ptr;

Java can offer somewhat better safety, but this is largely illusory. If
there is no object both C and Java can return null, whilst if the tree has
been corrupted in some way so that it no longer holds integers, the Java
program will treat it the same as a null (unless you add more lines of code)
whilst the C program will be corrupted and most probably crash a few lines
later. Neither is ideal because, once your program has encountered an error,
there is no ideal way of dealing with the situation.
 
D

Dik T. Winter

>
> Is that correct? I know that '\xFF' is implementation-defined
> as to whether it is -1 or 255 ..

Indeed, I see now that "the value is the one that results when an object
with type char whose value is that of the single character or escape
sequence is converted to type int". So there is a good reason SPlint
does not see the difference between '\377' and EOF on some systems.
 
J

Joe Wright

Old said:
Is that correct? I know that '\xFF' is implementation-defined
as to whether it is -1 or 255 ..

Fascinating. 'Splain me this Batman.

C:\work\c\clc>cat ow.c
#include <stdio.h>
int main(void) {
printf("%d %d %d %d\n", '\xff', 0xff, 0377, '\377');
return 0;
}

C:\work\c\clc>ow
-1 255 255 -1

I thought they'd print the same thing. Is this implementation-defined as
well?
 
C

Chris Croughton

(Note followups to poster...)


Which Oberon-2 doesn't support (and C, C++ and most other languages
don't support either, of course).

Other things:

Strings can't contain quote marks (no 'escape' like backslash). It
isn't clear from the specification whether they can contain
non-printing characters like tab and newline.

No unsigned types (implicit 2's-complement arithmetic?). No bit
operators except (sort of) in SYSTEM.

Even less specification of sizes of integer types than in C90 (as far
as I can see everything from LONGREAL to SHORTINT could be a single
bit!).

Files? I/O in general?
No, this is a quite common misconception that is probably one of the
reasons the language isn't more widespread than it is. Oberon/Oberon-2
is not merely a teaching language. It was used to build an entire
operating system, namely Oberon.

I wonder how, I see nothing in the specification:

http://www.zel.org/aos/o2report.htm

which even mentions input or output in any standard way. SYSTEM defines
procedures for accessing 'memory' directly, but nothing to access I/O
(not all I/O is memory-mapped!). Presumably all of the "nasty bits" are
written in C or assembler and then linked in (in some unspecified way).

(All other copies of the specification seem to be variants of the one I
quoted.)
In Oberon-2 you isolate the potentially
dangerous low level code in so called SYSTEM modules. If you import the
SYSTEM pseudo module you can do *anything*. I recommend reading Stefan
Metzeler's article at
http://www.amadeus-3.com/main_files/oberon2vsCPP.php (although he is
biased in favour of the product he sells, he has some good points).

He's very biased against C and C++, certainly. His Deutsche Bank
example is just stupid, why they bothered to use C++ at all I don't
understand and that sort of draconian limitation is certainly not
necessary to writing safe and maintainable C or C++ programs. Similarly
with his Microsoft example, the fact that MS write unsafe code is
nothing to do with using C++, it's because (a) they have to try to keep
backwards compatibility with systems which were never designed for
security (right back to MSDOS!) and (b) because they have a design goal
of "more features" (and especially flashy ones).

His rant against C++ boils down to "I don't like C++" and "I like
Oberon-2". Some of his statements are just plain wrong (his section on
"local procedures", with the idea of putting a for loop into a local
procedure to declare the loop variable -- huh? Either he means a local
block, which C and C++ have, or he's really confused).

And what on earth is this term 'accolades' used for braces (squiggly
brackets, {})? The dictionary definition of 'accolade' is "An
expression of approval; praise. A ceremonial embrace, as of greeting or
salutation. Ceremonial bestowal of knighthood." (American Heritage
Dictionary, via http://dictionary.reference.com/). He uses the (spit!)
K&R style as an example of how difficult it is to line them up, ignoring
styles which make it easy to do so.

Chris C
 
D

Dietmar Schindler

Keith said:
... But since even a reference
to a char object:
char c = '\0';
char x = c;
is promoted to int before being converted back to char, ...

ISO/IEC 9899 (Committee Draft  January 18, 1999):
6.5.16 Assignment operators

[#3] ... The type of an assignment expression is
the type of the left operand unless the left operand has
qualified type, in which case it is the unqualified version
of the type of the left operand. ...

6.5.16.1 Simple assignment

[#2] In simple assignment (=), the value of the right
operand is converted to the type of the assignment
expression and replaces the value stored in the object
designated by the left operand.
 
D

Dietmar Schindler

Joe said:
Fascinating. 'Splain me this Batman.

C:\work\c\clc>cat ow.c
#include <stdio.h>
int main(void) {
printf("%d %d %d %d\n", '\xff', 0xff, 0377, '\377');
return 0;
}

C:\work\c\clc>ow
-1 255 255 -1

I thought they'd print the same thing. Is this implementation-defined as
well?

I'm not Batman, but I'll try regardless.
ISO/IEC 9899 (Committee Draft -- January 18, 1999):
6.4.4.4 Character constants

[#10] ... If an integer character constant
contains a single character or escape sequence, its value is
the one that results when an object with type char whose
value is that of the single character or escape sequence is
converted to type int.

[#13] EXAMPLE 2 Consider implementations that use two's-
complement representation for integers and eight bits for
objects that have type char. In an implementation in which
type char has the same range of values as signed char, the
integer character constant '\xFF' has the value -1; if type
char has the same range of values as unsigned char, the
character constant '\xFF' has the value +255 .
 
T

Tim Rentsch

Keith Thompson said:
I wouldn't say that Splint (the particular lint implementation that
generates the message) is wrong to issue a warning. It's not required
to limit itself to complaining about violations of the standard. It's
diagnosting (what its authors see as) a style issue -- which is part
of what lint is supposed to do. I happen to agree with it in this
case; initializing a char object with '\0' is clearer than using 0,
even though it's semantically identical.



In fact, types int and char are compatible for assignment (but not for
some other purposes).

Just a minor nit. The word 'compatible' has a specific meaning in the
C standard document (section 6.2.7). In that sense of the word, the
types here are not compatible. So the wording of message seems
technically correct, even if it may be misleading.

I basically agree with everything else you said. It would be nice if
there were standard language for when types are "assignable"; even
nicer would be if the standard language were consistent with common
usage. Using "compatible" for a circumstance that really means
something more like "equivalent" seems an unfortunate choice, at least
in retrospect.

(And this doesn't even address another kind of "compatibility", like
pointers to different structure types, where types are guaranteed to
have the same representation and alignment requirements, yet are
neither compatible nor assignable.)
 
K

Keith Thompson

Dietmar Schindler said:
Keith said:
... But since even a reference
to a char object:
char c = '\0';
char x = c;
is promoted to int before being converted back to char, ...

ISO/IEC 9899 (Committee Draft  January 18, 1999):
6.5.16 Assignment operators

[#3] ... The type of an assignment expression is
the type of the left operand unless the left operand has
qualified type, in which case it is the unqualified version
of the type of the left operand. ...

6.5.16.1 Simple assignment

[#2] In simple assignment (=), the value of the right
operand is converted to the type of the assignment
expression and replaces the value stored in the object
designated by the left operand.

C99 6.3.1.1 p2, p3 says:

The following may be used in an expression wherever an int or
unsigned int may be used:

-- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.

-- A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise, it is converted to an unsigned
int. These are called the integer promotions.48) All other types
are unchanged by the integer promotions.

It's not clear to me whether the integer promotions are applied to the
RHS of an assignment operator. If they are, then "the value of the
right operand" would refer to the value after conversion. If not, it
refers to a value of type char.
 
T

Tim Rentsch

Keith Thompson said:
Dietmar Schindler said:
Keith said:
... But since even a reference
to a char object:
char c = '\0';
char x = c;
is promoted to int before being converted back to char, ...

ISO/IEC 9899 (Committee Draft ^T January 18, 1999):
6.5.16 Assignment operators

[#3] ... The type of an assignment expression is
the type of the left operand unless the left operand has
qualified type, in which case it is the unqualified version
of the type of the left operand. ...

6.5.16.1 Simple assignment

[#2] In simple assignment (=), the value of the right
operand is converted to the type of the assignment
expression and replaces the value stored in the object
designated by the left operand.

C99 6.3.1.1 p2, p3 says:

The following may be used in an expression wherever an int or
unsigned int may be used:

-- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.

-- A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise, it is converted to an unsigned
int. These are called the integer promotions.48) All other types
are unchanged by the integer promotions.

It's not clear to me whether the integer promotions are applied to the
RHS of an assignment operator. If they are, then "the value of the
right operand" would refer to the value after conversion. If not, it
refers to a value of type char.

It seems as though the integer promotions are not applied to the RHS
of an assignment operator. Footnote 48 says

"The integer promotions are applied only: as part of the usual
arithmetic conversions, to certain argument expressions, to the
operands of the unary +, -, and ~ operators, and to both operands of
the shift operators, as specified by their respective subclauses."

The descriptions of other operators explicitly cite the usual
arithmetic conversions (eg, 6.5.6 p4). Furthermore, looking up "usual
arithmetic conversions" in the index, it lists: 6.3.1.8, 6.5.5,
6.5.6, 6.5.8, 6.5.9, 6.5.10, 6.5.11, 6.5.12, 6.5.15. No mention of
assignment (6.5.16).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,164
Messages
2,570,898
Members
47,439
Latest member
shasuze

Latest Threads

Top