Unsigned types are DANGEROUS??

M

MikeP

MikeP said:
You didn't read what I wrote: I said in a lot of places (maybe "a
lot" is too strong).


Think about it correctly instead of in terms of overflow. Currently
unsigned does not give the required semantic you have be arguing for,
hence something is lacking: the complement integer type to signed.
It's a void being filled by what is available and neither of those
alternatives type are behaviorally correct for the required usage.

And loop counters shouldn't be modular. Nor overflow-checked probably.
 
P

Paul

Peter Remmers said:
Am 17.03.2011 12:22, schrieb Paul:

You don't care. You don't tell what it simply is. You tell everyone what
is your distorted view of the matter. You tell everyone who does not share
your view they are idiots and they should **** off and STFU. You resort to
name calling once you run out of arguments. You tell everyone that they
are confused, they are trolls, they are narrow-minded, ignorant, arrogant,
n00bs, losers, and whatnot. You boast that you have been right about those
things 20 years ago. You act like you are the only sane person in this
newsgroup, like you are far superior, even though you don't have a single
clue who you are talking to - you just are superior by definition.

Your claim that you already knew everything better 20 years ago implies
that you are at least 30 years old or something. Which is all the more a
shame because you act like a little child, sticking your fingers in your
ears and singing "lalala, I don't hear you!". You are not able to have a
technical conversation without name calling, high adrenaline levels,
accusations, and good manners worthy of an adult who deserves respect.

Well perhaps you should take a close look at the idiotic concept you seem to
be supporting....

an array is not an array because its a pointer.


When you realise how idiotic that is you will maybe understand some of my
reactions.
 
J

James Kanze

Is libraryFunction() meaningful for number > INT_MAX?
If yes, why restrict it to signed?

And if it's meaningful for a number > UINT_MAX? changing int to
unsigned doesn't increase the range that much. (It was important
on 16 bit machines, but not today.)

Still, the "range" argument is a red herring. If I'm
representing a die, the only legal values are in the range [1,6].
Neither int nor signed (nor any other built-in type in C++) will
give my any realistic and useful range restrictions.
 
P

Peter Remmers

Am 17.03.2011 23:21, schrieb Paul:
You don't have anything else to say other than go on in exactly the way
I have just described?
Well perhaps you should take a close look at the idiotic concept you seem to
be supporting....

an array is not an array because its a pointer.

"X is not X because it is Y."

Except that it's more like

"a is not an element of B, because it is an element of C, and B and C
are disjoint."

int *arr = new int[10];

"arr is not an element of (arrays) because it is an element of
(pointers), and (arrays) and (pointers) are disjoint."

It only sounds to you like "X is not X..." because in your mind it's
like this:

"arr is an element of (pointers) and an element of (arrays)."

And you can't accept the fact that the intersection of (arrays) and
(pointers) is empty.
When you realise how idiotic that is you will maybe understand some of my
reactions.

When you realise that arrays and pointers are two different things you
will maybe understand some of my reactions.

Peter
 
J

James Kanze

[...]
I was a contractor for many, many years; I've
worked in a lot of different places (in different domains---I'm
a telecom's specialist, but I currently work in an investment
bank). And I can't think of a single case where the rule wasn't
int, unless there are strong technical reasons otherwise. That
seems to be the general attitude. In most cases, I'm sure,
without ever having considered the technical aspects. One
learns C++ from Stroustrup, and Stroustrup uses int everywhere
(without ever giving a reason, as far as I know---I suspect that
he does so simply because the people he learned C
from---Kernighan, in particular) did so.
I postulated that also: I think they haven't thought of trying the
alternative or dismissed it too soon because of being too close to the
technology. Then again, I switched over when I was doing C and didn't
know all the ramifications (still don't, but close enough) but found the
change worthwhile and didn't really consider going back to preferring
signed. So I'm kind of the opposite.

When I first learned C, it was int everywhere. (I'm not even
sure that unsigned existed back then.) At some point, I started
using unsigned for things that couldn't be negative. That one
extra bit of range was important on the 16 bit machines at the
time. I switched back some years later, when I moved to 32 bit
machines---using unsigned didn't have any real advantages, and
was just one more thing to keep in mind. Plus, my collegues
always expected bit operations when the saw unsigned.

[...]
Well there's more than that, such as loop counter rollover.

If you insist on using a for loop:). When working down, I'd
normally write:

int n = top;
while (n > 0) {
-- n;
// ...
}

Which, of course, works equally well with unsigned.
Nothing wrappers wouldn't solve. I'm not sure how many cases
get missed by compilers that give warnings for signed/unsigned
conversions.

Wrappers have their own problems. For starters, someone new who
reads the code will (again) suppose something special.
You make it sound like unsigned was a compromise. How so?

Requiring size_t to be unsigned was a compromize. The fact that
you need both ptrdiff_t and size_t was recognized as a serious
problem---almost a design flaw. But the alternatives seemed
worse (back then, when the most frequently used machines were 16
bits).
Well only for those who hold BS as infallible.

You're misreading my argument. I'm not saying that it's better
because BS does it. I'm saying that BS has a large influence,
and because he does it, a lot of other people do it; it is the
"expected" idiom for most programmers. If there were really
strong arguments for something else, then do so. But the burden
of proof is on the other side: most programmers will have their
expectations set by BS. Similar reasoning has made me drop my
insistence on not using .h for C++ headers, for example. In that
case, there are fairly strong technical arguments against it.
But not enough to justify bucking the expectations of the
everyday programmer.
I've seen the cfront code, and that's a mess, so I hope he has
improved since then (surely C++ has helped him a lot).

I've seen Unix kernel source code, and it's worse:). On the
other hand, I've seen a lot of code written elsewhere, at that
time, which makes both look really nice.
You were doing great until that last "go with the crowd" statement!
(Unless you meant within an existing project rather than a new one, where
either alternative can be chosen. i.e., Consistency is of course
important). Also, I'm not sure whether you consider richness of semantics
a technical thing: using signed, you have one thing while using unsigned
you have "divided and conquered".

Yes, but the division is mainly with regards to what the reader
understands. If you have a project with detailed coding
guidelines, that everyone really understands, then you can choose
an arbitrary semantic meaning for the distinction
signed/unsigned. But in general, most programmers will
understand the distinction to be bit-wise operations or not. So
that's the distinction you go with.
 
M

MikeP

William said:
I don't have an opinion on whether there should or shouldn't be. But
there aren't; not without going outside the standard. I do have
qualms about exceptions in general, since they can obfuscate code
flow; especially where exceptions are mandatory rather than optional.
(I learned years ago at another job--in the context of Perl and
die(), though--that exceptions were too often abused by people to
pass off error handling to the next sucker.) But exceptions wouldn't
be strictly necessary, I guess.

I suppose I don't think the issue is that important to bother with
fancier solutions. From a security standpoint overflow handling is
exceptionally important, especially since that area has received
heightened scrutiny from exploit authors; but from a feature
standpoint it's trivial and straight-forward, not something to over
complicate.

It's not just overflow. The whole modular integer thing is gives rise to
loop control problems because modular integers are not appropriate in
such context but neither are signed integers. There is no semantically
correct integer type to use for a loop counter in C++. It's a
defficiency, perhaps small, perhaps not, but a deficiency nonetheless.
 
J

James Kanze

MikeP wrote:
For me, one of the most important differences is that range overflow
doesn't exist in unsigned types; arithmetic on them is defined to be
modulo arithmetic. With signed types, range overflow is UB.
Therefore, a (standard-conformant) compiler may produce code to catch
range overflow with signed types but not with unsigned types. And some
actually do. IMO this helps create more robust applications.

I like that point of view, but you'll have to admit that the
argument would be a lot less vacuous if such implementations
actually existed. (I think one does, but it's certainly not
widely used.)
To me, this characteristic of unsigned types makes it pretty
obvious that they were designed for modulo arithmetic -- which
has its places and uses, but in most cases where unsigned types
are used is not what is intended. And most programmers simply
forget that signed and unsigned types are different in this
(IMO important) aspect.

The major characteristic of unsigned is that they have different
behavior than signed. In many ways. The real question is which
different behavior a reader will assume you meant to signal by
using unsigned. And the issue isn't obvious: I'm less radical
about it that Stroustrup, and I'll use unsigned rather than mix
the two when faced with an interface I can't change.
When James sounded as if it was a compromise to use unsigned, it's
probably because it was. In many architectures there is no signed type
that is able to enumerate the full address space with its positive
range, so an unsigned type has to be used where this is required. (Focus
on "has to be used", which already sounds like "can't do better, even if
I wanted to" :)

Let's not forget that when the decision was made, 16 bit machines
dominated (except for mainframes, and they didn't support C).
The choice was 1) to force such machines to limit the size of
objects even more, or use a larger type, which generally required
a function call for such basic operations as ++, 2) to leave the
signedness of size_t implementation defined, with all the
problems that entails for writing portable code, or 3) to require
size_t to be unsigned (and provide a separate signed type,
ptrdiff_t, for most of the everyday uses). Fundamentally, none
of the alternatives was considered acceptable, but one had to be
chosen.
 
J

James Kanze

Spell it out for me please: How do you conclude that they must
have seen strong arguments towards using a signed integer?

Taking the difference between two addresses is a reasonable
operation. And it gives a signed result. Ditto taking the
difference between two indexes. No one liked the fact that we
needed both size_t and ptrdiff_t: a single type for both would
have been preferable.
While 64 bits of address space is "more than enough", Intel limits the
accessible space to something less than 64 bits (48 bits?).

What appears on the external address bus is irrelevant. Or
rather, the fact that it is smaller than the maximum of a signed
type punctures a whole in the major motivation for making size_t
unsigned. If most machines had been in that situation when C was
standardized, there wouldn't be a size_t, and sizeof would return
an int (as it did in K&R C).
 
M

MikeP

James said:
What appears on the external address bus is irrelevant. Or
rather, the fact that it is smaller than the maximum of a signed
type punctures a whole in the major motivation for making size_t
unsigned.

No it doesn't. It does only if you take the stance that range is the
reason for desired it to be unsigned, and that is not the reason. It's
not mine anyway.
 
M

MikeP

James said:
Taking the difference between two addresses is a reasonable
operation. And it gives a signed result. Ditto taking the
difference between two indexes. No one liked the fact that we
needed both size_t and ptrdiff_t: a single type for both would
have been preferable.

The preference was for fewer types over semantic correctness/clarity? I'd
expect that from language lawyers! ;) The "make it simple as possible,
but no simpler" principle does not seem to hold any weight with the
committee.
 
M

MikeP

James said:
The major characteristic of unsigned is that they have different
behavior than signed. In many ways.

But when unsigned was being invented for the first time and the decision
was made to make unsigned integers modular (is that the correct way to
say it: "unsigned ints are modular"?), was ...
OK, I'm not going to ask that, because I think the answer to that will be
about 2's complement and converting between signed and unsigned.
Nevermind.
 
P

Paul

Peter Remmers said:
Am 17.03.2011 23:21, schrieb Paul:

You don't have anything else to say other than go on in exactly the way I
have just described?
Well perhaps you should take a close look at the idiotic concept you seem
to
be supporting....

an array is not an array because its a pointer.

"X is not X because it is Y."

Except that it's more like

"a is not an element of B, because it is an element of C, and B and C are
disjoint."

int *arr = new int[10];

"arr is not an element of (arrays) because it is an element of (pointers),
and (arrays) and (pointers) are disjoint."

It only sounds to you like "X is not X..." because in your mind it's like
this:

"arr is an element of (pointers) and an element of (arrays)."

And you can't accept the fact that the intersection of (arrays) and
(pointers) is empty.
When you realise how idiotic that is you will maybe understand some of my
reactions.

When you realise that arrays and pointers are two different things you
will maybe understand some of my reactions.
A pointer is used to acceess an array , they are not two different thing
entirely.
 
P

Peter Remmers

Am 18.03.2011 02:10, schrieb Paul:
Peter Remmers said:
Am 17.03.2011 23:21, schrieb Paul:
message
Am 17.03.2011 12:22, schrieb Paul:
I am not even reading your long winded explnation of what E1 is above.
I'm simply "telling" you it's an array, I don't care what you think
I'm
telling you wehat it simply is.

You don't care. You don't tell what it simply is. You tell everyone
what
is your distorted view of the matter. You tell everyone who does not
share
your view they are idiots and they should **** off and STFU. You resort
to
name calling once you run out of arguments. You tell everyone that they
are confused, they are trolls, they are narrow-minded, ignorant,
arrogant,
n00bs, losers, and whatnot. You boast that you have been right about
those
things 20 years ago. You act like you are the only sane person in this
newsgroup, like you are far superior, even though you don't have a
single
clue who you are talking to - you just are superior by definition.

Your claim that you already knew everything better 20 years ago implies
that you are at least 30 years old or something. Which is all the more
a
shame because you act like a little child, sticking your fingers in
your
ears and singing "lalala, I don't hear you!". You are not able to have
a
technical conversation without name calling, high adrenaline levels,
accusations, and good manners worthy of an adult who deserves respect.

You don't have anything else to say other than go on in exactly the way I
have just described?
Well perhaps you should take a close look at the idiotic concept you seem
to
be supporting....

an array is not an array because its a pointer.

"X is not X because it is Y."

Except that it's more like

"a is not an element of B, because it is an element of C, and B and C are
disjoint."

int *arr = new int[10];

"arr is not an element of (arrays) because it is an element of (pointers),
and (arrays) and (pointers) are disjoint."

It only sounds to you like "X is not X..." because in your mind it's like
this:

"arr is an element of (pointers) and an element of (arrays)."

And you can't accept the fact that the intersection of (arrays) and
(pointers) is empty.
When you realise how idiotic that is you will maybe understand some of my
reactions.

When you realise that arrays and pointers are two different things you
will maybe understand some of my reactions.
A pointer is used to acceess an array Right.

, they are not two different thing
entirely.
Wrong. They are not completely unrelated, but they are certainly not the
same.

As long as you refuse to see this, this won't lead any further. And I
will leave you alone now with your pet belief.

Peter
 
M

MikeP

James said:
James said:
James Kanze wrote:
[...]
I was a contractor for many, many years; I've
worked in a lot of different places (in different domains---I'm
a telecom's specialist, but I currently work in an investment
bank). And I can't think of a single case where the rule wasn't
int, unless there are strong technical reasons otherwise. That
seems to be the general attitude. In most cases, I'm sure,
without ever having considered the technical aspects. One
learns C++ from Stroustrup, and Stroustrup uses int everywhere
(without ever giving a reason, as far as I know---I suspect that
he does so simply because the people he learned C
from---Kernighan, in particular) did so.
I postulated that also: I think they haven't thought of trying the
alternative or dismissed it too soon because of being too close to
the technology. Then again, I switched over when I was doing C and
didn't know all the ramifications (still don't, but close enough)
but found the change worthwhile and didn't really consider going
back to preferring signed. So I'm kind of the opposite.

When I first learned C, it was int everywhere. (I'm not even
sure that unsigned existed back then.) At some point, I started
using unsigned for things that couldn't be negative. That one
extra bit of range was important on the 16 bit machines at the
time. I switched back some years later, when I moved to 32 bit
machines---using unsigned didn't have any real advantages, and
was just one more thing to keep in mind. Plus, my collegues
always expected bit operations when the saw unsigned.

[...]

You have to look at all widths. Having unsigned is more important toward
the left (less width) than toward the right. Focusing on just one will
give the "wrong answer" or at least the wrong impression.
If you insist on using a for loop:). When working down, I'd
normally write:

int n = top;
while (n > 0) {
-- n;
// ...
}

Which, of course, works equally well with unsigned.

Most of that gets subsumed by iterators now anyway, so even less of a
point is the loop control thing.
Wrappers have their own problems.

Of course. I have some though that I use during debug which turn into
primitives in release.
For starters, someone new who
reads the code will (again) suppose something special.



Requiring size_t to be unsigned was a compromize. The fact that
you need both ptrdiff_t and size_t was recognized as a serious
problem---almost a design flaw. But the alternatives seemed
worse (back then, when the most frequently used machines were 16
bits).

Deja vu. I think I've replied to this post already! Oh well, we'll see
how consistent my thoughts are.
You're misreading my argument. I'm not saying that it's better
because BS does it. I'm saying that BS has a large influence,
and because he does it, a lot of other people do it; it is the
"expected" idiom for most programmers.

Well that's what I thought you meant.
If there were really
strong arguments for something else, then do so. But the burden
of proof is on the other side: most programmers will have their
expectations set by BS.

I don't think it's a provable thing, no matter who says it. Programmers
will have to think for themselves on this one.
Similar reasoning has made me drop my
insistence on not using .h for C++ headers, for example. In that
case, there are fairly strong technical arguments against it.
But not enough to justify bucking the expectations of the
everyday programmer.


I've seen Unix kernel source code, and it's worse:). On the
other hand, I've seen a lot of code written elsewhere, at that
time, which makes both look really nice.



Yes, but the division is mainly with regards to what the reader
understands.

No small potatoes, for simplifying maintenance is worth more than
"simplifying" original coding. The semantic correctness and syntactic
richness ("semantical" and "syntactical"?) is what is compelling to me
about unsigned.
If you have a project with detailed coding
guidelines, that everyone really understands, then you can choose
an arbitrary semantic meaning for the distinction
signed/unsigned. But in general, most programmers will
understand the distinction to be bit-wise operations or not. So
that's the distinction you go with.

How does the curriculum go? Do they teach unsigned as a bit-based thing
or as the complementary integer type to signed? Surely without such
instruction, the novice will assume unsigned is "the set of positive
numbers that can be represented with the given width".
 
M

MikeP

Leigh said:
What about char? char is a distinct type *and* it is up to the
implementation as to whether char is signed or unsigned yet char, I am
sure you will agree, is used to represent characters more than for
anything else including values which require bit manipulation.

Yes, the C++ type system is a mess.
 
P

Paul

Peter Remmers said:
Am 18.03.2011 02:10, schrieb Paul:
Wrong. They are not completely unrelated, but they are certainly not the
same.
Nobody has said they are the same
int* p = new int[12];
p[3]=3;

The identifier p refers to both the pointer and the array depending on
whether or not it's dereferenced.
Yes its a pointer, but its a pointer to an array and can be used like an
array. People use the terminology that its an array because of the ways
arrays are always passed by reference, and everyone knows an array is just a
pointer under the hood.
As long as you refuse to see this, this won't lead any further. And I will
leave you alone now with your pet belief.
I refuse to see what? Its quite the opposite here that you refuse to accept
my argument, you havent proposed anything for me to refuse to see. except
the obvious that an array entity is not a pointer.
You could just as well say that a memory loacation is not a wooden ornament
on my fireplace, it goes without saying.


I'm quite content here on my own tyvm :)
 
J

Joshua Maurice

Am 18.03.2011 02:10, schrieb Paul:

Wrong. They are not completely unrelated, but they are certainly not the
same.

Nobody has said they are the same
int* p = new int[12];
p[3]=3;

The identifier p refers to both the pointer and the array depending on
whether or not it's dereferenced.
Yes its a pointer, but its a pointer to an array and can be used like an
array. People use the terminology that its an array because of the ways
arrays are always passed by reference, and everyone knows an array is just a
pointer under the hood.


As long as you refuse to see this, this won't lead any further. And I will
leave you alone now with your pet belief.

I refuse to see what? Its quite the opposite here that you refuse to accept
my argument, you havent proposed anything for me to refuse to see.  except
the obvious that an array entity is not a pointer.
You could just as well say that a memory loacation is not a wooden ornament
on my fireplace, it goes without saying.

I'm quite content here on my own tyvm :)

This post especially is a classic case of Not Even Wrong.

http://rationalwiki.org/wiki/Not_even_wrong

Not even wrong (or the full version "That's not right - that's not
even wrong") refers to any statement, argument or explanation that is
not only incorrect but also fails to meet criteria by which
correctness and incorrectness are determined.

The phrase implies that not only is someone not making a valid point
in a discussion, but they don't even seem to understand the nature of
the discussion itself, or the things that need to be understood in
order to participate.

Because he's not right, and not even wrong, it's impossible to argue
with him. In other words, he refuses to use the same language of
discourse as the rest of us and make intelligible assertions. In fact,
he openly admits to refusing to use the same language of discourse as
the rest of us, and instead he prefers his own muddled definitions
from "An English Dictionary".

You're welcome to continue arguing with him. I am of the opinion that
some people will always feed the trolls, and thus getting trolls to
leave by refusing to feed them is a waste of time. In the end, the
sane ones blacklist said trolls, and all is well.

I haven't blacklisted him yet mostly because it's amusing to read this
in my off time and to see how many different logical fallacies and
"patterns" I can apply to his posts, such as "Not Even Wrong".
 
J

James Kanze

But when unsigned was being invented for the first time and the decision
was made to make unsigned integers modular (is that the correct way to
say it: "unsigned ints are modular"?), was ...

The original unsigned weren't modular. At least not in the
specification---both signed and unsigned are modular on most
implementations. (Also, in the original implementations, when
you compared signed with unsigned, the unsigned was converted to
signed. But I don't think the specifications at the time really
made this clear. It was just what the compilers did.)
OK, I'm not going to ask that, because I think the answer to
that will be about 2's complement and converting between
signed and unsigned.

The problem is that when the C committee started to standardize,
there was already a lot of existing practice. Often
contradictory. They did what they could, trying to break as few
existing programs as possible, and still make the language as
clean as possible, while still allowing as much performance as
possible. Almost every decision was a compromize.
 
J

James Kanze

James said:
James Kanze wrote:
James Kanze wrote:
[...]
Well there's more than that, such as loop counter rollover.
If you insist on using a for loop:). When working down, I'd
normally write:
int n = top;
while (n > 0) {
-- n;
// ...
}
Which, of course, works equally well with unsigned.
Most of that gets subsumed by iterators now anyway, so even less of a
point is the loop control thing.

And of course, with iterators, you have to write the loop as
above (or use reverse iterators). You can't decrement once
you've encountered begin.

[...]
I don't think it's a provable thing, no matter who says it.
Programmers will have to think for themselves on this one.

Except that in the absence of any killer argument, you have to
go with what the majority of programmers expect. In fact, even
when there is a strong argument for something else, you may end
up having to go with what the majority of programmers expect.

[...]
No small potatoes, for simplifying maintenance is worth more
than "simplifying" original coding. The semantic correctness
and syntactic richness ("semantical" and "syntactical"?) is
what is compelling to me about unsigned.

The semantic correctness is in the eye of the beholder. You can
argue for any meaning you want, but in the end, the only meaning
that counts is what the reader understands. At least if your
goal is communicating.
How does the curriculum go? Do they teach unsigned as a
bit-based thing or as the complementary integer type to
signed?

They never mention it except in conjunction with the bitwise
operators.
Surely without such instruction, the novice will assume
unsigned is "the set of positive numbers that can be
represented with the given width".

Without any instruction, I suspect that the novice will assume
that unsigned behaves like a cardinal. I.e. that all of the
values of an unsigned will fit in an int (integer), and that
substraction of two unsigned will result in an int (and is
guaranteed to fit in an int).

Of course, without any instruction, the novice will assume that
division of two integers results in a rational number. Or at
least it's closest approximation, a float or a double.

And even more "of course": without any instruction, a novice
won't even know that unsigned exists. What he thinks of
unsigned is 100% conditioned by his instruction.

And in my experience, something like 95% of the programmers seem
to have learned it along the lines of what Stroustrup presents.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,142
Messages
2,570,819
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top