Restricted unsigned integer range would have been better?

A

Ansel

Talking-back to myself: But you shouldn't be doing that anyway probably.
Wow. What a big advantage.

It seems to be a "hack-thing" (in a bad way) to be doing anyway.
Now what about assigning the positive value of any unsigned int to an
int? Because that doesn't work.

Why wouldn't it?
But actually, you f***ed it up completely.
Operations with unsigned int are performed modulo (UINT_MAX + 1).

That is *one* type of integer, yes. Maybe "my" idea cannot be evaluated or
had in isolation, but rather must be analyzed in the context of type system
as a whole. I have a feeling that the whole discussion is about to become
moot -- in favor of nixing arithmetic unsigneds altogether.
 
K

Keith Thompson

Ansel said:
Talking-back to myself: But you shouldn't be doing that anyway probably.

It seems to be a "hack-thing" (in a bad way) to be doing anyway.


Why wouldn't it?

Under your proposal, you'd might have signed int with a range of
-32768 to +32767, and unsigned int with a range of 0 to +32768.
You wouldn't be able to assign an unsigned int value of +32768
to a signed int object without changing the mathematical value.
(Adjust values for 32-bit ints.)

I'm skeptical that your proposal, or anything similar to it, would
be an improvement over what we have now.
 
K

Keith Thompson

Ansel said:
You are picking nits -- you *know* that is not what I meant in the OP.

Except that that's exactly what you said.

| If C's unsigned integer types were designed so that their maximum value
| was the same as the absolute value of the same width signed integer
| type's minimum value, wouldn't that automagically elimintate a bunch of
| erroneous code and simplify programming in C?
 
K

Keith Thompson

Ansel said:
You can still have the full-width unsigned types, they just wouldn't be
*arithmetic* types. I don't know if that would work out for embedded
systems. It should be adequate though for, say, representation of character
types. That noted, still "unworkable"?

Probably.

What do you mean when you say they wouldn't be arithmetic types? Would
it not be possible to perform arithmetic on them?
 
A

Ansel

Keith Thompson said:
Under your proposal, you'd might have signed int with a range of
-32768 to +32767, and unsigned int with a range of 0 to +32768.

-32767 to +32767
 
A

Ansel

Keith Thompson said:
Probably.

What do you mean when you say they wouldn't be arithmetic types? Would
it not be possible to perform arithmetic on them?

Would it be possible to perform arithmetic on them? Answer: no. (Or
*extremely* limited operations, say, increment and decrement).
(I didn't like the way you asked it. Too much potential for confusion).
 
K

Keith Thompson

Ansel said:
Keith Thompson said:
Ansel said:
news:[email protected]... [...]
Also, a range is usually a subset of the allowed int range; yours
would be a superset! Because the top value will be one more than the
normal range of a signed int.

You are picking nits -- you *know* that is not what I meant in the OP.

Except that that's exactly what you said.

And it is still picking nits.

How so? We have no way of knowing what you meant other than reading
what you actually wrote.

Based on your recent followup to one of my followups, I *think* you
intended signed integer types to have symmetric ranges, so a signed
integer might have a range of -32767..+32767, and the corresponding
unsigned integer would have a range of 0..+32767. That was not at all
clear from what you originally wrote.
 
K

Keith Thompson

Ansel said:
Would it be possible to perform arithmetic on them? Answer: no. (Or
*extremely* limited operations, say, increment and decrement).
(I didn't like the way you asked it. Too much potential for confusion).

Then yes, that would be unworkable. There's far too much existing
code that depends on the ability to perform arithmetic on full-width
unsigned types.

(On the other hand, C implementations are already allowed to do as you
suggest, by giving unsigned types a single padding bit. But hardly any
implementations do so.)
 
A

Ansel

Keith Thompson said:
Ansel said:
Keith Thompson said:
[...]
Also, a range is usually a subset of the allowed int range; yours
would be a superset! Because the top value will be one more than the
normal range of a signed int.

You are picking nits -- you *know* that is not what I meant in the OP.

Except that that's exactly what you said.

And it is still picking nits.

How so? We have no way of knowing what you meant other than reading
what you actually wrote.

Based on your recent followup to one of my followups, I *think* you
intended signed integer types to have symmetric ranges, so a signed
integer might have a range of -32767..+32767, and the corresponding
unsigned integer would have a range of 0..+32767. That was not at all
clear from what you originally wrote.

Yes, I haphazardly expressed the thought originally.
 
A

Ansel

Keith Thompson said:
Ansel said:
Keith Thompson said:
[...]
You can still have the full-width unsigned types, they just wouldn't be
*arithmetic* types. I don't know if that would work out for embedded
systems. It should be adequate though for, say, representation of
character
types. That noted, still "unworkable"?

Probably.

What do you mean when you say they wouldn't be arithmetic types? Would
it not be possible to perform arithmetic on them?

Would it be possible to perform arithmetic on them? Answer: no. (Or
*extremely* limited operations, say, increment and decrement).
(I didn't like the way you asked it. Too much potential for confusion).

Then yes, that would be unworkable. There's far too much existing
code that depends on the ability to perform arithmetic on full-width
unsigned types.

But I was asking if it would have been better right from the beginning
rather than now.
(On the other hand, C implementations are already allowed to do as you
suggest, by giving unsigned types a single padding bit. But hardly any
implementations do so.)

Good info.
 
A

Ansel

David Brown said:
Yes, it would be unworkable.

There are many reasons why unsigned types are common in embedded systems.
For some small processors, they are faster (some cpu's do not have the
necessary condition code flags to do signed comparisons, and
multiplication and division of unsigneds is often faster). You are often
mapping hardware registers, which rarely have any concept of sign. Most of
the numbers you encounter are unsigned. They give greater range than
signed numbers - and in embedded systems, every bit counts. (On a 32-bit
system, 1 bit is only 3% of your information - with 8-bit data, it is
12.5% of your information. That's a lot to waste.) And of course these
types are also related to pointers - you sometimes need to treat pointers
as integer types, and you wouldn't want to lose have your memory space.

It doesn't really make sense to say "wouldn't be *arithmetic* types" in
terms of C as we know it. All integer types are arithmetic. And anyway,
you certainly need to be able to do arithmetic on these types in embedded
systems.

Well there could be an additional type like the one I pondered about for
those domains that don't work at the bare metal level (yeah, I know, most of
what C is used for, but one could easily extend this discussion to C++ which
is used en masse at much higher levels). But that would be facetiously
ignoring what has been established in the rest of the thread already, so OK:
perhaps "unworkable" for embedded domains, but pretty much useless elsewhere
because most unsigned things are going to be things where arithmetic ops
should not be available on that type anyway. The rest of your post "summed
it up" nicely: use Ada instead ;).
Of course, the disadvantage here is that when converting between signed
and unsigned types, or mixing them in expressions, you must be even more
careful about overflows. I usually have compiler warnings enabled to tell
me if I've done such mixing - then I will use explicit casts to leave no
doubts.

[Snipped the rest of the very good answer which goes on to describe the type
system of Ada. ;)]
 
B

BartC

David Brown said:
On 25/07/2012 14:28, Ansel wrote:
[Snipped the rest of the very good answer which goes on to describe the
type
system of Ada. ;)]

The ideas are perhaps stolen from Ada (which in turn stole them from
Pascal and/or Modula 2). But Ada adds a whole lot more overhead to code -
both source code (it is a far more verbose language) and run-time (it's
not bad, but it is less efficient than C, and has a few more "surprises"
such as run-time range checking code).

Half-hearted attempts at extending type systems don't really work. There
will always be something else to be added, or ambiguous aspects to be tidied
up, until you end up with something like Ada.
My suggested stronger types could be added to C without /any/ extra
overhead - nothing more at run-time, and only more source code typing if
you actually use the features.

You will need to assign an int value to your range type at some point. Then
it will need a runtime range-check. If you allow arithmetic on your
range-types (even just ++ and --), then they will need overflow checking,
something not necessary at the moment as any result will be a legal value
for that type.

In fact there will be all sorts of issues; these things are never quite that
simple.
 
K

Keith Thompson

Ansel said:
Keith Thompson said:
Ansel said:
[...]
You can still have the full-width unsigned types, they just wouldn't be
*arithmetic* types. I don't know if that would work out for embedded
systems. It should be adequate though for, say, representation of
character
types. That noted, still "unworkable"?

Probably.

What do you mean when you say they wouldn't be arithmetic types? Would
it not be possible to perform arithmetic on them?

Would it be possible to perform arithmetic on them? Answer: no. (Or
*extremely* limited operations, say, increment and decrement).
(I didn't like the way you asked it. Too much potential for confusion).

Then yes, that would be unworkable. There's far too much existing
code that depends on the ability to perform arithmetic on full-width
unsigned types.

But I was asking if it would have been better right from the beginning
rather than now.

If you'll clearly restate your proposal in a single article, it
will be easier to judge. Currently we have your original article
with miscellaneous clarifications scattered throughout the thread,
which has created a lot of misunderstanding.
 
A

Ansel

David Brown said:
I still don't understand why you think arithmetic operations are not
useful on unsigned types, nor why you are so keen on signed types. It is
/rare/ in any real-world program (embedded or otherwise) for integer data
to be negative - the majority of "int" variables will never be less than
zero. (The exception is the somewhat artificial convention in C for error
codes to be negative.) So why not use unsigned types and get a little
more useful range, as well as well-defined overflow and wrapping
semantics?

In an effort to simplify, if something is just a tad "useful" instead of
necessary, maybe it would be better to go without. It's not a "keeness" on
signed types, it's that if there can be one abstraction instead of two, then
maybe a little bit of usefulness can be given up for greater simplicity. The
simplification is, of course, not having to deal with mixed-type arithmetic,
which my OP was seeking to simplify somewhat by range-restricting the
unsigned types. I don't know exactly what my train of thought was when I
posted, but right now I wonder why I would want to range-restrict unsigneds
instead of just using signeds! I'm not sure -- my mind is elsewhere.
Of course, the disadvantage here is that when converting between signed
and unsigned types, or mixing them in expressions, you must be even more
careful about overflows. I usually have compiler warnings enabled to
tell
me if I've done such mixing - then I will use explicit casts to leave no
doubts.

[Snipped the rest of the very good answer which goes on to describe the
type
system of Ada. ;)]

The ideas are perhaps stolen from Ada (which in turn stole them from
Pascal and/or Modula 2). But Ada adds a whole lot more overhead to code -
both source code (it is a far more verbose language) and run-time (it's
not bad, but it is less efficient than C, and has a few more "surprises"
such as run-time range checking code). My suggested stronger types could
be added to C without /any/ extra overhead - nothing more at run-time, and
only more source code typing if you actually use the features. In fact,
range types could give the compiler scope for additional optimisations
since it has greater knowledge.

Well, there's a spec for saturating integers somewhere already -- that may
be a start in the right direction. I'm all for various types of integers
with different semantics. In your prior post, you noted the need for a
typedef of somekind to keep it under control and I too think some kind of
construct like that is necessary to make it work. Realize what that actually
is though: some kind of "incompatible type" builder. After you think about
that for awhile, you'll start to want a "compatible type" builder. Then a
"semi-compatible type" builder. Then you realize you can have all that much
easier and more elegantly if you switch programming languages. "The right
tool for the job".
 
A

Ansel

Keith Thompson said:
Ansel said:
Keith Thompson said:
[...]
You can still have the full-width unsigned types, they just wouldn't
be
*arithmetic* types. I don't know if that would work out for embedded
systems. It should be adequate though for, say, representation of
character
types. That noted, still "unworkable"?

Probably.

What do you mean when you say they wouldn't be arithmetic types?
Would
it not be possible to perform arithmetic on them?

Would it be possible to perform arithmetic on them? Answer: no. (Or
*extremely* limited operations, say, increment and decrement).
(I didn't like the way you asked it. Too much potential for confusion).

Then yes, that would be unworkable. There's far too much existing
code that depends on the ability to perform arithmetic on full-width
unsigned types.

But I was asking if it would have been better right from the beginning
rather than now.

If you'll clearly restate your proposal in a single article, it
will be easier to judge. Currently we have your original article
with miscellaneous clarifications scattered throughout the thread,
which has created a lot of misunderstanding.

Will you please review my latest replies to D. Brown in this thread and see
if that answers your questions or interject there? I really don't want to
restart from the beginning because so much has changed (perspective) since
then.
 
M

Malcolm McLean

בת×ריך ×™×•× ×—×ž×™×©×™, 26 ביולי 2012 07:09:26 UTC+1, מ×ת Ansel:
signed types, it's that if there can be one abstraction instead of two, then
maybe a little bit of usefulness can be given up for greater simplicity. The
simplification is, of course, not having to deal with mixed-type arithmetic,
which my OP was seeking to simplify somewhat by range-restricting the
unsigned types. I don't know exactly what my train of thought was when I
posted, but right now I wonder why I would want to range-restrict unsigneds
instead of just using signeds! I'm not sure -- my mind is elsewhere.
Conisder this. It's a real problem in my current program.

I've got a buffer z, with length N. I'm passed in an index and a window length. I want to take a window of the buffer.

Now of course this is trivial to code, because ints are 32 bits and the buffer's never going to go above a few hundred samples. Just don't use unsigned arithmetic and the code is clean and simple. If anyone does pass in a maliciously large buffer and window, you hope that the system will detect it and stop the program due to arithmetical overflow.
But imagine we're using unsigned because N and the window centre, ci, are naturally unsigned, and buffer _ window length is likely to be over the range of an unsigned. Now we've got real problems. Do you or do you not alleviate the coding difficulties by saying that unsigned cannot go beyond the range of a signed?

We don't need to change any compilers. Just insert a line into the standardsaying that "the result of assigning a value greater than INT_MAX to an unsigned int is undefined".
 
K

Keith Thompson

Ansel said:
Will you please review my latest replies to D. Brown in this thread and see
if that answers your questions or interject there? I really don't want to
restart from the beginning because so much has changed (perspective) since
then.

Sorry, but I'm not going to take the time to reconstruct what you're
proposing by looking through multiple articles. I *think* I found the
reply you're referring to, but it doesn't seem all that specific.

Just one point, though.

C's integer type traditionally map very closely to what's supported by
the hardware. Your proposal would break that tradition. For that
reason alone, I don't think it's feasible.

Something like what you're suggesting might be sensible for a new
language -- but C should still be there for programmers who want to use
the features that this new language is missing.
 
B

BartC

David Brown said:
On 25/07/2012 18:17, BartC wrote:

No, it won't need a runtime range check - at least, not anything generated
automatically by the compiler. The compiler should be able to assume that
the programmer knows what he is doing.

So if you have a type "int_1_to_10_t", and you write:

extern int foo(void);
int_1_to_10_t x = (int_1_to_10_t) foo();

then the compiler can assume that the value for x is in range. If the
compiler can do any compile-time checking, then it should - but when a
programmer makes an explicit cast like this, he is also making a clear
statement that it is a valid cast.

There is already a kind of restricted int in C, called a _Bool, which can
only contain (I believe) 0 or 1.

When you assign an int to it, a runtime conversion seems to be performed (to
convert not-zero to 1). But when I tried a (_Bool) cast in gcc just now, it
still performed this conversion! It didn't trust the programmer telling it
the int already contained 0 or 1.

But then, there is a slight ambiguity in the meaning of a cast anyway; does
it mean 'assume the value is already of the type expected, so don't
convert', or 'please convert'!

Your (int_1_to_10_t) cast can be interpreted either way.
Note that your code might well contain explicit checks for validity - but
you would have these anyway, even if the type did not have a range.


Again, no more so than you have for existing types (after all, types like
"int" have a limited range already).

It's not quite the same. If you do ++ on an int, you *know* (for unsigned
anyway) the result will always be a valid int, although it might wrap
eventually.

Do ++ on a 1..10 int, then it's quite likely you will get 11 at some point.
11 is an illegal value for this type. And it doesn't wrap, doesn't saturate,
and doesn't give a warning, yet you've told the compiler the value will
*never* be more than 10.

(And performing ++ on a _Bool, you will always get 1, not sometimes 2.
Create a new type 0..1, and ++ can give you 1, 2 ... or anything else
depending on how many ++s you've done! But again, C doesn't trust the
programmer to only increment a _Bool which contains 0. Admittedly this
checking is trivial for a _Bool type.)
 
J

James Kuyper

....
There is already a kind of restricted int in C, called a _Bool, which can
only contain (I believe) 0 or 1.

It can only be assigned a value of 0 or 1. By type punning it could be
given bit patterns that represent other values; the behavior when you
attempt to read the value of the object after such type punning is
undefined, of course.
When you assign an int to it, a runtime conversion seems to be performed (to
convert not-zero to 1). But when I tried a (_Bool) cast in gcc just now, it
still performed this conversion!

Good, that's precisely what's required by the standard.
... It didn't trust the programmer telling it
the int already contained 0 or 1.

But then, there is a slight ambiguity in the meaning of a cast anyway; does
it mean 'assume the value is already of the type expected, so don't
convert', or 'please convert'!

A cast expression always means "unconditionally convert" - inserting
"please" implies a degree of freedom that conforming implementations
don't have. The only time a cast implies "no conversion" is when the
compiler knows that no conversion is needed (for example, when
converting to the same type, or a type with the same representation, or
a literal value known to have the same representation in both types).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,078
Messages
2,570,570
Members
47,204
Latest member
MalorieSte

Latest Threads

Top