How to elegantly get the enum code from its string type

Jonathan Lee · Apr 15, 2010

No.
For the theoretical perspective, overflow with signed type is Undefined
Behavior, not well defined.

Really? Messed. But.. I don't see how that's an
improvement... Personally I'll chalk that up as a point in
favor of unsigned values...

There's a difference between a boundary 2 cm in front of your
nose, which you're likely to collide with, and one out in the
Andromeda galaxy.

Apparently the difference isn't the same for everybody.

In summary, the particular mathist viewpoint expressed above
brings nothing useful, but does rather confuse things.

I dunno. I think it points out that people approach the matter
differently. You guys keep arguing like it's going to come to
some satisfactory conclusion. It's not. You all just don't see
things the same way.

--Jonathan

Ian Collins · Apr 15, 2010

So who uses size_t on a modern machine?

Back when C was being standardize, 16 bit machines were still
legion, and the extra bit was necessary. Today, realistically,
size_t is an anachronism, and not really necessary. (It's worth
noting that the STL was originally developed using Borland C++,
on a 16 bit machine. Which possibly accounts for it's use of
unsigned size_type as well.)

Specialised types like size_t do have a use on modern machines. It can
be used it index large data sets on 32 and 64 bit systems. On all
commonly used 64 bit models, int is only 32 bit. You could use long,
but that's still only 32 bit on 64 bit windows. Or you could use long
long, but that has unnecessary overheads on some 32 bit machines.

Also size_t has another important use in both C and C++: it is the
result type of the sizeof operator.

Ian Collins · Apr 15, 2010

Nonsense or not, it's what the standard says.

It indicates that, on certain, very old machines, you need that
extra bit.

So my core i7 and AMD 64 systems are very old machines? Can you
recommend a new one?

Or that the STL was developed on 16 bit machines, where that
extra bit might have been relevant.

It still is today; try

size_t m = 1024*1024*1024;

m *= 3;

char* p = malloc( m );

In a 32 bit application where size_t is signed.

Once code has been tainted by an unsigned type, you more or less
have to go with the flow.

Why "tainted"? Is the standard library and POSIX wrong to use size_t
where ever a size is required? Does the use of the sizeof operator
"taint" code?

Vladimir Jovic · Apr 16, 2010

Keith said:
As opposed to that "negative" being silently converted to a
huge "positive" number?? Have you actually /thought/ about
this or bothered to read any of the lengthy discussions of
the topic (and I don't mean this thread)?

I had a discussion about this with colleagues last month, and they took
your position.

I must admit I haven't search enough, because most of google hits on
this subject are garbage and not right (they talk about completely
different things, but my search criteria might be wrong).
Therefore, what I say is based on my own personal experience (5 years of
c/c++ learning in schools, 10 years experience in various c or c++ projects)

Similar example as the example Alf posted:

<code>

#include <vector>
#include <cassert>
#include <string>

std::vector< std::string > a( 50, "abc" );

std::string GetString( const unsigned int idx )
{
assert( idx < a.size() );
return a[idx];
}

int main()
{
std::cout << GetString( -3 ); // will trigger the assert
std::cout << GetString( 3 ); // OK
std::cout << GetString( 148 ); // will trigger the assert
}

</code>

I've already posted a search link once in the thread. Go find
it and/or google search comp.lang.c++.moderated for "unsigned"
and begin your studies.

Your link gives 8000 hits.
I went through maybe 200, and I found stuff what Alf explained, but I
also saw people are using unsigned integers.
For example like this:
for ( unsigned int i = 0; i < GetSomeNumber; ++ i )
{
// do something
}

Do you have something specific that explains why the unsigned numbers
are bad?

However, unlike Leigh, don't stop immediately as soon as you
find a post that agrees with your preconceptions. Instead study
the entirety of the arguments.

By the way, if one stops searching as soon as they find results
they agree with, then one is guilty of what is called "selection
bias". It allows one to remain ignorant for quite some time.

Off course.

But for every (bad) example why an unsigned is bad, you can find another
(bad) example why the signed is bad.
My conclusion is : think what type you really need, and avoid conversions.

Vladimir Jovic · Apr 19, 2010

Leigh said:
Vladimir Jovic said:

My conclusion is : think what type you really need, and avoid
conversions.

Click to expand...

Your conclusion is valid yes although I would qualify "avoid
conversions" with "unless the situation calls for it".

The following is correct:

std::vector<int> v;
typedef std::vector<int>::size_type index;
...
for (index i = 0; i != v.size(); ++i)
{
/* do stuff with v */
}

Whilst the following is incorrect:

std::vector<int> v;
...
for (int i = 0; i != v.size(); ++i)
{
/* do stuff with v */
}

In the 2nd for loop, you will get a warning (with properly set compiler
parameters) that you are trying to compare unsigned and signed integers

Ian Collins · Apr 19, 2010

If the full address space of a program is 4G, one really cannot hope in
succeeding to allocate a 3G chunk in it, memory fragmentation and OS
restrictions kick in long before. Even if there is a rare app needing to
do such things, then it is so platform-specific anyway that there is no
need to support it by the language standard.

The example works happily as a 32 bit application on my 64 bit
OpenSolaris boxes with plenty of RAM. Granted it is an edge case, but
one I have run into.

I would suggest a counter-example instead: with unsigned size_t on a 32-
bit app:

size_t m = 1024*1024*1024;

m *= 4; // this is well-defined

char* p = static_cast<char*>(malloc( m )); // this succeeds

p[42] = 100; // boom!

Boom indeed!

I'm on the fence in this argument. Coming form a hardware and driver
background, I would hate to have to work without unsigned types. Their
use probably should be confined to representing physical entities, bit
patterns and absolute sizes.

Using size_t is OK by itself. It's not OK that size_t is hard to use in
expressions and that wrapover in size_t is well-defined and thus cannot
be marked as an error by the compiler or run-time. The simplest fix for
those problems in C++ would be to use a signed type for size_t. A cleaner
fix would be to change the behavior of unsigned types and add bitmask_t
and wrapover_t types for those needing the special features of current
unsigned types.

The fact that neither of those fixes cannot be really applied today does
not make the current definition of size_t any better.

That's a fair summary. The only "completely safe" way to use unsigned
types is to assign them to a bigger signed type, which isn't always
available.

Keith H Duggar · Apr 19, 2010

I'm on the fence in this argument. Coming form a hardware and driver
background, I would hate to have to work without unsigned types. Their
use probably should be confined to representing physical entities, bit
patterns and absolute sizes.

I don't think anyone is advocating never using them. Most are just
trying to get people to stop confusing /un/signed with /positive/.
It's right there in the name /un/signed meaning "without sign". It
is neither positive nor negative. That is why we use them for raw
bit manipulation where we don't want /any sign/ (as in "there is
no sign") to get in the way and do unexpected things.

/un/signed is a finite modular group having no sign. Kai-UweBux's
nit not withstanding ie that their canonical C++ /representation/
is "positive" in a trivial sense. Once one stops wrongly thinking
of them as "positive" one also loses the temptation to use them
for "positive only" values.

Other languages do have proper range types, natural, etc but core
C++ does not and we shouldn't pretend it does.

KHD

Kai-Uwe Bux · Apr 19, 2010

Keith said:
I don't think anyone is advocating never using them. Most are just
trying to get people to stop confusing /un/signed with /positive/.
It's right there in the name /un/signed meaning "without sign". It
is neither positive nor negative. That is why we use them for raw
bit manipulation where we don't want /any sign/ (as in "there is
no sign") to get in the way and do unexpected things.

/un/signed is a finite modular group having no sign. Kai-UweBux's
nit not withstanding ie that their canonical C++ /representation/
is "positive" in a trivial sense. Once one stops wrongly thinking
of them as "positive" one also loses the temptation to use them
for "positive only" values.

This is too strong a claim. The interpretation of unsigned types as modeling
a finite modular group works well for the three operators +, -, and *. It
modular arithmetic aspect of unsigned types may be appropriate for many
contexts, but that point of view is presented by the rules of C++ only as

the unsigned types said:
Other languages do have proper range types, natural, etc but core
C++ does not and we shouldn't pretend it does.

C++ has types that cater to different needs. I agree that the multi-faced
nature of the unsigned types creates pitfalls and makes it more tricky to
mix them with signed types. However, I don't agree that the mantra "unsigned
means modular" is firmly rooted in the language rules nor am I convinced
that it provides a good cure for the problems.

Best

Kai-Uwe Bux

Keith H Duggar · Apr 20, 2010

It is not wrong at all to think of them as being "positive".

Stated with no supporting argument of course. Typical Leigh.

std::size_t is an unsigned integral type, this will not change.
std::size_t and other unsigned integral types can be used to represent sizes
and array indexes
std::array<T>::size_type will be a typedef of std::size_t and is used used
for array size() (always positive) and array indexes (always positive).

All of which have /nothing/ to do with them being /un/signed as
opposed to positive/negative ie "signed".

You have to accept that unsigned integral types pervade the language and
standard library so it is impossible to avoid them when writing new code.

Which again has /nothing/ to do with /un/signed != positive.

You are simply being a "use int everywhere" troll.

Where have I ever recommeneded one "use int everywhere"? (The
answer is of course "nowhere" but Leigh has a pitiable reading
comprehesion problem.)

KHD

Ian Collins · Apr 20, 2010

std::size_t has nothing to do with bit manipulation and modular
arithmetic. std::size_t pervades the language and standard library. The
correct use of unsigned integral types cannot be avoided in C++.

Many hapless developers have managed to successfully avoid the correct
use of unsigned integral types and shot themselves in the foot!

Keith H Duggar · Apr 20, 2010

This is too strong a claim. The interpretation of unsigned
types as modeling a finite modular group works well for the
three operators +, -, and *. It does not work for the operators
/, %, <, and >.

I don't see how you come to that conclusion regarding / % < >?
Given that for unsigned p and q division and modulus, p / d and
p % d, are both defined by p = d * q + r ie in terms of * and +,
and that you claim my view is appropriate for both * and +, it
seems odd to claim it "doesn't work" for / and %.

As for < and > are you saying that a finite modular group
cannot be ordered? Or that /sign/ is necessary for ordering?
I don't think so. Please demonstrate these claims.

Anyhow, it seems to me that, in the words of Leigh, / % < > all
work "just fine" for an /un/signed interpretation that is "having
no sign". Not only that, but the defining equation for / and % is
even simpler for unsigned than it is for integer viz:

unsigned
--------
Let p and d be unsigned
Let p = d * q + r for unsigned * and + and unsigned q and r
such that r < d
Then p = d * q + r has a unique solution with q called the
"quotient" and r the "remainder" and
Define p / d = q
Define p % d = r

integer
-------
Let p and d be integers
Let p = d * q + r for integer * and + and integers q and r
such that |r| < |d| and sgn(r) = sgn(p)
Then p = d * q + r has a unique solution with q called the
"quotient" and r the "remainder" and
Define p / d = q
Define p % d = r

note that constraint on r is simpler for unsigned than it is for
integer, otherwise the defining equations are the same and they
are "just fine" for unsigned as "having no sign" ie there is not
a single reference to sign (nor subtraction) in them (as there
must be for integer).

So can you please demonstrate your claim that /un/signed "does

not work for the operators / said:
The operators &, |, <<, and>> present a somewhat different
point of view entirely. Stressing the

Which is irrelevant to the /arithmetic/ context. But it is what
I had in mind, of course, when I wrote "That is why we use them
for raw bit manipulation where we don't want /any sign/".

modular arithmetic aspect of unsigned types may be appropriate
for many contexts, but that point of view is presented by the
rules of C++ only as one among several options. With regard to
/, %, <, and >, the unsigned types are best interpreted as
non-negative integer values.

See above. Please show how a non-negative integer interpretation
is "best" compared to non-signed finite modular group.

C++ has types that cater to different needs. I agree that the
multi-faced nature of the unsigned types creates pitfalls and
makes it more tricky to mix them with signed types. However, I
don't agree that the mantra "unsigned means modular" is firmly
rooted in the language rules

It's written in black-and-white in the standard. I don't know how
much more firmly rooted it could be. What would convince you?

nor am I convinced that it provides a good cure for the problems.

I can only speak from personal experience as, frankly, this issue
just hasn't come up within any of the groups I've worked. (I mean
to say we've never debated the concepts.)

So I can only say that once I agreed with Leigh's position that
it was "just divine fine" to use unsigned when the semantic was
"always positive" that is "having a sign that happens to always
be positive". However, in anything but the simplest scenarios
(like the toy for loops Leigh keeps regurgitating) I ended up
regretting the decision because invariably I needed to perform
/arithmetic/ on those "always positive" types. And that in turn
often required annoying casts that ugglified the code.

Once I changed my perspective to view unsigned as a modular type
literally as /un/signed ie "having no sign", such nastiness just
evaporated (expect where we are forced to interact with unsigned
types in /arithmetic/ scenarios).

Anyhow, thanks for the enjoyable math discussion Kai-Uwe Bux and
I'm looking forward to your reply.

KHD

PS. One scenario I specifically recall where my initial unsigned
index choice (because indexes into vectors are "always positive"
right ;-) later became annoying was in implementing a convolution
function optimized for small kernels operating on large vectors.

Vladimir Jovic · Apr 20, 2010

First of all, thank you for such detailed explanation.

* Vladimir Jovic:

Unsigned types are problemetic in arithmetic and comparisions, because

* unsigned arithmetic is modulo 2^n, where n is the number of value
representation bits (i.e. you have wrap-around behavior,
guaranteed), and

But you get a wrap around with signed as well.

* with like size types U and I in the same expression you get promotion
to U (you also get promotion to U if U is of greater size than I).

Yes, but with compilation options set, you will get a warning about a
comparison between signed and unsigned types. At least that's what you
can do in g++, I am assuming other compilers do it as well.

The first version of the program is a very common novice error, failing
to understand the issues of unsigned representation and unsigned
arithmetic, that for the default signed 'char' type the value of 'æ' has
been wrapped and is therefore outside the required range for 'is_lower'.

Even professional programmers tend to make such mistakes, for as Gosling
observed "almost no C developers actually understand what goes on with
unsigned", and that applies also to C++ -- it's the same.

This doesn't mean that using 'unsigned char' is a good solution. It
means that mixing unsigned and signed, as the 'is_lower' standard lib
function does, is a recipe for disaster. It's just too darned easy to
get the /usage/ wrong.

Here's an expression example:

a < b + n

With signed integer types and common not-overly-large values for a and
b, this can also be expressed as

a - n < b

However, if the type involved is an unsigned one, then instead of a - n
possibly producing a negative value it will in that case wrap around,
with the result that when the former expression is 'true', the latter
expression is 'false'...

I.e., with unsigned arithmetic the usual math rules don't apply for
ordinary not-overly-large values -- which are the values most often
occurring.

But such bugs can be introduced by not using a type that has enough
precision. For example, using signed short int, instead of signed int :

#include <iostream>
signed short int foo( const signed short int a, signed short int b )
{
return a * b;
}
int main()
{
const signed short int a = 146163;
const signed short int b = -37613;
std::cout << a << " * " << b << " = " << foo(a,b) << std::endl;
}

Most programmers are aware of this when writing e.g. a loop that counts
down, but keeping it in mind for expressions like the above is much much
harder. IIRC one almost grotesque example in recent years was when
someone noticed that Microsoft's code for rendering a picture had such a
bug, which with a suitable crafted picture made it possible to place
arbitrary bytes in memory. This then in turn allowed malware infection
via Internet Explorer by simply presenting a picture on your web site;
uh oh, infected by a JPEG, oh dear.

I haven't heard of this. Very good. hehe you made my day

-- Implicit promotion.

As a concrete example of implicit promotion, consider

<code>
#include <iostream>
#include <string>

std::string rightAdjusted( int n, std::string const& s )
{
if( s.length() >= n )
{
return s;
}
else
{
return std::string( n - s.length(), ' ' ) + s;
}
}

int main()
{
using namespace std;
for( int x = -5; x <= 5; ++x )
{
cout << rightAdjusted( x*x - 4, "o" ) << endl;
}
}
</code>

<result>
o
o
o
o

This application has requested the Runtime to terminate it in an unusual
way.
Please contact the application's support team for more information.
</result>

Evidently there's something wrong. And e.g. the g++ compiler warns about
that, "warning: comparison between signed and unsigned integer
expressions".

But if you deliberately choose to ignore warnings, you should get what
you deserve

I.e. this happens not only with comparisions but also in simple
arithmetic expressions, where the compiler will usually /not/ warn you,
as g++ didn't for the above expression. It is real pitfall, a very
common error. But it's very simple to avoid, like just putting on a
condom: simply don't introduce unsigned values in the first place, e.g.,
define & use a signed type 'size' function.

But even with signed values you have to take care not to get over flow.
And the only way to prevent it is to test as much as possible.

Alf P. Steinbach · Apr 20, 2010

* Vladimir Jovic:

First of all, thank you for such detailed explanation.

But you get a wrap around with signed as well.

Sorry, no, that's not guaranteed.

I think it should be (in order to get rid of UB), but it isn't. The only one in
this group who can remember a computer using anything other than two's
complement is, as far as I know, James Kanze. Some years ago he mentioned a
still extant such computer which, unbelievably, had/has a C++ compiler, but I
say, let that antique go the way of the dinosaurs, no point in supporting it.

Anyway, it's the guaranteed wraparound for unsigned, and the *placement* of that
wraparound, namely at 0 smack in front of your nose, that's problematic.

Yes, but with compilation options set, you will get a warning about a
comparison between signed and unsigned types. At least that's what you
can do in g++, I am assuming other compilers do it as well.

Most compilers do not warn about promotion in arithmetic expressions.

In particular since you're using g++, the g++ compiler doesn't.

Cheers & hth.,

- Alf

Keith H Duggar · Apr 20, 2010

Your position seems to be that unsigned integers should only be
used for bit manipulation and modular arithmetic, this is an
incorrect position to take.

My position is that they ARE MODULAR ARITHMETIC types ie that
they OBEY MODULAR ARITHMETIC. I am sorry for your unfortunate
reading comprehension problem. If you want to try again here
is an excerpt from the standard:

3.9.1.4 :
Unsigned integers, declared unsigned, shall obey the laws of arith-
metic modulo 2n where n is the number of bits in the value representa-
tion of that particular size of integer.17)

Perhaps with the aid of a dictionary and tutor you may succeed
in comprehending that simple passage. Good luck! Don't give up,
it will come with enough hard work.

People who hold this position I refer to as "use int everywhere" trolls,

Ok. So it's just an arbitrary sound-bite you invented to dismiss
those whose arguments you cannot counter rationally. Got it.

i.e. trolls who use int instead of the appropriate type and
stubbornly put forward this view in public forums.
...
std::size_t has nothing to do with bit manipulation and modular
arithmetic. std::size_t pervades the language and standard library.
The correct use of unsigned integral types cannot be avoided in C++.

Hehe .. "appropriate", "correct" ... Do you know the fallacy of
"begging the question"? Don't bother answering, that was just a
rhetorical question, I already know the answer. Go wikipeducate
yourself, study a few dozen examples of the fallacy and see if
you can learn to recognize and avoid it in your "arguments".

Anyhow, size_t DOES OBEY MODULAR ARITHMETIC. No matter how much
Leigh personally dislikes that FACT, no matter how he squirms,
weasels, flails, QQs, or otherwise cries, the C++ standard is
crystal clear:

3.9.1.4 :
Unsigned integers, declared unsigned, shall obey the laws of arith-
metic modulo 2n where n is the number of bits in the value representa-
tion of that particular size of integer.17)

In C++ unsigned types OBEY THE LAWS OF MODULAR ARITHMETIC.

KHD

Kai-Uwe Bux · Apr 20, 2010

Keith said:
I don't see how you come to that conclusion regarding / % < >?
Given that for unsigned p and q division and modulus, p / d and
p % d, are both defined by p = d * q + r ie in terms of * and +,
and that you claim my view is appropriate for both * and +, it
seems odd to claim it "doesn't work" for / and %.

As for < and > are you saying that a finite modular group
cannot be ordered? Or that /sign/ is necessary for ordering?
I don't think so. Please demonstrate these claims.

Anyhow, it seems to me that, in the words of Leigh, / % < > all
work "just fine" for an /un/signed interpretation that is "having
no sign". Not only that, but the defining equation for / and % is
even simpler for unsigned than it is for integer viz:

unsigned
--------
Let p and d be unsigned
Let p = d * q + r for unsigned * and + and unsigned q and r
such that r < d
Then p = d * q + r has a unique solution with q called the
"quotient" and r the "remainder" and
Define p / d = q
Define p % d = r

integer
-------
Let p and d be integers
Let p = d * q + r for integer * and + and integers q and r
such that |r| < |d| and sgn(r) = sgn(p)
Then p = d * q + r has a unique solution with q called the
"quotient" and r the "remainder" and
Define p / d = q
Define p % d = r

note that constraint on r is simpler for unsigned than it is for
integer, otherwise the defining equations are the same and they
are "just fine" for unsigned as "having no sign" ie there is not
a single reference to sign (nor subtraction) in them (as there
must be for integer).

So can you please demonstrate your claim that /un/signed "does
not work for the operators /, %, <, and >"? Thanks!

Ok, I'll try.

First of all, the claim "unsigned means no-sign" is different from the claim
I was commenting on:

And I have trouble more with the first part of the claim than the second.
The second can be interpreted as saying that operations on unsigned types
can be explained without reference to signs. This is true (as far as I see)
but not very strong a claim since with those explanations we tend to
implicitly assume "positive" values. Nonetheless, please allow me to focus
in my discussion on the "finite modular group" aspect of the unsigned types.
What I shall argue is that this point of view does not do justice to the
operators /, %, and <.

First, let me discuss the issue of order. I agree that < _can_ be defined
for unsigned values (after all it is

. However, the notion of < does not
interact nicely with the additive structure of the modular arithmetic. One
very basic and important property of < for integers (or real numbers for
that matter) is _translation invariance_:

(*) for integers a, b, and c: a < b if and only if a+c < b+c

This does not hold _mod N_. In fact, a finite group cannot be totally
ordered as to satisfy (*). Taking N = 32 as a small example, we have

8 < 15

but

8 + 20 = 28
15 + 20 = 35 = 3 mod 32

thus

15+20 < 8+20

So, even though comparing unsigned values in C++ is meaningful, the meaning
is not really well-aligned with the interpretation of unsigned values as
representing values in a modular group.

Now, let us turn to division and mod. As you point out, / and % can be
defined in terms of *, + and < by

(-) p = d * q + r

subject to the requirement 0 <= r < d. At least, this is what we would
_want_ to obtain: the value for q should be p/d and r should be p%d.

First, note that this definition hinges upon <, which in turn is not really
a concept of modular arithmetic. This triggers some suspicion that / and %
may also not be all that cool with regard to modular arithmetic.
Nonetheless, let us just use the C++ meaning of *, +, < for unsigned types
and see whether (-) can serve as a good definition of / and %.

The crucial property of (-) is that it should determine q and r _uniquely_.
Otherwise, we would not know, which of the possible values to pick. Now, in
modular arithmetic as realized by the unsigned types, this uniqueness fails.
E.g., let us consider arithmetic mod 32 (so that we can do the computations
easily):

p = 19
d = 6

19 = 6*3 + 1 (mod 32) // this is the _desired_ solution since 1 < 6
19 = 6*8 + 3 (mod 32) // oops, this is a second solution since 3 < 6

So, we could have 19/6 = 8 and 19%6 = 3 and still satisfy (-) _mod 32_. In
other words, when you _interpret_ (-) as a statement in modular arithmetic,
then (-) fails to determine / and %.

That is why I'd say that / and % realize operations that are not best viewed
as modular arithmetic.

Which is irrelevant to the /arithmetic/ context. But it is what
I had in mind, of course, when I wrote "That is why we use them
for raw bit manipulation where we don't want /any sign/".

See above. Please show how a non-negative integer interpretation
is "best" compared to non-signed finite modular group.

Well, this is precisely the matter of (-). Once you interpret the symbols of
(-) over non-negative integers, the condition indeed does determine unique
values for q and r.

It's written in black-and-white in the standard. I don't know how
much more firmly rooted it could be. What would convince you?

I don't deny what is in the standard. What I deny is that the catch phrase
"unsigned means modular" captures the essence of unsigned types. It
overemphasizes a particular aspect and does not do justice to _other_
provisions in the standard that are hard to reconcile with modular
arithmetic.

I can only speak from personal experience as, frankly, this issue
just hasn't come up within any of the groups I've worked. (I mean
to say we've never debated the concepts.)

So I can only say that once I agreed with Leigh's position that
it was "just divine fine" to use unsigned when the semantic was
"always positive" that is "having a sign that happens to always
be positive". However, in anything but the simplest scenarios
(like the toy for loops Leigh keeps regurgitating) I ended up
regretting the decision because invariably I needed to perform
/arithmetic/ on those "always positive" types. And that in turn
often required annoying casts that ugglified the code.

Once I changed my perspective to view unsigned as a modular type
literally as /un/signed ie "having no sign", such nastiness just
evaporated (expect where we are forced to interact with unsigned
types in /arithmetic/ scenarios).

Now, I cannot dispute personal experience. So, I am prepared to accept that
once programmers adopt the view of unsigned types as (mainly) realizing
modular arithmetic, they might change their use of unsigned types and run
into fewer problems because of that shift.

What I do agree on is that "unsigned means non-negative" is a misguided
mantra, which has (also) no foundation in the standard. I also can see that
following this slogan does lead to problems. I would suspect (although I am
open to correction from experience) that the main benefit of the phrase
"unsigned means modular" is to act as an antidote to "unsigned means non-
negative".

[...]

PS. One scenario I specifically recall where my initial unsigned
index choice (because indexes into vectors are "always positive"
right ;-) later became annoying was in implementing a convolution
function optimized for small kernels operating on large vectors.

Hm, that sounds very interesting. I only did a toy implementation of DFT
once (more or less to see what it is) and implemented convolution on top of
that. I don't recall heavy index arithmetic, rather the algorithms would
just work on ranges given by iterators. But again, I did not optimize in any
way or form; Here, it appears, the devil is hiding in the details ready to
jump at you any moment.

Best

Kai-Uwe Bux

Keith H Duggar · Apr 20, 2010

Ok, I'll try.

First of all, the claim "unsigned means no-sign" is different from the claim
I was commenting on:

And I have trouble more with the first part of the claim than the second.
The second can be interpreted as saying that operations on unsigned types
can be explained without reference to signs. This is true (as far as I see)
but not very strong a claim since with those explanations we tend to
implicitly assume "positive" values. Nonetheless, please allow me to focus

Except I'm saying that implicit assumption is a /problem/ when
dealing with unsigned. But, I agree let's stick to the math for
now. It's more fun and more easily "proved" LOL.

in my discussion on the "finite modular group" aspect of the unsigned types.
What I shall argue is that this point of view does not do justice to the
operators /, %, and <.

First, let me discuss the issue of order. I agree that < _can_ be defined
for unsigned values (after all it is. However, the notion of < does not
interact nicely with the additive structure of the modular arithmetic. One
very basic and important property of < for integers (or real numbers for
that matter) is _translation invariance_:

(*) for integers a, b, and c: a < b if and only if a+c < b+c

This does not hold _mod N_. In fact, a finite group cannot be totally
ordered as to satisfy (*). Taking N = 32 as a small example, we have

8 < 15

but

8 + 20 = 28
15 + 20 = 35 = 3 mod 32

thus

15+20 < 8+20

I agree. However, it is common for different groups to exhibit
different properties for operators. And translation invariance
of < is not a requirement, as you point out, of finite groups.

So, even though comparing unsigned values in C++ is meaningful, the meaning
is not really well-aligned with the interpretation of unsigned values as
representing values in a modular group.

Now, let us turn to division and mod. As you point out, / and % can be
defined in terms of *, + and < by

(-) p = d * q + r

subject to the requirement 0 <= r < d. At least, this is what we would
_want_ to obtain: the value for q should be p/d and r should be p%d.

First, note that this definition hinges upon <, which in turn is not really
a concept of modular arithmetic. This triggers some suspicion that / and %
may also not be all that cool with regard to modular arithmetic.
Nonetheless, let us just use the C++ meaning of *, +, < for unsigned types
and see whether (-) can serve as a good definition of / and %.

The crucial property of (-) is that it should determine q and r _uniquely_.
Otherwise, we would not know, which of the possible values to pick. Now, in
modular arithmetic as realized by the unsigned types, this uniqueness fails.
E.g., let us consider arithmetic mod 32 (so that we can do the computations
easily):

p = 19
d = 6

19 = 6*3 + 1 (mod 32) // this is the _desired_ solution since 1 < 6
19 = 6*8 + 3 (mod 32) // oops, this is a second solution since 3 < 6

So, we could have 19/6 = 8 and 19%6 = 3 and still satisfy (-) _mod 32_. In
other words, when you _interpret_ (-) as a statement in modular arithmetic,
then (-) fails to determine / and %.

That is why I'd say that / and % realize operations that are not best viewed
as modular arithmetic.

Now hold on. The definition of division above is an /equality/
not a /congruence/ (2) and that applies for both the modular and
"regular" versions of the definition. So there is no (mod 32)
in /this/ definition of division (1). So the unique solution is
(3,1). Or am I missing something fundamental?

(1) There is a another common definition of "division" which /is/
based on congruence instead of equality and defines division by d
as multiplication by d^-1 the multiplicative inverse of d where

d * d^-1 =m= 1

with =m= representing congruence modulo m.

However, this definition is limited to d which are coprime to the
modulus and is more difficult to calculate so it would be rather
inconvenient for a programming language. Also the result merges
the quotient and remainder in an interesting way into a single
result even if the remainder is not 0. Finally, and importantly
for this context, the multiplicative inverse definition does
not apply at all to "regular" arithmetic on integers.

(2) Meh while I was going back to respond to

First, note that this definition hinges upon <, which in turn is not really
a concept of modular arithmetic. This triggers some suspicion that / and %
may also not be all that cool with regard to modular arithmetic.

I realized an unfortunate and careless s/integer/unsigned/g led
to this phrase "for unsigned * and +" in "Let p = d * q + r for
unsigned * and + and unsigned q and r". That is not intended to
imply the = is a congruence (it is not) nor that the itermediate
expressions in the definition are modulo something. "unsigned
p, d, q, r" just means they are members of the group.

Anyhow, the point is, in response to both the the snip above and
the bad substitution, definitions frequently involve "meta" sets,
functions, etc. So the pedantic full blown definition of modular
division would involve /integers/ proper along with /integer/
addition, multiplication, etc. So < does not trigger suspicion
for me because in the back of my mind I know there is the larger
precise definition that uses integer < if necessary.

Well, this is precisely the matter of (-). Once you interpret the symbols of
(-) over non-negative integers, the condition indeed does determine unique
values for q and r.

Or once you apply /equality/ instead of the congruence relation.

I don't deny what is in the standard. What I deny is that the catch phrase
"unsigned means modular" captures the essence of unsigned types. It
overemphasizes a particular aspect and does not do justice to _other_
provisions in the standard that are hard to reconcile with modular
arithmetic.

I think we have to resolve the central issue of (-) above first.
I await your reply. Thanks!

KHD

Vladimir Jovic · Apr 21, 2010

Keith said:
Now hold on. The definition of division above is an /equality/
not a /congruence/ (2) and that applies for both the modular and
"regular" versions of the definition. So there is no (mod 32)
in /this/ definition of division (1). So the unique solution is
(3,1). Or am I missing something fundamental?

Maybe. Just calculate :

6 * 3 + 1 = 19 % 32 = 19
6 * 8 + 3 = 51 % 32 = 19
6 * 13 + 5 = 85 % 32 = 19
6 * 18 + 7 = 115 % 32 = 19

Therefore the solution (3,1) is not unique. The results are (3,1),
(8,3), (13,5), (18,7)

Vladimir Jovic · Apr 21, 2010

Alf said:
* Vladimir Jovic:

Most compilers do not warn about promotion in arithmetic expressions.

In particular since you're using g++, the g++ compiler doesn't.

But how are you getting rid of the warning about comparison between
signed and unsigned? Simple example:

#include <vector>
int main()
{
std::vector< int > v( 100, 200 );
for ( int i = 0; i < v.size(); ++ i )
{
// do something
}
}

Alf P. Steinbach · Apr 21, 2010

* Vladimir Jovic:

But how are you getting rid of the warning about comparison between
signed and unsigned? Simple example:

#include <vector>
int main()
{
std::vector< int > v( 100, 200 );
for ( int i = 0; i < v.size(); ++ i )
{
// do something
}
}

I recommend writing that as

#include <vector>
#include <blahblah.h> // Index, size()

int main()
{
std::vector< int > v( 100, 22222 );
for( Index i = 0; i < size( v ); ++i )
{
// Do something.
}
}

Using plain "int" instead of "Index" may be a Bad Habit for the 64-bit world.

Check out the thread "Automatic function result type adaption depending on arg?"
for discussion of how to define functions like size(); there's also real code.

Cheers & hth.,

- Alf

Paul Bibbings · Apr 21, 2010

Alf P. Steinbach said:
I recommend writing that as

#include <vector>
#include <blahblah.h> // Index, size()

int main()
{
std::vector< int > v( 100, 22222 );
for( Index i = 0; i < size( v ); ++i )
{
// Do something.
}
}

Using plain "int" instead of "Index" may be a Bad Habit for the 64-bit world.

Check out the thread "Automatic function result type adaption
depending on arg?" for discussion of how to define functions like
size(); there's also real code.

In the (near) future (C++0x), would the following achieve the same?

#include <vector>

int main()
{
std::vector<int> v(100, 22222);
decltype(v.size()) i;
for (i = 0; i < v.size(); ++i)
{
// Do something
}
}

Regards

Paul Bibbings

Thoughts after reading the thread: "How to elegantly get the enum code"	0	Apr 17, 2010
Enum to String	6	Feb 9, 2012
How to get education and coding job coming from abroad starting new in the US? Advice of courses or places to look?	2	May 18, 2023
UnicodeDecodeError, how to elegantly deal with this?	3	Aug 4, 2008
How to use PDF-lib and how to center each line of texts on the page?	1	Aug 16, 2023
Trying to get JSON data from API into HTML table	7	Feb 1, 2021
How to remove the undefined thing?	1	Oct 19, 2022
How to Restore Original Mask from Overlayed Image Using CNN?	0	Oct 29, 2023

How to elegantly get the enum code from its string type

Jonathan Lee

Ian Collins

Ian Collins

Vladimir Jovic

Vladimir Jovic

Ian Collins

Keith H Duggar

Kai-Uwe Bux

Keith H Duggar

Ian Collins

Keith H Duggar

Vladimir Jovic

Alf P. Steinbach

Keith H Duggar

Kai-Uwe Bux

Keith H Duggar

Vladimir Jovic

Vladimir Jovic

Alf P. Steinbach

Paul Bibbings

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads