Are these the best set of bitset macros in the world or what!?

L

lawrence.jones

Ben Bacarisse said:
This sounded wrong and since I happen to have a copy of "IBM System/370
Principles of Operation" here I was able to check. The MSB is always
numbered 0 in that manual.

I stand corrected. I must have been thinking of some other big endian
system I worked on in the dark ages.
 
B

Brian

Nick said:
you didn't give any expansion on this

It's been said and said and said again so many times that I don't feel it
needs further explanation. Safe to say that I am one of the ones who
likes the part that says "use any part of it or all of it as your little
heart desires".
I meant "one builds on the standard stuff..."

I know. I was just emphasizing.
it's also a common thing to reinvent (unnecessary) wheels.

Well no need to try and convince me, I've heard all the flames overs the
years and make decisions appropriate for my projects.
Sometimes
it's a good idea. Some people have built better string libraries.


well we'll have to disagree here
OK.

if it's standard it'll tend to be portable. If you move it from
Windows to Linux and you've avoided too many platform specific APIs
(or encapsulated them well) then porting will be easier

That is "emphasized" way too much and it wastes a lot of time and makes
code unmaintainable. First assess the real needs, not some
way-out-there-chance, and build what is appropriate. The "portability"
rant is pure dogma or the like.
for instance?

My bitset macros! :)
 
B

Brian

Seebs said:
Because he thinks he can do better. As a result of which, he has been
forced to accept a few minor issues:

* He can never move to a different operating system or upgrade his
compiler.
* He can never write code other people can use.
* He can never use code other people have written.

None of the above is, of course, true to any degree. Indeed, my
foundational libraries are designed to abstract-away platforms. I wonder
if he feels childish for posting what he did above, for a programmer is
supposed to be open-minded, practical and able to think abstractly (a few
nice-to-have traits). (More on SW dev "personalities" later, for sure).
 
M

Marcin Grzegorczyk

Seebs said:
Well I don't drink morning coffee (and even if I did, its effects would
have long worn out by now), but I think you have, except that the first
one should be
#define BIT_N(u, n) (((u) * 0 + 1u) << (n))
so the type is always unsigned as long as the type of 'u' is unsigned.

Not so! [snip quotation of 6.3.1.8]
So if you are on a system where there is a signed type which can represent
every possible unsigned int, an object of that type put into:

((u) * 0 + 1u)

will still result in a signed object of that type, with the value 1.

I said "as long as the type of 'u' is unsigned"; obviously your macros
will not be safe if you pass, say, signed long as the first argument.

But if the first argument has an unsigned type, I cannot see how it
could fail. Either its integer conversion rank will be greater or equal
than that of int, and then both the literal 0 (type = int) and 1u (type
= unsigned int) will be converted to that type; or it will be promoted
to (signed or unsigned) int, and then the unsigned int introduced by 1u
must win.

Of course, the macros do suffer from the usual problem of evaluating
their first argument more than once; but it appears that as long as the
usual precautions are taken, the type of 'u' is unsigned, and 'n' is
nonnegative and less than the width of 'u', there are no nasal demons
lurking.

If anyone thinks I'm missing something, please let me know :)
 
B

Brian

Shao said:
Brian said:
Seebs wrote:

[snip]

Your entire post and every "point" in it was a strawman. (Seems to
be a pattern in this ng, for I've found 2 in just reading very few
posts. I hope you and the other guy are just squeeky wheels and not
representative of the whole group).

I think that bit manipulation macros have a use, in general. I don't
know that yours are the very best. If they included documentation and
attempted to avoid identifier collisions, I think they'd be even
better. A personal aesthetic preference of mine is that macro
identifiers use all upper-case, but that's not a universal
preference. It's possible that you could incorporate some of the
feedback in the thread thus far into the macros and offer a version
#2 (or #3, since I think you did already note some improvements based
on feedback), and change from "Are these..." to "Ok, are THESE..." :)

Yeah. On rewrite, before I posted them, I "over-thought" the n-1 thing
and should have left them as they were. I did post them with just n
anyway, not n-1. Why I had defines instead of typedefs, god only knows,
probably just a holderover from the original writing many years ago. The
lack of the u qualifier hasn't ever bit me: maybe there was a ui32 there
as the ui64 is in the 64-bit macro and that got lost on the rewrite.
I believe that people who publicly filter ("plonk") you are attempting
to protect themselves against wasting their time in discussion they
mightn't be interested in and/or attempting to publicly advertise
their opinion that you are not worth C discussion.

More on "bittwits" later.
But if you carry on posting to this newsgroup, you might be surprised
to find that a plonker replies to replies of your posts and might even
continue to portray you in negative ways, regardless of the brevity of
your previous, pre-plonk interaction... It's bullying. You might
find yourself portrayed as a person with characteristics that you do
not agree with.

No need to explain. I've rejected many a "bittwit" on my projects over
the years and am well aware of all the personalities.
Have a look at any of your plonkers' signatures. See if you can find
a Wikipedia link. What does its inclusion suggest to you? Just
something to keep in mind.

I don't ban anyone (accept on real projects!) but if I get too annoyed I
just "put them in their place", and a few explitives usually helps them
to self-destruct also.
Fortunately, there appears to be a lot of intelligence in the forum,

I know it. I'll bet some of the most valuable intelligence is suppressed
by the bittwits (I may define the term later, as it seems to be necessary
to do the "roles and responsibilities" thing formally).
and perhaps more forgiveness with some subscribers than with others.
Interestingly, the regular posters do not resemble a "gang" which
automatically plonks when one regular poster plonks; a demonstration
of autonomy and a measure of insurance that even if someone, perhaps
yourself, is disagreeable in discussion about C, anything you offer
that might be valuable might be noticed, discussed, clarified,
refined, referenced, dismissed, dealt with, or whatever.

I have found that too. I'm "used to" a more cordial discourse (remember
compuserve? BIX?). The internet is highly volatile and quick to get
there, I hope it improves (not that I'm an angel... I bite when
cornered?), but that would be like hoping public schools weren't a
similar sadening statement.
 
M

Marcin Grzegorczyk

Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.

As a matter of nitpick, it actually exhibits UB if n >= the *precision*
of int.
 
M

Marcin Grzegorczyk

Niklas said:
Marcin said:
[...]
Yes, but do you know any CPU architecture where bits are consistently
numbered that way? I know quite a few big-endian architectures, and on
all of them bit 0 is the LSb. (Motorola M68030 actually has 2 sets of
bit manipulation instructions, one of which numbers bits starting from
the LSb and the other from MSb!)

The MIL-STD-1750 architecture is big-endian (in a 32-bit integer, the
more significant 16-bit word comes first) and numbers bits starting from
the most significant bit, which is bit number 0.
Once upon a time the IBM 360 and 370 weren't considered exotic! :)
There 0 was always MSB. (Some say IBM deliberately did everything
opposite to the norm.) That machine of my youth made an impression
on me.

OK, I knew there *had* to be such architectures, I just couldn't name
any. Thank you both for enlightening me :)
Anyway, if bytes are numbered left-to-right, so should be bits
intuitively.

Yes and no. If you have machine instructions to manipulate individual
bits of data of more than one width (as both M68k and IA-32
architectures do), it can be quite confusing. (For example, which bit
of a register is the bit 0 depends on whether you treat that register as
16-bit or 32-bit. Yuck. I guess that's why that way of numbering bits
never became popular, as it appears.)

IMHO this is the only substantial argument in favor of little-endian
(not that I'm a fan of little-endian). But as you (i.e., James) said,
that's a topic for another forum :)
 
S

Seebs

Seebs wrote:

I misread this, and now I understand it. You're right.
I said "as long as the type of 'u' is unsigned"; obviously your macros
will not be safe if you pass, say, signed long as the first argument.

Well, the interesting thing is, I think they'll be "safe" -- they'll
always produce a value of the type passed in. It's up to you whether
you need it to be unsigned.
Of course, the macros do suffer from the usual problem of evaluating
their first argument more than once; but it appears that as long as the
usual precautions are taken, the type of 'u' is unsigned, and 'n' is
nonnegative and less than the width of 'u', there are no nasal demons
lurking.

Yup. (Well, as someone already pointed out, "width" isn't quite the
right word, but I know what you mean.)

-s
 
M

Marcin Grzegorczyk

Seebs said:
Seebs said:
#define BIT_N(u, n) (((u) * 0 + 1) << (n))
#define BIT_SET(u, n) ((u) | BIT_N((u), (n)))
#define BIT_CLEAR(u, n) ((u) & ~BIT_N((u), (n)))
[snip]
I said "as long as the type of 'u' is unsigned"; obviously your macros
will not be safe if you pass, say, signed long as the first argument.

Well, the interesting thing is, I think they'll be "safe" -- they'll
always produce a value of the type passed in. It's up to you whether
you need it to be unsigned.

You're right, I think. I've re-read 6.5.3.3 and 6.5.10 to be sure, and
it appears that as long as both 'u' and 'n' are nonnegative and 'n' is
less than the precision of 'u', the result is perfectly defined even
though the result of ~BIT_N((u), (n)) may be unspecified. Interesting.
 
M

Marcin Grzegorczyk

Marcin said:
Seebs said:
Seebs wrote:

#define BIT_N(u, n) (((u) * 0 + 1) << (n))
#define BIT_SET(u, n) ((u) | BIT_N((u), (n)))
#define BIT_CLEAR(u, n) ((u) & ~BIT_N((u), (n))) [snip]
I said "as long as the type of 'u' is unsigned"; obviously your macros
will not be safe if you pass, say, signed long as the first argument.

Well, the interesting thing is, I think they'll be "safe" -- they'll
always produce a value of the type passed in. It's up to you whether
you need it to be unsigned.

You're right, I think. I've re-read 6.5.3.3 and 6.5.10 to be sure, and
it appears that as long as both 'u' and 'n' are nonnegative and 'n' is
less than the precision of 'u', the result is perfectly defined even
though the result of ~BIT_N((u), (n)) may be unspecified. Interesting.

(Correcting myself: implementation-defined, not unspecified. I think.)
 
S

Shao Miller

Marcin said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.

As a matter of nitpick, it actually exhibits UB if n >= the *precision*
of int.

Here is 'n1256.pdf', section 6.5.7, paragraph 3:

"The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of
the right operand is negative or is greater than or equal to the width
of the promoted left operand, the behavior is undefined."

That bit includes "greater than or equal to the width". Why are you
suggesting that it's the "precision"? If a sign-and-magnitude signed
representation applies and negative zero is a trap representation, then
I agree. Is there something else you are thinking of?
 
M

Marcin Grzegorczyk

Shao said:
Marcin said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.

As a matter of nitpick, it actually exhibits UB if n >= the
*precision* of int.

Here is 'n1256.pdf', section 6.5.7, paragraph 3:

"The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of
the right operand is negative or is greater than or equal to the width
of the promoted left operand, the behavior is undefined."

That bit includes "greater than or equal to the width". Why are you
suggesting that it's the "precision"?

The next paragraph says about the result of E1 << E2, /inter alia/:

# If E1 has a signed type and nonnegative value, and E1*(2**E2)
# is representable in the result type, then that is the resulting value;
# otherwise, the behavior is undefined

Now, the value of 1 << n (mathematically equal to 2**n) will be
representable in int if and only if n is less than the precision of int.
 
K

Keith Thompson

Marcin Grzegorczyk said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.

As a matter of nitpick, it actually exhibits UB if n >= the *precision*
of int.

Hmm. I think you're right (specifically when the left operand is 1).

For concreteness, let's assume 16-bit int with a range of
-32768..+32767 (and no padding bits, though I don't think that
matters here). So the precision of int is 15, and the width is 16.

C99 6.5.7p3:

The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If the
value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior
is undefined.

p4:

[...] If E1 has a signed type and nonnegative value, and
E1 * 2^E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined.

Neither (1 << 15) nor (1 << 16) violates p3, since both 15 and 16 are
less than or equal to the width. But they both violate p4, since the
mathematical results are 32768 and 65536, respectively, neither of which
is representable in int.

(1 << 15) --> 32768, violates p4, UB
(1 << 16) --> 65536, violates p4, UB
(1 << 17) --> 131072, violates both p3 and p4, UB

(0 << 15) --> 0, ok
(0 << 16) --> 0, ok
(0 << 17) --> 0, but violates p3, UB

(1u << 15) --> 32768, ok
(1u << 16) --> 65536, reduced to 0, ok
(1u << 17) --> 131072, but violates p3, UB

(I've lost count of the number of off-by-one errors I've committed
while revising this post.)
 
E

Ersek, Laszlo

(A bit late, sorry.)

I believe another way to do what ((u) * 0 + 1) is meant to do here would
be

(0 ? (u) : 1)

except it doesn't evaluate "u" at all (6.5.15p4-5).

(The idea is not mine; I seem to remember somebody posting a link to some
C++ (specifically, boost) template magic here, and that string of spells
relied on both the non-evaluation and the type inference.)

lacos
 
M

Marcin Grzegorczyk

(A bit late, sorry.)

I believe another way to do what ((u) * 0 + 1) is meant to do here would be

(0 ? (u) : 1)

except it doesn't evaluate "u" at all (6.5.15p4-5).

I believe it's as close to perfection as one can get :)
Thank you for the tip!
 
S

Shao Miller

Marcin said:
Shao said:
Marcin said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.

As a matter of nitpick, it actually exhibits UB if n >= the
*precision* of int.

Here is 'n1256.pdf', section 6.5.7, paragraph 3:

"The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the value of
the right operand is negative or is greater than or equal to the width
of the promoted left operand, the behavior is undefined."

That bit includes "greater than or equal to the width". Why are you
suggesting that it's the "precision"?

The next paragraph says about the result of E1 << E2, /inter alia/:

# If E1 has a signed type and nonnegative value, and E1*(2**E2)
# is representable in the result type, then that is the resulting value;
# otherwise, the behavior is undefined

Now, the value of 1 << n (mathematically equal to 2**n) will be
representable in int if and only if n is less than the precision of int.

Ah yes, indeed! Thank you! 2^n is truly not representable with a
precision of n. Excellent nit-pick. :)

We don't even get as far as producing a negative zero before invoking
undefined behaviour; the sign bit is not defined to be affected by '1 <<
n', where 'n' is the precision of 'int'. Good to keep in mind!
 
S

Shao Miller

Keith said:
Marcin Grzegorczyk said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.
As a matter of nitpick, it actually exhibits UB if n >= the *precision*
of int.

Hmm. I think you're right (specifically when the left operand is 1).

For concreteness, let's assume 16-bit int with a range of
-32768..+32767 (and no padding bits, though I don't think that
matters here). So the precision of int is 15, and the width is 16.

... ... ...

Neither (1 << 15) nor (1 << 16) violates p3, since both 15 and 16 are
less than or equal to the width. But they both violate p4, since the
mathematical results are 32768 and 65536, respectively, neither of which
is representable in int.

(1 << 15) --> 32768, violates p4, UB
(1 << 16) --> 65536, violates p4, UB

Does this one just above not violate p3, as well? 16 >= 16 (the width)?
(1 << 17) --> 131072, violates both p3 and p4, UB

(0 << 15) --> 0, ok
(0 << 16) --> 0, ok

Same just above?
(0 << 17) --> 0, but violates p3, UB

(1u << 15) --> 32768, ok
(1u << 16) --> 65536, reduced to 0, ok

Hmmm... For this one just above: Does p3 or p4 take precedence? Once
again, 16 (right operand) >= 16 (width) which p3 states is undefined.
But the "...reduced modulo..." in p4 confuses me.
(1u << 17) --> 131072, but violates p3, UB

(I've lost count of the number of off-by-one errors I've committed
while revising this post.)

Eep.
 
K

Keith Thompson

Shao Miller said:
Keith said:
Marcin Grzegorczyk said:
Keith Thompson wrote:
[snip]
Yes, in (1 << n), the type of the result is int, and it exhibits
UB if n >= the width of int.
As a matter of nitpick, it actually exhibits UB if n >= the *precision*
of int.

Hmm. I think you're right (specifically when the left operand is 1).

For concreteness, let's assume 16-bit int with a range of
-32768..+32767 (and no padding bits, though I don't think that
matters here). So the precision of int is 15, and the width is 16.

... ... ...

Neither (1 << 15) nor (1 << 16) violates p3, since both 15 and 16 are
less than or equal to the width. But they both violate p4, since the
mathematical results are 32768 and 65536, respectively, neither of which
is representable in int.

(1 << 15) --> 32768, violates p4, UB
(1 << 16) --> 65536, violates p4, UB

Does this one just above not violate p3, as well? 16 >= 16 (the width)?

Yes, it does; thanks for catching that.
Same just above?
Indeed.

Again, I was mistaken; (1U << 16) violates p3, thus UB.
Hmmm... For this one just above: Does p3 or p4 take precedence? Once
again, 16 (right operand) >= 16 (width) which p3 states is undefined.
But the "...reduced modulo..." in p4 confuses me.

p3 says it's undefined; that takes precedence. The standard doesn't
quite say so, but it wouldn't make much sense otherwise; the UB in p3
would never apply if the left operand is unsigned or has the value 0.

And at least two of them survived. *sigh*
 
E

Ersek, Laszlo

I believe it's as close to perfection as one can get :)
Thank you for the tip!

I like very much what Seebs wrote: multiplying by the additive identity,
then adding the multiplicative identity :)

This leads me now to some further attempts (none of which are solutions
due to "holes" in their domains, and/or multiple evaluations):

u / u /* doesn't work for 0 */

u % (u - 1) /* only for u > 2 */

u - u + 1 /* variation of u*0 */

(Extra parens omitted.)

lacos
 
B

Brian

Keith said:
Brian said:
pete said:
Brian wrote:

Keith Thompson wrote:
Eric Sosman wrote:
[...]
- What is the type of `1'?

I would assume it would be "coerced" to the type of n, yes? [...]

The type of 1 is int.

No matter what the type of n is?

Correct.

In ((a) << (n)),
the types of (a) and (n) are unrelated.

I find that non-regular and surprising and a good reason for hiding
it behind some macros.

Only because you fail to understand the underlying principles.
No, because it is. And there are many faults in C. My macros correct some
of them. Next person who starts with the pedantic shit gets a size 12
boot up their ass. Just curb it bitheads, ok? ok, good.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,083
Messages
2,570,591
Members
47,212
Latest member
RobynWiley

Latest Threads

Top