Lolling at programmers, how many ways are there to create a bitmask ? ;) :)

Skybuck Flying · May 27, 2010

Seebs said:
On many systems.

No, it isn't. Don't do something incoherent.

I have no idea what you think you are talking about. There is no reason
you
need to use shifts to write to a bitstream. Perhaps more importantly,
what
on earth do you think you're shifting? If you have a 32-bit value, and
you
shift it by 32, you have shifted ALL of the data out of it. Why bother?
If you're not going to be using any remaining bits at all, why are you
performing an operation?

I think to get rid of a branch (branches slow down cpu's) and thereby speed
up the code.

Would you rather write:

// 1.
Z := X shl Y;

// or

// 2.
if Y < 32 then
begin
Z := X shl Y;
end else
begin
Z := X;
end;

?

Bye,
Skybuck.

Seebs · May 27, 2010

A quick look at Skybuck Flying's posting history in other newsgroups
might be illuminating.

I have enough information to convince me that he's a waste of valuable
electrons. I think he's probably purely trolling, but it's the kind
of trolling where genuine stupidity shines through.

-s

James Harris · May 27, 2010

I have enough information to convince me that he's a waste of valuable
electrons. I think he's probably purely trolling, but it's the kind
of trolling where genuine stupidity shines through.

I'm not going to try and defend him but having seen his posts for some
time I don't think he's trolling. His interests are or have been video
processing. He puts a lot of effort into getting the best x86
instruction sequences from his Delphi compiler. The rest is mainly
communication style.

James

James Harris · May 27, 2010

I think to get rid of a branch (branches slow down cpu's) and thereby speed
up the code.

Would you rather write:

// 1.
Z := X shl Y;

// or

// 2.
if Y < 32 then
begin
Z := X shl Y;
end else
begin
Z := X;
end;

If you are talking about x86 don't be afraid of branches. Instead, be
afraid of unpredictable branches. Further, from examples I've seen in
the past the processors can make a surprisingly good job of predicting
sequences we might consider random.

It does help if you can present it stable sequences: longish runs of Y
< 32, longish runs of Y >= 32 in your example.

James

Seebs · May 27, 2010

I'm not going to try and defend him but having seen his posts for some
time I don't think he's trolling.

Could be.

His interests are or have been video
processing. He puts a lot of effort into getting the best x86
instruction sequences from his Delphi compiler. The rest is mainly
communication style.

Could be, but it's a communications style which seems rude to me, and
the outrage at the idea that a processor might require you to give it
only semantically valid instructions strikes me as a bad sign.

-s

Seebs · May 27, 2010

If you need a branch for this, you've done it wrong.

If you are talking about x86 don't be afraid of branches. Instead, be
afraid of unpredictable branches. Further, from examples I've seen in
the past the processors can make a surprisingly good job of predicting
sequences we might consider random.

For that matter, why are you writing something this low-level to begin
with? And what on earth is it doing cross-posted to comp.lang.c, when it's
obviously nothing to do with C?

It does help if you can present it stable sequences: longish runs of Y
< 32, longish runs of Y >= 32 in your example.

Yes.

It also helps if you think about the algorithm in advance, because it is
patently obvious that shifting a word by more than the word size is
meaningless, and means that the entire operation you're coming in from was
nonsense.

The problem with taking a long walk off a short pier isn't that walking
is mistakenly producing undefined behavior. It's that the instructions are
either wrong, or intentionally producing that behavior.

-s

Robert Myers · May 27, 2010

Could be.

Could be, but it's a communications style which seems rude to me, and
the outrage at the idea that a processor might require you to give it
only semantically valid instructions strikes me as a bad sign.

Rudeness on Usenet? What *is* the world coming to?

Main Entry: 1loll
Pronunciation: \ËˆlÃ¤l\
Function: verb
Etymology: Middle English
Date: 14th century
intransitive verb
1 : to hang loosely or laxly : droop
2 : to act or move in a lax, lazy, or indolent manner : lounge
transitive verb
: to let droop or dangle
synonyms see idle
â€” lollÂ·er \ËˆlÃ¤-lÉ™r\ noun

In any case, computers know what programmers really mean. If not,
they're just being difficult, like a recalcitrant child. A very
stubborn and defiant recalcitrant child.

Robert.

Skybuck Flying · May 28, 2010

Seebs said:
If you need a branch for this, you've done it wrong.

Perhaps... I was starting to wonder if there is a better way than using a
lookup table.

Today is a new today with a fresh look at things

Now that the basic algorithm is done it's easier to see where problems might
be.

There are two or three different problems which are slightly different from
each other.

1. The first problem is calculating a mask to represents all one's for the
lowest bits up to bitcount.

So if bitcount is 5, bits 0 to 4 most be set.

Bit count can go up to 32 for a longword.
Bit count could even go up to 64 for a int64. But this case can be ignored
for now.

So the limit of bit count 32 is good enough for now.

Since shr 32 and shl 32 is not possible a solution is needed. A fast one too
for that matter.

Noone has yet provided one in there examples, perhaps because the examples
where limited to 16 bit words.

Anyway since there are only 5 bits available for shl and shr there is a
logical problem for intel:

"Do we support range 0 to 31 or do we support range 1 to 32 ?"

Intel choose to support range 0 to 31.

Logically/interestingly shr 0 and shl 0 do absolutely nothing !

Yet I have heard nobody complain about this apperently lack of
functionality, which is just as useless as doing shr 32 and shl 32...
The last could have even been more usefull but ok. Depends on the situation
perhaps.

The conclusion therefore can be/most be that shr 0 and shl 0 are usefull
after all !

If "we" can wrap around the bitcount from 32 back to 0 than at least we will
prevent "garbage".

Now the question is, is it usefull to wrap around ? This will probably
depend on the code and the situation.

So let's examine it:

The technique used so far to calculate a mask is to set the mask initially
to all one's as follows:

1111111-32-one's-11111111

Depending on the bit count this masks needs to be shifted left. If bitcount
is 1 the mask should have all one's except one zero as follows:

1111111-31 one's-11111110

The zero will represent the content bit later on. It's clear to me that with
the current masking formula only 31 bits can be shifted out.

Example: (1 shl 31)-1 = 0111 1111 1111 1111 1111 1111 1111 1111
(I just found a nice button on the windows calculator it's called Lsh (Left
shift I guess) it can be used to do these calculations... nice)
The problem is clear: the last bit is missing.

I was thinking, maybe a "bitcount mod 32" might help so let's see what
happens:

32 mod 32 would become zero.

The formula will be:

not ( -1 shl 0 );

-1 shl 0 = -1

-1 = all bits set.

not (-1) = must therefore be logically all bits cleared.

This is not what we want... we want all bits set for a bitcount of 32.

Therefore there is clearly a problem.

So we need a different solution.

The algorithm exits on bitcount 0 so the mask of all one's is therefore
useless.

Perhaps the mask can be changed as follows:

not ( -2 shl ? );

This would assume bitcount is always 1 or greater.... therefore we can
subtract 1 from the bitcount.

the formula would then be:

not ( -2 shl (BitCount-1) )

Now let's see if this works for BitCount 1, 5 and 32 as verifications.

Case 1: not ( -2 shl (1-1) ) = not ( - 2 shl 0 ) = not(-2)

Logically all bits are set except the last one. (Assuming two complements
computers, have a nice time figuring out how to support it in "portable C"

)
Therefore it gets changed into: etc0000000000001

Which is indeed what we want.

Now let's continue with the 32.

Case 32: not( -2 shl (32-1) ) = not ( -2 shl 31 )

Logically all bits will be pushed out except the first one...

So I would expect it to be all zero's... notting all zero's gives a mask of
all one's.

Which is what we want.

For now the problem with "masking" seems to be solved.

However the problem for shifting "content" still needs to be looked. I am
not yet sure if it can be solved.

Bye,
Skybuck.

James Dow Allen · May 28, 2010

I was thinking, maybe a "bitcount mod 32" might help ...

From the (useless?) factoids department:

When Intel x86 shifts a 32-bit register with a shift-count
coming from reg, all but the low-order 5 bits are ignored;
i.e. shift count is mod 32.

*However* the Motorola 68xxx uses 6 bits; i.e. shift
count is mod 64. A large count will set the 32-bit
reg to all 0's (or all 1's).

I wrote a Huffman decoder once. I didn't want it to
be just another FAST decoder; I wanted it to be the
fastest decoder possible! Config tested shifts and
took a short-cut when possible.

I am *not* here to defend anal-retentive coding.
("Hello, my name is James and I indulge in silly
pointless micro-optimizations.") But surely everyone
will admit that micro-optimizations are a cheaper
and safer *recreation* than many alternatives.

James

Skybuck Flying · May 28, 2010

Ok,

I no longer believe the lookup table is best...

Correct formula's / calculations can probably solve the problem.

As long as the algorithm exists on BitCount = 0.

Good testing still has to be done but it seems ok.

The time went down from 1.32 seconds or so to 1.12 seconds or so...

So that's at least 15% more speed or so, that's nice.

Bye,
Skybuck.

Skybuck Flying · May 28, 2010

However the problem for shifting "content" still needs to be looked. I am

not yet sure if it can be solved.

There is now one remaining problem with the algorithm:

mSomething := Content shr (BitCount - Shift);

The BitCount can again range from 0 to 32.

The Shift can range from 0 to 31.

Thus BitCount 32 - 0 is 32

So shr 32 is a problem.

mSomething should become zero when shr 32 is done.

Shr 0 will leave the content intact which would be wrong.

Any solutions ?

For now I only see a branch as a solution.

Bye,
Skybuck.

Skybuck Flying · May 28, 2010

Skybuck Flying said:
There is now one remaining problem with the algorithm:

mSomething := Content shr (BitCount - Shift);

The BitCount can again range from 0 to 32.

The Shift can range from 0 to 31.

Thus BitCount 32 - 0 is 32

So shr 32 is a problem.

mSomething should become zero when shr 32 is done.

Shr 0 will leave the content intact which would be wrong.

Any solutions ?

For now I only see a branch as a solution.

It's kinda tricky too... to try and get a branch working.

Just checking if BitCount is 32 would not be enough....

Since BitShift might be 5 and then it needs to shift and so forth.

So it actually needs to branches probably something like:

if (mBitCount=32) and (mShift = 0) then
begin
mSomething := 0;
end;

Bye,
Skybuck.

Skybuck Flying · May 28, 2010

Skybuck Flying said:
It's kinda tricky too... to try and get a branch working.

Just checking if BitCount is 32 would not be enough....

Since BitShift might be 5 and then it needs to shift and so forth.

So it actually needs to branches probably something like:

if (mBitCount=32) and (mShift = 0) then
begin
mSomething := 0;
end;

A better way could be to store the result in a variable first:

vShiftRight := (BitCount - Shift);

if vShiftRight = 32 then
begin
mSomething := 0;
end else
begin
mSomething := Content shr vShiftRight;
end;

At least this brings the branches back to 1.

Bye,
Skybuck.

Robert Myers · May 28, 2010

James Dow Allen wrote:

It is never pointless: I learn something every time.

Absolutely.

I've been helping a guy who crossposted an optimization problem over a
bunch of programming groups (I read it in comp.lang.asm.x86).

His original program took 26 minutes (albeit a number of years ago), he
has a third-party written version which runs in ~5 ms, but sometimes
fails to work (i.e. doesn't find all solutions).

My current code runs in 0.75 ms, and he's still complaining because I
wrote it in C(++) instead of his preferred Pascal.

Just saw another orienteering competition in progress, Terje. Very
safe, except possibly for a sprain or an abrasion from a slip and fall
on the rocky terrain. I once managed to get my foot infected from
something in my shoe, even though I wasn't orienteering.

Most important, if someone is overly-clever or overreaching, the
consequences for anyone else will be strictly limited. The national
guard and police helicopters are unlikely to be engaged in even the
most dire conceivable mishap.

I'm not sure you can say the same for compulsively-optimizing hotshot
programmers. Maybe you could get more of them interested in
orienteering, or some similar mentally and physically challenging
activity that will leave less time for obscure cleverness.

As to learning something from almost anything that requires thought,
who could disagree? ;-)

Robert.

Nathan · May 28, 2010

I think at this point I'm plonking you. You're extremely verbose without
saying much of anything, you overuse exclamation marks, you're overly
exciteable, and apparently you don't have any clue how to frame a problem,
think about what you need, and develop a plan for approaching it. Maybe
you should consider decaf.

Hmm... "overly excitable" might be indicative of bipolar-ism. The
rest might be attributable to a "need for attention" which overpowers
his desire to learn.

One item in his favor is a persistent willingness to play with code.

Nathan.

Robert Myers · May 28, 2010

One item in his favor is a persistent willingness to play with code.

Forgive my idle chatter.

Playing with any kind of puzzle or challenging activity can always be
productive.

One of the things that's attractive about applied mathematics as I
have experienced it is that, if something goes well, "as if by magic,"
it's usually an indication that you have done something right, and
perhaps very right. There are exceptions, but the fact that something
"works out" is almost always a trail of bread crumbs worth following.

That has not been my experience with programming, either my own, or
the programming of others. Too many things in programming will work
out "as if by magic" under some circumstances and not others. That's
one of the roots of evil in the mess that is called computer science.

Robert.

Robert Myers · May 28, 2010

Not quite true:

Several of my most cherished programs/functions/algorithms, those I
consider really elegant, have exactly the ssame feeling of rightness
about them: All the parts fit together, all the corner cases simply go
away, and the performance is close to optimal as well.

I suspect that the key difference is the depth of understanding of the
programmer.

The context of the discussion was playing with code.

If you have some notion of elegance and can accurately identify it
when you see it, then I would agree with you.

Anyone can play with code, whether they can recognize correctness,
never mind elegance, or not.

Robert.

Skybuck Flying · May 29, 2010

Anyway,

I have now completely illiminated the need for a mask... at least in the
write routine... but other routines might/will probably still require it.

The formula was:

Mask := not ( -2 shl (BitCount-1) )

As far as I can see it probably uses 3 instructions, which is still quite a
lot.

Since a subtraction must now occur anyway the following formula will
probably be faster, unless shifting many bits takes longer... but I would
not expect that on modern hardware, the AMD optimization manual says "note
3, clock count ???" What is that ?

Anyway here is a potentially faster formula which works from the opposite
side:

Mask := -1 shr (32-BitCount);

From the looks of it, just 2 instructions.

Yeah it's a little bit faster

Bye,
Skybuck.

Rudy Velthuis · Jun 29, 2010

Seebs said:
Tell you what. Try posting this coherently with a reasonable amount
of punctuation, and I'll totally point out that the word "undefined"
is completely unambiguous, since apparently this is not obvious
enough.

-s

Fortunately, I can't se SkyBuck's posts anymore, but I see he doesn't
even understand what happens after a shl 32? LOL!

In his case, any result is garbage, regardless the processor used. <g>

--
Rudy Velthuis http://rvelthuis.de

"Nothing is wrong with California that a rise in the ocean level
wouldn't cure."
-- Ross MacDonald (1915-1983)

Rudy Velthuis · Jun 29, 2010

Seebs said:
I think at this point I'm plonking you. You're extremely verbose
without saying much of anything, you overuse exclamation marks,
you're overly exciteable, and apparently you don't have any clue how
to frame a problem, think about what you need, and develop a plan for
approaching it. Maybe you should consider decaf.

He's been like that as long as I know him. He has lots of mental
problems, and some are that he doesn't read nor listen to advice, has a
lot of fantastic ideas that usually appear to be complete rubbish
(which he woulöd know if he had read up on the subject). He also thinks
he is God's gift to humanity and that he actually has an immensely
sharp brain, and can get pretty rude if people don't agree. <g>

FWIW, I can't read his posts, so I don't exactly know what he posted.
Apparently he is clueless about shifts. <g>

--
Rudy Velthuis http://rvelthuis.de

"A time will come when a politician who has willfully made war
and promoted international dissension will be ... surer of the
noose than a private homicide."
-- H. G. Wells

extension_pack	0	Jan 6, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004

Lolling at programmers, how many ways are there to create a bitmask ? ;) :)

Skybuck Flying

Seebs

James Harris

James Harris

Seebs

Seebs

Robert Myers

Skybuck Flying

James Dow Allen

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Robert Myers

Nathan

Robert Myers

Robert Myers

Skybuck Flying

Rudy Velthuis

Rudy Velthuis

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads