move bits stream through an array

vib · May 28, 2005

Hi there,

union UC32 {
unsigned int L;
unsigned short S[2];
unsigned char C[4];
} UC32;

with the above structure, I am able to shift bits of C[0], into C[1],
and C[1] into C[2] so on and so forth till C[3] as they are "union"ed
with unsigned int L.

However, the problem comes when I need more than the primitive datatype
can handle, let's say as 10 bytes.

my attempt is:

unsigned char ar[10];
unsigned temp;

signed char i, j;

for (i = 9; i > 0; i --)
{

if ( ar[i-1] & 0x80)
{
ar = ar << 1 | 0x01;
}
else
ar = ar << 1;

}

ar[0] = ar[0] << 1;

i don't quite like this approach. All comments and suggests are
welcome.

Thanks
vib

Walter Roberson · May 28, 2005

vib said:
my attempt is:

unsigned char ar[10];
unsigned temp;

signed char i, j;

for (i = 9; i > 0; i --)
{

if ( ar[i-1] & 0x80)
{
ar = ar << 1 | 0x01;
}
else
ar = ar << 1;

}

ar[0] = ar[0] << 1;

Click to expand...

This can be shortened a bit: also, you are shifting the wrong way
and shifting in the wrong bit into the wrong place.
The loop you show is much closer to shifting in the other direction
(but still wrong for that.. for a left shift, you have to start your
loop at 0 and you have to look at ar[i+1] not at ar[i-1] .)

The loop can be:

for (i = sizeof(ar) - 1; i > 0; i-- )
ar = (ar >> 1) | ((ar[i-1] & 1) << (CHAR_BIT-1));
ar[0] >>= 1;

It can also be optimized a bit:

#define ARagTYPE long /* could be long long if your compiler supports that */
#define ARSIZE 10
#define ARLSIZE ((ARSIZE + sizeof ARagTYPE - 1) / sizeof ARagTYPE)
typedef union {
unsigned char ar_c[ARSIZE];
unsigned ARagTYPE ar_l[ARLSIZE];
} ar_u;

ar_u ar;

unsigned char i;

initialize_ar(&ar); /* initialize ar to something */

for (i = ARLSIZE - 1; i > 0; i-- )
ar_l = (ar_l >> 1) | ((ar_l[i-1] & 1) << (CHAR_BIT-1));
ar_l[0] >> 1;

Now the question you have to ask yourself is: when you shift as
a long, are you certain you are shifting the right bits into the
right place? Suppose the byte order is 3412 within a long...

Malcolm · May 28, 2005

vib said:
union UC32 {
unsigned int L;
unsigned short S[2];
unsigned char C[4];
} UC32;

with the above structure, I am able to shift bits of C[0], into C[1],
and C[1] into C[2] so on and so forth till C[3] as they are "union"ed
with unsigned int L.

Don't use unions for playing games with bit representations. It isn't
portable and may not work as you expect if the compiler uses a funny
strategy for allocating members. It is also strictly speaking undefined
behaviour to write data to one union member and read it from another.

Try this strategy (untested code)

typedef struct
{
unsigned char *data;
int pos;
int mask;
} BITSTREAM;

BITSTREAM *bitstream(void *ptr)
{
BITSTREAM *answer = malloc(sizeof(BITSTREAM));
if(answer)
{
answer->data = ptr;
answer->pos = 0;
answer->mask = 1;
}
return answer;
}

int getbit(BITSTREAM *bs)
{
int answer = (bs->data[bs->pos] & bs->mask) ? 1 : 0;
bs->mask <<= 1;
if(bs->mask == 0x0100)
{
bs->mask = 1;
bs->pos++;
}
return answer;
}

/*
this one can be optimised by accessing several bits at a time.
May not be worth while.
*/
unsigned long getbits(BISTREAM *bs, int N)
{
unsigned long answer = 0;
while(N--)
{
answer <<= 1;
answer |= getbit(bs);
}

return answer;
}

vib · May 29, 2005

Thanks Walter,

The new data is obtained and append at the ar[0], By shifting to the
left , and having a pattern matching operation at the specific
locations in the array, ar[], I am able to scan the incoming bits
stream easily. It is like watching and matching a pattern at a running
bits from right to left.

Tested your code, it is great. However as i am using it in embedded
system which resources are always a concern. I decide not to use OR
with (CHAR_BIT-1). Instead, I use if to determine the most signficant
bit of the byte on the right is nonzero or zero and act accordingly.

Also ur on second suggestion, it is great too. I was thinking along
that as well. However, having anything larger than byte has to deal
with the byte order of the datatype. It is too much a headache, as well
as for the performance sensitive embedded system. Hence, I chose to
stay by using the byte array.

Many thanks.

vib · May 29, 2005

Hi Walter,
On second thought, I think right-shifting makes more sense as it would
be easier when doing memcmp.

Thanks

vib · May 29, 2005

Hi Malcom,

What is brilliant approach for replacing union. To make it advances to
next byte, I 've make the following changes:

if(bs->mask == 0x0100)
{
bs->mask = 1;
bs->pos++;
bs->data++;
}

Thanks

Keith Thompson · May 29, 2005

vib said:
Tested your code, it is great. However as i am using it in embedded
system which resources are always a concern. I decide not to use OR
with (CHAR_BIT-1). Instead, I use if to determine the most signficant
bit of the byte on the right is nonzero or zero and act accordingly.

[...]

I haven't been following this discussion closely, but based on your
description, I'm skeptical that what you describe is going to be more
efficient.

vib · May 29, 2005

Are you refering to the statement, "I decide not to use OR with
(CHAR_BIT-1). "? Just think that shifting (CHAR_BIT -1) bits to the
left is less efficient comparing to an if instruction.

Keith Thompson · May 29, 2005

vib said:
Are you refering to the statement, "I decide not to use OR with
(CHAR_BIT-1). "? Just think that shifting (CHAR_BIT -1) bits to the
left is less efficient comparing to an if instruction.

That depends on the system. The C language has nothing to say about
the relative efficiency of various constructs. On some systems,
conditonal branches can mess up pipelining; a shift might be executed
in a single clock cycle. The only way to be sure is to measure.

Remember the Rules of Optimization:

Rule 1: Don't do it.
Rule 2: Don't do it yet (experts only).

The best approach is usually to code as clearly as possible and use
good algorithms (don't bother fine-tuning a bubblesort). Once you've
done that, profile your code and find out where the bottlenecks are.

vib · May 29, 2005

well noted. Thanks.

Walter Roberson · May 29, 2005

Are you refering to the statement, "I decide not to use OR with
(CHAR_BIT-1). "? Just think that shifting (CHAR_BIT -1) bits to the
left is less efficient comparing to an if instruction.

You can precalculate the value 1 << (CHAR_BIT-1)
and multiply (ar[i-1] & 1) by that precalulated value.

That would transform the code from a "shift by a known factor of
an unknown bit value" into a "multiply an unknown bit by a constant
value".

Your code needs CHAR_BIT if it is to be portable. You should not
be assuming 8 bits per unsigned char.

Another approach would be to | in (ar[i-1] & 1) ? 1 << (CHAR_BIT-1) : 0
The 1 << (CHAR_BIT-1) will be computed at compile time .
This trades the multiply (or shift) into a conditional computation.
On some systems a ?: construct with constants for both potential
results can be done -quite- efficiently, but on other systems it
requires a full conditional branch that messes up pipelining and
so on.

On systems (e.g., MIPS R10000 chips) that have the
"move conditional" instruction, the ?: approach would be most efficient.

On many systems, most efficient would be the shifting of the bit
by (CHAR_BIT-1) -- but there are some important CPUs that have no
"barrel shifter" that can do it in one step, and there are some
CPUs on which there is an effecient shift in one direction but not the
other, so the efficiency of shifts should not be taken for granted.

On systems without an efficient shift instruction, the version
using the integer multiply by a constant might be the most efficient,
but this is not certain either... some systems do integer multiplies
quite efficiently, others only do floating point multiplies efficiently.

On a fair number of CPUs, if/then/else is the least efficient choice.
But even that is quite variable, even within the same machine:
if the instructions are all in primary cache the if/then/else
might be pretty efficient, but if the instructions are not in
primary cache then the if/then/else might come out as the slowest...
Still, it's a good rule of thumb that a conditional
branch ( ?: or if/then/else ) will -likely- be slower than
using short code that can compute the value without using a branch.

Keith Thompson · May 29, 2005

vib said:
well noted. Thanks.

What is? Please provide some context when posting a followup.

If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.

vib · May 30, 2005

Thanks Walter. Your explanations are indeed helpful.

vib · May 30, 2005

If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.

thanks, this really solves my problems.

Keith Thompson · May 30, 2005

vib said:
thanks, this really solves my problems.

Glad to hear it. One more thing: please don't snip attributions (such
as the
line above). I think Google adds attribution lines if you use the
proper "Reply" button.

Adding adressing of IPv6 to program	1	Feb 16, 2023
Fibonacci	0	May 13, 2023
Please help with C programming to save GPS reception data in Raspberry Pi.	0	Dec 8, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Array of structs function pointer	10	Jul 16, 2023
herding the bits of a float into an integer array	35	Dec 5, 2009
Shifting bits	31	Nov 29, 2011
Survey details won't go through using php, ajax, Mysql	2	Oct 26, 2023

move bits stream through an array

vib

Walter Roberson

Malcolm

vib

vib

vib

Keith Thompson

vib

Keith Thompson

vib

Walter Roberson

Keith Thompson

vib

vib

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads