An unusual use of case/switch

Ian Collins · Sep 23, 2013

glen said:
As I understand it, in the original Duff's case *to is an
I/O port, so writing to it is not pointless.

The probably means it should be volatile, such that the stores
don't get optimized away. That might stop some optimizations
that the compiler might otherwise be able to do.

But not loop unrolling.

Siri Cruise · Sep 23, 2013

Ian Collins said:
Why?

I thought the convoluted option would just make it harder for the
compiler to recognise opportunities for optimisations, particularly loop
unrolling. So I tried compiling both and sure enough, Sun's cc
generated the same code as gcc for the switch case, but it went on to
generate unrolled versions for the simple loop.

The most likely way to optimise the above array move is
memcpy(to, from, count*sizeof(*to))
or
memmove(to, from, count*sizeof(*to))
and let the compiler choose the best implementation.

The more usual loop would be something like
for (k=0; k<n; k++) a[k] = f(b[k], c[k]);

Depending on the underlying hardware that might be optimised to
r0 := b[0]; r2 := b[1];
r3 := b[2]; r4 := b[3];
r1 := c[0]; r3 := c[1];
r5 := c[2]; r7 := c[3];
while (k+:=4)<=n
r8 := f(r0,r1); r9 := f(r2,r3);
r10 := f(r4,r5); r11 := f(r6,r7);
r0 := b[k+0]; r2 := b[k+1];
r3 := b[k+2]; r4 := b[k+3];
r1 := c[k+0]; r3 := c[k+1];
r5 := c[k+2]; r7 := c[k+3];
a[k-4] := r8; a[k-3] := r9;
a[k-2] := r10; a[k-1] := r11;
if k-3<n, a[k-4] := f(r0, r1);
if k-2<n, a[k-3] := f(r2, r3);
if k-1<n, a[k-2] := f(r4, 5);
k := n;
Because loads and stores can be very expensive but overlap with other
operations, and it's faster to use the entire cache line at once. These kinds of
considerations are very machine specific and not obvious to people not familar
with assembly level code.

Ben Bacarisse · Sep 23, 2013

Ian Collins said:
I thought the convoluted option would just make it harder for the
compiler to recognise opportunities for optimisations, particularly
loop unrolling. So I tried compiling both and sure enough, Sun's cc
generated the same code as gcc for the switch case, but it went on to
generate unrolled versions for the simple loop.

The original point of the device was to express the effect of loop
unrolling directly in C, so it only made sense in an environment where
the compiler won't unroll the original, much simpler, loop.

I don't think you'd ever want a compiler to unroll the loop in Duff's
device. I can't see how the result could be better than the compiler's
unrolling of the original. If, for some reason, it isn't, you'd
probably get better results by adding more cases.

Keith Thompson · Sep 23, 2013

glen herrmannsfeldt said:
As I understand it, in the original Duff's case *to is an
I/O port, so writing to it is not pointless.

The probably means it should be volatile, such that the stores
don't get optimized away. That might stop some optimizations
that the compiler might otherwise be able to do.

If written in modern C, yes, it should be volatile. But the volatile
keyword didn't exist (as far as I know) when Duff invented his
device in 1983.

Joe Pfeiffer · Sep 23, 2013

One of those things I've forgotten, but how is:

more optimised than

JGH

If I understand the code (I had no idea you could embed a do-while
around the cases of a switch statement!), the unrolled version ends up
doing eight assignments for each iteration of the do-while, while the
unoptimized only does one. The result is that the unoptimized has to
evaluate eight times as many conditionals.

glen herrmannsfeldt · Sep 23, 2013

(snip, I wrote)

If written in modern C, yes, it should be volatile. But the volatile
keyword didn't exist (as far as I know) when Duff invented his
device in 1983.

Oh, yes. But also compilers didn't do as much of the optimizations
that would depend on that, especially for embedded systems.

For embedded systems with memory-mapped I/O, it is common to assign
an address (int) to a pointer and use it as an I/O port. Compilers
for those systems should know not to optimize them away.

-- glen

Keith Thompson · Sep 23, 2013

glen herrmannsfeldt said:
Oh, yes. But also compilers didn't do as much of the optimizations
that would depend on that, especially for embedded systems.

For embedded systems with memory-mapped I/O, it is common to assign
an address (int) to a pointer and use it as an I/O port. Compilers
for those systems should know not to optimize them away.

To be pedantic, I presume you mean assigning an integer value
*converted* to some pointer type to a pointer object, such as:

unsigned char *ptr = (unsigned char*)0xdeadbeef;

The phrase "an address (int)" implies that addresses are integers,
and that discussion has already been had.

glen herrmannsfeldt · Sep 23, 2013

(snip, someone wrote)

To be pedantic, I presume you mean assigning an integer value
*converted* to some pointer type to a pointer object, such as:

unsigned char *ptr = (unsigned char*)0xdeadbeef;

The phrase "an address (int)" implies that addresses are integers,
and that discussion has already been had.

I suppose, but in the case of embedded systems with memory-mapped
I/O, it is pretty close to true that addresses are int.

So far, I have not seen any machines with (float) or (double)
addresses. Maybe some with (short), though.

In most cases, embedded programmers are using C as an approximation
to an assembler. They know, almost exactly, the instructions that
will be compiled. Many processors have few enough registers that
you know exactly which register a value will be in.

Some allow one to switch, line by line, between C code and assembler
instructions. Any program that assigns I/O port addresses to pointer
variables is pretty much not portable.

-- glen

Keith Thompson · Sep 23, 2013

glen herrmannsfeldt said:
I suppose, but in the case of embedded systems with memory-mapped
I/O, it is pretty close to true that addresses are int.

So far, I have not seen any machines with (float) or (double)
addresses. Maybe some with (short), though.

In C terms, addresses are pointer values, not integers.

In lower-level terms (assembly or machine language), there is no
such thing as "int", unless the system happens to use that term
for something; "int" is a C-specific type name.

A machine address may very well have the same representation as a C
"int" (though not necessarily; for example, on many 64-bit systems
addresses are 64 bits, but int is only 32).

And of course an assembly language might permit things that look very
much like C integer constants to be used as addresses. But in C, if you
write:

volatile char *ptr = 0xdeadbeef;

you're going to get a diagnostic from the compiler -- unless the
compiler is running in some non-conforming mode.

In most cases, embedded programmers are using C as an approximation
to an assembler. They know, almost exactly, the instructions that
will be compiled. Many processors have few enough registers that
you know exactly which register a value will be in.

It depends. I work on embedded systems myself, but there's a full OS.
I understand it's different for smaller systems.

Some allow one to switch, line by line, between C code and assembler
instructions. Any program that assigns I/O port addresses to pointer
variables is pretty much not portable.

All true, but not particularly relevant to my point.

glen herrmannsfeldt · Sep 23, 2013

(snip)
(snip, then I wrote)

In C terms, addresses are pointer values, not integers.

In lower-level terms (assembly or machine language), there is no
such thing as "int", unless the system happens to use that term
for something; "int" is a C-specific type name.

A machine address may very well have the same representation as a C
"int" (though not necessarily; for example, on many 64-bit systems
addresses are 64 bits, but int is only 32).

Last embedded project I did was on a Rabbit 3000, which is pretty
much a descendant of the Z80, but with enough extra register that
you can do bank-select access to a 24 bit address space.

And of course an assembly language might permit things that look very
much like C integer constants to be used as addresses. But in C, if you
write:

volatile char *ptr = 0xdeadbeef;

you're going to get a diagnostic from the compiler -- unless the
compiler is running in some non-conforming mode.

I don't remember if the C compiler for Rabbit allows it, but I
wouldn't be surprised. Make it easier for the programmers!

It depends. I work on embedded systems myself, but there's a full OS.
I understand it's different for smaller systems.

All true, but not particularly relevant to my point.

-- glen

Keith Thompson · Sep 24, 2013

David Brown said:
It is true to say that addresses are effectively an integer type - but
not "int". Addresses (at least in all the embedded systems I have used,
which is quite a number) are universally "unsigned", and they are not
necessarily the same size as a plain "unsigned int". In particular,
there are systems with 16-bit "int" and 32-bit addresses, and systems
with 32-bit "int" and 16-bit addresses. Most 8-bit cpu's have 16-bit
addresses, and though they have 16-bit "int" in standard C mode, some
compilers support flags for non-standard modes with 8-bit "int" (which
is not allowed by the C standards, but allows the programmer to write
smaller and faster code).

If we're talking about C (meaning a language reasonably close to
the language defined by the ISO C standard), no, addresses are
not integers. They can *behave* like integers in many ways, but an
"address" is a value of pointer type. (Unless you're using the word
"address" in a way other than the way the C standard uses the word.)

You can add, multiply, divide two integers and get a meaningful
result. You can't do that with two addresses.

You can increment an integer object, and the result, even if you
look at the bits, will be one greater than the original value.
If you increment an address, it advances by the size of whatever
it points to.

It can be useful, especially in embedded systems I suppose,
to think of an address as something very similar to an unsigned
integer of the same size, but as far as C is concerned, integers
and addresses/pointers are quite distinct, sharing in common
their representation as a sequence of bits and a limited subset of
arithmetic operations (with different semantics).

[...]

Tim Rentsch · Sep 25, 2013

Ian Collins said:
*to++ = *from++;

This bag of tricks was what I was referring to above as the compiler
recognising and exploiting opportunities for optimisation. Compiler
writers have had decades to tune their code to optimise idiomatic
code.

I'll grant you that, but on the other hand Duff's device has been
around 30 years, so it too may be given idiomatic recognition in
some compilers.

If you have a test case, I'll run the numbers.

Any simple test case will do, as for example (please excuse
typing mistakes):

void
simple_loop( char *p, char *q, int n ){
while( n-- > 0 ) *p = *q++;
}

void
duff_loop( char *p, char *q, int n ){
int k = n/8;
switch( n%8 ) do {
*p = *q++; case 7:
*p = *q++; case 6:
*p = *q++; case 5:
*p = *q++; case 4:
*p = *q++; case 3:
*p = *q++; case 2:
*p = *q++; case 1:
*p = *q++; case 0: ;
} while( k-- > 0 );
}

/* and separately compiled */

extern void simple_loop(), duff_loop();
char a[1234567], b[1234567];

int
main( int argc, char *argv[] ){
void (*f)() = argc < 2 ? simple_loop : duff_loop;
int loop_count = 10000; /* or suitable other */
while( loop_count-- > 0 ) f( a, b, (int) sizeof b );
return 0;
}

and time two invocations (with appropriate number of arguments to
get the two different behaviors).

Ian Collins · Sep 25, 2013

Tim said:
If you have a test case, I'll run the numbers.

Click to expand...

Any simple test case will do, as for example (please excuse
typing mistakes):

void
simple_loop( char *p, char *q, int n ){
while( n-- > 0 ) *p = *q++;
}

void
duff_loop( char *p, char *q, int n ){
int k = n/8;
switch( n%8 ) do {
*p = *q++; case 7:
*p = *q++; case 6:
*p = *q++; case 5:
*p = *q++; case 4:
*p = *q++; case 3:
*p = *q++; case 2:
*p = *q++; case 1:
*p = *q++; case 0: ;
} while( k-- > 0 );
}

/* and separately compiled */

extern void simple_loop(), duff_loop();
char a[1234567], b[1234567];

int
main( int argc, char *argv[] ){
void (*f)() = argc < 2 ? simple_loop : duff_loop;
int loop_count = 10000; /* or suitable other */
while( loop_count-- > 0 ) f( a, b, (int) sizeof b );
return 0;
}

and time two invocations (with appropriate number of arguments to
get the two different behaviors).

I added an extra zero to the loop count.

The difference was dramatic:

c99 -fast a.c b.c
time ./a.out
real 0m4.237s

time ./a.out 2
real 0m53.272s

gcc -O3 a.c b.c
time ./a.out
real 1m8.108s

time ./a.out 2

real 0m34.600s

Phil Carmody · Sep 26, 2013

Keith Thompson said:
If we're talking about C

we're not.

did you miss the "in ..." clauses nearby the word "address" above?

(meaning a language reasonably close to
the language defined by the ISO C standard), no, addresses are
not integers. They can *behave* like integers in many ways, but an
"address" is a value of pointer type. (Unless you're using the word
"address" in a way other than the way the C standard uses the word.)

See above. Addresses are addresses. They are defined by the hardware,
your silly C language pretends to model them closeley, and often
suceeds. As bit-strings (not necessarily atomic), they are integers.

I'm a bit surprised that you have an embedded background given your
above "everything revolves around C" attitude.

You can add, multiply, divide two integers and get a meaningful
result. You can't do that with two addresses.

let's comapare an contrast:

char foo[10]={/*whatever*/};
int index = 7;
index *= index; // KT approved, it's an int!
bar(foo[index]); // no regrets

against:

const u32 FIFO1_LWM_INT = 0xC0084001; // address of hw register
const u32 FIFO1_HWM_INT = 0xC0084002; // address of hw register
u32 whocareswhat= FIFO1_LWM_INT*FIFO1_HWM_INT;

which is the more bogus code?

I know you'ved approved the former,, and think that the latter is inherently
evil, but I would ask you to reconsider.

Phil

Eric Sosman · Sep 26, 2013

we're not.

did you miss the "in ..." clauses nearby the word "address" above?

Did you miss the "Newsgroups" header above?

See above. Addresses are addresses. They are defined by the hardware,
your silly C language pretends to model them closeley, and often
suceeds. As bit-strings (not necessarily atomic), they are integers.

As bit-strings, complex long doubles are integers.

Phil Carmody · Sep 26, 2013

David Brown said:
Another example is "protothreads" at <http://dunkels.com/adam/pt/> .
This basically wraps your code in a switch statement with a hidden state
variable (implemented with some macros that are messy to define, but
easy to use).

Pffft. See Tatham's earlier coroutines article - a much more
thorough analysis and solution. (Which the above seems to be
mostly a reinvention of, despite the "invented by" claim.)

Phil

James Kuyper · Sep 26, 2013

we're not.

did you miss the "in ..." clauses nearby the word "address" above?

All uses of "int" or "unsigned int" in the above text refer to terms
from the C language, with meanings governed by the relevant clauses of
the C standard, and the same is also true of David's use of the phrase
"standard C". Those "in ..." clauses don't do anything to change those
facts.
Talking about addresses while comparing them to C data types such as
"int" is pretty much pointless unless you're using the C standard's
meaning for the term "address". It's fairly clear that David is NOT
using that meaning.

See above. Addresses are addresses. They are defined by the hardware,
your silly C language pretends to model them closeley, and often
suceeds. As bit-strings (not necessarily atomic), they are integers.

I'm a bit surprised that you have an embedded background given your
above "everything revolves around C" attitude.

David and Glen's comments about C data types such as "int" and "unsigned
int" are what makes Keith's argument "revolve around C". If they had
referred to "integer" rather than "int", Keith's response, if any, would
not have had any reason to even refer to the C standard. On the other
hand, if they had phrased their comments that way, their comments
wouldn't have been relevant to the conclusions they wanted to reach.

Keith Thompson · Sep 27, 2013

Phil Carmody said:
we're not.

Newsgroups: comp.lang.c

did you miss the "in ..." clauses nearby the word "address" above?

What does the word "int" mean if it doesn't refer to the C type of
that name?

See above. Addresses are addresses. They are defined by the hardware,
your silly C language pretends to model them closeley, and often
suceeds. As bit-strings (not necessarily atomic), they are integers.

*My* "silly C language"? It's no more mine that it is yours; we're both
posting to a newsgroup that discusses it.

Bit-strings can represent things other than integers: graphical images,
floating-point numbers, addresses, and so forth.

I'm a bit surprised that you have an embedded background given your
above "everything revolves around C" attitude.

When have I ever said that everything revolves around C? In this
newsgroup, I generally discuss C. In other contexts, I discuss other
things.

You can add, multiply, divide two integers and get a meaningful
result. You can't do that with two addresses.

Click to expand...

let's comapare an contrast:

char foo[10]={/*whatever*/};
int index = 7;
index *= index; // KT approved, it's an int!
bar(foo[index]); // no regrets

against:

const u32 FIFO1_LWM_INT = 0xC0084001; // address of hw register
const u32 FIFO1_HWM_INT = 0xC0084002; // address of hw register
u32 whocareswhat= FIFO1_LWM_INT*FIFO1_HWM_INT;

which is the more bogus code?

A strawman argument unworthy of response.

Phil Carmody · Sep 27, 2013

Eric Sosman said:
Did you miss the "Newsgroups" header above?

Headers aren't C. Like you I shall refuse to address anything
written that isn't about C, so shall ignore your reference to
headers.

If you are unable to distinguish discussion about the thing being
modelled by C, and how C models that thing, then this discussion
can go nowhere useful.

Phil

Phil Carmody · Sep 27, 2013

Keith Thompson said:
[SNIP]
A strawman argument unworthy of response.

Yes, I should have spotted that, and not responded.

Phil

Command Line Arguments	0	Mar 7, 2023
Swich case problem???	2	Feb 2, 2022
Why can't use fgetc() in SWITCH with CASE?	15	Nov 17, 2013
Exit the infinity while loop by pressing the button and continue with the switch element.	2	Apr 21, 2024
Using Do-While loop statement for a selection Menu	3	May 1, 2023
Help in this program.	2	May 14, 2022
switch case question	9	Oct 6, 2009
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023

An unusual use of case/switch

Ian Collins

Siri Cruise

Ben Bacarisse

Keith Thompson

Joe Pfeiffer

glen herrmannsfeldt

Keith Thompson

glen herrmannsfeldt

Keith Thompson

glen herrmannsfeldt

Keith Thompson

Tim Rentsch

Ian Collins

Phil Carmody

Eric Sosman

Phil Carmody

James Kuyper

Keith Thompson

Phil Carmody

Phil Carmody

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads