Is % slow?

M

Michael B Allen

I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

Thanks,
Mike
 
E

Emmanuel Delahaye

Michael B Allen a écrit :
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

Use ldiv(), it is optimized for your architecture and you have the
result of the division and the remain for the same price.
 
S

Skarmander

Michael said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?
What operation *cannot* be slow on some architectures? I vividly recall
expecting that
a = a * 10 + b - 48;
would be slower than
a = a << 8 | b;
And maybe I'm right for most platforms. But on the Pentium 4, the first
is in fact faster:
leal (%ebx,%ebx,4), %eax # eax = ebx + ebx * 4
leal -48(%ecx,%eax,2), %ebx # ebx = ecx + eax * 2 - 48
beats
sall $8, %ebx # ebx = ebx << 8
orl %eax, %ebx # ebx = ebx | eax
since Intel processors have some funky instructions and the Pentium 4
sucks at shifting. Moral: don't assume too quickly that you know what's
fast and what isn't.

(No, the expressions don't compute the same value. Different algorithms
for the same purpose are being compared.)
My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.
Could or is? Profile. This will also immediately tell you which
alternative is faster. Armchair optimization generally doesn't work; it
falls under "too clever by half". Better algorithms, yes; optimizing
individual expressions, no.

In particular, many processors have a division instruction that
simultaneously generates the remainder. If you write / and %, the
compiler could recognize this and benefit. It's unlikely it will do so
when faced with a subtraction and multiplication.

In general, say what you mean. You mean modulo, use modulo. Should a
particular combination of compiler and platform prove unable to generate
optimal code *and* you need the speed, then optimize. Until then, leave
it alone.

S.
 
J

Jack Klein

I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

The C language itself does not address speed or efficiency of any
operation, only what the result should be if it is used in a strictly
conforming way.

The simplest way to determine the efficiency of a particular operation
on your particular platform under your conditions is to code up some
timing tests.
My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

You could consider the div() or ldiv() functions, depending on the
type of timeout.tv, which return both the quotient and the remainder
in a single structure.
struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

On the other hand, if this were going to actually be called every
millisecond, I would never code it this way. I would maintain my own
count of nanoseconds and adjust it each time with an addition.
 
S

Scott \Poncho\ Fluhrer

Michael B Allen said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

In my experience, worrying about such microoptimizations is generally a
mistake. You should write code that is easy to understand and maintain, and
worry about optimizing it only if profiling showed that the code is too
slow.
 
W

Walter Bright

Michael B Allen said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

Let's have some fun and compile the two snippets of code with the Digital
Mars C compiler, and see what the resulting assembler is (compile with -o
and disassemble the result with obj2asm). The first observation is that a
DIV instruction will commonly produce both the dividend and the remainder,
and a halfway decent compiler should keep track of that for you. The second
is that modern compilers do a pretty good job of low level optimization, and
by trying to wrest this away from the compiler to do it yourself will often
defeat it and make things worse. The third is that obj2asm or equivalent is
indispensible to doing tuning. And lastly, profiling the result is necessary
to see if one is gaining or losing ground.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers
www.digitalmars.com/techtips/timing_code.html how to measure code
performance
------------------------------------------------
struct timespec
{
int tv_sec;
int tv_nsec;
};

struct timespec timeout;

void foo(int millis)
{
timeout.tv_sec = millis / 1000;
timeout.tv_nsec = (millis % 1000) * 1000000;
mov EAX,4[ESP]
mov ECX,03E8h
cdq
idiv ECX
mov ?timeout@@3Utimespec@@A,EAX
imul EDX,EDX,0F4240h
mov ?timeout@@3Utimespec@@A[04h],EDX
ret
}

void bar(int millis)
{
timeout.tv_sec = millis / 1000;
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;
push EBX
mov EAX,8[ESP]
mov ECX,03E8h
push ESI
cdq
idiv ECX
mov EDX,0Ch[ESP]
imul EBX,EAX,03E8h
sub EDX,EBX
mov ?timeout@@3Utimespec@@A,EAX
imul ESI,EDX,0F4240h
mov ?timeout@@3Utimespec@@A[04h],ESI
pop ESI
pop EBX
ret
}
--------------------------------------------------------------
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,170
Messages
2,570,921
Members
47,464
Latest member
Bobbylenly

Latest Threads

Top