Is % slow?

Michael B Allen · Oct 16, 2005

I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

Thanks,
Mike

Emmanuel Delahaye · Oct 16, 2005

Michael B Allen a écrit :

I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

Use ldiv(), it is optimized for your architecture and you have the
result of the division and the remain for the same price.

Skarmander · Oct 16, 2005

Michael said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

What operation *cannot* be slow on some architectures? I vividly recall
expecting that
a = a * 10 + b - 48;
would be slower than
a = a << 8 | b;
And maybe I'm right for most platforms. But on the Pentium 4, the first
is in fact faster:
leal (%ebx,%ebx,4), %eax # eax = ebx + ebx * 4
leal -48(%ecx,%eax,2), %ebx # ebx = ecx + eax * 2 - 48
beats
sall $8, %ebx # ebx = ebx << 8
orl %eax, %ebx # ebx = ebx | eax
since Intel processors have some funky instructions and the Pentium 4
sucks at shifting. Moral: don't assume too quickly that you know what's
fast and what isn't.

(No, the expressions don't compute the same value. Different algorithms
for the same purpose are being compared.)

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

Could or is? Profile. This will also immediately tell you which
alternative is faster. Armchair optimization generally doesn't work; it
falls under "too clever by half". Better algorithms, yes; optimizing
individual expressions, no.

In particular, many processors have a division instruction that
simultaneously generates the remainder. If you write / and %, the
compiler could recognize this and benefit. It's unlikely it will do so
when faced with a subtraction and multiplication.

In general, say what you mean. You mean modulo, use modulo. Should a
particular combination of compiler and platform prove unable to generate
optimal code *and* you need the speed, then optimize. Until then, leave
it alone.

S.

Jack Klein · Oct 16, 2005

I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

The C language itself does not address speed or efficiency of any
operation, only what the result should be if it is used in a strictly
conforming way.

The simplest way to determine the efficiency of a particular operation
on your particular platform under your conditions is to code up some
timing tests.

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

You could consider the div() or ldiv() functions, depending on the
type of timeout.tv, which return both the quotient and the remainder
in a single structure.

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

On the other hand, if this were going to actually be called every
millisecond, I would never code it this way. I would maintain my own
count of nanoseconds and adjust it each time with an addition.

Scott \Poncho\ Fluhrer · Oct 17, 2005

Michael B Allen said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

My specific problem is how to convert from milliseconds to seconds and
nanoseconds (POSIX timespec value):

struct timespec timeout;

timeout.tv_sec = millis / 1000; /* seconds */
timeout.tv_nsec = (millis % 1000) * 1000000; /* nanoseconds */

v.s.

timeout.tv_sec = millis / 1000; /* seconds */
/* nanoseconds */
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;

This routine could be called quite frequently (e.g. 1000x/sec) so an
optimization might be worth it here.

In my experience, worrying about such microoptimizations is generally a
mistake. You should write code that is easy to understand and maintain, and
worry about optimizing it only if profiling showed that the code is too
slow.

Walter Bright · Oct 18, 2005

Michael B Allen said:
I think I recall the modulo operator can be slow on some architectures. Or
will even trivial compiler optimization factor it out?

Let's have some fun and compile the two snippets of code with the Digital
Mars C compiler, and see what the resulting assembler is (compile with -o
and disassemble the result with obj2asm). The first observation is that a
DIV instruction will commonly produce both the dividend and the remainder,
and a halfway decent compiler should keep track of that for you. The second
is that modern compilers do a pretty good job of low level optimization, and
by trying to wrest this away from the compiler to do it yourself will often
defeat it and make things worse. The third is that obj2asm or equivalent is
indispensible to doing tuning. And lastly, profiling the result is necessary
to see if one is gaining or losing ground.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers
www.digitalmars.com/techtips/timing_code.html how to measure code
performance
------------------------------------------------
struct timespec
{
int tv_sec;
int tv_nsec;
};

struct timespec timeout;

void foo(int millis)
{
timeout.tv_sec = millis / 1000;
timeout.tv_nsec = (millis % 1000) * 1000000;
mov EAX,4[ESP]
mov ECX,03E8h
cdq
idiv ECX
mov ?timeout@@3Utimespec@@A,EAX
imul EDX,EDX,0F4240h
mov ?timeout@@3Utimespec@@A[04h],EDX
ret
}

void bar(int millis)
{
timeout.tv_sec = millis / 1000;
timeout.tv_nsec = (millis - timeout.tv_sec * 1000) * 1000000;
push EBX
mov EAX,8[ESP]
mov ECX,03E8h
push ESI
cdq
idiv ECX
mov EDX,0Ch[ESP]
imul EBX,EAX,03E8h
sub EDX,EBX
mov ?timeout@@3Utimespec@@A,EAX
imul ESI,EDX,0F4240h
mov ?timeout@@3Utimespec@@A[04h],ESI
pop ESI
pop EBX
ret
}
--------------------------------------------------------------

file seek is slow	11	Mar 9, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
How bad is $'? (Was: "Get substring of line")	4	Jan 18, 2005
Point on Line Segment in 2D. Which code is faster ? Can you improve it ?	65	Sep 16, 2005

Is % slow?

Michael B Allen

Emmanuel Delahaye

Skarmander

Jack Klein

Scott \Poncho\ Fluhrer

Walter Bright

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads