complexity of trigonometric functions

Vladimir Jovic · Sep 2, 2010

Hello,

I have a piece of code that I am trying to optimize. It is nothing
special - just some calculations, including trigonometric functions
(bunch of multiplications with sin and cos here and there). My code
duplicates in some places, but that is ok.

The questions are :
How complex are sin and cos functions? Are they simple look up tables?
Or are they some super-complex calculations.
If translated to multiplication/addition, how many
multiplications/additions would one sin or cos involve?

I am well aware this is implementation dependant, but it is probably
done in very similar fashion.

Alf P. Steinbach /Usenet · Sep 2, 2010

* Vladimir Jovic, on 02.09.2010 10:22, asked an off-topic question:

Hello,

I have a piece of code that I am trying to optimize. It is nothing special -
just some calculations, including trigonometric functions (bunch of
multiplications with sin and cos here and there). My code duplicates in some
places, but that is ok.

The questions are :
How complex are sin and cos functions? Are they simple look up tables? Or are
they some super-complex calculations.

If translated to multiplication/addition, how many multiplications/additions
would one sin or cos involve?

Not much. Look up the series in e.g. Wikipedia.

I am well aware this is implementation dependant, but it is probably done in
very similar fashion.

Most probably the standard lib's 'sin' and 'cos' are the ~fastest possible
general ones on your platform. However, usage patterns can mean that you can
optimize further for your particular application. For example, around zero
crossings 'sin' and 'cos' are almost linear, and can be done as linar
interpolation (that means, replaced by linear function like Ax+B).

The only reasonable general answer is to *measure*.

But please ask such general programming questions in e.g.
[comp.lang.programming], leaving this group for C++ specific questions -- even
if C++ folks generally know what they're talking about... ;-)

Cheers & hth.,

- Alf

Victor Bazarov · Sep 2, 2010

[..]
But please ask such general programming questions in e.g.
[comp.lang.programming],

Just a nit pick, I believe Alf meant [comp.programming]. My server for
one doesn't have 'comp.LANG.programming'.

> leaving this group for C++ specific questions
-- even if C++ folks generally know what they're talking about... ;-)

V

Vladimir Jovic · Sep 2, 2010

Alf said:
* Vladimir Jovic, on 02.09.2010 10:22, asked an off-topic question:

Not much. Look up the series in e.g. Wikipedia.

Most probably the standard lib's 'sin' and 'cos' are the ~fastest
possible general ones on your platform. However, usage patterns can mean

I am using standard lib's sin and cos. Where should I look how many
element of the series is taken? Compiler's documentation?

The only reasonable general answer is to *measure*.

Correct, but I thought someone might give an insight on what should I
expect.

But please ask such general programming questions in e.g.
[comp.lang.programming], leaving this group for C++ specific questions
-- even if C++ folks generally know what they're talking about... ;-)

Ok, thanks (as Victor said it is comp.programming)

btw I just found this :
http://linux.die.net/man/3/sincos
It is an extension I can use, and it perfectly fits my need.

osmium · Sep 2, 2010

Vladimir Jovic said:
I have a piece of code that I am trying to optimize. It is nothing
special - just some calculations, including trigonometric functions (bunch
of multiplications with sin and cos here and there). My code duplicates in
some places, but that is ok.

The questions are :
How complex are sin and cos functions? Are they simple look up tables? Or
are they some super-complex calculations.
If translated to multiplication/addition, how many
multiplications/additions would one sin or cos involve?

I am well aware this is implementation dependant, but it is probably done
in very similar fashion.

You can get an idea of where things were when the computer started becoming
so pervasive from the link, wander around a few pages in the vicinity -
there are page links at the top. There may have been enormous progress
since then; if the upcoming Pentium MarkVII (or whatever) had a sine
function, I would not be *really* astonished. Though I doubt it.

http://www.math.sfu.ca/~cbm/aands/page_76.htm

Francesco S. Carta · Sep 2, 2010

I am using standard lib's sin and cos. Where should I look how many
element of the series is taken? Compiler's documentation?

Yes. You might look at their implementations, but you might discover
that they resolve to some built-in feature that you cannot inspect (as
it seems to be the case with my implementation), so the documentation
would become the only source of information about this - well, also some
search on the Internet could play here.

red floyd · Sep 2, 2010

I am using standard lib's sin and cos. Where should I look how many
element of the series is taken? Compiler's documentation?

Who says it's even using a series? Maybe it's using your CPU's
FSIN instruction (or equivalent).

red floyd · Sep 2, 2010

You can get an idea of where things were when the computer started becoming
so pervasive from the link, wander around a few pages in the vicinity -
there are page links at the top. There may have been enormous progress
since then; if the upcoming Pentium MarkVII (or whatever) had a sine
function, I would not be *really* astonished. Though I doubt it.

Intel chips have had an FSIN instruction since the '486, IIRC.
Acutally, it's
had it since the '387, but that was a separate coprocessor chip.

BGB / cr88192 · Sep 2, 2010

Vladimir Jovic said:
Hello,

I have a piece of code that I am trying to optimize. It is nothing
special - just some calculations, including trigonometric functions (bunch
of multiplications with sin and cos here and there). My code duplicates in
some places, but that is ok.

The questions are :
How complex are sin and cos functions? Are they simple look up tables? Or
are they some super-complex calculations.
If translated to multiplication/addition, how many
multiplications/additions would one sin or cos involve?

I am well aware this is implementation dependant, but it is probably done
in very similar fashion.

on x86, and probably other major architectures, these functions are
essentially built right into the processor, and are typically fairly fast.

most of the 'slowness' one may experience with them (in some cases), is
actually due to things the compiler is doing, rather than the operations
themselves. for example, MSVC defaults to using fairly stupid/inefficient
implementations of the math functions, and their "optimization" is simply to
use less stupid implementations (typically excluding some special case
handling, usually for NaN, Inf, denormals, ...).

(one can also get a major speedup by writing their own ASM functions which
simply send the values directly to the processor and return the result, if
possibly slightly less "safe" at edge cases).

but, at least in a general-purpose floating-point sense, it is unlikely one
will be able to write much of anything faster than the built-in functions.

if using fixed point integer though, and willing to settle with a little
reduced accuracy, it is possible to replace them by a table, which is
typically a little faster, but this is more because conversions between
floats and integers tend to be expensive.

this usually involves either using a modified angle scheme (256 degrees in a
circle, ...), or doing a fixed-point multiply to get the angle into the
correct base, and masking it off to get the array index.

say, 20.12 fixed point scheme in radians(val).
i=(val*10420)>>8; //scale to be in a 256 degree system (0.8
fixed-multiply, result still 20.12)
i=(i>>8)&4095; //convert into 12 bit table index
ret=foo_sintab;

however, usually one avoids fixed point unless really needed, since with
most modern computers it is generally slower (as well as being less
accurate) than the use of floats, except in a few edge cases, such as
processing integer input data to integer output data, as would take place in
a video or audio codec, where often calucaltions such as the DCT or MDCT are
done with fixed point and lookup tables, ...

the main reason for use in codecs is because often otherwise a lot of time
would get used up in conversions between floating point and integer values,
....

or such...

BGB / cr88192 · Sep 2, 2010

I am using standard lib's sin and cos. Where should I look how many
element of the series is taken? Compiler's documentation?

<--
Who says it's even using a series? Maybe it's using your CPU's
FSIN instruction (or equivalent).
-->

yes, this is what most compilers do for these functions (sin, cos, sqrt,
....).

using a series for these calculations (in a general case) would be a serious
waste of time (even if there were not direct CPU support).

about the only time I have used series for caculating these functions is
typically for things like their quaternion variants, since there is no other
good way to do this...

or such...

osmium · Sep 2, 2010

red said:
Intel chips have had an FSIN instruction since the '486, IIRC.
Acutally, it's
had it since the '387, but that was a separate coprocessor chip.

Well, darn. I am going to have to start actually reading some of these
books I have. I have a reference manual for the 486 on my shelves; after
that I gave up on assembly language. Obviously, I wasn't very enthusiastic
in 486 days either.

The book shows an FSIN takes 241 "clocks" and FMUL takes (commonly) 16
clocks. Anyone who sets out to do better than that is doomed to failure.
Whether "clock" is an ink saving way of saying "clock cycle" I do not know.

Marc · Sep 2, 2010

<--
Who says it's even using a series? Maybe it's using your CPU's
FSIN instruction (or equivalent).
-->

Note that fsin causes the computer to use some microcode that uses a
series...
(there isn't an actual circuit on the chip that directly computes sin,
AFAIK).

yes, this is what most compilers do for these functions (sin, cos, sqrt,
...).

using a series for these calculations (in a general case) would be a serious
waste of time (even if there were not direct CPU support).

Did you measure before posting this? I did some measurements about a
year ago, and for type double, using fsin was slower and less accurate
than a software implementation (still using x87), which itself was
slower than the same on SSE. For float, the difference is even more
visible.

BGB / cr88192 · Sep 2, 2010

osmium said:
Well, darn. I am going to have to start actually reading some of these
books I have. I have a reference manual for the 486 on my shelves; after
that I gave up on assembly language. Obviously, I wasn't very
enthusiastic in 486 days either.

The book shows an FSIN takes 241 "clocks" and FMUL takes (commonly) 16
clocks. Anyone who sets out to do better than that is doomed to failure.
Whether "clock" is an ink saving way of saying "clock cycle" I do not
know.

checking a newer Intel manual, FSIN is more around 130 clock cycles, whereas
FMUL is 2.

but, yeah, trying to do the calculation via a series is not likely to turn
out well, but some savings are possible if one uses lookup tables (although
this may cost accuracy and has other issues, so is not really a good
general-purpose solution in most cases...).

or such...

BGB / cr88192 · Sep 2, 2010

<--
Who says it's even using a series? Maybe it's using your CPU's
FSIN instruction (or equivalent).
-->

<--
Note that fsin causes the computer to use some microcode that uses a
series...
(there isn't an actual circuit on the chip that directly computes sin,
AFAIK).
-->

fair enough...

yes, this is what most compilers do for these functions (sin, cos, sqrt,
...).

using a series for these calculations (in a general case) would be a
serious
waste of time (even if there were not direct CPU support).

<--
Did you measure before posting this? I did some measurements about a
year ago, and for type double, using fsin was slower and less accurate
than a software implementation (still using x87), which itself was
slower than the same on SSE. For float, the difference is even more
visible.
-->

well, partly by this I also meant:
there are actually faster ways of even doing SW implementations than to use
a series...
(this is what I meant by "even if there were not direct CPU support", or
IOW, there are faster ways in SW, but they were not so likely be via using a
series...).

examples of faster ways include several variations on using a lookup table,
....

I have used series before, but mostly 12-stage series, as I had estimated by
12-steps that the result would largely converge with 32-bit floats (it would
take a longer series to entirely converge with doubles though).

however, my results showed the opposite effect, with the series being
slower...

granted, one could possibly make it faster by precalculating some of the
values (using precalculated divisors and using multiplies instead of
divides, and also expanding out any for loops, ...), but whatever...

Richard · Sep 3, 2010

[Please do not mail me a copy of your followup]

red floyd <[email protected]> spake the secret code

Intel chips have had an FSIN instruction since the '486, IIRC.
Acutally, it's
had it since the '387, but that was a separate coprocessor chip.

There were FPU coprocessor chips for the 8088 (8087), the 80286
(80287), 80386 (80387) and early 80486 (80487) chips which didn't have
an on-board FPU. Eventually a version of the 80486 was developed
which integrated the FPU and all subsequent entries in the x86
architecture evolution have included an on-chip FPU.

All of these FPUs, whether integrated on-chip or not, have had the
FSIN, FCOS, etc., instructions for computing trigonometric functions.

red floyd · Sep 3, 2010

red floyd<[email protected]> spake the secret code

There were FPU coprocessor chips for the 8088 (8087), the 80286
(80287), 80386 (80387) and early 80486 (80487) chips which didn't have
an on-board FPU. Eventually a version of the 80486 was developed
which integrated the FPU and all subsequent entries in the x86
architecture evolution have included an on-chip FPU.

Not quite. The early original 80486 (no DX or SX) had an on-board FPU.
The 486SX was a 486 with the FPU disabled, but it could use an 80487
FPU, which was essentially a 486 with the integer unit disabled.
I know this because I had an early 486/33 (no DX or SX).

Once the DX/SX split came along, the DX (and DX/2, DX/3 and DX/4) series
had the integrated FPU.

All of these FPUs, whether integrated on-chip or not, have had the
FSIN, FCOS, etc., instructions for computing trigonometric functions.

I have an 8087 manual at my office, I'll have to check. I didn't think
the 8087 and the 80287 had trig. I'll double-check the 8087 manual to
see (some websites indicate the 287 and earlier didn't have it).

red floyd · Sep 3, 2010

Not quite. The early original 80486 (no DX or SX) had an on-board FPU.
The 486SX was a 486 with the FPU disabled, but it could use an 80487
FPU, which was essentially a 486 with the integer unit disabled.
I know this because I had an early 486/33 (no DX or SX).

Once the DX/SX split came along, the DX (and DX/2, DX/3 and DX/4) series
had the integrated FPU.

I have an 8087 manual at my office, I'll have to check. I didn't think
the 8087 and the 80287 had trig. I'll double-check the 8087 manual to
see (some websites indicate the 287 and earlier didn't have it).

According to my 8087 manual (Intel 8086 manual, July 1981, appendix
S), the 8087 instruction set did not include FSIN or FCOS.

James Kanze · Sep 6, 2010

Who says it's even using a series? Maybe it's using your CPU's
FSIN instruction (or equivalent).

The hardware implementations of FSIN or equivalent that I've
seen have all used microcode... which evaluated a series.

James Kanze · Sep 6, 2010

Intel chips have had an FSIN instruction since the '486, IIRC.
Acutally, it's had it since the '387, but that was a separate
coprocessor chip.

The Intel 8087 (1981) had an FSIN function, although it only
worked over a small range. (It required the client to scale the
input first.)

Given the way things have evolved, I would not be surprised if
on a modern chip, one could do it faster by programming the
series directly using standard multiples and divides.

James Kanze · Sep 6, 2010

On Sep 2, 10:21 pm, red floyd <[email protected]> wrote:

[...]

According to my 8087 manual (Intel 8086 manual, July 1981,
appendix S), the 8087 instruction set did not include FSIN or
FCOS.

The 8087 definitely had some support for trig, since I used it
back then. IIRC, it was in the form of an FSINCOS or FCOSSIN
function, which calculated both sin and cos (over a very limited
range).

My 8087 manual was lost in a move, many, many years ago; if
you've got one and can check it, I'd be curious to know.

Vectorization of template functions	2	Jun 22, 2007
Critique requested.....	5	Jan 25, 2006
Looking for HDL code for sin( a ) and x ** y Functions	13	Oct 2, 2006
function of 2d array	4	Nov 7, 2006
[ANN] Multidimensional Array - MDArray 0.5.5	0	Nov 19, 2013
@PyNoobs: The Fundamental Five Built-in Functions, and Beyond!	4	Jul 27, 2011
Floating-point glitches with the math module. Bug? Or am I missing something?	15	Sep 20, 2004
[ANN] macstl 0.2 -- portable SIMD toolkit, fast valarray transcendentals, fast Mach vectors	0	Feb 2, 2005

complexity of trigonometric functions

Vladimir Jovic

Alf P. Steinbach /Usenet

Victor Bazarov

Vladimir Jovic

osmium

Francesco S. Carta

red floyd

red floyd

BGB / cr88192

BGB / cr88192

osmium

Marc

BGB / cr88192

BGB / cr88192

Richard

red floyd

red floyd

James Kanze

James Kanze

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads