fixed-point

P

pout

What are the purposes of fixed-point? When should it be used?

I read:
#define Int2Fixed(x) (((long)(short)x) << 16)

and the fixed-point in 16.16 format. Does the 16 in the MACRO refer to
integer or decimal part? For example, if in 8.24, should the macro be:

#define Int2Fixed(x) (((long)(short)x) << 24)?

Another question is about the casting here. What is actually happening when
doing casting like : (long)(short)x? Could someone elaborate this?

Grateful to your help!
 
M

Malcolm

pout said:
What are the purposes of fixed-point? When should it be used?
You use fixed point when you know that all your floating point variables are
within a certain range, and when integer arithmetic is substantially faster
than floating point arithmetic and speed is important.
A typical use is for model vertices in 3d graphics. All values are likely to
be within two orders of magnitude, and speed is very important.
I read:
#define Int2Fixed(x) (((long)(short)x) << 16)

and the fixed-point in 16.16 format. Does the 16 in the MACRO > > refer to
integer or decimal part? For example, if in 8.24, should the
macro be:

#define Int2Fixed(x) (((long)(short)x) << 24)?
You've got it right. Normally a long is 32 bits, so the integer part would
also be 16 bits. The point is a binary point, BTW, not a decimal one.
Another question is about the casting here. What is actually
happening when doing casting like : (long)(short)x? Could someone
elaborate this?
I don't know exactly what he hopes to achieve. Casting to a short will
almost certainly make the integer exactly 16 bits, and the result will then
be expanded to a long. However I don't see why simply casting to a long
isn't OK.
 
P

pout

Malcolm said:
You use fixed point when you know that all your floating point variables are
within a certain range, and when integer arithmetic is substantially faster
than floating point arithmetic and speed is important.
A typical use is for model vertices in 3d graphics. All values are likely to
be within two orders of magnitude, and speed is very important.
When you say "a certain range", do you mean the range that can be
represented
by the format of fixed-point, like 16.16, its range is 0000.0000 ~
FFFF.FFFF?
to
integer or decimal part? For example, if in 8.24, should the
You've got it right. Normally a long is 32 bits, so the integer part would
also be 16 bits. The point is a binary point, BTW, not a decimal one.

Thanks for your help!
Shall it be 8 bits instead of 16 in: #define Int2Fixed(x) (((long)(short)x)
<< 24)?
What exactly is accomplished by left-shifting 24 bits?
How to apply "Int2Fixed" and "Fixed2Int" in real calculation? For example,

int a, b;

How to do addition, subtraction, multiplication and division with a and b
that involves "fixed-point" type?

My guess here is, in case of a * b, is to be done like,

int c = Fixed2Int(Int2Fixed(a) * Int2Fixed(b));

and in case of a/b,

int d = Fixed2Int(Int2Fixed(a) / Int2Fixed(b));

But I don't see why this is necessary to convert back and fro between
"fixed-point" and "int"
Do it really speed things up?
 
M

Malcolm

pout said:
When you say "a certain range", do you mean the range that can be
represented by the format of fixed-point, like 16.16, its range is
0000.0000 ~ FFFF.FFFF?
That's right. Fixed point can be signed or unsigned. If all your values are
in the range +- 30000 and you don't need greater precision than 0.0001 (1 in
10,000) then you can consider fixed point with 16:16 bits.
What exactly is accomplished by left-shifting 24 bits?
If we have the integer value 15 (0x0F) it becomes 0x000F000000 in the fixed
point representation.
How to apply "Int2Fixed" and "Fixed2Int" in real calculation? For > example,

int a, b;

How to do addition, subtraction, multiplication and division with a
and b that involves "fixed-point" type?
To add or subtract two fixed point numbers, simply add or subtract them.
Multipication is the tricky part. All values are scaled up by 16 bits, so
you have to correct

(a * b) >> 16;

(or as you did it, calling the FixedToInt() macro).
The problem is that on many compilers you will get overflow, since the
product of two 32 bit numbers is a 64 bit number. This can be solved by a
smidgeon of assembly, or sometimes by a judicious cast to long long.
and in case of a/b,

int d = Fixed2Int(Int2Fixed(a) / Int2Fixed(b));
There are two answers here.
1) You want the result as an integer. Simply divide two fixed point numbers.
2) You wnat the result in fixed point format. You have to left shift the
numerator before you do the divide

x = (num << 16) / denom;

You run into the same problem that (num << 16) will very likely overflow.
Either you need to cast to long long or, on many compilers, again resort to
the inline assembler.
But I don't see why this is necessary to convert back and fro
between "fixed-point" and "int"
Do it really speed things up?
If you just want to multiply two integers it is a waste of time converting
to fixed point, multiplying, and converting back. However if you want to
multiply 1.4 by 1.3 then fixed point is likely to be substantially faster
than floating point, but not on modern machines because the floating point
units are often so good that they are actually faster than integer
multiplies.
 
R

Rob Thorpe

Malcolm said:
If you just want to multiply two integers it is a waste of time converting
to fixed point, multiplying, and converting back. However if you want to
multiply 1.4 by 1.3 then fixed point is likely to be substantially faster
than floating point, but not on modern machines because the floating point
units are often so good that they are actually faster than integer
multiplies.

That is certainly true of modern *desktop* hardware. But in other
environments the situation is different.
 
P

pout

Malcolm said:
That's right. Fixed point can be signed or unsigned. If all your values are
in the range +- 30000 and you don't need greater precision than 0.0001 (1 in
10,000) then you can consider fixed point with 16:16 bits.
If we have the integer value 15 (0x0F) it becomes 0x000F000000 in the fixed
point representation.
To add or subtract two fixed point numbers, simply add or subtract them.
Multipication is the tricky part. All values are scaled up by 16 bits, so
you have to correct

(a * b) >> 16;

(or as you did it, calling the FixedToInt() macro).
The problem is that on many compilers you will get overflow, since the
product of two 32 bit numbers is a 64 bit number. This can be solved by a
smidgeon of assembly, or sometimes by a judicious cast to long long.
There are two answers here.
1) You want the result as an integer. Simply divide two fixed point numbers.
2) You wnat the result in fixed point format. You have to left shift the
numerator before you do the divide

x = (num << 16) / denom;

This is much clearer to me now.

According to what you said, whenever two numbers in division, to get a
result of fixed point, change either numerator or denominator to fixed point
format. Right?

After having the result in fixed point, will chaning the result back to int
make the number of fixed point lose its precision? Or make the conversion
to fixed point and back to int the same as a division of two ints?

Thanks!
 
M

Malcolm

pout said:
According to what you said, whenever two numbers in division, to > get a
result of fixed point, change either numerator or denominator > to fixed
point format. Right?Fixed point is an integer multiplied by a constant.

for an integer result
(a * 65536) / (b * 65536) = a / b

for a fixed point result
((a * 65536) * 65536) / (b * 65536) = (a/b) * 65536
After having the result in fixed point, will chaning the result back to >
int make the number of fixed point lose its precision?If you convert to integer you throw away the fraction part of your fixed
point value.
Or make the conversion to fixed point and back to int the same
as a division of two ints?
If you convert an integer to fixed point and back to integer you will get
your original integer back. Except that you will lose the 16 most
significant bits.
 
K

kal

What are the purposes of fixed-point? When should it be used?

First ask yourself "what is the purpose of a floating point number?"

A floating point number enables one to store a wider range of
values in the same storage space compared to integeral numbers,
albeit with loss of accuracy.

The problem with using floating point numbers in computer
calculations is that, whatever your processor, it always takes
longer to operate on floting point numbers vis-a-vis integers.
This has nothing to do with the hardware but with the algorithm used.

e.g. Assuming decimal digits. To add 0.0100 E+2 and 0.0001 E+20 steps
smiliar to the following are executed in the processor.

1. Remove insignificant digits. Those are the zeroes next to the "."
The two numbers now become 0.1000 E+1 and 0.1000 E+17

2. Adjust the numbers so that both numbers have the same exponent
values, which should be that of the higher exponent value.
The two numbers now become 0.0000 E+17 and 0.1000 E+17

3. Add the mantissas, it is the mantissa of the result, use the
exponent of one of the numbers as the exponent of the result.

There are more stuff like overflow handling. But I hope you get some
idea of the complexity involved.

Fixed point number is like the values of items in your grocery receipt.
There are always fixed number of positions after the decimal point.
That is to say, the exponent value is always the same.

One can of course store these values as floating point numbers and
operate on them. But things can be speeded up a bit if one can
make use of the fact that the nunber of digits after decimal point is
always the same.

e.g. Instead of considering dollars just store all values in cents.
So, 1.23 become 123, 56.75 becomes 5675 etc. Then all the operations
are in integers which is much faster.
I read:
#define Int2Fixed(x) (((long)(short)x) << 16)

Looks to be nice code. If the rest of the program is like this then
the program code might be worth studying.
and the fixed-point in 16.16 format. Does the 16 in the MACRO refer to
integer or decimal part?

The 16 in the macro refers to the multiplication factor. Note that
shifting left by 16 bits is the same as multiplying by 2**16.
For example, if in 8.24, should the macro be:
#define Int2Fixed(x) (((long)(short)x) << 24)?

Yes, almost. It should be (((long)(byte)x) << 24) the raeson being
that the maximum value you can store in the integer part is only
8 bits, i.e. a byte. See below.

But note that the concept of the fixedpoint is only in the mind
of the programmer. The numbers are all integers. On input to the
program, the numbres are multiplied by such a constant value (2**24)
that the result is always integral for all input values.
Another question is about the casting here. What is actually happening when
doing casting like : (long)(short)x? Could someone elaborate this?

There are two type conversions, first x is convrted to (short) and
then it is converted to (long). The question is why the conversion to
(short) first? Why not just ((long)x) ?

The reason is that programmars use macros like functions. That is when
a programmer codes as follows:

y = Int2Fixed(x)

he is treating the macro like the following function:

long Int2Fixed (short x);

So, he is liable to pass a (long) or even (float) or(double) value for x
thinking that the function call will convert the values to the appropriate
type. As you know this is a macro invocation and not a function call.

So, the (short) in the "(long)(short)x" makes the macro act like a
function call in so far as the paramter conversion is concerned.

THIS IS GOOD CODING!

Hope I haven't confused you more than warranted.
 
P

pout

kal said:
There are two type conversions, first x is convrted to (short) and
then it is converted to (long). The question is why the conversion to
(short) first? Why not just ((long)x) ?

The reason is that programmars use macros like functions. That is when
a programmer codes as follows:

y = Int2Fixed(x)

he is treating the macro like the following function:

long Int2Fixed (short x);

So, he is liable to pass a (long) or even (float) or(double) value for x
thinking that the function call will convert the values to the appropriate
type. As you know this is a macro invocation and not a function call.

So, the (short) in the "(long)(short)x" makes the macro act like a
function call in so far as the paramter conversion is concerned.

THIS IS GOOD CODING!

Hope I haven't confused you more than warranted.

Thanks, kal!

Your reply is very understandable to me and helpful. The only part which is
above me is your explanation of (long)(short)x; specifically, why does
"(short)x" imply that "a (long) or even (float) or(double) value for x" is
passed into the macro? Just because the conversion of x happening?
 
K

kal

Your reply is very understandable to me and helpful. The only part which is
above me is your explanation of (long)(short)x; specifically, why does
"(short)x" imply that "a (long) or even (float) or(double) value for x" is
passed into the macro? Just because the conversion of x happening?

N.B. Codes herein are for the purposes of illustration only. In particular
they do not handle sign of numbers properly.

In this instance, the (short) typecasting is unnecessary. However, it is
a good coding style.

Let us consider the following function which takes two short values, one
representing the integral part and the other representing the decimal part
of a number and converts them into a single long (fixed point).

long MakeFixed (short integral, short decimal)
{
return (((long)integral) << 16) & ((long)decimal);
}

The following is a possible instance of a call to this function.

long scan_integral;
long scan_decimal;
long scan_fixed;

scan_fixed = MakeFixed(scan_integral, scan_decimal);

The actual parameters passed are of type long. But the compiler
generates code to convert them to type short before passing them
to the function. So, the function always gets short types. The
effect of this is that EVEN IF THE HIGHER 16 BITS OF THE PARAMETER
scan_decimal HAD NON ZERO VALUES THE FUNCTION CALL WILL SUCCEED.

Now, consider a macro definition to implement the same function.

#define MakeFixed(x,y) ((((long)x) << 16) & ((long)y))

And it being used as follows:

scan_fixed = MakeFixed(scan_integral, scan_decimal);

This will fail if the higher 16 bits of scan_decimal contains
non zero values.

However, the following macro definition rectifies this problem.

#define MakeFixed(x,y) (((long)x) << 16) & ((long)(short)y)

You will be justified in asking as to why the programmer passed
a long when what was expected was short. A reply to such a question
will be long and tedious.

In general it is good coding practice to define macros so that they
cater to as many situations as possible. In particular, if the
expected parameter is of a certain type then first cast that parameter
to that type before using it in an expression.

C++ notes:
----------

C++ implements strict type checking in function calls so problems
due to mismatched parameter types are avoided. However, this is
not available if one uses 'defines'.

One benefit of 'define' vis-a-vis 'function' is code compaction.
When the function body is small compared to code needed for
function call then 'defines' will produce smaller code and faster
execution.

It is to provide this functionality and at the same time keep the
strict type checking facily that C++ implements what is called
"inline" functions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,362
Latest member
ChandaWagn

Latest Threads

Top