Byte to float conversion problem - PLEASE HELP

C

cpptutor2000

Could some Java guru please help ? I am trying to analyze some audio
data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
bit, little endian and signed. The resulting data is put in an array
of bytes. I am generating tones at 1000.00 Hz, with a tone generator.
However, when I convert the bytes to float values, I do not see the
periodic sinusoidal data, as expected, (sample output below)
18770.0
38724.0
16727.0
28006.0
16.0
1.0
2000.0
4000.0
2.0
24932.0
38688.0
0.0
0.0
0.0
0.0

I understand that with 16 bit resolution, I can get numbers in the
range -2^16 - 1 to 2^16 - 1.

I believe that I am not converting the data correctly. To achieve the
conversion, I am taking 4 bytes at a time, and converting them. That
is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

Any hints, suggestions would be greatly appreciated. Thanks in advance
for your help.
 
M

Mark Space

Could some Java guru please help ? I am trying to analyze some audio
data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
bit, little endian and signed. The resulting data is put in an array
I believe that I am not converting the data correctly. To achieve the
conversion, I am taking 4 bytes at a time, and converting them. That
is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

4 bytes = 32 bits. Why are you taking your date four bytes at a time if
your data is two bytes?
 
R

Roedy Green

Any hints, suggestions would be greatly appreciated. Thanks in advance
for your help.

see endian.html

If the data are not IEEE, you will have to find out the format and do
some fancy bit fiddling.
 
P

Patricia Shanahan

Mark said:
4 bytes = 32 bits. Why are you taking your date four bytes at a time if
your data is two bytes?

Also make sure the conversion deals with the data being little-endian.

Rather than going straight to float, I suggest first turning the data
into shorts, and making sure that is working. It may be easier to check.

If the first wave of ideas do not solve the problem, try posting a
sample of the input data in hex.

Patricia
 
L

Logan Shaw

Could some Java guru please help ? I am trying to analyze some audio
data on a PC. I am recording the sound in PCM format at 2000 Hz, 16
bit, little endian and signed. The resulting data is put in an array
of bytes. I am generating tones at 1000.00 Hz, with a tone generator.

You *must* set your tone generator to a lower frequency! At
that frequency, even if your software is perfect, you're still
going to see garbage data!

Nyquist's sampling theorem says that when you sample at 2000 Hz,
the highest possible frequency you can represent at *all* (without
completely mangling it) is 1000 Hz. There must be at least two
samples per wavelength.

And that 1000 Hz is in an ideal world. A real-world A-to-D
converter has a low-pass filter that will filter out everything
below the Nyquist frequency (in this case 1000 Hz), and the slope
of that filter is usually sharp, but it is not infinite. That
means in practice the highest frequency that the A-to-D converter
will even see is something less than 1000 Hz.

I would try setting your frequency generator to something like
100 Hz, or set your sampling rate higher.
However, when I convert the bytes to float values, I do not see the
periodic sinusoidal data, as expected, (sample output below)
18770.0
38724.0
16727.0
28006.0
16.0
1.0
2000.0
4000.0
2.0
24932.0
38688.0
0.0
0.0
0.0
0.0

I understand that with 16 bit resolution, I can get numbers in the
range -2^16 - 1 to 2^16 - 1.

No, that would be a total of 2^17 + 1 distinct values. With a
16-bit number, you can only have 2^16 distinct values.

The usual format for signed numbers is two's complement. In
that format, the values range from -2^15 to 2^15-1, which is
another way of saying from -32768 to +32767.
I believe that I am not converting the data correctly. To achieve the
conversion, I am taking 4 bytes at a time, and converting them. That
is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

Well, you haven't said whether the data in your input file is
monophonic, stereophonic, or something else. If it's stereo,
you're going to have pairs of samples. Since each sample is
16 bits, which is 2 bytes, each pair of samples will be 4 bytes.
But I would avoid that at the early stages and try to start with
an input file that is monophonic in order to keep things simple.

Assuming you have a monophonic input file, you need to read
only 2 bytes per sample.
Any hints, suggestions would be greatly appreciated. Thanks in advance
for your help.

Let's assume you have read some bytes of the input file into
some array. Converting that into samples is going to look
something like this:

byte[] rawBytes = getBlockOfSamples();

if (samples.length % 2 != 0) {
throw SomeException("Can't handle samples spanning blocks");
}

short[] samples = new short[samples.length / 2];
int inputOffset = 0;
int outputOffset = 0;

while (inputOffset < samples.length) {
// read in both bytes of first sample;
// put them in 16-bit types since they'll
// be converted to that size soon anyway.
short lowOrder = rawBytes[inputOffset];
short highOrder = rawBytes[inputOffset+1];
inputOffset += 2;

// the low-order byte is meant to be
// unsigned since the sign bit is in the
// high-order byte. But the java type
// wraps around after 127, so some of
// our positive numbers will have gotten
// converted to negatives. so fix that.
// since we have already converted to short,
// we can already handle the larger range.
if (lowOrder < 0) {
lowOrder += 256;
}

// shift the high-order byte into position
// and combine them.
samples[outputOffset] = lowOrder | (highOrder << 8);
outputOffset++;
}

There is probably some tricky way to avoid that conditional I
used to correct for the negative values, but let's forget about
performance for now.

- Logan
 
L

Lew

Logan said:
You *must* set your tone generator to a lower frequency! At
that frequency, even if your software is perfect, you're still
going to see garbage data!

Nyquist's sampling theorem says that when you sample at 2000 Hz,
the highest possible frequency you can represent at *all* (without
completely mangling it) is 1000 Hz. There must be at least two
samples per wavelength.

Doesn't that apply to analog sampling? Digital sampling reduces the accuracy
of the reproduction still further, doesn't it?

I have always wondered if the Nyquist frequency really applied to digital
sampling. Every time I've looked it up the formulas use real numbers, not
floating-point approximations. Do you have insight on this?
I would try setting your frequency generator to something like
100 Hz, or set your sampling rate higher.
The usual format for signed numbers is two's complement. In
that format, the values range from -2^15 to 2^15-1, which is
another way of saying from -32768 to +32767.

This is Java, where this is the only format for signed numbers. However, Java
does not have a signed 16-bit integral type.

Endianness should be much easier to handle with the built-in facilities of
java.nio.ByteBuffer.

saving you all that looping through
short lowOrder = rawBytes[inputOffset];
short highOrder = rawBytes[inputOffset+1];
etc.
 
M

Mark Space

Logan said:
Nyquist's sampling theorem says that when you sample at 2000 Hz,
the highest possible frequency you can represent at *all* (without
completely mangling it) is 1000 Hz. There must be at least two
I would try setting your frequency generator to something like
100 Hz, or set your sampling rate higher.


Good point, I completely missed that in the OPs post. It never occured
to me that he was actually using an external tone generator. I thought
he was talking about something in software.

Considering that humans can hear up to 15kHz to 20kHz or so, should be
be using at least 150k samples per second? That's the general rule I
remember -- 10x oversample or risk distortion.
 
C

cpptutor2000

Thank you very much for your very helpful hints and insight into the
problem. Initially, I had set the sampling frequency at 8000 Hz, with
PCM at 16 bits, signed, little-endian, channel mono. However, with
this sampling frequency, I started getting Java OutOfMemoryException.
So, I shifted to 2000 Hz. Also, I am using a software tone generator.

You *must* set your tone generator to a lower frequency! At
that frequency, even if your software is perfect, you're still
going to see garbage data!

Nyquist's sampling theorem says that when you sample at 2000 Hz,
the highest possible frequency you can represent at *all* (without
completely mangling it) is 1000 Hz. There must be at least two
samples per wavelength.

And that 1000 Hz is in an ideal world. A real-world A-to-D
converter has a low-pass filter that will filter out everything
below the Nyquist frequency (in this case 1000 Hz), and the slope
of that filter is usually sharp, but it is not infinite. That
means in practice the highest frequency that the A-to-D converter
will even see is something less than 1000 Hz.

I would try setting your frequency generator to something like
100 Hz, or set your sampling rate higher.


However, when I convert the bytes to float values, I do not see the
periodic sinusoidal data, as expected, (sample output below)
18770.0
38724.0
16727.0
28006.0
16.0
1.0
2000.0
4000.0
2.0
24932.0
38688.0
0.0
0.0
0.0
0.0
I understand that with 16 bit resolution, I can get numbers in the
range -2^16 - 1 to 2^16 - 1.

No, that would be a total of 2^17 + 1 distinct values. With a
16-bit number, you can only have 2^16 distinct values.

The usual format for signed numbers is two's complement. In
that format, the values range from -2^15 to 2^15-1, which is
another way of saying from -32768 to +32767.
I believe that I am not converting the data correctly. To achieve the
conversion, I am taking 4 bytes at a time, and converting them. That
is, first bytes 0 - 3, then bytes 4 - 7 and so on. Is this correct ?

Well, you haven't said whether the data in your input file is
monophonic, stereophonic, or something else. If it's stereo,
you're going to have pairs of samples. Since each sample is
16 bits, which is 2 bytes, each pair of samples will be 4 bytes.
But I would avoid that at the early stages and try to start with
an input file that is monophonic in order to keep things simple.

Assuming you have a monophonic input file, you need to read
only 2 bytes per sample.
Any hints, suggestions would be greatly appreciated. Thanks in advance
for your help.

Let's assume you have read some bytes of the input file into
some array. Converting that into samples is going to look
something like this:

byte[] rawBytes = getBlockOfSamples();

if (samples.length % 2 != 0) {
throw SomeException("Can't handle samples spanning blocks");
}

short[] samples = new short[samples.length / 2];
int inputOffset = 0;
int outputOffset = 0;

while (inputOffset < samples.length) {
// read in both bytes of first sample;
// put them in 16-bit types since they'll
// be converted to that size soon anyway.
short lowOrder = rawBytes[inputOffset];
short highOrder = rawBytes[inputOffset+1];
inputOffset += 2;

// the low-order byte is meant to be
// unsigned since the sign bit is in the
// high-order byte. But the java type
// wraps around after 127, so some of
// our positive numbers will have gotten
// converted to negatives. so fix that.
// since we have already converted to short,
// we can already handle the larger range.
if (lowOrder < 0) {
lowOrder += 256;
}

// shift the high-order byte into position
// and combine them.
samples[outputOffset] = lowOrder | (highOrder << 8);
outputOffset++;
}

There is probably some tricky way to avoid that conditional I
used to correct for the negative values, but let's forget about
performance for now.

- Logan
 
L

Logan Shaw

Doesn't that apply to analog sampling? Digital sampling reduces the
accuracy of the reproduction still further, doesn't it?

Yes, but as far as I know, that's a different issue.
I have always wondered if the Nyquist frequency really applied to
digital sampling. Every time I've looked it up the formulas use real
numbers, not floating-point approximations. Do you have insight on this?

It's an interesting question. I think about it in terms of the magnitude
of the error. The size of the error will always be smaller than the
magnitude of the least significant bit in the digital sample; otherwise,
you'd be choosing a different digital value. So you could think of the
result of playing back the series of digital samples as being equivalent
to playing back the original analog values (before quantization) plus
very small values that are the difference of the analog and the digital.

When you think of it in these terms, it's like you're adding a small
noise signal to the original. But the noise signal's magnitude is so
small that it's barely noticeable (if your samples are high-enough
resolution). In fact, it may easily be dwarfed by some noise source
that was original analog signal anyway. In the digital domain, there
is obviously no such thing as a noise-free reproduction of a signal,
because that would require an infinite amount of information. But the
thing not to miss is that there is no such thing as a noise-free
reproduction of a signal in the analog domain either. Even a simple
pair of wires adds distortion: it has capacitance and so acts as a
low-pass filter. I'm not a professional audio guy or anything, but
from what I understand, in the audio world, any A-to-D with a sample
size larger than 24 bits is viewed as partly a joke, because many people
are convinced nobody sells any equipment with analog circuits that are
quiet enough so that the last few bits are anything other than a
digital representation of analog noise, even at 24 bits. (Of course,
larger sample sizes are useful when processing the audio.)
This is Java, where this is the only format for signed numbers.

Yes, it might not have been all that clear, but I was referring to
the format of the samples in the input file. They probably are
two's-complement, but in theory they could be something else. They
could have their bits completely reversed or they could be a gray
code or one's-complement or something. Unlikely, but you never know.
There are some strange file formats out there.
Endianness should be much easier to handle with the built-in facilities
of java.nio.ByteBuffer.

That's an interesting idea. I haven't had much occasion to use nio,
and I didn't know it had that stuff in it.

- Logan
 
L

Logan Shaw

Mark said:
Considering that humans can hear up to 15kHz to 20kHz or so, should be
be using at least 150k samples per second? That's the general rule I
remember -- 10x oversample or risk distortion.

Yes, I believe that's what actually happens in practice. You effectively
sample at one rate and then downconvert back to rate that you can use to
work with the data and transmit the data. I'm fairly sure that in most
commercially-available audio hardware, the oversampling and downconversion
happens in the hardware, so that the software and the end-user don't need
to be aware of it.

Actually, that's not even true now that I think of it. A lot of A-to-D
converters (maybe even the vast majority now?) are "one-bit", which means
they use delta-sigma coding, which involves some analog circuitry to
modulate the signal as they take one-bit samples at a very high rate.
That then gets converted into PCM data at a lower rate.

- Logan
 
L

Logan Shaw

Thank you very much for your very helpful hints and insight into the
problem. Initially, I had set the sampling frequency at 8000 Hz, with
PCM at 16 bits, signed, little-endian, channel mono. However, with
this sampling frequency, I started getting Java OutOfMemoryException.
So, I shifted to 2000 Hz. Also, I am using a software tone generator.

Aha, well if you're using a software tone generator (which I had not
considered), you may actually be able to see the maximum theoretical
frequency. But don't expect it to come across as a sine wave. Since
2000 is an exact multiple of 1000, you will be hitting the sine wave
at the same exact two points in its phase every time, and so the sample
data you see will just be a series of two alternating sample values.

That makes me think a little further: I would have expected a sequence
of bytes that repeated every two samples. But the output data you
posted showed variation even after the first four bytes. This makes
me wonder if you are not loading a file format that has a header in
front of the sample data. If you think that might be happening, you
could generate about 10 or more seconds worth of data, then skip over
the first few kilobytes of the input file to get past the header.

- Logan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top