Wednesday, December 2, 2009

Int->Float->Int: It's a jungle out there!

It turns out that the simple operation of converting from float to integer and back is not so simple. When it comes to audio, this operation should be done with care, and most programmers do, in fact, put a lot of thought into it. The problem most programmers observe is that audio, when stored (or processed) as an integer, is usually stored in what's called "two's complement" notation, which always gives us 1 more negative number than positive. When we process or store floating point numbers, we use a nominal range of -1 to +1.

The fact that there are more negative numbers than positive numbers has caused some confusion amongst programers, and a number of different conversion methods have been proposed. Here is my survey of how a number of existing software and hardware packages handle this conversion. In these examples, I show conversions for 16-bit integers, but they all extend in the obvious way to other bit depths. It is important to consider how these methods extend to larger integers, especially how they extend to 24-bit integers, so I've tested bit transparency for these methods up to 24-bit using single precision floating point intermediaries, correcting for the fact that IEEE allows for extended precisions to be used in computations. Endianness is irrelevant here, because everything works for big and little endian systems.

Transparency is only required or possible when the data has not been created synthetically or altered via DSP (including such simple operations as volume changes, mixing, etc). In cases where transparency is not possible, dither must be applied when converting to integer or reducing the resolution. In many software packages it is up to the end-user to make this determination and manually switch dither on or off. In my next post I will discuss dithering and linearity.

Int to Float
Float to Int*
Used By
((integer + .5)/(0x7FFF+.5)
Up to at least 24-bit
DC DAC Modeled
(integer / 0x8000)
float * 0x8000
Up to at least 24-bit
Apple (Core Audio)1, ALSA2, MatLab2, sndlib2
(integer / 0x7FFF)
float * 0x7FFF
Up to at least 24-bit
Pulse Audio2
(integer / 0x8000)
float * 0x7FFF
PortAudio1,2, Jack2, libsndfile1,3
Up to at least 24-bit
At least one high end DSP and A/D/A manufacturer.2,4 XO Wave 1.0.3.
*obviously, rounding or dithering may be required here.
Note that in the case of IO APIs, drivers are often responsible for conversions. The conversions listed here are provided by the API.

Method 0 is one possible method for preserving the DC accuracy of a DAC, and is included here for reference.

Edited December 6, 2009: Fixed Method 3. (0x8000 and 0x7FFF were backwards)

1 Mailing list
2 Perusing the source code (this, of course, is subject to mistakes due to following old, conditional or optional code)
3 libsndfile FAQ goes into detail about this.
4 Personal communication.


  1. The libsndfile FAQ seems to show

    Int to Float: integer/0x8000

    Float to Int: float*0x7FFF

    Float to Int in libsndfile also seems to use the lrintf() function rather than just an integer cast or a truncation.

  2. Thanks for the correction brbrofsl. I made the fix about libsndfile. Indeed, I had my constants reversed. As for rounding, libsndfile and many of these techniques assume some kind of sensible rounding for bit-transparency to work, as noted at the bottom of my table.

  3. Update/Correction: JACK has used symmetrical ("transparent") conversions for more than a year now (using 0x7fffff).

  4. @Domain: sorry I missed your comment. can you send me a link to verify this?

  5. Established in 2013 to provide exciting, effective design solutions. Since its inception, Globalwebsolution has grown considerably into a recognised brand design and digital marketing innovator. Rewarding our clients with compelling visual solutions that create value and recognition in their marketplace.