IEEE single-precision floating-point format
Real values are stored in a 4-byte number. Those 4 bytes (32 bits) are divided
into 3 fields of bits: The sign bit, the (binary) exponent, and the (binary)
mantissa.
I will refer to the values in those bit fields as the binary
values. I will refer to the sign, exponent and mantissa as one
normally thinks of them in a programming language, or in scientific
notation, as mathematical values or decimal values.
For example, -3.14e-05 has binary sign bit 1,
exponent bits 0x70, and mantissa bits 0x03b37e. Yet
of course one thinks mathematically of -3.14e-05 as having
sign -1, exponent -5, and mantissa 3.14.
Bit fields
An IEEE single-precision float has 1 sign bit, 8 exponent bits, and 23 mantissa bits:
- IEEE sign bit is 0 for positive numbers, 1 for negative.
- IEEE exponent is 127 more than the mathematical exponent and is unsigned.
This is called excess-127 exponent notation.
Since there are 8 bits for the sign, the lowest value is 0x0 and
the highest value is 0xff.
- IEEE mantissa has an implicit leading bit which is not stored.
Representable values
The purpose of the sign bit is self-explanatory. Here are the values
that can be represented with various values in the exponent and mantissa fields:
- If the exponent bits are 0x0:
- If the mantissa bits are 0x0: Decimal value represented is plus or minus
0.0, depending on the sign bit.
(Note that due to the presence of an explicit sign bit, you can
have 0.0 as well as -0.0. This is not the case
with C/C++/Fortran's integral types, such as int, which do not
have an explicit sign bit.)
- If the mantissa bits are not 0x0: Decimal value represented is called a
subnormal number. The mantissa's implicit leading bit is a zero.
Exponent 0x0 corresponds to 2^-126.
So, the smallest subnormal number (the smallest number next to zero) has
mantissa bits all zeroes but the last bit; the largest subnormal number has
mantissa bits all ones. This gives a range from 1.401298e-45 to
1.175494e-38 .
- If the exponent bits are 0x1 to 0xfe:
- Mantissa bits having any value: A normal number.
The mantissa's implicit leading bit is a one.
So, this means that the mantissa in decimal is greater than or
equal to 1.0, and less than 2.0.
- Exponent 0x1 corresponds to 2^-126.
- Exponent 0x7f corresponds to 2^0.
- Exponent 0xfe corresponds to 2^127.
The smallest normal number has exponent bits 0x1 and mantissa all zeroes;
the largest normal number has exponent bits 0xfe and mantissa all ones.
This gives a range from 1.175494e-38 to 3.402823e+38.
- If the exponent bits are 0xff: