In general refer to the IEEE 754 standard itself for the strict conversion (including the rounding behaviour) of a real number into its equivalent binary32 format.

Here we can show how to convert a base 10 real number into an IEEE 754 binary32 format using the following outline:

  • consider a real number with an integer and a fraction part such as 12.375
  • convert and normalize the integer part into binary
  • convert the fraction part using the following technique as shown here
  • add the two results and adjust them to produce a proper final conversion

Conversion of the fractional part:

consider 0.375, the fractional part of 12.375.

To convert it into a binary fraction, multiply the fraction by 2, take the integer part and re-multiply new fraction by 2 until a fraction of zero is found or until the precision limit is reached which is 23 fraction digits for IEEE 754 binary32 format.

0.375 x 2 = 0.750 = 0 + 0.750 => b−1 = 0, the integer part represents the binary fraction digit. Re-multiply 0.750 by 2 to proceed

0.750 x 2 = 1.500 = 1 + 0.500 => b−2 = 1

0.500 x 2 = 1.000 = 1 + 0.000 => b−3 = 1, fraction = 0.000, terminate

We see that (0.375)10 can be exactly represented in binary as (0.011)2. Not all decimal fractions can be represented in a finite digit binary fraction. For example decimal 0.1 cannot be represented in binary exactly. So it is only approximated.

Therefore (12.375)10 = (12)10 + (0.375)10 = (1100)2 + (0.011)2 = (1100.011)2

Also IEEE 754 binary32 format requires that you represent real values in (1.b1b2....b23)x2^e  format, (see Normalized number, Denormalized number) so that 1100.011 is shifted to the right by 3 digits to become

(1.100011) x 2^3

Finally we can see that: 12.375 = (1.100011)x2^3

From which we deduce:

  • The exponent is 3 (and in the biased form it is therefore 130 = 1000 0010)
  • The fraction is 100011 (looking to the right of the binary point)

From these we can form the resulting 32 bit IEEE 754 binary32 format representation of 12.375 as: 0-10000010-10001100000000000000000 = 41460000H

 

Note: consider converting 68.123 into IEEE 754 binary32 format: Using the above procedure you expect to get 42883EF9H with the last 4 bits being 1001 However due to the default rounding behaviour of IEEE 754 format what you get is 42883EFAH whose last 4 bits are 1010 .