Arithmetic Operations, Floating Point

Instructions: Enter a floating point number in the upper left textfield labeled "number 1:" and a floating point number in the textfield below that is labeled "number 2:". Then select an operation (+,-,/,*) from the bottom button group. The binary representation of the two numbers shows up in the four textfields to the right of each number. Floating point numbers are typically represented in a computer as a sequence of four bytes (doubles are represented by eight bytes). The result of the selected operation shows up in the bottom left textfield labeled "result:". The binary value of the result is shown in the four bytes to its right. Try assigning number 1 the value 23, assigning number 2 the value 34 and multiplying. Any surprises on the binary side?

Floating Point Representation:
 The problem with integers is that the highest representable one is 232-1 which is 4,294,967,295. This is not terribly high. The solution to representing very large numbers (and very small numbers too) is to treat each number as though it were written in scientific notation with a mantissa between 1 and 2 and an exponent. For example, the number 0.15625 in binary is 0.00101. This is 1.01 (the mantissa) times 2-3 (the number -3 which is ...1100 in binary) is called the exponent. The computer representation of the number 0.15625 as a floating point number uses the rightmost 23 bits to represent the mantissa, the leftmost bit to represent the sign of the number (0 means positive, 1 means negative) and the remaining 8 bits to represent the exponent as shown here (stolen from wikipedia, sorry): where the bits are numbered from 0 to 31. Bits 0 to 22 are the represent the mantissa. The reason the leading 1 is missing is because any number between 1 and 2 always has a leading 1 so it can be removed leaving only the fractional part of the mantissa to worry about (that is, the mantissa is treated like a 24 bit number). Hereafter we will say fraction instead of mantissa when referring to the fractional part of the mantissa that is stored. There are some special cases to worry about, though. By convention, the representation of 0 has the exponent equal to 000...000 (all zero) and fraction equal to 000...000 (all zero) regardless of the sign bit. If the exponent is 111...111 (all ones) and the fraction is 000...000 (all zero) then the number is +∞ if the sign bit is 0 and -∞ if the sign bit is 1. If the exponent is 111...111 (all ones) and the fraction contains at least one 1, then the number is undefined (also called NaN for "Not a Number"). That leaves two cases. If the exponent is non-zero then the number is positive if the sign bit is 1 and negative if it is zero, the exponent is in twos complement form except that the most significant bit (which would represent the sign of the exponent) is reversed (thus in the figure the exponent -3 is represented as 01111100) and the fraction is represented as an integer (in the figure it is 01000...). If the exponent is 0 and there is at least one 1 in the fraction, then the number represented is 2-126*F where F is just the fractional part (in order words we forget about the implied 1 to the left of the radix point in the mantissa). Thus, the following: ```0 00000000 10000000000000000000000 ``` represents the number 2-127. The reason for the above convention is that arithemtic operations, particularly multiplication, can be done efficiently.

The bottom row above shows the result represented as a double (8 bytes). In that case, the exponent is 11 bits and the fraction is 52 bits (with the sign bit this is 64 bits or 8 bytes).