# Floating Point Format

• Dec 13th 2010, 11:19 AM
magrogi
Floating Point Format

Quote:

Several different representations of real numbers have been proposed, but by far the most widely used is the floating-point representation. Floating-point representations have a base (which is always assumed to be even) and a precision p. = 10 and p = 3, then the number 0.1 is represented as 1.00 × 10-1 . If= 2 and p = 24, then the decimal number 0.1 cannot be represented exactly, but is approximately 1.10011001100110011001101 × 2-4.

In general, a floating-point number will be represented as ± d.dd... d × , where d.dd... d is called the significand and hasp digits. More precisely ±d0 . d1 d2 ... dp-1 × http://docs.sun.com/source/806-3568/chars/beta.gife represents the number http://docs.sun.com/source/806-3568/...oldberg283.gif .

The term floating-point number will be used to mean a real number that can be exactly represented in the format under discussion. Two other parameters associated with floating-point representations are the largest and smallest allowable exponents, emax and emin. Since there are http://docs.sun.com/source/806-3568/chars/beta.gifp possible significands, and emax - emin + 1 possible exponents, a floating-point number can be encoded in
http://docs.sun.com/source/806-3568/...oldberg278.gif bits, where the final +1 is for the sign bit.
What I dont understand is the last part of :

http://docs.sun.com/source/806-3568/...oldberg278.gif

How the whole expression could be denoted by that formular is bugging me and I would like to understand it if anyone can explain it.

Please any reasonable input would be well appreciated.

Thank you
• Dec 13th 2010, 11:38 AM
snowtea
This is much simpler than it looks.

How many bits are needed to encode n discrete pieces of information? $\lceil log_2(n) \rceil$
For example: the minimum number of bits needed to encode 3 numbers say x, y, and z would be ceil(lg(3)) = 2. E.g. let x = 00, y = 01, z = 11
Another example: the number of bits needed to encode all numbers between 100 and 1123 is ceil(lg(1123 - 100 + 1)) = lg(1024) = 10, so we need 10 bits.

To explain the formula:

The first part is the number of bits needed to encode $e_{max} - e_{min} + 1$ possible numbers for the exponents.
The second part is the number of bits needed to encode all numbers expressed by p digits in base $\beta$
The last + 1 is for the sign as mentioned.
• Dec 13th 2010, 12:19 PM
magrogi
First I would like to thank you for your wonderful reply, it answered my question straight away.

My next question is how do you work out ceil(lg(3)) = 2 or ceil(lg(1123 - 100 + 1)) = 10.

• Dec 13th 2010, 12:25 PM
snowtea
I'm using lg for log base 2. And ceil just rounds to the nearest integer.

All together ceil(lg(x)) means what is the smallest integer y s.t. 2^y >= x

Lets work out ceil(lg(1000)) as an example

2^9 = 512
2^10 = 1024

so 10 is the smallest integer s.t. 2^10 >= 1000
so ceil(lg(1000)) = 10

To do it on a calculator. You use the fact that lg(x) = log(x) / log(2) = ln(x) / ln(2)
So one way is to type ln(1000)/ln(2) and round up.
• Dec 13th 2010, 12:56 PM
magrogi
Junior member thank you so much, u sure solve this.

But I shall like to leave this thread open since the article am reading has a lot of theorems and proofs, and since my maths has been a bit rusty, I shall come in often for explanations.

But junior thank you.
• Dec 13th 2010, 07:27 PM
CaptainBlack
Quote:

Originally Posted by magrogi
Junior member thank you so much, u sure solve this.

But I shall like to leave this thread open since the article am reading has a lot of theorems and proofs, and since my maths has been a bit rusty, I shall come in often for explanations.

But junior thank you.

Post new questions in new threads, do not tag them on to this thread because:

1. It is against MHF rules