Computer Arithmetic

Amu_Ke_Fundye

Computer Arithmetic

Point Representation

Because of computer hardware limitation everything including the sign of number has to be represented either by 0’s or 1’s. So, for a positive number the leftmost bit or sign bit is always 0 and for a negative number the sign bit should be 1.

Floating Point Representation

A floating point number can be represented using two points. First is called mantissa (m) and other one is exponent (e). Thus, in a number system with base r, a floating point number with mantissa m and exponent e will be represented as (m × r^e).

The value of m may be a fraction or an integer. Thus, a number (2.25)₁₀ can be represented as 0.225 × 10¹.

Here, m = 225 and e = 1, r = 10.

For n bit register, MSB will be sign bit and (n – 1) bits will be magnitude.

So, positive largest number that can be stored is (2^n-1 – 1) and negative lowest number is –(2^n-1 – 1).

Actual Number Finding Technique

Here, we always store exponent in positive. Biased number is also called excess number. Since, exponent is stored in biased form, so bias number is added to the actual exponent of the given number. Actual number can be calculated from the contents of the registers by using following formula

Actual number = (–1)^s (1 + m) × 2^e–Bias

S = Sign bit

m = Mantissa value of register

e = Exponent value of register

Bias = Bias number of n bits used to represent exponent, then

Bias number = (2^n–1 –1)

Range of exponent = –(2^k–1 –1) to 2^k–1.

IEEE Floating Point Representation

It provides a 32-bit format for single-precision values, and a 64-bit format for double-precision values.

The double-precision format contains a mantissa field that is more than twice as long as the mantissa field of the single-precision format, permitting greater accuracy.

The mantissa field assumes an implicit leading bit of 1, and the

Exponent field adopts the excess system with a bias value of 127 for the single-precision format, and a bias of 1023 for the double-precision format.

Representations are reserved for special values such as zero, infinity, NAN (not-a-number), denormalised values.

Ranges of Normalized numbers using single precision

A normalized number is represented in the format:

(–1)^S . M . 2^E, where 1.0 ≤ M < 2.0 and –126 ≤ E ≤ 127.

The smallest positive number is: 1.0 × 2^–126 which is equivalent to 1.2 × 10^–38

The largest positive number is: (2 – 2^–23) × 2¹²⁷, minutely less than 2 × 2¹²⁷ = 2¹²⁸ which is equivalent to 3.4 × 10³⁸.

The range for positive normalized numbers in this format is 1.2 × 10^–38 to 3.4 × 10³⁸.

Normalization using Single Precision Floating Point Representation

Step 1: Determine the sign bit. Save this for later.

Step 2: Convert the absolute value of the number to normalized form.

Step 3: Determine the eight–bit exponent field.

Step 4: Determine the 23–bit significant. There are shortcuts here.

Step 5: Arrange the fields in order.

Step 6: Rearrange the bits, grouping by fours from the left.

Step 7: Write the number as eight hexadecimal digits.

Example: The Negative Number – 0.750

Step 1: The number is negative. The sign bit is S = 1.

Step 2: 0.750 = 1.5 ∙ 0.50 = 1.5 ∙ 2^–1. The exponent is P = – 1.

Step 3: P + 127 = – 1 + 127 = 126. As an eight–bit number, this is 0111 1110.

Step 4: Convert 1.5 to binary. 1.5 = 1 + ½ = 1.1₂. The significant is 10000.

To get the significant, drop the leading “1.” from the number.

Note that we do not extend the significant to its full 23 bits, but only place a few zeroes after the last 1 in the string.

Step 5: Arrange the bits: Sign | Exponent | Significand

Sign Exponent Significand

1 0111 1110 1000 … 00

Step 6: Rearrange the bits

1011 1111 0100 0000 … etc.

Step 7: Write as 0xBF40. Extend to eight hex digits: 0xBF40 0000.

Regards

Amrut Jagdish Gupta

Amu_Ke_Fundye

Search This Blog

Computer Arithmetic

Amu_Ke_Fundye

Computer Arithmetic

Point Representation

Floating Point Representation

Actual Number Finding Technique

IEEE Floating Point Representation

Ranges of Normalized numbers using single precision

Normalization using Single Precision Floating Point Representation

Labels

Comments

Post a Comment

Popular posts from this blog

Undecidability

Funny Shortcut to Remember Periodic Table

Sequential Logic Circuits