06.09.2021 Views

First Semester in Numerical Analysis with Julia, 2020a

First Semester in Numerical Analysis with Julia, 2020a

First Semester in Numerical Analysis with Julia, 2020a

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 1. INTRODUCTION 31<br />

represent numbers less than 1! That’s why we use the shifted expression e − 1023,<br />

called the biased exponent, <strong>in</strong> the representation (1.2). Note that the bounds for<br />

the biased exponent are −1023 ≤ e − 1023 ≤ 1024.<br />

Here is a schema that illustrates how the physical bits of a computer correspond to the<br />

representation above. Each cell <strong>in</strong> the table below, numbered 1 through 64, correspond to<br />

the physical bits <strong>in</strong> the computer memory.<br />

1 2 3 ... 12 13 ... 64<br />

• The first bit is the sign bit: it stores the value for s, 0 or 1.<br />

• The blue bits 2 through 12 store the exponent e (not e − 1023). Us<strong>in</strong>g 11 bits, one can<br />

generate the <strong>in</strong>tegers from 0 to 2 11 − 1 = 2047. Here is how you get the smallest and<br />

largest values for e:<br />

e =(00...0) 2 =0<br />

e =(11...1) 2 =2 0 +2 1 + ... +2 10 = 211 − 1<br />

2 − 1 = 2047.<br />

• The red bits, and there are 52 of them, store the digits a 2 through a 53 .<br />

Example 9. F<strong>in</strong>d the float<strong>in</strong>g-po<strong>in</strong>t representation of 10.375.<br />

Solution. You can check that 10 = (1010) 2 and 0.375 = (.011) 2 by comput<strong>in</strong>g<br />

10 = 0 × 2 0 + 1 × 2 1 + 0 × 2 2 + 1 × 2 3<br />

0.375 = 0 × 2 −1 + 1 × 2 −2 + 1 × 2 −3 .<br />

Then<br />

10.375 = (1010.011) 2 =(1.010011) 2 × 2 3<br />

where (1.010011) 2 × 2 3 is the normalized float<strong>in</strong>g-po<strong>in</strong>t representation of the number. Now<br />

we rewrite this <strong>in</strong> terms of the representation (1.2):<br />

10.375 = (−1) 0 (1.010011) 2 × 2 1026−1023 .<br />

S<strong>in</strong>ce 1026 = (10000000010) 2 , the bit by bit representation is:<br />

0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 1 0 ... 0

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!