06.09.2021 Views

First Semester in Numerical Analysis with Julia, 2020a

First Semester in Numerical Analysis with Julia, 2020a

First Semester in Numerical Analysis with Julia, 2020a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 1. INTRODUCTION 33<br />

and the largest is<br />

x =(−1) 0 (1.11 ...1) 2 × 2 1023 =<br />

(<br />

1+ 1 2 + 1 2 2 + ... + 1<br />

2 52 )<br />

× 2 1023<br />

=(2− 2 −52 )2 1023<br />

≈ 0.18 × 10 309 .<br />

Dur<strong>in</strong>g a calculation, if a number less than the smallest float<strong>in</strong>g-po<strong>in</strong>t number is obta<strong>in</strong>ed,<br />

then we obta<strong>in</strong> the underflow error. A number greater than the largest gives overflow<br />

error.<br />

Exercise 1.3-1: Consider the follow<strong>in</strong>g toy model for a normalized float<strong>in</strong>g-po<strong>in</strong>t representation<br />

<strong>in</strong> base 2: x =(−1) s (1.a 2 a 3 ) 2 × 2 e where −1 ≤ e ≤ 1. F<strong>in</strong>d all positive mach<strong>in</strong>e<br />

numbers (there are 12 of them) that can be represented <strong>in</strong> this model. Convert the numbers<br />

to base 10, and then carefully plot them on the number l<strong>in</strong>e, by hand, and comment on how<br />

the numbers are spaced.<br />

Representation of <strong>in</strong>tegers<br />

In the previous section, we discussed represent<strong>in</strong>g real numbers <strong>in</strong> a computer. Here we will<br />

give a brief discussion of represent<strong>in</strong>g <strong>in</strong>tegers. How does a computer represent an <strong>in</strong>teger n?<br />

As <strong>in</strong> real numbers, we start <strong>with</strong> writ<strong>in</strong>g n <strong>in</strong> base 2. We have 64 bits to represent its digits<br />

and sign. As <strong>in</strong> the float<strong>in</strong>g-po<strong>in</strong>t representation, we can allocate one bit for the sign, and<br />

use the rest, 63 bits, for the digits. This approach has some disadvantages when we start<br />

add<strong>in</strong>g <strong>in</strong>tegers. Another approach, known as the two’s complement, is more commonly<br />

used, <strong>in</strong>clud<strong>in</strong>g <strong>in</strong> <strong>Julia</strong>.<br />

For an example, assume we have 8 bits <strong>in</strong> our computer. To represent 12 <strong>in</strong> two’s<br />

complement (or any positive <strong>in</strong>teger), we simply write it <strong>in</strong> its base 2 expansion: (00001100) 2 .<br />

To represent −12, we do the follow<strong>in</strong>g: flip all digits, replac<strong>in</strong>g 1 by 0, and 0 by 1, and<br />

then add 1 to the result. When we flip digits for 12, we get (11110011) 2 , and add<strong>in</strong>g 1 (<strong>in</strong><br />

b<strong>in</strong>ary), gives (11110100) 2 . Therefore −12 is represented as (11110100) 2 <strong>in</strong> two’s complement<br />

approach. It may seem mysterious to go through all this trouble to represent −12, until you<br />

add the representations of 12 and −12,<br />

(00001100) 2 + (11110100) 2 =(100000000) 2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!