25.11.2014 Views

Algorithms and Data Structures

Algorithms and Data Structures

Algorithms and Data Structures

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

N.Wirth. <strong>Algorithms</strong> <strong>and</strong> <strong>Data</strong> <strong>Structures</strong>. Oberon version 37<br />

as possible from further searches, no matter what the outcome of the comparison is. The optimal solution is<br />

to choose the middle element, because this eliminates half of the array in any case. As a result, the<br />

maximum number of steps is log 2 N, rounded up to the nearest integer. Hence, this algorithm offers a drastic<br />

improvement over linear search, where the expected number of comparisons is N/2.<br />

The efficiency can be somewhat improved by interchanging the two if-clauses. Equality should be tested<br />

second, because it occurs only once <strong>and</strong> causes termination. But more relevant is the question, whether —<br />

as in the case of linear search — a solution could be found that allows a simpler condition for termination.<br />

We indeed find such a faster algorithm, if we ab<strong>and</strong>on the naive wish to terminate the search as soon as a<br />

match is established. This seems unwise at first glance, but on closer inspection we realize that the gain in<br />

efficiency at every step is greater than the loss incurred in comparing a few extra elements. Remember that<br />

the number of steps is at most log N.<br />

The faster solution is based on the following invariant:<br />

(Ak: 0 ≤ k < L : a k < x) & (Ak: R ≤ k < N : a k ≥ x)<br />

<strong>and</strong> the search is continued until the two sections span the entire array.<br />

L := 0; R := N; (* ADenS18_Search *)<br />

WHILE L < R DO<br />

m := (L+R) DIV 2;<br />

IF a[m] < x THEN L := m+1 ELSE R := m END<br />

END<br />

The terminating condition is L ≥ R. Is it guaranteed to be reached? In order to establish this guarantee,<br />

we must show that under all circumstances the difference R-L is diminished in each step. L < R holds at the<br />

beginning of each step. The arithmetic mean m then satisfies L ≤ m < R. Hence, the difference is indeed<br />

diminished by either assigning m+1 to L (increasing L) or m to R (decreasing R), <strong>and</strong> the repetition<br />

terminates with L = R.<br />

However, the invariant <strong>and</strong> L = R do not yet establish a match. Certainly, if R = N, no match exists.<br />

Otherwise we must take into consideration that the element a[R] had never been compared. Hence, an<br />

additional test for equality a[R] = x is necessary. In contrast to the first solution, this algorithm — like<br />

linear search — finds the matching element with the least index.<br />

1.8.3 Table Search<br />

A search through an array is sometimes also called a table search, particularly if the keys are<br />

themselves structured objects, such as arrays of numbers or characters. The latter is a frequently<br />

encountered case; the character arrays are called strings or words. Let us define a type String as<br />

String<br />

= ARRAY M OF CHAR<br />

<strong>and</strong> let order on strings x <strong>and</strong> y be defined as follows:<br />

(x = y) ≡ (Aj: 0 ≤ j < M : x j = y j )<br />

(x < y) ≡ Ei: 0 ≤ i < N : ((Aj: 0 ≤ j < i : x j = y j ) & (x i < y i ))<br />

In order to establish a match, we evidently must find all characters of the compar<strong>and</strong>s to be equal. Such a<br />

comparison of structured oper<strong>and</strong>s therefore turns out to be a search for an unequal pair of compar<strong>and</strong>s,<br />

i.e. a search for inequality. If no unequal pair exists, equality is established. Assuming that the length of the<br />

words be quite small, say less than 30, we shall use a linear search in the following solution.<br />

In most practical applications, one wishes to consider strings as having a variable length. This is<br />

accomplished by associating a length indication with each individual string value. Using the type declared<br />

above, this length must not exceed the maximum length M. This scheme allows for sufficient flexibility for<br />

many cases, yet avoids the complexities of dynamic storage allocation. Two representations of string

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!