Algorithms and Data Structures
Algorithms and Data Structures
Algorithms and Data Structures
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
N.Wirth. <strong>Algorithms</strong> <strong>and</strong> <strong>Data</strong> <strong>Structures</strong>. Oberon version 37<br />
as possible from further searches, no matter what the outcome of the comparison is. The optimal solution is<br />
to choose the middle element, because this eliminates half of the array in any case. As a result, the<br />
maximum number of steps is log 2 N, rounded up to the nearest integer. Hence, this algorithm offers a drastic<br />
improvement over linear search, where the expected number of comparisons is N/2.<br />
The efficiency can be somewhat improved by interchanging the two if-clauses. Equality should be tested<br />
second, because it occurs only once <strong>and</strong> causes termination. But more relevant is the question, whether —<br />
as in the case of linear search — a solution could be found that allows a simpler condition for termination.<br />
We indeed find such a faster algorithm, if we ab<strong>and</strong>on the naive wish to terminate the search as soon as a<br />
match is established. This seems unwise at first glance, but on closer inspection we realize that the gain in<br />
efficiency at every step is greater than the loss incurred in comparing a few extra elements. Remember that<br />
the number of steps is at most log N.<br />
The faster solution is based on the following invariant:<br />
(Ak: 0 ≤ k < L : a k < x) & (Ak: R ≤ k < N : a k ≥ x)<br />
<strong>and</strong> the search is continued until the two sections span the entire array.<br />
L := 0; R := N; (* ADenS18_Search *)<br />
WHILE L < R DO<br />
m := (L+R) DIV 2;<br />
IF a[m] < x THEN L := m+1 ELSE R := m END<br />
END<br />
The terminating condition is L ≥ R. Is it guaranteed to be reached? In order to establish this guarantee,<br />
we must show that under all circumstances the difference R-L is diminished in each step. L < R holds at the<br />
beginning of each step. The arithmetic mean m then satisfies L ≤ m < R. Hence, the difference is indeed<br />
diminished by either assigning m+1 to L (increasing L) or m to R (decreasing R), <strong>and</strong> the repetition<br />
terminates with L = R.<br />
However, the invariant <strong>and</strong> L = R do not yet establish a match. Certainly, if R = N, no match exists.<br />
Otherwise we must take into consideration that the element a[R] had never been compared. Hence, an<br />
additional test for equality a[R] = x is necessary. In contrast to the first solution, this algorithm — like<br />
linear search — finds the matching element with the least index.<br />
1.8.3 Table Search<br />
A search through an array is sometimes also called a table search, particularly if the keys are<br />
themselves structured objects, such as arrays of numbers or characters. The latter is a frequently<br />
encountered case; the character arrays are called strings or words. Let us define a type String as<br />
String<br />
= ARRAY M OF CHAR<br />
<strong>and</strong> let order on strings x <strong>and</strong> y be defined as follows:<br />
(x = y) ≡ (Aj: 0 ≤ j < M : x j = y j )<br />
(x < y) ≡ Ei: 0 ≤ i < N : ((Aj: 0 ≤ j < i : x j = y j ) & (x i < y i ))<br />
In order to establish a match, we evidently must find all characters of the compar<strong>and</strong>s to be equal. Such a<br />
comparison of structured oper<strong>and</strong>s therefore turns out to be a search for an unequal pair of compar<strong>and</strong>s,<br />
i.e. a search for inequality. If no unequal pair exists, equality is established. Assuming that the length of the<br />
words be quite small, say less than 30, we shall use a linear search in the following solution.<br />
In most practical applications, one wishes to consider strings as having a variable length. This is<br />
accomplished by associating a length indication with each individual string value. Using the type declared<br />
above, this length must not exceed the maximum length M. This scheme allows for sufficient flexibility for<br />
many cases, yet avoids the complexities of dynamic storage allocation. Two representations of string