10.07.2015 Views

Data Processing Techniques - All about the IBM 1130 Computing ...

Data Processing Techniques - All about the IBM 1130 Computing ...

Data Processing Techniques - All about the IBM 1130 Computing ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Section Subsections Page75 30 10 03Input and Pass 113 \___„ 1369 i - 613\ 5613561356135656 56 )--'.6 \ 02 02 0202 02 02 )---)' 69 08 08---4.--08 08 08 08 1 6 \ , 2121 21 21 21 21 )—'.69Pass 213 13 13 13 1356 56 02 02 0202 02 56 \ 08 0808 08 08 56 \ 2121 21 21 21 5669 69 69 69 69Pass 3Output13 02 02 02 0202 )-- 13 08 08 0808 08 T1 13 1321 21 21 )-- 21 2156 56 56 56 5669 69 69 69 69The size of <strong>the</strong> file is of great importance, since<strong>the</strong> total number of comparisons and interchangesincreases roughly with <strong>the</strong> square of <strong>the</strong> number ofrecords in <strong>the</strong> file.MergingMerging is <strong>the</strong> process of combining severalsequences of records to form a single specifiedsequence. The same rules by which sequences arecombined may also be used to form sequences (oftwo or more items). Thus, <strong>the</strong> merging processhas, essentially, a dual nature: it can be used forcreating sequences (usually in an internal sort),and it is also capable of reducing previously createdsequences to one (usually in an external sort). Thisdual capability contrasts with <strong>the</strong> selection andexchange techniques described thus far, which areuseful primarily for internal sorting of relativelyshort files of records. The versatility, speed, andsimplicity of merging make it one of <strong>the</strong> most widelyused sorting techniques.There are two basic methods of merge sorting:(1) straight or standard merging, with fixed-lengthsequences, and (2) natural merging, with variablelengthsequences, or strings. (The words "sequence"and "string" are often used interchangeably inmerging terminology.)In straight merging, <strong>the</strong> input file is distributedinitially into two or more work areas, dependingupon <strong>the</strong> number of sequences to be combined duringeach merge (that is, <strong>the</strong> order of merge). Forexample, in a method of two-way straight merging,<strong>the</strong> first merge pass alternates between two storageareas to form strings of two records, one fromeach area. Subsequent passes double <strong>the</strong> lengthof <strong>the</strong> strings each time (for example, 4, 8, 16,etc.), until <strong>the</strong> last pass produces a single sequenceof all <strong>the</strong> records. The length of <strong>the</strong> strings duringeach pass and <strong>the</strong> number of passes are fixed.The natural merge sort takes advantage of"natural" sequences in <strong>the</strong> original file, whichoccur with a certain "probable" frequency. Thelength of <strong>the</strong> strings on each pass is no longer fixed,but depends upon <strong>the</strong> existing sequences. The totalnumber of passes required to sort a given file, <strong>the</strong>n,also depends on <strong>the</strong> number of natural sequences in<strong>the</strong> original file. For a file that is in correctsequence, only a single pass is required -- to verifythat sequence. In <strong>the</strong> worst case, <strong>the</strong> number ofpasses is <strong>the</strong> same as for straight merging.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!