10.07.2015 Views

Data Processing Techniques - All about the IBM 1130 Computing ...

Data Processing Techniques - All about the IBM 1130 Computing ...

Data Processing Techniques - All about the IBM 1130 Computing ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Section Subsections Page75 30 10 05Address CalculationWhen <strong>the</strong> approximate distribution of <strong>the</strong> key valuesis known, it becomes possible to sort a file internallyby estimating <strong>the</strong> eventual (sorted) positionof each key. This method is called "address calculation"or "pigeonhole sorting".Briefly, it consists of calculating <strong>the</strong> correctrecord number of each item within <strong>the</strong> file by apredetermined linear formula of <strong>the</strong> form y= a +bx.If <strong>the</strong> location at that record number is empty, <strong>the</strong>item (record or key) is placed <strong>the</strong>re; if it is full,a search is made to find <strong>the</strong> closest empty space in<strong>the</strong> vicinity of <strong>the</strong> calculated record number. Theitem at <strong>the</strong> calculated record number and <strong>the</strong> adjacentitems are <strong>the</strong>n moved so that <strong>the</strong> new item canbe inserted in its proper place in <strong>the</strong> sequence.Address calculation is similar to <strong>the</strong> insertionmethod in that each item is placed directly in itsproper relative position within <strong>the</strong> file, and <strong>the</strong>entire file is in order just after <strong>the</strong> last item hasbeen inserted. The method differs from insertion,however, in that some foreknowledge of <strong>the</strong> rangeand distribution of <strong>the</strong> keys is required to estimate<strong>the</strong> relative location for each item. When this isavailable, address calculation is a relatively simpleand rapid method for sorting a medium-size file(several hundred to a few thousand items) of smallto medium-length records. The major disadvantageof <strong>the</strong> method is <strong>the</strong> need for a fairly large storagearea -- <strong>about</strong> two or three times <strong>the</strong> size of <strong>the</strong> areaneeded for <strong>the</strong> original file. If only a relativelysmall working storage area is available, or if <strong>the</strong>distribution within <strong>the</strong> file is not as forecast, a greatdeal of processing time will be spent in redistributing<strong>the</strong> records.To illustrate this method, let us consider ahypo<strong>the</strong>tical case: Many years ago, <strong>the</strong> ABC Companyset up a man-number system based on a threedigitnumber. Since <strong>the</strong>y had <strong>about</strong> 150 employees,each man was assigned, in alphabetic order, a numberevenly divisible by 5 (005, 010, 015, 020, 025,, 995). However, <strong>the</strong>re are now <strong>about</strong> 240 employees,and <strong>the</strong> system is not quite as neat as itonce was.Some of <strong>the</strong> men (50 of <strong>the</strong>m) have been assignednumbers out of <strong>the</strong> normal pattern (for example,862 in between 860 and 865). They are still in alphabeticorder, though.The address calculation sort could be used toplace this employee file onto <strong>the</strong> disk in alphabetic(man-number) sequence in <strong>the</strong> following way:1. Set up a file containing 500 records.2. As each man-number is encountered, divideit by 2.5, and convert <strong>the</strong> result to an integer (callit N).3. Check record number N to see whe<strong>the</strong>r <strong>the</strong>reis already an employee <strong>the</strong>re.4. If <strong>the</strong>re isn't, put <strong>the</strong> man just processed intothat record.5. If <strong>the</strong>re is someone <strong>the</strong>re already, move <strong>the</strong>adjacent records up (or down) until <strong>the</strong>re is roomto insert <strong>the</strong> new man.This will be quite fast, provided <strong>the</strong> "moving around"(step 5) is not required too frequently. If it is, <strong>the</strong>file could be increased to 600 records, and <strong>the</strong> mannumberdivided by 2. This, however, would wastea considerable amount of space on <strong>the</strong> disk.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!