581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...
581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...
581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4. (Course book's exercise 2.7) Consider a postings intersection between this postings list, with skip<br />
pointers:<br />
and the following intermediate result postings list (which hence has no skip pointers):<br />
Trace through the postings intersection algorithm in Figure 2.10 (page 37).<br />
a) How often is a skip pointer followed (i.e., p1 is advanced to skip(p2))?<br />
Solution: The skip pointer is followed once (from 24 to 75).<br />
b) How many postings comparisons will be made by this algorithm while intersecting the two<br />
lists?<br />
Solution: 19 comparisons are made. Let (x,y) denote a posting comparison. The<br />
comparisons are: (3,3), (5,5), (9,89), (15,89), (24,89), (75,89), (75,89), (92,89), (81,89),<br />
(84,89), (89,89), (92,95), (115,95), (96,95), (96,97), (97,97), (100,99), (100,100), (115,101).<br />
c) How many postings comparisons would be made if the postings lists are intersected without<br />
the use of skip pointers?<br />
Solution: 19 comparisons are made. The comparisons are: (3,3), (5,5), (9,89), (15,89),<br />
(24,89), (39,89), (60,89), (68,89), (75,89), (81,89), (84,89), (89,89), (92,95), (96,95), (96,97),<br />
(97,97), (100,99), (100,100), (115,101).<br />
5. (Course book's exercise 2.6) We have a two-word query. For one term the postings list consists of<br />
the following 16 entries:<br />
[4,6,10,12,14,16,18,20,22,32,47,81,120,122,157,180]<br />
and for the other it is the one entry postings list:<br />
[47]<br />
Work out how many comparisons would be done to intersect the two postings lists with the following<br />
two strategies. Briefly justify your answers:<br />
a) Using standard postings lists<br />
Solution: Applying MERGE on the standard postings list, comparisons will be made<br />
unless either of the postings list end, i.e., till we reach 47 in the upper postings list, after<br />
which the lower list ends and no more processing needs to be done. Therefore, the<br />
number of comparisons made is 11.<br />
b) Using postings lists stored with skip pointers, with a skip length of √P, as suggested in<br />
Section 2.3<br />
Solution: Using skip pointers of length 4 for the longer list and of length 1 for the shorter<br />
list, the following comparisons will be made: (4,47), (14,47), (22,47), (120,47), (22,47),<br />
(120,47), (32,47), (47,47). Therefore, the number of comparisons made in this case is 8.<br />
This is not, however, an optimal approach, because comparisons (22,47) and (120,47)<br />
are made two times. The solution would be more optimal, if we would add command “p 1<br />
← next(p1)” directly after the while loop. Then the number of comparisons would be<br />
decreased to 6.<br />
2