22.01.2013 Views

581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...

581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...

581257 Information Retrieval Methods Autumn 2010 Exercise 2 ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4. (Course book's exercise 2.7) Consider a postings intersection between this postings list, with skip<br />

pointers:<br />

and the following intermediate result postings list (which hence has no skip pointers):<br />

Trace through the postings intersection algorithm in Figure 2.10 (page 37).<br />

a) How often is a skip pointer followed (i.e., p1 is advanced to skip(p2))?<br />

Solution: The skip pointer is followed once (from 24 to 75).<br />

b) How many postings comparisons will be made by this algorithm while intersecting the two<br />

lists?<br />

Solution: 19 comparisons are made. Let (x,y) denote a posting comparison. The<br />

comparisons are: (3,3), (5,5), (9,89), (15,89), (24,89), (75,89), (75,89), (92,89), (81,89),<br />

(84,89), (89,89), (92,95), (115,95), (96,95), (96,97), (97,97), (100,99), (100,100), (115,101).<br />

c) How many postings comparisons would be made if the postings lists are intersected without<br />

the use of skip pointers?<br />

Solution: 19 comparisons are made. The comparisons are: (3,3), (5,5), (9,89), (15,89),<br />

(24,89), (39,89), (60,89), (68,89), (75,89), (81,89), (84,89), (89,89), (92,95), (96,95), (96,97),<br />

(97,97), (100,99), (100,100), (115,101).<br />

5. (Course book's exercise 2.6) We have a two-word query. For one term the postings list consists of<br />

the following 16 entries:<br />

[4,6,10,12,14,16,18,20,22,32,47,81,120,122,157,180]<br />

and for the other it is the one entry postings list:<br />

[47]<br />

Work out how many comparisons would be done to intersect the two postings lists with the following<br />

two strategies. Briefly justify your answers:<br />

a) Using standard postings lists<br />

Solution: Applying MERGE on the standard postings list, comparisons will be made<br />

unless either of the postings list end, i.e., till we reach 47 in the upper postings list, after<br />

which the lower list ends and no more processing needs to be done. Therefore, the<br />

number of comparisons made is 11.<br />

b) Using postings lists stored with skip pointers, with a skip length of √P, as suggested in<br />

Section 2.3<br />

Solution: Using skip pointers of length 4 for the longer list and of length 1 for the shorter<br />

list, the following comparisons will be made: (4,47), (14,47), (22,47), (120,47), (22,47),<br />

(120,47), (32,47), (47,47). Therefore, the number of comparisons made in this case is 8.<br />

This is not, however, an optimal approach, because comparisons (22,47) and (120,47)<br />

are made two times. The solution would be more optimal, if we would add command “p 1<br />

← next(p1)” directly after the while loop. Then the number of comparisons would be<br />

decreased to 6.<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!