6.830 Problem Set 2 (2009) - MIT Database Group
6.830 Problem Set 2 (2009) - MIT Database Group
6.830 Problem Set 2 (2009) - MIT Database Group
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>6.830</strong> <strong>Problem</strong> <strong>Set</strong> 2 Solutions 21. [5 points]: Query 1:SELECT r1.nameFROM researchers AS r1, researchers AS r2, grants, grant_researchers AS grWHERE grants.pi = r2.idAND grants.id = gr.grantidAND gr.researcherid = r1.idAND r1.org = 10AND r1.org != r2.org;Answer:hashindexnested loopsindex scangrants.idseq scan r2hashseq scan grfilter org=10a.seq scan r1b. Working from the bottom up, the query scans/filters r1 because using the index on r1.id would require lots of randomseeks into r1 to test org=10. There’s no predicate on grant researchers so the only choice is to sequentially scan it.Hash join is in general the best choice when there isn’t a need for output in sorted order or an obvious index-basedplan.It is somewhat unclear why it chooses to do an index-nested loops join with the joined ri/gr table and grants – itestimates that it will do 1929 index lookups which sounds expensive relative to building a hash table on the grantstable. This appears to be a bad plan choice.Again, it chooses a hash join for the top-most join, because that’s faster than the random seeks that would berequired to use r2’s index, and a large fraction of r2’s pages will be examined.c. 1586d. 28e. The counts are only wrong in the top-most join (e.g., in the check of r1.org r2.org).f. The plan looks reasonable, except for the choice of index-nested-loops for the middle join.