Spotlight on Spotlight - Carol Smith Home Page
Spotlight on Spotlight - Carol Smith Home Page
Spotlight on Spotlight - Carol Smith Home Page
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Smith</strong> 16<br />
Query #3<br />
Search String Virginia or VA<br />
Results Set<br />
This search returned 5 documents, 4 of which were<br />
relevant.<br />
Precisi<strong>on</strong> 1 4/5, or 80% of all retrieved documents are relevant to the<br />
query.<br />
Recall<br />
4/32, or 12.5% of all relevant documents were retrieved by<br />
the query.<br />
2_______<br />
1 + 1<br />
.125 .8<br />
Harm<strong>on</strong>ic<br />
Mean<br />
= 2_______<br />
9.25<br />
Observati<strong>on</strong>s<br />
= 0.216<br />
Recall was extremely low, because the search string was not<br />
interpreted by the system as the user anticipated. <str<strong>on</strong>g>Spotlight</str<strong>on</strong>g><br />
does not recognize “or” as a valid Boolean operator. Those<br />
documents that were retrieved happened to refer to the<br />
state of Virginia as both ‘Virginia’ and ‘VA’, and included at<br />
least <strong>on</strong>e word c<strong>on</strong>taining ‘or’ as a letter sequence (i.e.,<br />
“memorial”; “born”; “Dora”). Precisi<strong>on</strong> was high, but as<br />
the search string was not interpreted as expected by the<br />
user, it cannot be attributed to a well-formulated query.<br />
1<br />
<str<strong>on</strong>g>Spotlight</str<strong>on</strong>g> does not employ a ranking algorithm, and returns documents in lexicographic order by document<br />
name. For this reas<strong>on</strong>, precisi<strong>on</strong> cannot be presented at intermediate levels of recall<br />
Additi<strong>on</strong>al observati<strong>on</strong>s:<br />
<br />
<br />
<br />
<br />
As can be seen in the histograms <strong>on</strong> the following page, precisi<strong>on</strong> and recall display<br />
an inverse relati<strong>on</strong>ship for all three queries.<br />
Query 3 has the highest precisi<strong>on</strong>, but also the lowest recall, whereas query 2 has the<br />
lowest precisi<strong>on</strong>, but the highest recall of the three queries.<br />
Harm<strong>on</strong>ic mean is poorest for query 3, reflecting the large difference between<br />
precisi<strong>on</strong> and recall measures.<br />
The data set is fairly small (144 documents), and this analysis may not properly<br />
reflect the poor recall performance associated with searching large document<br />
collecti<strong>on</strong>s (Blair & Mar<strong>on</strong>, 1985).