11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

By looking at the start and end positions for each term, we can see that the only thing this field type does is<br />

tokenize text on whitespace. Notice in this image that the term "Running" has a start position of 0 and an end<br />

position of 7, while "an" has a start position of 8 and an end position of 10, and "Analyzer" starts at 11 and ends<br />

at 19. If the whitespace between the terms was also included, the count would be 21; since it is 19, we know that<br />

whitespace has been removed from this query.<br />

Note also that the indexed terms and the query terms are still very different. "Running" doesn't match "run",<br />

"Analyzer" doesn't match "analyzer" (to a computer), and obviously "an" and "my" are totally different words. If<br />

our objective is to allow queries like " run my analyzer" to match indexed text like " Running an<br />

Analyzer" then we will evidently need to pick a different field type with index & query time text analysis that<br />

does more processing of the inputs.<br />

In particular we will want:<br />

Case insensitivity, so "Analyzer" and "analyzer" match.<br />

Stemming, so words like "Run" and "Running" are considered equivalent terms.<br />

Stop Word Pruning, so small words like "an" and "my" don't affect the query.<br />

For our next attempt, let's try the " text_general" field type:<br />

With the verbose output enabled, we can see how each stage of our new analyzers modify the tokens they<br />

receive before passing them on to the next stage. As we scroll down to the final output, we can see that we do<br />

start to get a match on "analyzer" from each input string, thanks to the "LCF" stage -- which if you hover over<br />

with your mouse, you'll see is the " LowerCaseFilter":<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

176

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!