11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

enablePositionIncrements: if luceneMatchVersion is 4.4 or earlier and enablePositionIncrement<br />

s="false" , no position holes will be left by this filter when it removes tokens. This argument is invalid if luc<br />

eneMatchVersion is 5.0 or later.<br />

Example:<br />

Case-sensitive matching, capitalized words not stopped. Token positions skip stopped words.<br />

<br />

<br />

<br />

<br />

In: "To be or what?"<br />

Tokenizer to Filter: "To"(1), "be"(2), "or"(3), "what"(4)<br />

Out: "To"(1), "what"(4)<br />

Example:<br />

<br />

<br />

<br />

<br />

In: "To be or what?"<br />

Tokenizer to Filter: "To"(1), "be"(2), "or"(3), "what"(4)<br />

Out: "what"(4)<br />

Suggest Stop Filter<br />

Like Stop Filter, this filter discards, or stops analysis of, tokens that are on the given stop words list. Suggest<br />

Stop Filter differs from Stop Filter in that it will not remove the last token unless it is followed by a token<br />

separator. For example, a query " find the" would preserve the ' the' since it was not followed by a space,<br />

punctuation etc., and mark it as a KEYWORD so that following filters will not change or remove it. By contrast, a<br />

query like " find the popsicle" would remove " the" as a stopword, since it's followed by a space. When<br />

using one of the analyzing suggesters, you would normally use the ordinary StopFilterFactory in your index<br />

analyzer and then SuggestStopFilter in your query analyzer.<br />

Factory class: solr.SuggestStopFilterFactory<br />

Arguments:<br />

words: (optional; default: StopAnalyzer#ENGLISH_STOP_WORDS_SET<br />

parse.<br />

) The name of a stopwords file to<br />

format: (optional; default: wordset) Defines how the words file will be parsed. If words is not specified, then f<br />

ormat must not be specified. The valid values for the format option are:<br />

wordset: This is the default format, which supports one word per line (including any intra-word<br />

whitespace) and allows whole line comments begining with the "#" character. Blank lines are ignored.<br />

snowball: This format allows for multiple words specified on each line, and trailing comments may be<br />

specified using the vertical line (" | "). Blank lines are ignored.<br />

ignoreCase: (optional; default: false) If true,<br />

matching is case-insensitive.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

137

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!