11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Example:<br />

<br />

<br />

<br />

<br />

<br />

In: "The The"<br />

Tokenizer to Filter: "the"(1), "the"(2)<br />

Out: "the"(2)<br />

Synonym Filter<br />

This filter does synonym mapping. Each token is looked up in the list of synonyms and if a match is found, then<br />

the synonym is emitted in place of the token. The position value of the new tokens are set such they all occur at<br />

the same position as the original token.<br />

Factory class: solr.SynonymFilterFactory<br />

Arguments:<br />

synonyms: (required) The path of a file that contains a list of synonyms, one per line. In the (default) solr forma<br />

t - see the format argument below for alternatives - blank lines and lines that begin with "#" are ignored. This<br />

may be an absolute path, or path relative to the <strong>Solr</strong> config directory. There are two ways to specify synonym<br />

mappings:<br />

A comma-separated list of words. If the token matches any of the words, then all the words in the list are<br />

substituted, which will include the original token.<br />

Two comma-separated lists of words with the symbol "=>" between them. If the token matches any word<br />

on the left, then the list on the right is substituted. The original token will not be included unless it is also in<br />

the list on the right.<br />

ignoreCase: (optional; default: false) If true, synonyms will be matched case-insensitively.<br />

expand: (optional; default: true) If true, a synonym will be expanded to all equivalent synonyms. If false, all<br />

equivalent synonyms will be reduced to the first in the list.<br />

format: (optional; default: solr) Controls how the synonyms will be parsed. The short names solr (for <strong>Solr</strong>S<br />

ynonymParser) and wordnet (for WordnetSynonymParser ) are supported, or you may alternatively supply<br />

the name of your own SynonymMap.Builder subclass.<br />

tokenizerFactory: (optional; default: WhitespaceTokenizerFactory) The name of the tokenizer factory<br />

to use when parsing the synonyms file. Arguments with the name prefix " tokenizerFactory." will be<br />

supplied as init params to the specified tokenizer factory. Any arguments not consumed by the synonym filter<br />

factory, including those without the " tokenizerFactory. " prefix, will also be supplied as init params to the<br />

tokenizer factory. If tokenizerFactory is specified, then analyzer may not be, and vice versa.<br />

analyzer: (optional; default: WhitespaceTokenizerFactory) The name of the analyzer class to use when<br />

parsing the synonyms file. If analyzer is specified, then tokenizerFactory may not be, and vice versa.<br />

For the following examples, assume a synonyms file named mysynonyms.txt:<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

138

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!