11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<br />

<br />

<br />

<br />

<br />

<br />

In: "tirgiem tirgus"<br />

Tokenizer to Filter: "tirgiem", "tirgus"<br />

Out: "tirg", "tirg"<br />

Norwegian<br />

<strong>Solr</strong> includes two classes for stemming Norwegian, NorwegianLightStemFilterFactory and NorwegianM<br />

inimalStemFilterFactory. Lucene includes an example stopword list.<br />

Another option is to use the Snowball Porter Stemmer with an argument of language="Norwegian".<br />

Also relevant are the Scandinavian normalization filters.<br />

Norwegian Light Stemmer<br />

The NorwegianLightStemFilterFactory requires a "two-pass" sort for the -dom and -het endings. This<br />

means that in the first pass the word "kristendom" is stemmed to "kristen", and then all the general rules apply so<br />

it will be further stemmed to "krist". The effect of this is that "kristen," "kristendom," "kristendommen," and<br />

"kristendommens" will all be stemmed to "krist."<br />

The second pass is to pick up -dom and -het endings. Consider this example:<br />

One pass Two passes<br />

Before After Before After<br />

forlegen forleg forlegen forleg<br />

forlegenhet forlegen forlegenhet forleg<br />

forlegenheten forlegen forlegenheten forleg<br />

forlegenhetens forlegen forlegenhetens forleg<br />

firkantet firkant firkantet firkant<br />

firkantethet firkantet firkantethet firkant<br />

firkantetheten firkantet firkantetheten firkant<br />

Factory class: solr.NorwegianLightStemFilterFactory<br />

Arguments: variant: Choose the Norwegian language variant to use. Valid values are:<br />

Example:<br />

nb: Bokmål (default)<br />

nn: Nynorsk<br />

no: both<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

165

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!