11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The table below presents examples of regex-based pattern replacement:<br />

Input pattern replacement Output Description<br />

see-ing looking (\w+)(ing) $1 see-ing look Removes "ing" from the end of<br />

word.<br />

see-ing looking (\w+)ing $1 see-ing look Same as above. 2nd<br />

parentheses can be omitted.<br />

No.1 NO. no.<br />

543<br />

[nN][oO]\.\s*(\d+) #$1 #1 NO. #543 Replace some string literals<br />

abc=1234=5678 (\w+)=(\d+)=(\d+) $3=$1=$2 5678=abc=1234 Change the order of the<br />

groups.<br />

Related Topics<br />

CharFilterFactories<br />

Language Analysis<br />

This section contains information about tokenizers and filters related to character set conversion or for use with<br />

specific languages. For the European languages, tokenization is fairly straightforward. Tokens are delimited by<br />

white space and/or a relatively small set of punctuation characters. In other languages the tokenization rules are<br />

often not so simple. Some European languages may require special tokenization rules as well, such as rules for<br />

decompounding German words.<br />

For information about language detection at index time, see Detecting Languages During Indexing.<br />

Topics discussed in this section:<br />

KeywordMarkerFilterFactory<br />

KeywordRepeatFilterFactory<br />

StemmerOverrideFilterFactory<br />

Dictionary Compound Word Token Filter<br />

Unicode Collation<br />

ASCII & Decimal Folding Filters<br />

Language-Specific Factories<br />

Related Topics<br />

KeywordMarkerFilterFactory<br />

Protects words from being modified by stemmers. A customized protected word list may be specified with the<br />

"protected" attribute in the schema. Any words in the protected word list will not be modified by any stemmer in<br />

<strong>Solr</strong>.<br />

A sample <strong>Solr</strong> protwords.txt with comments can be found in the sample_techproducts_configs config<br />

set directory:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

146

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!