11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

To use this encoding in your analyzer, see Daitch-Mokotoff Soundex Filter in the Filter Descriptions section.<br />

The Daitch-Mokotoff Soundex algorithm is a refinement of the Russel and American Soundex algorithms,<br />

yielding greater accuracy in matching especially Slavic and Yiddish surnames with similar pronunciation but<br />

differences in spelling.<br />

The main differences compared to the other soundex variants are:<br />

coded names are 6 digits long<br />

initial character of the name is coded<br />

rules to encoded multi-character n-grams<br />

multiple possible encodings for the same name (branching)<br />

Note: the implementation used by <strong>Solr</strong> (commons-codec's DaitchMokotoffSoundex<br />

branching rules compared to the original description of the algorithm.<br />

) has additional<br />

For more information, see http://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex and http://www.avo<br />

taynu.com/soundex.htm<br />

Double Metaphone<br />

To use this encoding in your analyzer, see Double Metaphone Filter in the Filter Descriptions section.<br />

Alternatively, you may specify encoding="DoubleMetaphone" with the Phonetic Filter, but note that the<br />

Phonetic Filter version will not provide the second ("alternate") encoding that is generated by the Double<br />

Metaphone Filter for some tokens.<br />

Encodes tokens using the double metaphone algorithm by Lawrence Philips. See the original article at http://w<br />

ww.drdobbs.com/the-double-metaphone-search-algorithm/184401251?pgno=2<br />

Metaphone<br />

To use this encoding in your analyzer, specify encoding="Metaphone" with the Phonetic Filter.<br />

Encodes tokens using the Metaphone algorithm by Lawrence Philips, described in "Hanging on the Metaphone"<br />

in Computer Language, Dec. 1990.<br />

See http://en.wikipedia.org/wiki/Metaphone<br />

Soundex<br />

To use this encoding in your analyzer, specify encoding="Soundex" with the Phonetic Filter.<br />

Encodes tokens using the Soundex algorithm, which is used to relate similar names, but can also be used as<br />

a general purpose scheme to find words with similar phonemes.<br />

See http://en.wikipedia.org/wiki/Soundex<br />

Refined Soundex<br />

To use this encoding in your analyzer, specify encoding="RefinedSoundex" with the Phonetic Filter.<br />

Encodes tokens using an improved version of the Soundex algorithm.<br />

See http://en.wikipedia.org/wiki/Soundex<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

174

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!