11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<br />

<br />

In: "Visit http://accarol.com/contact.htm?from=external&a=10 or e-mail bob.cratchet@accarol.com"<br />

Out: "Visit", "http://accarol.com/contact.htm?from=external&a=10", "or", "e", "mail", "bob.cratchet@accarol.com"<br />

White Space Tokenizer<br />

Simple tokenizer that splits the text stream on whitespace and returns sequences of non-whitespace characters<br />

as tokens. Note that any punctuation will be included in the tokens.<br />

Factory class: solr.WhitespaceTokenizerFactory<br />

Arguments: rule : Specifies how to define whitespace for the purpose of tokenization. Valid values:<br />

Example:<br />

java: (Default) Uses Character.isWhitespace(int)<br />

unicode: Uses Unicode's WHITESPACE property<br />

<br />

<br />

<br />

In: "To be, or what?"<br />

Out: "To", "be,", "or", "what?"<br />

Related Topics<br />

TokenizerFactories<br />

Filter Descriptions<br />

You configure each filter with a element in schema.xml as a child of , following the element. Filter definitions should follow a tokenizer or another filter definition because they take a T<br />

okenStream as input. For example.<br />

<br />

<br />

<br />

...<br />

<br />

<br />

The class attribute names a factory class that will instantiate a filter object as needed. Filter factory classes must<br />

implement the org.apache.solr.analysis.TokenFilterFactory interface. Like tokenizers, filters are<br />

also instances of TokenStream and thus are producers of tokens. Unlike tokenizers, filters also consume tokens<br />

from a TokenStream. This allows you to mix and match filters, in any order you prefer, downstream of a<br />

tokenizer.<br />

Arguments may be passed to tokenizer factories to modify their behavior by setting attributes on the e<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

116

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!