11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Order of Operations<br />

Here is the order in which the <strong>Solr</strong> Cell framework, using the Extracting Request Handler and Tika, processes its<br />

input.<br />

1.<br />

2.<br />

3.<br />

4.<br />

Tika generates fields or passes them in as literals specified by literal.= . If li<br />

teralsOverride=false, literals will be appended as multi-value to the Tika-generated field.<br />

If lowernames=true, Tika maps fields to lowercase.<br />

Tika applies the mapping rules specified by fmap. source = target parameters.<br />

If uprefix is specified, any unknown field names are prefixed with that value, else if defaultField is<br />

specified, any unknown fields are copied to the default field.<br />

Configuring the <strong>Solr</strong> ExtractingRequestHandler<br />

If you are not working with the supplied sample_techproducts_configs or data_driven_schema_conf<br />

igs config set, you must configure your own solrconfig.xml to know about the Jar's containing the Extract<br />

ingRequestHandler and it's dependencies:<br />

<br />

<br />

You can then configure the ExtractingRequestHandler in solrconfig.xml.<br />

<br />

<br />

last_modified<br />

ignored_<br />

<br />

<br />

/my/path/to/tika.config<br />

<br />

<br />

yyyy-MM-dd<br />

<br />

<br />

In the defaults section, we are mapping Tika's Last-Modified Metadata attribute to a field named last_modifie<br />

d. We are also telling it to ignore undeclared fields. These are all overridden parameters.<br />

The tika.config entry points to a file containing a Tika configuration. The date.formats allows you to<br />

specify various java.text.SimpleDateFormats date formats for working with transforming extracted input to<br />

a Date. <strong>Solr</strong> comes configured with the following date formats (see the DateUtil in <strong>Solr</strong>):<br />

yyyy-MM-dd'T'HH:mm:ss'Z'<br />

yyyy-MM-dd'T'HH:mm:ss<br />

yyyy-MM-dd<br />

yyyy-MM-dd hh:mm:ss<br />

yyyy-MM-dd HH:mm:ss<br />

EEE MMM d hh:mm:ss z yyyy<br />

EEE, dd MMM yyyy HH:mm:ss zzz<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

202

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!