11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

stream_content_type<br />

The content type of the stream, if available.<br />

We recommend that you try using the extractOnly option to discover which values <strong>Solr</strong> is setting for<br />

these metadata elements.<br />

Examples of Uploads Using the Extracting Request Handler<br />

Capture and Mapping<br />

The command below captures tags separately, and then maps all the instances of that field to a dynamic<br />

field named foo_t.<br />

bin/post -c techproducts example/exampledocs/sample.html -params<br />

"literal.id=doc2&captureAttr=true&defaultField=_text_&fmap.div=foo_t&capture=div"<br />

Capture, Mapping, and Boosting<br />

The command below captures tags separately, maps the field to a dynamic field named foo_t, then<br />

boosts foo_t by 3.<br />

bin/post -c techproducts example/exampledocs/sample.html -params<br />

"literal.id=doc3&captureAttr=true&defaultField=_text_&capture=div&fmap.div=foo_t&boo<br />

st.foo_t=3"<br />

Using Literals to Define Your Own Metadata<br />

To add in your own metadata, pass in the literal parameter along with the file:<br />

bin/post -c techproducts -params<br />

"literal.id=doc4&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&boost<br />

.foo_t=3&literal.blah_s=Bah" example/exampledocs/sample.html<br />

XPath<br />

The example below passes in an XPath expression to restrict the XHTML returned by Tika:<br />

bin/post -c techproducts -params<br />

"literal.id=doc5&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_t&boost<br />

.foo_t=3&xpath=/xhtml:html/xhtml:body/xhtml:div//node()"<br />

example/exampledocs/sample.html<br />

Extracting Data without Indexing It<br />

<strong>Solr</strong> allows you to extract data without indexing. You might want to do this if you're using <strong>Solr</strong> solely as an<br />

extraction server or if you're interested in testing <strong>Solr</strong> extraction.<br />

The example below sets the extractOnly=true parameter to extract data without indexing it.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

204

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!