11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

defaultField<br />

extractOnly<br />

extractFormat<br />

If the uprefix parameter (see below) is not specified and a field cannot be<br />

determined, the default field will be used.<br />

Default is false. If true, returns the extracted content from Tika without indexing the<br />

document. This literally includes the extracted XHTML as a string in the response.<br />

When viewing manually, it may be useful to use a response format other than XML<br />

to aid in viewing the embedded XHTML tags.For an example, see http://wiki.apach<br />

e.org/solr/TikaExtractOnlyExampleOutput.<br />

Default is "xml", but the other option is "text". Controls the serialization format of<br />

the extract content. The xml format is actually XHTML, the same format that<br />

results from passing the -x command to the Tika command line application, while<br />

the text format is like that produced by Tika's -t command. This parameter is valid<br />

only if extractOnly is set to true.<br />

fmap.< source_field><br />

Maps (moves) one field name to another. The source_field must be a field in<br />

incoming documents, and the value is the <strong>Solr</strong> field to map to. Example: fmap.co<br />

ntent=text causes the data in the content field generated by Tika to be<br />

moved to the <strong>Solr</strong>'s text field.<br />

ignoreTikaException<br />

literal.< fieldname><br />

literalsOverride<br />

lowernames<br />

multipartUploadLimitInKB<br />

passwordsFile<br />

resource.name<br />

resource.password<br />

tika.config<br />

uprefix<br />

xpath<br />

If true, exceptions found during processing will be skipped. Any metadata<br />

available, however, will be indexed.<br />

Populates a field with the name supplied with the specified value for each<br />

document. The data can be multivalued if the field is multivalued.<br />

If true (the default), literal field values will override other values with the same field<br />

name. If false, literal values defined with literal.< fieldname><br />

will be<br />

appended to data already in the fields extracted from Tika. If setting literalsOv<br />

erride to "false", the field must be multivalued.<br />

Values are "true" or "false". If true, all field names will be mapped to lowercase<br />

with underscores, if needed. For example, "Content-Type" would be mapped to<br />

"content_type."<br />

Useful if uploading very large documents, this defines the KB size of documents to<br />

allow.<br />

Defines a file path and name for a file of file name to password mappings.<br />

Specifies the optional name of the file. Tika can use it as a hint for detecting a file's<br />

MIME type.<br />

Defines a password to use for a password-protected PDF or OOXML file<br />

Defines a file path and name to a customized Tika configuration file. This is only<br />

required if you have customized your Tika implementation.<br />

Prefixes all fields that are not defined in the schema with the given prefix. This is<br />

very useful when combined with dynamic field definitions. Example: uprefix=ig<br />

nored_ would effectively ignore all unknown fields generated by Tika given the<br />

example schema contains <br />

When extracting, only return Tika XHTML content that satisfies the given XPath<br />

expression. See http://tika.apache.org/1.7/index.html for details on the format of<br />

Tika XHTML. See also http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

201

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!