11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

onError<br />

By default, the TikaEntityProcessor will stop processing documents if it finds one that<br />

generates an error. If you define onError to "skip", the TikaEntityProcessor will instead<br />

skip documents that fail processing and log a message that the document was skipped.<br />

The FileListEntityProcessor<br />

This processor is basically a wrapper, and is designed to generate a set of files satisfying conditions specified in<br />

the attributes which can then be passed to another processor, such as the XPathEntityProcessor. The entity<br />

information for this processor would be nested within the FileListEnitity entry. It generates five implicit fields: fil<br />

eAbsolutePath, fileDir, fileSize, fileLastModified, file, which can be used in the nested<br />

processor. This processor does not use a data source.<br />

The attributes specific to this processor are described in the table below:<br />

Attribute<br />

fileName<br />

basedir<br />

recursive<br />

excludes<br />

Use<br />

Required. A regular expression pattern to identify files to be included.<br />

Required. The base directory (absolute path).<br />

Whether to search directories recursively. Default is 'false'.<br />

A regular expression pattern to identify files which will be excluded.<br />

newerThan A date in the format yyyy-MM-ddHH:mm:ss or a date math expression ( NOW - 2YEARS).<br />

olderThan<br />

rootEntity<br />

dataSource<br />

A date, using the same formats as newerThan.<br />

This should be set to false. This ensures that each row (filepath) emitted by this processor is<br />

considered to be a document.<br />

Must be set to null.<br />

The example below shows the combination of the FileListEntityProcessor with another processor which will<br />

generate a set of fields from each file found.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

219

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!