11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

...<br />

<br />

<br />

<br />

The RegexTransformer<br />

The regex transformer helps in extracting or manipulating values from fields (from the source) using Regular<br />

Expressions. The actual class name is org.apache.solr.handler.dataimport.RegexTransformer. But<br />

as it belongs to the default package the package-name can be omitted.<br />

The table below describes the attributes recognized by the regex transformer.<br />

Attribute<br />

regex<br />

sourceColName<br />

splitBy<br />

groupNames<br />

replaceWith<br />

Description<br />

The regular expression that is used to match against the column or sourceColName's<br />

value(s). If replaceWith is absent, each regex group is taken as a value and a list of values<br />

is returned.<br />

The column on which the regex is to be applied. If not present, then the source and target<br />

are identical.<br />

Used to split a string. It returns a list of values. note: this is a regular expression – it may<br />

need to be escaped (e.g. via back-slashes)<br />

A comma separated list of field column names, used where the regex contains groups and<br />

each group is to be saved to a different field. If some groups are not to be named leave a<br />

space between commas.<br />

Used along with regex . It is equivalent to the method new<br />

String().replaceAll(, ).<br />

Here is an example of configuring the regex transformer:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

In this example, regex and sourceColName are custom attributes used by the transformer. The transformer<br />

reads the field full_name from the resultset and transforms it to two new target fields, firstName and lastNa<br />

me. Even though the query returned only one column, full_name, in the result set, the <strong>Solr</strong> document gets two<br />

extra fields firstName and lastName which are "derived" fields. These new fields are only created if the<br />

regexp matches.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

224

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!