11.05.2016 Views

Apache Solr Reference Guide Covering Apache Solr 6.0

21SiXmO

21SiXmO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For simple queries, the clustering time will usually dominate the fetch time. If the document content is very long<br />

the retrieval of stored content can become a bottleneck. The performance impact of clustering can be lowered in<br />

several ways:<br />

feed less content to the clustering algorithm by enabling carrot.produceSummary attribute,<br />

perform clustering on selected fields (titles only) to make the input smaller,<br />

use a faster algorithm (STC instead of Lingo, Lingo3G instead of STC),<br />

tune the performance attributes related directly to a specific algorithm.<br />

Some of these techniques are described in <strong>Apache</strong> SOLR and Carrot2 integration strategies document, available<br />

at http://carrot2.github.io/solr-integration-strategies. The topic of improving performance is also included in the<br />

Carrot2 manual at http://doc.carrot2.org/#section.advanced-topics.fine-tuning.performance.<br />

Additional Resources<br />

The following resources provide additional information about the clustering component in <strong>Solr</strong> and its potential<br />

applications.<br />

<strong>Apache</strong> <strong>Solr</strong> and Carrot2 integration strategies: http://carrot2.github.io/solr-integration-strategies<br />

<strong>Apache</strong> <strong>Solr</strong> Wiki (covers previous <strong>Solr</strong> versions, may be inaccurate): http://carrot2.github.io/solr-integratio<br />

n-strategies<br />

Clustering and Visualization of <strong>Solr</strong> search results (video from Berlin BuzzWords conference, 2011): http://<br />

vimeo.com/26616444<br />

Spatial Search<br />

<strong>Solr</strong> supports location data for use in spatial/geospatial searches. Using spatial search, you can:<br />

Index points or other shapes<br />

Filter search results by a bounding box or circle or by other shapes<br />

Sort or boost scoring by distance between points, or relative area between rectangles<br />

Generate a 2D grid of facet count numbers for heatmap generation or point-plotting.<br />

There are three main field types available for spatial search:<br />

LatLonType and its non-geodetic twin PointType<br />

SpatialRecursivePrefixTreeFieldType (RPT for short), including RptWithGeometrySpatialF<br />

ield, a derivative<br />

BBoxField<br />

RPT offers more features than LatLonType and fast filter performance, although LatLonType is more appropriate<br />

when efficient distance sorting/boosting is desired. They can both be used simultaneously for what each does<br />

best – LatLonType for sorting/boosting, RPT for filtering. If you need to index shapes other than points (e.g. a<br />

circle or polygon) then use RPT.<br />

BBoxField is for indexing bounding boxes, querying by a box, specifying a search predicate<br />

(Intersects,Within,Contains,Disjoint,Equals), and a relevancy sort/boost like overlapRatio or simply the area.<br />

Some details that are not in this guide can be found at http://wiki.apache.org/solr/SpatialSearch.<br />

Indexing and Configuration<br />

For indexing geodetic points (latitude and longitude), supply the pair of numbers as a string with a comma<br />

separating them in latitude then longitude order. For non-geodetic points, the order is x,y for PointType, and for<br />

RPT you must use a space instead of a comma, or use WKT.<br />

<strong>Apache</strong> <strong>Solr</strong> <strong>Reference</strong> <strong>Guide</strong> <strong>6.0</strong><br />

370

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!