15.07.2016 Views

MARKLOGIC SERVER

Inside-MarkLogic-Server

Inside-MarkLogic-Server

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

value but a latitude and longitude pair. Picture a long array of structures holding lat/<br />

long values and associated document IDs, sorted by latitude major and longitude minor,<br />

held in a memory-mapped file. (Latitude major and longitude minor means they're<br />

sorted first by latitude, then by longitude for points with the same latitude.)<br />

Point queries can be easily resolved by finding the matching points within the pre-sorted<br />

index and extracting the corresponding document ID or IDs. Box queries (looking for<br />

matches between two latitude values and two longitude values) can be resolved by first<br />

finding the subsection of the geospatial index within the latitude bounds, then finding<br />

the sections within that range that also reside within the longitude bounds. 11<br />

For circle and polygon constraints, MarkLogic employs a high-speed comparator to<br />

determine if a given point in the range index resides inside or outside the circle or<br />

polygon constraint. The geospatial indexes use this comparator where a string-based<br />

range index would use a string collation comparator. The comparator can compare 1<br />

million to 10 million points per second per core, allowing for a fast scan through the<br />

range index. The trick is to look northward or southward from any particular point,<br />

counting arc intersections with the bounding shape: an even number of intersections<br />

means the point is outside, odd means it's inside.<br />

As an accelerator for circles and polygons, MarkLogic uses a set of first-pass bounding<br />

boxes (a box or series of boxes that fully contain the circle or polygon) to limit the<br />

number of points that have to be run through the detailed comparator. A circle<br />

constraint thus doesn't require comparing every point, only those within the bounding<br />

box around the circle.<br />

Searches on certain parts of the globe complicate matters. The poles represent<br />

singularities where all longitude lines converge, so MarkLogic uses special trigonometry<br />

to help resolve searches there. For geospatial shapes that cross the anti-meridian (where<br />

longitude values switch from negative to positive), the server generates multiple smaller<br />

regions that don't cross this special boundary and unions the results.<br />

This subject is described more in Geospatial Search Applications.<br />

11 The worst-case performance on bounding boxes? A thin vertical slice.<br />

81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!