You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
value but a latitude and longitude pair. Picture a long array of structures holding lat/<br />
long values and associated document IDs, sorted by latitude major and longitude minor,<br />
held in a memory-mapped file. (Latitude major and longitude minor means they're<br />
sorted first by latitude, then by longitude for points with the same latitude.)<br />
Point queries can be easily resolved by finding the matching points within the pre-sorted<br />
index and extracting the corresponding document ID or IDs. Box queries (looking for<br />
matches between two latitude values and two longitude values) can be resolved by first<br />
finding the subsection of the geospatial index within the latitude bounds, then finding<br />
the sections within that range that also reside within the longitude bounds. 11<br />
For circle and polygon constraints, MarkLogic employs a high-speed comparator to<br />
determine if a given point in the range index resides inside or outside the circle or<br />
polygon constraint. The geospatial indexes use this comparator where a string-based<br />
range index would use a string collation comparator. The comparator can compare 1<br />
million to 10 million points per second per core, allowing for a fast scan through the<br />
range index. The trick is to look northward or southward from any particular point,<br />
counting arc intersections with the bounding shape: an even number of intersections<br />
means the point is outside, odd means it's inside.<br />
As an accelerator for circles and polygons, MarkLogic uses a set of first-pass bounding<br />
boxes (a box or series of boxes that fully contain the circle or polygon) to limit the<br />
number of points that have to be run through the detailed comparator. A circle<br />
constraint thus doesn't require comparing every point, only those within the bounding<br />
box around the circle.<br />
Searches on certain parts of the globe complicate matters. The poles represent<br />
singularities where all longitude lines converge, so MarkLogic uses special trigonometry<br />
to help resolve searches there. For geospatial shapes that cross the anti-meridian (where<br />
longitude values switch from negative to positive), the server generates multiple smaller<br />
regions that don't cross this special boundary and unions the results.<br />
This subject is described more in Geospatial Search Applications.<br />
11 The worst-case performance on bounding boxes? A thin vertical slice.<br />
81