15.07.2016 Views

MARKLOGIC SERVER

Inside-MarkLogic-Server

Inside-MarkLogic-Server

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

concurrently), spreading the load on the target with a load balancer (so more E-nodes<br />

can participate), and buying a bigger network pipe between clusters (speeding<br />

the delivery).<br />

For more information about Flexible Replication, see the Flexible Replication Guide<br />

and the flexrep functions documentation.<br />

QUERY-BASED FLEXIBLE REPLICATION<br />

Flexible Replication can also be combined with MarkLogic's alerting if you need<br />

fine-grained control over what documents get replicated to what nodes as well as who<br />

gets to see what based on permissions. This feature is known as Query-Based Flexible<br />

Replication (QBFR). QBFR is useful in the case of heterogeneous networks where large<br />

central servers are mixed with smaller systems, such as laptop computers out in the<br />

field. All of the systems run MarkLogic server, but the laptops only require a subset of<br />

the documents that are stored on the main servers. Use cases for QBFR include military<br />

deployments, where field units only need information specific to their location, and<br />

scientific applications, where remote researchers only need data related to their field<br />

of study.<br />

As described previously, alerting lets you define queries that specify which documents in<br />

a data set a user is interested in. When used in QBFR, the queries define the documents<br />

to be replicated from the master clusters to the replica clusters (which could be simple<br />

one-node clusters as in the laptop example). For example, queries in the military<br />

example might define geospatial coordinates, while queries in the scientific example<br />

might define technical keywords. Each replica cluster in the field has a query associated<br />

with it; when a document is inserted or updated on the central master cluster, a reverse<br />

query is run to determine which of the stored queries match the document. The<br />

matching queries determine to which clusters the document is replicated. Each stored<br />

query is also associated with a user, allowing role-based control of replication.<br />

Pull-based replication is typically the preferred method of operation with QBFR. This<br />

allows field clusters that might have times of limited or no connectivity to retrieve<br />

documents when they have network access. Replication can also be bi-directional.<br />

A field cluster can act as a master when it connects and transfer any new data it has<br />

collected back to the home base. For example, military field units might send back<br />

reconnaissance data, with the data's distribution controlled by additional queries.<br />

REBALANCING<br />

Rebalancing in MarkLogic redistributes content in a database so that the forests that<br />

make up the database each have a similar number of documents. Spreading documents<br />

evenly across forests lets you take better advantage of concurrency among hosts. A<br />

110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!