31.07.2013 Views

MySQL Cluster Tutorial - cdn.oreillystatic.com

MySQL Cluster Tutorial - cdn.oreillystatic.com

MySQL Cluster Tutorial - cdn.oreillystatic.com

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Scaling and Performance<br />

<strong>MySQL</strong> Nodes<br />

<strong>MySQL</strong> <strong>Cluster</strong> is not designed for the single thread / single query performance. In fact<br />

you can be almost certain that it will perform worse in this scenario than any other storage<br />

engine type. It is designed for multiple simultaneous queries.<br />

By default a mysqld server only has a single connection to the cluster to process queries<br />

which can be a bottleneck. There are two ways around this. First, you could add more<br />

mysqld nodes and have your application load-balance between them.<br />

Or alternatively you could use the --ndb-cluster-connection-pool setting to add more<br />

connections between mysqld and the cluster. Please note that to do this you will need to<br />

add more [mysqld] slots to your my.cnf to account for each connection in the pool.<br />

NDBAPI<br />

Of course, direct NDBAPI is going to give much better performance than using SQL.<br />

Not only do you eliminate the SQL overhead but there are some things in NDBAPI which<br />

cannot be defined using SQL.<br />

Data Nodes<br />

<strong>MySQL</strong> <strong>Cluster</strong> was originally designed for primary key equality lookups, and it excels at<br />

this. When running an SQL query with this kind of lookup mysqld will try to go directly to<br />

the node with the data and the data will be retrieved very quickly.<br />

Ordered index lookups are, in general, a lot slower. This is because the query has to be<br />

sent to all nodes to be processed simultaneous and then the results are collected and sent<br />

back. When adding more nodes there is more network traffic and round trips required for<br />

this. So performance can degrade with more nodes when ordered indexes are used.<br />

Other Issues<br />

Blobs<br />

Blobs (text columns are blob columns too) in <strong>MySQL</strong> <strong>Cluster</strong> are stored using a hidden<br />

table.<br />

Each blob is split up into many rows for this table (depending on the size of the blob). This<br />

not only causes performance problems but can cause locking problems too because <strong>MySQL</strong><br />

needs to keep the blob table consistent with the main table.<br />

Where possible a large VARCHAR or VARBINARY may be a better option.<br />

Joins<br />

At the moment joins require a lot of excessive overhead to process. The second table in the<br />

join is effectively queried for every matching row in the first table. This is a lot of network<br />

overhead.<br />

There is a work-in-progress solution to this called pushed-down joins. This is where the<br />

joins are processed directly in the data nodes. This is currently an in-development feature<br />

with no scheduled release date (or any guarantee it will ever be released).<br />

Copyright © 2010, Oracle and/or its affiliates. All rights reserved. 71/81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!