31.07.2013 Views

MySQL Cluster Tutorial - cdn.oreillystatic.com

MySQL Cluster Tutorial - cdn.oreillystatic.com

MySQL Cluster Tutorial - cdn.oreillystatic.com

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Fault tolerance<br />

<strong>MySQL</strong> <strong>Cluster</strong> is designed to be very fault tolerant. Although there are issues you should<br />

be aware of. This section will cover how cluster is fault tolerant and how your application<br />

should work with this.<br />

<strong>MySQL</strong> Server<br />

When using <strong>MySQL</strong> in general temporary errors can happen, that is to say errors that may<br />

not occur if the transaction is retried. In <strong>MySQL</strong> <strong>Cluster</strong> this can happen more often than<br />

other storage engines for a variety of reasons, such as timeouts, node failures or redo log<br />

problems. With <strong>MySQL</strong> <strong>Cluster</strong> when this happens the transaction will be implicitly<br />

rolled-back and the error will be returned to the client. It is up to the client application to<br />

retry the transaction.<br />

Sometimes the error message returned will not give enough details to find the true cause of<br />

the failure, so it is a good idea to run 'SHOW WARNINGS' after an error to see other messages<br />

from cluster. For example, if there is a transaction exclusive locking some rows this could<br />

happen:<br />

mysql> select * from t1;<br />

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction<br />

This is kind of useful, but <strong>MySQL</strong> <strong>Cluster</strong> has 5 different things that can cause this error.<br />

So:<br />

mysql> show warnings\G<br />

*************************** 1. row ***************************<br />

Level: Error<br />

Code: 1297<br />

Message: Got temporary error 274 'Time-out in NDB, probably caused by deadlock'<br />

from NDB<br />

*************************** 2. row ***************************<br />

Level: Error<br />

Code: 1297<br />

Message: Got temporary error 274 'Time-out in NDB, probably caused by deadlock'<br />

from NDB<br />

*************************** 3. row ***************************<br />

Level: Error<br />

Code: 1205<br />

Message: Lock wait timeout exceeded; try restarting transaction<br />

*************************** 4. row ***************************<br />

Level: Error<br />

Code: 1622<br />

Message: Storage engine NDB does not support rollback for this statement.<br />

Transaction rolled back and must be restarted<br />

4 rows in set (0.00 sec)<br />

So, the error here is NDB error code 274, it is now possible for the <strong>MySQL</strong> cluster team to<br />

find out the cause of this (274 is triggered by TransactionDeadLockDetectionTimeOut<br />

exceeded during a scan operation).<br />

Heartbeats<br />

The data nodes use a system of heartbeats in a ring to make sure that a node's neighbour is<br />

still up and running. The heartbeat interval is set using the HeartbeatIntervalDBDB setting<br />

which is 1500ms by default. A cluster node will be declared dead if it is not heard from<br />

Copyright © 2010, Oracle and/or its affiliates. All rights reserved. 34/81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!