MySQL Cluster Tutorial - cdn.oreillystatic.com
MySQL Cluster Tutorial - cdn.oreillystatic.com
MySQL Cluster Tutorial - cdn.oreillystatic.com
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Fault tolerance<br />
<strong>MySQL</strong> <strong>Cluster</strong> is designed to be very fault tolerant. Although there are issues you should<br />
be aware of. This section will cover how cluster is fault tolerant and how your application<br />
should work with this.<br />
<strong>MySQL</strong> Server<br />
When using <strong>MySQL</strong> in general temporary errors can happen, that is to say errors that may<br />
not occur if the transaction is retried. In <strong>MySQL</strong> <strong>Cluster</strong> this can happen more often than<br />
other storage engines for a variety of reasons, such as timeouts, node failures or redo log<br />
problems. With <strong>MySQL</strong> <strong>Cluster</strong> when this happens the transaction will be implicitly<br />
rolled-back and the error will be returned to the client. It is up to the client application to<br />
retry the transaction.<br />
Sometimes the error message returned will not give enough details to find the true cause of<br />
the failure, so it is a good idea to run 'SHOW WARNINGS' after an error to see other messages<br />
from cluster. For example, if there is a transaction exclusive locking some rows this could<br />
happen:<br />
mysql> select * from t1;<br />
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction<br />
This is kind of useful, but <strong>MySQL</strong> <strong>Cluster</strong> has 5 different things that can cause this error.<br />
So:<br />
mysql> show warnings\G<br />
*************************** 1. row ***************************<br />
Level: Error<br />
Code: 1297<br />
Message: Got temporary error 274 'Time-out in NDB, probably caused by deadlock'<br />
from NDB<br />
*************************** 2. row ***************************<br />
Level: Error<br />
Code: 1297<br />
Message: Got temporary error 274 'Time-out in NDB, probably caused by deadlock'<br />
from NDB<br />
*************************** 3. row ***************************<br />
Level: Error<br />
Code: 1205<br />
Message: Lock wait timeout exceeded; try restarting transaction<br />
*************************** 4. row ***************************<br />
Level: Error<br />
Code: 1622<br />
Message: Storage engine NDB does not support rollback for this statement.<br />
Transaction rolled back and must be restarted<br />
4 rows in set (0.00 sec)<br />
So, the error here is NDB error code 274, it is now possible for the <strong>MySQL</strong> cluster team to<br />
find out the cause of this (274 is triggered by TransactionDeadLockDetectionTimeOut<br />
exceeded during a scan operation).<br />
Heartbeats<br />
The data nodes use a system of heartbeats in a ring to make sure that a node's neighbour is<br />
still up and running. The heartbeat interval is set using the HeartbeatIntervalDBDB setting<br />
which is 1500ms by default. A cluster node will be declared dead if it is not heard from<br />
Copyright © 2010, Oracle and/or its affiliates. All rights reserved. 34/81