17.01.2015 Views

Erlang and OTP in Action.pdf - Synrc

Erlang and OTP in Action.pdf - Synrc

Erlang and OTP in Action.pdf - Synrc

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

13<br />

processes with<strong>in</strong> a certa<strong>in</strong> period of time before it ought to give up, <strong>and</strong> more. All you need<br />

to do is to provide some parameters <strong>and</strong> hooks.<br />

But a system should not be structured as just a s<strong>in</strong>gle-level hierarchy of supervisors <strong>and</strong><br />

workers. In any complex system, you will want a supervision tree, with multiple layers, that<br />

allows subsystems to be restarted at different levels.<br />

1.2.3 – Layer<strong>in</strong>g processes for fault tolerance<br />

Layer<strong>in</strong>g br<strong>in</strong>gs related subsystems together under a common supervisor. More importantly,<br />

it def<strong>in</strong>es different levels of work<strong>in</strong>g base states that we can revert to. In the diagram below,<br />

you can see that there are two dist<strong>in</strong>ct groups of worker processes, A <strong>and</strong> B, supervised<br />

separately from one another. These two groups <strong>and</strong> their supervisors together form a larger<br />

group C, under yet another supervisor higher up <strong>in</strong> the tree.<br />

Figure illustrat<strong>in</strong>g a layered system of supervisors <strong>and</strong> workers<br />

Let’s assume that the processes <strong>in</strong> group A work together to produce a stream of data<br />

that group B consumes. Group B is however not required for group A to function. Just to<br />

make th<strong>in</strong>gs concrete, let’s say group A is process<strong>in</strong>g <strong>and</strong> encod<strong>in</strong>g multimedia data, while<br />

group B presents it. Let’s further suppose that a small percent of the data enter<strong>in</strong>g group A<br />

is corrupt <strong>in</strong> some way not predicted at the time the application was written.<br />

This malformed data causes a process with<strong>in</strong> group A to malfunction. Follow<strong>in</strong>g the “let it<br />

crash” philosophy, that process dies immediately without so much as try<strong>in</strong>g to untangle the<br />

mess, <strong>and</strong> because processes are isolated, none of the other processes have been affected<br />

by the bad <strong>in</strong>put. The supervisor, detect<strong>in</strong>g that a process has died, restores the base state<br />

we prescribed for group A, <strong>and</strong> the system picks up from a known po<strong>in</strong>t. The beauty of this is<br />

that group B, the presentation system, has no idea that this is go<strong>in</strong>g on, <strong>and</strong> really does not<br />

care. So long as group A pushes enough good data to group B for the latter to display<br />

someth<strong>in</strong>g of acceptable quality to the user, we have a successful system.<br />

By isolat<strong>in</strong>g <strong>in</strong>dependent parts of our system <strong>and</strong> organiz<strong>in</strong>g them <strong>in</strong>to a supervision tree,<br />

we can create little subsystems that can be <strong>in</strong>dividually restarted, <strong>in</strong> fractions of a second, to<br />

keep our system chugg<strong>in</strong>g along even <strong>in</strong> the face of unpredicted errors. If group A fails to<br />

restart properly, its supervisor might eventually give up <strong>and</strong> escalate the problem to the<br />

supervisor of the entire group C, which might then <strong>in</strong> a case like this decide to shut down B<br />

as well <strong>and</strong> call it a day. If you imag<strong>in</strong>e that our system is <strong>in</strong> fact runn<strong>in</strong>g hundreds of<br />

©Mann<strong>in</strong>g Publications Co. Please post comments or corrections to the Author Onl<strong>in</strong>e forum:<br />

http://www.mann<strong>in</strong>g-s<strong>and</strong>box.com/forum.jspaforumID=454

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!