13.07.2015 Views

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

WWW/Internet - Portal do Software Público Brasileiro

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

IADIS International Conference <strong>WWW</strong>/<strong>Internet</strong> 2010 algorithmic solutions, oriented to the efficient extraction of server status information from manyservers.In our experience, scalability at the network traffic level is not an issue comparable to the aforementionedproblems. In order to keep the control enforcement process scalable, each management task must orchestratethe control using the least amount of resources available. This implies the monitoring and control of only alimited subset of the entire system (this holds particularly true for short-time tasks). We decide not topermanently store each raw performance measure obtained from the subset, since this would be unfeasible,given the literally hundreds of time series that would need to be monitored. Instead, we treat each monitor asa stream of data which is filtered in a basic fashion and made directly available to the orchestrator. Thisoperation is straightforward, since low-level system monitors such vmstat and sar can be easily instrumentedto pipe their output to different processes. For some proprietary systems, it is possible to exploit the SNMPquerying capabilities offered by more sophisticated tools such as Cacti, Nagios, Zenoss, to extract the sameinformation from several system components.Since longer tasks tend often to use the results of shorter tasks, we instrument the orchestrator to extractproper subsets of the filtered data (typically, the last n performance measures coming from the mostimportant monitor probes) and to store them in a lightweight database. In this way, we reduce considerablythe set of monitored data available to the mid-term and long-term tasks, and we provide a persistent datastorage which is necessary to hold data spanning in longer ranges. A RAM database such as Firebird wouldfit perfectly, since it provides excellent performance over a moderately high volume of stored data. Ourexperience shows that this approach allows to monitor tens of hardware and software resources per node, in asubsystem consisting of a few hundreds nodes.The mid-term management tasks take the filtered data, aggregate it, detect (and possibly correct)anomalies, and produce a meaningful representation of a subsystem's internal state, which is used to enforcecontrol decisions aimed at improving the performance and optimizing the present behavior relevantsubsystems. These operations can be implemented through any standard math environment (Matlab, ROctave) or through mathematical libraries available for the most popular general purpose languages (Scipy,Numpy in Python). With respect to the whole set of time series available from the monitors, therepresentations produced by mid-term management tasks are much more compact in terms of size and arecomputed less often; thus, they can be stored into a DBMS, such as MySQL, PostgreSQL, Oracle.Finally, the long-term management tasks also perform more sophisticated tasks oriented improve theperformance and optimize the behavior of the whole system in the future. To this purpose, they retrieve thestate representations computed by the mid-term management tasks and perform sophisticated computationsinvolving long-term predictions, what-if analysis, capacity planning. The resulting models drive the decisionsof the orchestration modules.Our architecture is designed in such a way that the control actions can be implemented through off-theshelfhardware and software components. For example, request dispatching and load balancing can beoperated through standard setups based on Apache2 (through the mod_rewrite and mod_proxy modules),Tomcat (through AJP connectors in a clustered setup) and, more recently, on nginx. There are several viablealternatives for the virtualization of services; the most popular are Xen (version 3.0 and above) and KVM(the hypervisor officially a<strong>do</strong>pted by the Linux community). All these solutions support (live) migrations,dynamic resource re-allocation, ballooning, paravirtualized device drivers for close-to-native performance.Another alternative is the use of resource containers (OpenVZ), which provide native performance at theexpense of executing a shared operating system kernel for all running services.205

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!