01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

164 M.F. Dolz et al.<br />

ThemodulequeriestheSGEqueuesystem to collect <strong>in</strong>formation on the actual<br />

jobs, nodes and queues (qstat, qmod and qhost commands). This is then used<br />

to compel the necessary statistics, and apply the power sav<strong>in</strong>g policy def<strong>in</strong>ed by<br />

the system adm<strong>in</strong>istrator. The module also runs several daemons implemented<br />

<strong>in</strong> Python [16]. These daemons ma<strong>in</strong>ta<strong>in</strong> a MySQL database that conta<strong>in</strong>s all<br />

the <strong>in</strong>formation and statistics; they also query the “cluster” database used by<br />

Rocks R○ to extract <strong>in</strong>formation about the nodes (e.g., their MAC addresses) to<br />

remotely power them on us<strong>in</strong>g WakeOnLAN (WOL) [17].<br />

The nodes <strong>of</strong> the cluster have their BIOS configured with WOL (WakeUp<br />

events). <strong>Systems</strong> that support the PCI 2.2 standard <strong>in</strong> conjunction with a compatible<br />

PCI network card usually do not require a WOL cable because energy is<br />

provided through the PCI bus.<br />

3 Implementation <strong>of</strong> the Energy Sav<strong>in</strong>g Roll<br />

In this section, we describe the energy sav<strong>in</strong>g module <strong>in</strong> detail; see Figure 1. The<br />

module <strong>in</strong>cludes the follow<strong>in</strong>g major components:<br />

– Three daemons <strong>in</strong> charge <strong>of</strong> manag<strong>in</strong>g the database, collect<strong>in</strong>g statistics, and<br />

execut<strong>in</strong>g the commands that power on and shut down the nodes.<br />

– The database that stores all <strong>in</strong>formation necessary to make decisions.<br />

– The website <strong>in</strong>terface to configure and adm<strong>in</strong>ister users’ groups as well as<br />

set the threshold triggers that def<strong>in</strong>e the power sav<strong>in</strong>g policy.<br />

We have chosen a modular design, mapp<strong>in</strong>g the ma<strong>in</strong> functions <strong>of</strong> the system to<br />

daemons (control <strong>of</strong> queue system, collect statistics, and apply policies <strong>of</strong> activation<br />

and deactivation nodes). Moreover, we have decided to employ a database<br />

to ease data m<strong>in</strong><strong>in</strong>g via SQL. The user <strong>in</strong>terface is web-oriented, seamlessly<br />

<strong>in</strong>tegrates with Rocks R○, and facilitates remote access and adm<strong>in</strong>istration.<br />

3.1 Daemons<br />

Daemon for epilogue requests. A node <strong>of</strong> the cluster runs a epilogue script<br />

provided by the SGE queue system when a job completes its execution, and<br />

therefore leaves the queue. This script receives parameters from the SGE executor<br />

daemon which are essential for monitor<strong>in</strong>g the cluster and, therefore, for<br />

implement<strong>in</strong>g the energy sav<strong>in</strong>g policy. As the database is located <strong>in</strong> the frontend<br />

node, it is necessary to send this set <strong>of</strong> parameters through the network. For<br />

this purpose, the node that executes the epilogue script opens a connection via<br />

a TCP socket with the epilogue daemon that runs on the front-end node to pass<br />

the necessary <strong>in</strong>formation.<br />

The epilogue daemon employs this <strong>in</strong>formation to perform a series <strong>of</strong> updates<br />

<strong>in</strong> the energy sav<strong>in</strong>g module database, extract<strong>in</strong>g data from the account<strong>in</strong>g file<br />

ma<strong>in</strong>ta<strong>in</strong>ed by the queue system. Updated data comprise the number <strong>of</strong> jobs<br />

for the user responsible <strong>of</strong> this job, this user’s average execution time, and the<br />

queue average wait<strong>in</strong>g and execution times.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!