19.12.2017 Views

ADMIN+Magazine+Sample+PDF

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Daemon Monitoring<br />

Nuts and Bolts<br />

Figure 1: After starting, the script outputs the log at the console: availability,<br />

error, restart, database running.<br />

Almost all known databases include<br />

a client program for the shell – for<br />

example, mysql for MySQL or psql for<br />

PostgreSQL. Alternatively, you can<br />

use ODBC to access the database in<br />

your scripted monitoring, such as the<br />

isql tool provided by the Unix ODBC<br />

project.<br />

For ease of access, you might need to<br />

set up a (non-privileged) user, a database,<br />

and a table for the test query<br />

on the database server. If you choose<br />

the ODBC option, you also need a<br />

.odbc.ini file with the right access<br />

credentials.<br />

The psql shell client for the Postgres<br />

database also poses the problem of<br />

non-standard exit codes. 1 stands for<br />

an error in the query, although the<br />

connect attempt has been successful;<br />

2 indicates a connection error.<br />

A connection test with psql would<br />

look like this:<br />

psql ‐U User ‐d Database ‐c U<br />

"select * from test_table;"<br />

For ODBC access, you would need to<br />

pipe the SQL query to the client:<br />

echo "select * from test_table;" | U<br />

isql ODBC_data_source user<br />

For the cups printer daemon, lpq<br />

gives you a simple method of checking<br />

whether the daemon is alive. If<br />

you need to check access to individual<br />

printers, you additionally need<br />

to provide the print queue name and<br />

then grep the exit code. To make sure<br />

the exit code complies with this behavior,<br />

Grep checks the output that<br />

you receive if the printer is active:<br />

lpq ‐Pprinter | grep ‐q U<br />

"printer is ready"<br />

To match the output from lpq, you<br />

need to modify the search string for<br />

grep.<br />

The ping command<br />

checks<br />

network connections.<br />

The<br />

exit error codes<br />

differ, depending<br />

on your operating<br />

system. The<br />

FreeBSD ping<br />

uses 2, the Linux<br />

ping uses 1.<br />

The number of test packets is restricted<br />

by the ‐c packets option;<br />

this improves the script run time and<br />

avoids unnecessary network traffic. If<br />

you use the IP address as the target,<br />

you avoid the risk of false positives<br />

from buggy name resolution.<br />

ping ‐c1 ip_address<br />

Sensor scripts can obviously be extended<br />

to cover many other system<br />

parameters, such as disk space usage<br />

(df), logged in users (who), and much,<br />

much more.<br />

If an error or threshold value infringement<br />

occurs, the script can use this<br />

information to generate a message<br />

and notify the system administrator.<br />

The message text should include the<br />

hostname, date, and time. Messages<br />

can be stored in a file to which the<br />

administrator has permanent access.<br />

To allow this to happen, you simply<br />

have to display the logfile in a terminal<br />

and use tail ‐f, but other forms<br />

of communication are also possible –<br />

texting, for example.<br />

If the shell script has the correct<br />

privileges, it can become involved<br />

and restart a daemon, remove block<br />

files, or even reboot the whole system.<br />

Because you should avoid running<br />

this kind of script as root, you<br />

can instead set up special users and<br />

groups to own the script and the<br />

process (which is the case with many<br />

daemons).<br />

Database Restart<br />

The sample script in Listing 1 monitors<br />

an active database instance and<br />

notifies the administrator if the database<br />

happens to fail and then is<br />

successfully restarted (Figure 1). If<br />

it can’t start the daemon, it waits<br />

for the administrator to step in and<br />

handle the situation.<br />

Printer Restart<br />

The second sample script relates to<br />

the printing service. The one shown<br />

here is taken from a production example,<br />

in which the cupsd server has<br />

an unknown problem with a network<br />

printer. The printer was disabled time<br />

and time again, causing no end of<br />

frustration to users and unnecessary<br />

work for the system admins. The shell<br />

script shown in Listing 2 doesn’t<br />

output messages; instead, it simply<br />

restarts the service. Either run these<br />

scripts manually (for a temporary fix<br />

or quick check) or as RC scripts.<br />

Conclusions<br />

Administrators don’t need a complex<br />

monitoring framework that covers<br />

every aspect of the environment and<br />

has a multi-week learning curve.<br />

With some scripting know-how, you<br />

can easily create your own shell<br />

scripts to monitor server daemon processes<br />

and restart them autonomously<br />

if so desired. The use of shell scripts<br />

to monitor daemons and other system<br />

functions is by no means restricted to<br />

small embedded systems. With scripts<br />

tailored to match your requirements,<br />

you can establish your own troubleshooting<br />

arsenal.<br />

n<br />

The Author<br />

Harald Zisler has worked with Unix-flavored<br />

operating systems since the early 1990s.<br />

Listing 2: CUPS Monitoring<br />

01 #! /bin/sh<br />

02 <br />

03 while true<br />

04 do<br />

05 <br />

06 lpq ‐Plp | grep ‐q "lp is ready"<br />

07 <br />

08 if [ $? ‐gt 0 ]<br />

09 then<br />

10 cupsenable lp<br />

11 fi<br />

12 <br />

13 sleep 15<br />

14 <br />

15 done<br />

www.admin-magazine.com<br />

Admin 01<br />

93

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!