ADMIN
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Backup Software<br />
Features<br />
data. One company I worked for had<br />
so much data it took more than a<br />
day to back up all of the machines.<br />
Thus, full backups were done over<br />
the weekend, with incrementals in<br />
between. Also, in a business environment<br />
with dozens of machines,<br />
trying to figure out exactly where the<br />
specific version of the data resides<br />
increases the recovery time considerably.<br />
Finally, you must also consider the<br />
cost. Although you might be tempted<br />
to get a larger single drive because<br />
it is less expensive than two drives<br />
that are only half as big, being able to<br />
switch between two drives (or more)<br />
adds an extra level of safety if one<br />
fails. Furthermore, you could potentially<br />
take one home every night. If<br />
you are writing to tape, an extra tape<br />
drive also increases safety; it can also<br />
speed up backups and recovery.<br />
Which Tape?<br />
Some companies remove all of the<br />
tapes after the backup is completed<br />
and store them in a fireproof safe or<br />
somewhere off-site. This means that<br />
when doing incremental backups,<br />
the most recent copy of a specific<br />
file might be on any one of a dozen<br />
tapes. Naturally, the question becomes,<br />
“Which tape?” (see also the<br />
“Whose Data” box). To solve this<br />
problem, the backup software must<br />
be able keep track of which version of<br />
Incremental vs. Differential<br />
Because of the amount of data, businesses<br />
frequently have a two-tiered backup scheme.<br />
Once a week, a full backup is done (of every<br />
single file); on subsequent days, backups are<br />
done of only those files that have changed.<br />
This approach is referred to as an incremental<br />
backup. Although it saves media, it<br />
potentially takes more time to recover. With<br />
this method, you first need to restore the<br />
full backup and, depending on which files<br />
have changed, you might need to access<br />
every single incremental backup.<br />
One alternative is a differential backup,<br />
which stores only files that have been<br />
changed since the last full backup. This has<br />
the advantage of saving time compared with<br />
an incremental backup, because you need to<br />
restore from, at most, two backups.<br />
Backup Alternatives<br />
If you are running Linux and your software repositories are configured properly, a number of<br />
backup applications are available through your respective installation tool (e.g., YaST, Synaptic).<br />
In fact, I found more than two dozen products that have defined themselves in one way or another<br />
as a backup tool (not counting those explicitly for backing up databases).<br />
Here are a few important questions to ask about your backup software:<br />
n Is your hardware supported?<br />
n How does the software deal with database backups?<br />
n Can you do a directed recovery (i.e., to a different directory)?<br />
n Can the software verify the data after a backup and restore?<br />
n Can the software write to multiple volumes?<br />
n Do you really need all of the features?<br />
n Can the software do a backup of a remote system?<br />
How is the backup information<br />
stored? Does the backup software<br />
have its own internal format or does<br />
it use a database such as my SQL?<br />
The more systems you back up, the<br />
more you need a product that indexes<br />
which files are saved and where they<br />
are saved as well. Unless you are<br />
simply doing a complete backup every<br />
night to one destination for one<br />
machine (i.e., one tape or remote diwhich<br />
file is stored where (i.e., which<br />
tape or disk).<br />
Once a software product has reached<br />
this level, it will typically also be<br />
able to manage multiple versions of<br />
a given file. Sometimes you will need<br />
to make monthly or even yearly backups,<br />
which are then stored for longer<br />
periods of time. (This setup is common<br />
when you have sensitive data<br />
like credit card or bank information.)<br />
To prevent the software from overwriting<br />
tapes that it shouldn’t, you<br />
should be able to define a “recycle<br />
time” that specifies the minimum<br />
amount of time before the media can<br />
be reused.<br />
Because not all backups are the same<br />
and not all companies are the same,<br />
you should consider the ability of<br />
the software to be configured to your<br />
needs. If you have enough time and<br />
space, software that can only do a<br />
full backup might be sufficient. On<br />
the other hand, you might want to be<br />
able to pick and choose just specific<br />
directories, even when doing a “full”<br />
backup.<br />
Support<br />
One consideration that is often overlooked<br />
is the amount of support available for your<br />
product. Commercial support might be necessary<br />
if implementing the backup solution<br />
for a company. However, the amount of free<br />
support (forums, mailing lists) can be an issue.<br />
When considering open source software<br />
of any kind for a business, I always suggest<br />
taking a good look at the product’s website.<br />
If the product has not been updated in three<br />
years, you might want to look elsewhere. If<br />
forums have few posts and most are unanswered,<br />
you likely won’t get your questions<br />
answered either.<br />
Many of the products I looked at have<br />
the ability to define “profiles” (or<br />
use a similar term). For example, you<br />
define a Linux MySQL profile, assign<br />
it to a subset of your machines, and<br />
the backup software automatically<br />
knows which directories to include<br />
and which to ignore. The Apache profile,<br />
for example, has a different set<br />
of directories. This might also include<br />
a pre-command that is run immediately<br />
before the backup, then a postcommand<br />
that is run immediately<br />
afterward.<br />
Storage<br />
Whose Data?<br />
One important aspect is the ability to write<br />
data from different sources to specific<br />
media. For example, where I work, each<br />
customer is assigned specific tapes (often<br />
referred to as a “pool”). With the use of<br />
labels written to the tape, the software can<br />
tell which tape belongs to which pool, so<br />
that data from different environments is<br />
not mixed. This scheme is very useful if, for<br />
example, one customer wants weekly backups<br />
stored off-site and another customer<br />
frequently requests the backup tapes to load<br />
them into a local test system.<br />
www.admin-magazine.com<br />
Admin 01<br />
29