12.01.2013 Views

Tivoli Storage Manager Sample Architecture - IBM

Tivoli Storage Manager Sample Architecture - IBM

Tivoli Storage Manager Sample Architecture - IBM

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> <strong>Sample</strong> <strong>Architecture</strong><br />

Disk to Disk Backup Using TSM Deduplication on AIX<br />

1.0<br />

Author:<br />

Jason Basler<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 1 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

Document History<br />

Document Location<br />

This is a snapshot of an on-line document. Paper copies are valid only on the day they are printed.<br />

Revision History<br />

Revision Revision Summary of Changes Changes<br />

Number Date<br />

marked<br />

1.0 09/28/12 Initial publication<br />

About this document<br />

This document provides a sample architecture describing a deployment of <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong>. This<br />

information is provided for reference purposes to provide guidance during the planning of other deployments<br />

of <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong>. The information provided is based upon configurations used and test results<br />

derived from implementations in the test labs used by <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> development. The<br />

architectures are not intended to be a suitable configuration for all situations, and do not replace the need to<br />

carefully plan and design the elements of your own implementation of <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong>.<br />

Disclaimer<br />

This document contains measurements of <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> performance which are intended to be<br />

used for reference during planning implementations of <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong>. Performance results reported<br />

in this document were measured by <strong>IBM</strong> under controlled test conditions. Performance results measured in<br />

other environments may vary from those reported herein, depending on factors such as system configuration,<br />

workload characteristics, and other environmental conditions. Accordingly, this data does not constitute a<br />

performance guarantee or warranty.<br />

The information contained in this document is distributed on an "as is" basis without any warranty either<br />

expressed or implied. This document has been made available as part of <strong>IBM</strong> developerWorks WIKI, and is<br />

hereby governed by the terms of use of the WIKI as defined at the following location:<br />

http://www.ibm.com/developerworks/tivoli/community/disclaimer.html<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 2 of 21


Contents<br />

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

Document History...................................................................................................................2<br />

Document Location........................................................................................................................................2<br />

Revision History.............................................................................................................................................2<br />

About this document......................................................................................................................................2<br />

Disclaimer......................................................................................................................................................2<br />

Contents..................................................................................................................................3<br />

1. Overview of the architecture...............................................................................................5<br />

1.1 Deduplication technology.........................................................................................................................5<br />

1.2 Executive summary..................................................................................................................................5<br />

1.3 <strong>Architecture</strong> details...................................................................................................................................6<br />

1.3.1 Deduplicated primary pool.................................................................................................................7<br />

1.3.2 Random disk storage pool.................................................................................................................7<br />

1.3.3 Tape copy storage pool.....................................................................................................................7<br />

2. Observed performance.......................................................................................................8<br />

2.1 Data ingestion..........................................................................................................................................8<br />

2.2 Protected data..........................................................................................................................................9<br />

2.3 Data movement........................................................................................................................................9<br />

2.4 Database disk IOPS measurements......................................................................................................10<br />

3. Hardware Details..............................................................................................................11<br />

3.1 TSM server.............................................................................................................................................11<br />

3.2 Disk storage...........................................................................................................................................11<br />

3.2.1 Disk for TSM database....................................................................................................................11<br />

3.2.2 Disk for TSM storage pools.............................................................................................................12<br />

3.3 Tape storage..........................................................................................................................................12<br />

3.4 Software stack........................................................................................................................................12<br />

3.5 Network overview...................................................................................................................................13<br />

4. Configuration.....................................................................................................................14<br />

4.1 Operating system tuning changes..........................................................................................................14<br />

4.2 DB2 configuration changes....................................................................................................................14<br />

4.3 <strong>Storage</strong> configuration.............................................................................................................................14<br />

4.3.1 Disk subsystem layout.....................................................................................................................14<br />

4.3.2 Volume manager and file system configuration...............................................................................15<br />

4.4 TSM server configuration.......................................................................................................................16<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 3 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

4.4.1 TSM processing options..................................................................................................................16<br />

4.4.2 Device class creation.......................................................................................................................18<br />

4.4.3 <strong>Storage</strong> pool creation.......................................................................................................................18<br />

4.4.4 Policy settings..................................................................................................................................19<br />

4.4.5 Schedules and macro definitions to control server maintenance tasks............................................20<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 4 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

1. Overview of the architecture<br />

This document provides a sample architecture for data protection using <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> (TSM) with a<br />

deduplicated disk-based storage pool on <strong>IBM</strong> Power hardware running AIX. The architecture also provides a<br />

second storage pool hierarchy for a subset of clients with a traditional backup ingestion to a random-disk<br />

staging pool with daily migration to a tape storage pool.<br />

The system is implemented with two different storage pool technologies (deduplicated disk versus traditional<br />

disk to tape) for the purpose of breadth of test coverage, but also demonstrates the flexibility of TSM in terms<br />

of establishing different storage policies for clients with different business requirements. A new<br />

implementation of TSM can use either one of the storage pool hierarchies individually.<br />

1.1 Deduplication technology<br />

Deduplication technology provides an effective method of reducing the amount of data which TSM needs to<br />

store in storage pools. Detailed information describing the deduplication technology in TSM is available in<br />

the paper Effective Planning and Use of <strong>Tivoli</strong> <strong>Storage</strong> manager V6 Deduplication.<br />

1.2 Executive summary<br />

The sample TSM architecture provides a disk-based backup solution that retains data in a deduplicated<br />

storage pool for its entire retention. Data protection is provided for hundreds of clients in the deduplicated<br />

storage pool. The following table summarizes some key points which are detailed in the remainder of the<br />

document:<br />

Key point Summary<br />

• 500 clients are protected with a daily scheduled backup.<br />

Protected clients<br />

• 300 of these clients use a mix of server-side and client-side<br />

deduplication with up to 100 clients using client-side<br />

deduplication.<br />

• 4 TB of storage allocated for the database with 3 TB<br />

TSM database size •<br />

currently used by the database.<br />

2.8 billion objects are being protected (primary copy).<br />

• A secondary copy of every object is being protected in a<br />

copy storage pool.<br />

TSM log sizes • 130 GB for the active log (120 GB assigned).<br />

• 428 GB for the archive log.<br />

• Up to 2TB of new data is ingested into the server each<br />

day (size before deduplication)<br />

Deduplicated storage pool • 83 TB of disk storage is allocated for the deduplicated<br />

storage pool. Of this storage:<br />

o 74TB of data is protected (size before<br />

deduplication) using only 21TB of storage<br />

Networks • 1Gb LAN for client backup ingestion.<br />

• 4Gb SAN for storage.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 5 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

1.3 <strong>Architecture</strong> details<br />

The diagram above illustrates the architecture that includes two primary storage pool hierarchies. The first is<br />

a deduplicated sequential-file storage pool. Data remains in this pool for its entire retention and is never<br />

allowed to migrate to another storage pool. There second storage pool hierarchy using a random disk<br />

storage pool with a tape storage pool as the next pool. This represents the traditional TSM model of<br />

ingesting backups to disk and migrating to tapes. The second hierarchy is used for a subset of clients for<br />

which the use of deduplication is not appropriate. Tape storage is also used to provide a copy storage pool<br />

for creating storage pool backup copies of all of the data in both primary storage pools.<br />

The following summarizes key aspects of the architecture:<br />

• Client backups are ingested over the LAN to one of two primary storage pool destinations.<br />

1. A deduplicated primary file-based disk storage pool is used for a majority of clients with a mix<br />

of both client-side and server-side deduplication. Objects are stored in this pool for their<br />

entire retention.<br />

2. A random disk-based primary storage pool is used as temporary storage during backup<br />

ingestion for a subset of clients. The data is migrated each day to a primary tape storage<br />

pool for the remainder of its retention.<br />

• A tape copy storage pool is used for a storage pool backup copy of all data in the primary storage<br />

pools.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 6 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

• The best practices implementation of separating the activities of data ingestion and server data<br />

maintenance tasks into two distinct windows each day is used. This includes ordering the data<br />

maintenance tasks optimally to avoid resource contention.<br />

• Different disk storage is used for holding the TSM database and TSM storage pools. Faster disk is<br />

used for the database, and less-expensive and slower disk is used for the storage pools.<br />

1.3.1 Deduplicated primary pool<br />

The deduplicated primary storage pool retains backup objects for their entire retention. A majority of clients<br />

ingest backups directly into this storage pool. Of these clients, one-third use client-side deduplication during<br />

backup ingestion. The remaining clients backup without deduplication, allowing the data to be later<br />

deduplicated using server-side deduplication.<br />

The following considerations are used to determine which clients use this storage pool:<br />

• Fast restore times are desired without the delay associated with tape mounts, and data spread<br />

across multiple tapes.<br />

• The data responds well to deduplication in terms of the amount of reduction.<br />

• Objects backed up do not exceed 500GB in size.<br />

• Data is not encrypted on the client.<br />

1.3.2 Random disk storage pool<br />

The remaining clients ingest into a random disk storage pool which cannot use deduplication. This pool is a<br />

temporary staging pool from which data is moved to tape each day using storage pool migration. The staging<br />

pool provides an efficient mechanism for ingesting from a large number of clients without contention for tape<br />

mounts. The following factors determine which clients ingest into this storage pool:<br />

• Clients with large objects (greater than 500GB) which are not suitable for deduplication.<br />

• Lower-priority clients that do not require faster restore times making the lower-cost tape storage more<br />

desirable, or require using encryption.<br />

1.3.3 Tape copy storage pool<br />

The tape copy storage pool is used to support taking storage pool backups of all primary storage pools. The<br />

copies are created using the BACKUP STGPOOL command that incrementally keeps a secondary copy of<br />

every backup object. The tape device is also used for storing backup copies of the TSM database using the<br />

BACKUP DATABASE command. The tapes from the copy storage pool and database backups can be taken<br />

to an off-site location to provide for disaster recovery.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 7 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

2. Observed performance<br />

Observed performance results are presented in this section based on measurements taken on the sample<br />

architecture in the test lab. This information is provided to serve as a reference during planning a TSM<br />

implementation, and is not intended to be a guarantee of the performance to be expected in a different<br />

environment.<br />

2.1 Data ingestion<br />

Data ingestion is performed over a seven hour backup window.<br />

Item/metric Variations Limit/Range Notes<br />

Protected clients Number of active clients 400-500 There is a variety of client types. 1<br />

Typical session peak 300-500 Clients are divided into groups<br />

with start times staggered at<br />

00:00, 02:00, and 03:00.<br />

New data ingested Number of objects 2-3 million new<br />

objects per day<br />

Break down of daily<br />

ingestion<br />

Object sizes vary, but on average<br />

fall into the small - medium<br />

workload in the range of 1KB -<br />

10MB.<br />

Volume of data 500GB - 2TB Data size represents the size<br />

before deduplication and/or<br />

compression reductions.<br />

Without deduplication<br />

(random disk)<br />

With deduplication<br />

(server)<br />

With deduplication<br />

(client)<br />

Objects inspected with incremental backup<br />

(active set of files queried during incremental)<br />

100-500GB<br />

250GB - 1TB<br />

250GB - 1TB Data size represents the size<br />

before deduplication and/or<br />

compression reductions.<br />

600 million<br />

1 Client workloads primarily include backup-archive clients performing incremental backups. One hundred of<br />

the clients perform client-side deduplication, 200 are processed using server-side deduplication, and the<br />

remaining clients are stored in the random disk to tape hierarchy. The clients also include select TDP clients,<br />

TSM for Virtual Environments, and client image backups.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 8 of 21


2.2 Protected data<br />

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

Item/metric Variations Limit/Range Notes<br />

TSM database (used) 3 TB<br />

Number of stored objects (primary copy) 2.8 billion There is also a storage pool backup copy of<br />

every object.<br />

Total protected data (size<br />

before deduplication<br />

including all versions)<br />

2.3 Data movement<br />

In non-deduplicated<br />

storage pools<br />

In deduplicated storage<br />

pools<br />

41 TB<br />

Total 115 TB<br />

74 TB Stored: 21 TB<br />

Item/metric Variations Limit/Range Notes<br />

Eliminated by deduplication: 53 TB<br />

Objects deleted by expire inventory (daily) 1 - 6 million Typical duration is 350-450 minutes with<br />

resource=8.<br />

Throughput of database backup 240 MB/sec Full backup to LTO5 tapes using three<br />

streams. Typical completion time is 3.5<br />

hours.<br />

Throughput of storage pool<br />

backup<br />

Random disk 108 MB/sec 6 processes.<br />

With deduplication 55 MB/sec 6 processes. Actual rate may be higher. 2<br />

Throughput of reclamation Tape 70 MB/sec 3 processes.<br />

Throughput of duplicate<br />

identification<br />

Throughput of migration,<br />

random disk to tape<br />

File with deduplication 29 MB/sec 6 processes. Actual rate may be higher. 3<br />

46 MB/sec 4 processes.<br />

29 MB/sec 5 processes.<br />

2 Query process output which is used to calculate this rate is based on the reduced size of objects after<br />

deduplication. The actual amount of data copied is higher than what is reported by query process due to<br />

rehydrating data. The actual copy rate is estimated to be 64 MB/sec.<br />

3 Query process output for reclamation of deduplicated data is also based on the reduced object size. The<br />

actual amount of data moved is more than reported which we can estimate to be an overall movement rate of<br />

47 MB/sec.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 9 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

2.4 Database disk IOPS measurements<br />

The following chart shows database disk IOPS measurements taken throughout one day with the general<br />

time ranges of the various TSM server activities labeled. The data is taken from the output of the iostat<br />

command using a 5 second sampling interval. The data points represent the sum of the reported “transfers<br />

per second” for all eight of the disks used for holding the TSM server database volumes.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 10 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

3. Hardware Details<br />

3.1 TSM server<br />

• <strong>IBM</strong> Power 520 (8203-E4A)<br />

• 4 - core 4.2 GHz POWER6 processors<br />

• 64 GB RAM<br />

I/O adapters<br />

Slot# Slot type<br />

Feature code of<br />

installed adapter Description<br />

Slot 1 PCIe 8x 577D 8Gb PCI express dual-port Fibre Channel<br />

Slot 2 PCIe 8x 5767 1Gb PCI express dual-port ethernet<br />

Slot 3 PCIe 8 5774 4Gb PCI express dual-port Fibre Channel<br />

Slot 4 PCI-X 266MHz 5759 4Gb PCI-X dual-port Fibre Channel<br />

Slot 5 PCI-X 266MHz 5759 4Gb PCI-X dual-port Fibre Channel<br />

The system also includes two on-board 1Gb Ethernet adapters.<br />

Additional system details for the p520 are available here:<br />

http://www.redbooks.ibm.com/redpapers/pdfs/redp4403.pdf<br />

3.2 Disk storage<br />

Different storage controllers are used for the TSM database and storage pools. Faster fibre-channel disk<br />

technology is used for the database, and slower SATA disks are used for the storage pools. In general, fast<br />

disk should be used for the TSM database.<br />

3.2.1 Disk for TSM database<br />

• <strong>IBM</strong> System <strong>Storage</strong> DS5020 (1814-20A)<br />

o 1 x <strong>IBM</strong> EXP520 storage enclosure (1814-52A)<br />

o 32 x 300GB 4Gbps FC drives, 15k RPM (16 in controller, 16 in expansion)<br />

o 4 x 8Gbps FC host interfaces 4<br />

o 4 x 4Gbps FC drive ports<br />

4 Although the controller host ports support 8Gbps connections, the test lab is using a 4Gbps fabric.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 11 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

3.2.2 Disk for TSM storage pools<br />

The storage pool space is spread across different disk subsystems. This is a result of availability of storage<br />

capacity in the test lab rather than being a recommendation.<br />

• <strong>IBM</strong> System <strong>Storage</strong> DS5020 (1814-20A)<br />

o 2 x <strong>IBM</strong> EXP520 storage enclosure (1814-52A)<br />

o 48 x 2TB SATA drives (16 in controller, 32 in expansions)<br />

o 4 x 8Gbps FC host interfaces<br />

o 4 x 4Gbps FC drive ports<br />

• <strong>IBM</strong> System <strong>Storage</strong> DS4200 (1814-7VH)<br />

o 2 x EXP 420 expansion drawers<br />

o 48 x 1GB SATA drives<br />

• <strong>IBM</strong> System <strong>Storage</strong> DS4000 (1812-81A)<br />

o 1 x EXP 810 enclosure<br />

o 32 x 1GB SATA drives<br />

3.3 Tape storage<br />

• <strong>IBM</strong> System <strong>Storage</strong> TS3500 Tape Library (3584-L53)<br />

o 10 x 8Gbps FC <strong>IBM</strong> Ultrium 5 (LTO 5) tape drives (3588-F5A) 5<br />

o 224 tape cartridges:<br />

3.4 Software stack<br />

• 151 x LTO5 tapes<br />

• 26 x LTO4 tapes<br />

• 47 x LTO3 tapes (in read-only state left until older data expires)<br />

• AIX 6.1 (6100-07-05-1228), 64bit kernel<br />

• TSM server for AIX Version 6.3 (6.3.2.0)<br />

• Atape device driver 12.0.7.0<br />

5 Although the drives support 8Gbps connections, the test lab is using a 4Gbps fabric.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 12 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

3.5 Network overview<br />

Client backups to the TSM server are all LAN-based using 1Gb connections. Clients are distributed across<br />

three different network subnets. The TSM server has a 1Gb adapter connected to each of the subnets with a<br />

different IP address assigned to each network adapter. The clients on each subnet use the TSM server IP<br />

address for the subnet to which they are connected. This avoids traffic routed across subnets.<br />

The storage network is a 4Gb fibre channel switched fabric. The TSM server has a total of eight fibre<br />

channel ports which are zoned in the SAN as follows:<br />

• 2 ports for TSM database disk<br />

• 2 ports for TSM storage pool disk<br />

• 4 ports for library / tape drives.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 13 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

4. Configuration<br />

The following sections summarize configuration changes that were made to software components after<br />

installation.<br />

4.1 Operating system tuning changes<br />

The following AIX operating system settings were changed on the test system:<br />

• Enable I/O completion ports<br />

chdev -l iocp0 -P<br />

• Tune hard disks used for TSM database, logs, and storage pool volumes. The commands below were<br />

repeated for all disks used for the TSM database and storage pools.<br />

chdev -l hdisk38 -a max_transfer=0x100000<br />

chdev -l hdisk38 -a queue_depth=32<br />

• Several configuration choices were made relative to the volume manager and file systems. The following<br />

options are reflected in commands which follow in a later section:<br />

• Volume groups are created as the Big-type.<br />

• JFS2 file systems are used.<br />

• File system logs are not created as in-line.<br />

• File systems are mounted with the release-behind (rbrw) option for file systems used by the<br />

archive log and file storage pools.<br />

4.2 DB2 configuration changes<br />

No additional configuration changes were made to DB2 beyond what the TSM installation and database<br />

formatting perform automatically. At the time of formatting, a TSM V6.1 server was used with DB2 v9.5. The<br />

following TSM command was used to format the database:<br />

dsmserv -o /tsminst1/dsmserv.opt -i /tsminst1 format<br />

dbdir=/tsmdb01,/tsmdb02,/tsmdb03,/tsmdb04,/tsmdb05,/tsmdb06,/tsmdb07,/tsmdb08<br />

activelogsize=122880 activelogdir=/tsmactlog archlogdir=/tsmarchlog<br />

archfailoverlogdir=/tsmarchfail<br />

The following DB2 level is currently installed:<br />

DB2 v9.7.0.5", "special_28032", "IP23285_28032", and Fix Pack "5<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 14 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

4.3 <strong>Storage</strong> configuration<br />

4.3.1 Disk subsystem layout<br />

TSM Database volumes (4.3 TB total usable capacity):<br />

• 8 x Raid 5 arrays created with three spindles per array.<br />

o One logical drive created consuming the entire array, with a total usable capacity of 558GB<br />

per logical drive.<br />

TSM database log volumes (558 GB total usable capacity):<br />

• 1 x Raid 5 array created with three spindles<br />

o One logical drive created with 130GB for the active log<br />

o One logical drive created with 428GB for the archive log<br />

TSM storage pool volumes (83TB total usable capacity):<br />

• 14 Raid 5 arrays created using 2TB spindles with three per array.<br />

o One logical drive created consuming each array, with a total usable capacity of 3725GB per<br />

logical drive.<br />

• 18 Raid 5 arrays created using 1TB spindles with three per array.<br />

o One logical drive created consuming each array, with a total usable capacity of 1818GB per<br />

logical drive.<br />

4.3.2 Volume manager and file system configuration<br />

1. Create volume groups for the database, database logs, and storage pools (several additional volume<br />

groups are created for file storage pools which are not shown)<br />

mkvg -B -y tsmdb hdisk39 hdisk40 hdisk41 hdisk42 hdisk43 hdisk44 hdisk45 hdisk46<br />

mkvg -B -y tsmlog hdisk38<br />

mkvg -B -y tsmarchlog hdisk36<br />

mkvg -B -y tsmdisk hdisk34 hdisk35<br />

mkvg -B -y tsmfile1 hdisk20 hdisk21 hdisk22 hdisk23 hdisk24 hdisk25 hdisk26<br />

hdisk27 hdisk28<br />

2. Create logical volumes and file systems. In some cases repeated commands have been omitted.<br />

Note the pp size counts used in your environment will vary depending on the physical volume size. The rbrw<br />

mount option has been shown to improve performance for storage pool volumes, archive log volumes, and<br />

file systems used for TSM database backups to disk. After file systems are created and mounted, be sure to<br />

assign ownership to the user id which will be used as the TSM database instance id.<br />

• DB volumes:<br />

mklv -y tsmdb01 -t jfs2 -u 1 -x 555 tsmdb 555 hdisk39<br />

crfs -v jfs2 -d tsmdb01 -p rw -a agblksize=4096 -m /tsmdb01 -A yes<br />

< … ><br />

mklv -y tsmdb08 -t jfs2 -u 1 -x 555 tsmdb 555 hdisk46<br />

crfs -v jfs2 -d tsmdb08 -p rw -a agblksize=4096 -m /tsmdb08 -A yes<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 15 of 21


• Active log:<br />

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

mklv -y tsmlog -t jfs2 -u 1 -x 100 tsmlog 100 hdisk38<br />

crfs -v jfs2 -d tsmlog -p rw -a agblksize=4096 -m /tsmactlog -A yes<br />

• Archive log:<br />

mklv -y tsmarchlog -t jfs2 -u 1 -x 455 tsmarchlog 455 hdisk36<br />

crfs -v jfs2 -d tsmarchlog -p rw -a options=rbrw -a agblksize=4096<br />

-m /tsmarchlog -A yes<br />

• Random disk storage pool:<br />

mklv -y tsmdisk -t jfs2 -u 1 -x 3700 tsmdisk 3700<br />

crfs -v jfs2 -d tsmdisk -p rw -a options=rbrw -a agblksize=4096 -m /tsmdisk<br />

-A yes<br />

• Sequential file storage pool (note: several of these may be needed depending on capacity)<br />

mklv -y tsmfile1 -t jfs2 -x 32768 tsmfile1 32768<br />

crfs -v jfs2 -d tsmfile1 -p rw -a options=rbrw -a agblksize=4096 -m /tsmfile1<br />

-A yes<br />

4.4 TSM server configuration<br />

4.4.1 TSM processing options<br />

The following table summarizes TSM server options which are set in dsmserv.opt, as well as, server-wide<br />

options which have been set using various set commands on the TSM server.<br />

Options (dsmserv.opt) Setting Explanation<br />

activelogsize 122880<br />

Slightly below the maximum allowed active<br />

log size.<br />

allowreorgtable default: yes Allows on-line DB2 table reorg to run<br />

allowreorgindex yes Allows on-line DB2 index reorg to run.<br />

clientdeduptxnlimit<br />

commtimeout 10000<br />

Default: 300 Limit objects using client-side deduplication<br />

to 300GB<br />

Certain TDP’s benefit from an increased<br />

setting for this timeout.<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 16 of 21


dedupdeletionthreads 8<br />

<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

deduprequiresbackup Default: Yes<br />

deduptier2filesize Default: 100<br />

deduptier3filesize Default: 400<br />

devconf /tsminst1/DEVCONF.txt<br />

enableautodbbackup no<br />

expinterval 0<br />

idletimeout 240<br />

maxsessions 700<br />

numopenvolsallowed 20<br />

serverdeduptxnlimit 500<br />

Increases the number of threads performing<br />

deletion of deduplicated chunks which are<br />

no longer referenced.<br />

No effect for nodes which use client-side<br />

deduplication. For those which use serverside<br />

deduplication, storage pool backup<br />

completes before deduplication is applied.<br />

Only files smaller than 100GB are<br />

processed in tier1.<br />

Files in the range 100 - 400GB will be<br />

processed in tier2. Files 400GB and larger<br />

will be processed in tier3.<br />

Controls which file the server device<br />

configuration information is copied to.<br />

A full database backup is only taken once a<br />

day and they are scheduled.<br />

This disables automatic inventory<br />

expiration. Instead, this task is scheduled<br />

to run at a specific time every day.<br />

Allows idle client sessions to remain<br />

through the expected backup window<br />

duration.<br />

Clients which use client-side deduplication<br />

use one additional session.<br />

This option controls the number of volumes<br />

that a process such as a reclamation<br />

process or client restore sessions can hold<br />

open at the same time. A small increase to<br />

this option is recommended, and some trial<br />

and error may be needed. The device class<br />

mount limit parameter was also increased<br />

because of this change.<br />

Limit objects using server-side<br />

deduplication to 500GB<br />

txngroupmax Default: 4096 Allows more client objects per transaction.<br />

volhist /tsminst1/VOLHIST.txt<br />

Set options Command Notes<br />

Activity log retention set actlogreten 60<br />

Summary record retention set summaryreten 45<br />

Controls which file the server volume history<br />

is written to.<br />

Max schedule session set maxschedsess 90 Increases the percentage of allowed<br />

sessions which can be used for scheduled<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 17 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

4.4.2 Device class creation<br />

backups<br />

The random disk storage pool uses the built-in disk device class. Device classes must be created for the<br />

sequential file deduplication pool, and the tape storage pools.<br />

• sequential file<br />

> define devclass largefile devt=file mountlimit=750 maxcap=102400M<br />

dir=/tsmfile1,/tsmfile2,/tsmfile3,/tsmfile4,/tsmfile5,/tsmfile6,/tsmfile7,/tsmfi<br />

le8<br />

• tape<br />

> define library TS3500 libtype=scsi shared=yes autolabel=overwrite<br />

> define path scorpio2 TS3310 SRCTYPE=server DESTTYPE=library device=/dev/smc0<br />

online=yes<br />

> define drive TS3500 drivea element=autodetect serial=autodetect<br />

… Repeat the define drive for other drives …<br />

> define drive TS3500 drivej element=autodetect serial=autodetect<br />

> define path scorpio2 drivea SRCTYPE=server DESTTYPE=drive LIBRARY=TS3500<br />

device=/dev/rmt0<br />

… Repeat the define path for other drives …<br />

> define path scorpio2 drivej SRCTYPE=server DESTTYPE=drive LIBRARY=TS3500<br />

device=/dev/rmt9<br />

> define devclass TS3500devc devtype=LTO library=TS3500<br />

4.4.3 <strong>Storage</strong> pool creation<br />

Below are the commands used to create the four storage pools used in this sample architecture. The options<br />

controlling reclamation, migration, and duplicate identification are set to disable the automatic launching of<br />

these tasks. Instead, the tasks are scheduled to run at specified times as detailed in a later section.<br />

• tape<br />

> define stgpool tapepool ts3500devc maxscratch=30 reclaim=100 collocate=group<br />

reusedelay=3 reclaimprocess=3<br />

• random disk<br />

> update stgpool BACKUPPOOL nextstgpool=tapepool migprocess=5 highmig=100<br />

lowmig=0<br />

> update stgpool ARCHIVEPOOL nextstgpool=tapepool migprocess=5 highmig=100<br />

lowmig=0<br />

> define volume ARCHIVEPOOL /tsmdisk/arch01.dsm format=512000M<br />

> define volume BACKUPPOOL /tsmdisk/back01.dsm format=512000M<br />

> define volume BACKUPPOOL /tsmdisk/back02.dsm format=512000M<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 18 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

> define volume BACKUPPOOL /tsmdisk/back03.dsm format=512000M<br />

> define volume BACKUPPOOL /tsmdisk/back04.dsm format=512000M<br />

> define volume BACKUPPOOL /tsmdisk/back05.dsm format=512000M<br />

> define volume BACKUPPOOL /tsmdisk/back06.dsm format=512000M<br />

• deduplicated sequential file<br />

> define stgpool filepool largefile maxscratch=820 deduplicate=yes<br />

identifyprocess=0 reclaim=100 reclaimprocess=6 collocate=group nextstgpool=””<br />

• tape copy storage pool<br />

> define stgpool copypool ts3500devc pooltype=copy maxscratch=200 reclaim=100<br />

collocate=no reusedelay=3 reclaimprocess=3<br />

4.4.4 Policy settings<br />

The tested architecture includes three different policy domains used to vary retention schemes and control<br />

which of the storage pool hierarchies is used for each client.<br />

• The gold domain provides 60 days of backup retention and uses the deduplicated storage pool target.<br />

Archive objects are retained for 1000 days.<br />

> define domain GOLD<br />

> define policy GOLD GOLD<br />

> define mgmtclass GOLD GOLD GOLD<br />

> assign defmgmt GOLD GOLD GOLD<br />

> define copygroup GOLD GOLD GOLD type=backup destination=FILEPOOL VERE=nolimit<br />

VERD=10 RETE=60 RETO=NOLIMIT<br />

> define copygroup GOLD GOLD GOLD type=archive destination=FILEPOOL RETV=1000<br />

> activate pol GOLD GOLD<br />

• The silver domain provides up to 40 days of backup retention and uses the random disk to tape storage<br />

pool hierarchy. Archive objects are retained for 365 days.<br />

> define domain SILVER<br />

> define policy SILVER SILVER<br />

> define mgmtclass SILVER SILVER SILVER<br />

> assign defmgmt SILVER SILVER SILVER<br />

> define copygroup SILVER SILVER SILVER type=backup destination=BACKUPPOOL<br />

VERE=40 VERD=2 RETE=40 RETO=60<br />

> define copygroup SILVER SILVER SILVER type=archive destination=ARCHIVEPOOL<br />

RETV=365<br />

> activate pol SILVER SILVER<br />

• The bronze domain provides up to 30 days of backup retention and uses the deduplicated storage pool<br />

target. Archive objects are retained for 150 days.<br />

> define domain BRONZE<br />

> define policy BRONZE BRONZE<br />

> define mgmtclass BRONZE BRONZE BRONZE<br />

> assign defmgmt BRONZE BRONZE BRONZE<br />

> define copygroup BRONZE BRONZE BRONZE type=backup destination=FILEPOOL VERE=30<br />

VERD=1 RETE=30 RETO=30<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 19 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

> define copygroup BRONZE BRONZE BRONZE type=archive destination=FILEPOOL<br />

RETV=150<br />

> activate pol BRONZE BRONZE<br />

4.4.5 Schedules and macro definitions to control server maintenance<br />

tasks<br />

The following section shows how to configure TSM to schedule data maintenance tasks to follow the best<br />

practices of separating backup ingestion from data maintenance. The recommended ordering is explained<br />

below, along with example of commands to implement these tasks through scheduling.<br />

Here is the implemented sequence of tasks:<br />

1. 00:00 - 07:00.<br />

Client data ingestion.<br />

2. 08:00 - 12:00.<br />

Create the secondary disaster recovery (DR) copy using the BACKUP STGPOOL<br />

command.<br />

3. The following tasks can run in parallel:<br />

a. 11:00 - 16:00.<br />

Perform server-side duplicate identification by running the IDENTIFY<br />

DUPLICATES command. This processes data that was not already deduplicated on the<br />

clients.<br />

b. 12:00 - 15:30.<br />

Create a DR copy of the TSM database by running the BACKUP DATABASE<br />

command. Following the completion of the database backup, the DELETE VOLHISTORY<br />

command is used to remove older versions of database backups which are no longer<br />

required. At completion of the backup, the volume history and device configuration are<br />

backed up using the BACKUP VOLHISTORY and BACKUP DEVCONFIG commands.<br />

4. 14:00 - 15:30.<br />

Move data from the backuppool and archivepool random disk storage pools to the<br />

tape storage pool using the MIGRATE STGPOOL command.<br />

5. 18:30 - 23:00.<br />

Reclaim unused space from storage pool volumes that has been released through<br />

deduplication and inventory expiration using the RECLAIM STGPOOL command.<br />

6. 22:00 - 04:00.<br />

Remove objects that have exceeded their allowed retention using the EXPIRE<br />

INVENTORY command.<br />

4.4.5.1 Define scripts which run each required maintenance task<br />

The following scripts are defined to perform the data maintenance tasks, and invoked via scheduled<br />

administrative commands.<br />

def script STGBACKUP "/* Run stg pool backups */"<br />

upd script STGBACKUP "backup stg ARCHIVEPOOL copypool maxprocess=6<br />

wait=yes" line=010<br />

upd script STGBACKUP "backup stg BACKUPPOOL copypool maxprocess=6<br />

wait=yes" line=020<br />

upd script STGBACKUP "backup stg FILEPOOL copypool maxprocess=6<br />

wait=yes" line=030<br />

def script DEDUP "/* Run identify duplicate processes */"<br />

upd script DEDUP "identify duplicates FILEPOOL numprocess=4 duration=360"<br />

line=010<br />

set dbrecovery TS3500DEVC numstreams=3<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 20 of 21


<strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong><br />

define script DBBACKUP "/* Run DB backups */"<br />

update script DBBACKUP "backup db devclass=TS3500DEVC type=full wait=yes"<br />

numstreams=3 line=010<br />

update script DBBACKUP "backup volhistory" line=020<br />

update script DBBACKUP "backup devconfig" line=030<br />

update script DBBACKUP "delete volhistory type=dbbackup todate=today-7<br />

totime=now" line=040<br />

define script MIGRATE "/* Run stg pool migration */"<br />

update script MIGRATE "migrate stgpool ARCHIVEPOOL wait=yes" line=010<br />

update script MIGRATE " migrate stgpool BACKUPPOOL wait=yes " line=020<br />

define script RECLAIM "/* Run stg pool reclamation */"<br />

update script RECLAIM "reclaim stgpool FILEPOOL threshold=40<br />

duration=200 wait=yes" line=010<br />

update script RECLAIM "reclaim stgpool TAPEPOOL threshold=60<br />

duration=60 wait=yes" line=020<br />

update script RECLAIM "reclaim stgpool COPYPOOL threshold=60<br />

duration=60 wait=yes" line=030<br />

define script EXPIRE "/* Run expiration processes. */"<br />

update script EXPIRE "expire inventory resource=8 wait=yes" line=010<br />

4.4.5.2 Define schedules to run the data maintenance tasks<br />

The following commands define administrative schedules which execute the scripts which were created in the<br />

previous section.<br />

define schedule STGBACKUP type=admin cmd="run STGBACKUP" active=yes \<br />

desc="Run all stg pool backups." startdate=today starttime=08:00:00 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

define schedule DEDUP type=admin cmd="run DEDUP" active=yes \<br />

desc="Run indentify duplicates." startdate=today starttime=11:00:00 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

define schedule DBBACKUP type=admin cmd="run DBBACKUP" active=yes \<br />

desc="Run database backup." startdate=today starttime=12:00:00 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

define schedule MIGRATE type=admin cmd="run MIGRATE" active=yes \<br />

desc="Migrate data from random stg pools." startdate=today starttime=14:00 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

define schedule RECLAIM type=admin cmd="run RECLAIM" active=yes \<br />

desc="Reclaim space from storage pools." startdate=today starttime=18:30 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

define schedule EXPIRATION type=admin cmd="run expire" active=yes \<br />

desc="Run expiration." startdate=today starttime=22:00:00 \<br />

duration=15 durunits=minutes period=1 perunits=day<br />

<br />

Document: <strong>Tivoli</strong> <strong>Storage</strong> <strong>Manager</strong> Deduplication <strong>Sample</strong> <strong>Architecture</strong> Date: 09/28/2012<br />

Version: 1.0<br />

Page 21 of 21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!