DATABASE TECHNOLOGYThe MSN CSP application uses SQL Server transactional replicationto help ensure high availability because transactional replicationenables a combination of low latency and transactionalconsistency guarantees. The CSP application is not designed toread or write to both the primary and the secondary databasesconcurrently for two reasons:• Application design would be much more complex. It wouldbe necessary to implement bidirectional replication betweenprimary and secondary databases.• The secondary copies coexist with primary copies of differentdata set partitions on the same nodes. Reading from thesecondary database would take resources from those primarydatabases.Transactional replication uses the transaction log to captureincremental changes that were made to data in a publishedtable. Microsoft SQL Server monitors INSERT, UPDATE,and DELETE statements, or other modifications made tothe data, and stores those changes in the distribution database,which is designed to act as a reliable queue. Changesare then propagated to subscribers and applied in the sameorder as they occurred by opening a connection to the subscriberdatabase and issuing SQL commands to the subscriberdatabase. In applications where there are high write-to-readtransaction ratios, replication may lag behind the transactionprocessing. The common limiting factors are network latency,index overhead on the subscriber database, and the number ofconnections to the subscriber that is executing the commands.When the source system’s transactions per second exceed thereplication capacity of the subscriber’s system, replication latencywill keep climbing until the transaction load is reduced. Losingthe primary system causes transactions to be lost in the replicationqueue. The tolerable level of transaction loss depends onbusiness needs and the desired end-user experience.There are several ways to avoid the replication bottleneck:• The data and workload should be spread out acrossmore physical servers, which reduces the workload perdistributor—although this approach can lead to underutilizedhardware. This is the option elected for the CSPapplication.• Since each server running SQL Server has only one distributor,having multiple SQL Server instances providesmultiple distributors per server. By spreading the originalworkload across these instances, the replication load canbe processed by more distributors per server. In this way,additional servers are not needed but additional managementis required for allocating hardware resources amonginstances, such as CPU and memory.• Microsoft SQL Server 2005 is designed to support parallelreplication and to help ensure that transactions are processedto the subscriber in the same order as to the publisher. Atpress time, SQL Server 2005 had not been released so thisfeature was not yet implemented on the CSP production site.On an ongoing basis, the MSN CSP application developmentteam runs a stress test lab at Microsoft to determine the stress-levelthreshold where replication queue build-up exceeds specified designlevels. In live production, the MSN CSP operation team monitors thestress level of client requests per node. As the number of accountsand size of the accounts increase, so does the number of queriesper node. When the stress level reaches the threshold, the CSP teamadds another failover group to the system.System failure detection and failoverThe MSN scale-out management layer monitors the status of allnodes. It detects the failure of a server or a database and promotesthe secondary database for the failed partition by redirecting theworkload traffic to the secondary database.BeforeWorkloadAfterWorkloadPartition 1(Primary)Partition 2(Primary)Partition 1(Primary)Partition 2(Primary)Partition 3(Primary)Partition 4(Primary)Partition 2(Secondary)Partition 1(Secondary)Partition 2(Secondary)Partition 1(Secondary)Partition 4(Secondary)Partition 3(Secondary)Database server 1Database server 2Database server 1Database server 2Database server 3Database server 4Figure 5. Adding a failover group102DELL <strong>POWER</strong> <strong>SOLUTIONS</strong> Reprinted from Dell Power Solutions, August 2005. Copyright © 2005 Dell Inc. All rights reserved. August 2005
DATABASE TECHNOLOGYSystem rolePhysical(number of systems) Model CPU memory Storage OS version ApplicationDatabase server (4) Dell PowerEdge 6650 4 Intel ® Xeon processors 8 GB Dell/EMC CX600 Windows Server 2003, SQL Server 2005 Beta 2at 2 GHz array–based SAN Enterprise EditionLookup partition Dell PowerEdge 6650 4 Intel Xeon processors 8 GB Direct attach SCSI Windows Server 2003, SQL Server 2005 Beta 2server (2) at 2 GHz disk array with Enterprise Editionfive 146 GB drivesWeb server (3) Dell PowerEdge 2650 2 Intel Xeon processors 4 GB Local disk Windows Server 2003, IIS 6.0at 2.4 GHzEnterprise EditionScale-out management Dell PowerEdge 2650 2 Intel Xeon processors 4 GB Local disk Windows Server 2003,layer server at 2.4 GHz Enterprise EditionWeb client (12) Dell PowerEdge 1650 2 Intel Pentium ® III 2 GB Local disk Windows Server 2003,processors at 1.4 GHzStandard EditionFigure 6. Configuration for pilot study deployment of Microsoft CSP running on SQL Server 2005 Beta 2The Web server establishes connections to the back-end databasesaccording to the information in the scale-out managementlayer configuration database. The Web server makes requests againstthe correct physical database instance during processing. Returncodes from SQL Server indicate connection problems. Connectiontime-out is also treated as a failure. The Web server runs a clientof the MSN scale-out management layer, which communicates thefailure to the management layer. The management layer “blacklists”the failed databases and is designed to redirect the Web server toits backups in a matter of seconds.Systems maintenanceMaintenance presents a particular challenge for scale-out systems.For example, an online transaction processing (OLTP) applicationlike the MSN CSP application requires routine systems maintenancetasks including OS and application patching, node replacementsand additions, database backups, and index defragmentation.Such common tasks must be accomplished in a way that incurs noappreciable impact on data availability and workload performance.Some maintenance tasks can be performed without taking databasesoffline; others require taking a particular database offline, while stillothers require bringing down an entire server. Administrators canuse the administrative interface of the scale-out management layerfor common system maintenance tasks.Pilot study of the scale-out approachIn January 2005, Microsoft engineers configured an example enterprisedeployment of the large-scale MSN CSP application in theMicrosoft SQL Server Product Group Scalability Lab. This pilot studywas designed to demonstrate how a scaled-out, end-to-end OLTP databaseapplication that enables high availability and enhanced manageabilitycan be built using the Microsoft SQL Server 2005 platform. 1The test configuration included 12 Web clients, three Web servers,two lookup partition servers, four database servers, and one scale-outmanagement layer server (see Figure 6). The three Web servers wereconnected to the network through a switch that distributes queriesfrom clients to the Web servers in a round-robin pattern to enableload balancing. Data and log files for the database servers werestored across the same group of disks on a storage area network(SAN), which consisted of a Dell/EMC CX600 storage array with146 GB 10,000 rpm drives. Each database server was connected tothe 2 Gbps switched Fibre Channel SAN with two Emulex LP9802Peripheral Component Interconnect Extended (PCI-X)–based hostbus adapters to enhance I/O capacity and I/O failover.The test team configured the enterprise scenario described inthis article using SAN storage instead of direct attach storage (DAS)because the SAN approach is designed to enable the followingbenefits compared to DAS:• Centralized management: Helps IT organizations streamlineadministration.• Flexible scale-out: Enables administrators to increase capacityand/or the number of spindles without adding servers.• High availability: Allows administrators to upgrade anddeploy storage resources without disrupting business operationsrunning on individual servers.Although the SAN approach can be significantly more expensiveto deploy and maintain than DAS, SANs can be used to enablesophisticated high-availability data deployments. However, that discussionis beyond the scope of this article. Note: Best practicesrecommend careful planning when using a SAN to support highdata availability, to help avoid configurations in which a SAN maybecome the single point of failure for an entire enterprise system.Increasing throughput with additional nodesFigure 7 shows that the total number of transactions per secondprocessed by the four database servers increased proportionally to1The Microsoft SQL Server 2005 Beta 2 release was used in the pilot study described in this article. Actual features of Microsoft SQL Server 2005 as it is released to market may differ slightly; actual performance depends onvarious factors, including but not limited to the specific hardware and software configurations on which Microsoft SQL Server 2005 is deployed.www.dell.com/powersolutions Reprinted from Dell Power Solutions, August 2005. Copyright © 2005 Dell Inc. All rights reserved. DELL <strong>POWER</strong> <strong>SOLUTIONS</strong> 103