09.11.2012 Views

Redpaper - IBM Redbooks

Redpaper - IBM Redbooks

Redpaper - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

High Availability z/OS Solutions<br />

for WebSphere Business<br />

Integration Message Broker V5<br />

Develop a highly available WebSphere<br />

Business Integration Message Broker<br />

solution on z/OS<br />

Configure WebSphere MQ QSGs to<br />

support Message Broker in a<br />

Sysplex<br />

Example Message Broker<br />

high availability<br />

implementations<br />

Front cover<br />

Saida Davies<br />

Dean Barker<br />

Steve Kiernan<br />

Jon Mc Namara<br />

<strong>Redpaper</strong><br />

ibm.com/redbooks


International Technical Support Organization<br />

High Availability z/OS Solutions for WebSphere<br />

Business Integration Message Broker V5<br />

October 2004


Note: Before using this information and the product it supports, read the information in<br />

“Notices” on page v.<br />

First Edition (October 2004)<br />

This edition applies to Version 5, Release 01 of <strong>IBM</strong> WebSphere Business Integration Message<br />

Broker for z/OS (product number 5655-K60).<br />

© Copyright International Business Machines Corporation 2004. All rights reserved.<br />

Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP<br />

Schedule Contract with <strong>IBM</strong> Corp.


Contents<br />

Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v<br />

Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi<br />

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />

The team that wrote this <strong>Redpaper</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />

Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x<br />

Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x<br />

Chapter 1. Introduction and technical overview. . . . . . . . . . . . . . . . . . . . . . 1<br />

1.1 Project overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.1.1 Availability levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />

1.2 Testing methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />

1.2.1 HTTP Listener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />

1.3 Environment overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />

Chapter 2. Design decisions that affect high availability . . . . . . . . . . . . . . 9<br />

2.1 Considerations when designing for high availability . . . . . . . . . . . . . . . . . 10<br />

2.2 High Availability with Message Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

2.3 WebSphere MQ options in supporting HA with Message Broker . . . . . . . 12<br />

2.3.1 WebSphere MQ clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12<br />

2.3.2 WebSphere MQ shared queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14<br />

2.4 Message Broker flow design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />

2.4.1 Affinities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />

2.4.2 Error processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18<br />

2.5 Message Broker networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19<br />

2.5.1 Further considerations with Message Broker networks . . . . . . . . . . 19<br />

Chapter 3. Topology and system setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />

3.1 High Availability configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

3.1.1 Active-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

3.1.2 Active-passive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

3.2 Test environment topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

3.3 The z/OS LPARs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

3.4 The DB2 data sharing group configuration . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

3.5 The WebSphere MQ queue sharing group configuration . . . . . . . . . . . . . 25<br />

3.5.1 Queue sharing group configuration considerations. . . . . . . . . . . . . . 25<br />

3.6 The WebSphere Business Integration Message Broker configuration . . . 26<br />

3.6.1 Message Broker configuration considerations . . . . . . . . . . . . . . . . . 27<br />

3.6.2 Additional Message Broker configuration hints . . . . . . . . . . . . . . . . . 29<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. iii


3.7 Automatic Restart Management configuration . . . . . . . . . . . . . . . . . . . . . 30<br />

3.8 The configuration manager platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />

3.9 An overview of WebSphere Business Integration Message Broker<br />

SupportPac IP13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />

Chapter 4. Failover scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />

4.1 Test environment setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />

4.1.1 SupportPac IP13 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />

4.1.2 Message Broker configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

4.2 Scenario 1 - Initial state with all components active . . . . . . . . . . . . . . . . . 37<br />

4.3 Scenario 2 - Execution group failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />

4.4 Scenario 3 - Message Broker failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />

4.5 Scenario 4 - Queue manager failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />

4.6 Scenario 5 - DB2 failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />

4.7 Scenario 6 - z/OS system failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />

Appendix A. Sample code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

ARM policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />

Broker customization input file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />

Appendix B. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />

Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />

Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />

Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

<strong>IBM</strong> <strong>Redbooks</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

How to get <strong>IBM</strong> <strong>Redbooks</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />

Help from <strong>IBM</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />

iv High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Notices<br />

This information was developed for products and services offered in the U.S.A.<br />

<strong>IBM</strong> may not offer the products, services, or features discussed in this document in other countries. Consult<br />

your local <strong>IBM</strong> representative for information on the products and services currently available in your area.<br />

Any reference to an <strong>IBM</strong> product, program, or service is not intended to state or imply that only that <strong>IBM</strong><br />

product, program, or service may be used. Any functionally equivalent product, program, or service that<br />

does not infringe any <strong>IBM</strong> intellectual property right may be used instead. However, it is the user's<br />

responsibility to evaluate and verify the operation of any non-<strong>IBM</strong> product, program, or service.<br />

<strong>IBM</strong> may have patents or pending patent applications covering subject matter described in this document.<br />

The furnishing of this document does not give you any license to these patents. You can send license<br />

inquiries, in writing, to:<br />

<strong>IBM</strong> Director of Licensing, <strong>IBM</strong> Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.<br />

The following paragraph does not apply to the United Kingdom or any other country where such provisions<br />

are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES<br />

THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,<br />

INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,<br />

MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer<br />

of express or implied warranties in certain transactions, therefore, this statement may not apply to you.<br />

This information could include technical inaccuracies or typographical errors. Changes are periodically made<br />

to the information herein; these changes will be incorporated in new editions of the publication. <strong>IBM</strong> may<br />

make improvements and/or changes in the product(s) and/or the program(s) described in this publication at<br />

any time without notice.<br />

Any references in this information to non-<strong>IBM</strong> Web sites are provided for convenience only and do not in any<br />

manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the<br />

materials for this <strong>IBM</strong> product and use of those Web sites is at your own risk.<br />

<strong>IBM</strong> may use or distribute any of the information you supply in any way it believes appropriate without<br />

incurring any obligation to you.<br />

Information concerning non-<strong>IBM</strong> products was obtained from the suppliers of those products, their published<br />

announcements or other publicly available sources. <strong>IBM</strong> has not tested those products and cannot confirm<br />

the accuracy of performance, compatibility or any other claims related to non-<strong>IBM</strong> products. Questions on<br />

the capabilities of non-<strong>IBM</strong> products should be addressed to the suppliers of those products.<br />

This information contains examples of data and reports used in daily business operations. To illustrate them<br />

as completely as possible, the examples include the names of individuals, companies, brands, and products.<br />

All of these names are fictitious and any similarity to the names and addresses used by an actual business<br />

enterprise is entirely coincidental.<br />

COPYRIGHT LICENSE:<br />

This information contains sample application programs in source language, which illustrates programming<br />

techniques on various operating platforms. You may copy, modify, and distribute these sample programs in<br />

any form without payment to <strong>IBM</strong>, for the purposes of developing, using, marketing or distributing application<br />

programs conforming to the application programming interface for the operating platform for which the<br />

sample programs are written. These examples have not been thoroughly tested under all conditions. <strong>IBM</strong>,<br />

therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,<br />

modify, and distribute these sample programs in any form without payment to <strong>IBM</strong> for the purposes of<br />

developing, using, marketing, or distributing application programs conforming to <strong>IBM</strong>'s application<br />

programming interfaces.<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. v


Trademarks<br />

The following terms are trademarks of the International Business Machines Corporation in the United States,<br />

other countries, or both:<br />

Eserver®<br />

Eserver®<br />

<strong>Redbooks</strong> (logo) <br />

Eserver®<br />

ibm.com®<br />

z/OS®<br />

zSeries®<br />

DB2®<br />

<strong>IBM</strong>®<br />

Lotus®<br />

MQSeries®<br />

MVS<br />

NetView®<br />

OS/390®<br />

Parallel Sysplex®<br />

<strong>Redbooks</strong><br />

The following terms are trademarks of other companies:<br />

RACF®<br />

S/390®<br />

SupportPac<br />

ThinkPad®<br />

Tivoli®<br />

WebSphere®<br />

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the<br />

United States, other countries, or both.<br />

UNIX is a registered trademark of The Open Group in the United States and other countries.<br />

Other company, product, and service names may be trademarks or service marks of others.<br />

vi High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Preface<br />

When designing and implementing a production grade Message Broker solution<br />

on z/OS®, one of the most important factors to consider is high availability. This<br />

<strong>IBM</strong>® <strong>Redpaper</strong> examines the design considerations inherent in configuring a<br />

highly available Message Broker environment. Also demonstrated is the use of<br />

the coupling facility for WebSphere MQ queue sharing groups (QSG) and<br />

Automatic Restart Management (ARM) in order to support WebSphere Business<br />

Integration Message Broker HA in a sysplex environment. Finally, examples of<br />

the behavior of Message Broker during failover are provided, including<br />

transaction rate measurements and throughput statistics.<br />

The team that wrote this <strong>Redpaper</strong><br />

This paper was produced by a team of specialists from around the world, working<br />

with the International Technical Support Organization (ITSO), Raleigh, NC.<br />

Saida Davies is a Project Leader with the<br />

ITSO. She is a certified senior IT specialist and<br />

has 15 years of experience in IT. Saida has<br />

published several <strong>Redbooks</strong> on various<br />

business integration scenarios. She has<br />

experience in the architecture and design of<br />

WebSphere® MQ solutions, has extensive<br />

knowledge of <strong>IBM</strong>’s z/OS operating system,<br />

and a detailed working knowledge of both <strong>IBM</strong><br />

and Independent Software Vendors’ operating<br />

system software. In a customer facing role<br />

with <strong>IBM</strong> Global Services, her role included the<br />

development of services for WebSphere MQ within the z/OS and Windows®<br />

platform. This covered the architecture, scope, design, project management, and<br />

implementation of the software on stand-alone systems or on systems in a<br />

Parallel Sysplex® environment. She has a degree in Computer Science, and her<br />

background includes z/OS systems programming.<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. vii


From left: Dean, Steve, and Jon<br />

Dean Barker is an IT Specialist at the <strong>IBM</strong> Hursley Laboratories in the UK. He<br />

has over ten years experience as a MVS Systems Programmer. He holds a<br />

degree in Chemical Engineering from the University of Manchester Institute of<br />

Science and Technology. He has an excellent knowledge of the z/OS operating<br />

system and sysplex environment. His other areas of expertise include Unix<br />

System Services and WebSphere Application Server for z/OS. Dean assisted<br />

with the z/OS system setup pre-requirements for this project which enabled the<br />

team to accomplish the full scope of this <strong>Redpaper</strong>.<br />

Steve Kiernan is a Consulting IT Specialist on the New England and Upstate<br />

New York WebSphere technical team. Before joining <strong>IBM</strong> eight years ago, he<br />

spent fifteen years in the banking industry as a mainframe systems programmer.<br />

Since Steve joined <strong>IBM</strong>, he has worked with the entire WebSphere Business<br />

Integration platform and primarily with WebSphere MQ and WebSphere<br />

Business Integration MB on z/OS.<br />

Jon Mc Namara is an IT Specialist in the Hursley WebSphere Business<br />

Integration Services Team. He provides WebSphere Business Integration<br />

customers with a range of expert technical services. Jon’s areas of expertise<br />

include z/OS, WebSphere MQ, WebSphere MQ Integrator, WebSphere Business<br />

Integrator FN, WebSphere Business Integration Message Broker, and<br />

WebSphere Business Integration Event Broker. He is also a recognized expert in<br />

multicast technology.<br />

viii High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


The <strong>Redpaper</strong> team would like to thank the following people, located in <strong>IBM</strong><br />

Hursley, UK, for their guidance, assistance, and contributions to this edition:<br />

► Gary Willoughby, Manager, WebSphere Business Integration, EMEA<br />

WebSphere Lab Services<br />

► Ralph Bateman, WebSphere MQ Brokers z/OS Change Team specialist<br />

► Karen Burgess, WebSphere MQ z/OS FV & Test<br />

► William Chorlton, IT Specialist, zSeries® DB/DC Support<br />

► Rob Convery, WebSphere MQ System Test<br />

► Peter Edwards, MQSeries® Test<br />

► Emir Garza, Consulting IT Specialist, Software Group/Application Integration<br />

Middleware<br />

► Luisa Lopez de Silanes Ruiz, Consulting IT Specialist, Software<br />

Group/Application Integration Middleware<br />

► Colin Paice, WebSphere MQ Scenarions - z/OS performance specialist, <strong>IBM</strong><br />

Hursley, UK<br />

► Alasdair Paton, WebSphere MQ Brokers - System Test team lead<br />

► Pete Siddall, Software Engineer<br />

► Vicente Suarez, IT Specialist, Software Group/Application Integration<br />

Middleware<br />

Preface ix


Become a published author<br />

Join us for a two- to six-week residency program! Help write an <strong>IBM</strong> redbook<br />

dealing with specific products or solutions, while getting hands-on experience<br />

with leading-edge technologies. You will team with <strong>IBM</strong> technical professionals,<br />

Business Partners, and customers.<br />

Your efforts will help increase product acceptance and customer satisfaction. As<br />

a bonus, you will develop a network of contacts in <strong>IBM</strong> development labs and<br />

increase your productivity and marketability.<br />

Find out more about the residency program, browse the residency index, and<br />

apply online at:<br />

Comments welcome<br />

ibm.com/redbooks/residencies.html<br />

Your comments are important to us!<br />

We want our papers to be as helpful as possible. Send us your comments about<br />

this <strong>Redpaper</strong> or other <strong>Redbooks</strong> in one of the following ways:<br />

► Use the online Contact us review redbook form found at:<br />

ibm.com/redbooks<br />

► Send your comments in an e-mail to:<br />

redbook@us.ibm.com<br />

► Mail your comments to:<br />

<strong>IBM</strong> Corporation, International Technical Support Organization<br />

Dept. HZ8 Building 662<br />

P.O. Box 12195<br />

Research Triangle Park, NC 27709-2195<br />

x High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Chapter 1. Introduction and technical<br />

overview<br />

1<br />

This paper is divided into four chapters:<br />

► This chapter, Chapter 1, “Introduction and technical overview” on page 1,<br />

describes the scope of this book and provides an overview of a high<br />

availability environment for WebSphere Business Integration Message Broker<br />

(Message Broker) on z/OS.<br />

► Chapter 2, “Design decisions that affect high availability” on page 9, provides<br />

considerations, hints, and tips for the Message Broker in a high availability<br />

environment and developing message flows and publish and subscribe<br />

applications in a high availability environment.<br />

► Chapter 3, “Topology and system setup” on page 21, describes how to<br />

achieve a high availability environment with Message Broker and contains<br />

specific configuration information for the software components used during<br />

failover testing.<br />

► Chapter 4, “Failover scenarios” on page 33, describes the procedures to<br />

create a testing environment and details the results of the failover testing.<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 1


1.1 Project overview<br />

This paper explores the architecture and explains the tasks necessary to build a<br />

high availability environment for Message Broker on z/OS. Remember that the<br />

Message Broker high availability environment is necessary for the business<br />

applications that it serves and not for itself. Today’s business critical applications<br />

often demand a high degree of availability, and the z/OS system administrator<br />

must exploit Parallel Sysplex technology to provide continuous service levels<br />

during system outages, planned or otherwise.<br />

1.1.1 Availability levels<br />

There are several ways to describe the degree of availability that a system<br />

requires. The availability of a system can be expressed as a number of nines or<br />

by using different terms.<br />

The number of nines term represents the percentage of time for which the<br />

system is available. Three nines means that it is available 99.9% of the time; five<br />

nines means that it is available 99.999% of the time. These numbers become<br />

much more significant when you look at these figures in terms of downtime over<br />

a fixed period. For example, over a period of one year, a system with 99.9%<br />

availability would have 8.75 hours of downtime, while a system with 99.999%<br />

availability would be down for just 315 seconds.<br />

The following terms are also used to describe availability:<br />

► Continuous availability: This term describes a system that experiences no<br />

discernible downtime, where neither scheduled nor unscheduled outages<br />

occur. A continuously available system detects the error and immediately<br />

provides an alternative component that is already ready to go. Also, this<br />

system should support the scheduling of planned maintenance by allowing<br />

workload to be transparently transferred away from the components or<br />

subsystems that are the subject of the maintenance activity. Although<br />

continuous availability seems difficult to achieve, it is possible to obtain such<br />

availability by combining hardware, software, and operational procedures that<br />

can mask outages from the user so that the user does not perceive that a<br />

system outage occurs.<br />

► Continuous operation: This term describes a system that experiences no<br />

discernible downtime due to scheduled outages. However, this system's<br />

availability will not be as high as it would be with a continuously available<br />

system because it may suffer unplanned outages.<br />

► High availability (HA): This term describes a system that can detect a single<br />

failure and react to it automatically within a matter of a few minutes at most.<br />

2 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


These kinds of systems will operate with an amount of planned and<br />

unplanned outages. There are two significant aspects in this definition:<br />

– The system should survive a single failure but a second failure may result<br />

in a loss of service.<br />

– The detection of a fault and the triggering of an action to recover from it<br />

should be automatic, that is, require no manual intervention.<br />

Figure 1-1 illustrates the relationship between the components of systems<br />

availability.<br />

Continuous Availability<br />

Concurrency<br />

Redundancy<br />

Systems Management<br />

Reliable, Robust and Resilient Technologies<br />

High Availability + Continuous Operation<br />

Figure 1-1 High availability + Continuous Operation = Continuous Availability<br />

There are times with the term fault tolerance is used mistakenly when the terms<br />

high availability or continuous availability are meant. Fault tolerance describes<br />

systems which, in the event of a failure, can substitute a replacement component<br />

for the failed component in a matter of a few milliseconds. This kind of<br />

achievement is supported by components that have redundant sub-components,<br />

error checking and correction for data, retry capabilities for basic operations,<br />

alternate path for I/O requests, and so forth. However, there may also be a single<br />

point of failure which, despite the fault tolerance, can cause a component to fail.<br />

Similarly, if one important component in a system is not fault-tolerant, then the<br />

system is not fault-tolerant even though all other components are.<br />

Specifically, HA refers to a specific level of service that provides availability in the<br />

event of a single, non-catastrophic component failure. Transaction capacity may<br />

Chapter 1. Introduction and technical overview 3


e compromised while the failing component recovers, but recovery should be<br />

automatic.<br />

This paper describes a resilient HA Message Broker architecture based on an<br />

Active-Active Message Broker configuration using a WebSphere MQ queue<br />

sharing group (QSG) and a DB2 data sharing group (DSG) across two Logical<br />

Partitions (LPAR). Testing revealed that this configuration provides for sustained<br />

Message Broker services through a variety of failover scenarios.<br />

For a complete overview of HA concepts, refer to the <strong>IBM</strong> Redbook Highly<br />

Available WebSphere Business Integration Solutions, SG24-6328.<br />

The individual component failures that we tested in the research for this paper<br />

include:<br />

► A Message Broker execution group failure<br />

► A Message Broker failure<br />

► A queue manager failure<br />

► A DB2 sub-system failure<br />

In our testing, we restarted each of these components in place on the LPAR on<br />

which they failed using Automatic Restart Management (ARM). We also tested a<br />

complete LPAR failure in which we restarted the queue manager and Message<br />

Broker from one LPAR on the active LPAR.<br />

There are countless other individual component failures that we could not test<br />

due to time constraints. We used ARM to restart all failing components.<br />

Appendix A, “Sample code” on page 49 provides the ARM policy that was in<br />

effect at the time of the failure. Some organizations may prefer NetView®<br />

automated recovery to ARM. You can achieve similar restart functionality using<br />

NetView.<br />

A coupling facility (CF) failure is categorized as a catastrophic failure. To achieve<br />

continuous availability on z/OS, a CF failover has to be accounted for, specifically,<br />

by duplexing the WebSphere MQ admin and list structures. The only single point<br />

of failure in our tested configuration was the CF. CF failover testing is outside the<br />

scope of this paper. For further information refer the following information:<br />

► MVS Setting up a Sysplex:<br />

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/IEA2F132/<br />

CCONTENTS?SHELF=IEA2BK34&DN=SA22-7625-06&DT=20030423145429<br />

► The WebSphere MQ System Administration Guide - Part 5, “Recovery and<br />

Restart” at:<br />

http://publibfp.boulder.ibm.com/epubs/html/csqsaw02/csqsaw02tfrm.htm<br />

4 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


1.2 Testing methodology<br />

Because the goal of this paper is to evaluate Message Broker service availability,<br />

we obtained measurements from both the Message Broker and the application<br />

that was using the Message Broker. The WebSphere Business Integration<br />

Broker - Sniff test and performance on z/OS (<strong>IBM</strong> Category 2 SupportPac IP13)<br />

provided an elegant solution. For details on downloading the SupportPac, see:<br />

http://www-306.ibm.com/software/integration/support/supportpacs/<br />

Note: We recommended that you install the most current version of the<br />

SupportPac before using the techniques described in this paper.<br />

SupportPac IP13 consists of documentation, programs, and message flows<br />

designed to help you measure application throughput on Message Broker on<br />

z/OS. SupportPac IP13 provides statistics at the end of the job, including the<br />

total number of round trip messages that the SupportPac IP13 application<br />

processed. You can find the specific SupportPac IP13 parameters that we used<br />

for testing in Chapter 4, “Failover scenarios” on page 33. We ran the application<br />

on both LPARs during several of the tests to simulate a realistic production<br />

environment.<br />

Figure 1-2 on page 6 illustrates our testing environment topology.<br />

Chapter 1. Introduction and technical overview 5


Figure 1-2 Test environment topology<br />

In our testing, we also obtained statistics from the Message Broker while archive<br />

accounting data collection was active. Again, you can find specific procedures for<br />

this Chapter 4, “Failover scenarios” on page 33. By using SupportPac IP13 to<br />

generate a workload for the Message Broker to consume and by comparing<br />

those results with the statistics produced by the Message Broker, we were able<br />

to evaluate continuous throughput during component failover.<br />

Thus, the SupportPac IP13 batch application drove messages to a shared<br />

message queue. We configured a SupportPac IP13 message flow to consume<br />

those messages on the shared queue and then deploy the .bar file to both<br />

brokers. Reply messages were sent to a second shared queue and, in turn, were<br />

6 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


1.2.1 HTTP Listener<br />

then consumed by the SupportPac IP13 batch application program in a<br />

request-reply model.<br />

This testing methodology was sufficient to evaluate component failover.<br />

However, the statistics supplied in Chapter 4, “Failover scenarios” on page 33 are<br />

insufficient to evaluate Message Broker performance or capacity for the following<br />

reasons:<br />

► The z/OS environment was not tuned. The main purpose of SupportPac IP13<br />

is to provide a way to begin to tune the Message Broker environment, but we<br />

did not use it for this purpose in this exercise.<br />

► The LPARs did not have identical hardware resources. For instance, LPAR0<br />

was configured with two logical processors, while LPAR2 was configured with<br />

one logical processor.<br />

► The loads that we placed on the configuration were designed to provide a<br />

continuous stream of messages for the duration of the individual tests, but the<br />

steady-state load did not stress available resources. A single instance of the<br />

SupportPac IP13 message flow was used for each broker.<br />

Although HTTP Listener support in Message Broker is expected to makes its<br />

debut on z/OS shortly (FixPack 4), the current version does not support it.<br />

Therefore, configuration and testing using HTTP Listener was beyond the scope<br />

of this paper.<br />

It is essential for the system administrator to consider dynamic VIPAs for a high<br />

availability environment. WebSphere MQ and WebSphere Business Integration<br />

Message Broker rely on TCP/IP stack for communicating with other systems.<br />

For further information, on this subject, there is an <strong>IBM</strong> white paper entitled<br />

Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />

availability, GM13-0026, at:<br />

http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/<br />

gm130165.html<br />

1.3 Environment overview<br />

For our testing environment, we kept the software configuration as simple as<br />

possible to achieve a HA environment. The goal was to build a resilient Message<br />

Broker server capable of providing a very high degree of availability. Because<br />

Chapter 1. Introduction and technical overview 7


Message Broker is a WebSphere MQ and DB2® application, these components<br />

must also be configured for high availability, specifically:<br />

► The two z/OS queue managers participated in a QSG to allow for shared<br />

application queues.<br />

► The queue managers were not clustered.<br />

► A pair of WebSphere MQ channels was used between each z/OS queue<br />

manager and the Windows Configuration Manager queue manager.<br />

► A DB2 DSG was created to support the QSG.<br />

The operating environments used for this book include:<br />

► The Message Broker installed on two z/OS 1.4 LPARs in a sysplex running on<br />

a 9672 model XZ7. The LPARs have the following software levels:<br />

– z/OS 1.4 with service up to RSU0406<br />

– DB2 V7R1M0 with service up to RSU0312<br />

– WebSphere MQ V5R3M1 at FP5<br />

– WebSphere Business Integration Message Broker V5R0M1 at Fix Pack 3<br />

► The Message Broker Configuration Manager installed on an <strong>IBM</strong> T40<br />

ThinkPad® with:<br />

– Windows XP SP1<br />

– WebSphere MQ V5.3 Fix Pack 5<br />

– WebSphere Business Integration Message Broker V5 Fix Pack 3<br />

– DB2 V8.1 FixPak 2<br />

A thorough description of the software configuration and setup that we used in<br />

our testing environment can be found in Chapter 3, “Topology and system setup”<br />

on page 21.<br />

Note: We recommend that you install the most current release of all Fix Packs<br />

before using the techniques described in this paper.<br />

8 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


2<br />

Chapter 2. Design decisions that affect<br />

high availability<br />

This chapter examines design decisions that you need to address when<br />

configuring an HA system. It begins by taking a high-level look at what HA entails<br />

and continues by examining the details of designing an HA system on z/OS using<br />

WebSphere MQ and Message Broker.<br />

Remember that when you design a Message Broker configuration on z/OS which<br />

must satisfy stringent HA requirements, there are a number of factors you should<br />

consider. The main question is how to ensure that the availability targets for a<br />

Message Broker implementation are met?<br />

The most common implementation of Message Broker is as a central hub<br />

through which all messages in a messaging architecture are processed. Thus,<br />

the potential exists for Message Broker to become a bottleneck and single point<br />

of failure for the whole message processing network. Because of the business<br />

impact of the messaging hub failing, you should use a thorough approach when<br />

designing the queue manager and an HA implementation of Message Broker.<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 9


2.1 Considerations when designing for high availability<br />

Many factors contribute to the HA of a Message Broker environment, including<br />

platform and operating system agnostic factors, such as the following:<br />

► Documented procedures<br />

► Practicing procedures<br />

► Reliable hardware<br />

► Failover<br />

► Dual networks<br />

Other platform and operating system specific factors can also affect the HA of a<br />

Message Broker environment, including:<br />

► Shared queues<br />

► Reliable operating system<br />

► WebSphere MQ clustering<br />

All of these factors are important when considering availability. Many of these<br />

factors are platform agnostic and are equally relevant across the range of<br />

supported platforms, including z/OS. These factors include contingency against<br />

disk failures, processor and memory, and power supply or power itself, network,<br />

and so forth.<br />

Because the entire range of components that would make up a highly available<br />

system is extremely varied and covers a number of separate disciplines, we<br />

concentrated our efforts on a distinct subset in our testing environment. This<br />

subset included methods by which the queue manager and Message Broker<br />

were configured to support a highly available system. This section also outlines<br />

some of the issues to take into account when using functionality such as shared<br />

queues, clustering, and cloned brokers.<br />

One approach to mitigating against failures is to install multiple instances of the<br />

components that might fail. On z/OS, there is an architecture which lends itself<br />

well to providing a highly available system. For example, there is the well-proven<br />

architecture of using separate LPARs connected via the coupling facility. This<br />

method is collectively referred to as a parallel sysplex environment. A parallel<br />

sysplex environment is comprised of highly available disks, the power supply,<br />

and the ability to communicate across LPARs by using the coupling facility. With<br />

this architecture, you can implement a highly available queue manager and<br />

Message Broker system by using, for example, a number of brokers operating on<br />

separate LPARs, all of which are supported by shared WebSphere MQ queues<br />

via the coupling facility or by WebSphere MQ clustering.<br />

10 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


2.2 High Availability with Message Broker<br />

When formulating a design for a highly available Message Broker system, it is<br />

worth considering the benefits of having a network topology consisting of a hub<br />

that contains a number of queue managers which serve brokers. This type of<br />

configuration allows the workload to be spread evenly across the brokers, while<br />

also providing redundancy in the event that a Message Broker or a queue<br />

manager fails.<br />

In addition, running brokers across multiple LPARs allows work to continue in<br />

case of an LPAR failure.<br />

At a lower level, there is a choice between deploying the same configuration to<br />

each broker in the hub or assigning a different configuration to each broker. If a<br />

particular message flow runs only on a single broker in the hub, then this<br />

environment does not present a high availability solution. If the same message<br />

flow is the chosen configuration for all brokers, any broker can process any<br />

message. However, the resource allocation cannot be tailored as finely.<br />

With a separate configuration for each broker, you can control the number of<br />

copies of a flow run on each broker in the hub. You can also tailor the priorities so<br />

that LPARs with more capacity can handle more of the work. However, this<br />

environment can introduce affinities to particular LPARs and compromise HA<br />

effectiveness. Whichever way you configure brokers, you must create appropriate<br />

.bar files and execution groups for each broker in the hub. Table 2-1 on page 12<br />

shows the advantages and disadvantages to each configuration.<br />

Chapter 2. Design decisions that affect high availability 11


Table 2-1 Advantages and disadvantages of clones and tailored execution groups<br />

Cloned execution groups Tailored execution groups<br />

Advantages Advantages<br />

Provides for the highest availability level<br />

because all brokers can process any<br />

message.<br />

Gives a wide distribution of work evenly<br />

across the system, reducing the potential<br />

for bottlenecks.<br />

Disadvantages Disadvantages<br />

Does not provide granular control over<br />

resource allocation.<br />

Does not provide support for flows which<br />

may be required to turnaround messages<br />

more quickly than others.<br />

2.3 WebSphere MQ options in supporting HA with<br />

Message Broker<br />

There are advantages and disadvantages you should consider before embarking<br />

on a particular method of providing HA with Message Broker. However, once you<br />

have determined the best method for your configuration, there are a number of<br />

ways you can use WebSphere MQ to support your configuration, using either<br />

WebSphere MQ clustering or WebSphere MQ shared queues.<br />

This section discusses the WebSphere MQ options available in supporting a<br />

highly available Message Broker configuration.<br />

2.3.1 WebSphere MQ clustering<br />

Allows for the best use of the resources<br />

available.<br />

Allows more flows which are used more or<br />

have a more demanding message<br />

processing need.<br />

May introduce affinities to particular<br />

brokers and LPARS.<br />

Provides a lower availability because<br />

fewer brokers can process all messages.<br />

To understand how clustering works, consider this example. A sending<br />

application that sends a message to a receiving application uses the queue<br />

manager to send the message. If the queue manager of the receiving application<br />

is offline, the clustering functionality would automatically reroute the message to<br />

another queue manager. This queue manager acts as a clone of the receiving<br />

application. This function is totally transparent to the applications. Additional<br />

information about clustering can be found in WebSphere MQ Queue Manager<br />

Clusters, SC34-6061.<br />

12 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


The ability to define these clustered queues on several of the queue managers in<br />

the cluster leads to increased system availability. Each queue manager runs<br />

equivalent instances of the applications that process the messages. If one of the<br />

queue managers fails, or the communication to it is suspended, that queue<br />

manager is temporarily excluded from the choice of destinations for the<br />

messages. This functionality provides a number of benefits, the main one being<br />

HA.<br />

There are a few points to consider when approaching WebSphere MQ clustering<br />

as the preferred choice for implementing HA. For example, if a queue manager<br />

fails, then although any subsequent messages sent to the cluster are not routed<br />

to the failed queue manager, any persistent messages already sent to the queue<br />

manager, but not yet processed by the broker, are marooned on this failed queue<br />

manager. Also, for WebSphere MQ clustering to effectively balance loads<br />

between queues on different queue managers, messages sent from outside the<br />

cluster need to be sent to a gateway queue manager. A gateway queue manager<br />

creates a single point of failure for the whole logical hub. Therefore, the gateway<br />

queue manager would need to have very high availability.<br />

Also, remember that in the event that the Message Broker falls over but the<br />

clustered queue manager is still operational, messages under a cluster queue<br />

environment are still sent to the queue manager. These messages are not<br />

processed until the Message Broker is once again functional. Unless there is a<br />

monitoring tool which registers that the Message Broker is inoperative,<br />

messages can build up unnoticed.<br />

A way of safeguarding against this situation is to employ a monitoring tool, which<br />

registers both the status of the Message Broker and the subsequent build up of<br />

messages on the broker input queue. If, for example, the normal buildup on the<br />

broker input queue is 50 messages, it might be wise to set an alert on the queue.<br />

If messages build to a number greater than 50, then an alert is sent to the<br />

monitoring tool, and an operator can determine if there is a problem.<br />

Table 2-2 shows the advantages and disadvantages to using WebSphere MQ<br />

clustering.<br />

Table 2-2 Advantages and disadvantages to using WebSphere MQ clustering<br />

Advantages Disadvantages<br />

Provides resilient failover because<br />

WebSphere MQ can redirect messages<br />

away from a failed queue manager to be<br />

picked up by a functional one.<br />

Persistent messages awaiting processing<br />

while on the queue of a failed queue<br />

manager remain there until the queue<br />

manager is operational.<br />

Chapter 2. Design decisions that affect high availability 13


Advantages Disadvantages<br />

Uses the available resources because<br />

WebSphere MQ clustering can direct<br />

messages across all available LPARS.<br />

Takes advantage of cloned applications<br />

and identically configured brokers by<br />

workload balancing, thus reducing the risk<br />

of bottlenecks.<br />

2.3.2 WebSphere MQ shared queues<br />

Gateway queue managers can act as a<br />

single point of failure for the entire<br />

WebSphere MQ cluster.<br />

Multi-part messages and transactions can<br />

cause affinities to particular queue<br />

managers, thus reducing the availability of<br />

the system.<br />

Should Message Broker fail and the queue<br />

manager is still functional, the WebSphere<br />

MQ cluster continues to send messages<br />

to that queue manager even though they<br />

are not processed.<br />

Now that we have examined the issues in using WebSphere MQ clustering to<br />

support HA, this section discusses WebSphere MQ shared queues. Shared<br />

queues rely heavily on the coupling facility and a shared DB2 database to allow<br />

queue managers to form a queue sharing group (QSG). Once the queue<br />

managers have configured the QSG, they can create queues that are available<br />

to all the queue managers in the group.<br />

This functionality sounds very similar to WebSphere MQ clustering and, in many<br />

respects, there is a certain amount of overlap in the service shared queues<br />

provide. However, a shared queue is one instance of a queue which is “shared”<br />

among a number of queue managers. This method differs from WebSphere MQ<br />

clustering in that the clustered queues, despite having the same designation, are<br />

actually separate entities. In practice, once a shared queue has been configured<br />

allowing a number of queue managers to access it, all of the brokers associated<br />

with those queue managers (or any application) could have access to that queue,<br />

despite existing on separate LPARs.<br />

A simple scenario which uses shared queues might resemble the following. A<br />

sending application puts messages to a shared queue. These messages are<br />

then available to all queue managers in the QSG across all relevant LPARs.<br />

Associated with these queue managers are brokers, all of which are retrieving<br />

messages from the same shared queue. Should an LPAR, queue manager, or a<br />

broker fall over, the other brokers would continue retrieving and processing<br />

messages from the shared queue.<br />

Because cloned applications and the broker which serve them are able to a<br />

connect to queue managers in a QSG and because all queue managers in a<br />

QSG can access shared queues, applications do not need to rely on the<br />

14 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


availability of any one queue manager. Should an LPAR, queue manager, or a<br />

Message Broker fall over, shared queues on the functioning system can continue<br />

to service cloned applications.<br />

There are distinct advantages for using shared queues as opposed to using<br />

WebSphere MQ clustering. For example, in the event that the Message Broker<br />

falls over but the queue manager is operational, messages under a cluster queue<br />

environment are still sent to the queue manager. These messages then wait on<br />

the queue until the Message Broker restarts. This situation can cause problems,<br />

especially if the business requires a quick turnaround response to the message.<br />

This situation does not occur when using shared queues, because it is the<br />

Message Broker that is retrieving the message from the shared queue directly,<br />

rather then the message being passed to a queue manager which only supports<br />

one Message Broker.<br />

Another advantage of using shared queues is that both applications and the<br />

broker can take advantage of serialization during shared queue peer recovery.<br />

As previously mentioned, if a queue manager should fail, then all the queue<br />

managers in the QSG continue to process messages on the shared queue.<br />

However, they also finish the shared queue work for the incomplete units of work<br />

which were running on the failed queue manager. One potential issue here is that<br />

during the process of rolling back uncommitted messages it is possible that<br />

another queue manager may attempt to process one of the messages in the unit<br />

of work still on the queue. In this case, the messages in the unit of work would<br />

then be out of sequence. By using the serialization mechanism, no other queue<br />

manager in the QSG can access any of the messages in the unit of work until a<br />

full roll back has been completed. This method ensures that the messages are<br />

processed in the correct order.<br />

Some of the restrictions of using shared queues on z/OS when it comes to HA.<br />

Possibly the most important restriction which should be taken into account is the<br />

maximum message size. For the current version, WebSphere MQ 5.3, the<br />

largest message which can be placed upon a shared queue is 63 KB. Thus, the<br />

largest message that can be transported via the highly available shared queue<br />

system is also 63 KB. In addition, there is currently a restriction of eight million<br />

messages which can be stored on a queue.<br />

Also, for WebSphere MQ 5.2, the shared queues do not support persistent<br />

messaging. However, the non-persistent messages on a shared queue do<br />

survive a queue manager restart. Thus, this method does have a form of<br />

resiliency (although technically, this is not persistence). With WebSphere MQ<br />

5.3, persistent messages are supported with shared queues. Remember that<br />

non-persistent messages do survive the queue manager restart. This situation<br />

can be advantageous unless an application specifically depends upon the<br />

non-persistent messages being deleted on the event of a queue manager restart.<br />

Chapter 2. Design decisions that affect high availability 15


In this instance, you should consider incorporating a policy for “cleaning up”<br />

these messages.<br />

One aspect of the availability of messages on shared queues which you should<br />

consider is the effect of using a two-phase commit. For example, consider a unit<br />

of work which retrieves a message from a shared queue, updates a DB2 table<br />

based on the contents of the message, and returns a response to a shared<br />

queue. A 2pc protocol is used to ensure that either all or none of the processing<br />

happens, typically coordinated by Resource Recovery Services (RRS). If the<br />

queue manager where this unit of work is running were to fail during the<br />

two-phase commit, it is possible that the unit of work would be left indoubt in<br />

WebSphere MQ.<br />

In this case, the correct resolution of the unit of work cannot be determined until<br />

the queue manager is restarted and can reconnect with RRS. It is therefore not<br />

possible for the other queue managers to perform peer recovery for this unit of<br />

work (this means, the input message cannot be rolled back by a peer for<br />

processing via a different queue manager which has an impact on the availability<br />

of the messages consumed and produced by that indoubt unit of work). The<br />

availability of other messages on the shared queue is not impacted unless<br />

serialization tokens are being used to ensure an ordering of processing<br />

messages on this queue. This is further explained in WebSphere MQ for z/OS<br />

System Administration Guide V5.3.1, SC34-6053-01.<br />

Note: When messages are put to a shared queue, the data is logged on a<br />

particular queue manager, but that process does not cause any kind of<br />

message affinity to a queue manager. The affinity is between a unit of work<br />

and a queue manager.<br />

Table 2-3 outlines the issues you should consider when looking at shared queues<br />

to support a highly available environment.<br />

Table 2-3 Advantages and disadvantages of using WebSphere MQ shared queues<br />

Advantages Disadvantages<br />

Resilient support of HA over multiple<br />

LPARs. Should a queue manager or a<br />

Message Broker fail, other brokers<br />

continue to retrieve messages.<br />

Maximum message size of 63 KB. Even if<br />

a current system does not have messages<br />

of that size or above, it places a restriction<br />

on the natural growth of the system.<br />

However, this restriction is increased to<br />

100 MB in the next release of WebSphere<br />

MQ by further using DB2.<br />

16 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


2.4 Message Broker flow design<br />

2.4.1 Affinities<br />

Advantages Disadvantages<br />

Messages are pulled from the queue,<br />

rather than pushed onto another queue<br />

managers clustered queue. Thus, if a<br />

Message Broker becomes inoperative,<br />

messages do not build up on the queue<br />

manager serving the broker. Instead, the<br />

messages remain on the shared queue to<br />

be picked up by a functional broker.<br />

The ability to fully utilize the available<br />

resources as shared queues allows<br />

messages to be shared across available<br />

LPARS.<br />

Ability to take advantage of cloned<br />

applications and identically configured<br />

brokers by allowing less busy applications<br />

and brokers to retrieve messages from the<br />

shared queue at there own pace, thus<br />

reducing the risk of bottlenecks.<br />

Ability to take advantage of the<br />

serialization mechanism to ensure backed<br />

out units of work are completed in the<br />

correct order.<br />

Limitation of the coupling facility storage<br />

that no more than eight million messages<br />

can be stored on a queue.<br />

Non-persistent messages which could<br />

survive queue manager restarts can be an<br />

issue for the application. If an application<br />

depends on non-persistent messages not<br />

surviving a queue manager restart, it is<br />

possible you will need to add functionality<br />

to deal with this issue.<br />

Depending on how the coupling facility is<br />

being used, the storage taken up by<br />

shared queues could well be relatively<br />

expensive.<br />

There are a number of points to keep in mind with HA in the design of message<br />

flows.<br />

Message affinities can arise when multiple messages are required to make up a<br />

single business transaction or when messages must be processed in a specific<br />

order. This situation is also true when a Message Broker is forced to keep state,<br />

as is the case when using the aggregation node. This type of situation can often<br />

mean that a particular queue manager or Message Broker is required to process<br />

these messages. In terms of HA, this affinity can lead to vulnerabilities. If a<br />

particular queue manager or Message Broker is required to process a long string<br />

Chapter 2. Design decisions that affect high availability 17


of messages making up a transaction, then that manager or Message Broker can<br />

become a single point of failure for that transaction. This situation creates a<br />

potentially vulnerable point in the system, because they the queue manager or<br />

Message Broker require messages to be processed through the same thread,<br />

broker, and so on.<br />

However, your business needs may make it necessary to introduce affinities into<br />

the Message Broker infrastructure. In such a situation, you should be aware of<br />

the issues that introducing affinities may cause. One way to alleviate some of<br />

these issues is to run transactions under transaction coordination, so that in the<br />

event that the queue manager or the Message Broker falls over, the transaction<br />

is rolled back. Doing so means that the transaction can then be re-processed by<br />

another instance of the Message Broker.<br />

Publish and Subscribe functionality is an example of where a user application<br />

can develop an affinity to a specific broker. Unless cloning is used, once a<br />

subscription has been registered at a specific broker, only that broker can deliver<br />

publications to the subscriber. This situation applies specifically to RealTime,<br />

Multicast, and Telemetry transports because these applications establish a direct<br />

TCP/IP connection to the client, which is broken if the server goes down.<br />

2.4.2 Error processing<br />

To ensure that the system is as highly available as possible, it is worthwhile to<br />

understand how Message Broker handles error processing. Message Broker has<br />

very defined procedures for dealing with invalid messages. Not all of them,<br />

however, are conducive to HA.<br />

While processing messages, the normal procedure for the broker when<br />

encountering an invalid message is to place the message back on the input<br />

queue. The broker then retries the message until the retry count is reached. At<br />

this point, the broker simply places the message back on the queue and ceases<br />

processing it. When it comes to supporting HA, this behavior causes a problem.<br />

Once an invalid message is placed back on the input queue, other potentially<br />

valid messages are building up on the queue behind it. This building up of<br />

messages continues until the invalid message is removed. This situation is not<br />

ideal in the HA environment. To work-around this type of situation, you will need<br />

to incorporate some tailored error processing. The error processing can be as<br />

simple or as complex as you prefer, from simply putting the invalid message onto<br />

an error queue, to creating an error processing subflow supported with Try Catch<br />

functionality. The important factor is to remove the invalid message from the input<br />

queue, thereby preventing valid messages from building up behind the invalid<br />

message.<br />

18 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


2.5 Message Broker networks<br />

A popular feature of Message Broker is that it supports the Publish and<br />

Subscribe functionality. A subscriber can connect to a broker in the Publish and<br />

Subscribe topology and receive publications made on that broker or others in the<br />

network. In the default case, the subscriber has an affinity to the broker with<br />

which it has registered. There are various ways to reduce this affinity. For<br />

WebSphere MQ subscribers, a solution is to use the Cloned Broker feature,<br />

which allows a subscriber to receive its publications directly from several different<br />

brokers, thus reducing the affinity.<br />

Note: Cloned brokers cannot be used with other broker topologies, such as<br />

hierarchies and collectives. For subscribers using the RealTime transport, this<br />

option is not available because a connection is established directly with a<br />

particular broker and the subscription is removed if the connection is broken.<br />

This paper does not discuss Publish and Subscribe applications in detail, but<br />

there is more information about HA for Publish and Subscribe applications in the<br />

Redbook WebSphere Business Integration Pub/Sub Solutions, SG24-6088.<br />

2.5.1 Further considerations with Message Broker networks<br />

When working with Message Broker networks, you should also consider the User<br />

Name Server’s function. This function provides a level of security for the Publish<br />

and Subscribe function on Message Broker. Overall, the User Name Server<br />

provides an excellent level of service across the WebSphere MQ supported<br />

platforms. However, an examination of the User Name Server with regard to HA<br />

has not been outlined in this document because the User Name Server is not<br />

particularly well suited to the large scale, high volume type of application typically<br />

found on z/OS. On z/OS, the User Name Server periodically accesses RACF® to<br />

match access control privileges against the user IDs which require access to<br />

topics, resulting in a rebuild of the cache.<br />

One point to consider, would be that while this is not a significant overhead when<br />

dealing with moderate numbers of user IDs, it can become more significant once<br />

the number of user IDs begins to grow. With the nature and scale of applications<br />

which are based on z/OS and the subsequent number of user IDs and RACF<br />

definitions, the User Name Server may not be seen to scale particularly well in<br />

this environment. This situation may potentially cause performance problems and<br />

be expensive in the amount of resources required to frequently check RACF<br />

definitions against a large number of user IDs.<br />

Chapter 2. Design decisions that affect high availability 19


20 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Chapter 3. Topology and system setup<br />

This chapter describes the topology and system setup of HA Message Broker<br />

environments on z/OS, and more specifically, the environment we used to create<br />

the failover scenarios tested in Chapter 4, “Failover scenarios” on page 33.<br />

In this chapter, you can find information about:<br />

► High Availability configurations.<br />

► Test environment topology.<br />

► The z/OS LPARs.<br />

► The DB2 data sharing group configuration.<br />

► The WebSphere MQ queue sharing group configuration.<br />

► The WebSphere Business Integration Message Broker configuration.<br />

► Automatic Restart Management configuration.<br />

► The configuration manager platform.<br />

► An overview of WebSphere Business Integration Message Broker<br />

SupportPac IP13.<br />

3<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 21


3.1 High Availability configurations<br />

3.1.1 Active-active<br />

3.1.2 Active-passive<br />

A high availability environment with WebSphere Business Integration Message<br />

Broker on z/OS requires the use of WebSphere MQ queue sharing groups as<br />

described in Chapter 2, “Design decisions that affect high availability” on page 9.<br />

That setup in turn implies the need for a coupling facility and at least two z/OS<br />

systems in a sysplex that host a DB2 DSG.<br />

These components, employing two z/OS images for simplicity, form the basis of<br />

the following HA configuration descriptions and, in the next section, the test<br />

environment that we used for this paper. The selection of either one of the<br />

configurations described below depends on your business need and the<br />

available capacity of your environment.<br />

An active-active setup describes a business environment where both z/OS<br />

images run WebSphere Business Integration Message Broker to process<br />

messages from the shared queue. Both systems process the business<br />

application load. In the event of a failure on one image, the other is still available<br />

to carry on processing alone for the duration of the recovery. While this setup<br />

provides HA and fully utilizes the two machines in daily operation, it may cause<br />

performance degradation during recovery.<br />

And active-passive setup describes a business environment where only one of<br />

the z/OS images runs the daily broker business. In the event of a failure on this<br />

image, the second z/OS image is available on hot standby to pick up the load<br />

from the shared queue. While this provides for HA and also maintains throughput<br />

during a failure, in normal operation only one machine is being fully utilized.<br />

22 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


3.2 Test environment topology<br />

The HA environment that we used for this paper employed two z/OS images to<br />

process work in an “active-active” configuration. Figure 3-1 illustrates the<br />

topology of the system components listed in Chapter 4, “Failover scenarios” on<br />

page 33.<br />

z/OS<br />

DB2<br />

(DSG)<br />

QMGR<br />

(QSG)<br />

Message Broker<br />

Execution Groups<br />

Message<br />

Flows<br />

IP13<br />

Figure 3-1 Queue sharing group topology<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

Two z/OS systems in a sysplex host a DB2 data sharing group and a WebSphere<br />

MQ queue sharing group. A WebSphere Business Integration Message Broker<br />

runs on each system, connected to the respective queue manager. The brokers<br />

are administered by the configuration manager installed on a ThinkPad running<br />

Windows. The application workload to test the configuration is supplied by<br />

SupportPac IP13.<br />

z/OS<br />

IP13<br />

DB2<br />

(DSG)<br />

QMGR<br />

(QSG)<br />

Message Broker<br />

Execution Groups<br />

Message<br />

Flows<br />

Chapter 3. Topology and system setup 23


3.3 The z/OS LPARs<br />

There are three LPARs in the sysplex running on a 9672 model XZ7 with an<br />

internal coupling facility. For simplicity, only two of the LPARs are used for the<br />

active-active configuration, as depicted in Figure 3-1 on page 23. One of the<br />

LPARs, MVSM0, is connected to two logical processors while the second,<br />

MVSM2, to just one. MVSM0 has 1536 MB of real storage configured, while<br />

MVSM2 has 1024 MB.<br />

Otherwise, the MVS images have similar configurations with RRS, DB2,<br />

WebSphere MQ, WebSphere Business Integration Message Broker, and TCP/IP<br />

all active. UNIX® System Services is running with shared Hierarchical File<br />

System (HFS). Both images are running z/OS 1.4.<br />

3.4 The DB2 data sharing group configuration<br />

The DB2 subsystems are data sharing and accessible as an Open Database<br />

Connectivity (ODBC) data source via a Distributed Data Facility (DDF) that is<br />

using TCP/IP. The DB2 subsystems in data sharing group DSN710PM are DFM0<br />

and DFM2, running on MVSM0 and MVSM2 respectively. Figure 3-2 displays the<br />

data sharing group.<br />

Figure 3-2 The DB2 data sharing group<br />

24 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


3.5 The WebSphere MQ queue sharing group<br />

configuration<br />

A queue manager is set up on each system and defined to a QSG. To create the<br />

QSG, we defined some new structures to the coupling facility. The procedure for<br />

this is described in the WebSphere MQ z/OS System Setup Guide, which can be<br />

found at:<br />

http://www-306.ibm.com/software/integration/mqfamily/library/manualsa/manuals/<br />

platspecific.html#zos<br />

The queue managers in queue sharing group MB01 are WMQ0 and WMQ2,<br />

running on MVSM0 and MVSM2 respectively. Figure 3-3 shows the QSG.<br />

Figure 3-3 The queue sharing group<br />

3.5.1 Queue sharing group configuration considerations<br />

Some things you should consider when configuring the QSG are:<br />

► If you are using the z/OS Automatic Restart Management (ARM) to restart<br />

queue managers on different z/OS images, then:<br />

– Define every queue manager with a sysplex-wide, unique four character<br />

subsystem name that uses a command prefix string (CPF) scope of S.<br />

– Configure each queue manager with a different channel listener port. In<br />

the event of a system failure, this configuration is important if there is a<br />

requirement to restart the failing queue manager on another system in the<br />

sysplex that is already running a queue manager.<br />

► Use the INITSIZE value of 10 MB as provided in the sample job CSQ4CFRM<br />

when you define the admin structure. Specify a larger amount than this for the<br />

SIZE value so that it can expand. You should also consider adding the<br />

ALLOWAUTOALT(YES) parameter to allow system-initiated alters<br />

(automatic-alter) for this structure.<br />

Chapter 3. Topology and system setup 25


► Review actual structure sizes regularly and make sure the coupling facility<br />

Resource Manager (CFRM) policy is updated to reflected actual usage.<br />

Application structure allocations grow with use.<br />

► Confirm that the admin structure is a single point of failure. If a second<br />

coupling facility is available then duplexing of the structure should be<br />

considered.<br />

► Define the queue manager logs with SHAREOPTIONS(2 3) as in the Job<br />

Control Language (JCL) of sample job CSQ4BSDS, because shared queue<br />

recovery requires that a queue manager can access the logs of peers within<br />

the QSG.<br />

► Define application CFSTRUCT with CFLEVEL(3) and RECOVER YES when using<br />

persistent messages (the default is level 2 and NO).<br />

► If using persistent messages, review log sizes prior to migration to shared<br />

queues. Data logged is slightly larger, and BACKUP CFSTRUCT copies<br />

shared queue messages to the log. Non-persistent messages on a private<br />

queue may be logged under some circumstances (for instance, if the<br />

message stays on the queue for an extended length of time), but this situation<br />

does not occur for non-persistent messages that reside on a shared queue.<br />

► Prevent Hierarchical Storage Manager (HSM) from migrating the queue<br />

manager DB2 tables. If the tables are migrated, you may experience startup<br />

problems.<br />

► Do not attempt to manually change QSG DB2 tables unless directed by <strong>IBM</strong><br />

support. Even apparently innocuous changes may leave DB2 table<br />

information out of sync with the coupling facility or with other tables.<br />

► Allow a QSG for a group listener (using VIPA) and shared channels, which<br />

may be useful in providing HA for WebSphere MQ applications.<br />

3.6 The WebSphere Business Integration Message<br />

Broker configuration<br />

A message broker is created for each of the queue managers described<br />

previously. The message broker names are WMQ0BRK and WMQ2BRK, running<br />

on MVSM0 and MVSM2 respectively.<br />

You can find instructions for creating the broker at:<br />

http://publib.boulder.ibm.com/infocenter/wbihelp/index.jsp<br />

26 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


3.6.1 Message Broker configuration considerations<br />

Some things to consider when configuring the Message Broker are:<br />

► Each Message broker connected to the QSG should run under a different<br />

user ID. One of the reasons for this configuration is that the broker tables<br />

must be created with unique names in the DB2 data sharing group. The user<br />

ID, which is used for variable DB2_TABLE_OWNER in the mqsicompcif file, is<br />

prepended to the broker table names to form the fully qualified unique table<br />

names.<br />

► If the desired response to a system failure is to use ARM to restart a broker,<br />

with its queue manager on another z/OS system in the sysplex, then consider<br />

the following:<br />

– The UNIX System Services (USS) environment across the sysplex should<br />

be configured to use shared HFS.<br />

– The broker root directories must not be created under the /var directory.<br />

This is because the /var directory resolves to &SYSNAME/var and thus is<br />

system specific. System specific HFSs are unmounted when the owning<br />

system goes down and are not available to the rest of the sysplex while<br />

that system remains down.<br />

Instead, create a new directory under the sysplex root and create the<br />

broker root directories under this same directory.<br />

For example:<br />

mkdir /wbimb<br />

mkdir /wbimb/WMQ0BRK<br />

The root directory for any given broker is now visible to each z/OS system<br />

in the sysplex and is not unmounted if its host system goes down.<br />

Depending on the number of brokers created, you may also want to create<br />

additional HFSs to be mounted at each broker root directory, or one larger<br />

HFS at the higher directory, in order to prevent filling up the sysplex root<br />

HFS mounted at ‘/’.<br />

– When editing the mqsicompcif file, the DB2 group attach name (in this<br />

case DFPM) should be used for the DB2_SUBSYSTEM variable rather<br />

than specifying a specific DB2 subsystem. This configuration means that<br />

the broker can use another DB2 subsystem in the data sharing group to<br />

access its tables in the event of failure.<br />

– Use of BP0 as the DB2 buffer pool chosen for the broker database is not<br />

recommended. Furthermore, to enable the broker to restart on another<br />

system in the sysplex the buffer pool selected needs to be active on that<br />

system. Use the alter bufferpool command to activate it. If the buffer<br />

Chapter 3. Topology and system setup 27


pool is not available, errors occur on the system to where the broker is<br />

moved. Figure 3-4 lists those errors.<br />

Figure 3-4 Buffer Pool 2 is not available to DB2 subsystem DFM0<br />

– Having defined the local broker DB2 buffer pools on the relevant systems,<br />

you must then allocate a global buffer pool to allow data to be shared<br />

between the DB2 subsystems. This configuration requires defining a new<br />

structure to the coupling facility, for example called:<br />

DSN710PM_GBP2<br />

28 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Figure 3-5 Global Buffer Pool 2 not defined<br />

If the global buffer pool is not available, an error is submitted to the system<br />

log to where the broker has been moved. Figure 3-5 illustrates this error.<br />

3.6.2 Additional Message Broker configuration hints<br />

The following are additional Message Broker configuration hints:<br />

► Before running the mqsicreatebroker command make sure you have<br />

completed the actions in the section Setting up your OMVS user ID with<br />

instructions on setting the USS environment variables PATH and NLSPATH.<br />

Otherwise, the command and the corresponding messages are not found.<br />

You can find Setting up your OMVS user ID in the Message Broker section of<br />

the <strong>IBM</strong> WebSphere Business Integration Information Center z/OS section,<br />

under the chapter Configuring the Broker Domain.<br />

The Information Center is online at:<br />

http://publib.boulder.ibm.com/infocenter/wbihelp/index.jsp<br />

► The SYSTEM.BROKER.* queues must be private queues defined to the<br />

broker’s queue manager. They cannot be shared.<br />

► To enable the broker to use ARM to restart it, customize the ARM section of<br />

the mqsicompcif file and run the mqsicustomize program. The following are<br />

example lines from mqsicompcif:<br />

USE_ARM=’YES’<br />

ARM_ELEMENTNAME=’WMQ0BRK’<br />

ARM_ELEMENTTYPE=’SYSWMQI’<br />

You can find a downloadable sample ZIP file in Appendix B, “Additional<br />

material” on page 53.<br />

Chapter 3. Topology and system setup 29


3.7 Automatic Restart Management configuration<br />

On z/OS in a sysplex environment, a program can enhance its recovery potential<br />

by registering as an element of ARM. You can reduce the impact of an<br />

unexpected error to an element using ARM, because MVS can restart<br />

automatically without operator intervention. Program recovery via ARM is<br />

provided by activating an ARM policy using the SETXCF START command.<br />

In this environment, the active ARM policy is set to restart the following:<br />

► DB2<br />

► WebSphere MQ<br />

► WebSphere Business Integration Message Broker<br />

If any of these restarts fail, MVS restarts them in place. However, in the event of<br />

a system failure, these elements are restarted on another system in the sysplex.<br />

In this case, DB2 is only restarted in ‘light’ mode on another system. Restart light<br />

enables DB2 to restart with minimal storage footprint to quickly release retained<br />

locks and then terminate normally. Additionally, to enable Internal Resource Lock<br />

Manager (IRLM) to obtain the full benefits of a restart light, the ARM policy for the<br />

IRLM element should specify PC=YES.<br />

The JCL used to create the ARM policy for our testing environment is illustrated<br />

in Appendix A, “Sample code” on page 49. A ZIP file containing this JCL is<br />

available for download in Appendix B, “Additional material” on page 53.<br />

3.8 The configuration manager platform<br />

All components of WebSphere Business Integration Message Broker V5 Fix<br />

Pack 3 were installed on a T40 ThinkPad along with WebSphere MQ V5.3 FP5<br />

and DB2 V8.1 FP2. A set of WebSphere MQ channels was built between the<br />

mobile computer queue manager and each z/OS queue manager.<br />

The SupportPac IP13 message flows were imported into the workspace as per<br />

installation instructions.<br />

A .bar file was built for deployment of the DB2U message flow. The .bar file was<br />

deployed to both brokers.<br />

30 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


3.9 An overview of WebSphere Business Integration<br />

Message Broker SupportPac IP13<br />

<strong>IBM</strong> provides SupportPac IP13 that can be used to check the setup of a z/OS<br />

system and its WebSphere MQ and WebSphere Business Integration Message<br />

Broker configuration. SupportPac IP13 includes example flows and programs<br />

with documentation that facilitates the capability to do quick performance and<br />

health checks on your z/OS system.<br />

The SupportPac IP13 is available online at the following Web address:<br />

http://www-1.ibm.com/support/docview.wss?rs=203&uid=swg24006892&loc=en_US&cs=ut<br />

f-8&lang=en<br />

SupportPac IP13 is used in the testing environment for this paper to provide work<br />

for the brokers and to measure transaction rates. Broker statistics are used in<br />

conjunction with SupportPac IP13 transaction rate data to illustrate the effects of<br />

the various scenarios tested in Chapter 4, “Failover scenarios” on page 33.<br />

Chapter 3. Topology and system setup 31


32 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Chapter 4. Failover scenarios<br />

This chapter describes the high availability failover scenarios that we tested. The<br />

resultant application and broker statistics are shown along with our conclusions<br />

and explanations.<br />

This chapter describes the following failover scenarios:<br />

► Scenario 1 - Initial state with all components active.<br />

► Scenario 2 - Execution group failover.<br />

► Scenario 3 - Message Broker failover.<br />

► Scenario 4 - Queue manager failover.<br />

► Scenario 5 - DB2 failover.<br />

► Scenario 6 - z/OS system failover.<br />

4<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 33


4.1 Test environment setup<br />

As previously mentioned, the SupportPac IP13 batch job (OEMPUTX) was used<br />

to drive a message load to a shared queue.<br />

Note: For further information about all components of the <strong>IBM</strong> Category 2<br />

SupportPac IP13 refer to the documentation available from:<br />

http://www-306.ibm.com/software/integration/support/supportpacs/<br />

The SupportPac IP13 message flow (DB2U) running on both brokers consumed<br />

these messages, then placed reply messages on a second shared queue. The<br />

reply messages were subsequently picked up by OEMPUTX, completing the<br />

request-reply loop. Statistics generated by OEMPUTX were compared to<br />

statistics generated by the Message Broker and the results evaluated. The DB2U<br />

message flow also updates a DB2 database called SHAREPRICES.<br />

We executed all test scenarios at least three times to provide reasonable<br />

accuracy in reporting statistics.<br />

Transaction rates for the applications (running on the MVSM0 and MVSM2<br />

LPARs) and the brokers (WMQ0BRK, WMQ2BRK) can be compared only within<br />

a given test scenario. Comparison of these rates between test scenarios is<br />

meaningless.<br />

4.1.1 SupportPac IP13 setup<br />

This section provides details on the SupportPac IP13 setup.<br />

DB2 configuration<br />

JCL is supplied with SupportPac IP13 to both set up the application environment<br />

and run the tests. Job JDB2DEFS creates a DB2 SHAREPRICES table and<br />

inserts some initial data.<br />

Since this table is unique, the JDB2DEFS job should only be run once. However,<br />

the JCL in this job would normally create the table by taking the user ID of one<br />

broker as the schema name. When the other broker later tries to access the<br />

table, it refers to it by taking a different user ID, under which it runs, as the<br />

schema. This is because in each broker’s dsnaoini file the CURRENTSQLID is<br />

set to the user ID of that broker. So, the second broker is not able to access the<br />

table.<br />

To avoid this problem in the environment for this book, broker 2 was given<br />

authority to set its CURRENTSQLID to the user ID of broker 0. However, a better<br />

34 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


solution for setting up SupportPac IP13 for running in a sysplex is to follow the<br />

steps below:<br />

1. Create a new RACF group that will be the schema name for the unique<br />

SupportPac IP13 DB2 table.<br />

2. Connect the user IDs under which the broker started tasks run to the<br />

previously created RACF group.<br />

3. Create the SupportPac IP13 SHAREPRICES table using job JBDB2DEFS,<br />

inserting the new RACF group name as the schema.<br />

4. SET CURRENTSQLID in the ESQL of the mesageflow. See DB2U message<br />

flow configuration below for details.<br />

OEMPUTX batch job configuration<br />

There are several parameters that can be passed to OEMPUTX in order to<br />

cause different behavior of the application. The parameters we chose for our<br />

testing included:<br />

-n25000 Total number of messages to put, in this case 25000.<br />

-m4 Causes the program to run for four minutes. In the test<br />

environment, no single broker was capable of processing 25000<br />

messages within four minutes, so these first two parameters<br />

provided a steady-state message flow rate.<br />

-gm Causes OEMPUTX to use same MQMD.MsgId for all MQPUTs in<br />

the loop, and MQGET replies by this MsgId. With this option,<br />

there is no Message Broker affinity.<br />

-w45 Sets the MQGET MQMD.WaitInterval to wait_time seconds.<br />

-c Commit every msgs_in_loop messages. msgs-in-loop was not<br />

set and the default is one message per batch.<br />

We also used non-persistent messages.<br />

DB2U message flow configuration<br />

Recall that in each broker’s dsnaoini file, the CURRENTSQLID is set to the user<br />

ID of that broker. This configuration would normally prevent brokers from<br />

accessing DB2 tables outside of their schema. To circumvent this behavior:<br />

► Add the following line of ESQL to the DB2U message flow to set the<br />

CURRENTSQLID to the RACF group name created for the SupportPac IP13<br />

SHAREPRICES table (see Chapter 3, “Topology and system setup” on<br />

page 21). This value has to have single quotes surrounding it for z/OS DB2 to<br />

process it correctly, as shown in the example below:<br />

PASSTHRU('SET CURRENT SQLID= ''SYSDSP''');<br />

Chapter 4. Failover scenarios 35


► In a production environment, you may prefer to pass the CURRENTSQLID as<br />

part of the input message and perform the set via a variable. We did not test<br />

the following ESQL, but we have provided it as a sample:<br />

DECLARE ID CHARACTER;<br />

SET ID = InputBody.My.CurrentSQLID;<br />

PASSTHRU(‘{SET SQLID = (?)}’,ID);<br />

4.1.2 Message Broker configuration<br />

In our testing, we used WebSphere Business Integration Message Broker<br />

without any tuning. We built a single execution group for each broker, and the<br />

same .bar file was deployed to the execution group. The final task necessary to<br />

run the tests was to turn message flow accounting (archive data) on. Statistical<br />

data is written to a message queue based on the collection interval defined for<br />

the broker. The first step is to build a subscription to the Message Broker to<br />

publish statistics. You can build the subscription as follows:<br />

► The subscription topic is: $SYS/Broker/+/StatisticsAccounting/#<br />

► This subscription must be put to the SYSTEM.BROKER.CONTROL.QUEUE<br />

► Specify a private queue to publish the statistics to, for example:<br />

STATS.IN.WMQ0BRK<br />

The IH03 SupportPac (RFHUTIL) was used to put the subscription and retrieve<br />

the XML statistics messages as they were produced. Once a statistics message<br />

is successfully read into RFHUTIL, select the Data tab. Then, select the XML<br />

radio button under Data Format. The XML tags of the statistics message are self<br />

explanatory.<br />

For easier viewing, you can import this file into Excel using the following<br />

procedure:<br />

1. From the Data tab panel in RFHUTIL, press Ctrl+A to select the XML<br />

message, then Ctrl+C to copy it.<br />

2. Paste the data into WordPad (not NotePad). Save the data to a file.<br />

3. This file can then be imported into Excel. In Excel, select File → Open, then<br />

navigate to the directory you just saved the WordPad file in. At the bottom of<br />

the navigation screen, select All File Types and select the file.<br />

4. The Text Import Wizard opens. Click Next.<br />

5. On the second panel of the Text Import Wizard in the Delimiter section, select<br />

Other and in the box to the right of Other type a double quote (“). Then select<br />

Finish. With a little re-sizing, you can easily read the statistics in the<br />

spreadsheet.<br />

36 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


In order for the Message Broker to publish statistics, the following command was<br />

issued at the MVS console:<br />

f WMQ0BRK,cs a=yes,g=yes,j=yes,n=basic,t=basic,o=xml,c=active<br />

4.2 Scenario 1 - Initial state with all components active<br />

The first scenario that was measured involved all components of the environment<br />

active in their normal functioning state. The SupportPac IP13 batch jobs<br />

submitted to both systems allowed statistics to be gathered for this control<br />

situation.<br />

Figure 4-1 displays the configuration.<br />

z/OS (MVSM0)<br />

DB2 (DFM0)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ0)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ0BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

IP13<br />

Figure 4-1 All components active<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

z/OS (MVSM2)<br />

IP13<br />

DB2 (DFM2)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ2<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ2BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

Chapter 4. Failover scenarios 37


The batch jobs provided work for the brokers for two minutes. The results from<br />

the SupportPac IP13 measurements and the Message Broker statistics recorded<br />

are displayed in Table 4-1.<br />

Table 4-1 SupportPac IP13 and Message Broker statistics<br />

IP13 Statistics MVSM0 MVSM2 Total<br />

Total Transactions 9884 9568 19450<br />

Elapsed Time (seconds) 119.942 119.615<br />

Application CPU Time (seconds) 9.419 11.479<br />

Transaction Rate (trans/sec) 82.406 79.973<br />

Round trip per msg (ms) 12.134 12.504<br />

Average App CPU per msg (ms) 0.952 1.199<br />

Broker Statistics WMQ0BRK WMQ2BRK<br />

Total Number Input Messages 9234 10216 19450<br />

Total Elapsed Time (seconds) 96.401 98.762<br />

Total CPU Time (seconds) 31.590 33.167<br />

Conclusions<br />

From the results, we concluded the following:<br />

► The total number of transactions processed by the SupportPac IP13<br />

applications on both systems equals the total number of messages processed<br />

by each of the two brokers.<br />

► A greater number of SupportPac IP13 application transactions are processed<br />

by the MVSM0 LPAR, while a greater number of messages are processed by<br />

the broker on MVSM2. This slight imbalance is due to the different resources<br />

available to each system.<br />

4.3 Scenario 2 - Execution group failover<br />

In the second scenario the execution group for WMQ2BRK was made to fail by<br />

issuing a cancel command (C SAMPLE2) to the MVS console, as illustrated in<br />

Figure 4-2 on page 39. Message Broker execution groups are automatically<br />

recovered by the Message Broker, so no action is required by ARM. For this and<br />

all subsequent tests, Coordinated Transaction was selected in the deployment<br />

descriptor of the .bar file.<br />

38 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


z/OS (MVSM0)<br />

DB2 (DFM0)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ0)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ0BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

IP13<br />

Figure 4-2 Execution Group failover<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

z/OS (MVSM2)<br />

The batch jobs provided work for the brokers for four minutes. Table 4-2 records<br />

the results from the SupportPac IP13 measurements and the Message Broker<br />

statistics.<br />

Table 4-2 SupportPac IP13 and Message Broker statistics<br />

IP13 Statistics MVSM0 MVSM2 Total<br />

Total Transactions 15212 14584 29796<br />

IP13<br />

Elapsed Time (seconds) 239.294 239.258<br />

Application CPU Time (seconds) 17.172 18.454<br />

Transaction Rate (trans/sec) 63.570 60.955<br />

Round trip per msg (ms) 15.730 16.405<br />

Average App CPU per msg (ms) 1.128 1.265<br />

DB2 (DFM2)<br />

(DSG - DSN710PM))<br />

QMGR (WQM2)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ2BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

Chapter 4. Failover scenarios 39


IP13 Statistics MVSM0 MVSM2 Total<br />

Broker Statistics WMQ0BRK WMQ2BRK<br />

Total Number Input Messages 17051 12010 29061<br />

Total Elapsed Time (seconds) 181.542 124.319<br />

Total CPU Time (seconds) 61.153 42.033<br />

Conclusions<br />

From the results, we concluded the following:<br />

► At first glance, it would seem as though messages were lost under this test,<br />

but this is not the case for this or any other test scenario. When the execution<br />

group is cancelled, the statistics message is dumped by the broker. Upon<br />

successful execution group startup, a new statistics message is created, the<br />

interval time is reset, and accounting starts fresh. Since the execution group<br />

was cancelled very close to the beginning of the test and it recovered quickly,<br />

there are 735 messages that WMQ2BRK processed that are not accounted<br />

for in the above table.<br />

► The high number of messages consumed by WMQ2BRK underscores the<br />

speed of the execution group recovery.<br />

► The higher number of messages consumed by WMQ0BRK compared to<br />

transactions completed on MVSM0 illustrates how broker 0 processes<br />

messages from the SupportPac IP13 batch jobs on both systems, taking the<br />

extra load while the execution group is down on MVSM2.<br />

4.4 Scenario 3 - Message Broker failover<br />

The third scenario tests the failure of WMQ2BRK Message Broker. The failure<br />

was simulated by issuing a MVS cancel command (C WMQ2BRK,ARMRESTART), as<br />

illustrated in Figure 4-3 on page 41. The ARM policy in effect ensures the broker<br />

restarts immediately. You can find the policy details in Appendix A, “Sample<br />

code” on page 49.<br />

40 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


z/OS (MVSM0)<br />

DB2 (DFM0)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ0)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ0BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

IP13<br />

Figure 4-3 Message Broker failover<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

z/OS (MVSM2)<br />

The batch jobs provided work for the brokers for four minutes. Table 4-3 records<br />

the results from the SupportPac IP13 measurements and the Message Broker<br />

statistics.<br />

Table 4-3 SupportPac IP13 and Message Broker statistics<br />

IP13 Statistics MVSM0 MVSM2 Total<br />

Total Transactions 20564 9608 30172<br />

IP13<br />

Elapsed Time (seconds) 239.539 239.721<br />

Application CPU Time (seconds) 21.363 9.934<br />

Transaction Rate (trans/sec) 85.847 41.912<br />

Round trip per msg (ms) 46572 88645<br />

Average App CPU per msg (ms) 1045 1034<br />

DB2 (DFM2)<br />

(DSG- DSN710PM)<br />

QMGR (WMQ2)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ2BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

Chapter 4. Failover scenarios 41


IP13 Statistics MVSM0 MVSM2 Total<br />

Broker Statistics WMQ0BRK WMQ2BRK<br />

Total Number Input Messages 23098 unknown 23098<br />

Total Elapsed Time (seconds) 224.622 unknown<br />

Total CPU Time (seconds) 76.918 unknown<br />

Conclusions<br />

From the results, we concluded:<br />

► Again, the statistics message is dumped by the broker so no statistics are<br />

available for WMQ2BRK. Cancelling the broker caused multiple SVC dumps<br />

to be taken and CPU usage was at 100% on MVSM2 for some time. This<br />

explains why the transaction rate for MVSM2 is less than half of MVSM0.<br />

► The higher number of messages consumed by WMQ0BRK compared to<br />

transactions completed on MVSM0 illustrates how broker 0 processes<br />

messages from the SupportPac IP13 batch jobs on both systems, taking the<br />

extra load while WMQ2BRK is down.<br />

4.5 Scenario 4 - Queue manager failover<br />

Scenario 4 tests the failure of WMQ2 queue manager. The failure was simulated<br />

by issuing a MVS cancel command (WMQ2STOP QMGR MODE(RESTART) as illustrated<br />

in Figure 4-4 on page 43. The ARM policy in effect ensures the queue manager<br />

restarts immediately. Upon successful queue manager startup, the Message<br />

Broker dynamically reconnects to the queue manager and issues the following<br />

message to the MVS log:<br />

+BIP2091I WMQ2BRK 0 The broker has reconnected to WebSphere Business<br />

Integration successfully. : ImbAdminAgent(1095)<br />

The OEMPUTX batch job does not reconnect to the queue manager after a<br />

failure, so for this test the batch job was submitted only on MVSM0.<br />

42 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


z/OS (MVSM0)<br />

DB2 (DFM0)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ0)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ0BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

IP13<br />

Figure 4-4 Queue Manager failover<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

The batch jobs provided work for the brokers for four minutes. Table 4-4 records<br />

the results from the SupportPac IP13 measurements and the Message Broker<br />

statistics.<br />

Table 4-4 SupportPac IP13 and Message Broker statistics<br />

IP13 Statistics MVSM0 MVSM2 Total<br />

Total Transactions 37204 n.a. 37204<br />

Elapsed Time (seconds) 239.919 n.a.<br />

Application CPU Time (seconds) 35.269 n.a.<br />

Transaction Rate (trans/sec) 155.059 n.a.<br />

Round trip per msg (ms) 25774 n.a.<br />

Average App CPU per msg (ms) 947 n.a.<br />

z/OS (MVSM2)<br />

IP13<br />

DB2 (DFM2)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ2)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ2BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

Chapter 4. Failover scenarios 43


IP13 Statistics MVSM0 MVSM2 Total<br />

Broker Statistics WMQ0BRK WMQ2BRK<br />

Total Number Input Messages 25460 11019 36479<br />

Total Elapsed Time (seconds) 224.761 119.003<br />

Total CPU Time (seconds) 84.247 36.560<br />

Conclusions<br />

From the results, we concluded that the statistics message is lost prior to the<br />

queue manager restart since statistics gathering is a WebSphere MQ Pub/Sub<br />

function. The statistics shown for WMQ2BRK are after the WMQ2 queue<br />

manager restart. WMQ2BRK processed about 30% of the total messages. This<br />

is due to the length of time it takes the queue manager to restart and the broker<br />

to reconnect.<br />

4.6 Scenario 5 - DB2 failover<br />

There is a known issue with the Message Broker in that if DB2 should fail while it<br />

is active, it does not dynamically reconnect as is the case with a queue manager<br />

failure, as illustrated in Figure 4-5 on page 45. So, while ARM restarts DB2, the<br />

Message Broker requires a manual restart to resume work. The DB2 connection<br />

is not actively managed by the Message Broker and, in fact, it would not realize<br />

the DB2 had failed until there was a need to access a database. This issue is<br />

being addressed in APAR PQ92596. As such, there were no statistics produced<br />

for this test.<br />

44 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


z/OS (MVSM0)<br />

DB2 (DFM0)<br />

(DSG - DSN710PM)<br />

QMGR (WMQ0)<br />

(QSG - MB01)<br />

Message Broker<br />

(WMQ0BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

Figure 4-5 DB2 failover<br />

4.7 Scenario 6 - z/OS system failover<br />

IP13<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

z/OS (MVSM2)<br />

Message Broker<br />

(WMQ2BRK)<br />

Execution Groups<br />

Message<br />

Flows<br />

The final scenario tests the failure of MVSM2. The failure was simulated by<br />

removing MVSM2 from the sysplex with a system reset from the hardware<br />

console, as illustrated in Figure 4-6 on page 46. ARM restarts the DFM2 DB2<br />

subsystem in light mode on MVSM0 to release any locks retained. It then restarts<br />

queue manager, WMQ2, and broker, WMQ2BRK, on the surviving system,<br />

MVSM0.<br />

IP13<br />

DB2 (DFM2)<br />

(DSG- DSN710PM)<br />

QMGR (WMQ2)<br />

(QSG - MB01)<br />

Chapter 4. Failover scenarios 45


z/OS<br />

DB2<br />

(DSG)<br />

QMGR<br />

(QSG)<br />

Message Broker<br />

Execution Groups<br />

Message<br />

Flows<br />

Figure 4-6 LPAR failover<br />

Because the OEMPUTX batch job running on MVSM2 ended when the system<br />

went down, there are no statistics for this test. However, to test the operation of<br />

the moved broker, the job was resubmitted on MVSM0, and it was observed that<br />

both brokers functioned normally on the one surviving z/OS image.<br />

Once MVSM2 was brought back into the sysplex with its DB2 subsystem active,<br />

the MVSM2 queue manager and broker were shut down on MVSM0 and<br />

restarted on MVSM2. Once again, the OEMPUTX batch job was submitted and<br />

normal operation was verified.<br />

46 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5<br />

IP13<br />

Coupling<br />

Facility<br />

Configuration<br />

Manager<br />

z/OS<br />

IP13<br />

DB2<br />

(DSG)<br />

QMGR<br />

(QSG)<br />

Message Broker<br />

Execution Groups<br />

Message<br />

Flows


4.8 Summary<br />

Conclusions<br />

Though collection of statistics was not appropriate for this scenario, the scenario<br />

demonstrated that:<br />

► With the ARM policy provided in Appendix A, “Sample code” on page 49,<br />

when the host z/OS system failed, the Message Broker and its pre-requisite<br />

subsystems were automatically restarted on the chosen surviving z/OS<br />

system in the sysplex.<br />

► Successful operation of the moved Message Broker was verified.<br />

► On moving the Message Broker back to its original z/OS system its<br />

successful operation was again verified.<br />

We performed various failover tests to demonstrate high availability solutions for<br />

WebSphere Business Integration Message Broker on z/OS. Our goal was to<br />

observe that none of the messages were lost, despite the loss of statistics in<br />

some of the failover scenarios.<br />

Chapter 4. Failover scenarios 47


48 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Appendix A. Sample code<br />

This appendix provides sample code that was used for this paper.<br />

A<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 49


ARM policy<br />

Example A-1 displays the ARM policy we created and activated for the sysplex<br />

on which the failover scenarios (in Chapter 4, “Failover scenarios” on page 33)<br />

were performed. The policy name is POLICY1. The elements to be restarted are<br />

in GROUP1. These elements consist of DB2, IRLM, WebSphere MQ, and<br />

WebSphere Business Integration Message Broker.<br />

Example: A-1 ARM Policy<br />

DATA TYPE(ARM)<br />

REPORT(YES)<br />

DEFINE POLICY NAME(POLICY1) REPLACE(YES)<br />

RESTART_ORDER<br />

LEVEL(1)<br />

ELEMENT_NAME(DSN710PMDFM0,DSN710PMDFM2,<br />

DFPMIRLMIFM0001,DFPMIRLMIFM2003)<br />

LEVEL(2)<br />

ELEMENT_NAME(SYSMQMGRWMQ0,SYSMQMGRWMQ2)<br />

LEVEL(3)<br />

ELEMENT_NAME(SYSWMQI_WMQ0BRK,SYSWMQI_WMQ2BRK)<br />

RESTART_GROUP(DEFAULT)<br />

ELEMENT(*)<br />

RESTART_ATTEMPTS(0) /* JOBS NOT TO BE RESTARTED BY ARM */<br />

RESTART_GROUP(GROUP1)<br />

TARGET_SYSTEM(MVM0,MVM2) /* Z/OS SYSTEM NAME(S) */<br />

RESTART_PACING(20)<br />

ELEMENT(DSN710PMDFM0)<br />

RESTART_METHOD(SYSTERM,STC,'#DFM0 STA DB2,LIGHT(YES)')<br />

ELEMENT(DSN710PMDFM2)<br />

RESTART_METHOD(SYSTERM,STC,'#DFM2 STA DB2,LIGHT(YES)')<br />

ELEMENT(DFPMIRLMIFM0001)<br />

RESTART_METHOD(SYSTERM,STC,'#DFM0 S DFM0IRLM,PC=YES')<br />

ELEMENT(DFPMIRLMIFM2003)<br />

RESTART_METHOD(SYSTERM,STC,'#DFM2 S DFM2IRLM,PC=YES')<br />

ELEMENT(SYSMQMGRWMQ0)<br />

RESTART_ATTEMPTS(3,300)<br />

RESTART_TIMEOUT(120)<br />

TERMTYPE(ALLTERM)<br />

RESTART_METHOD(BOTH,STC,'WMQ0 START QMGR')<br />

ELEMENT(SYSMQMGRWMQ2)<br />

RESTART_ATTEMPTS(3,300)<br />

RESTART_TIMEOUT(120)<br />

TERMTYPE(ALLTERM)<br />

RESTART_METHOD(BOTH,STC,'WMQ2 START QMGR')<br />

50 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


ELEMENT(SYSWMQI_WMQ0BRK)<br />

RESTART_ATTEMPTS(3,300)<br />

RESTART_TIMEOUT(120)<br />

TERMTYPE(ALLTERM)<br />

RESTART_METHOD(BOTH,STC,'S WMQ0BRK')<br />

ELEMENT(SYSWMQI_WMQ2BRK)<br />

RESTART_ATTEMPTS(3,300)<br />

RESTART_TIMEOUT(120)<br />

TERMTYPE(ALLTERM)<br />

RESTART_METHOD(BOTH,STC,'S WMQ2BRK')<br />

This policy is also available as a downloadable ZIP file in Appendix B, “Additional<br />

material” on page 53.<br />

Once downloaded and unzipped, the file should be uploaded to the z/OS system<br />

using binary ftp transfer. The resulting data set is in TSO TRANSMIT format. To<br />

extract the ARM policy JCL issue a TSO command similar to the following:<br />

RECEIVE INDSNAME(ARMPOL.XMIT)<br />

Once created, the policy can be activated with the following system command:<br />

SETXCF START,POLICY,TYPE=ARM,POLNAME=POLICY1<br />

Broker customization input file<br />

An example of the broker customization input file, mqsicompcif used for broker<br />

WMQ2BRK is downloadable from Appendix B, “Additional material” on page 53.<br />

The size constraint limits its display in the appendix.<br />

Once downloaded and unzipped, the file should be uploaded to the z/OS system<br />

using binary ftp transfer. The resulting data set is in TSO TRANSMIT format. To<br />

extract the mqsicomcif file issue a TSO command similar to the following:<br />

RECEIVE INDSNAME(MQSI.COMPCIF.XMIT)<br />

Appendix A. Sample code 51


52 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Appendix B. Additional material<br />

This paper refers to additional material that can be downloaded from the Internet<br />

as described below.<br />

Locating the Web material<br />

The Web material associated with this paper is available in softcopy on the<br />

Internet from the <strong>IBM</strong> <strong>Redbooks</strong> Web server. Point your Web browser to:<br />

ftp://www.redbooks.ibm.com/redbooks/REDP3894<br />

Alternatively, you can go to the <strong>IBM</strong> <strong>Redbooks</strong> Web site at:<br />

ibm.com/redbooks<br />

B<br />

Select the Additional materials and open the directory that corresponds with<br />

the redbook form number, REDP3894.<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 53


Using the Web material<br />

The additional Web material that accompanies this <strong>Redpaper</strong> includes the<br />

following files:<br />

File name Description<br />

armpol.xmit.zip ARM Policy (TSO) transmit Zipped Code Sample.<br />

mqsicomsif.xmit.zip Broker customization input file (TSO) transmit sample<br />

How to use the Web material<br />

Create a subdirectory (folder) on your workstation, and unzip the contents of the<br />

Web material ZIP file into this folder.<br />

54 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Abbreviations and acronyms<br />

ARM Automatic Restart<br />

Management<br />

CA continuous availability<br />

CF coupling facility<br />

CFRM Coupling Facility Resource<br />

Manager<br />

CPF command prefix string<br />

DDF Distributed Data Facility<br />

DSG DB2 data sharing group<br />

HA high availability<br />

HFS Hierarchical File System<br />

HSM Hierarchical Storage Manager<br />

<strong>IBM</strong> International Business<br />

Machines Corporation<br />

IRLM Internal Resource Lock<br />

Manager<br />

ITSO International Technical<br />

Support Organization<br />

JCL Job Control Language<br />

LPAR Logical Partition<br />

ODBC Open Database Connectivity<br />

QSG queue sharing group<br />

RRS Resource Recovery Services<br />

USS UNIX System Services<br />

VIPA Virtual IP Address<br />

WMQ WebSphere MQ<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 55


56 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Related publications<br />

<strong>IBM</strong> <strong>Redbooks</strong><br />

The publications listed in this section are considered particularly suitable for a<br />

more detailed discussion of the topics covered in this <strong>Redpaper</strong>.<br />

For information about ordering these publications, see “How to get <strong>IBM</strong><br />

<strong>Redbooks</strong>” on page 58. Note that some of the documents referenced here may<br />

be available in softcopy only.<br />

► WebSphere Business Integration Pub/Sub Solutions, SG24-6088<br />

► Highly Available WebSphere Business Integration Solutions, SG24-7006<br />

Other publications<br />

Online resources<br />

These publications are also relevant as further information sources:<br />

► Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />

availability, GM13-0026<br />

These Web sites and URLs are also relevant as further information sources:<br />

► Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />

availability, GM13-0026<br />

http://www-1.ibm.com/servers/eserver/zseries/pso/whitepaper.html<br />

http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/<br />

gm130165.html<br />

► OS/390® and z/OS TCP/IP in the Parallel Sysplex Environment - Blurring the<br />

Boundaries<br />

http://www-1.ibm.com/servers/eserver/zseries/pso/whitepaper.html<br />

http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/pdf/<br />

gm130026.pdf<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 57


How to get <strong>IBM</strong> <strong>Redbooks</strong><br />

You can search for, view, or download <strong>Redbooks</strong>, <strong>Redpaper</strong>s, Hints and Tips,<br />

draft publications and Additional materials, as well as order hardcopy <strong>Redbooks</strong><br />

or CD-ROMs, at this Web site:<br />

ibm.com/redbooks<br />

Help from <strong>IBM</strong><br />

<strong>IBM</strong> Support and downloads<br />

ibm.com/support<br />

<strong>IBM</strong> Global Services<br />

ibm.com/services<br />

58 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


Index<br />

A<br />

active-active 22<br />

active-passive 22<br />

Automatic Restart Management (ARM) 4, 25, 30<br />

C<br />

cloned applications, advantages 14<br />

cloned brokers 10<br />

cloned execution groups, advantages 12<br />

cloned execution groups, disadvantages 12<br />

clustering 10<br />

command prefix string (CPF) 25<br />

Continuous availability 4<br />

coupling facility (CF)<br />

failover 4<br />

failure 4<br />

Resource Manager (CFRM) 4, 26<br />

CURRENTSQLID 34–35<br />

D<br />

DB2 data sharing group (DSG) 4<br />

DB2 database called SHAREPRICES 34<br />

DB2 SHAREPRICES table 34<br />

design of message flows 17<br />

Distributed Data Facility (DDF) 24<br />

F<br />

factors contribute to HA of a Message Broker environment<br />

10<br />

H<br />

Hierarchical File System (HFS) 24<br />

Hierarchical Storage Manager (HSM) 26<br />

high availability (HA) 1, 21–22<br />

environment for WebSphere Business Integration<br />

Message Broker 2<br />

failover scenarios 33<br />

I<br />

<strong>IBM</strong> Category 2 SupportPac IP13 5, 34<br />

IH03 SupportPac (RFHUTIL) 36<br />

important factors when considering availability 10<br />

Internal Resource Lock Manager (IRLM) 30<br />

J<br />

JBDB2DEFS 35<br />

JDB2DEFS 34<br />

Job Control Language (JCL) 26<br />

Job JDB2DEFS 34<br />

L<br />

Logical Partition (LPAR) 4, 24<br />

M<br />

Message Broker, error processing 18<br />

N<br />

network topology, benefits 11<br />

O<br />

OEMPUTX, parameters that can be passed 35<br />

Open Database Connectivity (ODBC) 24<br />

P<br />

Parallel Sysplex 2<br />

Publish and Subscribe functionality 18–19<br />

Q<br />

queue sharing group (QSG) 4, 14<br />

R<br />

<strong>Redbooks</strong> Web site 58<br />

S<br />

SupportPac IP13<br />

batch job (OEMPUTX) 34<br />

batch jobs 37<br />

message flow (DB2U) 34<br />

SHAREPRICES table 35<br />

© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 59


U<br />

UNIX System Services (USS) environment 27<br />

W<br />

WebSphere<br />

Business Integration Broker on z/OS 5<br />

Business Integration Message Broker 1, 9<br />

WebSphere MQ<br />

clustering 10, 12–15<br />

clustering, advantages 13<br />

clustering, disadvantages 13<br />

shared queues 12, 14<br />

shared queues, advantages 16<br />

shared queues, disadvantages 16<br />

60 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5


High Availability z/OS Solutions<br />

for WebSphere Business<br />

Integration Message Broker V5<br />

Develop a highly<br />

available WebSphere<br />

Business Integration<br />

Message Broker<br />

solution on z/OS<br />

Configure<br />

WebSphere MQ QSGs<br />

to support Message<br />

Broker in a Sysplex<br />

Example Message<br />

Broker high<br />

availability<br />

implementations<br />

Back cover<br />

When designing and implementing a production grade<br />

Message Broker solution on z/OS, one of the most important<br />

factors to consider is high availability.<br />

This <strong>IBM</strong> <strong>Redpaper</strong> examines the design considerations<br />

inherent in configuring a highly available Message Broker<br />

environment.<br />

Also demonstrated is the use of the coupling facility for<br />

WebSphere MQ queue sharing groups (QSG) and Automatic<br />

Restart Management (ARM) in order to support WebSphere<br />

Business Integration Message Broker HA in a sysplex<br />

environment.<br />

Finally, examples of the behavior of Message Broker during<br />

failover are provided, including transaction rate<br />

measurements and throughput statistics.<br />

INTERNATIONAL<br />

TECHNICAL<br />

SUPPORT<br />

ORGANIZATION<br />

®<br />

<strong>Redpaper</strong><br />

BUILDING TECHNICAL<br />

INFORMATION BASED ON<br />

PRACTICAL EXPERIENCE<br />

<strong>IBM</strong> <strong>Redbooks</strong> are developed by<br />

the <strong>IBM</strong> International Technical<br />

Support Organization. Experts<br />

from <strong>IBM</strong>, Customers and<br />

Partners from around the world<br />

create timely technical<br />

information based on realistic<br />

scenarios. Specific<br />

recommendations are provided<br />

to help you implement IT<br />

solutions more effectively in<br />

your environment.<br />

For more information:<br />

ibm.com/redbooks

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!