Redpaper - IBM Redbooks
Redpaper - IBM Redbooks
Redpaper - IBM Redbooks
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
High Availability z/OS Solutions<br />
for WebSphere Business<br />
Integration Message Broker V5<br />
Develop a highly available WebSphere<br />
Business Integration Message Broker<br />
solution on z/OS<br />
Configure WebSphere MQ QSGs to<br />
support Message Broker in a<br />
Sysplex<br />
Example Message Broker<br />
high availability<br />
implementations<br />
Front cover<br />
Saida Davies<br />
Dean Barker<br />
Steve Kiernan<br />
Jon Mc Namara<br />
<strong>Redpaper</strong><br />
ibm.com/redbooks
International Technical Support Organization<br />
High Availability z/OS Solutions for WebSphere<br />
Business Integration Message Broker V5<br />
October 2004
Note: Before using this information and the product it supports, read the information in<br />
“Notices” on page v.<br />
First Edition (October 2004)<br />
This edition applies to Version 5, Release 01 of <strong>IBM</strong> WebSphere Business Integration Message<br />
Broker for z/OS (product number 5655-K60).<br />
© Copyright International Business Machines Corporation 2004. All rights reserved.<br />
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP<br />
Schedule Contract with <strong>IBM</strong> Corp.
Contents<br />
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v<br />
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi<br />
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />
The team that wrote this <strong>Redpaper</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x<br />
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x<br />
Chapter 1. Introduction and technical overview. . . . . . . . . . . . . . . . . . . . . . 1<br />
1.1 Project overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />
1.1.1 Availability levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br />
1.2 Testing methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />
1.2.1 HTTP Listener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />
1.3 Environment overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />
Chapter 2. Design decisions that affect high availability . . . . . . . . . . . . . . 9<br />
2.1 Considerations when designing for high availability . . . . . . . . . . . . . . . . . 10<br />
2.2 High Availability with Message Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
2.3 WebSphere MQ options in supporting HA with Message Broker . . . . . . . 12<br />
2.3.1 WebSphere MQ clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12<br />
2.3.2 WebSphere MQ shared queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14<br />
2.4 Message Broker flow design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />
2.4.1 Affinities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />
2.4.2 Error processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18<br />
2.5 Message Broker networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19<br />
2.5.1 Further considerations with Message Broker networks . . . . . . . . . . 19<br />
Chapter 3. Topology and system setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />
3.1 High Availability configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
3.1.1 Active-active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
3.1.2 Active-passive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
3.2 Test environment topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
3.3 The z/OS LPARs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
3.4 The DB2 data sharing group configuration . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
3.5 The WebSphere MQ queue sharing group configuration . . . . . . . . . . . . . 25<br />
3.5.1 Queue sharing group configuration considerations. . . . . . . . . . . . . . 25<br />
3.6 The WebSphere Business Integration Message Broker configuration . . . 26<br />
3.6.1 Message Broker configuration considerations . . . . . . . . . . . . . . . . . 27<br />
3.6.2 Additional Message Broker configuration hints . . . . . . . . . . . . . . . . . 29<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. iii
3.7 Automatic Restart Management configuration . . . . . . . . . . . . . . . . . . . . . 30<br />
3.8 The configuration manager platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />
3.9 An overview of WebSphere Business Integration Message Broker<br />
SupportPac IP13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
Chapter 4. Failover scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />
4.1 Test environment setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
4.1.1 SupportPac IP13 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
4.1.2 Message Broker configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36<br />
4.2 Scenario 1 - Initial state with all components active . . . . . . . . . . . . . . . . . 37<br />
4.3 Scenario 2 - Execution group failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />
4.4 Scenario 3 - Message Broker failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40<br />
4.5 Scenario 4 - Queue manager failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42<br />
4.6 Scenario 5 - DB2 failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
4.7 Scenario 6 - z/OS system failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br />
Appendix A. Sample code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
ARM policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />
Broker customization input file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br />
Appendix B. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53<br />
Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />
How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br />
Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br />
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
<strong>IBM</strong> <strong>Redbooks</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
How to get <strong>IBM</strong> <strong>Redbooks</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />
Help from <strong>IBM</strong> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58<br />
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59<br />
iv High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Notices<br />
This information was developed for products and services offered in the U.S.A.<br />
<strong>IBM</strong> may not offer the products, services, or features discussed in this document in other countries. Consult<br />
your local <strong>IBM</strong> representative for information on the products and services currently available in your area.<br />
Any reference to an <strong>IBM</strong> product, program, or service is not intended to state or imply that only that <strong>IBM</strong><br />
product, program, or service may be used. Any functionally equivalent product, program, or service that<br />
does not infringe any <strong>IBM</strong> intellectual property right may be used instead. However, it is the user's<br />
responsibility to evaluate and verify the operation of any non-<strong>IBM</strong> product, program, or service.<br />
<strong>IBM</strong> may have patents or pending patent applications covering subject matter described in this document.<br />
The furnishing of this document does not give you any license to these patents. You can send license<br />
inquiries, in writing, to:<br />
<strong>IBM</strong> Director of Licensing, <strong>IBM</strong> Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.<br />
The following paragraph does not apply to the United Kingdom or any other country where such provisions<br />
are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES<br />
THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,<br />
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,<br />
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer<br />
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.<br />
This information could include technical inaccuracies or typographical errors. Changes are periodically made<br />
to the information herein; these changes will be incorporated in new editions of the publication. <strong>IBM</strong> may<br />
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at<br />
any time without notice.<br />
Any references in this information to non-<strong>IBM</strong> Web sites are provided for convenience only and do not in any<br />
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the<br />
materials for this <strong>IBM</strong> product and use of those Web sites is at your own risk.<br />
<strong>IBM</strong> may use or distribute any of the information you supply in any way it believes appropriate without<br />
incurring any obligation to you.<br />
Information concerning non-<strong>IBM</strong> products was obtained from the suppliers of those products, their published<br />
announcements or other publicly available sources. <strong>IBM</strong> has not tested those products and cannot confirm<br />
the accuracy of performance, compatibility or any other claims related to non-<strong>IBM</strong> products. Questions on<br />
the capabilities of non-<strong>IBM</strong> products should be addressed to the suppliers of those products.<br />
This information contains examples of data and reports used in daily business operations. To illustrate them<br />
as completely as possible, the examples include the names of individuals, companies, brands, and products.<br />
All of these names are fictitious and any similarity to the names and addresses used by an actual business<br />
enterprise is entirely coincidental.<br />
COPYRIGHT LICENSE:<br />
This information contains sample application programs in source language, which illustrates programming<br />
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in<br />
any form without payment to <strong>IBM</strong>, for the purposes of developing, using, marketing or distributing application<br />
programs conforming to the application programming interface for the operating platform for which the<br />
sample programs are written. These examples have not been thoroughly tested under all conditions. <strong>IBM</strong>,<br />
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,<br />
modify, and distribute these sample programs in any form without payment to <strong>IBM</strong> for the purposes of<br />
developing, using, marketing, or distributing application programs conforming to <strong>IBM</strong>'s application<br />
programming interfaces.<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. v
Trademarks<br />
The following terms are trademarks of the International Business Machines Corporation in the United States,<br />
other countries, or both:<br />
Eserver®<br />
Eserver®<br />
<strong>Redbooks</strong> (logo) <br />
Eserver®<br />
ibm.com®<br />
z/OS®<br />
zSeries®<br />
DB2®<br />
<strong>IBM</strong>®<br />
Lotus®<br />
MQSeries®<br />
MVS<br />
NetView®<br />
OS/390®<br />
Parallel Sysplex®<br />
<strong>Redbooks</strong><br />
The following terms are trademarks of other companies:<br />
RACF®<br />
S/390®<br />
SupportPac<br />
ThinkPad®<br />
Tivoli®<br />
WebSphere®<br />
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the<br />
United States, other countries, or both.<br />
UNIX is a registered trademark of The Open Group in the United States and other countries.<br />
Other company, product, and service names may be trademarks or service marks of others.<br />
vi High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Preface<br />
When designing and implementing a production grade Message Broker solution<br />
on z/OS®, one of the most important factors to consider is high availability. This<br />
<strong>IBM</strong>® <strong>Redpaper</strong> examines the design considerations inherent in configuring a<br />
highly available Message Broker environment. Also demonstrated is the use of<br />
the coupling facility for WebSphere MQ queue sharing groups (QSG) and<br />
Automatic Restart Management (ARM) in order to support WebSphere Business<br />
Integration Message Broker HA in a sysplex environment. Finally, examples of<br />
the behavior of Message Broker during failover are provided, including<br />
transaction rate measurements and throughput statistics.<br />
The team that wrote this <strong>Redpaper</strong><br />
This paper was produced by a team of specialists from around the world, working<br />
with the International Technical Support Organization (ITSO), Raleigh, NC.<br />
Saida Davies is a Project Leader with the<br />
ITSO. She is a certified senior IT specialist and<br />
has 15 years of experience in IT. Saida has<br />
published several <strong>Redbooks</strong> on various<br />
business integration scenarios. She has<br />
experience in the architecture and design of<br />
WebSphere® MQ solutions, has extensive<br />
knowledge of <strong>IBM</strong>’s z/OS operating system,<br />
and a detailed working knowledge of both <strong>IBM</strong><br />
and Independent Software Vendors’ operating<br />
system software. In a customer facing role<br />
with <strong>IBM</strong> Global Services, her role included the<br />
development of services for WebSphere MQ within the z/OS and Windows®<br />
platform. This covered the architecture, scope, design, project management, and<br />
implementation of the software on stand-alone systems or on systems in a<br />
Parallel Sysplex® environment. She has a degree in Computer Science, and her<br />
background includes z/OS systems programming.<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. vii
From left: Dean, Steve, and Jon<br />
Dean Barker is an IT Specialist at the <strong>IBM</strong> Hursley Laboratories in the UK. He<br />
has over ten years experience as a MVS Systems Programmer. He holds a<br />
degree in Chemical Engineering from the University of Manchester Institute of<br />
Science and Technology. He has an excellent knowledge of the z/OS operating<br />
system and sysplex environment. His other areas of expertise include Unix<br />
System Services and WebSphere Application Server for z/OS. Dean assisted<br />
with the z/OS system setup pre-requirements for this project which enabled the<br />
team to accomplish the full scope of this <strong>Redpaper</strong>.<br />
Steve Kiernan is a Consulting IT Specialist on the New England and Upstate<br />
New York WebSphere technical team. Before joining <strong>IBM</strong> eight years ago, he<br />
spent fifteen years in the banking industry as a mainframe systems programmer.<br />
Since Steve joined <strong>IBM</strong>, he has worked with the entire WebSphere Business<br />
Integration platform and primarily with WebSphere MQ and WebSphere<br />
Business Integration MB on z/OS.<br />
Jon Mc Namara is an IT Specialist in the Hursley WebSphere Business<br />
Integration Services Team. He provides WebSphere Business Integration<br />
customers with a range of expert technical services. Jon’s areas of expertise<br />
include z/OS, WebSphere MQ, WebSphere MQ Integrator, WebSphere Business<br />
Integrator FN, WebSphere Business Integration Message Broker, and<br />
WebSphere Business Integration Event Broker. He is also a recognized expert in<br />
multicast technology.<br />
viii High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
The <strong>Redpaper</strong> team would like to thank the following people, located in <strong>IBM</strong><br />
Hursley, UK, for their guidance, assistance, and contributions to this edition:<br />
► Gary Willoughby, Manager, WebSphere Business Integration, EMEA<br />
WebSphere Lab Services<br />
► Ralph Bateman, WebSphere MQ Brokers z/OS Change Team specialist<br />
► Karen Burgess, WebSphere MQ z/OS FV & Test<br />
► William Chorlton, IT Specialist, zSeries® DB/DC Support<br />
► Rob Convery, WebSphere MQ System Test<br />
► Peter Edwards, MQSeries® Test<br />
► Emir Garza, Consulting IT Specialist, Software Group/Application Integration<br />
Middleware<br />
► Luisa Lopez de Silanes Ruiz, Consulting IT Specialist, Software<br />
Group/Application Integration Middleware<br />
► Colin Paice, WebSphere MQ Scenarions - z/OS performance specialist, <strong>IBM</strong><br />
Hursley, UK<br />
► Alasdair Paton, WebSphere MQ Brokers - System Test team lead<br />
► Pete Siddall, Software Engineer<br />
► Vicente Suarez, IT Specialist, Software Group/Application Integration<br />
Middleware<br />
Preface ix
Become a published author<br />
Join us for a two- to six-week residency program! Help write an <strong>IBM</strong> redbook<br />
dealing with specific products or solutions, while getting hands-on experience<br />
with leading-edge technologies. You will team with <strong>IBM</strong> technical professionals,<br />
Business Partners, and customers.<br />
Your efforts will help increase product acceptance and customer satisfaction. As<br />
a bonus, you will develop a network of contacts in <strong>IBM</strong> development labs and<br />
increase your productivity and marketability.<br />
Find out more about the residency program, browse the residency index, and<br />
apply online at:<br />
Comments welcome<br />
ibm.com/redbooks/residencies.html<br />
Your comments are important to us!<br />
We want our papers to be as helpful as possible. Send us your comments about<br />
this <strong>Redpaper</strong> or other <strong>Redbooks</strong> in one of the following ways:<br />
► Use the online Contact us review redbook form found at:<br />
ibm.com/redbooks<br />
► Send your comments in an e-mail to:<br />
redbook@us.ibm.com<br />
► Mail your comments to:<br />
<strong>IBM</strong> Corporation, International Technical Support Organization<br />
Dept. HZ8 Building 662<br />
P.O. Box 12195<br />
Research Triangle Park, NC 27709-2195<br />
x High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Chapter 1. Introduction and technical<br />
overview<br />
1<br />
This paper is divided into four chapters:<br />
► This chapter, Chapter 1, “Introduction and technical overview” on page 1,<br />
describes the scope of this book and provides an overview of a high<br />
availability environment for WebSphere Business Integration Message Broker<br />
(Message Broker) on z/OS.<br />
► Chapter 2, “Design decisions that affect high availability” on page 9, provides<br />
considerations, hints, and tips for the Message Broker in a high availability<br />
environment and developing message flows and publish and subscribe<br />
applications in a high availability environment.<br />
► Chapter 3, “Topology and system setup” on page 21, describes how to<br />
achieve a high availability environment with Message Broker and contains<br />
specific configuration information for the software components used during<br />
failover testing.<br />
► Chapter 4, “Failover scenarios” on page 33, describes the procedures to<br />
create a testing environment and details the results of the failover testing.<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 1
1.1 Project overview<br />
This paper explores the architecture and explains the tasks necessary to build a<br />
high availability environment for Message Broker on z/OS. Remember that the<br />
Message Broker high availability environment is necessary for the business<br />
applications that it serves and not for itself. Today’s business critical applications<br />
often demand a high degree of availability, and the z/OS system administrator<br />
must exploit Parallel Sysplex technology to provide continuous service levels<br />
during system outages, planned or otherwise.<br />
1.1.1 Availability levels<br />
There are several ways to describe the degree of availability that a system<br />
requires. The availability of a system can be expressed as a number of nines or<br />
by using different terms.<br />
The number of nines term represents the percentage of time for which the<br />
system is available. Three nines means that it is available 99.9% of the time; five<br />
nines means that it is available 99.999% of the time. These numbers become<br />
much more significant when you look at these figures in terms of downtime over<br />
a fixed period. For example, over a period of one year, a system with 99.9%<br />
availability would have 8.75 hours of downtime, while a system with 99.999%<br />
availability would be down for just 315 seconds.<br />
The following terms are also used to describe availability:<br />
► Continuous availability: This term describes a system that experiences no<br />
discernible downtime, where neither scheduled nor unscheduled outages<br />
occur. A continuously available system detects the error and immediately<br />
provides an alternative component that is already ready to go. Also, this<br />
system should support the scheduling of planned maintenance by allowing<br />
workload to be transparently transferred away from the components or<br />
subsystems that are the subject of the maintenance activity. Although<br />
continuous availability seems difficult to achieve, it is possible to obtain such<br />
availability by combining hardware, software, and operational procedures that<br />
can mask outages from the user so that the user does not perceive that a<br />
system outage occurs.<br />
► Continuous operation: This term describes a system that experiences no<br />
discernible downtime due to scheduled outages. However, this system's<br />
availability will not be as high as it would be with a continuously available<br />
system because it may suffer unplanned outages.<br />
► High availability (HA): This term describes a system that can detect a single<br />
failure and react to it automatically within a matter of a few minutes at most.<br />
2 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
These kinds of systems will operate with an amount of planned and<br />
unplanned outages. There are two significant aspects in this definition:<br />
– The system should survive a single failure but a second failure may result<br />
in a loss of service.<br />
– The detection of a fault and the triggering of an action to recover from it<br />
should be automatic, that is, require no manual intervention.<br />
Figure 1-1 illustrates the relationship between the components of systems<br />
availability.<br />
Continuous Availability<br />
Concurrency<br />
Redundancy<br />
Systems Management<br />
Reliable, Robust and Resilient Technologies<br />
High Availability + Continuous Operation<br />
Figure 1-1 High availability + Continuous Operation = Continuous Availability<br />
There are times with the term fault tolerance is used mistakenly when the terms<br />
high availability or continuous availability are meant. Fault tolerance describes<br />
systems which, in the event of a failure, can substitute a replacement component<br />
for the failed component in a matter of a few milliseconds. This kind of<br />
achievement is supported by components that have redundant sub-components,<br />
error checking and correction for data, retry capabilities for basic operations,<br />
alternate path for I/O requests, and so forth. However, there may also be a single<br />
point of failure which, despite the fault tolerance, can cause a component to fail.<br />
Similarly, if one important component in a system is not fault-tolerant, then the<br />
system is not fault-tolerant even though all other components are.<br />
Specifically, HA refers to a specific level of service that provides availability in the<br />
event of a single, non-catastrophic component failure. Transaction capacity may<br />
Chapter 1. Introduction and technical overview 3
e compromised while the failing component recovers, but recovery should be<br />
automatic.<br />
This paper describes a resilient HA Message Broker architecture based on an<br />
Active-Active Message Broker configuration using a WebSphere MQ queue<br />
sharing group (QSG) and a DB2 data sharing group (DSG) across two Logical<br />
Partitions (LPAR). Testing revealed that this configuration provides for sustained<br />
Message Broker services through a variety of failover scenarios.<br />
For a complete overview of HA concepts, refer to the <strong>IBM</strong> Redbook Highly<br />
Available WebSphere Business Integration Solutions, SG24-6328.<br />
The individual component failures that we tested in the research for this paper<br />
include:<br />
► A Message Broker execution group failure<br />
► A Message Broker failure<br />
► A queue manager failure<br />
► A DB2 sub-system failure<br />
In our testing, we restarted each of these components in place on the LPAR on<br />
which they failed using Automatic Restart Management (ARM). We also tested a<br />
complete LPAR failure in which we restarted the queue manager and Message<br />
Broker from one LPAR on the active LPAR.<br />
There are countless other individual component failures that we could not test<br />
due to time constraints. We used ARM to restart all failing components.<br />
Appendix A, “Sample code” on page 49 provides the ARM policy that was in<br />
effect at the time of the failure. Some organizations may prefer NetView®<br />
automated recovery to ARM. You can achieve similar restart functionality using<br />
NetView.<br />
A coupling facility (CF) failure is categorized as a catastrophic failure. To achieve<br />
continuous availability on z/OS, a CF failover has to be accounted for, specifically,<br />
by duplexing the WebSphere MQ admin and list structures. The only single point<br />
of failure in our tested configuration was the CF. CF failover testing is outside the<br />
scope of this paper. For further information refer the following information:<br />
► MVS Setting up a Sysplex:<br />
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/IEA2F132/<br />
CCONTENTS?SHELF=IEA2BK34&DN=SA22-7625-06&DT=20030423145429<br />
► The WebSphere MQ System Administration Guide - Part 5, “Recovery and<br />
Restart” at:<br />
http://publibfp.boulder.ibm.com/epubs/html/csqsaw02/csqsaw02tfrm.htm<br />
4 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
1.2 Testing methodology<br />
Because the goal of this paper is to evaluate Message Broker service availability,<br />
we obtained measurements from both the Message Broker and the application<br />
that was using the Message Broker. The WebSphere Business Integration<br />
Broker - Sniff test and performance on z/OS (<strong>IBM</strong> Category 2 SupportPac IP13)<br />
provided an elegant solution. For details on downloading the SupportPac, see:<br />
http://www-306.ibm.com/software/integration/support/supportpacs/<br />
Note: We recommended that you install the most current version of the<br />
SupportPac before using the techniques described in this paper.<br />
SupportPac IP13 consists of documentation, programs, and message flows<br />
designed to help you measure application throughput on Message Broker on<br />
z/OS. SupportPac IP13 provides statistics at the end of the job, including the<br />
total number of round trip messages that the SupportPac IP13 application<br />
processed. You can find the specific SupportPac IP13 parameters that we used<br />
for testing in Chapter 4, “Failover scenarios” on page 33. We ran the application<br />
on both LPARs during several of the tests to simulate a realistic production<br />
environment.<br />
Figure 1-2 on page 6 illustrates our testing environment topology.<br />
Chapter 1. Introduction and technical overview 5
Figure 1-2 Test environment topology<br />
In our testing, we also obtained statistics from the Message Broker while archive<br />
accounting data collection was active. Again, you can find specific procedures for<br />
this Chapter 4, “Failover scenarios” on page 33. By using SupportPac IP13 to<br />
generate a workload for the Message Broker to consume and by comparing<br />
those results with the statistics produced by the Message Broker, we were able<br />
to evaluate continuous throughput during component failover.<br />
Thus, the SupportPac IP13 batch application drove messages to a shared<br />
message queue. We configured a SupportPac IP13 message flow to consume<br />
those messages on the shared queue and then deploy the .bar file to both<br />
brokers. Reply messages were sent to a second shared queue and, in turn, were<br />
6 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
1.2.1 HTTP Listener<br />
then consumed by the SupportPac IP13 batch application program in a<br />
request-reply model.<br />
This testing methodology was sufficient to evaluate component failover.<br />
However, the statistics supplied in Chapter 4, “Failover scenarios” on page 33 are<br />
insufficient to evaluate Message Broker performance or capacity for the following<br />
reasons:<br />
► The z/OS environment was not tuned. The main purpose of SupportPac IP13<br />
is to provide a way to begin to tune the Message Broker environment, but we<br />
did not use it for this purpose in this exercise.<br />
► The LPARs did not have identical hardware resources. For instance, LPAR0<br />
was configured with two logical processors, while LPAR2 was configured with<br />
one logical processor.<br />
► The loads that we placed on the configuration were designed to provide a<br />
continuous stream of messages for the duration of the individual tests, but the<br />
steady-state load did not stress available resources. A single instance of the<br />
SupportPac IP13 message flow was used for each broker.<br />
Although HTTP Listener support in Message Broker is expected to makes its<br />
debut on z/OS shortly (FixPack 4), the current version does not support it.<br />
Therefore, configuration and testing using HTTP Listener was beyond the scope<br />
of this paper.<br />
It is essential for the system administrator to consider dynamic VIPAs for a high<br />
availability environment. WebSphere MQ and WebSphere Business Integration<br />
Message Broker rely on TCP/IP stack for communicating with other systems.<br />
For further information, on this subject, there is an <strong>IBM</strong> white paper entitled<br />
Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />
availability, GM13-0026, at:<br />
http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/<br />
gm130165.html<br />
1.3 Environment overview<br />
For our testing environment, we kept the software configuration as simple as<br />
possible to achieve a HA environment. The goal was to build a resilient Message<br />
Broker server capable of providing a very high degree of availability. Because<br />
Chapter 1. Introduction and technical overview 7
Message Broker is a WebSphere MQ and DB2® application, these components<br />
must also be configured for high availability, specifically:<br />
► The two z/OS queue managers participated in a QSG to allow for shared<br />
application queues.<br />
► The queue managers were not clustered.<br />
► A pair of WebSphere MQ channels was used between each z/OS queue<br />
manager and the Windows Configuration Manager queue manager.<br />
► A DB2 DSG was created to support the QSG.<br />
The operating environments used for this book include:<br />
► The Message Broker installed on two z/OS 1.4 LPARs in a sysplex running on<br />
a 9672 model XZ7. The LPARs have the following software levels:<br />
– z/OS 1.4 with service up to RSU0406<br />
– DB2 V7R1M0 with service up to RSU0312<br />
– WebSphere MQ V5R3M1 at FP5<br />
– WebSphere Business Integration Message Broker V5R0M1 at Fix Pack 3<br />
► The Message Broker Configuration Manager installed on an <strong>IBM</strong> T40<br />
ThinkPad® with:<br />
– Windows XP SP1<br />
– WebSphere MQ V5.3 Fix Pack 5<br />
– WebSphere Business Integration Message Broker V5 Fix Pack 3<br />
– DB2 V8.1 FixPak 2<br />
A thorough description of the software configuration and setup that we used in<br />
our testing environment can be found in Chapter 3, “Topology and system setup”<br />
on page 21.<br />
Note: We recommend that you install the most current release of all Fix Packs<br />
before using the techniques described in this paper.<br />
8 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
2<br />
Chapter 2. Design decisions that affect<br />
high availability<br />
This chapter examines design decisions that you need to address when<br />
configuring an HA system. It begins by taking a high-level look at what HA entails<br />
and continues by examining the details of designing an HA system on z/OS using<br />
WebSphere MQ and Message Broker.<br />
Remember that when you design a Message Broker configuration on z/OS which<br />
must satisfy stringent HA requirements, there are a number of factors you should<br />
consider. The main question is how to ensure that the availability targets for a<br />
Message Broker implementation are met?<br />
The most common implementation of Message Broker is as a central hub<br />
through which all messages in a messaging architecture are processed. Thus,<br />
the potential exists for Message Broker to become a bottleneck and single point<br />
of failure for the whole message processing network. Because of the business<br />
impact of the messaging hub failing, you should use a thorough approach when<br />
designing the queue manager and an HA implementation of Message Broker.<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 9
2.1 Considerations when designing for high availability<br />
Many factors contribute to the HA of a Message Broker environment, including<br />
platform and operating system agnostic factors, such as the following:<br />
► Documented procedures<br />
► Practicing procedures<br />
► Reliable hardware<br />
► Failover<br />
► Dual networks<br />
Other platform and operating system specific factors can also affect the HA of a<br />
Message Broker environment, including:<br />
► Shared queues<br />
► Reliable operating system<br />
► WebSphere MQ clustering<br />
All of these factors are important when considering availability. Many of these<br />
factors are platform agnostic and are equally relevant across the range of<br />
supported platforms, including z/OS. These factors include contingency against<br />
disk failures, processor and memory, and power supply or power itself, network,<br />
and so forth.<br />
Because the entire range of components that would make up a highly available<br />
system is extremely varied and covers a number of separate disciplines, we<br />
concentrated our efforts on a distinct subset in our testing environment. This<br />
subset included methods by which the queue manager and Message Broker<br />
were configured to support a highly available system. This section also outlines<br />
some of the issues to take into account when using functionality such as shared<br />
queues, clustering, and cloned brokers.<br />
One approach to mitigating against failures is to install multiple instances of the<br />
components that might fail. On z/OS, there is an architecture which lends itself<br />
well to providing a highly available system. For example, there is the well-proven<br />
architecture of using separate LPARs connected via the coupling facility. This<br />
method is collectively referred to as a parallel sysplex environment. A parallel<br />
sysplex environment is comprised of highly available disks, the power supply,<br />
and the ability to communicate across LPARs by using the coupling facility. With<br />
this architecture, you can implement a highly available queue manager and<br />
Message Broker system by using, for example, a number of brokers operating on<br />
separate LPARs, all of which are supported by shared WebSphere MQ queues<br />
via the coupling facility or by WebSphere MQ clustering.<br />
10 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
2.2 High Availability with Message Broker<br />
When formulating a design for a highly available Message Broker system, it is<br />
worth considering the benefits of having a network topology consisting of a hub<br />
that contains a number of queue managers which serve brokers. This type of<br />
configuration allows the workload to be spread evenly across the brokers, while<br />
also providing redundancy in the event that a Message Broker or a queue<br />
manager fails.<br />
In addition, running brokers across multiple LPARs allows work to continue in<br />
case of an LPAR failure.<br />
At a lower level, there is a choice between deploying the same configuration to<br />
each broker in the hub or assigning a different configuration to each broker. If a<br />
particular message flow runs only on a single broker in the hub, then this<br />
environment does not present a high availability solution. If the same message<br />
flow is the chosen configuration for all brokers, any broker can process any<br />
message. However, the resource allocation cannot be tailored as finely.<br />
With a separate configuration for each broker, you can control the number of<br />
copies of a flow run on each broker in the hub. You can also tailor the priorities so<br />
that LPARs with more capacity can handle more of the work. However, this<br />
environment can introduce affinities to particular LPARs and compromise HA<br />
effectiveness. Whichever way you configure brokers, you must create appropriate<br />
.bar files and execution groups for each broker in the hub. Table 2-1 on page 12<br />
shows the advantages and disadvantages to each configuration.<br />
Chapter 2. Design decisions that affect high availability 11
Table 2-1 Advantages and disadvantages of clones and tailored execution groups<br />
Cloned execution groups Tailored execution groups<br />
Advantages Advantages<br />
Provides for the highest availability level<br />
because all brokers can process any<br />
message.<br />
Gives a wide distribution of work evenly<br />
across the system, reducing the potential<br />
for bottlenecks.<br />
Disadvantages Disadvantages<br />
Does not provide granular control over<br />
resource allocation.<br />
Does not provide support for flows which<br />
may be required to turnaround messages<br />
more quickly than others.<br />
2.3 WebSphere MQ options in supporting HA with<br />
Message Broker<br />
There are advantages and disadvantages you should consider before embarking<br />
on a particular method of providing HA with Message Broker. However, once you<br />
have determined the best method for your configuration, there are a number of<br />
ways you can use WebSphere MQ to support your configuration, using either<br />
WebSphere MQ clustering or WebSphere MQ shared queues.<br />
This section discusses the WebSphere MQ options available in supporting a<br />
highly available Message Broker configuration.<br />
2.3.1 WebSphere MQ clustering<br />
Allows for the best use of the resources<br />
available.<br />
Allows more flows which are used more or<br />
have a more demanding message<br />
processing need.<br />
May introduce affinities to particular<br />
brokers and LPARS.<br />
Provides a lower availability because<br />
fewer brokers can process all messages.<br />
To understand how clustering works, consider this example. A sending<br />
application that sends a message to a receiving application uses the queue<br />
manager to send the message. If the queue manager of the receiving application<br />
is offline, the clustering functionality would automatically reroute the message to<br />
another queue manager. This queue manager acts as a clone of the receiving<br />
application. This function is totally transparent to the applications. Additional<br />
information about clustering can be found in WebSphere MQ Queue Manager<br />
Clusters, SC34-6061.<br />
12 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
The ability to define these clustered queues on several of the queue managers in<br />
the cluster leads to increased system availability. Each queue manager runs<br />
equivalent instances of the applications that process the messages. If one of the<br />
queue managers fails, or the communication to it is suspended, that queue<br />
manager is temporarily excluded from the choice of destinations for the<br />
messages. This functionality provides a number of benefits, the main one being<br />
HA.<br />
There are a few points to consider when approaching WebSphere MQ clustering<br />
as the preferred choice for implementing HA. For example, if a queue manager<br />
fails, then although any subsequent messages sent to the cluster are not routed<br />
to the failed queue manager, any persistent messages already sent to the queue<br />
manager, but not yet processed by the broker, are marooned on this failed queue<br />
manager. Also, for WebSphere MQ clustering to effectively balance loads<br />
between queues on different queue managers, messages sent from outside the<br />
cluster need to be sent to a gateway queue manager. A gateway queue manager<br />
creates a single point of failure for the whole logical hub. Therefore, the gateway<br />
queue manager would need to have very high availability.<br />
Also, remember that in the event that the Message Broker falls over but the<br />
clustered queue manager is still operational, messages under a cluster queue<br />
environment are still sent to the queue manager. These messages are not<br />
processed until the Message Broker is once again functional. Unless there is a<br />
monitoring tool which registers that the Message Broker is inoperative,<br />
messages can build up unnoticed.<br />
A way of safeguarding against this situation is to employ a monitoring tool, which<br />
registers both the status of the Message Broker and the subsequent build up of<br />
messages on the broker input queue. If, for example, the normal buildup on the<br />
broker input queue is 50 messages, it might be wise to set an alert on the queue.<br />
If messages build to a number greater than 50, then an alert is sent to the<br />
monitoring tool, and an operator can determine if there is a problem.<br />
Table 2-2 shows the advantages and disadvantages to using WebSphere MQ<br />
clustering.<br />
Table 2-2 Advantages and disadvantages to using WebSphere MQ clustering<br />
Advantages Disadvantages<br />
Provides resilient failover because<br />
WebSphere MQ can redirect messages<br />
away from a failed queue manager to be<br />
picked up by a functional one.<br />
Persistent messages awaiting processing<br />
while on the queue of a failed queue<br />
manager remain there until the queue<br />
manager is operational.<br />
Chapter 2. Design decisions that affect high availability 13
Advantages Disadvantages<br />
Uses the available resources because<br />
WebSphere MQ clustering can direct<br />
messages across all available LPARS.<br />
Takes advantage of cloned applications<br />
and identically configured brokers by<br />
workload balancing, thus reducing the risk<br />
of bottlenecks.<br />
2.3.2 WebSphere MQ shared queues<br />
Gateway queue managers can act as a<br />
single point of failure for the entire<br />
WebSphere MQ cluster.<br />
Multi-part messages and transactions can<br />
cause affinities to particular queue<br />
managers, thus reducing the availability of<br />
the system.<br />
Should Message Broker fail and the queue<br />
manager is still functional, the WebSphere<br />
MQ cluster continues to send messages<br />
to that queue manager even though they<br />
are not processed.<br />
Now that we have examined the issues in using WebSphere MQ clustering to<br />
support HA, this section discusses WebSphere MQ shared queues. Shared<br />
queues rely heavily on the coupling facility and a shared DB2 database to allow<br />
queue managers to form a queue sharing group (QSG). Once the queue<br />
managers have configured the QSG, they can create queues that are available<br />
to all the queue managers in the group.<br />
This functionality sounds very similar to WebSphere MQ clustering and, in many<br />
respects, there is a certain amount of overlap in the service shared queues<br />
provide. However, a shared queue is one instance of a queue which is “shared”<br />
among a number of queue managers. This method differs from WebSphere MQ<br />
clustering in that the clustered queues, despite having the same designation, are<br />
actually separate entities. In practice, once a shared queue has been configured<br />
allowing a number of queue managers to access it, all of the brokers associated<br />
with those queue managers (or any application) could have access to that queue,<br />
despite existing on separate LPARs.<br />
A simple scenario which uses shared queues might resemble the following. A<br />
sending application puts messages to a shared queue. These messages are<br />
then available to all queue managers in the QSG across all relevant LPARs.<br />
Associated with these queue managers are brokers, all of which are retrieving<br />
messages from the same shared queue. Should an LPAR, queue manager, or a<br />
broker fall over, the other brokers would continue retrieving and processing<br />
messages from the shared queue.<br />
Because cloned applications and the broker which serve them are able to a<br />
connect to queue managers in a QSG and because all queue managers in a<br />
QSG can access shared queues, applications do not need to rely on the<br />
14 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
availability of any one queue manager. Should an LPAR, queue manager, or a<br />
Message Broker fall over, shared queues on the functioning system can continue<br />
to service cloned applications.<br />
There are distinct advantages for using shared queues as opposed to using<br />
WebSphere MQ clustering. For example, in the event that the Message Broker<br />
falls over but the queue manager is operational, messages under a cluster queue<br />
environment are still sent to the queue manager. These messages then wait on<br />
the queue until the Message Broker restarts. This situation can cause problems,<br />
especially if the business requires a quick turnaround response to the message.<br />
This situation does not occur when using shared queues, because it is the<br />
Message Broker that is retrieving the message from the shared queue directly,<br />
rather then the message being passed to a queue manager which only supports<br />
one Message Broker.<br />
Another advantage of using shared queues is that both applications and the<br />
broker can take advantage of serialization during shared queue peer recovery.<br />
As previously mentioned, if a queue manager should fail, then all the queue<br />
managers in the QSG continue to process messages on the shared queue.<br />
However, they also finish the shared queue work for the incomplete units of work<br />
which were running on the failed queue manager. One potential issue here is that<br />
during the process of rolling back uncommitted messages it is possible that<br />
another queue manager may attempt to process one of the messages in the unit<br />
of work still on the queue. In this case, the messages in the unit of work would<br />
then be out of sequence. By using the serialization mechanism, no other queue<br />
manager in the QSG can access any of the messages in the unit of work until a<br />
full roll back has been completed. This method ensures that the messages are<br />
processed in the correct order.<br />
Some of the restrictions of using shared queues on z/OS when it comes to HA.<br />
Possibly the most important restriction which should be taken into account is the<br />
maximum message size. For the current version, WebSphere MQ 5.3, the<br />
largest message which can be placed upon a shared queue is 63 KB. Thus, the<br />
largest message that can be transported via the highly available shared queue<br />
system is also 63 KB. In addition, there is currently a restriction of eight million<br />
messages which can be stored on a queue.<br />
Also, for WebSphere MQ 5.2, the shared queues do not support persistent<br />
messaging. However, the non-persistent messages on a shared queue do<br />
survive a queue manager restart. Thus, this method does have a form of<br />
resiliency (although technically, this is not persistence). With WebSphere MQ<br />
5.3, persistent messages are supported with shared queues. Remember that<br />
non-persistent messages do survive the queue manager restart. This situation<br />
can be advantageous unless an application specifically depends upon the<br />
non-persistent messages being deleted on the event of a queue manager restart.<br />
Chapter 2. Design decisions that affect high availability 15
In this instance, you should consider incorporating a policy for “cleaning up”<br />
these messages.<br />
One aspect of the availability of messages on shared queues which you should<br />
consider is the effect of using a two-phase commit. For example, consider a unit<br />
of work which retrieves a message from a shared queue, updates a DB2 table<br />
based on the contents of the message, and returns a response to a shared<br />
queue. A 2pc protocol is used to ensure that either all or none of the processing<br />
happens, typically coordinated by Resource Recovery Services (RRS). If the<br />
queue manager where this unit of work is running were to fail during the<br />
two-phase commit, it is possible that the unit of work would be left indoubt in<br />
WebSphere MQ.<br />
In this case, the correct resolution of the unit of work cannot be determined until<br />
the queue manager is restarted and can reconnect with RRS. It is therefore not<br />
possible for the other queue managers to perform peer recovery for this unit of<br />
work (this means, the input message cannot be rolled back by a peer for<br />
processing via a different queue manager which has an impact on the availability<br />
of the messages consumed and produced by that indoubt unit of work). The<br />
availability of other messages on the shared queue is not impacted unless<br />
serialization tokens are being used to ensure an ordering of processing<br />
messages on this queue. This is further explained in WebSphere MQ for z/OS<br />
System Administration Guide V5.3.1, SC34-6053-01.<br />
Note: When messages are put to a shared queue, the data is logged on a<br />
particular queue manager, but that process does not cause any kind of<br />
message affinity to a queue manager. The affinity is between a unit of work<br />
and a queue manager.<br />
Table 2-3 outlines the issues you should consider when looking at shared queues<br />
to support a highly available environment.<br />
Table 2-3 Advantages and disadvantages of using WebSphere MQ shared queues<br />
Advantages Disadvantages<br />
Resilient support of HA over multiple<br />
LPARs. Should a queue manager or a<br />
Message Broker fail, other brokers<br />
continue to retrieve messages.<br />
Maximum message size of 63 KB. Even if<br />
a current system does not have messages<br />
of that size or above, it places a restriction<br />
on the natural growth of the system.<br />
However, this restriction is increased to<br />
100 MB in the next release of WebSphere<br />
MQ by further using DB2.<br />
16 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
2.4 Message Broker flow design<br />
2.4.1 Affinities<br />
Advantages Disadvantages<br />
Messages are pulled from the queue,<br />
rather than pushed onto another queue<br />
managers clustered queue. Thus, if a<br />
Message Broker becomes inoperative,<br />
messages do not build up on the queue<br />
manager serving the broker. Instead, the<br />
messages remain on the shared queue to<br />
be picked up by a functional broker.<br />
The ability to fully utilize the available<br />
resources as shared queues allows<br />
messages to be shared across available<br />
LPARS.<br />
Ability to take advantage of cloned<br />
applications and identically configured<br />
brokers by allowing less busy applications<br />
and brokers to retrieve messages from the<br />
shared queue at there own pace, thus<br />
reducing the risk of bottlenecks.<br />
Ability to take advantage of the<br />
serialization mechanism to ensure backed<br />
out units of work are completed in the<br />
correct order.<br />
Limitation of the coupling facility storage<br />
that no more than eight million messages<br />
can be stored on a queue.<br />
Non-persistent messages which could<br />
survive queue manager restarts can be an<br />
issue for the application. If an application<br />
depends on non-persistent messages not<br />
surviving a queue manager restart, it is<br />
possible you will need to add functionality<br />
to deal with this issue.<br />
Depending on how the coupling facility is<br />
being used, the storage taken up by<br />
shared queues could well be relatively<br />
expensive.<br />
There are a number of points to keep in mind with HA in the design of message<br />
flows.<br />
Message affinities can arise when multiple messages are required to make up a<br />
single business transaction or when messages must be processed in a specific<br />
order. This situation is also true when a Message Broker is forced to keep state,<br />
as is the case when using the aggregation node. This type of situation can often<br />
mean that a particular queue manager or Message Broker is required to process<br />
these messages. In terms of HA, this affinity can lead to vulnerabilities. If a<br />
particular queue manager or Message Broker is required to process a long string<br />
Chapter 2. Design decisions that affect high availability 17
of messages making up a transaction, then that manager or Message Broker can<br />
become a single point of failure for that transaction. This situation creates a<br />
potentially vulnerable point in the system, because they the queue manager or<br />
Message Broker require messages to be processed through the same thread,<br />
broker, and so on.<br />
However, your business needs may make it necessary to introduce affinities into<br />
the Message Broker infrastructure. In such a situation, you should be aware of<br />
the issues that introducing affinities may cause. One way to alleviate some of<br />
these issues is to run transactions under transaction coordination, so that in the<br />
event that the queue manager or the Message Broker falls over, the transaction<br />
is rolled back. Doing so means that the transaction can then be re-processed by<br />
another instance of the Message Broker.<br />
Publish and Subscribe functionality is an example of where a user application<br />
can develop an affinity to a specific broker. Unless cloning is used, once a<br />
subscription has been registered at a specific broker, only that broker can deliver<br />
publications to the subscriber. This situation applies specifically to RealTime,<br />
Multicast, and Telemetry transports because these applications establish a direct<br />
TCP/IP connection to the client, which is broken if the server goes down.<br />
2.4.2 Error processing<br />
To ensure that the system is as highly available as possible, it is worthwhile to<br />
understand how Message Broker handles error processing. Message Broker has<br />
very defined procedures for dealing with invalid messages. Not all of them,<br />
however, are conducive to HA.<br />
While processing messages, the normal procedure for the broker when<br />
encountering an invalid message is to place the message back on the input<br />
queue. The broker then retries the message until the retry count is reached. At<br />
this point, the broker simply places the message back on the queue and ceases<br />
processing it. When it comes to supporting HA, this behavior causes a problem.<br />
Once an invalid message is placed back on the input queue, other potentially<br />
valid messages are building up on the queue behind it. This building up of<br />
messages continues until the invalid message is removed. This situation is not<br />
ideal in the HA environment. To work-around this type of situation, you will need<br />
to incorporate some tailored error processing. The error processing can be as<br />
simple or as complex as you prefer, from simply putting the invalid message onto<br />
an error queue, to creating an error processing subflow supported with Try Catch<br />
functionality. The important factor is to remove the invalid message from the input<br />
queue, thereby preventing valid messages from building up behind the invalid<br />
message.<br />
18 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
2.5 Message Broker networks<br />
A popular feature of Message Broker is that it supports the Publish and<br />
Subscribe functionality. A subscriber can connect to a broker in the Publish and<br />
Subscribe topology and receive publications made on that broker or others in the<br />
network. In the default case, the subscriber has an affinity to the broker with<br />
which it has registered. There are various ways to reduce this affinity. For<br />
WebSphere MQ subscribers, a solution is to use the Cloned Broker feature,<br />
which allows a subscriber to receive its publications directly from several different<br />
brokers, thus reducing the affinity.<br />
Note: Cloned brokers cannot be used with other broker topologies, such as<br />
hierarchies and collectives. For subscribers using the RealTime transport, this<br />
option is not available because a connection is established directly with a<br />
particular broker and the subscription is removed if the connection is broken.<br />
This paper does not discuss Publish and Subscribe applications in detail, but<br />
there is more information about HA for Publish and Subscribe applications in the<br />
Redbook WebSphere Business Integration Pub/Sub Solutions, SG24-6088.<br />
2.5.1 Further considerations with Message Broker networks<br />
When working with Message Broker networks, you should also consider the User<br />
Name Server’s function. This function provides a level of security for the Publish<br />
and Subscribe function on Message Broker. Overall, the User Name Server<br />
provides an excellent level of service across the WebSphere MQ supported<br />
platforms. However, an examination of the User Name Server with regard to HA<br />
has not been outlined in this document because the User Name Server is not<br />
particularly well suited to the large scale, high volume type of application typically<br />
found on z/OS. On z/OS, the User Name Server periodically accesses RACF® to<br />
match access control privileges against the user IDs which require access to<br />
topics, resulting in a rebuild of the cache.<br />
One point to consider, would be that while this is not a significant overhead when<br />
dealing with moderate numbers of user IDs, it can become more significant once<br />
the number of user IDs begins to grow. With the nature and scale of applications<br />
which are based on z/OS and the subsequent number of user IDs and RACF<br />
definitions, the User Name Server may not be seen to scale particularly well in<br />
this environment. This situation may potentially cause performance problems and<br />
be expensive in the amount of resources required to frequently check RACF<br />
definitions against a large number of user IDs.<br />
Chapter 2. Design decisions that affect high availability 19
20 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Chapter 3. Topology and system setup<br />
This chapter describes the topology and system setup of HA Message Broker<br />
environments on z/OS, and more specifically, the environment we used to create<br />
the failover scenarios tested in Chapter 4, “Failover scenarios” on page 33.<br />
In this chapter, you can find information about:<br />
► High Availability configurations.<br />
► Test environment topology.<br />
► The z/OS LPARs.<br />
► The DB2 data sharing group configuration.<br />
► The WebSphere MQ queue sharing group configuration.<br />
► The WebSphere Business Integration Message Broker configuration.<br />
► Automatic Restart Management configuration.<br />
► The configuration manager platform.<br />
► An overview of WebSphere Business Integration Message Broker<br />
SupportPac IP13.<br />
3<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 21
3.1 High Availability configurations<br />
3.1.1 Active-active<br />
3.1.2 Active-passive<br />
A high availability environment with WebSphere Business Integration Message<br />
Broker on z/OS requires the use of WebSphere MQ queue sharing groups as<br />
described in Chapter 2, “Design decisions that affect high availability” on page 9.<br />
That setup in turn implies the need for a coupling facility and at least two z/OS<br />
systems in a sysplex that host a DB2 DSG.<br />
These components, employing two z/OS images for simplicity, form the basis of<br />
the following HA configuration descriptions and, in the next section, the test<br />
environment that we used for this paper. The selection of either one of the<br />
configurations described below depends on your business need and the<br />
available capacity of your environment.<br />
An active-active setup describes a business environment where both z/OS<br />
images run WebSphere Business Integration Message Broker to process<br />
messages from the shared queue. Both systems process the business<br />
application load. In the event of a failure on one image, the other is still available<br />
to carry on processing alone for the duration of the recovery. While this setup<br />
provides HA and fully utilizes the two machines in daily operation, it may cause<br />
performance degradation during recovery.<br />
And active-passive setup describes a business environment where only one of<br />
the z/OS images runs the daily broker business. In the event of a failure on this<br />
image, the second z/OS image is available on hot standby to pick up the load<br />
from the shared queue. While this provides for HA and also maintains throughput<br />
during a failure, in normal operation only one machine is being fully utilized.<br />
22 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
3.2 Test environment topology<br />
The HA environment that we used for this paper employed two z/OS images to<br />
process work in an “active-active” configuration. Figure 3-1 illustrates the<br />
topology of the system components listed in Chapter 4, “Failover scenarios” on<br />
page 33.<br />
z/OS<br />
DB2<br />
(DSG)<br />
QMGR<br />
(QSG)<br />
Message Broker<br />
Execution Groups<br />
Message<br />
Flows<br />
IP13<br />
Figure 3-1 Queue sharing group topology<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
Two z/OS systems in a sysplex host a DB2 data sharing group and a WebSphere<br />
MQ queue sharing group. A WebSphere Business Integration Message Broker<br />
runs on each system, connected to the respective queue manager. The brokers<br />
are administered by the configuration manager installed on a ThinkPad running<br />
Windows. The application workload to test the configuration is supplied by<br />
SupportPac IP13.<br />
z/OS<br />
IP13<br />
DB2<br />
(DSG)<br />
QMGR<br />
(QSG)<br />
Message Broker<br />
Execution Groups<br />
Message<br />
Flows<br />
Chapter 3. Topology and system setup 23
3.3 The z/OS LPARs<br />
There are three LPARs in the sysplex running on a 9672 model XZ7 with an<br />
internal coupling facility. For simplicity, only two of the LPARs are used for the<br />
active-active configuration, as depicted in Figure 3-1 on page 23. One of the<br />
LPARs, MVSM0, is connected to two logical processors while the second,<br />
MVSM2, to just one. MVSM0 has 1536 MB of real storage configured, while<br />
MVSM2 has 1024 MB.<br />
Otherwise, the MVS images have similar configurations with RRS, DB2,<br />
WebSphere MQ, WebSphere Business Integration Message Broker, and TCP/IP<br />
all active. UNIX® System Services is running with shared Hierarchical File<br />
System (HFS). Both images are running z/OS 1.4.<br />
3.4 The DB2 data sharing group configuration<br />
The DB2 subsystems are data sharing and accessible as an Open Database<br />
Connectivity (ODBC) data source via a Distributed Data Facility (DDF) that is<br />
using TCP/IP. The DB2 subsystems in data sharing group DSN710PM are DFM0<br />
and DFM2, running on MVSM0 and MVSM2 respectively. Figure 3-2 displays the<br />
data sharing group.<br />
Figure 3-2 The DB2 data sharing group<br />
24 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
3.5 The WebSphere MQ queue sharing group<br />
configuration<br />
A queue manager is set up on each system and defined to a QSG. To create the<br />
QSG, we defined some new structures to the coupling facility. The procedure for<br />
this is described in the WebSphere MQ z/OS System Setup Guide, which can be<br />
found at:<br />
http://www-306.ibm.com/software/integration/mqfamily/library/manualsa/manuals/<br />
platspecific.html#zos<br />
The queue managers in queue sharing group MB01 are WMQ0 and WMQ2,<br />
running on MVSM0 and MVSM2 respectively. Figure 3-3 shows the QSG.<br />
Figure 3-3 The queue sharing group<br />
3.5.1 Queue sharing group configuration considerations<br />
Some things you should consider when configuring the QSG are:<br />
► If you are using the z/OS Automatic Restart Management (ARM) to restart<br />
queue managers on different z/OS images, then:<br />
– Define every queue manager with a sysplex-wide, unique four character<br />
subsystem name that uses a command prefix string (CPF) scope of S.<br />
– Configure each queue manager with a different channel listener port. In<br />
the event of a system failure, this configuration is important if there is a<br />
requirement to restart the failing queue manager on another system in the<br />
sysplex that is already running a queue manager.<br />
► Use the INITSIZE value of 10 MB as provided in the sample job CSQ4CFRM<br />
when you define the admin structure. Specify a larger amount than this for the<br />
SIZE value so that it can expand. You should also consider adding the<br />
ALLOWAUTOALT(YES) parameter to allow system-initiated alters<br />
(automatic-alter) for this structure.<br />
Chapter 3. Topology and system setup 25
► Review actual structure sizes regularly and make sure the coupling facility<br />
Resource Manager (CFRM) policy is updated to reflected actual usage.<br />
Application structure allocations grow with use.<br />
► Confirm that the admin structure is a single point of failure. If a second<br />
coupling facility is available then duplexing of the structure should be<br />
considered.<br />
► Define the queue manager logs with SHAREOPTIONS(2 3) as in the Job<br />
Control Language (JCL) of sample job CSQ4BSDS, because shared queue<br />
recovery requires that a queue manager can access the logs of peers within<br />
the QSG.<br />
► Define application CFSTRUCT with CFLEVEL(3) and RECOVER YES when using<br />
persistent messages (the default is level 2 and NO).<br />
► If using persistent messages, review log sizes prior to migration to shared<br />
queues. Data logged is slightly larger, and BACKUP CFSTRUCT copies<br />
shared queue messages to the log. Non-persistent messages on a private<br />
queue may be logged under some circumstances (for instance, if the<br />
message stays on the queue for an extended length of time), but this situation<br />
does not occur for non-persistent messages that reside on a shared queue.<br />
► Prevent Hierarchical Storage Manager (HSM) from migrating the queue<br />
manager DB2 tables. If the tables are migrated, you may experience startup<br />
problems.<br />
► Do not attempt to manually change QSG DB2 tables unless directed by <strong>IBM</strong><br />
support. Even apparently innocuous changes may leave DB2 table<br />
information out of sync with the coupling facility or with other tables.<br />
► Allow a QSG for a group listener (using VIPA) and shared channels, which<br />
may be useful in providing HA for WebSphere MQ applications.<br />
3.6 The WebSphere Business Integration Message<br />
Broker configuration<br />
A message broker is created for each of the queue managers described<br />
previously. The message broker names are WMQ0BRK and WMQ2BRK, running<br />
on MVSM0 and MVSM2 respectively.<br />
You can find instructions for creating the broker at:<br />
http://publib.boulder.ibm.com/infocenter/wbihelp/index.jsp<br />
26 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
3.6.1 Message Broker configuration considerations<br />
Some things to consider when configuring the Message Broker are:<br />
► Each Message broker connected to the QSG should run under a different<br />
user ID. One of the reasons for this configuration is that the broker tables<br />
must be created with unique names in the DB2 data sharing group. The user<br />
ID, which is used for variable DB2_TABLE_OWNER in the mqsicompcif file, is<br />
prepended to the broker table names to form the fully qualified unique table<br />
names.<br />
► If the desired response to a system failure is to use ARM to restart a broker,<br />
with its queue manager on another z/OS system in the sysplex, then consider<br />
the following:<br />
– The UNIX System Services (USS) environment across the sysplex should<br />
be configured to use shared HFS.<br />
– The broker root directories must not be created under the /var directory.<br />
This is because the /var directory resolves to &SYSNAME/var and thus is<br />
system specific. System specific HFSs are unmounted when the owning<br />
system goes down and are not available to the rest of the sysplex while<br />
that system remains down.<br />
Instead, create a new directory under the sysplex root and create the<br />
broker root directories under this same directory.<br />
For example:<br />
mkdir /wbimb<br />
mkdir /wbimb/WMQ0BRK<br />
The root directory for any given broker is now visible to each z/OS system<br />
in the sysplex and is not unmounted if its host system goes down.<br />
Depending on the number of brokers created, you may also want to create<br />
additional HFSs to be mounted at each broker root directory, or one larger<br />
HFS at the higher directory, in order to prevent filling up the sysplex root<br />
HFS mounted at ‘/’.<br />
– When editing the mqsicompcif file, the DB2 group attach name (in this<br />
case DFPM) should be used for the DB2_SUBSYSTEM variable rather<br />
than specifying a specific DB2 subsystem. This configuration means that<br />
the broker can use another DB2 subsystem in the data sharing group to<br />
access its tables in the event of failure.<br />
– Use of BP0 as the DB2 buffer pool chosen for the broker database is not<br />
recommended. Furthermore, to enable the broker to restart on another<br />
system in the sysplex the buffer pool selected needs to be active on that<br />
system. Use the alter bufferpool command to activate it. If the buffer<br />
Chapter 3. Topology and system setup 27
pool is not available, errors occur on the system to where the broker is<br />
moved. Figure 3-4 lists those errors.<br />
Figure 3-4 Buffer Pool 2 is not available to DB2 subsystem DFM0<br />
– Having defined the local broker DB2 buffer pools on the relevant systems,<br />
you must then allocate a global buffer pool to allow data to be shared<br />
between the DB2 subsystems. This configuration requires defining a new<br />
structure to the coupling facility, for example called:<br />
DSN710PM_GBP2<br />
28 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Figure 3-5 Global Buffer Pool 2 not defined<br />
If the global buffer pool is not available, an error is submitted to the system<br />
log to where the broker has been moved. Figure 3-5 illustrates this error.<br />
3.6.2 Additional Message Broker configuration hints<br />
The following are additional Message Broker configuration hints:<br />
► Before running the mqsicreatebroker command make sure you have<br />
completed the actions in the section Setting up your OMVS user ID with<br />
instructions on setting the USS environment variables PATH and NLSPATH.<br />
Otherwise, the command and the corresponding messages are not found.<br />
You can find Setting up your OMVS user ID in the Message Broker section of<br />
the <strong>IBM</strong> WebSphere Business Integration Information Center z/OS section,<br />
under the chapter Configuring the Broker Domain.<br />
The Information Center is online at:<br />
http://publib.boulder.ibm.com/infocenter/wbihelp/index.jsp<br />
► The SYSTEM.BROKER.* queues must be private queues defined to the<br />
broker’s queue manager. They cannot be shared.<br />
► To enable the broker to use ARM to restart it, customize the ARM section of<br />
the mqsicompcif file and run the mqsicustomize program. The following are<br />
example lines from mqsicompcif:<br />
USE_ARM=’YES’<br />
ARM_ELEMENTNAME=’WMQ0BRK’<br />
ARM_ELEMENTTYPE=’SYSWMQI’<br />
You can find a downloadable sample ZIP file in Appendix B, “Additional<br />
material” on page 53.<br />
Chapter 3. Topology and system setup 29
3.7 Automatic Restart Management configuration<br />
On z/OS in a sysplex environment, a program can enhance its recovery potential<br />
by registering as an element of ARM. You can reduce the impact of an<br />
unexpected error to an element using ARM, because MVS can restart<br />
automatically without operator intervention. Program recovery via ARM is<br />
provided by activating an ARM policy using the SETXCF START command.<br />
In this environment, the active ARM policy is set to restart the following:<br />
► DB2<br />
► WebSphere MQ<br />
► WebSphere Business Integration Message Broker<br />
If any of these restarts fail, MVS restarts them in place. However, in the event of<br />
a system failure, these elements are restarted on another system in the sysplex.<br />
In this case, DB2 is only restarted in ‘light’ mode on another system. Restart light<br />
enables DB2 to restart with minimal storage footprint to quickly release retained<br />
locks and then terminate normally. Additionally, to enable Internal Resource Lock<br />
Manager (IRLM) to obtain the full benefits of a restart light, the ARM policy for the<br />
IRLM element should specify PC=YES.<br />
The JCL used to create the ARM policy for our testing environment is illustrated<br />
in Appendix A, “Sample code” on page 49. A ZIP file containing this JCL is<br />
available for download in Appendix B, “Additional material” on page 53.<br />
3.8 The configuration manager platform<br />
All components of WebSphere Business Integration Message Broker V5 Fix<br />
Pack 3 were installed on a T40 ThinkPad along with WebSphere MQ V5.3 FP5<br />
and DB2 V8.1 FP2. A set of WebSphere MQ channels was built between the<br />
mobile computer queue manager and each z/OS queue manager.<br />
The SupportPac IP13 message flows were imported into the workspace as per<br />
installation instructions.<br />
A .bar file was built for deployment of the DB2U message flow. The .bar file was<br />
deployed to both brokers.<br />
30 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
3.9 An overview of WebSphere Business Integration<br />
Message Broker SupportPac IP13<br />
<strong>IBM</strong> provides SupportPac IP13 that can be used to check the setup of a z/OS<br />
system and its WebSphere MQ and WebSphere Business Integration Message<br />
Broker configuration. SupportPac IP13 includes example flows and programs<br />
with documentation that facilitates the capability to do quick performance and<br />
health checks on your z/OS system.<br />
The SupportPac IP13 is available online at the following Web address:<br />
http://www-1.ibm.com/support/docview.wss?rs=203&uid=swg24006892&loc=en_US&cs=ut<br />
f-8&lang=en<br />
SupportPac IP13 is used in the testing environment for this paper to provide work<br />
for the brokers and to measure transaction rates. Broker statistics are used in<br />
conjunction with SupportPac IP13 transaction rate data to illustrate the effects of<br />
the various scenarios tested in Chapter 4, “Failover scenarios” on page 33.<br />
Chapter 3. Topology and system setup 31
32 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Chapter 4. Failover scenarios<br />
This chapter describes the high availability failover scenarios that we tested. The<br />
resultant application and broker statistics are shown along with our conclusions<br />
and explanations.<br />
This chapter describes the following failover scenarios:<br />
► Scenario 1 - Initial state with all components active.<br />
► Scenario 2 - Execution group failover.<br />
► Scenario 3 - Message Broker failover.<br />
► Scenario 4 - Queue manager failover.<br />
► Scenario 5 - DB2 failover.<br />
► Scenario 6 - z/OS system failover.<br />
4<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 33
4.1 Test environment setup<br />
As previously mentioned, the SupportPac IP13 batch job (OEMPUTX) was used<br />
to drive a message load to a shared queue.<br />
Note: For further information about all components of the <strong>IBM</strong> Category 2<br />
SupportPac IP13 refer to the documentation available from:<br />
http://www-306.ibm.com/software/integration/support/supportpacs/<br />
The SupportPac IP13 message flow (DB2U) running on both brokers consumed<br />
these messages, then placed reply messages on a second shared queue. The<br />
reply messages were subsequently picked up by OEMPUTX, completing the<br />
request-reply loop. Statistics generated by OEMPUTX were compared to<br />
statistics generated by the Message Broker and the results evaluated. The DB2U<br />
message flow also updates a DB2 database called SHAREPRICES.<br />
We executed all test scenarios at least three times to provide reasonable<br />
accuracy in reporting statistics.<br />
Transaction rates for the applications (running on the MVSM0 and MVSM2<br />
LPARs) and the brokers (WMQ0BRK, WMQ2BRK) can be compared only within<br />
a given test scenario. Comparison of these rates between test scenarios is<br />
meaningless.<br />
4.1.1 SupportPac IP13 setup<br />
This section provides details on the SupportPac IP13 setup.<br />
DB2 configuration<br />
JCL is supplied with SupportPac IP13 to both set up the application environment<br />
and run the tests. Job JDB2DEFS creates a DB2 SHAREPRICES table and<br />
inserts some initial data.<br />
Since this table is unique, the JDB2DEFS job should only be run once. However,<br />
the JCL in this job would normally create the table by taking the user ID of one<br />
broker as the schema name. When the other broker later tries to access the<br />
table, it refers to it by taking a different user ID, under which it runs, as the<br />
schema. This is because in each broker’s dsnaoini file the CURRENTSQLID is<br />
set to the user ID of that broker. So, the second broker is not able to access the<br />
table.<br />
To avoid this problem in the environment for this book, broker 2 was given<br />
authority to set its CURRENTSQLID to the user ID of broker 0. However, a better<br />
34 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
solution for setting up SupportPac IP13 for running in a sysplex is to follow the<br />
steps below:<br />
1. Create a new RACF group that will be the schema name for the unique<br />
SupportPac IP13 DB2 table.<br />
2. Connect the user IDs under which the broker started tasks run to the<br />
previously created RACF group.<br />
3. Create the SupportPac IP13 SHAREPRICES table using job JBDB2DEFS,<br />
inserting the new RACF group name as the schema.<br />
4. SET CURRENTSQLID in the ESQL of the mesageflow. See DB2U message<br />
flow configuration below for details.<br />
OEMPUTX batch job configuration<br />
There are several parameters that can be passed to OEMPUTX in order to<br />
cause different behavior of the application. The parameters we chose for our<br />
testing included:<br />
-n25000 Total number of messages to put, in this case 25000.<br />
-m4 Causes the program to run for four minutes. In the test<br />
environment, no single broker was capable of processing 25000<br />
messages within four minutes, so these first two parameters<br />
provided a steady-state message flow rate.<br />
-gm Causes OEMPUTX to use same MQMD.MsgId for all MQPUTs in<br />
the loop, and MQGET replies by this MsgId. With this option,<br />
there is no Message Broker affinity.<br />
-w45 Sets the MQGET MQMD.WaitInterval to wait_time seconds.<br />
-c Commit every msgs_in_loop messages. msgs-in-loop was not<br />
set and the default is one message per batch.<br />
We also used non-persistent messages.<br />
DB2U message flow configuration<br />
Recall that in each broker’s dsnaoini file, the CURRENTSQLID is set to the user<br />
ID of that broker. This configuration would normally prevent brokers from<br />
accessing DB2 tables outside of their schema. To circumvent this behavior:<br />
► Add the following line of ESQL to the DB2U message flow to set the<br />
CURRENTSQLID to the RACF group name created for the SupportPac IP13<br />
SHAREPRICES table (see Chapter 3, “Topology and system setup” on<br />
page 21). This value has to have single quotes surrounding it for z/OS DB2 to<br />
process it correctly, as shown in the example below:<br />
PASSTHRU('SET CURRENT SQLID= ''SYSDSP''');<br />
Chapter 4. Failover scenarios 35
► In a production environment, you may prefer to pass the CURRENTSQLID as<br />
part of the input message and perform the set via a variable. We did not test<br />
the following ESQL, but we have provided it as a sample:<br />
DECLARE ID CHARACTER;<br />
SET ID = InputBody.My.CurrentSQLID;<br />
PASSTHRU(‘{SET SQLID = (?)}’,ID);<br />
4.1.2 Message Broker configuration<br />
In our testing, we used WebSphere Business Integration Message Broker<br />
without any tuning. We built a single execution group for each broker, and the<br />
same .bar file was deployed to the execution group. The final task necessary to<br />
run the tests was to turn message flow accounting (archive data) on. Statistical<br />
data is written to a message queue based on the collection interval defined for<br />
the broker. The first step is to build a subscription to the Message Broker to<br />
publish statistics. You can build the subscription as follows:<br />
► The subscription topic is: $SYS/Broker/+/StatisticsAccounting/#<br />
► This subscription must be put to the SYSTEM.BROKER.CONTROL.QUEUE<br />
► Specify a private queue to publish the statistics to, for example:<br />
STATS.IN.WMQ0BRK<br />
The IH03 SupportPac (RFHUTIL) was used to put the subscription and retrieve<br />
the XML statistics messages as they were produced. Once a statistics message<br />
is successfully read into RFHUTIL, select the Data tab. Then, select the XML<br />
radio button under Data Format. The XML tags of the statistics message are self<br />
explanatory.<br />
For easier viewing, you can import this file into Excel using the following<br />
procedure:<br />
1. From the Data tab panel in RFHUTIL, press Ctrl+A to select the XML<br />
message, then Ctrl+C to copy it.<br />
2. Paste the data into WordPad (not NotePad). Save the data to a file.<br />
3. This file can then be imported into Excel. In Excel, select File → Open, then<br />
navigate to the directory you just saved the WordPad file in. At the bottom of<br />
the navigation screen, select All File Types and select the file.<br />
4. The Text Import Wizard opens. Click Next.<br />
5. On the second panel of the Text Import Wizard in the Delimiter section, select<br />
Other and in the box to the right of Other type a double quote (“). Then select<br />
Finish. With a little re-sizing, you can easily read the statistics in the<br />
spreadsheet.<br />
36 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
In order for the Message Broker to publish statistics, the following command was<br />
issued at the MVS console:<br />
f WMQ0BRK,cs a=yes,g=yes,j=yes,n=basic,t=basic,o=xml,c=active<br />
4.2 Scenario 1 - Initial state with all components active<br />
The first scenario that was measured involved all components of the environment<br />
active in their normal functioning state. The SupportPac IP13 batch jobs<br />
submitted to both systems allowed statistics to be gathered for this control<br />
situation.<br />
Figure 4-1 displays the configuration.<br />
z/OS (MVSM0)<br />
DB2 (DFM0)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ0)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ0BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
IP13<br />
Figure 4-1 All components active<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
z/OS (MVSM2)<br />
IP13<br />
DB2 (DFM2)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ2<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ2BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
Chapter 4. Failover scenarios 37
The batch jobs provided work for the brokers for two minutes. The results from<br />
the SupportPac IP13 measurements and the Message Broker statistics recorded<br />
are displayed in Table 4-1.<br />
Table 4-1 SupportPac IP13 and Message Broker statistics<br />
IP13 Statistics MVSM0 MVSM2 Total<br />
Total Transactions 9884 9568 19450<br />
Elapsed Time (seconds) 119.942 119.615<br />
Application CPU Time (seconds) 9.419 11.479<br />
Transaction Rate (trans/sec) 82.406 79.973<br />
Round trip per msg (ms) 12.134 12.504<br />
Average App CPU per msg (ms) 0.952 1.199<br />
Broker Statistics WMQ0BRK WMQ2BRK<br />
Total Number Input Messages 9234 10216 19450<br />
Total Elapsed Time (seconds) 96.401 98.762<br />
Total CPU Time (seconds) 31.590 33.167<br />
Conclusions<br />
From the results, we concluded the following:<br />
► The total number of transactions processed by the SupportPac IP13<br />
applications on both systems equals the total number of messages processed<br />
by each of the two brokers.<br />
► A greater number of SupportPac IP13 application transactions are processed<br />
by the MVSM0 LPAR, while a greater number of messages are processed by<br />
the broker on MVSM2. This slight imbalance is due to the different resources<br />
available to each system.<br />
4.3 Scenario 2 - Execution group failover<br />
In the second scenario the execution group for WMQ2BRK was made to fail by<br />
issuing a cancel command (C SAMPLE2) to the MVS console, as illustrated in<br />
Figure 4-2 on page 39. Message Broker execution groups are automatically<br />
recovered by the Message Broker, so no action is required by ARM. For this and<br />
all subsequent tests, Coordinated Transaction was selected in the deployment<br />
descriptor of the .bar file.<br />
38 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
z/OS (MVSM0)<br />
DB2 (DFM0)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ0)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ0BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
IP13<br />
Figure 4-2 Execution Group failover<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
z/OS (MVSM2)<br />
The batch jobs provided work for the brokers for four minutes. Table 4-2 records<br />
the results from the SupportPac IP13 measurements and the Message Broker<br />
statistics.<br />
Table 4-2 SupportPac IP13 and Message Broker statistics<br />
IP13 Statistics MVSM0 MVSM2 Total<br />
Total Transactions 15212 14584 29796<br />
IP13<br />
Elapsed Time (seconds) 239.294 239.258<br />
Application CPU Time (seconds) 17.172 18.454<br />
Transaction Rate (trans/sec) 63.570 60.955<br />
Round trip per msg (ms) 15.730 16.405<br />
Average App CPU per msg (ms) 1.128 1.265<br />
DB2 (DFM2)<br />
(DSG - DSN710PM))<br />
QMGR (WQM2)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ2BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
Chapter 4. Failover scenarios 39
IP13 Statistics MVSM0 MVSM2 Total<br />
Broker Statistics WMQ0BRK WMQ2BRK<br />
Total Number Input Messages 17051 12010 29061<br />
Total Elapsed Time (seconds) 181.542 124.319<br />
Total CPU Time (seconds) 61.153 42.033<br />
Conclusions<br />
From the results, we concluded the following:<br />
► At first glance, it would seem as though messages were lost under this test,<br />
but this is not the case for this or any other test scenario. When the execution<br />
group is cancelled, the statistics message is dumped by the broker. Upon<br />
successful execution group startup, a new statistics message is created, the<br />
interval time is reset, and accounting starts fresh. Since the execution group<br />
was cancelled very close to the beginning of the test and it recovered quickly,<br />
there are 735 messages that WMQ2BRK processed that are not accounted<br />
for in the above table.<br />
► The high number of messages consumed by WMQ2BRK underscores the<br />
speed of the execution group recovery.<br />
► The higher number of messages consumed by WMQ0BRK compared to<br />
transactions completed on MVSM0 illustrates how broker 0 processes<br />
messages from the SupportPac IP13 batch jobs on both systems, taking the<br />
extra load while the execution group is down on MVSM2.<br />
4.4 Scenario 3 - Message Broker failover<br />
The third scenario tests the failure of WMQ2BRK Message Broker. The failure<br />
was simulated by issuing a MVS cancel command (C WMQ2BRK,ARMRESTART), as<br />
illustrated in Figure 4-3 on page 41. The ARM policy in effect ensures the broker<br />
restarts immediately. You can find the policy details in Appendix A, “Sample<br />
code” on page 49.<br />
40 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
z/OS (MVSM0)<br />
DB2 (DFM0)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ0)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ0BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
IP13<br />
Figure 4-3 Message Broker failover<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
z/OS (MVSM2)<br />
The batch jobs provided work for the brokers for four minutes. Table 4-3 records<br />
the results from the SupportPac IP13 measurements and the Message Broker<br />
statistics.<br />
Table 4-3 SupportPac IP13 and Message Broker statistics<br />
IP13 Statistics MVSM0 MVSM2 Total<br />
Total Transactions 20564 9608 30172<br />
IP13<br />
Elapsed Time (seconds) 239.539 239.721<br />
Application CPU Time (seconds) 21.363 9.934<br />
Transaction Rate (trans/sec) 85.847 41.912<br />
Round trip per msg (ms) 46572 88645<br />
Average App CPU per msg (ms) 1045 1034<br />
DB2 (DFM2)<br />
(DSG- DSN710PM)<br />
QMGR (WMQ2)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ2BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
Chapter 4. Failover scenarios 41
IP13 Statistics MVSM0 MVSM2 Total<br />
Broker Statistics WMQ0BRK WMQ2BRK<br />
Total Number Input Messages 23098 unknown 23098<br />
Total Elapsed Time (seconds) 224.622 unknown<br />
Total CPU Time (seconds) 76.918 unknown<br />
Conclusions<br />
From the results, we concluded:<br />
► Again, the statistics message is dumped by the broker so no statistics are<br />
available for WMQ2BRK. Cancelling the broker caused multiple SVC dumps<br />
to be taken and CPU usage was at 100% on MVSM2 for some time. This<br />
explains why the transaction rate for MVSM2 is less than half of MVSM0.<br />
► The higher number of messages consumed by WMQ0BRK compared to<br />
transactions completed on MVSM0 illustrates how broker 0 processes<br />
messages from the SupportPac IP13 batch jobs on both systems, taking the<br />
extra load while WMQ2BRK is down.<br />
4.5 Scenario 4 - Queue manager failover<br />
Scenario 4 tests the failure of WMQ2 queue manager. The failure was simulated<br />
by issuing a MVS cancel command (WMQ2STOP QMGR MODE(RESTART) as illustrated<br />
in Figure 4-4 on page 43. The ARM policy in effect ensures the queue manager<br />
restarts immediately. Upon successful queue manager startup, the Message<br />
Broker dynamically reconnects to the queue manager and issues the following<br />
message to the MVS log:<br />
+BIP2091I WMQ2BRK 0 The broker has reconnected to WebSphere Business<br />
Integration successfully. : ImbAdminAgent(1095)<br />
The OEMPUTX batch job does not reconnect to the queue manager after a<br />
failure, so for this test the batch job was submitted only on MVSM0.<br />
42 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
z/OS (MVSM0)<br />
DB2 (DFM0)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ0)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ0BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
IP13<br />
Figure 4-4 Queue Manager failover<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
The batch jobs provided work for the brokers for four minutes. Table 4-4 records<br />
the results from the SupportPac IP13 measurements and the Message Broker<br />
statistics.<br />
Table 4-4 SupportPac IP13 and Message Broker statistics<br />
IP13 Statistics MVSM0 MVSM2 Total<br />
Total Transactions 37204 n.a. 37204<br />
Elapsed Time (seconds) 239.919 n.a.<br />
Application CPU Time (seconds) 35.269 n.a.<br />
Transaction Rate (trans/sec) 155.059 n.a.<br />
Round trip per msg (ms) 25774 n.a.<br />
Average App CPU per msg (ms) 947 n.a.<br />
z/OS (MVSM2)<br />
IP13<br />
DB2 (DFM2)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ2)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ2BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
Chapter 4. Failover scenarios 43
IP13 Statistics MVSM0 MVSM2 Total<br />
Broker Statistics WMQ0BRK WMQ2BRK<br />
Total Number Input Messages 25460 11019 36479<br />
Total Elapsed Time (seconds) 224.761 119.003<br />
Total CPU Time (seconds) 84.247 36.560<br />
Conclusions<br />
From the results, we concluded that the statistics message is lost prior to the<br />
queue manager restart since statistics gathering is a WebSphere MQ Pub/Sub<br />
function. The statistics shown for WMQ2BRK are after the WMQ2 queue<br />
manager restart. WMQ2BRK processed about 30% of the total messages. This<br />
is due to the length of time it takes the queue manager to restart and the broker<br />
to reconnect.<br />
4.6 Scenario 5 - DB2 failover<br />
There is a known issue with the Message Broker in that if DB2 should fail while it<br />
is active, it does not dynamically reconnect as is the case with a queue manager<br />
failure, as illustrated in Figure 4-5 on page 45. So, while ARM restarts DB2, the<br />
Message Broker requires a manual restart to resume work. The DB2 connection<br />
is not actively managed by the Message Broker and, in fact, it would not realize<br />
the DB2 had failed until there was a need to access a database. This issue is<br />
being addressed in APAR PQ92596. As such, there were no statistics produced<br />
for this test.<br />
44 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
z/OS (MVSM0)<br />
DB2 (DFM0)<br />
(DSG - DSN710PM)<br />
QMGR (WMQ0)<br />
(QSG - MB01)<br />
Message Broker<br />
(WMQ0BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
Figure 4-5 DB2 failover<br />
4.7 Scenario 6 - z/OS system failover<br />
IP13<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
z/OS (MVSM2)<br />
Message Broker<br />
(WMQ2BRK)<br />
Execution Groups<br />
Message<br />
Flows<br />
The final scenario tests the failure of MVSM2. The failure was simulated by<br />
removing MVSM2 from the sysplex with a system reset from the hardware<br />
console, as illustrated in Figure 4-6 on page 46. ARM restarts the DFM2 DB2<br />
subsystem in light mode on MVSM0 to release any locks retained. It then restarts<br />
queue manager, WMQ2, and broker, WMQ2BRK, on the surviving system,<br />
MVSM0.<br />
IP13<br />
DB2 (DFM2)<br />
(DSG- DSN710PM)<br />
QMGR (WMQ2)<br />
(QSG - MB01)<br />
Chapter 4. Failover scenarios 45
z/OS<br />
DB2<br />
(DSG)<br />
QMGR<br />
(QSG)<br />
Message Broker<br />
Execution Groups<br />
Message<br />
Flows<br />
Figure 4-6 LPAR failover<br />
Because the OEMPUTX batch job running on MVSM2 ended when the system<br />
went down, there are no statistics for this test. However, to test the operation of<br />
the moved broker, the job was resubmitted on MVSM0, and it was observed that<br />
both brokers functioned normally on the one surviving z/OS image.<br />
Once MVSM2 was brought back into the sysplex with its DB2 subsystem active,<br />
the MVSM2 queue manager and broker were shut down on MVSM0 and<br />
restarted on MVSM2. Once again, the OEMPUTX batch job was submitted and<br />
normal operation was verified.<br />
46 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5<br />
IP13<br />
Coupling<br />
Facility<br />
Configuration<br />
Manager<br />
z/OS<br />
IP13<br />
DB2<br />
(DSG)<br />
QMGR<br />
(QSG)<br />
Message Broker<br />
Execution Groups<br />
Message<br />
Flows
4.8 Summary<br />
Conclusions<br />
Though collection of statistics was not appropriate for this scenario, the scenario<br />
demonstrated that:<br />
► With the ARM policy provided in Appendix A, “Sample code” on page 49,<br />
when the host z/OS system failed, the Message Broker and its pre-requisite<br />
subsystems were automatically restarted on the chosen surviving z/OS<br />
system in the sysplex.<br />
► Successful operation of the moved Message Broker was verified.<br />
► On moving the Message Broker back to its original z/OS system its<br />
successful operation was again verified.<br />
We performed various failover tests to demonstrate high availability solutions for<br />
WebSphere Business Integration Message Broker on z/OS. Our goal was to<br />
observe that none of the messages were lost, despite the loss of statistics in<br />
some of the failover scenarios.<br />
Chapter 4. Failover scenarios 47
48 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Appendix A. Sample code<br />
This appendix provides sample code that was used for this paper.<br />
A<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 49
ARM policy<br />
Example A-1 displays the ARM policy we created and activated for the sysplex<br />
on which the failover scenarios (in Chapter 4, “Failover scenarios” on page 33)<br />
were performed. The policy name is POLICY1. The elements to be restarted are<br />
in GROUP1. These elements consist of DB2, IRLM, WebSphere MQ, and<br />
WebSphere Business Integration Message Broker.<br />
Example: A-1 ARM Policy<br />
DATA TYPE(ARM)<br />
REPORT(YES)<br />
DEFINE POLICY NAME(POLICY1) REPLACE(YES)<br />
RESTART_ORDER<br />
LEVEL(1)<br />
ELEMENT_NAME(DSN710PMDFM0,DSN710PMDFM2,<br />
DFPMIRLMIFM0001,DFPMIRLMIFM2003)<br />
LEVEL(2)<br />
ELEMENT_NAME(SYSMQMGRWMQ0,SYSMQMGRWMQ2)<br />
LEVEL(3)<br />
ELEMENT_NAME(SYSWMQI_WMQ0BRK,SYSWMQI_WMQ2BRK)<br />
RESTART_GROUP(DEFAULT)<br />
ELEMENT(*)<br />
RESTART_ATTEMPTS(0) /* JOBS NOT TO BE RESTARTED BY ARM */<br />
RESTART_GROUP(GROUP1)<br />
TARGET_SYSTEM(MVM0,MVM2) /* Z/OS SYSTEM NAME(S) */<br />
RESTART_PACING(20)<br />
ELEMENT(DSN710PMDFM0)<br />
RESTART_METHOD(SYSTERM,STC,'#DFM0 STA DB2,LIGHT(YES)')<br />
ELEMENT(DSN710PMDFM2)<br />
RESTART_METHOD(SYSTERM,STC,'#DFM2 STA DB2,LIGHT(YES)')<br />
ELEMENT(DFPMIRLMIFM0001)<br />
RESTART_METHOD(SYSTERM,STC,'#DFM0 S DFM0IRLM,PC=YES')<br />
ELEMENT(DFPMIRLMIFM2003)<br />
RESTART_METHOD(SYSTERM,STC,'#DFM2 S DFM2IRLM,PC=YES')<br />
ELEMENT(SYSMQMGRWMQ0)<br />
RESTART_ATTEMPTS(3,300)<br />
RESTART_TIMEOUT(120)<br />
TERMTYPE(ALLTERM)<br />
RESTART_METHOD(BOTH,STC,'WMQ0 START QMGR')<br />
ELEMENT(SYSMQMGRWMQ2)<br />
RESTART_ATTEMPTS(3,300)<br />
RESTART_TIMEOUT(120)<br />
TERMTYPE(ALLTERM)<br />
RESTART_METHOD(BOTH,STC,'WMQ2 START QMGR')<br />
50 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
ELEMENT(SYSWMQI_WMQ0BRK)<br />
RESTART_ATTEMPTS(3,300)<br />
RESTART_TIMEOUT(120)<br />
TERMTYPE(ALLTERM)<br />
RESTART_METHOD(BOTH,STC,'S WMQ0BRK')<br />
ELEMENT(SYSWMQI_WMQ2BRK)<br />
RESTART_ATTEMPTS(3,300)<br />
RESTART_TIMEOUT(120)<br />
TERMTYPE(ALLTERM)<br />
RESTART_METHOD(BOTH,STC,'S WMQ2BRK')<br />
This policy is also available as a downloadable ZIP file in Appendix B, “Additional<br />
material” on page 53.<br />
Once downloaded and unzipped, the file should be uploaded to the z/OS system<br />
using binary ftp transfer. The resulting data set is in TSO TRANSMIT format. To<br />
extract the ARM policy JCL issue a TSO command similar to the following:<br />
RECEIVE INDSNAME(ARMPOL.XMIT)<br />
Once created, the policy can be activated with the following system command:<br />
SETXCF START,POLICY,TYPE=ARM,POLNAME=POLICY1<br />
Broker customization input file<br />
An example of the broker customization input file, mqsicompcif used for broker<br />
WMQ2BRK is downloadable from Appendix B, “Additional material” on page 53.<br />
The size constraint limits its display in the appendix.<br />
Once downloaded and unzipped, the file should be uploaded to the z/OS system<br />
using binary ftp transfer. The resulting data set is in TSO TRANSMIT format. To<br />
extract the mqsicomcif file issue a TSO command similar to the following:<br />
RECEIVE INDSNAME(MQSI.COMPCIF.XMIT)<br />
Appendix A. Sample code 51
52 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Appendix B. Additional material<br />
This paper refers to additional material that can be downloaded from the Internet<br />
as described below.<br />
Locating the Web material<br />
The Web material associated with this paper is available in softcopy on the<br />
Internet from the <strong>IBM</strong> <strong>Redbooks</strong> Web server. Point your Web browser to:<br />
ftp://www.redbooks.ibm.com/redbooks/REDP3894<br />
Alternatively, you can go to the <strong>IBM</strong> <strong>Redbooks</strong> Web site at:<br />
ibm.com/redbooks<br />
B<br />
Select the Additional materials and open the directory that corresponds with<br />
the redbook form number, REDP3894.<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 53
Using the Web material<br />
The additional Web material that accompanies this <strong>Redpaper</strong> includes the<br />
following files:<br />
File name Description<br />
armpol.xmit.zip ARM Policy (TSO) transmit Zipped Code Sample.<br />
mqsicomsif.xmit.zip Broker customization input file (TSO) transmit sample<br />
How to use the Web material<br />
Create a subdirectory (folder) on your workstation, and unzip the contents of the<br />
Web material ZIP file into this folder.<br />
54 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Abbreviations and acronyms<br />
ARM Automatic Restart<br />
Management<br />
CA continuous availability<br />
CF coupling facility<br />
CFRM Coupling Facility Resource<br />
Manager<br />
CPF command prefix string<br />
DDF Distributed Data Facility<br />
DSG DB2 data sharing group<br />
HA high availability<br />
HFS Hierarchical File System<br />
HSM Hierarchical Storage Manager<br />
<strong>IBM</strong> International Business<br />
Machines Corporation<br />
IRLM Internal Resource Lock<br />
Manager<br />
ITSO International Technical<br />
Support Organization<br />
JCL Job Control Language<br />
LPAR Logical Partition<br />
ODBC Open Database Connectivity<br />
QSG queue sharing group<br />
RRS Resource Recovery Services<br />
USS UNIX System Services<br />
VIPA Virtual IP Address<br />
WMQ WebSphere MQ<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 55
56 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Related publications<br />
<strong>IBM</strong> <strong>Redbooks</strong><br />
The publications listed in this section are considered particularly suitable for a<br />
more detailed discussion of the topics covered in this <strong>Redpaper</strong>.<br />
For information about ordering these publications, see “How to get <strong>IBM</strong><br />
<strong>Redbooks</strong>” on page 58. Note that some of the documents referenced here may<br />
be available in softcopy only.<br />
► WebSphere Business Integration Pub/Sub Solutions, SG24-6088<br />
► Highly Available WebSphere Business Integration Solutions, SG24-7006<br />
Other publications<br />
Online resources<br />
These publications are also relevant as further information sources:<br />
► Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />
availability, GM13-0026<br />
These Web sites and URLs are also relevant as further information sources:<br />
► Leveraging z/OS TCP/IP Dynamic VIPAs and Sysplex Distributor for higher<br />
availability, GM13-0026<br />
http://www-1.ibm.com/servers/eserver/zseries/pso/whitepaper.html<br />
http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/<br />
gm130165.html<br />
► OS/390® and z/OS TCP/IP in the Parallel Sysplex Environment - Blurring the<br />
Boundaries<br />
http://www-1.ibm.com/servers/eserver/zseries/pso/whitepaper.html<br />
http://www-1.ibm.com/servers/eserver/zseries/library/techpapers/pdf/<br />
gm130026.pdf<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 57
How to get <strong>IBM</strong> <strong>Redbooks</strong><br />
You can search for, view, or download <strong>Redbooks</strong>, <strong>Redpaper</strong>s, Hints and Tips,<br />
draft publications and Additional materials, as well as order hardcopy <strong>Redbooks</strong><br />
or CD-ROMs, at this Web site:<br />
ibm.com/redbooks<br />
Help from <strong>IBM</strong><br />
<strong>IBM</strong> Support and downloads<br />
ibm.com/support<br />
<strong>IBM</strong> Global Services<br />
ibm.com/services<br />
58 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
Index<br />
A<br />
active-active 22<br />
active-passive 22<br />
Automatic Restart Management (ARM) 4, 25, 30<br />
C<br />
cloned applications, advantages 14<br />
cloned brokers 10<br />
cloned execution groups, advantages 12<br />
cloned execution groups, disadvantages 12<br />
clustering 10<br />
command prefix string (CPF) 25<br />
Continuous availability 4<br />
coupling facility (CF)<br />
failover 4<br />
failure 4<br />
Resource Manager (CFRM) 4, 26<br />
CURRENTSQLID 34–35<br />
D<br />
DB2 data sharing group (DSG) 4<br />
DB2 database called SHAREPRICES 34<br />
DB2 SHAREPRICES table 34<br />
design of message flows 17<br />
Distributed Data Facility (DDF) 24<br />
F<br />
factors contribute to HA of a Message Broker environment<br />
10<br />
H<br />
Hierarchical File System (HFS) 24<br />
Hierarchical Storage Manager (HSM) 26<br />
high availability (HA) 1, 21–22<br />
environment for WebSphere Business Integration<br />
Message Broker 2<br />
failover scenarios 33<br />
I<br />
<strong>IBM</strong> Category 2 SupportPac IP13 5, 34<br />
IH03 SupportPac (RFHUTIL) 36<br />
important factors when considering availability 10<br />
Internal Resource Lock Manager (IRLM) 30<br />
J<br />
JBDB2DEFS 35<br />
JDB2DEFS 34<br />
Job Control Language (JCL) 26<br />
Job JDB2DEFS 34<br />
L<br />
Logical Partition (LPAR) 4, 24<br />
M<br />
Message Broker, error processing 18<br />
N<br />
network topology, benefits 11<br />
O<br />
OEMPUTX, parameters that can be passed 35<br />
Open Database Connectivity (ODBC) 24<br />
P<br />
Parallel Sysplex 2<br />
Publish and Subscribe functionality 18–19<br />
Q<br />
queue sharing group (QSG) 4, 14<br />
R<br />
<strong>Redbooks</strong> Web site 58<br />
S<br />
SupportPac IP13<br />
batch job (OEMPUTX) 34<br />
batch jobs 37<br />
message flow (DB2U) 34<br />
SHAREPRICES table 35<br />
© Copyright <strong>IBM</strong> Corp. 2004. All rights reserved. 59
U<br />
UNIX System Services (USS) environment 27<br />
W<br />
WebSphere<br />
Business Integration Broker on z/OS 5<br />
Business Integration Message Broker 1, 9<br />
WebSphere MQ<br />
clustering 10, 12–15<br />
clustering, advantages 13<br />
clustering, disadvantages 13<br />
shared queues 12, 14<br />
shared queues, advantages 16<br />
shared queues, disadvantages 16<br />
60 High Availability z/OS Solutions for WebSphere Business Integration Message Broker V5
High Availability z/OS Solutions<br />
for WebSphere Business<br />
Integration Message Broker V5<br />
Develop a highly<br />
available WebSphere<br />
Business Integration<br />
Message Broker<br />
solution on z/OS<br />
Configure<br />
WebSphere MQ QSGs<br />
to support Message<br />
Broker in a Sysplex<br />
Example Message<br />
Broker high<br />
availability<br />
implementations<br />
Back cover<br />
When designing and implementing a production grade<br />
Message Broker solution on z/OS, one of the most important<br />
factors to consider is high availability.<br />
This <strong>IBM</strong> <strong>Redpaper</strong> examines the design considerations<br />
inherent in configuring a highly available Message Broker<br />
environment.<br />
Also demonstrated is the use of the coupling facility for<br />
WebSphere MQ queue sharing groups (QSG) and Automatic<br />
Restart Management (ARM) in order to support WebSphere<br />
Business Integration Message Broker HA in a sysplex<br />
environment.<br />
Finally, examples of the behavior of Message Broker during<br />
failover are provided, including transaction rate<br />
measurements and throughput statistics.<br />
INTERNATIONAL<br />
TECHNICAL<br />
SUPPORT<br />
ORGANIZATION<br />
®<br />
<strong>Redpaper</strong><br />
BUILDING TECHNICAL<br />
INFORMATION BASED ON<br />
PRACTICAL EXPERIENCE<br />
<strong>IBM</strong> <strong>Redbooks</strong> are developed by<br />
the <strong>IBM</strong> International Technical<br />
Support Organization. Experts<br />
from <strong>IBM</strong>, Customers and<br />
Partners from around the world<br />
create timely technical<br />
information based on realistic<br />
scenarios. Specific<br />
recommendations are provided<br />
to help you implement IT<br />
solutions more effectively in<br />
your environment.<br />
For more information:<br />
ibm.com/redbooks