System 설치 및 구성을 위한 고려 사항 - IBM

System 설치 및 구성을 위한 

고려 사항 

최평락(prchoi@kr.ibm.com) 

MTS, IBM Korea 

© 2010 IBM Corporation

2 

Contents 

System 설치 및 구성을 위한 고려 사항 

I 고려 사항 - HW 

II II 고려 사항 - OS 

III III 고려 사항 - MW 

Ⅳ 구성 사례 

Ⅴ ISSUE & 해결 


Contents 

3 


I 고려 사항 -HW 

1.1 구성 일정 및 절차 

1.2 장애 단일점 고려 사항 

1.3 HW 고려 사항 

1.4 website 

1.5 HMC 연결 

1.6 스토리지 

1.7 Firmware 


1.1 구성 일정 및 절차 

4 

일정 

절차 

검증 


1.2 장애 단일점 고려 

5 

HW 구성요소에 대해 모두 redundancy를 확보하여 가용성을 증대시켜 서비스를 구성합니다. 

LPAR Planning 단계에서 여러 가지 이중화(BUS 분산 등)이 고려되어야 합니다. 

구성요소 장애단일점(SPOF) 에 대한 가이드라인 

노드/애플리케이션 HACMP가 구성되어 있어 노드 장애시에도 다른 노드에서 애플리케이션 서비스가 가능하도록 함 

전원 

- Power Supply는 dual 혹은 N+1의 구성을 가져야 함 

- Dual Power supply의 경우 각기 다른 power distribution unit을 사용 

- Rack으로 인입되는 전원은 이중화하여 특정 UPS 장애시에도 가용성을 유지하도록 함 

네트워크 어댑터 Etherchannel 혹은 HACMP를 구성하여 네트워크 어댑터 이중화 

네트워크 backbone 서버에 복수의 네트워크가 연결되어야 함 

디스크 어댑터 Fiber Channel Adapter들은 이중화하여 사용. 

디스크 콘트롤러 이중화된 어댑터들은 최적의 성능과 가용성을 위해 각기 다른 PCI BUS Group에 분산시킴 

디스크 RAID /미러링 디스크를 활용 


1.3 HW 고려 사항 

6 

공간및하중 


1.3 HW 고려 사항 (cont.) 

7 

Space 고려 사항 



8 

DB 

전원 3상 4선 

업무 

사전운영 DB 

DB1 

Model 

7040-61R 

785 * 2 mm 

Rack 

대수 

DB2 

2025mm 

1342mm 

3상 4선 208V, 100A – 2개 3상 4선 208V, 100A – 2개 

2 

2025 

Height 

서버당 

mm 

785 

Width 

서버당 

mm 

1342 

Depth 

서버당 

mm 

Rack당 PDU수 

220V, 일반전원 –2개(monitor, HMC) 

Desktop형 HMC 

2 

Rack당 서버수 

1 

Power 

전원사양 

3상 4선 208V, 100A 

분전반 

total 

4 



9 

app 

전원 단상 3선 

업무 

AP-3 

AP/CI 

단상3선208V,30A–4개 

623 mm * 2 

Model 

7014-T42 

Rack 

대수 

2 

AP-1 

AP-2 

단상3선208V,30A–6개 

2015 

Height 

서버당 

mm 

2015mm 

1043mm 

623 

Width 

서버당 

mm 

1043 

Depth 

서버당 

mm 

Rack당 PDU수 

4/6 

Rack당 서버수 

1-2 

Power 

전원사양 

단상 3선 200-240V 

분전반 

total 

10 


1.4 Website 

10 

DS8000 

http://publib.boulder.ibm.com/infocenter/dsichelp/ds8000sv/index.jsp 


1.4 Website (cont.) 

11 

Server 

http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp 


1.5 HMC 

12 

HMC 이중화 

� Network cable 

� Hub 

� HMC IP 

� 방화벽 

� DLPAR 


1.6 스토리지 

13 

DS8000 

https://w3-03.sso.ibm.com/sales/support/information/stg/#tab23 

RAID10 

M M M 

M M M 

http://www-01.ibm.com/support/docview.wss?rs=540&uid=ssg1S7001350 

S 

S 

S 

S 

Raid type? 

LUN size? 

Host? 

Flashcopy? 

PPRC? 

RAID5 

P 

P 


1.7 Firmware 

14 

Code Matrix 

http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/supportedcodep7.html 


1.7 Firmware (cont.) 

15 

Download 

http://www-933.ibm.com/support/fixcentral/firmware/selectFixes 

http://www-01.ibm.com/support/docview.wss?&rs=1114&uid=ssg1S1002949 

Patch 정책 

PM Cycle 


Contents 

16 


II 고려 사항 -OS 

2.1 AIX Level 결정 

2.2 Aix Life Cycle 

2.3 Compatibility 


2.1 AIX Level 결정 

17 

Guide Line 


2.2 AIX Life Cycle 

http://www-01.ibm.com/software/support/lifecycle/index_a_z.html 

18 


2.2 AIX Life Cycle (cont.) 

19 

AIX 5.3 EOS Plan 

OS level 

TL 5300-06 

(53J) 

TL 5300-07 

(53L) 

TL 5300-08 

(53N) 

TL 5300-09 

(53Q) 

TL 5300-10 

(53S) 

TL 5300-11 

(53V) 

TL 5300-12 

(53X) 

5300-EOS 

Original 

Release 

06/08/2007 

11/09/2007 

04/30/2008 

11/14/2008 

05/15/2009 

TBD 

10/16/2009 

TBD 04/2010 

End of PTF support 

5/31/2009 

approximately 

11/31/2009 

approximately 

04/30/2010 

approximately 

11/31/2010 

approximately 05/2011 

10/31/2011 

04/30/2012 

04/30/2012 End of 

Service 

End of interim fix support 

5/31/2009 

approximately 11/31/2009 



approximately 05/2011 

10/31/2011 

04/30/2012 

04/30/2015 End of Fee-based 

Service 

위의 내용 중 TBD 또는 approximately라고 표기된 부분은 IBM 상황에 따라 바뀔 수 있는 일자입니다. 

Strategy Type 

2 year strategy 





2 year strategy/End of 

Service 

End of Service 

Release End of Service 


2.2 AIX Life Cycle (cont.) 

20 

AIX 6.1 EOS Plan 

Release 

TL 6100-00 GOLD (610) 

TL 6100-01 GOLD (61B) 

TL 6100-02 GOLD (61D) 

TL 6100-03 GOLD (61F) 

TL 6100-04 GOLD (61H) 

TL 6100-05 GOLD (61J) 

TL 6100-06 GOLD (61L) 

TL 6100-07 GOLD (61N) 

TL 6100-08 GOLD (61Q) 

6100-EOS 

Original Release 

11/09/2007 

5/30/2008 

11/14/2008 

5/15/2009 

TBD 10/2009 

TBD 4/2010 

TBD 10/2010 

TBD 04/2011 

TBD 10/2011 

위의 내용 중 TBD 또는 approximately라고 표기된 부분은 IBM 상황에 따라 바뀔 수 있는 일자입니다. 

End of PTF/interim fix support 

TBD 11/30/2009 

TBD 5/31/2010 

TBD 11/30/2010 

TBD 5/31/2011 

TBD 10/31/2011 

TBD 4/30/2012 

TBD 10/31/2012 

TBD 04/30/2013 

TBD 09/30/2013 

09/30/2013 End of Service 

09/30/2016 End of Fee-based Service 


2.3 Compatibility 

21 

Oracle 

http://www.oracle.com/technetwork/middleware/ias/downloads/as-certification-r2-101202-095871.html 

DB2 

http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/index.jsp?topic=/com.ibm.netcool_OMNIbus.doc/probes/ 

generic_odbc/generic_odbc/wip/reference/genodbc_driver_sol.html 


2.3 Compatibility (cont.) 

22 

Java 

http://www.ibm.com/developerworks/java/jdk/aix/service.html 


2.3 Compatibility (cont.) 

23 

Compiler 

http://support.bull.com/ols/product/system/aix/proglang/C 


Contents 

24 


III 고려 사항 -MW 

3.1 HACMP Version Compatibility matrix 

3.2 HACMP concept 

3.3 GPFS 


3.1 HACMP Version Compatibility Matrix 

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347 

25 


3.2 HACMP concept 

• IP address, subnetmasks, switch port setting, VLAN 등의 network단에서의 변경에 

조심 

26 

IBM PowerHA Network Considerations 

- 장애 감지는 동일한 물리적 network/VLAN 안에서 있을 때 가능함 

• 최소 한 개의 non-IP network을 구성 

• Network 가용성 확보 차원에서 HACMP에서 Etherchannel을 구성하여 사용하면 유용함 

• 구성 시 secondary switch로 연결된 backup adapter를 포함시켜 구성할 것 

• HACMP는 etherchannel 구성을 단일 adapter network으로 간주. adapter 장애 시 문제해 

결 지원을 위해 netmon.cf file을 별도로 구성. Cluster 외부의 다른 interface로 ICMP echo 

request(ping)를 전송해서 adapter 장애를 판단 

• 각node에persistent IP를구성 

- 원격 관리, monitoring에 유용함 


3.2 HACMP concept (cont.) 

27 

IBM PowerHA Topology Considerations 

• IPAT via Replacement vs. IPAT via Aliasing * 

net_ether_0 

Considerations: 

– Max number service IPs within HACMP network 

– Hardware Address Takeover (HWAT) 

– Speed of Takeover 

– Firewall Issues 

Node A Node B 

en0 – 9.19.10.1 (boot) en0 – 9.19.10.2 (boot) 

en0 - 9.19.10.28 (service IP) en0 – 9.19.10.2 (boot) 

IPAT via Replacement 

en1 – 192.168.11.1 (standby) en1 – 192.168.11.2 (standby) 



28 


• Contrast between Replacement & Aliasing 

net_ether_0 

Considerations: 

– Max number service IPs within PowerHA network 

– Speed of swap 

– Hardware Address Takeover (HWAT) 

– Firewall Issues 

Node A Node B 

en0 – 192.168.10.1 (base1) en0 – 192.168.10.2 (base1) 

9.19.10.28 (persistent a) 9.19.10.29 (persistent b) 

9.19.10.51 (service IP) 

en1 – 192.168.11.1 (base2) en1 – 192.168.11.2 (base2) 

9.19.10.50 (service IP) 

IPAT via Aliasing 



• Etherchannel과 같은 단일 adapter의 장애를 HACMP가 정확하게 판단하기가 어려울 수 있음 

• RSCT Topology Services가단일adapter의정상작동여부를확증하기위해packet을강제 

적으로 전송할 수 없기 때문임 

• Etherchannel과 같이 single adapter network 구성에서는 netmon.cf file을 생성. 

- 일반적으로 default G/W IP 주소를 사용 

- 다른 node로 부터의 heartbeat packet 수신이 안되면, 미리 설정되었던 G/W 로 ping을 시도 

• /usr/sbin/cluster/netmon.cf 

Ex) 180.146.181.119 

steamer 

chowder 

180.146.181.121 

29 

netmon.cf 



세가지 type의 monitoring 법 

• Startup Monitors – run one time 

• Process Monitors – check specified process instance in the process table 

• Custom Monitors – run your specified script during reiterating interval 

Resource monitors 는문제발생시단순알림(notify) event를 수행하도록 구성할 수도 있고, 정 

상인 상대 node로 서비스가 fallover 되도록 구성할 수도 있음 

30 

Application Monitor 

Don’t stop at just the base configuration - with thorough testing these can be 

great tools to automate recovery and save an admin time. 



• 자동화 

- 관리자에 의한 수작업 없음 

• 일부 application들은 uname이나 serial no. 또는 IP address와 같은 특정한 OS 특성과 밀접 

한 관계를 보이는 경향이 있음 (ex. SAP) 

• Application이 현재 running 중인지 확인 

31 

Application script 

- RG가 unmanaged 상태일 때, default startup option을 사용하면 HACMP가 application start 

script를 재 수행함. Application이 수행중인 것이 확인되면 ‘exit 0’로 script를 끝냄. 

• Data 상태 확인. 복구가 필요한가? 

• Correct Coding : 

- start with declaring a shell (ex. #!/bin/usr/ksh) 

-exit with RC=0 

- application이 정말 중단되었는지 확인하는 절차 포함 

-fuser 


3.3 GPFS 

32 

Compatibility 

http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=%2Fcom.ibm.cluster.gpfs.doc%2Fgpfs_faqs%2Fgpfsclustersfaq.html 

1. The following additional filesets are required by GPFS V3.2, V3.3 and V3.4: 

* xlC.aix50.rte (C Set ++ Runtime for AIX 5.0), version 8.0.0.0 or late 

* xlC.rte (C Set ++ Runtime), version 8.0.0.0 or later 

2. Enhancements to the support of Network File System (NFS) V4 in GPFS are only available on AIX V5.3 systems with 

the minimum technology level of 5300-04 applied or on AIX V6.1 with GPFS V3.2 and V3.3. 


3.3 GPFS (cont.) 

33 

Limitation 


• What are the GPFS cluster size limits? 

– GPFS for Linux on Multiplatform : 2441 nodes 

– GPFS for AIX on POWER : 1530 nodes 

• What are the current file system size limits? 

– GPFS 2.3 or later, file system architectural limit : 2^99 bytes 

– GPFS 2.2 file system architectural limit : 2^51 bytes (2 Petabytes) 

– Current tested limitapproximately : 2 PB 

• What is the current limit on the number of mounted file systems in a GPFS cluster? 

– GPFS V3.2.0.1 or later : 256 

– GPFS V3.1.0.5 or later : 64 

– GPFS V3.1.0.1 thru V3.1.0.4 : 32 

– GPFS v2.3 all service levels : 32 

• What is the architectural limit of the number of files in a file system? 

– For file systems created with GPFS V2.3 or later : 2,147,483,648 

– For file system created with GPFS V3.4 or later, the architectural limit is 

4,000,000,000 

– For file systems created prior to GPFS V2.3 : 268,435,456 



34 

Limitation 

• What are the limitations on GPFS disk size? 

OS kernel Maximum supported GPFS disk size 

- AIX, 64-bit kernel >2TB, up to the device driver limit 

- AIX, 32-bit kernel 1TB 

- Linux 2.6 64-bit kernels >2TB, up to the device driver limit 

- Linux 32-bit kernels(built without CONFIG_LBD) 2TB 

• What is the current limit on the number of nodes that may concurrently join a cluster? 

- GPFS V3.2 is limited to a maximum of 8192 nodes. 

- GPFS V3.1 and V2.3 are limited to a maximum of 4096 nodes. 


Contents 

35 


Ⅳ 구성 사례 

4.1 구성사례 – A사 

4.2 구성사례 - B사 

4.3 GPFS 

4.4 Lesson Learned 


4.1 구성 사례-A사 

36 

Active 

Standby 

AIXL 5.3 TL09 sp03 

� HACMP 5.4.1.5 

� Oracle,SAP 

� Active/Standby 

DB1 

Act 

HA Link 

DB2 

Stby 

Flash Copy 

Etherchannel 

DB3 

Act 

SAN Link 

운영 

6.9TB 

복제 

3.1TB 

DS8000 

Power570 

#1 

국내 

AP1 

6 2 3 4 3 2 2 2 2 2 2 4 

32 

해외 

AP1 

DB1 

Stby 

SAN S/W B40 

# Active 

DB2 

Act 

4 

Uplink 

DB3 

Stby 

IDC 

네트워크 

HSRP 

라우팅 

ERP 

백본 스위치 

Power570 

#2 

국내 

AP2 

TS3500 

법인 

AP1 

Uplink 

# Standby 

국내 

개발 

국내 QA 

Back up 

Network 

해외 

개발 

Cisco 

해외 

개발 

Power570 

#3 

아카이 

빙서버 

TSM 

Channel 

Link 

10/100/100 

Active NW 

Standby NW 

HBA SAN 

10/100/1000 

Backup LAN 

HA Link 

6 Link 수량 

본사 

자회사 

DB 


4.1 구성 사례-A사 (cont.) 

37 

고려 사항 

1.Language install 

C 

POSIX 

DE_DE 

DE_DE.UTF-8 

EN_US.UTF-8 JA_JP 

JA_JP.UTF-8 Ja_JP 

KO_KR.UTF-8 ZH_CN 

ZH_CN.UTF-8 ZH_HK 

ZH_HK.UTF-8 ZH_SG 

ZH_SG.UTF-8 ZH_TW 

2.Data Migration 고려 사항 

-원격지 이동 

-Data 원본 보호 

-Migration 속도 

-Disk 분리(DS8000/DS4000) 

3.Data Migration 방법 

-mirrorvg 

-splitvg 

-importvg 

-chlv -t copy tlvsap 

-cplv -f -e'tlvsap' lvsap & 

-mount 


4.2 구성 사례-B사 

38 

구성도 


� GPFS V3.2.1.12 

� HACMP 5.4.1.6 

� Oracle RAC 10.2.0.4 

Publinc 

Network 

10.10.17x.. 

dbp00 

ppd00 

Private 

Network for 

RAC 

192.168.. 

Private 

Network for 

GPFS 

192.168.. 

Concurrent 

Hitachi 

Shared Disk 

Public N/W 

dbp10 

ppd10 

Publinc 

Network 

10.10.17x.. 


� GPFS V3.2.1.12 

� HACMP 5.4.1.6 

� Oracle RAC 10.2.0.4 


4.2 구성 사례-B사 (cont.) 

39 

Voting Disk를위한MNDHB 

VG_name LV_name 

용도 

dhbvg1 

dhbvg2 

dhbvg3 

tbkdsk1 

thdisk2 

thdisk3 

Resource 

Group 

voting1 

OCR1 

dhblv1 

voting2 

OCR2 

dhblv2 

voting3 

OCR3 

dhblv3 

- 

RG_name 

con_rg 

dbp_rg 

Oracle_RAC 

Disk HeartBeat 

Oracle_RAC 


Oracle_RAC 


GPFS HeartBeat 

Resource 

dhbvg1, dhbvg2, 

dhbvg3 

net_ether_01 

10.10.10.0/24 

192.168.99.0/24 



40 

서버 마이그레이션 



41 

마이그레이션 절차 



42 

Landscape 운영 전략 



43 

구축단계별 시스템 구성방안 

�2단계개발서버는vio서버를이용하여기존개발서버자원사용 

�2단계 테스트 서버는 microPAR의 uncapped mode로 기존 검증 DB 서버 자원 사용 

�2단계 선행 운영 서버는 Dedicated LPAR로 기존 운영 서버 자원을 나누어 사용, 2단계 open후 Landscape 상에 계속 유지를 권고 함 

�2단계 open 후 일정기간 유지하면서 필요한 데이터 백업 후 원래 제공된 서버 자원으로 복구하여 계속 운영 함 


Contents 

44 


Ⅴ ISSUE & 햬결 

5.1 OS 

5.2 HACMP 

5.3 GPFS 


5.1 OS 

45 

AIX 5.3 TL05 이상에서 Dump option변경 

No. 

1 

2 

3 

4 

5 

6 

권장사항 

디폴트 pagingspace(/dev/hd6)가 아닌 별도의 primary dump logical volume 이 

rootvg에 존재합니까? 

Dump 장치 크기는 메모리 사용량이 가장 많은 시기를 기준으로 30%정도의 여유분을 포함한 것 보다 

커야 합니다. 

간혹 crash상태도 아니면서 콘솔도 동작하지 않는 시스템 hang이 발생하는 경우, 강제 Dump를 유도 

하여 원인 분석에 활용하며 이를 위해서는 Dump장치 설정 메뉴에서 “Always allow system 

dump” 및 “ Forced Copy Floag” 가 enable로 설정되어 있어야 합니다. 

Compression 을 disable 하는 것이 권장입니다. 

(그러나, 설치된 memory 량이 커서 실제 compression 없이 dump 를 쏟을 경우 공간이 부족한 

환경인 경우 이를 그냥 enable 하여 사용합니다) 

Dump가 Fail되는 문제를 최소화하기 위해 Parallel dump는 사용하지 않습니다. (AIX5.3TL05 

이상) 

Dump가 Fail되는 문제를 최소화하기 위해 dump_hang_prevent 옵션을 Enable합니다. 

(AIX5.3TL05 이상에 해당. AIX5.3TL09 및 AIX6.1 부터는 default로 enable되어 더 이상 

변경이 필요 없음) 

확인및변경방법 

확인 : sysdumpdev –l 

변경(hd7을 primary dumpspace로 변경시) : 

sysdumpdev -p /dev/hd7 

확인 : sysdumpdev –e 

(현재 명령어를 수행한 시점의 예상 dump량 산정) 

확인 : sysdumpdev –l 

확인 : odmget SWservAt | grep -Ep 

“dump_compress“ 

변경 : sysdumpdev -c (turn off dump 

compression) 

확인 : odmget SWservAt | grep - 

Ep "parallel_dump“ 

변경 : sysdumpdev -x (turn off parallel 

dump, reboot 필요) 

확인 : odmget SWservAt | grep - 

Ep "dump_hang_prevent“ 

변경 : sysdumpdev -H (turn on 

dump_hang_prevent) 

Parallel dump와 같이 변경시에는 parallel 

dump를 먼저 변경해야 함. 


5.1 OS (cont.) 

46 

Disk Reservation Disable 

HACMP, GPFS등의 Disk Shared 환경에서 Reservation error를 방지하지 위해 아래와 같이 Reservation을 disable하는 것은 권고합니다. 

In shared subsystem configuration like HACMP and GPFS on AIX, we may encounter reservation 

conflict problem in case of the following scenarios 

- An unexpected node halt 

- Wrong SAN configuration (e.g. bad switch port, bad cable, zoning issue and so on) 

- Not-intellegent applications for DISK/SAN monitoring 

To eliminate an unexpected reservation error, we should set reserve_policy to “no_reserve” for vpath, 

mpio or hdlm and reserve_lock to “no” for powerpath at the following configurations 

-GPFS clusters 

- HACMP running enhanced concurrent mode 

- Oracle RAC 

- Oracle ASM 

Multi- 

Path. 

SDD 

If SDD is 1.7.0.X or earlier, 

#chdev –l hdiskX –a reserve_policy=no_reserve –P 

#shutdown –Fr (to make the change take effect) 

If SDD is 1.7.1.0 or later, 


#chdev –l vpathX –a reserve_policy=no_reserve –P 

#shutdown –Fr 



5.1 OS (cont.) 

47 

Disk Reservation Disable 

Multi- 

Path 

HDLM 

Powerpa 

th 

MPIO 

(SDDPCM 

) 


In HDLM version 5.8.1 or earlier, the reservation control setting is called as reservation level 

and is specified by using the set operation with the –rsv on parameter. 

If you are using HDLM 5.8.1 or earlier, it is strongly recommended to upgrade to HDLM 5.9 

or later. 

If you are already using HDLM 5.9 or later, refer to the following 



To disable SCSI reservation on an AIX hosts, run the following commands for each affected 

devices 

#chdev –l hdiskX –a reserve_lock=no –P 

#chdev –l hdiskpowerX –a reserve_lock=no –P 


If you are using MPIO device module for IBM, Hitachi and EMC storage etc. 



Note : If you are using cascading resource group in HACMP, you should not disable reserve_policy 

against target devices. 


5.2 HACMP 

48 

Application monitoring 

1. Stabilization Interval value? 

2. Process name : ps -el 

Application이 start 

되기에 충분한 시간을 고 

려 


5.2 HACMP (cont.) 

49 

HACMP long fallover time 

https://w3.tap.ibm.com/w3ki02/display/HAGPFS/HA_MISC#HA_MISC-P000 

• long failover issue when all ip networks are down on a node 

추가로 ethernet 

network 구성 

• The long fallover time are the delays occurring in the Group Services "voting" (ie: HACMP and RSCT 

interaction). 

• The HACMP nodes must always coordinate themselves, for example the second node must wait to 

acquire the RGs until the first node has realesed the RGs. 

• This coordination between the nodes occurs thanks to the interaction between HACMP and RSCT 

(Topology and Group Services). 

Under normal circumstances, this HACMP - RSCT interaction occurs through the IP network . 

• When all IP networks between the nodes are broken, this interaction occurs over the non-IP link 

between the nodes (e.g., disk heartbeat network /rs 232 network) 

The non-IP network heartbeat path between the nodes is much, much slower than any IP ethernet 

network. 

Therefore HACMP / RSCT can only rely upon non-IP network (which is much slower, as just explained). 

• The solution to avoid such a long delay after a local network_down is to configure in HACMP a second, 

separate IP network. 

This would give to both nodes a working IP network to use for HACMP / RSCT interaction when the enX 

of the node has failed 

(thus avoiding to rely upon the slow the non-IP network). 



• 단일 Adapter Network에 대해서는 항상 netmon.cf file을 구성 

: /usr/es/sbin/cluster/netmon.cf 

50 

Vitrural Ethernet 환경 

Typical File: 

9.12.4.11 

9.12.4.13 

In virtualized environments: 

9.12.4.11 

!REQD en2 100.12.7.9 

9.12.4.13 

!REQD en2 100.12.7.10- 

Most adapters will use netmon in the traditional manner, pinging 

9.12.4.11 and 9.12.4.13 along with other local adapters or known 

remote adapters, and will only care about the interface's inbound 

byte count for results. 

interface en2 will only be considered up if it can ping either 

100.12.7.9 or 100.12.7.10 

Note: 

There are additional !REQD formats that may be used within the netmon.cf file outlined in 

the description of APAR IZ01332 

IZ01332: NEW NETMON FUNCTIONALITY TO SUPPORT HACMP ON VIO 



51 

single interface에서의 boot와 service ip subnet 

아래 내용은 supported configuration 

• HACMP 5.4.1 PTF4 가 pre-req 

• 단지노드당topology 에들어가는ethernetinterface 가하나만존재하는경우에만 해당 됩니 

다 

• IZ26020: ENH: IPAT VIA ALIASING WITH BOOT AND SERVICE ON SAME 

SUBNET 

• http://www- 

01.ibm.com/support/docview.wss?rs=111&context=SWG10&dc=DB550&q1=iz26 

020&uid=isg1IZ26020&loc=en_US&cs=utf-8&lang=en 



IPAT via Replacement will only host one IP on an interface at any given time 

hence avoiding multiple IPs within the same subnet 

52 


Node X 

en0 – 9.19.51.1 boot 

en0 – 9.19.51.2 service IP1 

en1 – 10.10.11.1 standby 

Network set to use IPAT via Replacement 

Firewall 

When HACMP is activated 

the base address will be 

replaced by the new service 

IP address that the clients 

use to connect to 

LAN 

Work Around 2: 

If you only need to manage one service IP per HACMP network consider using IPAT via 

Replacement to avoid having multiple IPs on the same interface. 



53 

Topology environments with a Firewall 

If multiple IPs on the same subnet are configured on the same interface 

AIX will utilize the 1 st one configured for the outbound traffic. 

Node X 

en0 – 10.10.10.1 boot 

9.19.51.1 persistent IP 

9.19.51.2 service IP1 

en1 – 10.10.11.1 boot 

Network set to use IPAT via Aliasing 

Firewall 

Clients can talk to the 

cluster nodes, but node 

initiated traffic from the 

9.19.51.X network will look 

like its coming from the 

persistent IP not the service 

IP 

LAN 

Tip: 

If you only need to manage one service IP per PowerHA network consider 

using IPAT via Replacement to avoid having multiple IPs on the same 

interface. 



54 

Topology environments with a Firewall 

Overriding default AIX behavior sometimes requires some creativity 

LAN 

Node X 

en0 – 10.10.10.1 boot 

9.19.51.2 service IP 

9.19.51.1 persistent IP 

en1 – 10.10.11.1 boot 

Network set to use IPAT via Aliasing 

Firewall 

Work around 1: 

Perform an ifconfig down,ifconfig up command 

Work around 2: 

Firewall open(service ip,persistent ip) 

Firewall open 

(Service,Persistent) 



55 

Checking for Fast Failure Detection (FFD) 

# lssrc –ls topsvcs | grep Fast 

Fast Failure Detection enabled 

# odmget –q name=diskhb HACMPnim 

HACMPnim: 

name = "diskhb" 

desc = "Disk Heartbeat Serial protocol" 

addrtype = 1 

path = "/usr/sbin/rsct/bin/hats_diskhb_nim" 

para = “FFD_ON“ 

grace = 60 

hbrate = 3000000 

cycle = 8 

gratarp = 0 

entry_type = "adapter_type" 

next_generic_type = "transport" 

next_generic_name = "" 

src_routing = 0 

smitty cm_config_networks.chg_cust.select -> 

Diskhb ->Parameters->FFD_ON 

Reason to set this option: 

In the event of a crash this option will 

allow for the takeover to start up 

immediately instead of waiting for the 

Failure Detection Rate timeout to pass. 

※ new feature starting with HACMP 5.4 


5.3 GPFS 

56 

Install or Migration 실패 


* When installing or migrating GPFS, the minimum levels of service you must have applied are: 

• o GPFS V3.4 you must apply APAR IZ78460 (GPFSV3.4.0-1) 

• o GPFS V3.3 you must apply APAR IZ45231 (GPFS V3.3.0-1) 

• o GPFS V3.2 you must apply APAR IY99639 (GPFS V3.2.0-1) 

If you do not apply these levels of service and you attempt to start GPFS, you will receive an error 

message similar to: 

• mmstartup: Required service not applied. Install GPFS 3.2.0.1 or later 

• mmstartup: Command failed Examine previous error messages to determine cause 

* In order for GPFS tracing to function properly on a system running AIX 6.1 with the 6100-06 

Technology Level, 

you must either install AIX 6100-06-02 Service Pack or open a PMR to obtain an iFix for APAR 

IZ84729 from IBM Service. 

If you are running GPFS on AIX 6.1 TL 6 without 6100-06-02 Service Pack or the iFix and have 

AIX tracing enabled (such as by using the GPFS mmtracectl command), you will experience a GPFS 

memory fault (coredump) or node crash with kernel panic. 



57 

DR구성을 위한 gpfs 정보 sync 

Publinc 

Network 

10.10.10x.. 

dbp00 

ppd00 

GPFS 

RAC 

192.168.. 

�hostname 확인 ppd10 

�cp GPFS /etc/hosts /etc/hosts.normal 

�cp 192.168.. /etc/hosts.dr /etc/hosts 

GPFS 

�rm -fr /var/mmfs 

�mmcrcluster -n 

/tmp/ibm/`hostname`_allnodes -p 

`hostname` -C gpfs_`hostname` 

�mmimportfs ent 

all -i 

/tmp/ibm/`hostname`_mmsdrfs 

�mmchmgr /dev/test dr_server 

Publinc 

�mmstartup -a 

Network 

�sleep 3 

10.10.10x.. 

�mmmount Hitachi all 

Shared Disk 

Concurr 

Public N/W 

dbp10 

외 

부 

VIO 

dr_dbp00 

dr_ppd00 

GPFS 

Hitachi 

Shared Disk 



58 

gpfs 문제 Issue 

[Environment] 

• OS level : AIX 5.3 TL10 SP02 

• GPFS level : GPFS 3.2.1.15 

• M/T : 9119-FHA 

• Memory : 630GB 

• 양노드에서 다른 30분 시차를 두고 아래 data 를 1시간 간격으로 수집 

• mmfsadm saferdump malloc > /big_dir/$(uname -n).$(date 

+%m.%d.%H.%M.%S).gpfs.dump.malloc 

• sleep 10 

• mmfsadm saferdump instances |grep -c OpenInstance > /big_dir/$(uname - 

n).$(date +%m.%d.%H.%M.%S).gpfs.dump.instance 

• mmchconfig sharedMemLimit=2047M (gpfs must be recycled) 



59 

Paramter 

구 분 항 목 설 명 

pagepool 

maxFilesToCache 

maxStatCache 

prefetchThreads 

사용자 data와 GPFS File System meata data를 저장, Caching하기위해 사용되는 영역 

동시에 Open할 수 있는 File과 최근에 사용된 File 의 inode를 지정하며, 한 번에 

Caching될 수 있 는 File Caching영 역 

현재의 일반적인 File Cache에 없는 Files의 Attribute를 

Caching하기 위한 영역 

Controls m aximu m num ber of threads dedicated to pre fetc hing 

data for files that are read se que ntially, or to handle write-b ehinds 

64M 8192M 

1000 8000 

4000 30000 

72 100 

pre fetc hThreads prefetchTh reads + worker1Threads should be less than 550 100 100 

minMissedPingTimeout minimum time pinging without reply 30 30 

maxMissedPingTimeout maximum time pinging without reply to (declare node is nown) 60 60 

totalPingTimeout total time pinging before giving up 120 120 

sharedMemLimit shared memory의 제한 값 256M 2047M 

GPFS 화일시스템 I-NODE 개수 

GPFS 파일시스템은 free i-node 의 개수를 15% 이상 유지하도록 권장하며 free i-node 

개수가 미만으로 떨어질 경우 

warning 메세지와 함께 perf 저하가 나타날 수 있음. 

Default 권장 값 

worke r1Th reads Controls m aximu m num ber of concurrent file ops at any instant 48 300 

leaseRecoveryWait how long we wait to start log rec overy after a failed node’s lease has run out 35 35 

maxMBpS 

The value is use d in calcu lating the amount of I/O that can be done to effectively 

p refetch data for readers and write-be hind data from write rs. 

150 10240 

GPFS 

sendTimeout 

leaseDuration 

seconds before send/connect times out 

노드간이 일종의 Heartbeat check duration 

10 

35 

10 

35 

leaseRecoveryWait how long we wait to start log rec overy after a failed node’s lease has run out 35 35 

worker1Threads prefetchThreads + worker1Threads should be less than 550 450 400 

N/A 최대값으로 


60 

Thank You

System 설치 및 구성을 위한 고려 사항 - IBM

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?