07.01.2015 Views

OMVS latch deadlock –

OMVS latch deadlock –

OMVS latch deadlock –

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Unix System Services<br />

<strong>OMVS</strong> <strong>latch</strong> <strong>deadlock</strong> <strong>–</strong><br />

detected on the second view<br />

GSE z/OS Guide Lahnstein, 03/14/2012<br />

Matthias Korn<br />

z/OS Virtual Front End<br />

IBM Deutschland GmbH<br />

matthias.korn@de.ibm.com<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

The Dining Philosophers Problem<br />

Quelle: Wikipedia<br />

2<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is a Deadlock<br />

in use by<br />

Process / Task A<br />

requires<br />

Resource A<br />

Resource B<br />

requires<br />

Process / Task B<br />

in use by<br />

3<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Details from the SADUMP of MVSS:<br />

IP ANALYZE RESOURCE<br />

RESOURCE #0133:<br />

NAME=SYS.BPX.A000.FSLIT.FILESYS.LSN ASID=0010 Latch#=2<br />

RESOURCE #0133 IS HELD BY:<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1050<br />

DATA=EXCLUSIVE RETADDR=A347EFD2<br />

REQUEST = 11/15/2011 17:36:15.057144<br />

GRANT = 11/15/2011 17:36:15.142437<br />

4<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Details from the SADUMP of MVSS (cont.):<br />

RESOURCE #0133 IS REQUIRED BY:<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009DA188<br />

DATA=EXCLUSIVE RETADDR=A3796972<br />

REQUEST = 11/15/2011 17:36:15.144127<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1260<br />

DATA=EXCLUSIVE RETADDR=A36A6452<br />

REQUEST = 11/15/2011 17:46:00.160780<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1480<br />

DATA=EXCLUSIVE RETADDR=A347EFD2<br />

REQUEST = 11/15/2011 19:44:17.379913<br />

5<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Details from the SADUMP of MVSS (cont.):<br />

IP <strong>OMVS</strong>DATA DETAIL ASID(16)<br />

Thread Data: Active Threads<br />

Thread ID: 23A2E000 00000000<br />

Tcb Address: 009A1050<br />

Space switch stack:<br />

IP CBF 7A3F8000. ASID(X'0010') STR(BPXSTACK)<br />

WorkerThread Data:<br />

Requesting ASID: 0100<br />

Requesting System: 5


IBM Global Competency Center<br />

Details from the SADUMP of MVSS (cont.):<br />

IP CBF 7A3F8000. ASID(X'0010') STR(BPXSTACK)<br />

BPXNXWRK -> BPXNXWRK(BPXNXWK1) -> BPXTXRXS -><br />

BPXFVNL -> BPXTAVNO -> BPXFSMNT -> BPXLKKW<br />

7<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Details from the SADUMP of MVSS (cont.):<br />

IP WJSIP -> FILESYS<br />

>>> <strong>latch</strong> 2 held Exclusive has waiters<br />

ASID=0010 TCB=009A1050 Caller=2347EFD2 LQE=7E67D4F0 EP=BPXNXWRK<br />

>Waiters:<br />

Stok=00000040 TCB=009DA188 Caller=23796972 LQE=7E67D5C0 EP=BPXFSLIT<br />

Stok=00000040 TCB=009D4200 Caller=234FB25E LQE=7F4B01B0 EP=BPXNXWRK<br />

Stok=00000040 TCB=009D3528 Caller=2344F780 LQE=7E2956F8 EP=BPXNXWRK<br />

Stok=00000040 TCB=009CF0A8 Caller=236E6096 LQE=7F4B06F8 EP=IGWLQDTT<br />

Stok=00000040 TCB=009D43B0 Caller=23459608 LQE=7F4B09D0 EP=BPXNXWRK<br />

Stok=00000040 TCB=009D0D78 Caller=236A6452 LQE=7E2FA148 EP=BPXNXWRK<br />

Stok=00000040 TCB=009A1260 Caller=236A6452 LQE=7E67D148 EP=BPXNXWRK<br />

Stok=00000040 TCB=009CFCF0 Caller=2344F780 LQE=7E2954F0 EP=BPXNXWRK<br />

>>> <strong>latch</strong> 442 held Exclusive<br />

ASID=0010 TCB=009A1050 Caller=23480E62 LQE=7E2FA078 EP=BPXNXWRK<br />

8<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Details from the SADUMP of MVSS (cont.):<br />

IP BPXWFLCB 442<br />

*****************************************************<br />

** **<br />

** BPXZFLCB INDEX LOOK-UP **<br />

** **<br />

** ALL VALUES IN HEX UNLESS OTHERWISE STATED **<br />

** (DEC SPECIFIES A DECIMAL VALUE) **<br />

*****************************************************<br />

RELEASE IS: HBB7770<br />

LATCH NUMBER (DEC): 442<br />

LATCH NUMBER (HEX): 1BA<br />

THIS LATCH NUMBER REFERS TO EYE CATCHER: AVFS<br />

FILE SYSTEM NAME: *AMD/u$/tu<br />

EXECUTION FINISHED<br />

9<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Conclusions / Questions at this point<br />

• Processing got stuck during (auto-)MOUNT<br />

• No <strong>deadlock</strong> could be identified so far<br />

• WAIT was entered in BPXFSMNT <strong>–</strong> main mount entry point<br />

• What delayed the (auto-)MOUNT<br />

• What is BPXFSMNT waiting for<br />

• Is it waiting for any resource<br />

• Is it waiting for any subtask to respond<br />

10<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

BPXFSMNT<br />

• Entered the WAIT in a routine called GetSecLabel<br />

• SECLABEL is a RACF class, that allows to set security labels on z/OS UNIX resources<br />

• Creates and queues a work element and posts BPXFSLIT<br />

to obtain the security label for the file system to be mounted<br />

• waits for BPXFSLIT to return the security label<br />

What is BPXFSLIT doing<br />

11<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

How to find the BPXFSLIT task<br />

IP SUMMARY REGISTER ASID(16)<br />

F BPXFSLIT<br />

TCB: 009DA188<br />

CMP...... 00000000 PKF...... 00 LMP...... FF<br />

General purpose register values<br />

0-3 80000314 7FFBB028 000A5A7D 08069288<br />

4-7 00000000 00000000 00000000 00004200<br />

8-11 7C63D350 A34C4EA6 010595F0 00FDBF00<br />

12-15 00000000 7C63C028 01A208B6 00000000<br />

NDSP..... 00000000 JSCB..... 009FC07C<br />

PRB: 009DA100 LINK..... 009DA188<br />

WLIC..... 00020023 OPSW..... 070C0000 813A8D42<br />

GPR0-3... 00000000 7F6F1378 00000048 82559B70<br />

GPR4-7... 2379EF08 2379EF30 02559C10 2375E140<br />

GPR8-11.. 7F6F1378 7F6F1378 2375E138 2375E31F<br />

GPR12-15. 2375D320 00007B38 00000001 7F6F1378<br />

EP....... BPXFSLIT MAJOR.... BPXINPV2 ENTPT.... A37807D8<br />

12<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

Back to the ANALYZE report ...<br />

RESOURCE #0133 IS REQUIRED BY:<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009DA188<br />

DATA=EXCLUSIVE RETADDR=A3796972<br />

REQUEST = 11/15/2011 17:36:15.144127<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1260<br />

DATA=EXCLUSIVE RETADDR=A36A6452<br />

REQUEST = 11/15/2011 17:46:00.160780<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1480<br />

DATA=EXCLUSIVE RETADDR=A347EFD2<br />

REQUEST = 11/15/2011 19:44:17.379913<br />

13<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

The (almost) complete situation<br />

BPXFSMNT<br />

wants<br />

File System<br />

held by<br />

TCB 9A1050<br />

to mount<br />

*AMD/u$/tu<br />

Mount Latch<br />

and posts now<br />

BPXFSLIT to<br />

get response<br />

require<br />

and enters<br />

a wait.<br />

BPXFSLIT<br />

TCB 9DA188<br />

<br />

14<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is BPXFSLIT doing <br />

• the WAIT bit in the LINK is off <strong>–</strong> task is *NOT* in a WAIT<br />

• WLIC 23 <strong>–</strong> WTO appears to be residual because<br />

• OPSW points to IEAVEPS1 <strong>–</strong> pause the currently executing task<br />

• Linkage Stack showed a PC to ISGLRTR in GRS address space<br />

• obtain the mount <strong>latch</strong><br />

15<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is BPXFSLIT doing (cont.)<br />

IP SUMMARY FORMAT ASID(16);f ‘ tcb: 009DA188‘<br />

LINKAGE STACK ENTRY 01 LSED: 7F41E138<br />

LSE: 7F41E018<br />

GENERAL PURPOSE REGISTER VALUES<br />

00-01.... 00000000 00000000 00000048 7F6F1378<br />

02-03.... 00000000 00000048 00000000 82559B70<br />

04-05.... 00000000 2379EF08 00000000 2379EF30<br />

06-07.... 00000000 02559C10 00000000 2375E140<br />

08-09.... 00000000 7F6F1378 00000000 7F6F1378<br />

10-11.... 00000000 2375E138 00000000 2375E31F<br />

12-13.... 00000000 2375D320 00000000 00007B38<br />

14-15.... 00000000 80FDAD88 00000000 A37807D8<br />

...<br />

PSW...... 07040000 80000000 PSWE..... 00000000 00FDAD88<br />

TARG..... 00000000 A378080C MSTA..... 24F30000 00000000<br />

16<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is BPXFSLIT doing (cont.)<br />

IP CBF 24F30000 STR(BPXSTACK) ASID(16)<br />

Stack Entry 0<br />

Stack Entry Address: 24F30200<br />

Previous Entry Address: 00000000<br />

Next Entry Address: 24F31210<br />

Entry Point ID: 0503<br />

Csect: BPXFSLIT at 237807D8<br />

Entry Point: BPXFSLIT at 237807D8<br />

Stack Entry 1 * Active *<br />

Stack Entry Address: 24F31210<br />

Previous Entry Address: 24F30200<br />

Next Entry Address: 24F32EF8<br />

Entry Point ID: 1288<br />

Csect: BPXTXRIN at 23793468<br />

Entry Point: BPXTXRIN at 23793468<br />

17<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is BPXFSLIT doing (cont.)<br />

• BPXFSLIT called BPXTXRIN<br />

• BPXTXRIN is called by initialization, re-init and SET<strong>OMVS</strong> processing<br />

• as per ANALYZE RESOURCE:<br />

RESOURCE #0133 IS HELD BY:<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009A1050<br />

DATA=EXCLUSIVE RETADDR=A347EFD2<br />

REQUEST = 11/15/2011 17:36:15.057144<br />

GRANT = 11/15/2011 17:36:15.142437<br />

RESOURCE #0133 IS REQUIRED BY:<br />

JOBNAME=<strong>OMVS</strong> ASID=0010 TCB=009DA188<br />

DATA=EXCLUSIVE RETADDR=A3796972<br />

REQUEST = 11/15/2011 17:36:15.144127<br />

18<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

What is BPXFSLIT doing (cont.)<br />

• the SYSLOG shows:<br />

MVSS 11319 18:36:14.88<br />

SET<strong>OMVS</strong> FILESYS,FILESYSTEM='<strong>OMVS</strong>.CPXMVS.DFS',SYSNAME=MVSS<br />

• DEADLOCK is between the MOUNT task and the SET<strong>OMVS</strong> task<br />

• MOUNT task holds MOUNT LATCH and posts SET<strong>OMVS</strong> task due<br />

to SECLABEL being set<br />

• SET<strong>OMVS</strong> task is itself waiting for the MOUNT LATCH due to<br />

SET<strong>OMVS</strong> FILESYS command<br />

• both tasks cannot proceed <strong>–</strong> DEADLOCK!<br />

19<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

OA38094 <strong>–</strong> <strong>deadlock</strong> on mount <strong>latch</strong><br />

BPXFSMNT<br />

wants<br />

File System<br />

held by<br />

TCB 9A1050<br />

to mount<br />

*AMD/u$/tu<br />

Mount Latch<br />

and posts now<br />

BPXFSLIT to<br />

get response<br />

require<br />

and enters<br />

a wait.<br />

BPXFSLIT<br />

TCB 9DA188<br />

SET<strong>OMVS</strong><br />

20<br />

© 2012 IBM Corporation


IBM Global Competency Center<br />

The Dining Philosophers Solution<br />

Quelle: Wikipedia<br />

21<br />

© 2012 IBM Corporation

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!