Rdb Continuous LogMiner and the JCC LogMiner Loader - Oracle
Rdb Continuous LogMiner and the JCC LogMiner Loader - Oracle
Rdb Continuous LogMiner and the JCC LogMiner Loader - Oracle
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Rdb</strong> <strong>Continuous</strong> <strong>LogMiner</strong><br />
<strong>and</strong> <strong>the</strong> <strong>JCC</strong> <strong>LogMiner</strong><br />
<strong>Loader</strong><br />
A presentation of MnSCU’s MnSCU s use of<br />
this technology
• Overview of MnSCU<br />
• Logmining Uses<br />
• Loading Data<br />
Topics<br />
• Bonus: Global Buffer implementation <strong>and</strong><br />
resulting performance improvements
Overview of MnSCU
Overview of MnSCU<br />
• Minnesota State College <strong>and</strong> University System<br />
– Comprised of 32 Institutions<br />
• State Universities, Community <strong>and</strong> Technical Colleges<br />
• 53 Campuses in 46 Communities<br />
• Over 16,000 faculty <strong>and</strong> staff<br />
• More than 3,600 degree programs<br />
– Serves over 240,000 Students per year<br />
• Additional 130,000 Students in non-credit courses<br />
– About 30,000 graduates each year
I S R S<br />
• MnSCU’s Primary Application<br />
– ISRS: Integrated State-wide Record System<br />
• Written in Uniface (4GL), Cobol, C, JAVA<br />
– 2,000+ 3GL programs; 2,200+ 4GL forms<br />
– Over 2,900,000 lines of code
North Region<br />
9 Institution Dbs<br />
1 Regional Db<br />
South Region<br />
8 Institution Dbs<br />
1 Regional Db<br />
MnSCU’s MnSCU s <strong>Rdb</strong> Topology<br />
Central Region<br />
7 Institution Dbs<br />
1 Regional Db<br />
1 Central Db<br />
Metro Region<br />
13 Institution Dbs<br />
1 Regional Db<br />
Development<br />
20+ Dvlp/QC/Train<br />
Dbs<br />
•Each Institution Db has over 1200 tables<br />
•Over 900,000,000 rows<br />
•Over 1 Terra-byte total disk space<br />
•Over 20% Annual Data Growth
Production Users<br />
• Each regional center supports:<br />
– Between 500 <strong>and</strong> 1,000 on-line users (during<br />
<strong>the</strong> day)<br />
– Numerous batch reporting <strong>and</strong> update jobs<br />
daily <strong>and</strong> over-night<br />
– 25,000+ Web transactions each day 24x7<br />
• Registrations, Grades, Charges (fees),<br />
On-line Payments, etc…
LogMining
LogMining Modes<br />
• Static<br />
– The <strong>Rdb</strong> <strong>LogMiner</strong> runs, by default, as a<br />
st<strong>and</strong>-alone process against backup copies of<br />
<strong>the</strong> source database AIJ files<br />
• <strong>Continuous</strong><br />
– The <strong>Rdb</strong> <strong>LogMiner</strong> can run against live<br />
database AIJs to produce a continuous output<br />
stream that captures transactions when <strong>the</strong>y<br />
are committed
LogMining Uses<br />
• Hot St<strong>and</strong>by (replication) replacement<br />
• Mining a single Db to multiple targets<br />
• Mining to Non-<strong>Rdb</strong> Target(s)<br />
– XML, File, API, Tuxedo, Orrible<br />
• Mining multiple Dbs to a single target<br />
• Minimizing production Db maintenance<br />
downtime
LogMining at MnSCU<br />
The combination of <strong>the</strong> <strong>Rdb</strong> <strong>Continuous</strong> <strong>LogMiner</strong><br />
<strong>and</strong> <strong>the</strong> <strong>JCC</strong> <strong>LogMiner</strong> <strong>Loader</strong> allow us to:<br />
• Distribute centrally controlled data to multiple<br />
local databases<br />
• Replicate production databases into multiple<br />
partitioned query databases<br />
• Roll up multiple production databases into a<br />
single data warehouse<br />
• Replicate production data into non-<strong>Rdb</strong><br />
development databases to support development<br />
of database-independent application
CENTRALDB<br />
36 St<strong>and</strong>by <strong>Rdb</strong><br />
Reporting Dbs<br />
NTC<br />
FFC<br />
TRF<br />
BTC<br />
CLM w/DBK<br />
4 sessions<br />
REGIONALDB<br />
CLM w/PK<br />
4 sessions<br />
REGIONALDB<br />
HS<br />
36 Dbs<br />
The Big Picture<br />
REGIONALDB<br />
REGIONALDB<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
NWR db<br />
CLM w/PK<br />
37 sessions<br />
CLM w/DBK<br />
37 sessions<br />
CLM w/FilterMaps<br />
DBK+RC_ID<br />
37 sessions<br />
CC_SUMMARY<br />
Combined (all 37)<br />
MNSCUALL<br />
Schema<br />
5 Tables<br />
Combined<br />
(all 37)<br />
37 Schemas<br />
Each with 238<br />
Tables<br />
CORE Tables<br />
Combined<br />
(all 37)<br />
MVs<br />
WAREHOUSE<br />
MV/DJ<br />
ISRS<br />
REPL<br />
MVs to<br />
Combined<br />
Tables<br />
VAL Tables<br />
1 central<br />
copy<br />
MV/DJ<br />
MVs<br />
O<strong>the</strong>r Data<br />
Warehouse<br />
Structures<br />
MnOnline<br />
Application<br />
Tables (20+)<br />
O<strong>the</strong>r Data<br />
Warehouse<br />
Structures<br />
37 Schemas<br />
Each with 238<br />
Tables<br />
MnOnline Application<br />
Read-Only 24 x 7<br />
MVs to<br />
Individual<br />
Schemas<br />
Future<br />
Ad-Hoc Users<br />
Read-Only
Replication Replacement<br />
• Historically we’ve used <strong>Rdb</strong>’s Hot St<strong>and</strong>by<br />
feature to maintain reporting Dbs for users<br />
(ODBC <strong>and</strong> some batch reports)<br />
• However, of <strong>the</strong> 1200+ tables in<br />
production ISRS Dbs, we found users only<br />
use 260 tables for reporting from <strong>the</strong><br />
st<strong>and</strong>by Dbs<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
• Batch reports were found to use ano<strong>the</strong>r 222 tables<br />
• With Logminer, we can maintain just this subset of tables<br />
(482) for reporting purposes<br />
NTC<br />
FFC<br />
TRF<br />
BTC<br />
NWR db
Mining Single Db to<br />
Multiple Targets<br />
• We’ve begun combining<br />
institutions in some of our<br />
production databases<br />
• Introduced a new column called<br />
RC_ID to over 700 tables for row-<br />
level security<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
• Row security provided by views<br />
• However, <strong>the</strong> reporting databases still needed to<br />
be separate so users wouldn’t have to change<br />
hundreds of existing queries to include RC_ID<br />
NTC<br />
FFC<br />
TRF<br />
BTC<br />
NWR db
‘CORE CORE’ Tables<br />
• ‘CORE’ tables contain common data for<br />
MnSCU<br />
• Tables like PERSON, ADDRESS, PHONE,<br />
etc. that can be combined across all<br />
institutions<br />
• These tables incorporate a new surrogate<br />
key called MNSCU_ID which is unique<br />
across all institutions<br />
• Users do not access ‘CORE’ tables directly<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
NWR db
• ‘CORE’ data is presented to<br />
users via views that join RC_ID<br />
(an institution-specific value) <strong>and</strong><br />
TECH_ID (<strong>the</strong> original surrogate<br />
key) to determine MNSCU_ID,<br />
via a third ‘REPORTING’ table<br />
• All ‘CORE’ tables (PERSON,<br />
ADDRESS, PHONE, …) are<br />
accessed via <strong>the</strong> views<br />
‘CORE CORE’ Table Views<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
NWR db<br />
View PERSON:<br />
Columns:<br />
----------------<br />
TECH_ID<br />
RC_ID<br />
Rest of Attributes<br />
CR_REPORTING:<br />
Columns:<br />
-------------------<br />
MNSCU_ID<br />
TECH_ID<br />
RC_ID<br />
CR_PERSON:<br />
Columns:<br />
----------------<br />
MNSCU_ID<br />
Rest of attributes
Query Performance<br />
• Row-Level security in our production dbs<br />
(‘CORE’ table views) have increased <strong>the</strong> query<br />
tuning challenges<br />
• Implemented better reporting performance in<br />
target db by using a filtering-trigger into a table<br />
<strong>the</strong> users use, ra<strong>the</strong>r than using <strong>the</strong> same views<br />
as Production<br />
• Reporting Db has different indexes <strong>and</strong> uses<br />
row-caching to maximize query performance
Production Reporting<br />
CR_REPORTING<br />
MNSCU_ID<br />
RC_ID<br />
TECH_ID<br />
CR_PERSON<br />
MNSCU_ID<br />
‘CORE CORE’ Triggers<br />
CR_REPORTING<br />
CR_PERSON<br />
If Exists by<br />
MNSCU_ID<br />
AND Exists by<br />
RC_ID<br />
AFTER INSERT<br />
CR_RC_ID<br />
RC_ID<br />
Insert Into<br />
PERSON
Production Db<br />
NWR<br />
0142<br />
0263<br />
0215<br />
0303<br />
Non-’CORE<br />
Non- CORE’ Tables to<br />
Multiple Targets<br />
Logminer w/ filtering for RC_ID=‘0142’<br />
Logminer w/ filtering for RC_ID=‘0263’<br />
Logminer w/ filtering for RC_ID=‘0215’<br />
Logminer w/ filtering for RC_ID=‘0303’<br />
Reporting Dbs<br />
0142<br />
0263<br />
0215<br />
0303
Eliminating Remote<br />
Attaches<br />
• We also have some centralized<br />
data, like ‘codes tables’<br />
CENTRALDB<br />
REGIONALDB<br />
CLM w/PK<br />
4 sessions<br />
REGIONALDB<br />
• Some jobs require reading this<br />
centralized data while processing against<br />
each Production ISRS database<br />
– Forces a remote attachment to <strong>the</strong> central db<br />
• With Logminer, we can maintain regional copies<br />
of this data so <strong>the</strong>se jobs do not have to use<br />
remote attaches to get data<br />
REGIONALDB<br />
REGIONALDB
Non-<strong>Rdb</strong> Target<br />
• Since we have 37 separate institutional<br />
databases (<strong>and</strong> reporting databases),<br />
getting combined data for system-wide<br />
reporting was difficult<br />
• With Logminer we can combine data from<br />
each of <strong>the</strong> production ISRS databases<br />
into one target, in this case our <strong>Oracle</strong><br />
warehouse ISRS<br />
37<br />
Production<br />
<strong>Rdb</strong> DBs<br />
NWR<br />
db<br />
CLM w/FilterMaps<br />
DBK+RC_ID<br />
37 sessions<br />
MNSCUALL<br />
Schema<br />
5 Tables<br />
Combined<br />
(all 37)
Non-<strong>Rdb</strong> Target<br />
• With Logminer we can mine tables from<br />
each of our Production ISRS databases<br />
into individual <strong>Oracle</strong> Schemas<br />
• This allows us to test our application<br />
against a different DBMS<br />
– Structure <strong>and</strong> data are <strong>the</strong> identical<br />
• Keeps data separate for institutional<br />
reporting<br />
37 Production<br />
<strong>Rdb</strong> DBs<br />
NWR<br />
db<br />
CLM w/DBK<br />
37 sessions<br />
ISRS<br />
37 Schemas<br />
Each with 238<br />
Tables
ISRS Instance<br />
37 Institution Schemas<br />
(238 tables each)<br />
1 Combined Schema<br />
(5 tables)<br />
1 VAL Schema<br />
(Codes Tables)<br />
MnSCU’s MnSCU s <strong>Oracle</strong> Topology<br />
O<strong>the</strong>r Reporting Structures<br />
WAREHOUS Schema<br />
Several Warehouse<br />
Schemas<br />
Development<br />
Schema<br />
•Over 800,000,000 rows<br />
(And growing rapidly)<br />
•Over 500 Gb total disk space<br />
•Various ETL
• Sessions with <strong>Rdb</strong> Targets:<br />
LogMining Scope at<br />
MnSCU<br />
– NWR: 482 tables from 1 source to 4 <strong>Rdb</strong> targets<br />
(4 sessions; Example of mining 1 to Many)<br />
• In <strong>the</strong>se sessions we are separating data from a combined institutional<br />
database into separate reporting databases for each institution<br />
• Triggers used to populate base tables from production ‘Core’ tables<br />
– CENTRLDB: 3 tables from 1 source db to 4 target <strong>Rdb</strong> dbs<br />
• In this session we are taking centralized data <strong>and</strong> placing copies of<br />
it on our regional servers (allows us to maintain <strong>the</strong>se 3 tables<br />
centrally without changing our application which reads <strong>the</strong> data<br />
locally)
LogMining Scope at<br />
MnSCU<br />
• Sessions with <strong>Oracle</strong> Targets:<br />
– ISRS: 238 tables from each of 37 source dbs to 37 <strong>Oracle</strong><br />
schemas<br />
(37 sessions; Non-<strong>Rdb</strong> target; example of mining many to many)<br />
• These sessions allow us to create exact copies of our databases in <strong>Oracle</strong><br />
• From this <strong>Oracle</strong> data materialized views are built to provide value-added<br />
reporting ‘data-marts’<br />
• Allows us to shift our reporting focus to <strong>Oracle</strong> while continuing to base<br />
production on <strong>Rdb</strong><br />
• Also allows us to work on converting our application to use <strong>Oracle</strong> without<br />
impacting production or having to do a cold-switch<br />
– MNSCUALL: 5 tables from each of 37 source dbs to 1 <strong>Oracle</strong><br />
schema<br />
(37 sessions; Example of mining Many sources to 1 target)<br />
• These sessions allow us to build a combined version of data<br />
• These are some of our larger tables, with over 250,000,000 rows<br />
• Total of 9,513 tables being mined by 119<br />
separate continuous <strong>LogMiner</strong> sessions!
Session Support<br />
• To support so many sessions we’ve developed a<br />
naming convention for sessions<br />
• Includes a specific directory structure<br />
• Built several tools to simplify <strong>the</strong> task of<br />
creating/recreating/reloading tables<br />
• Some of <strong>the</strong> tools are based on <strong>the</strong> naming<br />
convention<br />
• Currently our tools are all DCL, but better<br />
implementations could be made with 3GLs
Source Db Preparation<br />
• The <strong>LogMiner</strong> input to <strong>the</strong> <strong>Loader</strong> is created when <strong>the</strong><br />
source database is enabled for LogMining<br />
• This is accomplished with an RMU comm<strong>and</strong>. The<br />
optional parameter ‘continuous’ is used to specify<br />
continuous operation:<br />
$ rmu/set logminer/enable[/continuous] <br />
$ rmu/backup/after/quiet ...<br />
• Many of <strong>the</strong> procedures included with <strong>the</strong> <strong>Loader</strong> kit rely<br />
on <strong>the</strong> procedure vms_functions.sql having been applied<br />
to <strong>the</strong> source database:<br />
SQL> attach ‘filename ’;<br />
SQL> @jcc_tool_sql:vms_functions.sql<br />
SQL> commit;
Target Db Preparation<br />
• Besides <strong>the</strong> task of creating <strong>the</strong> target db<br />
<strong>and</strong> tables itself, <strong>the</strong>re are many ‘details’ to<br />
attend to depending upon target db type<br />
• For <strong>Oracle</strong> targets, <strong>Rdb</strong> field <strong>and</strong> table<br />
name lengths (<strong>and</strong> names) can be an<br />
issue (<strong>and</strong> target tablespaces too)<br />
• One thing in common: <strong>the</strong> HighWater table<br />
– Used by <strong>the</strong> loader to keep track of what has<br />
been processed
• Basic Steps:<br />
– Do an AIJ backup<br />
Minimizing Production<br />
Downtime<br />
– Create copy of production Db<br />
– Perform restructuring / maintenance / etc on Db copy<br />
• This could take many hours<br />
– Remove users from Production Db<br />
– Apply AIJ transactions to Db copy using <strong>LogMiner</strong><br />
• This step requires minimal time<br />
– Switch applications to use Db copy – This is now <strong>the</strong><br />
new production Db!
Loading Data
Loading Data<br />
• 3 Methods to accomplish this:<br />
– Direct Load (RMU, SQLLOADER)<br />
• Could impose data restrictions by using views<br />
• Can configure commit-interval<br />
– <strong>LogMiner</strong> Pump<br />
• Use a no-change update transaction on source<br />
• Allows for data restrictions<br />
• Commit-interval matches source transaction<br />
• Consumes AIJ space<br />
– <strong>JCC</strong> Data Pump<br />
• Configurable to do parent/child tables, data restrictions,<br />
commit-interval <strong>and</strong> delay-interval to minimize performance<br />
impact<br />
• Consumes AIJ space
Loading Data Example<br />
• Loaded a table with 19,986 rows with<br />
SQLLOADER<br />
– This took about 13 minutes, no AIJ blocks,<br />
some disk space for .UNL file<br />
• Used <strong>LogMiner</strong> to ‘pump’ <strong>the</strong> rows via a nochange<br />
update<br />
– This took about 10 minutes; 20,000 AIJ-blocks
Pumping Data Example<br />
• Using <strong>the</strong> <strong>JCC</strong> Data Pump<br />
– About 22 seconds to update source data:<br />
– Only 3 minutes to update target data!!
<strong>Loader</strong> Performance<br />
• SCSU_REPISRS session<br />
– From CLM log:<br />
17-NOV-2004 16:15:42.26 20234166 CLM SCSU_REPISR Total : 36381<br />
records written (36048 modify, 333 delete)<br />
– The 3 processes <strong>the</strong>mselves:<br />
• CTL: 14.6 CPU secs / 1667 Direct IO<br />
• CLM: 1 min 11.4 CPU secs / 67,112 Direct IO<br />
• LML: 7 min 55.7 CPU secs / 67 Direct IO<br />
– After 11:15 hours of connect time, this is about<br />
1.6 Direct IO per second average<br />
– Processed about .9 records per second average
Heartbeat<br />
• Without Heartbeat a session can become<br />
‘stale’<br />
• AIJ backups can be blocked<br />
• With Heartbeat enabled this does not<br />
occur<br />
– Side-affect is that ‘trailing’ messages are not<br />
displayed with heartbeat enabled
Bonus: Global Buffers<br />
• Despite great performance gains from Row<br />
Cache over <strong>the</strong> past couple of years, we still<br />
were anticipating issues for our fall busy period<br />
• We turned on GB on many Dbs<br />
– On our busiest server, we enabled it on all dbs<br />
– On o<strong>the</strong>r servers, we have about 50% implementation<br />
• Used to run with RDM$BIND_BUFFERS of 220<br />
• Estimated GB at max number of users@200 ea
Global Buffers<br />
• Prior to implementing GB, our busiest<br />
server was running at a constant 6,000-<br />
7,000 IO/sec<br />
• O<strong>the</strong>r servers were running around 3,000<br />
but had spikes to 7,000 or more<br />
• Global buffers both lowered overall IO, as<br />
well as eliminated spikes<br />
• Cost is in total Locks (Resources)<br />
– Increase LOCKIDTBL <strong>and</strong> RESHASHTBL
7,000<br />
6,000<br />
5,000<br />
4,000<br />
3,000<br />
2,000<br />
1,000<br />
-<br />
28-Jul<br />
1 w/Gb<br />
2-Aug<br />
1 w/Gb<br />
Before GB<br />
METE<br />
3-Aug<br />
1 w/Gb<br />
10-11 IO<br />
2-3 IO<br />
10-11PRC<br />
2-3 PRC
7,000<br />
6,000<br />
5,000<br />
4,000<br />
3,000<br />
2,000<br />
1,000<br />
-<br />
11-aug 9 w/gb<br />
13-aug 11 w/gb<br />
17-aug 11 w/gb<br />
19-aug 11 w/gb<br />
23-aug 11 w/gb<br />
25-aug 11 w/gb<br />
27-aug 11 w/gb<br />
30-aug 11 w/gb<br />
2-sep 11 w/gb<br />
7-sep 11 w/gb<br />
9-sep 11 w/gb<br />
13-sep 11 w/gb<br />
15-sep 11 w/gb<br />
17-sep 11 w/gb<br />
21-sep 11 w/gb<br />
23-Sep<br />
27-Sep<br />
29-Sep<br />
1-Oct<br />
5-Oct<br />
7-Oct<br />
28-Jul 1 w/Gb<br />
3-Aug 1 w/Gb<br />
5-Aug 7 w/Gb<br />
9-aug 9 w/gb<br />
After GB<br />
METE<br />
10-11 IO<br />
2-3 IO<br />
10-11 PRC<br />
2-3 PRC
6,000<br />
5,000<br />
4,000<br />
3,000<br />
2,000<br />
1,000<br />
-<br />
28-Jul 0 w/Gb<br />
3-Aug 0 w/Gb<br />
5-Aug 1 w/Gb<br />
9-aug 1 w/gb<br />
11-aug 2 w/gb<br />
13-aug 4 w/gb<br />
17-aug 4 w/gb<br />
19-aug 4 w/gb<br />
23-aug 4 w/gb<br />
25-aug 4 w/gb<br />
27-aug 4 w/gb<br />
31-aug 4 w/gb<br />
2-sep 4 w/gb<br />
After Gb<br />
MNSCU1<br />
7-sep 4 w/gb<br />
9-sep 4 w/gb<br />
13-sep 4 w/gb<br />
15-sep 4 w/gb<br />
17-sep 4 w/gb<br />
21-sep 4 w/gb<br />
23-Sep<br />
27-Sep<br />
29-Sep<br />
1-Oct<br />
5-Oct<br />
7-Oct<br />
10-11 IO<br />
2-3 IO<br />
10-11 PRC<br />
2-3 PRC
Resources
CPU<br />
• Since we are now using less IO, more<br />
CPU is available<br />
• Before:<br />
SDA> lck show lck /rep=5/int=10<br />
23-AUG-2004 14:28:48.80 Delta sec: 10.0 Ave Spin: 24005<br />
Ave Req: 39875 Req/sec: 15656.9 Busy: 62.4%<br />
23-AUG-2004 14:28:58.80 Delta sec: 10.0 Ave Spin: 19121<br />
Ave Req: 30166 Req/sec: 20289.7 Busy: 61.2%<br />
• After:<br />
24-AUG-2004 11:44:01.90 Delta sec: 10.0 Ave Spin: 12846<br />
Ave Req: 13439 Req/sec: 38043.3 Busy: 51.1%<br />
24-AUG-2004 11:44:11.90 Delta sec: 10.0 Ave Spin: 16629<br />
Ave Req: 15514 Req/sec: 31109.1 Busy: 48.3%
Lock Rates<br />
• Reducing <strong>the</strong>se numbers to Lock<br />
Operations per 1% of CPU time yields:<br />
– Before: 299<br />
– After: 573
For More Information<br />
• Miles.Oustad@CSU.MNSCU.EDU<br />
• (218) 755-4614
Q U E S T I O N S<br />
&<br />
A N S W E R S