Peter Berggren - OneSAF Public Site
Peter Berggren - OneSAF Public Site
Peter Berggren - OneSAF Public Site
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Peter</strong> <strong>Berggren</strong><br />
<strong>OneSAF</strong> Production<br />
September, 2012<br />
Distribution A – Approved for <strong>Public</strong> Release – Distribution is Unlimited
Performance Analysis (Version 5.1.1)<br />
Introduction<br />
<strong>OneSAF</strong> V5.1<br />
Performance Improvement Strategy<br />
LR Compositions<br />
Threading<br />
HLA Interop Performance and Memory Usage<br />
General Memory Usage<br />
Fort Stewart Experiment<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
2
Introduction<br />
<br />
<br />
Background<br />
Performance of <strong>OneSAF</strong> V5.1 Released in February 2011 was<br />
Significantly Improved from Previous Versions<br />
Yet Because the TEMO Assessment Planned for Later that Year Would<br />
be Conducted on Very Poor Hardware, Performance was (Still) a<br />
Concern<br />
• (HP 8400, 4GB RAM)<br />
Thus, in Preparation for the TEMO Assessment, a Significant Effort Was<br />
Undertaken to Analyze and Improve <strong>OneSAF</strong> Performance<br />
Results Were Incorporated into <strong>OneSAF</strong> V5.1.1 Released in December<br />
2011<br />
Purpose<br />
The Purpose of this Briefing is to Describe the Performance Analysis<br />
and Improvement Effort Carried out Between March and June 2011<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
3
V5.1 Performance Update<br />
Where We Were as of V5.0<br />
Changes to Line-On-Line Benchmark<br />
V5.1 Performance<br />
Details Between V5.0 and V5.1<br />
PTR 51570<br />
September 2012 <strong>OneSAF</strong> Technical Exchange<br />
4
Where We Were As of V5.0<br />
Line-On-Line Benchmark<br />
Common Hardware Platform<br />
AcquireLC Model with Operational Sensor Ranges per <strong>OneSAF</strong> V3.0<br />
450<br />
400<br />
350<br />
300<br />
250<br />
200<br />
150<br />
100<br />
50<br />
0<br />
V3.0 V4.0 V5.0<br />
Entities 320 340 400<br />
September 2012 <strong>OneSAF</strong> Technical Exchange 5
Benchmark Changes Post-V5.0<br />
PTR 53136<br />
Integrated Code Changes to MR Mobility Model for Benchmark;<br />
Controlled via Property<br />
CR 52879<br />
• [Keeps Entities from Getting Stuck]<br />
Added Property to Control Use of Operational Sensor Ranges<br />
CR 54531<br />
Created“temo”Extensionand“temo-configuration”ScripttoConfigure<br />
with AcquireLC and Operational Sensor Ranges<br />
• [Recommended for Best Performance]<br />
September 2012 <strong>OneSAF</strong> Technical Exchange 6
V5.1 Performance (As of 2011-02-09)<br />
900<br />
810<br />
800<br />
700<br />
600<br />
500<br />
400<br />
400<br />
320 340<br />
300<br />
200<br />
100<br />
0<br />
V3.0 V4.0 V5.0 V5.1<br />
Entities 320 340 400 810<br />
<br />
Why the Big Improvement?<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
7
Details Between V5.0 and V5.1<br />
900<br />
800<br />
700<br />
670 690 710<br />
800 810<br />
600<br />
500<br />
400<br />
300<br />
200<br />
100<br />
400 400<br />
PTR 51570<br />
0<br />
V5.0<br />
Release<br />
trunk<br />
09/07<br />
trunk<br />
10/01<br />
trunk<br />
11/17<br />
trunk<br />
12/08<br />
trunk<br />
12/22<br />
V5.1<br />
02/09<br />
Entities 400 400 670 690 710 800 810<br />
Big Improvement Due to PTR 51570<br />
<br />
Also Steady Improvement Throughout V5.1 Development<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
8
PTR 51570<br />
<br />
SE Core: RWA Running Attack ends prematurely<br />
Integrated Into Trunk on 9/20<br />
<br />
<br />
Fixed error in Grid 3D implementation of area of interest (AOI) service<br />
In theory, benefit of this fix increases with number of entities in<br />
exercise<br />
Warrants additional study<br />
September 2012 <strong>OneSAF</strong> Technical Exchange<br />
9
Performance Improvement Strategy<br />
<br />
<br />
Target Issues Relevant to TEMO Assessment<br />
Very Large Scale (~40k Entities)<br />
HLA Interop<br />
Tight Memory (4GB)<br />
Pursue Multiple, Complementary Approaches<br />
LR and ULR Compositions (Use Less CPU)<br />
Threading<br />
Memory Usage in Battlemaster, Simcore, and HLA Interop<br />
HLA Interop Internal Design<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
10
Reduce Computation Cost of Actor<br />
Compositions for TEMO Use Case<br />
<br />
<br />
Tickets:<br />
PTR 56727 Performance: actor compositions contain incorrect or<br />
unnecessary components<br />
• Updated over 4000 <strong>OneSAF</strong> unit and entity compositions to:<br />
– Correct efficiency errors<br />
– Enable on-demand C2 modeling capability<br />
CR 57544 Need Model that uses C4I proximity-based sensor<br />
• Add low-cost sensor model to ULR actors<br />
CR 57102 Create an LR Driver component with OA and formation<br />
keeping<br />
Results:<br />
• Upgrade mobility of LR entities in <strong>OneSAF</strong> to increase opportunities<br />
to replace MR actors with LR actors<br />
LR and ULR Actors are Much More Capable Than in Previous Versions<br />
Vast Majority of Actors in Final TEMO Assessment Scenario Were LR<br />
or ULR<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
11
Performance PTRs (Page 1 of 2)<br />
id Subject Integrated<br />
56534 Performance: <strong>OneSAF</strong> inefficient on platforms with large numbers of CPU cores 4/12/2011<br />
56658 Performance: <strong>OneSAF</strong> should have more than 75% of memory available for heap on 64 bit machines 4/19/2011<br />
56660 Performance: Compressed OOPS should be enabled by default 4/19/2011<br />
55973 Performance: UniqueIDs used as keys in ordered collections 4/20/2011<br />
56763 Performance: AOI queries use intermediate collections 4/21/2011<br />
56830 Performance: Memory usage in PVD 4/21/2011<br />
56849 Performance: OA wasting time filtering entities 4/25/2011<br />
56906 Performance: SnapToLinear locking up MCT 4/28/2011<br />
56871 Performance: Event delivery optimizations 4/29/2011<br />
56941 Performance: Thread contention & liveliness issues 5/2/2011<br />
56944 Performance: Snap To Linear always replans all segments 5/3/2011<br />
56909 Performance: Munition Selection process consumes too much CPU time 5/5/2011<br />
57141 Performance: Default <strong>OneSAF</strong> heap size is too large 5/10/2011<br />
57152 Performance: MatrixTable inefficient when units are removed from the distribution 5/11/2011<br />
56662 Performance: <strong>OneSAF</strong> should display units by default for large scenarios 5/12/2011<br />
56727 Performance: Actor compositions contain incorrect or unnecessary components 5/13/2011<br />
56914 Performance: Massive performance hit when editing Relationships in a large scenario 5/13/2011<br />
56996 Performance: ULR entities have command agent, can be assigned orders in MissionEditor 5/13/2011<br />
57329 Performance: Sim Event interest evaluation is slow 5/18/2011<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
12
Performance PTRs (Page 2 of 2)<br />
id Subject Integrated<br />
56182 Performance: Unnecessary ODB updates generated in HLA adapter(s) 5/19/2011<br />
57340 Performance: Blackboard trigger delivery optimization 5/19/2011<br />
57127 Performance: ODB should be filter unneeded property changes where possible 5/24/2011<br />
56026 SECore: Performance: ICs spend too much time checking whether they are in buildings 5/25/2011<br />
57359 Performance: Grid3D position manager is not efficient in large exercises 5/26/2011<br />
57186 Performance: Sensor Range Fans on PVD lock up node 5/27/2011<br />
57447 Performance: PositionManagerGrid3D memory opitimizations for entry storage 5/27/2011<br />
57448 Performance: AOI Visitor Pattern optimizations 5/27/2011<br />
57486 Performance: AOI services called in redundant or incorrect cases 5/27/2011<br />
57391 Performance: BDE: Interop performance with large scenarios. 6/2/2011<br />
57169 Performance: DamagedComponent objects are redundant 6/7/2011<br />
57613 Performance: LR fuel consumption calculations are inefficient 6/8/2011<br />
57847 Performance: Stewart11: ODB StatusPanel is ticking pruneDeadObjects 6/28/2011<br />
57860 Performance: Stewart11: Performance improvements in HLA to prevent entity timeout during 6/28/2011 join<br />
57633 Performance: False Positives in GASP Detection Algorithm 6/29/2011<br />
58042 Performance: Clean up proxy object handling of getUniqueID methods 7/1/2011<br />
58101 Performance: ODB addAll() method inefficient with new concurrent collections 7/6/2011<br />
58136 Performance: Algorithm for Pruning ODB Objects Clashes with CopyOnWrite Data Structure 7/13/2011<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
13
MR Entities *<br />
Line-On-Line Benchmark Trend<br />
<br />
Default Configuration<br />
3500<br />
3000<br />
2500<br />
2000<br />
1500<br />
1000<br />
500<br />
0<br />
18-Mar 7-Apr 27-Apr 17-May 6-Jun 26-Jun 16-Jul<br />
High end developer box<br />
Older CHP HP 8400<br />
Z800, 6-core (LAB 002)<br />
Newest CHP Z800 , 4-core (Estimated)<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
14
Threading<br />
<br />
<br />
<br />
<br />
<br />
Used YourKit to Identify Points of Thread Contention<br />
Reduced Thread Contention By:<br />
Moving Variables into Thread-Local Storage<br />
Using Concurrent Collections<br />
Changing Logic to Avoid Need for Synchronization<br />
Line on Line Benchmark Shows Nearly 100% Utilization of a Four-Core<br />
Machine<br />
Utilization Tops out at Roughly 4-6 Cores<br />
Thus, To Fully Utilize Machines with >4 Cores, Run Multiple <strong>OneSAF</strong><br />
Processes<br />
See“Deploying<strong>OneSAF</strong>inYourEnvironment”<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
15
Memory Improvements<br />
<br />
<br />
Identified (with YourKit) Areas of Excessive Memory Consumption<br />
PTRs (Subset of Those Listed Earlier):<br />
56658 Performance: <strong>OneSAF</strong> should have more than 75% of memory<br />
available for heap on 64 bit machines<br />
56660 Performance: Compressed OOPS should be enabled by default<br />
56830 Performance: Memory usage in PVD<br />
57359 Performance: Grid3D position manager is not efficient in large<br />
exercises<br />
57447 Performance: PositionManagerGrid3D memory opitimizations for<br />
entry storage<br />
58042 Performance: Clean up proxy object handling of getUniqueID<br />
methods<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
16
Memory Model<br />
<br />
<br />
Collected Memory Usage Data During Large-Scale Exercises with<br />
Varying:<br />
Number of Actors Internal / External<br />
Number of Simcores<br />
Used Memory Data to Construct a Predictive Model of Memory Usage<br />
Implemented as Excel Spreadsheet<br />
• https://dev.onesaf.net/devsite/Development/SystemArchitecture/trun<br />
k/Performance/large-scale-memory.xlsx<br />
Works with Battlemaster, Simcore, and Interop Nodes<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
17
Java Heap Calculations<br />
<br />
<br />
For a Given Quantity of Physical Memory on a Machine, How Much is<br />
Available for Use by the Java Heap in a <strong>OneSAF</strong> Process?<br />
AsPerthe<strong>OneSAF</strong>“runtimeloader”Script:<br />
• Multiply by 0.9 (This is Done to Allow Space for the OS and Other<br />
Processes on the Box)<br />
• Subtract 800 MB for ERC (Native) Memory<br />
• Subtract 256 MB for Java PERMGEN Memory<br />
Multiply by 0.5 to Allow For:<br />
Example:<br />
• Spikes in Memory Usage<br />
• Garbage<br />
((4,096 MB physical memory * 0.9) - 800 - 256) * 0.5 = 1,315 MB Max<br />
Safe Java Heap Usage<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
18
Memory Model - Battlemaster<br />
3000<br />
Battlemaster Java Heap Usage<br />
2500<br />
2000<br />
Java Heap Usage (MB)<br />
4 GB<br />
Total<br />
Physical<br />
RAM<br />
1500<br />
1000<br />
500<br />
0<br />
0 10000 20000 30000 40000 50000 60000 70000 80000<br />
Entities<br />
May<br />
June<br />
Linear (May)<br />
Linear (June)<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
19
Simcore Java Heap Usage at Final as a Function of Entities<br />
and Simcores<br />
2500<br />
2000<br />
1500<br />
Java Heap Usage (MB)<br />
4 GB Total<br />
Physical<br />
RAM<br />
1000<br />
500<br />
10<br />
20<br />
30<br />
Simcores<br />
0<br />
10000 20000 30000 40000 50000 60000<br />
Entities<br />
September 2012 <strong>OneSAF</strong> Technical Exchange 20
Capacity Trend<br />
Total Entities by Role and Month Based<br />
Exclusively on 4 GB Memory Limit<br />
70,000<br />
60,000<br />
50,000<br />
40,000<br />
30,000<br />
20,000<br />
33,106<br />
47,274<br />
42,366<br />
57,148<br />
53,189<br />
May<br />
June<br />
10,000<br />
-<br />
Battlemaster Simcore Interop<br />
Assumes 4GB physical RAM, 30 simcores, and 10,000 external entities.<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
21
HLA Interop Internal Design<br />
<br />
PTRs (Subset of Those Listed Earlier):<br />
57391 Performance: BDE: Interop performance with large scenarios.<br />
• Increased default number of heartbeat buckets when running with<br />
TEMO extension<br />
• Used simple (no argument) version of RTI tick() method<br />
57860 Performance: Stewart11: Performance improvements in HLA to<br />
prevent entity timeout during join<br />
• Multi-threading of RTIMessageManager tested in BDE lab the week<br />
before Ft. Stewart.<br />
• Fix to not timeout entities when TranslationManager queue has<br />
reached a certain threshold.<br />
• Put heartbeat events straight onto the RTI Queue instead of passing<br />
through the TranslationManager queue.<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
22
Ft. Stewart “Expected” Scenario<br />
<br />
<br />
The “30k + 10k” Configuration Viewed from the <strong>OneSAF</strong> (30k) Federate<br />
After HLA Join<br />
Ghost Entities (From 10k Federate) Are Associated with<br />
“EXTERNAL…” Sides<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
23
Ft. Stewart Configuration<br />
<br />
<br />
Machine Types<br />
8400 Client: HP 8400 Dual-Core with 4 GB RAM<br />
• Typical (JCATS) Client Machine at NSC<br />
8400 Server: HP 8400 Dual Dual-Core with 8 GB RAM<br />
• Typical (JCATS) Server Machine at NSC<br />
Monster: High-Powered Machines Borrowed from <strong>OneSAF</strong> Lab<br />
• Not on Critical Path, Excluded from Memory and CPU Reports<br />
Allocations<br />
Battlemaster: One (1) 8400 Server as Primary, One (1) Monster as<br />
Backup<br />
Simcore: Twenty-Eight (28) 8400 Client; Two (2) 8400 Client Held in<br />
Reserve<br />
HLA Interop Adapter: One (1) 8400 Server<br />
C2 Adapter: One (1) 8400 Client<br />
MCT: Twenty-Eight (28) 8400 Clients, Connected via Pub-Sub<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
24
Ft. Stewart Representative Exercise<br />
Representative Exercise Carried Out Morning of Wednesday, 6/22/2011<br />
<br />
<br />
<br />
Operators Were Active for 3.5 Hours of Run Time<br />
All Machines Were Within Acceptable Limits for Memory and CPU<br />
Load Throughout the Exercise<br />
Memory Measurements Were Taken at the Following Points:<br />
07:00 : Initialized<br />
07:17 : Joined<br />
07:19 : Running<br />
08:49 : Elapsed 1.5 Hours Sim Time<br />
10:54 : Elapsed 3.5 Hours Sim Time<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
25
4<br />
3.5<br />
3<br />
CPU<br />
Load Average<br />
2.5<br />
2<br />
1.5<br />
1<br />
battlemaster<br />
simcore<br />
interop_ict<br />
c2Core<br />
0.5<br />
Java Heap<br />
Usage (MB)<br />
0<br />
1400<br />
1300<br />
1200<br />
1100<br />
1000<br />
900<br />
800<br />
700<br />
600<br />
500<br />
400<br />
Joined and Running<br />
Running 3.5 Hours<br />
6:43 7:12 7:40 8:09 8:38 9:07 9:36 10:04 10:33 11:02<br />
battlemaster<br />
simcore<br />
interop<br />
c2 adapter<br />
Initialized<br />
Running 1.5 Hours<br />
September 2012<br />
<strong>OneSAF</strong> Technical Exchange<br />
26
Back Up Slides