20.11.2014 Views

Peter Berggren - OneSAF Public Site

Peter Berggren - OneSAF Public Site

Peter Berggren - OneSAF Public Site

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Peter</strong> <strong>Berggren</strong><br />

<strong>OneSAF</strong> Production<br />

September, 2012<br />

Distribution A – Approved for <strong>Public</strong> Release – Distribution is Unlimited


Performance Analysis (Version 5.1.1)<br />

‭ Introduction<br />

‭ <strong>OneSAF</strong> V5.1<br />

‭ Performance Improvement Strategy<br />

LR Compositions<br />

Threading<br />

HLA Interop Performance and Memory Usage<br />

General Memory Usage<br />

‭ Fort Stewart Experiment<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

2


Introduction<br />

‭<br />

‭<br />

Background<br />

Performance of <strong>OneSAF</strong> V5.1 Released in February 2011 was<br />

Significantly Improved from Previous Versions<br />

Yet Because the TEMO Assessment Planned for Later that Year Would<br />

be Conducted on Very Poor Hardware, Performance was (Still) a<br />

Concern<br />

• (HP 8400, 4GB RAM)<br />

Thus, in Preparation for the TEMO Assessment, a Significant Effort Was<br />

Undertaken to Analyze and Improve <strong>OneSAF</strong> Performance<br />

Results Were Incorporated into <strong>OneSAF</strong> V5.1.1 Released in December<br />

2011<br />

Purpose<br />

The Purpose of this Briefing is to Describe the Performance Analysis<br />

and Improvement Effort Carried out Between March and June 2011<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

3


V5.1 Performance Update<br />

‭ Where We Were as of V5.0<br />

‭ Changes to Line-On-Line Benchmark<br />

‭ V5.1 Performance<br />

‭ Details Between V5.0 and V5.1<br />

‭ PTR 51570<br />

September 2012 <strong>OneSAF</strong> Technical Exchange<br />

4


Where We Were As of V5.0<br />

‭ Line-On-Line Benchmark<br />

‭ Common Hardware Platform<br />

‭ AcquireLC Model with Operational Sensor Ranges per <strong>OneSAF</strong> V3.0<br />

450<br />

400<br />

350<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

V3.0 V4.0 V5.0<br />

Entities 320 340 400<br />

September 2012 <strong>OneSAF</strong> Technical Exchange 5


Benchmark Changes Post-V5.0<br />

‭ PTR 53136<br />

Integrated Code Changes to MR Mobility Model for Benchmark;<br />

Controlled via Property<br />

‭ CR 52879<br />

• [Keeps Entities from Getting Stuck]<br />

Added Property to Control Use of Operational Sensor Ranges<br />

‭ CR 54531<br />

Created‭“temo”‭Extension‭and‭“temo-configuration”‭Script‭to‭Configure‭<br />

with AcquireLC and Operational Sensor Ranges<br />

• [Recommended for Best Performance]<br />

September 2012 <strong>OneSAF</strong> Technical Exchange 6


V5.1 Performance (As of 2011-02-09)<br />

900<br />

810<br />

800<br />

700<br />

600<br />

500<br />

400<br />

400<br />

320 340<br />

300<br />

200<br />

100<br />

0<br />

V3.0 V4.0 V5.0 V5.1<br />

Entities 320 340 400 810<br />

‭<br />

Why the Big Improvement?<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

7


Details Between V5.0 and V5.1<br />

900<br />

800<br />

700<br />

670 690 710<br />

800 810<br />

600<br />

500<br />

400<br />

300<br />

200<br />

100<br />

400 400<br />

PTR 51570<br />

0<br />

V5.0<br />

Release<br />

trunk<br />

09/07<br />

trunk<br />

10/01<br />

trunk<br />

11/17<br />

trunk<br />

12/08<br />

trunk<br />

12/22<br />

V5.1<br />

02/09<br />

Entities 400 400 670 690 710 800 810<br />

‭ Big Improvement Due to PTR 51570<br />

‭<br />

Also Steady Improvement Throughout V5.1 Development<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

8


PTR 51570<br />

‭<br />

SE Core: RWA Running Attack ends prematurely<br />

‭ Integrated Into Trunk on 9/20<br />

‭<br />

‭<br />

Fixed error in Grid 3D implementation of area of interest (AOI) service<br />

In theory, benefit of this fix increases with number of entities in<br />

exercise<br />

Warrants additional study<br />

September 2012 <strong>OneSAF</strong> Technical Exchange<br />

9


Performance Improvement Strategy<br />

‭<br />

‭<br />

Target Issues Relevant to TEMO Assessment<br />

Very Large Scale (~40k Entities)<br />

HLA Interop<br />

Tight Memory (4GB)<br />

Pursue Multiple, Complementary Approaches<br />

LR and ULR Compositions (Use Less CPU)<br />

Threading<br />

Memory Usage in Battlemaster, Simcore, and HLA Interop<br />

HLA Interop Internal Design<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

10


Reduce Computation Cost of Actor<br />

Compositions for TEMO Use Case<br />

‭<br />

‭<br />

Tickets:<br />

PTR 56727 Performance: actor compositions contain incorrect or<br />

unnecessary components<br />

• Updated over 4000 <strong>OneSAF</strong> unit and entity compositions to:<br />

– Correct efficiency errors<br />

– Enable on-demand C2 modeling capability<br />

CR 57544 Need Model that uses C4I proximity-based sensor<br />

• Add low-cost sensor model to ULR actors<br />

CR 57102 Create an LR Driver component with OA and formation<br />

keeping<br />

Results:<br />

• Upgrade mobility of LR entities in <strong>OneSAF</strong> to increase opportunities<br />

to replace MR actors with LR actors<br />

LR and ULR Actors are Much More Capable Than in Previous Versions<br />

Vast Majority of Actors in Final TEMO Assessment Scenario Were LR<br />

or ULR<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

11


Performance PTRs (Page 1 of 2)<br />

id Subject Integrated<br />

56534 Performance: <strong>OneSAF</strong> inefficient on platforms with large numbers of CPU cores 4/12/2011<br />

56658 Performance: <strong>OneSAF</strong> should have more than 75% of memory available for heap on 64 bit machines 4/19/2011<br />

56660 Performance: Compressed OOPS should be enabled by default 4/19/2011<br />

55973 Performance: UniqueIDs used as keys in ordered collections 4/20/2011<br />

56763 Performance: AOI queries use intermediate collections 4/21/2011<br />

56830 Performance: Memory usage in PVD 4/21/2011<br />

56849 Performance: OA wasting time filtering entities 4/25/2011<br />

56906 Performance: SnapToLinear locking up MCT 4/28/2011<br />

56871 Performance: Event delivery optimizations 4/29/2011<br />

56941 Performance: Thread contention & liveliness issues 5/2/2011<br />

56944 Performance: Snap To Linear always replans all segments 5/3/2011<br />

56909 Performance: Munition Selection process consumes too much CPU time 5/5/2011<br />

57141 Performance: Default <strong>OneSAF</strong> heap size is too large 5/10/2011<br />

57152 Performance: MatrixTable inefficient when units are removed from the distribution 5/11/2011<br />

56662 Performance: <strong>OneSAF</strong> should display units by default for large scenarios 5/12/2011<br />

56727 Performance: Actor compositions contain incorrect or unnecessary components 5/13/2011<br />

56914 Performance: Massive performance hit when editing Relationships in a large scenario 5/13/2011<br />

56996 Performance: ULR entities have command agent, can be assigned orders in MissionEditor 5/13/2011<br />

57329 Performance: Sim Event interest evaluation is slow 5/18/2011<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

12


Performance PTRs (Page 2 of 2)<br />

id Subject Integrated<br />

56182 Performance: Unnecessary ODB updates generated in HLA adapter(s) 5/19/2011<br />

57340 Performance: Blackboard trigger delivery optimization 5/19/2011<br />

57127 Performance: ODB should be filter unneeded property changes where possible 5/24/2011<br />

56026 SECore: Performance: ICs spend too much time checking whether they are in buildings 5/25/2011<br />

57359 Performance: Grid3D position manager is not efficient in large exercises 5/26/2011<br />

57186 Performance: Sensor Range Fans on PVD lock up node 5/27/2011<br />

57447 Performance: PositionManagerGrid3D memory opitimizations for entry storage 5/27/2011<br />

57448 Performance: AOI Visitor Pattern optimizations 5/27/2011<br />

57486 Performance: AOI services called in redundant or incorrect cases 5/27/2011<br />

57391 Performance: BDE: Interop performance with large scenarios. 6/2/2011<br />

57169 Performance: DamagedComponent objects are redundant 6/7/2011<br />

57613 Performance: LR fuel consumption calculations are inefficient 6/8/2011<br />

57847 Performance: Stewart11: ODB StatusPanel is ticking pruneDeadObjects 6/28/2011<br />

57860 Performance: Stewart11: Performance improvements in HLA to prevent entity timeout during 6/28/2011 join<br />

57633 Performance: False Positives in GASP Detection Algorithm 6/29/2011<br />

58042 Performance: Clean up proxy object handling of getUniqueID methods 7/1/2011<br />

58101 Performance: ODB addAll() method inefficient with new concurrent collections 7/6/2011<br />

58136 Performance: Algorithm for Pruning ODB Objects Clashes with CopyOnWrite Data Structure 7/13/2011<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

13


MR Entities *<br />

Line-On-Line Benchmark Trend<br />

‭<br />

Default Configuration<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

1000<br />

500<br />

0<br />

18-Mar 7-Apr 27-Apr 17-May 6-Jun 26-Jun 16-Jul<br />

High end developer box<br />

Older CHP HP 8400<br />

Z800, 6-core (LAB 002)<br />

Newest CHP Z800 , 4-core (Estimated)<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

14


Threading<br />

‭<br />

‭<br />

‭<br />

‭<br />

‭<br />

Used YourKit to Identify Points of Thread Contention<br />

Reduced Thread Contention By:<br />

Moving Variables into Thread-Local Storage<br />

Using Concurrent Collections<br />

Changing Logic to Avoid Need for Synchronization<br />

Line on Line Benchmark Shows Nearly 100% Utilization of a Four-Core<br />

Machine<br />

Utilization Tops out at Roughly 4-6 Cores<br />

Thus, To Fully Utilize Machines with >4 Cores, Run Multiple <strong>OneSAF</strong><br />

Processes<br />

See‭“Deploying‭<strong>OneSAF</strong>‭in‭Your‭Environment”<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

15


Memory Improvements<br />

‭<br />

‭<br />

Identified (with YourKit) Areas of Excessive Memory Consumption<br />

PTRs (Subset of Those Listed Earlier):<br />

56658 Performance: <strong>OneSAF</strong> should have more than 75% of memory<br />

available for heap on 64 bit machines<br />

56660 Performance: Compressed OOPS should be enabled by default<br />

56830 Performance: Memory usage in PVD<br />

57359 Performance: Grid3D position manager is not efficient in large<br />

exercises<br />

57447 Performance: PositionManagerGrid3D memory opitimizations for<br />

entry storage<br />

58042 Performance: Clean up proxy object handling of getUniqueID<br />

methods<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

16


Memory Model<br />

‭<br />

‭<br />

Collected Memory Usage Data During Large-Scale Exercises with<br />

Varying:<br />

Number of Actors Internal / External<br />

Number of Simcores<br />

Used Memory Data to Construct a Predictive Model of Memory Usage<br />

Implemented as Excel Spreadsheet<br />

• https://dev.onesaf.net/devsite/Development/SystemArchitecture/trun<br />

k/Performance/large-scale-memory.xlsx<br />

Works with Battlemaster, Simcore, and Interop Nodes<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

17


Java Heap Calculations<br />

‭<br />

‭<br />

For a Given Quantity of Physical Memory on a Machine, How Much is<br />

Available for Use by the Java Heap in a <strong>OneSAF</strong> Process?<br />

As‭Per‭the‭<strong>OneSAF</strong>‭“runtimeloader”‭Script:<br />

• Multiply by 0.9 (This is Done to Allow Space for the OS and Other<br />

Processes on the Box)<br />

• Subtract 800 MB for ERC (Native) Memory<br />

• Subtract 256 MB for Java PERMGEN Memory<br />

Multiply by 0.5 to Allow For:<br />

Example:<br />

• Spikes in Memory Usage<br />

• Garbage<br />

((4,096 MB physical memory * 0.9) - 800 - 256) * 0.5 = 1,315 MB Max<br />

Safe Java Heap Usage<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

18


Memory Model - Battlemaster<br />

3000<br />

Battlemaster Java Heap Usage<br />

2500<br />

2000<br />

Java Heap Usage (MB)<br />

4 GB<br />

Total<br />

Physical<br />

RAM<br />

1500<br />

1000<br />

500<br />

0<br />

0 10000 20000 30000 40000 50000 60000 70000 80000<br />

Entities<br />

May<br />

June<br />

Linear (May)<br />

Linear (June)<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

19


Simcore Java Heap Usage at Final as a Function of Entities<br />

and Simcores<br />

2500<br />

2000<br />

1500<br />

Java Heap Usage (MB)<br />

4 GB Total<br />

Physical<br />

RAM<br />

1000<br />

500<br />

10<br />

20<br />

30<br />

Simcores<br />

0<br />

10000 20000 30000 40000 50000 60000<br />

Entities<br />

September 2012 <strong>OneSAF</strong> Technical Exchange 20


Capacity Trend<br />

Total Entities by Role and Month Based<br />

Exclusively on 4 GB Memory Limit<br />

70,000<br />

60,000<br />

50,000<br />

40,000<br />

30,000<br />

20,000<br />

33,106<br />

47,274<br />

42,366<br />

57,148<br />

53,189<br />

May<br />

June<br />

10,000<br />

-<br />

Battlemaster Simcore Interop<br />

Assumes 4GB physical RAM, 30 simcores, and 10,000 external entities.<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

21


HLA Interop Internal Design<br />

‭<br />

PTRs (Subset of Those Listed Earlier):<br />

57391 Performance: BDE: Interop performance with large scenarios.<br />

• Increased default number of heartbeat buckets when running with<br />

TEMO extension<br />

• Used simple (no argument) version of RTI tick() method<br />

57860 Performance: Stewart11: Performance improvements in HLA to<br />

prevent entity timeout during join<br />

• Multi-threading of RTIMessageManager tested in BDE lab the week<br />

before Ft. Stewart.<br />

• Fix to not timeout entities when TranslationManager queue has<br />

reached a certain threshold.<br />

• Put heartbeat events straight onto the RTI Queue instead of passing<br />

through the TranslationManager queue.<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

22


Ft. Stewart “Expected” Scenario<br />

‭<br />

‭<br />

The “30k + 10k” Configuration Viewed from the <strong>OneSAF</strong> (30k) Federate<br />

After HLA Join<br />

Ghost Entities (From 10k Federate) Are Associated with<br />

“EXTERNAL…” Sides<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

23


Ft. Stewart Configuration<br />

‭<br />

‭<br />

Machine Types<br />

8400 Client: HP 8400 Dual-Core with 4 GB RAM<br />

• Typical (JCATS) Client Machine at NSC<br />

8400 Server: HP 8400 Dual Dual-Core with 8 GB RAM<br />

• Typical (JCATS) Server Machine at NSC<br />

Monster: High-Powered Machines Borrowed from <strong>OneSAF</strong> Lab<br />

• Not on Critical Path, Excluded from Memory and CPU Reports<br />

Allocations<br />

Battlemaster: One (1) 8400 Server as Primary, One (1) Monster as<br />

Backup<br />

Simcore: Twenty-Eight (28) 8400 Client; Two (2) 8400 Client Held in<br />

Reserve<br />

HLA Interop Adapter: One (1) 8400 Server<br />

C2 Adapter: One (1) 8400 Client<br />

MCT: Twenty-Eight (28) 8400 Clients, Connected via Pub-Sub<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

24


Ft. Stewart Representative Exercise<br />

‭ Representative Exercise Carried Out Morning of Wednesday, 6/22/2011<br />

‭<br />

‭<br />

‭<br />

Operators Were Active for 3.5 Hours of Run Time<br />

All Machines Were Within Acceptable Limits for Memory and CPU<br />

Load Throughout the Exercise<br />

Memory Measurements Were Taken at the Following Points:<br />

07:00 : Initialized<br />

07:17 : Joined<br />

07:19 : Running<br />

08:49 : Elapsed 1.5 Hours Sim Time<br />

10:54 : Elapsed 3.5 Hours Sim Time<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

25


4<br />

3.5<br />

3<br />

CPU<br />

Load Average<br />

2.5<br />

2<br />

1.5<br />

1<br />

battlemaster<br />

simcore<br />

interop_ict<br />

c2Core<br />

0.5<br />

Java Heap<br />

Usage (MB)<br />

0<br />

1400<br />

1300<br />

1200<br />

1100<br />

1000<br />

900<br />

800<br />

700<br />

600<br />

500<br />

400<br />

Joined and Running<br />

Running 3.5 Hours<br />

6:43 7:12 7:40 8:09 8:38 9:07 9:36 10:04 10:33 11:02<br />

battlemaster<br />

simcore<br />

interop<br />

c2 adapter<br />

Initialized<br />

Running 1.5 Hours<br />

September 2012<br />

<strong>OneSAF</strong> Technical Exchange<br />

26


Back Up Slides

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!