10.01.2016 Views

Deliver High Quality High Performance HEVC via Intel® Media Server Studio

1mz2mWD

1mz2mWD

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong><br />

<strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong><br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

A white paper unlocking new video technology opportunities!<br />

<strong>HEVC</strong> is an exciting, cutting edge, highly efficient, new video compression technology enabling<br />

next generation of digital media applications, products and services. Intel is at the forefront of<br />

this development, leading the <strong>HEVC</strong> technology revolution. Intel <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 aims to<br />

offer industry leading, among the best in the class developer focused <strong>HEVC</strong> solutions with the<br />

best tradeoff of quality versus performance. This paper introduces the capabilities of Intel’s<br />

developer <strong>HEVC</strong> product offering, the Intel <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder and Decoder,<br />

currently in its R7 release which now packs support for coding of 4:2:0 8-bit, 4:2:0 10 bit, 4:2:2 8-<br />

bit, and 4:2:2 10-bit content. The paper also identifies the opportunities due to unleashing of a<br />

range of extremely powerful, higher performance <strong>HEVC</strong> solutions suited for different<br />

applications, services, eco-systems, and devices.<br />

Nov. 11 2015, v1.52<br />

Author: Atul Puri


LEGAL DISCLAIMER<br />

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,<br />

EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS<br />

GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR<br />

SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR<br />

IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR<br />

WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR<br />

INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.<br />

UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR<br />

INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A<br />

SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.<br />

Intel may make changes to specifications and product descriptions at any time, without notice. Designers<br />

must not rely on the absence or characteristics of any features or instructions marked "reserved" or<br />

"undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for<br />

conflicts or incompatibilities arising from future changes to them. The information here is subject to<br />

change without notice. Do not finalize a design with this information.<br />

The products described in this document may contain design defects or errors known as errata which may<br />

cause the product to de<strong>via</strong>te from published specifications. Current characterized errata are available on<br />

request.<br />

Contact your local Intel sales office or your distributor to obtain the latest specifications and before<br />

placing your product order.<br />

Copies of documents which have an order number and are referenced in this document, or other Intel<br />

literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.<br />

MPEG is an international standard for video compression/decompression promoted by ISO.<br />

Implementations of MPEG CODECs, or MPEG enabled platforms may require licenses from various<br />

entities, including Intel Corporation.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Intel, the Intel logo, Intel Core are trademarks or registered trademarks of Intel Corporation or its<br />

subsidiaries in the United States and other countries.<br />

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.<br />

Copyright © 2008-2013, Intel Corporation. All Rights reserved.<br />

1<br />

Optimization Notice<br />

*Other names and brands may be claimed as property of others.


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for<br />

optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and<br />

SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or<br />

effectiveness of any optimization on microprocessors not manufactured by Intel.<br />

Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.<br />

Certain optimizations, not specific to Intel microarchitecture, are reserved for Intel microprocessors.<br />

Please refer to the applicable product User and Reference Guides for more information regarding the<br />

specific instruction sets covered by this notice.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Notice revision #20110804<br />

2


Table of Contents<br />

Introduction ........................................................................................................................................................ 5<br />

<strong>HEVC</strong> Compression Basics .................................................................................................................................. 6<br />

<strong>HEVC</strong> Data Hierarchy ........................................................................................................................................... 6<br />

<strong>HEVC</strong> Partitioning, Prediction and Coding Technologies ...................................................................................7<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 Overview ...................................................................................................... 13<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 SDK ................................................................................................................ 15<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong> Evaluation for 4:2:0 8-bit Content ............... 18<br />

<strong>Quality</strong> Evaluation Tests for 4:2:0 8-bit Content ............................................................................................... 18<br />

<strong>Quality</strong> Evaluation Results for 4:2:0 8-bit Content ........................................................................................... 20<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoffs for 4:2:0 8-bit Content ..... 28<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:0 8-bit Content ....................................................................... 29<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:0 8-bit Content ......................................................... 32<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Decoder <strong>Performance</strong> for 4:2:0 8-bit Content ....................... 34<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoding Computational Scalability Tests (on E5 <strong>Server</strong>) ... 36<br />

<strong>Quality</strong> Evaluation Results for Computational Scalability Tests for 4:2:0 8-bit .............................................. 36<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for Computational Scalability Tests for 4:2:0 8-bit ............ 37<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong> Evaluation for 4:2:0 10-bit Content ............. 42<br />

<strong>Quality</strong> Evaluation Tests for 4:2:0 10-bit Content ............................................................................................. 42<br />

<strong>Quality</strong> Evaluation Results for 4:2:0 10-bit Content ......................................................................................... 43<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoffs for 4:2:0 10-bit Content ... 47<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:0 10-bit Content ...................................................................... 47<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:0 10-bit Content ...................................................... 49<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong> Evaluation for 4:2:2 8-bit Content ............... 52<br />

<strong>Quality</strong> Evaluation Tests for 4:2:2 8-bit Content ............................................................................................... 52<br />

<strong>Quality</strong> Evaluation Results for 4:2:2 8-bit Content ............................................................................................53<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoffs for 4:2:2 8-bit Content ...... 57<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:2 8-bit Content ......................................................................... 57<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:2 8-bit Content ........................................................ 60<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong> Evaluation for 4:2:2 10-bit Content ............. 62<br />

<strong>Quality</strong> Evaluation Tests for 4:2:2 10-bit Content ............................................................................................. 62<br />

<strong>Quality</strong> Evaluation Results for 4:2:2 10-bit Content.......................................................................................... 62<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

3<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoffs for 4:2:2 10-bit Content .... 65<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:2 10-bit Content ....................................................................... 65<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:2 10-bit Content ....................................................... 67<br />

Summary ........................................................................................................................................................... 68<br />

References ........................................................................................................................................................ 68<br />

Appendix A......................................................................................................................................................... 69<br />

A1: Summary of 4:2:0 8-bit Coding <strong>Quality</strong> and <strong>Performance</strong> Tests ............................................................ 69<br />

A2: Summary of Computational Scalability Tests ............................................................................................71<br />

Appendix B .......................................................................................................................................................... 72<br />

B1: Summary of 4:2:0 10-bit Coding <strong>Quality</strong> and <strong>Performance</strong> Tests ............................................................. 72<br />

B2: Summary of 4:2:2 8-bit Coding <strong>Quality</strong> and <strong>Performance</strong> Tests .............................................................. 73<br />

B3: Summary of 4:2:2 10-bit Coding <strong>Quality</strong> and <strong>Performance</strong> Tests ............................................................. 74<br />

Appendix C .......................................................................................................................................................... 75<br />

C1: Example Encoding Directory and Subdirectories structure ..................................................................... 75<br />

C2: Scripts for 4:2:0 content Encoding with <strong>HEVC</strong> HM Encoder .................................................................... 76<br />

C3: Scripts for 4:2:0 content Encoding with <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder ............................ 80<br />

C4: Script for <strong>HEVC</strong> BD rate Computation ...................................................................................................... 84<br />

4


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong><br />

<strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong><br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

A white paper unlocking new video technology opportunities!<br />

Introduction<br />

<strong>HEVC</strong> (aka, H.265) [1-4] is a new, highly efficient, video compression standard from ISO MPEG that promises<br />

substantially higher compression over H.264 (aka, AVC) [4,6], its previous generation standard completed around<br />

10 years ago. In particular <strong>HEVC</strong> promises roughly a factor of 2 in compression over H.264, that had delivered a<br />

factor of 2 in compression over MPEG-2, its earlier generation standard. H.264 is currently dominant having<br />

supplemented or displaced MPEG-2 in nearly all digital video applications, services, products, and eco-systems.<br />

Over next few years the time seems ripe for <strong>HEVC</strong> due to its advantages, to supplement or displace H.264 in the<br />

same manner. Overall MPEG has an excellent history [4] of delivering on video standards that have a wide<br />

industry following.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK (formerly, <strong>Media</strong> SDK) is a well-known developer product that implements state<br />

of art standards based highly optimized decoders, corresponding efficient and highly optimized encoders,<br />

file/stream formatting, and pre- and postprocessing tools supporting efficient coding. <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

SDK implements many Codec and tools components initially in software, and later as hybrid (of software and<br />

hardware) or entirely in hardware. The reason for this multi-tier approach is faster time to market for software<br />

solutions, followed by hybrid solutions that contains partial hardware acceleration, and lastly blazingly fast<br />

hardware solutions that scale. <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK 2015 is available both for Windows and Linux. It<br />

supports <strong>Intel®</strong> 4 th and 5 th generation Core and Xeon ® Processor based platforms with Intel Iris Pro and Intel<br />

HD Graphics.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 is just being released and includes a number of significant tools, technologies,<br />

and enhancements including improved software implementation of <strong>HEVC</strong> Encoders and Decoders. Since not all<br />

<strong>HEVC</strong> implementations are created equal, this white paper attempts to quantify the quality and performance a<br />

developer should expect from <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec software implementation. Rest of the<br />

white paper is organized as per following sections.<br />

<strong>HEVC</strong> Compression Basics<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> Overview<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> R7 release (Codec <strong>Quality</strong>, Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoffs,<br />

and Decoder <strong>Performance</strong>) results presented in 4 parts as follows.<br />

o Part 1: 4:2:0 8-bit Coding <strong>Quality</strong>, <strong>Performance</strong>, and Computational Scalability Tests<br />

o Part 2: 4:2:0 10-bit Coding <strong>Quality</strong>, and <strong>Performance</strong> Tests<br />

o Part 3: 4:2:2 8-bit, and <strong>Performance</strong> Tests<br />

o Part 4: 4:2:2 10-bit, and <strong>Performance</strong> Tests<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

5<br />

*Other names and brands may be claimed as property of others.


Appendix A and B summarize quality & performance results. Appendix C discusses test scripts<br />

<strong>HEVC</strong> Compression Basics<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>HEVC</strong> builds on the well-known classical interframe coding framework of block motion compensated transform<br />

coding. However unlike previous MPEG/ITU-T standards including H.264 instead of using smaller, fixed size<br />

processing based on macroblocks and blocks for motion compensated prediction and small block transform<br />

coding, it uses larger, flexible structures that are partitioned for motion compensated prediction, and a range<br />

of large to small block sizes for transform coding. There are other significant difference as well.<br />

<strong>HEVC</strong> Data Hierarchy<br />

Figure 1 shows high level data structure hierarchy. The terms Video, GOP and Picture as shown are only<br />

conceptual while Slices and lower layers are actual layers employed by <strong>HEVC</strong>.<br />

Video<br />

Temporal Structure (GOP)<br />

[pictures grouped for coding]<br />

Picture<br />

[1 slice or a sequence of slices. Types I, P, B, or Generalized B]<br />

Slices<br />

[1 slice segment or a sequence of slice segments]<br />

Slice Segment (SS)<br />

[Dependent or Independent Slice Segments. A sequence of CTUs]<br />

Coding Tree Unit/Block (CTU/CTB)<br />

[Starting: 64x64 or 32x32 or 16x16. Recursively QT Split down to 8x8]<br />

Coding Unit (CU)<br />

[Sizes: 64x64 (inter only), 32x32, 16x16, 8x8. CU carries intra/inter/skip mode. Inter<br />

CU’s non-recursive split into 1-4 PUs per one of 8 modes. Intra CUs allow only square<br />

PUs and go down to 4x4. inter 8x8 CU allow only 8x4 or 4x8 splits but do not support<br />

bidir pred for these partition sizes. Residual CU recursive QT split up to 4x4 TU]<br />

6<br />

Prediction Unit (PU) – Intra partitions: QT – 35 pred dir. Transform Unit (TU)<br />

Inter partitions:<br />

[32x32, 16x16, 8x8: DCT, 4x4: DCT & DST]<br />

Figure 1 <strong>HEVC</strong> data hierarchy. Video, GOP & Pictures are conceptual, others are actual layers.


<strong>HEVC</strong> Partitioning, Prediction and Coding Technologies<br />

We now introduce the significant components of <strong>HEVC</strong> processing structures as well as discuss actual<br />

video coding algorithms. Due to significant amount of details only high level concepts are covered.<br />

Further, a 2 column visual presentation style is used that shows for each topic, a key concept shown in the<br />

first column and a related illustration in the second column showing. Since this section is a brief overview<br />

of the standard, the concepts are simplified and not necessarily covered in extreme detail.<br />

Coding Tree Unit/Block (CTU/CTB)<br />

Defined at a high level<br />

A CTU consists of 3 CTBs (1 luma plus<br />

2 chroma)<br />

Luma CTB starting size one of<br />

o 64x64<br />

o 32x32<br />

o 16x16<br />

Corresponding Chroma CTB, half in<br />

size horizontally and vertically<br />

Luma CTB split by recursive<br />

QuadTree down to 8x8<br />

Coding Unit (CU)<br />

Always Square<br />

o Largest CU (LCU) as big as<br />

size of luma CTB<br />

o As small as 8x8<br />

o Sizes: 64x64, 32x32, 16x16,<br />

8x8<br />

Traversed in Zig-zag order<br />

Types: Intra, Inter, Skip<br />

Intra CU<br />

o Largest size 32x32<br />

o Partitioned in to square<br />

Prediction Units (PU) up to<br />

4x4<br />

Inter CU<br />

o Largest size 64x64<br />

o 8x8 CU parttioned into 8x4,<br />

and 4x8 PUs only; no<br />

bidirectional pred<br />

Figure 2A Luma CTB starting size options<br />

Figure 2B Partitioning of a luma CTU into CUs<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

7<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Prediction Unit (PU)<br />

CU partitioning into Prediction<br />

partitions is nonrecursive<br />

Intra CUs partitioned into square<br />

Prediction partitions<br />

o 32x32<br />

o 16x16<br />

o 8x8<br />

o 4x4<br />

Inter CU (64x64, 32x32, 16x16, 8x8)<br />

partitioned into<br />

1 of 8 Prediction partition modes<br />

o Partitioned into 1, 2, or 4<br />

partitions<br />

o 8x8 PU is partitioned into<br />

8x4, 4x8 only;<br />

also no bidirectional<br />

prediction mode for 8x8 PUs<br />

Using PU partitions, a residual CU is<br />

constructed prior to coding<br />

Transform Unit (TU)<br />

Residual CU, QuadTree recursively<br />

split into TUs<br />

TUs of following sizes (no 64x64 TU)<br />

o 32x32<br />

o 16x16<br />

o 8x8<br />

o 4x4<br />

Chroma TU of 1/4 th size of luma TU<br />

but smallest<br />

TU for chroma is 4x4 (no 2x2 TUs for<br />

chroma)<br />

TU of size 4x4 flagged by coded/not<br />

coded<br />

DCT Transform on all TU sizes (32x32,<br />

16x16, 8x8, 4x4)<br />

DST Transform on size 4x4 TU<br />

Figure 2C Intra and Inter PU examples<br />

Figure 2D Partitioning of a CU into TUs<br />

8


Intra Coded PU<br />

Each Intra Coded PU<br />

o Pred mode for Luma<br />

o Pred mode for<br />

Chroma<br />

All TUs in a PU use the same<br />

mode<br />

For Luma candidate choices<br />

for prediction mode<br />

o Planar<br />

o DC<br />

o 33 Angular Pred<br />

Directions<br />

For Chroma candidate choices<br />

for prediction mode<br />

o Planar<br />

o DC<br />

o Hor<br />

o Vert<br />

o Luma pred mode copy<br />

Inter Coded PU<br />

Motion Pars specified explicitly or<br />

implicitly<br />

o Motion vector<br />

o Ref Picture Index<br />

o Picture List Usage Flag<br />

<br />

For inter coded CU with<br />

PredMode=Skip, CU coded with no<br />

transform coeff, or motion vector,<br />

and ref picture flag, and ref picture<br />

list usage obtained by motion merge.<br />

Figure 2E Intra Luma prediction directions<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<br />

For inter coded CU with<br />

PredMode=Inter, either use Motion<br />

merge or explicit motion pars<br />

Figure 2F Inter Coded PU Partitionings<br />

9<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Motion Merge<br />

Spatial Merge Candidates<br />

o 5 positions<br />

o Select 4 Candidates<br />

o Remove Partition<br />

Redundancy<br />

<br />

<br />

Temporal Merge Candidates<br />

o 2 positions<br />

o Select 1 candidate<br />

Merge Process<br />

o Remove duplicates from<br />

Spatial and Temporal<br />

Candidates<br />

o Add combined bi-predictive<br />

candidates<br />

o Add nonscaled bi-predictive<br />

candidates<br />

o Add zero merge candidates<br />

o Final merge candidates<br />

Transforms<br />

4x4 integer DST approx. Size 4 basis<br />

matrix shown on right.<br />

<br />

<br />

4x4 integer DCT approx. Size 4 basis<br />

matrix shown on right.<br />

8x8 integer DCT approx. Size 8 basis<br />

matrix shown on right.<br />

16x16 integer DCT approx. Size 16<br />

basis matrix shown on right.<br />

col_ref<br />

curr_ref<br />

tb<br />

td<br />

curr_pic<br />

curr_PU<br />

B 2<br />

A0<br />

(a) second PU of Nx2N<br />

col_pic<br />

col_PU<br />

current<br />

PU<br />

B1<br />

B0<br />

TL<br />

B2<br />

A1<br />

A0<br />

C0<br />

current PU<br />

(b)second PU of 2NxN<br />

current PU<br />

C3<br />

LCU boundary<br />

Figure 3A Spatial Merge candidates position,<br />

position of second PU of Nx2N, and 2NxN, MV<br />

scaling of temporal merge, coding of spatial merge,<br />

Temporal Merge candidates C3 and H<br />

BR<br />

H<br />

B0<br />

10<br />

32x32 ineger DCT approx. Size 32<br />

basis matrix not shown.<br />

Figure 3B Transform basis matrices


Interpolation Filter<br />

Luma<br />

o ¼ pel interpolation<br />

o 7/8 tap filter<br />

Figure 3C 4-tap DCT/IF Luma Filter, and 8 tap<br />

<br />

Chroma<br />

o 1/8 pel interpolation<br />

o 4 tap filter<br />

Deblock Filtering<br />

Overall Process<br />

<br />

<br />

<br />

Boundary strength calculation<br />

o Based on if P or Q is intra, P<br />

or Q has nonzero coef, Pand<br />

Q have different ref, P and Q<br />

have different num of MVs..<br />

o 3 levels of strength 0, 1, 2<br />

Threshold value β and Tc calculation<br />

from input Q<br />

Filter on/off Decision<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 3D Deblock filtering in <strong>HEVC</strong><br />

11<br />

*Other names and brands may be claimed as property of others.


Sample Adaptive Offset (SAO)<br />

Applied to reconstructed video<br />

<br />

SAO Types<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<br />

Details of how SAO Types work<br />

o 3 pel patterns for pixel<br />

classification in Edge<br />

Offset<br />

o Pixel Classification Rules<br />

for Edge Offset<br />

o Grouping 4 bands and<br />

Representation<br />

<strong>HEVC</strong> Encoder<br />

Figure 3E Sample Adaptive Offset Types, Edge Classification,<br />

and Grouping of Bands<br />

Fig. 4 shows high level block diagram of <strong>HEVC</strong> Encoder. Input video frames are partitioned recursively from<br />

CTB’s to CUs and then nonrecursively into PUs. The prediction partition PUs are then combined to generate<br />

Prediction CUs that are differenced from the original resulting in residual CU’s that are recursively QT split<br />

into TUs and coded with variable Block Size (VBS) transform of 4x4 (DST or DCT approx), or 8x8, 16x16, and<br />

32x32 (DCT approx only). CU/PU Partitioner partitions into CU/PU, and the TU partitioner partitions into<br />

TUs. An Encode Controller controls the degree of partitioning performed which depends on quantizer used<br />

in transform coding. The CU/PU Assembler and TU Assembler perform the reverse function of partitioner.<br />

The decoded (every DPCM encoder incorporates a decoder loop) intra/motion compensated difference<br />

partitions are assembled following inverse DST/DCT to which prediction PUs are added and reconstructed<br />

signal then Deblock, and SAO Filtered that corespondingly reduce appearance of artifacts and restore<br />

edges impacted by coding.<br />

12<br />

Figure 4 <strong>HEVC</strong> Encoder


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> Overview<br />

The <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 suite provides high performance software development tools,<br />

libraries, and infrastructure needed to help create, develop, debug, test, and deploy next generation<br />

enterprise grade media solutions on <strong>Intel®</strong> server processors and graphics. The suite supports multiple<br />

platforms with cross platform support for Linux and Windows under a single SDK.<br />

The <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 suite is offered appropriately segmented into bundles in order to<br />

offer maximum flexibility, cost advantage, and convenience to software developers, data center<br />

solutions providers, and OEMs to enable faster time to market.<br />

Communications<br />

Infrastructure<br />

Developers<br />

Encode<br />

Decode<br />

Microsoft Windows®<br />

<strong>Server</strong> 2012<br />

Video Software<br />

Experts and<br />

Developers<br />

<strong>Intel®</strong> HD<br />

Graphics<br />

Video<br />

Processing<br />

Linux<br />

<strong>Intel®</strong> Quick<br />

Sync Video<br />

Enterprise Video<br />

Device<br />

Manufacturers<br />

Tune, Test,<br />

Debug<br />

<strong>Intel®</strong> Advanced<br />

Vector Extension2<br />

Figure 5A <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> stack<br />

Cloud Based<br />

<strong>Media</strong> Services<br />

Providers<br />

<strong>Intel®</strong> <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong><br />

Cross<br />

Platform<br />

Optimized for<br />

<strong>Intel®</strong> Processors<br />

The <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> Essentials/Community Edition of the suite refers to a basic bundle of<br />

features, while the <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> Professional Edition refers to an advanced bundle of<br />

features. Optional high value capabilities are offered <strong>via</strong> <strong>Intel®</strong> VideoPro Analyzer, and <strong>Intel®</strong> Stress<br />

Bitstreams and Encoder toolkits.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Side-by-side Features Comparison of the two editions of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Components<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Professional Edition<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> Essentials Edition<br />

<strong>Intel®</strong><strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 -<br />

√ √<br />

SDK<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015<br />

√ √<br />

– Graphics Driver<br />

13<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 -<br />

√ √<br />

Samples<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015<br />

√<br />

– <strong>HEVC</strong> Decoder & Encoder<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015<br />

√<br />

– Audio Decoder & Encoder<br />

<strong>Intel®</strong> VTune Amplifier XE<br />

Video <strong>Quality</strong> Caliper Tool<br />

Premium Telecine and Interlace<br />

Reverser<br />

License<br />

Supported Features<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

√<br />

√<br />

√<br />

Professional <strong>HEVC</strong> Encoder<br />

(up to 40 sockets)<br />

Essential<br />

Hardware: Intel Xeon® Processor E3-1200 v3, E4-128x v4 product family and 4 th Generation Intel<br />

Core Processor-based Platforms with Intel Iris Pro and Intel HD Graphics.<br />

Operating Systems: CentOS*7.1, Ubuntu* 12.04 LTS, SUSE Linux Enterprise <strong>Server</strong>* (SLES), Windows<br />

<strong>Server</strong> 2012.<br />

Video Compression Codecs: H.264 (AVC), <strong>HEVC</strong> (H.265) (8-bit software only, and with <strong>Intel®</strong> Graphics<br />

Acceleration, and 10-bit software only), MPEG-2, VC-1, MVC, MJPEG (Windows only).<br />

Video Processing: deitnerlacing, Resizing, Cropping, Composition, Color conversion, Denoising, Framerate<br />

conversions.<br />

Audio Compression Codecs: Encode and Decode AAC LC and HE AAC V1, decode AAC LTP, AAC PS, HE-<br />

AAC v2, and MPEG-Audio (including ‘MP3’)<br />

Available Samples: Samples for decode, encode, multi-transcode, and video processing.<br />

Analysis/Improvement Tools: Intel VTune Amplifier XE 2013, Video Pro Analyzer, Visual <strong>Quality</strong> Caliper<br />

Tool, Telecine and Interlace Reversal Tool (preview version).<br />

Stress Bitstreams: <strong>HEVC</strong> Main-8, VP9 8-bit bitstreams.<br />

<br />

Documentation: Detailed documentation, user forum, and technical support to get app up and<br />

running.<br />

Related Optional Packages<br />

14<br />

<strong>Intel®</strong> VideoPro Analyzer 2015 offers a comprehensive kit of video analysis software tools for <strong>HEVC</strong> and VP9<br />

video coding. It enables visual analysis of bitstreams and the decoding process including statistics<br />

extraction, debugging and more. It also enables monitoring of visual quality of decoded video.<br />

<strong>Intel®</strong> Stress Bitstreams and Encoder 2015 are available for either of the <strong>HEVC</strong> or VP9 compression<br />

formats. It contains carefully designed stress bitstreams to remove pain from the validation process,


minimizing the needed development time while maximizing the confidence due to redundant syntax. It<br />

also provides the capability of being able to develop custom new bitstreams.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> 2015 SDK<br />

The <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK processing stack for Windows consists of optimized media libraries<br />

that are built on top of Microsoft* DirectX*, DirectX Video Acceleration (DVXA) APIs, and platform<br />

graphics drivers. <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK exposes the hardware acceleration features of <strong>Intel®</strong><br />

Quick Sync Video built into 2 nd and all following generations of <strong>Intel®</strong> Core processors.<br />

While <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK is designed to be a flexible solution for many media workloads, it<br />

focuses only on media pipeline components which are commonly used and usually the most in need for<br />

acceleration, such as follows.<br />

<br />

<br />

<br />

Decoding from video elementary stream formats (H.264, MPEG-2, VC-1, and JPEG/Motion JPEG) to<br />

uncompressed frames.<br />

Selected video frame processing operations<br />

Encoding uncompressed frames to advanced elementary stream formats (MPEG-2, H.264, <strong>HEVC</strong>)<br />

Samples, Applications, External Libraries<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> –<br />

Optimized <strong>Media</strong> Library for CPU<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> API<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> –<br />

Optimized <strong>Media</strong> Library for<br />

<strong>Intel®</strong> Processor Graphics<br />

DXVA/DDI Extensions<br />

Graphics Drivers<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 5B <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK archtecture stack<br />

In the first quarter of 2014 the following features were added to the SDK.<br />

<br />

<br />

<br />

<strong>HEVC</strong>: <strong>High</strong> <strong>Quality</strong> <strong>HEVC</strong> Software Encoder and <strong>HEVC</strong> Software Decoder<br />

H.264: (1) Improved Encode quality and BRC, (2) Region of Interest Encoding,<br />

(3) Adaptive Picture types, Look ahead and others.<br />

VPP: (1) Frame composite API, (2) Deinterlacer choices: Bob or Advanced.<br />

15<br />

*Other names and brands may be claimed as property of others.


System: Splitter/Muxer support for MPEG-2 TS, and MPEG-4.<br />

Plug-ins/Tools.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

During portion of 2014 and all of 2015 <strong>HEVC</strong> Codec was improved significantly in quality and speed.<br />

- The <strong>HEVC</strong> codec implementation includes a multithreaded <strong>HEVC</strong> Encoder and <strong>HEVC</strong> Decoder<br />

capable of running typically up to 8 threads. The Encoder and Decoder are up to SSE4 and AVX2<br />

optimized for Intel Core, and Atom platforms.<br />

- The <strong>HEVC</strong> Software Decoder is capable of real time decode of <strong>HEVC</strong> 4K encoded streams on an<br />

<strong>Intel®</strong> Core Haswell CPU 2 - 3.5 GHz, 4 Cores.<br />

- The <strong>HEVC</strong> Software Encoder supports a range of coding modes with the slowest mode allowing<br />

quality visually identical to <strong>HEVC</strong> HM14 and the fastest mode allowing 30-40 frames/sec encoding<br />

of HD1080p on the aforementioned Haswell system.<br />

- The <strong>HEVC</strong> Software Decoder supports <strong>HEVC</strong> Main Profile 8 bits, Level 1.0 to 6.2.<br />

A complete list of various components and features of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK is summarized in<br />

Fig. 5C.<br />

Figure 5C List of Features of the <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> SDK<br />

Additional flavors of <strong>HEVC</strong> codecs such as high bit-depth, higher chroma resolution formats and others and<br />

capabilities for controlling and optimum usage of these codecs, have allowed <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

SDK to continue to improve its offerings to stay ahead of the pack.<br />

16


<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> R7 results - Part 1<br />

4:2:0 8-bit Coding <strong>Quality</strong>, <strong>Performance</strong>, and<br />

(on E5) Computational Scalability tests<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

17<br />

*Other names and brands may be claimed as property of others.


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong><br />

Evaluation for 4:2:0 8-bit Content<br />

In this section, we first describe a test methodology employed for evaluating quality and then report on<br />

relative quality measurements for the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec with respect to MPEG <strong>HEVC</strong> HM14<br />

Codec, well known as a high quality reference albeit impractically slow.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Quality</strong> Evaluation Tests for 4:2:0 8-bit Content<br />

For the purpose of quality evaluation, four test sets, one for each of the 4 main resolutions are selected<br />

with each test set consisting of 6 publicly available challenging video test sequences. Further for each test<br />

sequence, 4 well-spaced quantizer (QP) values are selected such that <strong>HEVC</strong> encoding for each sequence<br />

generates bit-rates in a wide but moderate range, avoiding extreme values.<br />

Wherever possible, test sequences that are available in multiple resolutions are selected for each test set<br />

such that the behavior of a codec over multiple resolutions can be tracked.<br />

Specifically, Table 0A shows selected Ultra Definition 4K (UHD4K) test set, Table 0B shows <strong>High</strong> Definition<br />

1080p (HD1080p) test set, Table 0C shows <strong>High</strong> Definition 720p (HD720p), and Table 0D shows<br />

Standard/Extended Standard Definition (SD/XD) test set.<br />

Table 0A UHD4K Test Set and Quantizers used for Codec RD characteristics measurement<br />

Quantizers used for RD char.<br />

UHD4K Test Sequence Resolution frm rate #frm Qp1 Qp2 Qp3 Qp4<br />

1 Park_joy 3840x2160 50 500 26 29 33 37<br />

2 Ducks_take_off 3840x2160 50 500 28 31 35 37<br />

3 CrowdRun 3840x2160 50 500 26 30 34 38<br />

4 PeopleOnStreet 3840x2160 30 150 22 27 32 37<br />

5 Traffic 3840x2048 30 300 18 22 26 30<br />

6 Nebuta_Festival 2560x1600 60 300 30 33 36 39<br />

Of the UHD4K test sequences referred to in Table 0A, Park_joy, Ducks_take_off and CrowdRun of<br />

3840x2160 resolution can be obtained from http://media.xiph.org/video/derf/ while PeopleOnStreet of<br />

3840x2160 resolution, Traffic of 3840x2048 resolution, and Nebuta_Festival of 2560x1600 resolution are<br />

MPEG <strong>HEVC</strong> test sequences that can be obtained from MPEG distribution sites such as<br />

ftp://hvc:US88Hula@ftp.tnt.uni-hannover.de/testsequences.<br />

18<br />

Table 0B HD1080p Test Set and Quantizers used for Codec RD characteristics measurement<br />

Quantizers used for RD char.<br />

HD1080p Test Sequence Resolution frm rate #frm Qp1 Qp2 Qp3 Qp4<br />

1 Park_joy 1920x1080 50 500 26 29 33 37<br />

2 Ducks_take_off 1920x1080 50 500 28 31 35 37<br />

3 CrowdRun 1920x1080 50 500 26 30 34 38<br />

4 TouchDownPass 1920x1080 30 570 23 26 30 34


5 BQTerrace 1920x1080 60 600 25 27 31 34<br />

6 ParkScene 1920x1080 24 240 23 26 29 32<br />

Of the HD1080p test sequences referred to in Table 0B, Park_Joy, Ducks_take_off, CrowdRun and<br />

TouchDownPass of 1920x1080 resolution can be obtained from http://media.xiph.org/video/derf/ while<br />

BQTerrace and ParkScene of 1920x1080 resolution are standard MPEG <strong>HEVC</strong> test sequences that can be<br />

obtained from MPEG distribution site such as ftp://hvc:US88Hula@ftp.tnt.unihannover.de/testsequences.<br />

Table 0C HD720p Test Set and Quantizers used for Codec RD characteristics measurement<br />

Quantizers used for RD char.<br />

HD720p Test Sequence Resolution frm rate #frm Qp1 Qp2 Qp3 Qp4<br />

1 Park_joy 1280x720 50 500 26 29 33 37<br />

2 Ducks_take_off 1280x720 50 500 28 31 35 37<br />

3 CrowdRun 1280x720 50 500 26 30 34 38<br />

4 City 1280x720 30 300 22 24 27 29<br />

5 Crew 1280x720 30 300 22 26 29 32<br />

6 Sailormen 1280x720 30 300 24 26 28 30<br />

Table 0C shows HD720p test sequences of which Park_Joy, Ducks_take_off, and CrowdRun of 1280x720<br />

resolution can be obtained from http://media.xiph.org/video/derf/ while City, Crew, and Sailormen of<br />

1280x720 resolution are earlier MPEG SVC test sequences with limited public distribution (but can be<br />

made available on request).<br />

Table 0D SD/XD Test Set and Quantizers used for Codec RD characteristics measurement<br />

Quantizers used for RD char.<br />

SD/XD Test Sequence Resolution frm rate #frm Qp1 Qp2 Qp3 Qp4<br />

1 BasketBallDrillText 832x480 50 500 21 24 27 30<br />

2 PartyScene 832x480 50 500 24 27 30 33<br />

3 RaceHorses 832x480 30 300 24 27 30 33<br />

4 City 704x576 30 300 22 24 26 28<br />

5 Crew 704x576 30 300 22 25 28 31<br />

6 Soccer 704x576 30 300 22 25 28 31<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 0D shows standard and extended definition test sequences such as BasketBallDrillText, PartyScene,<br />

and RaceHorses all of 832x480 resolution that are standard MPEG <strong>HEVC</strong> test sequences that can be<br />

obtained from ftp://hvc:US88Hula@ftp.tnt.uni-hannover.de/testsequences while City, Crew, and Soccer of<br />

704x576 resolution can be obtained from MPEG site ftp://ftp.tnt.unihannover.de/pub/svc/testsequences/.<br />

Now that we have discussed test sequences, we now introduce encoding configuration used.<br />

<strong>HEVC</strong> HM encoding is employed in high quality default high delay Random Access configuration but with<br />

only first frame Intra (other Intra’s can still happen but only due to scene change), pyramid configuration<br />

of size 8, and 4 Reference Pictures. For each encoding test, the reference quantizer used (which may be<br />

19<br />

*Other names and brands may be claimed as property of others.


internally modulated into multiple quantisers as needed for I, P or B (including reference and<br />

nonreference slices)) is specified in Table 0A – 0D.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

The <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> (MSS) <strong>HEVC</strong> software encoder supports several Target Usages (TU) that range<br />

from TU1 to TU7, and <strong>Intel®</strong> Graphics optimized version of TU4-TU7 (TU4gacc-TU7gacc) such that TU1<br />

offers the highest quality, and TU7gacc the highest speed. In our encoding tests, the main four TU modes<br />

– TU1 (MSS ‘<strong>Quality</strong>’ mode), TU4 (MSS ‘Balanced’ mode), TU7 (MSS ‘Speed’ mode), and TU7gacc are<br />

tested. Further for TU1, TU4, TU7, and TU7gacc modes, B-pyramid encoding with pyramid length set to 8<br />

is used. Likewise, different TU modes use different number of reference pictures such as 4, 3, 2, and 2<br />

reference pictures respectively for TU1, TU4, TU7, and TU7gacc modes.<br />

To measure quality of an <strong>HEVC</strong> codec while many techniques such as peak signal-to-noise ratio (PSNR),<br />

structural similarity index (SSIM), other objective quality metrics, or even full subjective quality tests can<br />

be performed, in our own competitive quality assessment tests we have found results of most such<br />

measures to be fairly consistent with each other as long as codecs are compared in a constant quantizer<br />

mode. This allows comparison of actual codecs independent of bit rate control (BRC) techniques.<br />

We now briefly describe a statistically tractable technique to compare video quality produced by the<br />

codec being tested as compared to a reference codec. Rate Distortion (RD) characteristics for both the<br />

codecs are computed using each codec’s 4-point PSNR/Bitrate measurements followed by use of MPEG’s<br />

new BD rate ([5]) curve fitting procedure that generates a continuous RD curve that tightly fits the<br />

measured points. A single measurement of ‘goodness’ of the codec being tested against the reference<br />

codec in the form of BD rate is then computed that reflects percentage difference between the codecs.<br />

The BD rate percentage difference if positive means that the codec being tested is worse in quality, that is<br />

it costs ‘x’ percentage more bits to generate the same PSNR quality as the reference. The BD rate<br />

difference measurement procedure allows a straightforward way of computing and independently<br />

verifying quality of codec with respect to a reference codec.<br />

<strong>Quality</strong> Evaluation Results for 4:2:0 8-bit Content<br />

Before discussing detailed BD rate differences of each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode wrt HM 14, we first<br />

establish the quality/bitrate tradeoff of HM 14 that will be used as reference.<br />

First, each video sequence of each test set (such as UHD, HD1080p, HD720p, and SD/XD) is encoded using<br />

MPEG <strong>HEVC</strong> HM14 Reference Encoder with each of 4 quantizers as specified in Tables 0A-0D; sample<br />

encoding scripts used are shown in Appendix C. The overall PSNR (averaged over frames) for each test<br />

sequence for each Qp for each of luma (Y), and associated chroma components (U and V) is collected<br />

along with the generated coding bitrate. For instance, Tables 1A-1D show the luma PSNR (chroma PSNR is<br />

also collected but is not shown to keep tables manageable in size) and total bitrate for each test<br />

sequence for each of 4 Qps is collected and is used in curve fitting to generate a continuous RD curve for<br />

HM14 encoding of that test sequence.<br />

20<br />

Next, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in each of its 4 main modes: TU1, TU4, TU7, and<br />

TU7gacc is utilized to perform encoding of each test sequence with each of 4 quantizers, and PSNRs of Y,<br />

U, and V, and total bitrates are collected; example encoding scripts are shown in Appendix C.


UHD4K Test<br />

Sequence<br />

Table 1A PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on UHD4K Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB Y kbps dB Y kbps<br />

1 Park_joy 35.46 78618.8 34.05 46580.4 32.11 23364.3 30.28 12681.8<br />

2 Ducks_take_off 31.43 90261.7 30.50 41146.9 29.49 19223.3 28.98 14004.0<br />

3 CrowdRun 35.85 54630.9 34.41 27522.1 32.74 15383.6 31.02 9137.1<br />

4 PeopleOnStreet 39.96 63581.6 37.00 28295.8 34.14 14476.9 31.43 8065.3<br />

5 Traffic 43.63 51403.5 41.59 22377.6 39.48 9774.2 37.47 4907.8<br />

6 Nebuta_Festival 30.78 48670.2 28.98 19513.1 28.12 8004.1 27.56 3671.2<br />

HD1080p Test<br />

Sequence<br />

Table 1B PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on HD1080p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB, Y kbps dB, Y kbps<br />

1 Park_joy 34.33 35140.4 32.20 21705.7 29.64 10913.1 27.42 5552.4<br />

2 Ducks_take_off 32.23 28302.7 30.69 16346.3 28.81 7991.4 27.93 5587.9<br />

3 CrowdRun 35.11 25316.6 32.62 13868.8 30.24 7619.6 28.14 4281.4<br />

4 TouchdownPass 40.42 5743.1 39.35 2908.6 37.81 1444.7 36.14 790.7<br />

5 BQTerrace 35.91 12318.7 35.32 6264.8 34.19 2101.4 33.09 1130.2<br />

6 ParkScene 39.49 5300.4 37.89 3087.3 36.32 1886.6 34.70 1132.8<br />

HD720p Test<br />

Sequence<br />

Table 1C PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on HD720p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB, Y kbps dB, Y kbps<br />

1 Park_joy 33.91 18166.8 31.65 11456.1 28.88 5779.1 26.52 2888.0<br />

2 Ducks_take_off 32.29 14525.9 30.42 8539.4 28.21 4117.4 27.24 2871.9<br />

3 CrowdRun 34.49 15194.3 31.62 8438.4 39.04 4599.1 26.83 2518.4<br />

4 City 38.89 5439.4 37.83 3171.9 36.49 1603.0 35.61 1105.0<br />

5 Crew 40.76 4477.8 38.98 1975.8 37.72 1206.9 36.37 751.1<br />

6 Sailormen 37.43 4656.3 36.43 2931.4 35.53 1909.2 34.68 1282.5<br />

XD/SD Test<br />

Sequence<br />

Table 1D PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on XD/SD Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB, Y kbps dB, Y kbps<br />

1 BasketBallDrillText 41.07 3753.2 39.13 2405.3 37.19 1562.7 35.35 1023.2<br />

2 PartyScene 36.74 4478.5 34.62 2738.5 32.65 1702.7 30.75 1038.6<br />

3 RaceHorses 36.67 3206.6 35.82 1909.6 34.08 1188.9 32.37 747.2<br />

4 City 38.47 2745.7 37.37 1624.2 36.42 1021.6 35.54 692.4<br />

5 Crew 39.71 3064.0 38.17 1672.7 36.78 1002.7 35.35 621.5<br />

6 Soccer 40.06 2754.8 38.30 1600.4 36.61 1001.9 34.87 624.6<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

21<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

For <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in each encoding mode, the 4 collected bitrates and corresponding 4<br />

PSNRs (each corresponding to one Qp) are used in curve fitting to generate a continuous RD curve for<br />

each encoding mode. Further, for each test sequence and for each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode, the<br />

corresponding <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> RD curve is compared to HM14 RD curve and a BD rate percentage is<br />

computed that reflects the difference in quality between specific <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode with<br />

respect to TU14 reference. The BD rate provides a measure of departure of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU<br />

mode’s quality in percentage bits with respect to the HM14 reference. For instance a BD rate percentage<br />

of say 4% for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode means that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> TU1 mode in order to<br />

provide the same objective quality as HM14 would require 4% additional bits. The <strong>HEVC</strong> provided macro<br />

for BD rate calculation is shown in Appendix C.<br />

Table 2A-2D show the measured BD rate percentage bitrate for each of luma and chroma components for<br />

each test sequence of each test set of Table 0A-0D for each of 4 TU modes being evaluated. Fig. 6A-6D<br />

compares the RD characteristics of two codecs for each test set for TU1 and TU4 modes where the<br />

difference in quality is the highest and the lowest wrt HM14 reference.<br />

For instance, Table 2A shows on UHD4K test set, the BD rate percentage difference of <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various TU modes wrt HM14 reference. As can be observed on UHD4K test set, the<br />

average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec over HM14 (an ideal<br />

reference), is -0.05%, 17.65%, 33.57%, and 35.22% higher respectively in TU1, TU4, TU7, and TU7gacc modes.<br />

This means that for UHD4K test set, the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> codec as compared to HM14, requires<br />

around 0% higher bitrate in TU1 mode, 18% higher bitrate in TU4 mode, 34% higher bitrate in TU7 mode,<br />

and 35% higher bitrate in TU7gacc mode to achieve the same luma PSNR quality as HM14. Further as a<br />

matter of reference, the HM14 codec is 100 to 3000 times slower (shown in the next section) as compared<br />

to <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec depending on the TU setting used.<br />

Table 2A Relative <strong>Quality</strong> Evaluation Results on UHD4K test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec<br />

in TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

UHD4K Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Y U V Y U V Y U V Y U V<br />

1 Park_joy -0.18 1.92 -0.63 13.50 7.92 16.47 26.97 25.10 33.01 28.15 25.76 33.61<br />

2 Ducks_take_off -0.93 6.81 9.79 15.92 -16.84 6.27 28.53 -0.84 23.39 30.00 0.62 25.80<br />

3 CrowdRun 0.37 2.99 2.14 17.33 18.18 20.35 35.46 46.38 48.19 36.74 47.33 49.31<br />

4 PeopleOnStreet 0.16 -7.15 -6.21 20.65 21.06 25.66 40.38 52.82 59.84 43.26 55.77 62.65<br />

5 Traffic 0.31 -9.00 -15.63 20.86 16.15 23.76 36.52 41.70 42.65 37.96 42.62 42.98<br />

6 Nebuta_Festival 6.91 47.06 42.79 20.02 16.68 5.40 46.90 33.35 16.72 47.68 32.44 16.53<br />

Average -0.05 -0.89 -2.11 17.65 9.30 18.50 33.57 33.03 41.41 35.22 34.42 42.87<br />

22<br />

Further, Table 2A shows on UHD4K test set that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec TU1 mode with respect<br />

to HM14 reference, results in biggest positive luma BD rate percentage difference (its worst case) on<br />

Nebuta_Festival 1600p sequence and better than HM14 luma BD rate percentage difference (its best case)


on Ducks_take_off 4K sequence; the corresponding RD curves for luma (Y) for TU1 mode, and HM<br />

verifying this behaviour are shown in Fig. 6A1.<br />

32<br />

Nebuta_Festival Y<br />

32<br />

Ducks_take_off Y<br />

31<br />

PSNR (dB)<br />

30<br />

29<br />

28<br />

27<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

Figure 6A1 RD results of UHD4K scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Table 2A also shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode with respect to HM14 reference,<br />

results in the biggest luma BD rate percentage difference on Traffic 4K sequence and the smallest luma<br />

BD rate percentage difference on Park_Joy 4K sequence; the corresponding RD comparison curves for<br />

TU4 mode for the two cases are shown in Fig. 6A2.<br />

PSNR (dB)<br />

44<br />

43<br />

42<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

Traffic Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

PSNR (dB)<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

31<br />

30<br />

29<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 50000 100000 150000<br />

BitRate (kbps)<br />

Park_Joy Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 50000 100000<br />

BitRate (kbps)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 6A2 RD results of UHD4K scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

23<br />

Next, Table 2B shows BD rate percentage differences of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> wrt HM14 on the<br />

HD1080p test set. The average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

*Other names and brands may be claimed as property of others.


over HM14 reference, is 1.71%, 17.30%, 33.90%, and 37.71% respectively in TU1, TU4, TU7, and TU7gacc<br />

modes.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 2B Relative <strong>Quality</strong> Evaluation Results on HD1080p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec<br />

in TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

HD1080p Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Y U V Y U V Y U V Y U V<br />

1 Park_joy 1.13 5.42 1.63 13.40 16.97 27.40 26.40 32.87 42.50 28.41 34.47 44.18<br />

2 Ducks_take_off 0.55 2.78 7.08 10.49 -3.20 21.87 20.40 11.05 38.23 22.05 12.65 40.62<br />

3 CrowdRun 1.03 3.31 2.83 16.52 22.22 23.13 35.12 48.29 50.21 36.86 50.23 51.97<br />

4 TouchDownPass 0.83 -5.35 -4.83 19.31 17.52 17.78 38.34 53.74 50.80 46.23 57.41 54.14<br />

5 BQTerrace 4.01 -14.71 -22.15 26.65 24.46 32.33 49.85 41.01 61.07 55.94 38.78 61.87<br />

6 ParkScene 2.69 -6.10 -7.99 17.45 21.20 19.58 33.31 40.70 36.60 36.78 43.46 38.45<br />

Average 1.71 -2.44 -3.91 17.30 16.53 23.68 33.90 37.94 46.57 37.71 39.50 48.54<br />

Further, Table 2B shows on HD1080p test set that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1 mode with<br />

respect to HM14 reference, results in the biggest luma BD rate percentage difference on BQTerrace 1080p<br />

sequence and the smallest luma BD rate percentage difference on Ducks_take_off 1080p sequence; the<br />

corresponding RD comparison curves for TU1 mode for the two cases are shown in Fig. 6B1.<br />

PSNR (dB)<br />

37<br />

36<br />

35<br />

34<br />

33<br />

BQTerrace Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

Ducks_take_off Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 10000 20000 30000 40000<br />

BitRate (kbps)<br />

Figure 6B1 RD results of 1080p scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

24<br />

Table 2B also shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode with respect to HM14 reference,<br />

results in the biggest luma BD rate percentage difference on BQTerrace 1080p sequence and the smallest<br />

BD rate percentage difference for luma on Ducks_take_off 1080p sequence; the corresponding<br />

comparison curves for TU4 for the two cases are shown in Fig. 6B2.


PSNR (dB)<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

BQTerrace Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000 15000<br />

BitRate (kbps)<br />

Figure 6B2 RD results of 1080p scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

Next, Table 2C shows BD rate percentage differences of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> wrt HM14 on the<br />

HD720p test set. The average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

over HM14 reference is 1.56%, 16.66%, 34.94%, and 39.46% respectively for TU1, TU4, TU7, and TU7gacc<br />

modes.<br />

Table 2C Relative <strong>Quality</strong> Evaluation Results on HD720p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec<br />

in TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

HD720p Test TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Sequence<br />

Y U V Y U V Y U V Y U V<br />

1 Park_joy 1.50 8.52 1.79 13.76 21.34 31.32 25.98 36.64 48.01 28.54 39.09 50.21<br />

2 Ducks_take_off 0.74 3.17 8.35 9.74 5.09 31.10 19.86 20.60 48.11 21.48 22.18 49.8<br />

3 CrowdRun 1.76 3.44 3.55 17.36 25.54 26.88 34.09 51.23 54.06 35.86 53.03 056.0<br />

4 City 0.47 -15.84 -15.61 20.49 14.98 18.35 47.09 54.84 63.71 54.20 57.71 568.8<br />

5 Crew 2.80 -2.16 -1.90 22.24 12.83 13.89 46.97 57.90 52.16 58.21 64.42 461.4<br />

6 Sailormen 2.11 -6.27 -11.14 16.38 21.36 27.72 35.62 43.83 43.90 38.49 44.33 045.4<br />

Average 1.56 -1.53 -2.50 16.66 16.86 24.88 34.94 44.17 51.66 39.46 46.79 55.29 4<br />

PSNR (dB)<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

Ducks_take_off Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 10000 20000 30000 40000<br />

BitRate (kbps)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Further, Table 2C shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1 mode results in the biggest BD rate<br />

percentage difference for luma on Crew 720p sequence and the smallest BD rate percentage difference in<br />

luma on Ducks_take_off 720p sequence with respect to HM14 reference; the corresponding RD<br />

comparison curves for TU1 mode for the two cases are shown in Fig. 6C1.<br />

25<br />

*Other names and brands may be claimed as property of others.


42<br />

Crew Y<br />

33<br />

Ducks_take_off Y<br />

41<br />

32<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

PSNR (dB)<br />

40<br />

39<br />

38<br />

37<br />

36<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 2000 4000 6000<br />

BitRate (kbps)<br />

Figure 6C1 RD results of 720p scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Table 2C also shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode results in the biggest BD rate<br />

percentage difference for luma on Crew 720p sequence and the smallest BD rate percentage difference<br />

for luma on Ducks_take_off 720p sequence with respect to HM14 reference; the corresponding RD<br />

comparison curves for TU4 mode for the two cases are shown in Fig. 6C2.<br />

PSNR (dB)<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

Crew Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 2000 4000 6000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

PSNR (dB)<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

31<br />

30<br />

29<br />

28<br />

27<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

Ducks_take_off Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

Figure 6C2 RD results of 720p scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

26<br />

Next, Table 2D shows BD rate percentage differences of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> wrt HM14 on the<br />

SD/XD test set. The average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec over<br />

HM14 reference is 0.07%, 19.44%, 39.33%, and 44.65% respectively for TU1, TU4, TU7, and TU7gacc modes.<br />

Table 2D Relative <strong>Quality</strong> Evaluation Results on SD/XD test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in<br />

TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec


SD/XD Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Y U V Y U V Y U V Y U V<br />

1 BasketBallDrilText -5.28 -9.95 -8.02 23.65 30.36 33.23 42.58 61.73 65.86 46.58 64.30 68.15<br />

2 PartyScene 4.15 -1.27 -2.32 20.65 22.38 23.82 35.33 40.91 42.96 38.62 43.68 46.47<br />

3 RaceHorses 1.12 -1.18 -2.40 21.41 21.25 27.44 45.05 54.40 65.82 51.79 60.87 72.86<br />

4 City 0.49 -15.36 -16.88 18.37 11.15 15.15 41.57 38.53 48.10 46.92 41.37 50.18<br />

5 Crew 1.83 -2.89 -3.22 17.58 10.88 13.20 38.77 46.39 48.64 46.0 53.00 56.13<br />

6 Soccer -1.88 -6.68 -5.98 15.00 5.65 11.86 32.67 29.39 40.96 38.01 32.88 43.63<br />

Average 0.07 -6.22 -6.47 19.44 16.95 20.78 39.33 45.22 52.06 44.65 49.35 56.24<br />

Also, Table 2D shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1 mode results in the biggest BD rate<br />

percentage difference for luma on PartyScene XD sequence and the smallest BD rate percentage<br />

difference in luma on BasketballDrillText SD sequence with respect to HM14 reference; the corresponding<br />

RD comparison curves for TU1 mode for the two cases are shown in Fig. 6D1.<br />

PSNR (dB)<br />

38<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

PartyScene Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 2000 4000 6000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

Figure 6D1 RD results of SD/XD scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Further, Table 2D shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode results in the biggest BD rate<br />

percentage difference for luma on BasketballDrillText XD sequence and the smallest BD rate percentage<br />

difference for luma on Soccer SD sequence with respect to HM14 reference; the corresponding RD<br />

comparison curves for TU4 mode for the two cases are shown in Fig. 6D2.<br />

43<br />

42<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

BasketballDrillText Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 2000 4000 6000<br />

BitRate (kbps)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

27<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

PSNR (dB)<br />

Figure 6D2 RD results of SD/XD scenes with the biggest and the smallest quality difference wrt HM14 for<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

To summarize, the luma BD rate percentage bitrate difference of the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> in various<br />

modes with respect to an ideal (but slow) HM14 reference is as follows.<br />

<br />

<br />

<br />

<br />

42<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

34<br />

BasketballDrillText Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 2000 4000 6000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

TU1 mode on average is only 0 - 2% worse in BD rate (while around 125 times faster, as shown later) as<br />

compared to HM14 on all test sets. This is highest quality mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

TU4 mode on average is only 16 - 19% worse in BD rate (while around 950 times faster, as shown later)<br />

as compared to HM14. This mode represents excellent tradeoff of quality vs speed.<br />

TU7 mode on average is 33 - 39% worse in BD rate (while around 2450 times faster, as shown later) as<br />

compared to HM14. This is the fastest software only mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

TU7gacc mode on average is also 35 - 44% worse in BD rate rate (while being around 3150 times faster,<br />

as shown later) as compared to HM14 on all test sets. This is the fastest mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

and combines TU7 software mode with <strong>Intel®</strong> Graphics Acceleration.<br />

Now that we have completed quality analysis of various TU modes of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>, the next<br />

obvious step is to perform analysis of encoding speed offered by each of these modes; this issue is<br />

discussed at length in the next section.<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

34<br />

Soccer Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 1000 2000 3000 4000<br />

BitRate (kbps)<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong><br />

Tradeoffs for 4:2:0 8-bit Content<br />

28<br />

For measurement of encoding speed (fps) and speed vs quality tradeoffs, a recently released reference<br />

PC Platform (<strong>Intel®</strong> Core i7-4770K CPU @ 3.5 GHz – 4 Cores/8Threads) is employed.


<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:0 8-bit Content<br />

We first measure encoding speed (fps) of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in TU1 (highest<br />

quality) mode and MPEG <strong>HEVC</strong> HM14 on different resolution test sets. Results of these measurements<br />

comparing the two speeds are shown in Fig. 7A.<br />

Encoding Speed (fps)<br />

Figure 7A Average encoding speed comparison of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode with HM14<br />

From Figure 7A it can be seen that the encoding speed of the TU1 mode is 100 to 141 times (multithreaded<br />

on 4 cores) the speed of HM14 (single threaded/1 core) for the 4 resolution test sets. For 1080p content,<br />

TU1 is 141 times faster than HM14. Earlier we had shown that the quality of TU1 mode was nearly identical<br />

(only around 0-2% less in BD rate) with respect to HM14.<br />

Next in Table 3A-3D we show measurement of encoding speed of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software<br />

Encoder on each sequence of all four resolution test sets in each of four TU1, TU4, TU7, and TU7gacc<br />

modes.<br />

Table 3A Encoding Speed of UHD4K test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, TU7, and<br />

TU7gacc modes. For reference the average Speed of HM14 Encoder for this test set is .004 fps.<br />

UHD4K Test<br />

Sequence<br />

7.0<br />

6.0<br />

5.0<br />

4.0<br />

3.0<br />

2.0<br />

1.0<br />

0.0<br />

<strong>Performance</strong> Comparison of <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1 Mode and HM14.0<br />

5.78 (141x)<br />

3.04 (117x)<br />

2.09 (139x)<br />

0.40 (100x)<br />

0.041 0.026 0.015 0.004<br />

SD/XD HD720p HD1080p UHD4K<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>HEVC</strong> HM14.0<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gacc Enc Speed<br />

fps fps fps fps<br />

1 Park_joy 0.36 2.70 7.86 10.97<br />

2 Ducks_take_off 0.27 2.11 6.58 9.43<br />

3 CrowdRun 0.38 3.40 9.28 12.45<br />

4 PeopleOnStreet 0.33 3.03 7.59 9.73<br />

5 Traffic 0.65 4.42 11.02 14.49<br />

6 Nebuta_Festival* 1.05 7.66 19.29 24.30<br />

Average 0.40 3.13 8.47 11.41<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

29<br />

*Other names and brands may be claimed as property of others.


* For average encoder speed calculation, for all TU’s Nebuta_Festival is excluded as its size is 2560x1600 while others are of<br />

size 4K (3840x2160).<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 3B Encoding Speed of HD1080p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, TU7, and TU7gacc<br />

modes. For reference the average speed of HM14 Encoder for this test set is .015 fps.<br />

HD1080p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gacc Enc Speed<br />

fps fps fps fps<br />

1 Park_joy 1.50 10.02 25.06 34.59<br />

2 Ducks_take_off 1.18 8.65 24.40 34.80<br />

3 CrowdRun 1.33 10.53 27.34 38.20<br />

4 TouchDownPass 2.09 17.92 46.05 60.02<br />

5 BQTerrace 4.08 21.01 56.20 67.22<br />

6 ParkScene 2.36 16.59 42.51 54.18<br />

Average 2.09 14.12 36.93 48.17<br />

Table 3C Encoding Speed of HD720p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, TU7, and TU7gacc<br />

modes. For reference the average speed of HM14 Encoder for this test set is .026 fps.<br />

HD720p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gaccEnc Speed<br />

fps fps fps fps<br />

1 Park_joy 3.22 20.68 49.64 68.33<br />

2 Ducks_take_off 2.75 19.17 49.55 67.45<br />

3 CrowdRun 2.67 19.75 48.35 68.46<br />

4 City 3.92 30.87 69.16 89.06<br />

5 Crew 2.40 25.46 65.17 89.73<br />

6 Sailormen 3.30 24.75 67.05 86.73<br />

Average 3.04 23.45 58.15 78.29<br />

Table 3D Encoding Speed of SD/XD test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> software Codec in TU1, TU4, TU7, and TU7gacc<br />

modes. For reference the average speed of HM14 Encoder for this test set is .041 fps.<br />

SD/XD Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gaac Enc Speed<br />

fps fps fps fps<br />

1 BasketBallDrillText 6.42 50.41 132.21 153.24<br />

2 Party_Scene 7.52 54.83 133.19 148.70<br />

3 RaceHorses 4.72 40.10 103.49 126.06<br />

4 City 6.90 54.04 127.11 150.20<br />

5 Crew 3.82 38.80 103.84 133.84<br />

6 Soccer 5.29 45.45 120.86 149.45<br />

Average 5.78 47.27 120.12 143.58<br />

30<br />

As can be seen from Table 3A-3D the encoding speed of TU4 mode is in the range of 6.8 to 8.2 that of TU1<br />

mode, and that the encoding speed of TU7 mode is 2.5 to 2.7 times that of TU4 mode, and TU7gacc mode<br />

is around 32% faster, ie, 1.2 to 1.34 times the speed of TU7. In other words TU7gacc mode reflects an


overall speed of 24 to 29 times that of TU1 mode. Also, since TU1 for 4 core muli-threaded coding is<br />

typically on average 125 times the speed of HM (range of 100 to 141 times), TU7gacc reflects a typical<br />

speedup factor of 2850 to 3500 over HM14!<br />

A side-by-side comparison of encoding speed of different modes and for the 4 different resolution test<br />

sets is shown by the bar graphs of Fig. 7B. Table 3A-3D also show that the TU7gacc mode is able to reach<br />

encoding speed of 9.5 to 14.5 fps for UHD4K content, 35 to 67 fps for HD1080p content, around 67 to 90<br />

fps for HD720p, and 126 to 153 fps for SD/XD content.<br />

Encoding Speed (fps)<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

Encoding <strong>Performance</strong><br />

SD/XD HD720p HD1080p UHD4K<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7gacc<br />

Figure 7B Average encoding speed of all test sets on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder<br />

Fig. 8 shows another benefit of TU7gacc as compared to TU7 in terms of its capability of reducing the load<br />

on the CPU. Thus TU7gaac is currently able to reduce the load on CPU by around 5% from around 95% to<br />

around 90% enabling CPU to be used by other applications. This is in addition to 32% encoding speedup<br />

that TU7gacc provides over TU7 while achieving identical video quality.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

31<br />

*Other names and brands may be claimed as property of others.


CPU load Comparison between <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7 and TU7gacc<br />

110<br />

100<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7gacc<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

CPU Load (%)<br />

90<br />

80<br />

70<br />

60<br />

50<br />

SD/XD HD720p HD1080p UHD4K<br />

Resolution<br />

Figure 8 CPU Load Difference between TU7 and TU7gacc modes<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:0 8-bit Content<br />

We now discuss results of Codec <strong>Quality</strong> vs Encoding <strong>Performance</strong> tradeoffs for all modes for <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder for all 4 resolutions tested.<br />

Fig. 9 shows a comparison of <strong>Quality</strong> (negative BD rate percentage difference wrt <strong>HEVC</strong> HM14) vs<br />

Encoding <strong>Performance</strong> (fps) for each of the four resolution test sets at each of the four TU1, TU4, TU7,<br />

and TU7gacc encoding modes. The y-axis basically shows the quality difference in terms of loss of BD rate<br />

percentage difference in the process of increasing speed up of the encoder in going from TU1 to TU4 to<br />

TU7 to TU7gacc operating points.<br />

To summarize, encoding performance-wise <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in different modes achieves<br />

the following speedup of <strong>HEVC</strong> encoding on 4 core Reference PC platform specified earlier.<br />

32


<strong>Quality</strong> vs <strong>Performance</strong> Y<br />

5<br />

0 20 40 60 80 100 120 140 160<br />

Figure 9 <strong>Quality</strong> vs Encoding Speed Tradeoffs of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoding modes<br />

<br />

<br />

BDRate %age dif wrt HM14.0<br />

0 TU1 TU1 TU1<br />

-5<br />

-10<br />

-15<br />

-20<br />

-25<br />

-30<br />

-35<br />

-40<br />

-45<br />

-50<br />

TU4<br />

TU7<br />

TU4<br />

TU7gacc<br />

TU4<br />

TU4<br />

TU7 TU7gacc<br />

TU7<br />

TU7gacc<br />

Encoding Speed (fps)<br />

For UHD4K test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, TU7 and TU7gacc, correspondingly on<br />

average provides 0.40 fps, 3.1 fps, 8.5 fps and 11.4 fps. This reflects corresponding speedup factors<br />

wrt HM14 (single threaded) of 100, 783, 2118, and 2850.<br />

For HD1080p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, TU7 and TU7gacc, correspondingly on<br />

average provides 2.1 fps, 14.1 fps, 37.0 fps and 48.2 fps. This reflects corresponding speedup factors<br />

wrt HM14 (single threaded) of 139, 941, 2462, and 3211.<br />

TU7<br />

SD/XD<br />

HD720p<br />

HD1080p<br />

UHD4K<br />

TU7gacc<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<br />

<br />

For HD720p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, TU7 and TU7gacc, correspondingly on<br />

average provides 3.0 fps, 23.5 fps, 58.2 fps and 78.3 fps. This reflects corresponding speedup factors<br />

wrt HM14 (single threaded) of 117, 902, 2237, and 3011.<br />

For SD/XD test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, TU7 and TU7gacc, correspondingly on<br />

average provides 5.8 fps, 47.3 fps, 120.1 fps and 143.6 fps. This reflects corresponding speedup factors<br />

wrt HM14 (single threaded) of 141, 1153, 2930, and 3501.<br />

33<br />

*Other names and brands may be claimed as property of others.


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Decoder <strong>Performance</strong> for<br />

4:2:0 8-bit Content<br />

In this section we describe results of decoding speed measurement of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Decoder. For measurement of decoding speed (fps), the same reference PC Platform (<strong>Intel®</strong> Core i7-<br />

4770K CPU @ 3.5 GHz – 4 Cores/8 Threads) used for encoding speed measurement is employed.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

34<br />

The <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Decoder is able to achieve very high threading throughput<br />

consuming over 90% of resources on the noted machine.<br />

For measurement of decoder performance, longer bitstreams of typically around 1000 or more frames are<br />

necessary to obtain a stable measurement. Thus, each of the video sequences of Table 0A-0D since they<br />

are relatively short were extended by palindromic repetition (so as not to introduce sudden scene<br />

changes that might introduce an unnatural behavior in the measurement) to 900 – 1200 frames long and<br />

compressed with <strong>HEVC</strong> using the same Qp quantizers as in Table 0A-0D. These longer compressed<br />

streams were then used for decoder performance measurment.<br />

Tables 4A-4D show average speed of decoding bitstreams of each sequence of each of the 4 test sets as<br />

well as an overall average deoding speed for each category.<br />

Table 4A Decoding Speed of UHD4K on <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1, TU4, TU7 modes<br />

UHD4K Test Sequence<br />

TU1 Dec Speed TU4 Dec Speed TU7 Dec Speed<br />

fps fps fps<br />

1 Park_joy 70.98 62.31 67.98<br />

2 Ducks_take_off 72.83 66.19 61.57<br />

3 CrowdRun 73.07 67.90 74.39<br />

4 PeopleOnStreet 65.06 59.67 65.50<br />

5 Traffic 54.70 46.48 49.75<br />

6 Nebuta_Festival* 172.78 144.69 159.11<br />

Average 67.33 60.51 63.84<br />

* For average calculation, Nebuta_Festival is excluded as its size is 2560x1600 while others are 3840x2160.<br />

Table 4B Decoding Speed of HD1080p on <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1, TU4, TU7 modes<br />

HD1080p Test Sequence<br />

TU1 Dec Speed TU4 Dec Speed TU7 Dec Speed<br />

fps fps fps<br />

1 Park_joy 183.16 205.53 232.34<br />

2 Ducks_take_off 198.84 242.18 233.12<br />

3 CrowdRun 135.32 212.90 244.08<br />

4 TouchDownPass 221.87 312.77 346.70<br />

5 BQTerrace 255.18 329.90 368.03<br />

6 ParkScene 203.97 291.99 326.28<br />

Average 199.72 265.88 291.76


Table 4C Decoding Speed of HD720p on <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1, TU4, TU7 modes<br />

HD720p Test Sequence<br />

TU1 Dec Speed TU4 Dec Speed TU7 Dec Speed<br />

fps fps fps<br />

1 Park_joy 273.47 412.55 477.47<br />

2 Ducks_take_off 300.95 461.09 476.58<br />

3 CrowdRun 220.06 383.81 463.36<br />

4 City 300.66 512.35 564.14<br />

5 Crew 267.24 494.03 538.53<br />

6 Sailormen 283.76 504.51 561.94<br />

Average 274.36 461.39 513.67<br />

Table 4D Decoding Speed of SD/XD on <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1, TU4, and TU7 modes<br />

SD/XD Test Sequence<br />

TU1 Dec Speed TU4 Dec Speed TU7 Dec Speed<br />

fps fps fps<br />

1 BasketBallDrillText 504.41 942.65 1067.21<br />

2 PartyScene 476.69 866.48 1031.10<br />

3 RaceHorses 408.82 687.30 908.16<br />

4 City 509.83 1011.16 1134.78<br />

5 Crew 402.83 804.19 905.82<br />

6 Soccer 475.78 964.00 1099.34<br />

Average 463.06 879.30 1024.40<br />

Decoding Speed (fps)<br />

1200<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

Decoding <strong>Performance</strong><br />

TU1<br />

TU4<br />

TU7<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

0<br />

SD/XD HD720p HD1080p UHD4K<br />

Resolution<br />

Figure 10 <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Decoding speed for all resolutions<br />

As can be see from Table 4A-4D, as expected, the decoding speed is inversely proportional to the spatial<br />

resolution of video sequence being decoded. Average decoding speed for 1080p resolution content is<br />

35<br />

*Other names and brands may be claimed as property of others.


oughly 4 times that of 4K resolution content (Table 4B vs Table 4A), average decoding speed of 720p<br />

resolution content is almost a factor of 1.6 of 1080p content (Table 4C vs Table 4B), and that of SD/XD<br />

resolution content is close to a factor of 2 as compared to decoding speed of HD720p content (Table 4D<br />

vs Table 4C).<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

36<br />

Further, Fig. 10 shows a combined graph of variation of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Decoder<br />

performance (fps) across various resolutions test sets. On the Haswell platform, including the rendering<br />

overhead, one 60 fps UHD4K stream or two 30 fps UHD4K streams, or up to eight HD1080p streams can<br />

be easily decoded in realtime.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoding Computational<br />

Scalability Tests (on E5 <strong>Server</strong>)<br />

For measurement of computational scalability beyond 4 core i7-4770K system used in tests thus far, an E5<br />

<strong>Server</strong> (<strong>Intel®</strong> Xeon® Processor E5-2697v2, 12 Core, 2.7 GHz, 2 processors per board, total of 24 physical<br />

cores) is employed. Currently such servers do not inherently support <strong>Intel®</strong> graphics acceleration. The<br />

purpose of this test is to demonstrate threading scalability of the Intel <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

software encoder when many more cores are available. This also addresses an application where very<br />

high resolution video (4K and above) is needed to be <strong>HEVC</strong> encoded in realtime in software on many<br />

available cores.<br />

<strong>Quality</strong> Evaluation Results for Computational Scalability Tests for 4:2:0 8-bit<br />

Using the test set of 4:2:0 8-bit content as discussed earlier <strong>via</strong> Tables 0A-0B and test procedure establish<br />

earlier, we first evaluate video quality of the <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> for the three main TU modes, ie,<br />

TU1, TU4, and TU7, for the encoding configuration used to provide high degree of scalability.<br />

The results of quality tests are shown in Tables 5A and 5B for UHD4K and HD1080p test sets respectively.<br />

Table 5A Relative <strong>Quality</strong> Evaluation Results on UHD4K test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec<br />

in TU1, TU4, and TU7 modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

UHD4K Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age<br />

Y U V Y U V Y U V<br />

1 Park_joy -0.18 1.92 -0.63 13.50 7.92 16.47 26.97 25.10 33.01<br />

2 Ducks_take_off -0.93 6.81 9.79 15.92 -16.84 6.27 28.53 -0.84 23.39<br />

3 CrowdRun 0.37 2.99 2.14 17.33 18.18 20.35 35.46 46.38 48.19<br />

4 PeopleOnStreet 0.15 -7.15 -6.22 20.65 21.06 25.66 40.38 52.82 59.84<br />

5 Traffic 0.32 -9.01 -15.64 20.86 16.15 23.76 36.52 41.70 42.65<br />

6 Nebuta_Festival 6.91 47.07 42.79 20.02 16.68 5.40 46.90 33.35 16.72<br />

Average -0.05 -0.89 -2.11 17.65 9.30 18.50 33.57 33.03 41.41


Table 5B Relative <strong>Quality</strong> Evaluation Results on HD1080p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec<br />

in TU1, TU4, and TU7 modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

HD1080p Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age<br />

Y U V Y U V Y U V<br />

1 Park_joy 1.13 5.41 1.63 13.42 16.96 27.48 26.40 32.88 42.50<br />

2 Ducks_take_off 0.55 2.78 7.08 10.49 -3.20 21.87 20.40 11.05 38.23<br />

3 CrowdRun 1.03 3.32 2.83 16.51 22.24 23.07 35.12 48.29 50.21<br />

4 TouchDownPass 0.83 -5.35 -4.83 19.31 17.53 17.80 38.34 53.74 50.80<br />

5 BQTerrace 4.06 -14.83 -22.44 26.64 24.92 31.83 49.85 41.01 61.06<br />

6 ParkScene 2.69 -6.04 -7.99 17.45 21.24 19.67 33.31 40.70 36.60<br />

Average 1.71 -2.45 -3.96 17.30 16.61 23.62 33.90 37.94 46.57<br />

From Table 5A and 5B we can see that results on encoding server configuration used for E5 for all three TU’s<br />

provide identical high video quality to the configuration tested earlier for i7 processor 4770K, 4-core used for<br />

all other tests in this white paper.<br />

Encoder <strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for Computational<br />

Scalability Tests for 4:2:0 8-bit<br />

Next, we perform performance analysis tests on the aforementioned E5 server configuration. Fig. 11A shows<br />

results of this measurement which when compared to earlier tests on i7 4770k configuration for TU1 quality<br />

demonstrates a speedup factor of 3.5x for UHD content and 2.1x for HD content. This also shows the<br />

benefits of high throughput server configuration for higher resolution (UHD) content. The E5 server<br />

configuration also shows much reduced load on the cores, allowing additional similar encodes to run on the<br />

server - this item is discussed further a little later <strong>via</strong> Figure 12.<br />

Encoding Speed (fps)<br />

5.0<br />

4.5<br />

4.0<br />

3.5<br />

3.0<br />

2.5<br />

2.0<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

<strong>Performance</strong> Comparison of <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1 Mode and HM14.0<br />

4.34 (289x)<br />

HD1080p<br />

1.39 (347x)<br />

0.015 0.004<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>HEVC</strong> HM14.0<br />

UHD4K<br />

Figure 11A Average encoding speed comparison of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode with HM14<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

37<br />

Next, Table 6A and 6B show encoding speed measurements for each sequence of UHD4K and HD1080p test sets.<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 6A Encoding Speed of UHD4K test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and<br />

TU7 modes. For reference the average Speed of HM14 Encoder for this test set is .004 fps.<br />

UHD4K Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed<br />

fps fps fps<br />

1 Park_joy 1.24 8.80 24.65<br />

2 Ducks_take_off 1.01 7.23 23.19<br />

3 CrowdRun 1.37 11.39 27.28<br />

4 PeopleOnStreet 1.38 11.68 22.72<br />

5 Traffic 1.93 13.00 27.87<br />

6 Nebuta_Festival* 2.83 24.23 56.40<br />

Average 1.39 10.42 25.14<br />

* For average encoder speed calculation, for all TU’s Nebuta_Festival is excluded as its size is 2560x1600 while<br />

others are of size 4K (3840x2160).<br />

Table 6B Encoding Speed of HD1080p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and<br />

TU7 modes. For reference the average speed of HM14 Encoder for this test set is .015 fps.<br />

HD1080p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed<br />

fps fps fps<br />

1 Park_joy 3.24 20.59 64.65<br />

2 Ducks_take_off 2.56 17.61 66.53<br />

3 CrowdRun 2.92 21.42 70.82<br />

4 TouchDownPass 4.71 43.83 106.82<br />

5 BQTerrace 7.52 48.70 118.64<br />

6 ParkScene 5.08 37.17 98.57<br />

Average 4.34 31.55 87.67<br />

In comparing these results on E5 compared to i7 4770k configuration for which measurements were<br />

shown earlier in Table 3A and 3B, we observe that for UHD4K test set, the speed gain factor of the E5<br />

configuration over the i7 4770k configuration is 3.5, 3.3, and 3.0 for TU1, TU4, and TU7 respectively. The<br />

same comparison for HD1080p test set shows that the speed gain factor of the E5 configuration over the<br />

i7 4770k configuration is 2.1, 2.2, and 2.4 for TU1, TU4, and TU7 respectively.<br />

A side-by-side comparison on E5 of actual encoding speed in fps of TU1, TU4, TU7, both for UHD4K and<br />

HD1080p test sets is shown in Figure 11B.<br />

38


Encoding <strong>Performance</strong><br />

Encoding Speed (fps)<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

Figure 11B Average encoding speed of all test sets on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder<br />

Fig. 12 shows the actual CPU load on E5 configuration which appears to be in 45-70% range as compared to<br />

around 95% observed for the i7 4770k configuration as per Figure 8, discussed earlier.<br />

CPU Load (%)<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

HD1080p<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

UHD4K<br />

CPU load on E5 for <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

HD1080p<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

UHD4K<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 12 CPU load on E5 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU7 (fastest cpu only mode)<br />

39<br />

*Other names and brands may be claimed as property of others.


<strong>Quality</strong> vs <strong>Performance</strong> Y<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

BDRate %age dif wrt HM14.0<br />

5<br />

0<br />

-5<br />

-10<br />

-15<br />

-20<br />

-25<br />

-30<br />

-35<br />

-40<br />

0 10 20 30 40 50 60 70 80 90 100<br />

TU1<br />

TU4<br />

TU7<br />

TU4<br />

Encoding Speed (fps)<br />

HD1080p<br />

UHD4K<br />

Figure 13 <strong>Quality</strong> vs Encoding Speed Tradeoffs of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoding modes<br />

Finally, Figure 13 shows quality vs performance tradeoffs of the E5 configuration showing that it can almost<br />

reach real-time encoding of 4:2:0 8-bit UHD4K content.<br />

TU7<br />

40


<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> R7 results - Part 2<br />

4:2:0 10-bit Coding <strong>Quality</strong>, and <strong>Performance</strong><br />

tests<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

41<br />

*Other names and brands may be claimed as property of others.


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong><br />

Evaluation for 4:2:0 10-bit Content<br />

In this section, we first describe a test methodology employed for evaluating quality and then report on<br />

relative quality measurements for the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec with respect to MPEG <strong>HEVC</strong> HM14<br />

Codec, well known as a high quality reference albeit impractically slow.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Quality</strong> Evaluation Tests for 4:2:0 10-bit Content<br />

For the purpose of quality evaluation, two test sets for 4:2:0 10-bit content, one for 1080p, and the other<br />

for 1600p were employed. Each test set is composed of 3 publicly available challenging video test<br />

sequences of 10-bit bit-depth. Further for each test sequence, 4 well-spaced quantizer (QP) values are<br />

selected such that <strong>HEVC</strong> encoding for each sequence generates bit-rates in a wide but moderate range,<br />

avoiding extreme values.<br />

Specifically, Table 7A shows selected Ultra Definition 1600p (UHD1600p) test set, and Table 7B shows<br />

<strong>High</strong> Definition 1080p (HD1080p) test set.<br />

Table 7A UHD1600p Test Set for 4:2:0 10-bit and Quantizers used for Codec RD characteristics measurement<br />

Quantizers used for RD char.<br />

UHD1600p Test Sequence Resolution frm rate #frm Qp1 Qp2 Qp3 Qp4<br />

1 Nebuta_Festival 2560x1600 60 300 22 27 32 37<br />

2 SteamLocomotiveTrain 2560x1600 60 300 22 27 32 37<br />

3 ReadySetGo 2560x1600 60 300 22 27 32 37<br />

From UHD1600p test sequences referred to in Table 7A, Nebuta_Festival, and SteamLocomotiveTrain are<br />

MPEG <strong>HEVC</strong> test sequences of 2560x1600 resolution that can be obtained from MPEG distribution sites<br />

such as ftp://hvc:US88Hula@ftp.tnt.uni-hannover.de/testsequences. Further, the last sequence,<br />

ReadySetGo, of this test set is also of size 2560x1600 but obtained by windowing into an original<br />

3840x2160 resolution video using an offset of (764, 560) for top left hand corner - this original video can<br />

be found at http://ultravideo.cs.tut.fi/#testsequences.<br />

Table 7B HD1080p Test Set for 4:2:0 10 bit and Quantizers used for Codec RD characteristics measurement<br />

HD1080p Test Sequence Resolution frm rate #frm<br />

Quantizers used for RD char.<br />

Qp1 Qp2 Qp3 Qp4<br />

1 CrowdRun 1920x1080 50 500 26 30 34 38<br />

2 ParkScene 1920x1080 24 240 23 26 29 32<br />

3 Seeking 1920x1080 50 500 22 27 32 37<br />

42<br />

From HD1080p test sequences referred to in Table 7B, all three sequences CrowdRun, ParkScene, and<br />

Seeking are of 1920x1080 resolution and can be obtained from http://media.xiph.org/video/derf/.


<strong>Quality</strong> Evaluation Results for 4:2:0 10-bit Content<br />

Before discussing detailed BD rate differences of each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode wrt HM14, we first<br />

establish the quality/bitrate tradeoff of HM 14 that will be used as reference.<br />

UHD1600p Test<br />

Sequence<br />

First, each video sequence of each test set (such as UHD1600p, HD1080p) is encoded using MPEG <strong>HEVC</strong><br />

HM14 Reference Encoder with each of 4 quantizers as specified in Tables 7A-7B. The overall PSNR<br />

(averaged over frames) for each test sequence for each Qp for each of luma (Y), and associated chroma<br />

components (U and V) is collected along with the generated coding bitrate. For instance, Tables 8A-8B<br />

show the luma PSNR (chroma PSNR is also collected but is not shown to keep tables manageable in size)<br />

and total bitrate for each test sequence for each of 4 Qps is collected and is used in curve fitting to<br />

generate a continuous RD curve for HM14 encoding of that test sequence.<br />

Next, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in each of its 3 main modes: TU1, TU4, and TU7 is<br />

utilized to perform encoding of each test sequence with each of 4 quantizers (of Table 7A-7B) used in HM<br />

14 encoding, and PSNRs of Y, U, and V, and total bitrates are collected for each test.<br />

Table 8A PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on 4:2:0 10-bit UHD1600p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB Y kbps dB Y kbps<br />

1 Nebuta_Festival 30.77 48676.1 28.96 19640.4 28.11 8011.4 27.55 3678.2<br />

2 SteamLocomotive 41.53 22431.0 40.04 5523.4 38.84 2232.8 37.07 1066.4<br />

3 ReadySetGo 41.78 24443.7 40.04 11730.2 37.80 6343.4 35.31 3663.6<br />

HD1080p Test<br />

Sequence<br />

Table 8B PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on 4:2:0 10-bit HD1080p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB, Y kbps dB, Y kbps<br />

1 CrowdRun 35.13 25614.2 32.62 13979.3 30.24 7652.9 28.13 4301.7<br />

2 ParkScene 39.59 6432.6 37.98 3467.1 36.37 2064.3 34.73 1195.5<br />

3 Seeking 37.29 40340.7 34.92 12963.8 32.60 5528.0 30.20 2556.7<br />

For <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in each encoding mode, the 4 collected bitrates and corresponding 4<br />

PSNRs (each corresponding to one Qp) are used in curve fitting to generate a continuous RD curve for<br />

each encoding mode. Further, for each test sequence and for each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode, the<br />

corresponding <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> RD curve is compared to HM14 RD curve and a BD rate percentage is<br />

computed that reflects the difference in quality between specific <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode with<br />

respect to HM14 reference. The BD rate provides a measure of departure of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU<br />

mode’s quality in percentage bits with respect to the HM14 reference.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 8A-8B show the measured BD rate percentage bitrate for each of luma and chroma components for<br />

each test sequence of each test set of Table 7A-7B for each of 3 TU modes being evaluated. Fig. 14A1-14B2<br />

compares the RD characteristics of two codecs for each test set for TU1 and TU4 modes where the<br />

difference in quality is the highest and the lowest wrt HM14 reference.<br />

43<br />

*Other names and brands may be claimed as property of others.


For instance, Table 9A shows on UHD1600p test set, the BD rate percentage difference of <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various TU modes wrt HM14 reference. As can be observed on UHD1600p test set, the<br />

average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec over HM14 (an ideal<br />

reference), is 5.38%, 23.29%, and 44.69%, higher respectively in TU1, TU4, and TU7 modes. This means that<br />

for UHD1600p test set, the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> codec as compared to HM14, requires around 5% higher<br />

bitrate in TU1 mode, 23% higher bitrate in TU4 mode, and 45% higher bitrate in TU7 mode to achieve the<br />

same luma PSNR quality as HM14.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 9A Relative <strong>Quality</strong> Evaluation Results on 4:2:0 10-bit UHD1600p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Software Codec in TU1, TU4, and TU7 modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

UHD1600p Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age<br />

Y U V Y U V Y U V<br />

1 Nebuta_Festival 8.02 23.01 22.22 22.61 6.06 -0.65 45.00 26.50 14.37<br />

2 SteamLocomotive 2.96 -17.32 -13.10 22.63 6.49 7.87 45.07 27.01 25.16<br />

3 ReadySetGo 5.17 3.70 2.91 24.65 17.14 19.46 43.98 47.26 49.43<br />

Average 5.38 3.13 4.01 23.29 9.90 8.89 44.69 33.59 29.65<br />

Further, Table 9A shows on UHD1600p test set that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec TU1 mode with<br />

respect to HM14 reference, results in the biggest luma BD rate percentage difference on Nebuta_Festival<br />

sequence and the smallest luma BD rate percentage difference on SteamLocomotive sequence; the<br />

corresponding RD comparison curves for TU1 mode for the two cases shown in Fig. 14A1.<br />

PSNR (dB)<br />

31<br />

30<br />

29<br />

28<br />

Nebuta_Festival Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

PSNR (dB)<br />

42<br />

41<br />

40<br />

39<br />

38<br />

Steam_Locomotive Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

27<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

37<br />

0 10000 20000 30000<br />

BitRate (kbps)<br />

Figure 14A1 RD results of 4:2:0 10-bit UHD1600p scenes with the biggest and the smallest quality<br />

difference wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

44<br />

Table 9A also shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode with respect to HM14 reference,<br />

results in the biggest luma BD rate percentage difference on ReadySetGo sequence and the smallest luma


BD rate percentage difference on sequence Nebuta_Festival the corresponding RD comparison curves for<br />

TU4 mode for the two cases are shown in Fig. 14A2.<br />

PSNR (dB)<br />

43<br />

42<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

34<br />

Figure 14A2 RD results of 4:2:0 10-bit UHD1600p scenes with the biggest and the smallest quality<br />

difference wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

Next, Table 9B shows BD rate percentage differences of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> wrt HM14 on the<br />

HD1080p test set. The average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

over HM14 reference, is 2.9%, 9.8%, and 33.7% respectively in TU1, TU4, and TU7 modes.<br />

Table 9B Relative <strong>Quality</strong> Evaluation Results on 4:2:0 10-bit HD1080p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Software Codec in TU1, TU4, and TU7 modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

HD1080p Test<br />

Sequence<br />

ReadySetGo Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 10000 20000 30000<br />

BitRate (kbps)<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age<br />

Y U V Y U V Y U V<br />

1 CrowdRun 2.19 1.21 0.72 15.88 19.59 21.04 34.75 45.79 48.36<br />

2 ParkScene 3.08 1.80 8.67 15.88 20.53 30.33 30.80 41.00 47.29<br />

3 Seeking 3.26 -0.79 -1.38 -2.31 -1.49 -0.43 35.42 39.99 40.70<br />

Average 2.85 0.74 2.67 9.82 12.87 16.98 33.66 42.26 45.45<br />

PSNR (dB)<br />

31<br />

30<br />

29<br />

28<br />

27<br />

Nebuta_Festival Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Further, Table 9B shows on HD1080p test set that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1 mode with<br />

respect to HM14 reference, results in the biggest luma BD rate percentage difference on Seeking 1080p<br />

sequence and the smallest luma BD rate percentage difference on CrowdRun 1080p sequence; the<br />

corresponding RD comparison curves for TU1 mode for the two cases are shown in Fig. 14B1.<br />

45<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

PSNR (dB)<br />

38<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

Figure 14B1 RD results of 4:2:0 10-bit HD1080p scenes with the biggest and the smallest quality difference<br />

wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Table 9B also shows that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode with respect to HM14 reference,<br />

results in the biggest luma BD rate percentage difference on CrowdRun sequence and the smallest BD<br />

rate percentage difference for luma on Seeking sequence; the corresponding comparison curves for TU4<br />

for the two cases are shown in Fig. 14B2.<br />

PSNR (dB)<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

Seeking Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

CrowdRun Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 10000 20000 30000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

PSNR (dB)<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

38<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

CrowdRun Y<br />

<strong>HEVC</strong> HM14.0<br />

0 10000 20000 30000<br />

BitRate (kbps)<br />

Seeking Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

Figure 14B2 RD results of 4:2:0 10-bit HD1080p scenes with the biggest and the smallest quality difference<br />

wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

46<br />

To summarize, for 4:2:2 8-bit content the luma BD rate percentage bitrate difference of the <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various modes with respect to an ideal (but slow) HM14 reference is as follows.<br />

<br />

TU1 mode on average is only 2 to 4% worse in BD rate as compared to HM14 on all test sets. This is<br />

highest quality mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.


TU4 mode is -2 - 16% worse in BD rate as compared to HM14 0. As shown later, this mode provides an<br />

excellent tradeoff of quality vs speed.<br />

TU7 mode is 30 - 36% worse in BD rate as compared to HM14. As shown later, this is the fastest<br />

software only mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

Now that we have completed quality analysis of various TU modes of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>, the next<br />

obvious step is to perform analysis of encoding speed offered by each of these modes; this issue is<br />

discussed at length in the next section.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong><br />

Tradeoffs for 4:2:0 10-bit Content<br />

For measurement of encoding speed (fps) and speed vs quality tradeoffs, a recently released reference<br />

PC Platform (<strong>Intel®</strong> Core i7-4770K CPU @ 3.5 GHz – 4 Cores/8Threads) is employed.<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:0 10-bit Content<br />

We first measure encoding speed (fps) of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in TU1 (highest<br />

quality) mode and MPEG <strong>HEVC</strong> HM14 on different resolution test sets. Results of these measurements<br />

comparing the two speeds are shown in Fig. 15A.<br />

Encoding Speed (fps)<br />

1.6<br />

1.4<br />

1.2<br />

1.0<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0.0<br />

<strong>Performance</strong> Comparison of <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1 Mode and HM14.0<br />

1.38 (98x)<br />

HD1080p<br />

0.89 (81x)<br />

0.014 0.011<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>HEVC</strong> HM14.0<br />

UHD1600p<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 15A Average encoding speed comparison of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode with HM14<br />

From Fig. 15A it can be seen that the encoding speed of the TU1 mode is 80 to 100 times (multithreaded<br />

on 4 cores) the speed of HM14 (single threaded/1 core) for the two resolution test sets.<br />

47<br />

Next in Table 10A-10B we show measurement of encoding speed of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software<br />

Encoder on each sequence of the two resolution test sets in each of the three TU1, TU4, and TU7 modes.<br />

*Other names and brands may be claimed as property of others.


Table 10A Encoding Speed of 4:2:0 10-bit UHD1600p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and<br />

TU7 modes. For reference the average Speed of HM14 Encoder for this test set is .011 fps.<br />

UHD1600p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed<br />

fps fps fps<br />

1 Nebuta_Festival 0.85 5.97 14.91<br />

2 SteamLocomotive 1.18 7.68 20.75<br />

3 ReadySetGo 0.64 5.76 14.92<br />

Average 0.89 6.47 16.86<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 10B Encoding Speed of 4:2:0 10-bit HD1080p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and<br />

TU7 modes. For reference the average speed of HM14 Encoder for this test set is .014 fps.<br />

HD1080p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed<br />

fps fps fps<br />

1 CrowdRun 1.12 8.72 22.32<br />

2 ParkScene 1.91 12.14 31.32<br />

3 Seeking 1.10 8.70 23.00<br />

Average 1.38 9.85 25.55<br />

As can be seen from Table 10A-10B the encoding speed of TU4 mode is around 7.2x of that of TU1 mode,<br />

and that the encoding speed of TU7 mode is 2.6 times that of TU4 mode. In other words TU7 mode<br />

reflects an overall speed of around 19 times that of TU1 mode. Also, since TU1 (for 4 core multithreaded<br />

coding) is typically around 98 times the speed of HM, TU7 reflects a typical speedup factor of around 1700<br />

over HM14!<br />

A side-by-side comparison of encoding speed of different TU modes and for the 2 different resolution test<br />

sets is shown next by the bar graphs of Fig. 15B.<br />

48


Encoding Speed (fps)<br />

30<br />

25<br />

20<br />

15<br />

10<br />

5<br />

0<br />

HD1080p<br />

Encoding <strong>Performance</strong><br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

UHD1600p<br />

Figure 15B Average encoding speed of all 4:2:0 10-bit test sets on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder<br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:0 10-bit Content<br />

We now discuss results of Codec <strong>Quality</strong> vs Encoding <strong>Performance</strong> tradeoffs for all modes for <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder for all 4 resolutions tested.<br />

Figure 16 shows a comparison of <strong>Quality</strong> (negative BD rate percentage difference wrt <strong>HEVC</strong> HM14) vs<br />

Encoding <strong>Performance</strong> (fps) for each of the two resolution test sets at each of the three TU1, TU4, and<br />

TU7 encoding modes. The y-axis basically shows the quality difference in terms of loss of BD rate<br />

percentage difference in the process of increasing speed up of the encoder in going from TU1 to TU4 to<br />

TU7 operating points.<br />

To summarize, for 4:2:0 10-bit content, encoding performance-wise <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in<br />

different modes achieves the following speedup of <strong>HEVC</strong> encoding on 4 core Reference PC platform<br />

specified earlier.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

49<br />

*Other names and brands may be claimed as property of others.


<strong>Quality</strong> vs <strong>Performance</strong> Y<br />

0<br />

0 5 10 15 20 25 30<br />

TU1<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<br />

<br />

BDRate %age dif wrt HM14.0<br />

-5<br />

-10<br />

-15<br />

-20<br />

-25<br />

-30<br />

-35<br />

-40<br />

-45<br />

-50<br />

TU4<br />

TU4<br />

Figure 16 <strong>Quality</strong> vs Encoding Speed Tradeoffs for 4:2:0 10-bit content for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoding TU<br />

modes<br />

For 4:2:0 10-bit UHD1600p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, and TU7 modes,<br />

correspondingly on average provides 0.89 fps, 6.47 fps, and 16.86 fps. This reflects corresponding<br />

speedup factors wrt HM14 (single threaded) of 81, 588, and 1533.<br />

For 4:2:0 10-bit HD1080p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, and TU7 modes,<br />

correspondingly on average provides 1.38 fps, 9.85 fps, and 25.55 fps. This reflects corresponding<br />

speedup factors wrt HM14 (single threaded) of 98, 704, and 1825.<br />

TU7<br />

Encoding Speed (fps)<br />

HD1080p<br />

UHD1600p<br />

TU7<br />

50


<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> R7 results - Part 3<br />

4:2:2 8-bit Coding <strong>Quality</strong>, and <strong>Performance</strong><br />

tests<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

51<br />

*Other names and brands may be claimed as property of others.


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong><br />

Evaluation for 4:2:2 8-bit Content<br />

In this section, we first describe a test methodology employed for evaluating quality and then report on<br />

relative quality measurements for the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec with respect to MPEG <strong>HEVC</strong> HM14<br />

Codec, well known as a high quality reference albeit impractically slow.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Quality</strong> Evaluation Tests for 4:2:2 8-bit Content<br />

For the purpose of quality evaluation, two test sets of 4:2:2 8-bit content, one for 1080p, and the other<br />

one for 720p resolution are used with each test set consisting of 3 publicly available challenging video test<br />

sequences. Further for each test sequence, 4 well-spaced quantizer (QP) values are selected such that<br />

<strong>HEVC</strong> encoding for each sequence generates bit-rates in a wide but moderate range, avoiding extreme<br />

values.<br />

Specifically, Table 11A shows selected <strong>High</strong> Definition (HD1080p) test set, and Table 11B shows<br />

Intermediate Definition 720p (720p) test set.<br />

Table 11A HD1080p Test Set and Quantizers used for Codec RD characteristics measurement<br />

HD1080p Test Sequence Resolution frm rate #frm<br />

Quantizers used for RD char.<br />

Qp1 Qp2 Qp3 Qp4<br />

1 RushFieldCuts 1920x1080 30 570 22 27 32 37<br />

2 Red_Kayak 1920x1080 30 570 22 27 32 37<br />

3 TouchDownPass 1920x1080 30 570 23 26 30 34<br />

From 4:2:2 8-bit 1080p test sequences referred to in Table 11A, all three sequences, ie, RushFieldCuts,<br />

Red_Kayak, and TouchDownPass can be obtained from http://media.xiph.org/video/derf/.<br />

Table 11B 720p Test Set and Quantizers used for Codec RD characteristics measurement<br />

720p Test Sequence Resolution frm rate #frm<br />

Quantizers used for RD char.<br />

Qp1 Qp2 Qp3 Qp4<br />

1 CrowdRun 1280x720 50 497 26 30 34 38<br />

2 Ducks_take_off 1280x720 50 500 28 31 35 37<br />

3 Park_joy 1280x720 50 500 26 29 33 37<br />

52<br />

From 4:2:2 8-bit 720p test sequences referred to in Table 11B, all three sequences CrowdRun,<br />

Ducks_take_off, and Park_joy can be obtained from http://media.xiph.org/video/derf/.


<strong>Quality</strong> Evaluation Results for 4:2:2 8-bit Content<br />

Before discussing detailed BD rate differences of each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode wrt HM14, we first<br />

establish the quality/bitrate tradeoff of HM 14 that will be used as reference.<br />

HD1080p Test<br />

Sequence<br />

First, each video sequence of each test set (such as HD1080p, 720p) is encoded using MPEG <strong>HEVC</strong> HM14<br />

Reference Encoder with each of 4 quantizers as specified in Tables 11A-11B. The overall PSNR (averaged<br />

over frames) for each test sequence for each Qp for each of luma (Y), and associated chroma<br />

components (U and V) is collected along with the generated coding bitrate. For instance, Tables 12A-12B<br />

show the luma PSNR (chroma PSNR is also collected but is not shown to keep tables manageable in size)<br />

and total bitrate for each test sequence for each of 4 Qps is collected and is used in curve fitting to<br />

generate a continuous RD curve for HM14 encoding of that test sequence.<br />

Next, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in each of its 4 main modes: TU1, TU4, TU7, and<br />

TU7gacc is utilized to perform encoding of each test sequence with each of 4 quantizers (of Table 11A-11B)<br />

used in HM 14 encoding, and PSNRs of Y, U, and V, and total bitrates are collected for each test.<br />

Table 12A PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on 4:2:2 8-bit HD1080p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB Y kbps dB Y kbps<br />

1 RushFieldCuts 40.02 19450.9 37.25 7191.2 34.70 3209.9 32.16 1589.2<br />

2 Red_Kayak 41.89 16506.7 38.76 7987.6 35.62 3685.4 32.88 1653.7<br />

3 TouchDownPass 40.42 5995.0 39.34 2984.7 37.80 1464.8 36.13 791.4<br />

720p Test<br />

Sequence<br />

Table 12B PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on 4:2:2 8-bit 720p Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB, Y kbps dB, Y kbps<br />

1 CrowdRun 34.50 16213.2 31.63 8760.9 29.05 4684.5 26.84 2492.8<br />

2 Ducks_take_off 32.29 15641.9 30.42 8978.0 28.21 4205.4 27.24 2867.8<br />

3 Park_joy 33.90 19524.1 31.64 11946.9 28.87 5927.4 26.52 2861.1<br />

For <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in each encoding mode, the 4 collected bitrates and corresponding 4<br />

PSNRs (each corresponding to one Qp) are used in curve fitting to generate a continuous RD curve for<br />

each encoding mode. Further, for each test sequence and for each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode, the<br />

corresponding <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> RD curve is compared to HM14 RD curve and a BD rate percentage is<br />

computed that reflects the difference in quality between specific <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode with<br />

respect to HM14 reference. The BD rate provides a measure of departure of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU<br />

mode’s quality in percentage bits with respect to the HM14 reference.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 13A-13B show the measured BD rate percentage bitrate for each of luma and chroma components<br />

for each test sequence of each test set of Table 11A-11B for each of 4 TU modes being evaluated. Fig. 17A1-<br />

17B2 compares the RD characteristics of two codecs for each test set for TU1 and TU4 modes where the<br />

difference in quality is the highest and the lowest wrt HM14 reference.<br />

53<br />

*Other names and brands may be claimed as property of others.


For instance, Table 13A shows on HD1080p test set, the BD rate percentage difference of <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various TU modes wrt HM14 reference. As can be observed on HD1080p test set, the<br />

average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec over HM14 (an ideal<br />

reference), is 5.1%, 16.9%, 33.7%, and 37.8%, higher respectively in TU1, TU4, TU7, and TU7gacc modes. This<br />

means that for HD1080p test set, the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> codec as compared to HM14, requires around<br />

5% higher bitrate in TU1 mode, 17% higher bitrate in TU4 mode, 34% higher bitrate in TU7 mode, and 38%<br />

higher bitrate in TU7gacc mode to achieve the same luma PSNR quality as HM14.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 13A Relative <strong>Quality</strong> Evaluation Results on 4:2:2 8-bit HD1080p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Software Codec in TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

HD1080p Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Y U V Y U V Y U V Y U V<br />

1 RushFieldCuts 5.97 -8.21 -7.79 17.62 13.59 17.81 35.61 41.99 51.07 38.21 43.48 52.77<br />

2 Red_Kayak 3.19 -17.06 -13.96 12.44 -9.92 3.45 25.06 8.92 30.58 26.46 11.50 31.31<br />

3 TouchDownPass 6.22 -4.51 -4.29 20.67 18.93 17.65 40.45 57.78 54.72 48.83 61.28 58.16<br />

Average 5.12 -9.92 -8.68 16.91 7.53 12.97 33.71 36.23 45.46 37.83 38.75 47.41<br />

Further, Table 13A shows on 4:2:2 8-bit HD1080p test set that <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec TU1 mode<br />

with respect to HM14 reference, results in the biggest luma BD rate percentage difference on<br />

TouchdownPass sequence and the smallest luma BD rate percentage difference on Red_Kayak sequence;<br />

the corresponding RD comparison curves for TU1 mode for<br />

the two cases shown in Fig. 17A1.<br />

PSNR (dB)<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

TouchdownPass Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

PSNR (dB)<br />

44<br />

42<br />

40<br />

38<br />

36<br />

34<br />

32<br />

Red_Kayak Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

35<br />

0 2000 4000 6000 8000<br />

BitRate (kbps)<br />

30<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

Figure 17A1 RD results of 4:2:2 8-bit HD1080p scenes with the biggest and the smallest quality difference<br />

wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

54<br />

Table 13A also shows that on 4:2:2 8-bit HD1080p test set <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode<br />

with respect to HM14 reference, results in the biggest luma BD rate percentage difference on<br />

Touchdown_Pass sequence and the smallest luma BD rate percentage difference on Red_Kayak sequence;<br />

the corresponding RD comparison curves for TU4 mode for the two cases are shown in Fig. 17A2.


720p Test<br />

Sequence<br />

PSNR (dB)<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

35<br />

TouchdownPass Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 2000 4000 6000 8000<br />

BitRate (kbps)<br />

Figure 17A2 RD results of 4:2:2 8-bit 1080p scenes with the biggest and the smallest quality difference wrt<br />

HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

Next, Table 13B shows BD rate percentage differences of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> wrt HM14 on 4:2:2 8-<br />

bit 720p test set. The average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

over HM14 reference, is 4.7%, 14.6%, 28.7%, and 30.8% respectively in TU1, TU4, TU7, and TU7 gacc modes.<br />

Table 13B Relative <strong>Quality</strong> Evaluation Results on 4:2:2 8-bit 720p test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Software Codec in TU1, TU4, TU7, and TU7gacc modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age TU7gacc BD rate, %age<br />

Y U V Y U V Y U V Y U V<br />

1 CrowdRun 4.61 -7.55 -7.60 17.88 17.30 18.73 36.66 37.36 40.45 38.59 38.98 41.99<br />

2 Ducks_take_off 3.78 -0.97 -3.67 10.83 5.80 29.07 21.43 16.91 41.06 23.07 18.75 43.14<br />

3 Park_joy 5.80 -4.18 -8.04 14.93 15.20 25.00 28.10 27.30 39.22 30.87 29.72 41.60<br />

Average 4.73 -4.24 -6.43 14.55 12.77 24.27 28.73 27.19 40.24 30.84 29.15 42.24<br />

PSNR (dB)<br />

Further, Table 13B shows that on 4:2:2 8-bit 720p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU1 mode<br />

with respect to HM14 reference, results in the biggest luma BD rate percentage difference on Park_Joy<br />

sequence and the smallest luma BD rate percentage difference on Ducks_take_off sequence; the<br />

corresponding RD comparison curves for TU1 mode for the two cases are shown in Fig. 17B1.<br />

44<br />

42<br />

40<br />

38<br />

36<br />

34<br />

32<br />

30<br />

Red_Kayak Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

55<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

PSNR (dB)<br />

Figure 17B1 RD results of 4:2:2 8-bit HD720p scenes with the biggest and the smallest quality difference<br />

wrt HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Table 13B also shows that on 4:2:2 8-bit 720p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode with<br />

respect to HM14 reference, results in the biggest luma BD rate percentage difference on CrowdRun<br />

sequence and the smallest BD rate percentage difference for luma on Ducks_take_off sequence; the<br />

corresponding comparison curves for TU4 for the two cases are shown in Fig. 17B2.<br />

PSNR (dB)<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

26<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

26<br />

park_joy Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 10000 20000 30000<br />

BitRate (kbps)<br />

CrowdRun Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

PSNR (dB)<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

27<br />

26<br />

ducks_Take_off Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

ducks_Take_off Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000 15000 20000<br />

BitRate (kbps)<br />

Figure 17B2 RD results of 4:2:2 8-bit 720p scenes with the biggest and the smallest quality difference wrt<br />

HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

To summarize, for 4:2:2 8-bit content the luma BD rate percentage bitrate difference of the <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various modes with respect to an ideal (but slow) HM14 reference is as follows.<br />

56<br />

<br />

<br />

TU1 mode is only around 5% worse in BD rate as compared to HM14 on all test sets. This is highest<br />

quality mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

TU4 mode is 14.5 - 17% worse in BD rate as compared to HM14. As shown later, this mode provides an<br />

excellent tradeoff of quality vs speed.


TU7 mode is 21 - 37% worse in BD rate as compared to HM14. As shown later, this is the fastest<br />

software only mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

TU7gacc mode is 31 - 38% worse in BD rate as compared to HM14. As shown later, this is the fastest<br />

mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

Now that we have completed quality analysis of various TU modes of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>, the next<br />

obvious step is to perform analysis of encoding speed offered by each of these modes; this issue is<br />

discussed at length in the next section.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong><br />

Tradeoffs for 4:2:2 8-bit Content<br />

For measurement of encoding speed (fps) and speed vs quality tradeoffs, a recently released reference<br />

PC Platform (<strong>Intel®</strong> Core i7-4770K CPU @ 3.5 GHz – 4 Cores/8Threads) is employed.<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:2 8-bit Content<br />

We first measure encoding speed (fps) of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in TU1 (highest<br />

quality) mode and MPEG <strong>HEVC</strong> HM14 on different resolution test sets. Results of these measurements<br />

comparing the two speeds are shown in Fig. 18A.<br />

Encoding Speed (fps)<br />

3.0<br />

2.5<br />

2.0<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

<strong>Performance</strong> Comparison of <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1 Mode and HM14.0<br />

2.56 (64x)<br />

HD720p<br />

1.25 (63x)<br />

0.040 0.020<br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>HEVC</strong> HM14.0<br />

HD1080p<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 18A Average encoding speed comparison of 4:2:2 8-bit content on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

with HM14<br />

From Fig. 18A it can be seen that the encoding speed of the TU1 mode is around 61 times (multithreaded<br />

on 4 cores) the speed of HM14 (single threaded/1 core) for the two resolution test sets.<br />

57<br />

*Other names and brands may be claimed as property of others.


Table 14A Encoding Speed of 4:2:2 8-bit HD1080p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and TU7<br />

modes. For reference the average Speed of HM14 Encoder for this test set is 0.020 fps.<br />

HD1080p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gacc Enc Speed<br />

fps fps fps fps<br />

1 RushFieldCuts 1.18 11.89 31.10 41.23<br />

2 Red_Kayak 0.83 10.20 27.36 38.73<br />

3 TouchDownPass 1.74 16.33 43.50 55.16<br />

Average 1.25 12.81 33.99 45.04<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 14B Encoding Speed of 4:2:2 8-bit 720p test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and TU7<br />

modes. For reference the average speed of HM14 Encoder for this test set is 0.040 fps.<br />

720p Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed TU7gaccEnc Speed<br />

fps fps fps fps<br />

1 CrowdRun 2.39 18.67 47.68 65.51<br />

2 Ducks_take_off 2.40 17.52 48.19 63.99<br />

3 Park_joy 2.87 19.37 47.56 64.31<br />

Average 2.56 18.52 47.81 64.60<br />

As can be seen from Table 14A-14B the encoding speed of TU4 mode is in around 8.7 times that of TU1<br />

mode, encoding speed of TU7 mode is 2.4 times that of TU4 mode, and encoding speed of TU7gacc mode<br />

is around 1.3 times that of TU7 mode. In other words TU7gacc mode reflects an overall speed of around<br />

28 times that of TU1 mode. Also, since TU1 for 4 core muli-threaded coding is typically around 63 times the<br />

speed of HM, TU7gacc reflects a typical speedup factor of around 1800 over HM14!<br />

A side-by-side comparison of encoding speed of different TU modes and for the 2 different resolution test<br />

sets is shown next by the bar graphs of Fig. 18B.<br />

58


Encoding Speed (fps)<br />

Figure 18B Average encoding speed of all 4:2:2 8-bit test sets on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software<br />

Encoder<br />

Fig. 19 shows results of TU7gacc as compared to TU7 in terms of its capability of reducing the load on the<br />

CPU. For 1080p content TU7gcac is currently able to reduce the load on CPU by a few percent from<br />

enabling CPU to be used by other applications. This is in addition to around 32% encoding speedup that<br />

TU7gacc provides over TU7 while achieving identical video quality.<br />

CPU Load (%)<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

110<br />

100<br />

90<br />

80<br />

70<br />

HD720p<br />

Encoding <strong>Performance</strong><br />

Resolution<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7gacc<br />

HD1080p<br />

CPU load Comparison between <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7 and TU7gacc<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7gacc<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

60<br />

50<br />

HD720p<br />

Resolution<br />

HD1080p<br />

59<br />

Figure 19 CPU load difference between TU7 and TU7 gacc modes<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:2 8-bit Content<br />

We now discuss results of Codec <strong>Quality</strong> vs Encoding <strong>Performance</strong> tradeoffs for all modes for <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder for all 4 resolutions tested.<br />

Fig. 20 shows a comparison of <strong>Quality</strong> (negative BD rate percentage difference wrt <strong>HEVC</strong> HM14) vs<br />

Encoding <strong>Performance</strong> (fps) for each of the two resolution test sets at each of the four TU1, TU4, TU7,<br />

and TU7gacc encoding modes. The y-axis basically shows the quality difference in terms of loss of BD rate<br />

percentage difference in the process of increasing speed up of the encoder in going from TU1 to TU4 to<br />

TU7 to TU7 gacc operating points.<br />

To summarize, for 4:2:2 8-bit content, encoding performance-wise <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in<br />

different modes achieves the following speedup of <strong>HEVC</strong> encoding on 4 core Reference PC platform<br />

specified earlier.<br />

BDRate %age dif wrt HM14.0<br />

0<br />

-5<br />

-10<br />

-15<br />

-20<br />

-25<br />

-30<br />

-35<br />

-40<br />

TU1<br />

TU1<br />

TU4<br />

<strong>Quality</strong> vs <strong>Performance</strong> Y<br />

0 10 20 30 40 50 60 70<br />

TU4<br />

TU7<br />

Encoding Speed (fps)<br />

TU7gacc<br />

TU7<br />

HD720p<br />

HD1080p<br />

TU7gacc<br />

Figure 20 <strong>Quality</strong> vs Encoding Speed Tradeoffs for 4:2:2 8-bit content for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoding TU<br />

modes<br />

60<br />

<br />

<br />

For 4:2:2 8-bit 1080p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, TU7, and TU7 gacc modes,<br />

correspondingly on average provides 1.25 fps, 12.8 fps, 34 fps, and 45 fps. This reflects corresponding<br />

speedup factors wrt HM14 (single threaded) of around 62, 640, 1699, and 2252.<br />

For 4:2:2 8-bit 720p test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, and TU7 modes,<br />

correspondingly on average provides 2.55 fps, 18.5 fps, 47.8 fps, and 64 fps. This reflects<br />

corresponding speedup factors wrt HM14 (single threaded) of 64, 563, 1195, and 1615.


<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> R7 results - Part 4<br />

4:2:2 10-bit Coding <strong>Quality</strong>, and <strong>Performance</strong><br />

tests<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

61<br />

*Other names and brands may be claimed as property of others.


<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder <strong>Quality</strong><br />

Evaluation for 4:2:2 10-bit Content<br />

In this section, we first describe a test methodology employed for evaluating quality and then report on<br />

relative quality measurements for the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec with respect to MPEG <strong>HEVC</strong> HM14<br />

Codec, well known as a high quality reference albeit impractically slow.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

<strong>Quality</strong> Evaluation Tests for 4:2:2 10-bit Content<br />

For the purpose of quality evaluation, four test sets, one for each of the 4 main resolutions are selected<br />

with each test set consisting of 3 publicly available video test sequences. Further for each test sequence,<br />

4 well-spaced standard quantizer (QP) values used by MPEG for RD analysis for comparing <strong>HEVC</strong> encoders<br />

are used.<br />

Specifically, Table 15 shows the selected 4:2:2 10-bit Ultra <strong>High</strong> Definition 4K (UHD4K) test set.<br />

Table 15 4:2:2 10-bit UHD4K Test Set and Quantizers used for Codec RD characteristics measurement<br />

UHD4K Test Sequence Resolution frm rate #frm<br />

Quantizers used for RD char.<br />

Qp1 Qp2 Qp3 Qp4<br />

1 FastWalk 4096x2304 50 300 22 27 32 37<br />

2 Footbridge 4096x2304 24 300 22 27 32 37<br />

2 Venetian_crossing 4096x2304 60 300 22 27 32 37<br />

Since there is relatively little, public, free test content test for 4:2:2 10-bit UHD4K (or other resolution), the<br />

above noted (non-free) content was employed. The content is described at http://www.testvid.com/alltest-material/4k-media/84-test-4k-quad-hd-usa-europe-asia-video-t2v041<br />

and can be purchased from<br />

http://www.testvid.com/.<br />

<strong>Quality</strong> Evaluation Results for 4:2:2 10-bit Content<br />

Before discussing detailed BD rate differences of each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode wrt HM14, we first<br />

establish the quality/bitrate tradeoff of HM 14 that will be used as reference.<br />

62<br />

First, each video sequence of the 4:2:2 10-bit UHD4K is encoded using MPEG <strong>HEVC</strong> HM14 Reference<br />

Encoder with each of 4 quantizers as specified in Table 15. The overall PSNR (averaged over frames) for<br />

each test sequence for each Qp for each of luma (Y), and associated chroma components (U and V) is<br />

collected along with the generated coding bitrate. For instance, Tables 16 shows the luma PSNR (chroma<br />

PSNR is also collected but is not shown to keep tables manageable in size) and total bitrate for each test<br />

sequence for each of 4 Qps is collected and is used in curve fitting to generate a continuous RD curve for<br />

HM14 encoding of that test sequence.


Next, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in each of its 3 main modes: TU1, TU4, and TU7 is<br />

utilized to perform encoding of each test sequence with each of 4 quantizers (of Table 15) used in HM 14<br />

encoding, and PSNRs of Y, U, and V, and total bitrates are collected for each test.<br />

UHD4k Test<br />

Sequence<br />

Table 16 PSNR Y, and Total Bitrate for each of 4 QPs by HM 14 Codec on 4:2:2 10-bit UHD4K Test set<br />

Qp1 PSNR, Bitrate, Qp2 PSNR, Bitrate, Qp3 PSNR, Bitrate, Qp4 PSNR, Bitrate,<br />

dB Y kbps dB Y kbps dB Y kbps dB Y kbps<br />

1 FastWalk 48.11 38231.4 44.71 16403.4 41.61 7585.5 38.80 3938.2<br />

2 Footbridge 47.90 8496.9 44.86 3429.7 41.69 1542.9 38.65 740.6<br />

3 Venetian_Crossing 44.91 36959.6 42.25 10349.2 39.64 4177.3 36.78 2076.0<br />

For <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in each encoding mode, the 4 collected bitrates and corresponding 4<br />

PSNRs (each corresponding to one Qp) are used in curve fitting to generate a continuous RD curve for<br />

each encoding mode. Further, for each test sequence and for each <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode, the<br />

corresponding <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> RD curve is compared to HM14 RD curve and a BD rate percentage is<br />

computed that reflects the difference in quality between specific <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU mode with<br />

respect to HM14 reference. The BD rate provides a measure of departure of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU<br />

mode’s quality in percentage bits with respect to the HM14 reference.<br />

Table 17 shows the measured BD rate percentage bitrate for each of luma and chroma components for<br />

each test sequence of Table 16 for each of 3 TU modes being evaluated. Fig. 21A-21B compares the RD<br />

characteristics of two codecs on the UHD4K test set for TU1 and TU4 modes where the difference in<br />

quality is the highest and the lowest wrt HM14 reference.<br />

For instance, Table 17 shows on UHD4K test set, the BD rate percentage difference of <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> <strong>HEVC</strong> in various TU modes wrt HM14 reference. As can be observed on UHD4K test set, the<br />

average luma BD rate percentage difference of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec over HM14 (an ideal<br />

reference), is 8.3%, 18.2%, and 28.6%, higher respectively in TU1, TU4, and TU7 modes. This means that for<br />

UHD4K test set, the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> codec as compared to HM14, requires around 8% higher bitrate<br />

in TU1 mode, 18% higher bitrate in TU4 mode, and 29% higher bitrate in TU7 mode to achieve the same<br />

luma PSNR quality as HM14.<br />

Table 17 Relative <strong>Quality</strong> Evaluation Results on 4:2:2 10-bit UHD4K test set for <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Software Codec in TU1, TU4, and TU7 modes with respect to MPEG <strong>HEVC</strong> HM14 Codec<br />

UHD4K Test<br />

Sequence<br />

TU1 BD rate, %age TU4 BD rate, %age TU7 BD rate, %age<br />

Y U V Y U V Y U V<br />

1 FastWalk 9.32 -4.87 -2.44 20.31 1.97 3.97 30.58 32.59 29.85<br />

2 Footbridge 5.62 -2.87 -3.66 16.80 20.63 14.05 27.63 43.79 47.56<br />

3 Venetian_crossing 10.0 5.31 5.05 17.38 11.92 10.95 27.54 36.61 35.94<br />

Average 98.34 -0.81 -0.35 18.17 11.51 9.66 28.58 37.66 37.78<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

63<br />

*Other names and brands may be claimed as property of others.


Further, Table 17 shows that on 4:2:2 10-bit UHD4K test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec TU1 mode<br />

with respect to HM14 reference, results in the biggest luma BD rate percentage difference on<br />

Venetian_Crossing sequence and the smallest luma BD rate percentage difference on Footbridge<br />

sequence; the corresponding RD comparison curves for TU1 mode for the two cases are shown in Fig. 21A.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

PSNR (dB)<br />

46<br />

45<br />

44<br />

43<br />

42<br />

41<br />

40<br />

39<br />

38<br />

37<br />

36<br />

Venetian_Crossing Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

Figure 21A RD results of 4:2:2 10-bit UHD4K scenes with the biggest and the smallest quality difference wrt<br />

HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

Table 17 also shows that on 4:2:2 10-bit UHD4K test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec in TU4 mode<br />

with respect to HM14 reference, results in the biggest luma BD rate percentage difference on FastWalk<br />

sequence and the smallest luma BD rate percentage difference on Footbridge sequence; the<br />

corresponding RD comparison curves for TU4 mode for the two cases are shown in Fig. 21B.<br />

PSNR (dB)<br />

49<br />

47<br />

45<br />

43<br />

41<br />

39<br />

37<br />

FastWalk Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 20000 40000 60000<br />

BitRate (kbps)<br />

PSNR (dB)<br />

PSNR (dB)<br />

50<br />

48<br />

46<br />

44<br />

42<br />

40<br />

38<br />

49<br />

47<br />

45<br />

43<br />

41<br />

39<br />

37<br />

Footbridge Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

0 5000 10000<br />

BitRate (kbps)<br />

Footbridge Y<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

0 5000 10000<br />

BitRate (kbps)<br />

64<br />

Figure 21B RD results of 4:2:2 10-bit UHD4K scenes with the biggest and the smallest quality difference wrt<br />

HM14 for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU4 mode<br />

To summarize, the luma BD rate percentage bitrate difference of the <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> in various<br />

modes with respect to an ideal (but slow) HM14 reference is as follows.


TU1 mode on average is 8% worse in BD rate as compared to HM14. This is highest quality mode in<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

TU4 mode on average is 18% worse in BD rate as compared to HM14. As shown later, this mode<br />

provides an excellent tradeoff of quality vs speed.<br />

TU7 mode on average is 29% worse in BD rate as compared to HM14. As shown later, this is the fastest<br />

software only mode in <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>.<br />

Now that we have completed quality analysis of various TU modes of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>, the next<br />

obvious step is to perform analysis of encoding speed offered by each of these modes; this issue is<br />

discussed at length in the next section.<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder <strong>Quality</strong> vs <strong>Performance</strong><br />

Tradeoffs for 4:2:2 10-bit Content<br />

For measurement of encoding speed (fps) and speed vs quality tradeoffs, a recently released reference<br />

PC Platform (<strong>Intel®</strong> Core i7-4770K CPU @ 3.5 GHz – 4 Cores/8Threads) is employed.<br />

<strong>HEVC</strong> Software Encoder <strong>Performance</strong> for 4:2:2 10-bit Content<br />

We first measure encoding speed (fps) of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder in TU1 (highest<br />

quality) mode and MPEG <strong>HEVC</strong> HM14 on 4:2:2 10-bit UHD test set. Results of these measurements<br />

comparing the two speeds are shown in Fig. 22A.<br />

Encoding Speed (fps)<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0.0<br />

<strong>Performance</strong> Comparison of <strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1 Mode and HM14.0<br />

0.55 (138x)<br />

UHD4K<br />

Resolution<br />

0.004<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>HEVC</strong> HM14.0<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Figure 22A Average encoding speed comparison of 4:2:2 10-bit content on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> TU1 mode<br />

with HM14<br />

65<br />

*Other names and brands may be claimed as property of others.


From Fig. 22A it can be seen that the encoding speed of the TU1 mode is over 120 times (multithreaded on 4<br />

cores) the speed of HM14 (single threaded/1 core) for the 4:2:2 10-bit UHD4K test set.<br />

Next in Table 18 we show measurement of encoding speed of <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software<br />

Encoder on each sequence on each of the three TU1, TU4, and TU7 modes.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Table 18 Encoding Speed of 4:2:2 10-bit UHD4K test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec in TU1, TU4, and TU7<br />

modes. For reference the average Speed of HM14 Encoder for this test set is .004 fps.<br />

UHD4K Test<br />

Sequence<br />

TU1 Enc Speed TU4 Enc Speed TU7 Enc Speed<br />

fps fps fps<br />

1 FastWalk 0.24 2.71 7.14<br />

2 Footbridge 0.68 3.94 12.11<br />

3 Venetian_crossing 0.72 3.90 9.92<br />

Average 0.55 3.52 9.72<br />

As can be seen from Table 18 for the 4:2:2 10-bit UHD4K test set, the encoding speed of TU4 mode is 6.4<br />

times that of TU1 mode, and that the encoding speed of TU7 mode is 2.8 times that of TU4 mode. In<br />

other words TU7 mode reflects an overall speed of 17.7 times that of TU1 mode. Also, since TU1 for 4 core<br />

multithreaded coding is 137 times the speed of HM, TU7 reflects a typical speedup factor of 2430 over<br />

HM14!<br />

A side-by-side comparison of encoding speed of different TU modes and for the 4:2:2 10-bit UHD4K<br />

resolution test sets is shown next by the bar graphs of Fig. 22B.<br />

Encoding Speed (fps)<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU1<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU4<br />

<strong>Media</strong><strong>Server</strong><strong>Studio</strong> TU7<br />

Encoding <strong>Performance</strong><br />

0<br />

UHD4K<br />

Resolution<br />

66<br />

Figure 22B Average encoding speed of 4:2:2 10-bit UHD4K test set on <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software<br />

Encoder


<strong>Quality</strong> vs <strong>Performance</strong> Tradeoff for TU Modes for 4:2:2 10-bit Content<br />

We now discuss results of Codec <strong>Quality</strong> vs Encoding <strong>Performance</strong> tradeoffs for all modes for <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder for the UHD4K test set tested.<br />

Fig. 23 shows a comparison of <strong>Quality</strong> (negative BD rate percentage difference wrt <strong>HEVC</strong> HM14) vs<br />

Encoding <strong>Performance</strong> (fps) for the test set at each of the three TU1, TU4, and TU7 encoding modes. The<br />

y-axis basically shows the quality difference in terms of loss of BD rate percentage difference in the<br />

process of increasing speed up of the encoder in going from TU1 to TU4 to TU7 operating points.<br />

To summarize, for 4:2:2 10-bit content, encoding performance-wise <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> codec in<br />

different modes achieves the following speedup of <strong>HEVC</strong> encoding on 4 core Reference PC platform<br />

specified earlier.<br />

BDRate %age dif wrt HM14.0<br />

0<br />

-5<br />

-10<br />

-15<br />

-20<br />

-25<br />

-30<br />

<strong>Quality</strong> vs <strong>Performance</strong> Y<br />

0 2 4 6 8 10 12<br />

TU4<br />

TU7<br />

UHD4K<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

-35<br />

Encoding Speed (fps)<br />

Figure 23 <strong>Quality</strong> vs Encoding Speed Tradeoffs for 4:2:2 10-bit content for <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoding TU<br />

modes<br />

<br />

For 4:2:2 10-bit UHD4K test set, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> encoder in TU1, TU4, and TU7 modes,<br />

correspondingly on average provides 0.55 fps, 3.5 fps, and 9.7 fps. This reflects corresponding<br />

speedup factors wrt HM14 (single threaded) of 137, 880, and 2430.<br />

67<br />

*Other names and brands may be claimed as property of others.


Summary<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

In this white paper we first presented an overview of the new MPEG <strong>HEVC</strong> compression standard, and<br />

then introduced <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>, a developer product. Next we presented a test methodology<br />

for quality evaluation of <strong>HEVC</strong> Codecs and applied the methodology to evaluate quality of <strong>Intel®</strong> <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec. Encoder <strong>Quality</strong> versus performance tradeoffs of various modes of<br />

Intel <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Encoder were discussed next. This was then followed by<br />

evaluation of performance of <strong>HEVC</strong> Software Decoder of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong>. Appendix A<br />

summarizes the quality, encode performance and decode performance achieved.<br />

As can be seen from results of thorough testing on variety of video resolutions, bit-depth, and chroma<br />

resolutions, <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Software Codec delivers impressive encoding and decoding<br />

performance while achieving excellent tradeoffs in quality to meet the needs of highly demanding<br />

communication infrastructure developers, video software experts and developers, enterprise video<br />

device manufacturers, cloud based media services providers, and others.<br />

References<br />

[1] G. Sullivan, J.-R. Ohm, W.-J. Han, T. Wiegand, “Overview of the <strong>High</strong> Efficiency Video Coding (<strong>HEVC</strong>)<br />

Standard,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp 1649-1668,<br />

Dec 2012.<br />

[2] JCT-VC Video Subgroup Editors, “ISO/IEC JTC1/SC29/WG11 MPEG 2012/N13154: HM9 <strong>High</strong> Efficiency<br />

Video Coding (<strong>HEVC</strong>) Test Model 9 Encoder Description,” Oct. 2012, Shanghai.<br />

[3] Joint Collaborative Team on Video Coding, “JCTVC-L1003_v34: <strong>High</strong> Efficiency Coding Video Coding<br />

(<strong>HEVC</strong>) Text specification draft 10,” Jan. 2013, Geneva.<br />

[4] Leonardo Chiariglione (Ed.), “The MPEG Representation of Digital <strong>Media</strong>,” Springer, ISBN 978-1-4419-<br />

6183-9, 2012.<br />

[5] J. Wang, X. Yu, D. He, “JCTVC-F270: On BD-Rate Calculation,” Jul 2011, Torino. Also based on ...J. Zhao,<br />

Y. Su, A. Segall, “ITU-T COM16-C-404: On the Calculation of PSNR and bit-rate differences for the SVT test<br />

data,” April 2008.<br />

[6] A. Puri, X. Chen, A. Luthra, “Video Coding using the H.264/MPEG-4 AVC Compression Standard,” Signal<br />

Processing Image Communication, pp. 793-849, 2004.<br />

68


Appendix A<br />

A1: Summary of 4:2:0 8-bit Coding <strong>Quality</strong> and <strong>Performance</strong><br />

Tests of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

Resolution<br />

<strong>Quality</strong> difference with respect to <strong>HEVC</strong> HM14 Ref. Codec<br />

<strong>Media</strong> <strong>Server</strong> <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU4 <strong>Studio</strong> TU7<br />

Luma BD rate Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU1<br />

Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7gacc<br />

Luma BD rate<br />

Standard/Extended Definition (SD/XD) 0.07% 19.44% 39.33% 44.65%<br />

<strong>High</strong> Definition 720p (HD720p) 1.56% 16.66% 34.94% 39.46%<br />

<strong>High</strong> Definition 1080p (HD1080p) 1.71% 17.30% 33.90% 37.71%<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) -0.05% 17.65% 33.57% 35.22%<br />

Resolution<br />

Encode Speed (Speed factor wrt HM14 @ ) on Haswell α - 4 cores<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU1<br />

Speed, fps<br />

(Speed<br />

Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU4<br />

Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7<br />

Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7gacc<br />

Speed, fps<br />

(Speed Factor )<br />

Standard/Extended Definition (SD/XD) 5.78 (141x) 47.27 (1153x) 120.12 (2930x) 143.58 (3501x)<br />

<strong>High</strong> Definition 720p (HD720p) 3.04 ( 117x) 23.45 (902x) 58.15 (2237x) 78.29 (3011x)<br />

<strong>High</strong> Definition 1080p (HD1080p) 2.09 (139x) 14.12 (941x) 36.93 (2462x) 48.17 (3211x)<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) 0.4 (100x) 3.13 (783x) 8.47 (2118x) 11.41 (2850x)<br />

Resolution<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU1<br />

Speed, fps<br />

Decode Speed on Haswell α - 4 cores<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU4<br />

Speed, fps<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7<br />

Speed, fps<br />

Standard/Extended Definition (SD/XD) 463 879 1024<br />

<strong>High</strong> Definition 720p (HD720p) 274 461 514<br />

<strong>High</strong> Definition 1080p (HD1080p) 200 266 292<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) 67 61 79<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

α <strong>Intel®</strong> Core-i7 Processor 4770k: 4 Core, 3.5 GHz.<br />

@<br />

HM14 runs single threaded only.<br />

69<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

70


A2: Summary of Computational Scalability Tests (E5) of 4:2:0<br />

8-bit HD and UHD Coding by <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong><br />

Codec<br />

Resolution<br />

<strong>Quality</strong> difference with respect to <strong>HEVC</strong> HM14 Ref. Codec<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Luma BD rate<br />

<strong>High</strong> Definition 1080p (HD1080p) 1.71% 17.30% 33.90%<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) -0.05% 17.65% 33.57%<br />

Resolution<br />

Encode Speed (Speed factor wrt HM14 @ ) on E5 # - 12 cores<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Speed, fps<br />

(Speed Factor)<br />

<strong>High</strong> Definition 1080p (HD1080p) 4.34 (289x) 31.55 (2103x) 87.67 (5845x)<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) 1.39 (347x) 10.42 (2605x) 25.14 (6285x)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

71<br />

*Other names and brands may be claimed as property of others.


Appendix B<br />

B1: Summary of 4:2:0 10-bit Coding <strong>Quality</strong> and <strong>Performance</strong> of<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Resolution<br />

<strong>Quality</strong> difference with respect to <strong>HEVC</strong> HM14 Ref. Codec<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Luma BD rate<br />

<strong>High</strong> Definition 1080p (HD1080p) 2.85% 9.82% 33.66%<br />

Ultra <strong>High</strong> Definition 1600p (UHD1600p) 5.38% 23.29% 44.69%<br />

Resolution<br />

Encode Speed (Speed factor wrt HM14 @ ) on Haswell α - 4 cores<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Speed, fps<br />

(Speed Factor)<br />

<strong>High</strong> Definition 1080p (HD1080p) 1.38 (98x) 9.85 (704x) 25.55 (1825x)<br />

Ultra <strong>High</strong> Definition 1600p (UHD1600p) 0.89 (81x) 6.47 (588x) 16.86 (1533x)<br />

72


B2: Summary of 4:2:2 8-bit Coding <strong>Quality</strong> and <strong>Performance</strong> of<br />

<strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

Resolution<br />

<strong>Quality</strong> difference with respect to <strong>HEVC</strong> HM14 Ref. Codec<br />

<strong>Media</strong> <strong>Server</strong> <strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU4 <strong>Studio</strong> TU7<br />

Luma BD rate Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU1<br />

Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7gacc<br />

Luma BD rate<br />

<strong>High</strong> Definition 720p (HD720p) 4.73% 14.55% 28.73% 30.84%<br />

<strong>High</strong> Definition 1080p (HD1080p) 5.12% 16.91% 33.71% 37.83%<br />

Resolution<br />

Encode Speed (Speed factor wrt HM14 @ ) on Haswell α - 4 cores<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU1<br />

Speed, fps<br />

(Speed<br />

Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU4<br />

Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7<br />

Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong><br />

<strong>Studio</strong> TU7gacc<br />

Speed, fps<br />

(Speed Factor )<br />

<strong>High</strong> Definition 720p (HD720p) 2.56 (64x) 18.52 (563x) 47.81 (1195x) 64.6 (1615x)<br />

<strong>High</strong> Definition 1080p (HD1080p) 1.25 (62x) 12.81 (640x) 33.99 (1699x) 45.04 (2252x)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

73<br />

*Other names and brands may be claimed as property of others.


B3: Summary of 4:2:2 10-bit Coding <strong>Quality</strong> and <strong>Performance</strong><br />

Tests of <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Codec<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Resolution<br />

<strong>Quality</strong> difference with respect to <strong>HEVC</strong> HM14 Ref. Codec<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Luma BD rate<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Luma BD rate<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) 8.34% 18.17% 28.58%<br />

Resolution<br />

Encode Speed (Speed factor wrt HM14 @ ) on Haswell α - 4 cores<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU1 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU4 Speed, fps<br />

(Speed Factor)<br />

<strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

TU7 Speed, fps<br />

(Speed Factor)<br />

Ultra <strong>High</strong> Definition 4K (UHD4K) 0.55 (137x) 3.52 (880x) 9.72 (2430x)<br />

α <strong>Intel®</strong> Core-i7 Processor 4770k: 4 Core, 3.5 GHz.<br />

# <strong>Intel®</strong> Xeon® Processor E5-2697v2: 12 Core, 2.7 GHz. 2 processors per board, total of 24 physical cores.<br />

@<br />

HM14 runs single threaded only.<br />

74


Appendix C<br />

C1: Example Encoding Directory and Subdirectories structure<br />

Codec<br />

cfg<br />

encoder_random_access<br />

_main.cfg<br />

TAppEncoder.exe<br />

bin<br />

TAppDecoder.exe<br />

HM14<br />

MSS<br />

PSNR.exe<br />

script_par_rdpf_pd_hevc.bat<br />

script_qp_hevc.bat<br />

win_x64<br />

mss_enc_params.cmd<br />

mss_qp_hev.cmd<br />

bin<br />

2fca99749fdb49..<br />

1fdd936825ad47..<br />

libmfxhw64.dll<br />

mfxplugin64_he<br />

vce_sw.dll<br />

plugin.cfg<br />

mfxplugin64_he<br />

vcd_sw.dll<br />

plugin.cfg<br />

libmfxsw64.dll<br />

Park_joy_1920x1080_50.yuv<br />

..<br />

sample_decode.exe<br />

sample_encode.exe<br />

HM14_Bitstreams<br />

MSS_Bitstreams<br />

Content<br />

Output<br />

run_HM14.bat<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

75<br />

*Other names and brands may be claimed as property of others.


C2: Scripts for 4:2:0 content Encoding with <strong>HEVC</strong> HM<br />

Encoder<br />

We now provide example scripts (windows batch file and the other windows batch files it calls) for<br />

encoding using <strong>HEVC</strong> HM 14 encoder.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

76<br />

run_HM14.bat<br />

REM Script Dir Sequence FPS<br />

FRM Res qp1 qp2 qp3 qp4<br />

call Codec\HM14\script_qp_hevc.bat Content park_joy_3840x2160_50_8bit<br />

50 500 2160 26 29 33 37<br />

call Codec\HM14\script_qp_hevc.bat Content ducks_take_off_3840x2160_50_8bit<br />

50 500 2160 28 31 35 37<br />

call Codec\HM14\script_qp_hevc.bat Content crowdRun_3840x2160_50_8bit<br />

50 500 2160 26 30 34 38<br />

call Codec\HM14\script_qp_hevc.bat Content PeopleOnStreet_3840x2160_30_8bit<br />

30 150 2160 22 27 30 33<br />

call Codec\HM14\script_qp_hevc.bat Content Traffic_3840x2048_30_8bit<br />

30 300 2048 18 22 26 30<br />

call Codec\HM14\script_qp_hevc.bat Content Nebuta_Festival_2560x1600_60_8bit<br />

60 300 1600 30 33 36 39<br />

call Codec\HM14\script_qp_hevc.bat Content park_joy_1920x1080_50<br />

50 500 1080 26 29 33 37<br />

call Codec\HM14\script_qp_hevc.bat Content ducks_take_off_1920x1080_50<br />

50 500 1080 28 31 35 37<br />

call Codec\HM14\script_qp_hevc.bat Content crowdRun_1920x1080_50<br />

50 500 1080 26 30 34 38<br />

call Codec\HM14\script_qp_hevc.bat Content touchdown_pass_1920x1080_30<br />

30 570 1080 23 26 30 34<br />

call Codec\HM14\script_qp_hevc.bat Content BQTerrace_1920x1080_60<br />

60 600 1080 25 27 31 34<br />

call Codec\HM14\script_qp_hevc.bat Content ParkScene_1920x1080_24<br />

24 240 1080 23 26 29 32


call Codec\HM14\script_qp_hevc.bat Content park_joy_1280x720_50<br />

50 500 720 26 29 33 37<br />

call Codec\HM14\script_qp_hevc.bat Content ducks_take_off_1280x720_50<br />

50 500 720 28 31 35 37<br />

call Codec\HM14\script_qp_hevc.bat Content crowdRun_1280x720_50<br />

50 500 720 26 30 34 38<br />

call Codec\HM14\script_qp_hevc.bat Content City_1280x720_30<br />

30 300 720 22 24 27 29<br />

call Codec\HM14\script_qp_hevc.bat Content Crew_1280x720_30<br />

30 300 720 22 26 29 32<br />

call Codec\HM14\script_qp_hevc.bat Content Sailormen_1280x720_30<br />

30 300 720 24 26 28 30<br />

call Codec\HM14\script_qp_hevc.bat Content BasketballDrillText_832x480_50<br />

50 500 480 21 24 27 30<br />

call Codec\HM14\script_qp_hevc.bat Content PartyScene_832x480_50<br />

50 500 480 24 27 30 33<br />

call Codec\HM14\script_qp_hevc.bat Content RaceHorses_832x480_30<br />

30 300 480 24 27 30 33<br />

call Codec\HM14\script_qp_hevc.bat Content City_704x576_30<br />

30 300 576 22 24 26 28<br />

call Codec\HM14\script_qp_hevc.bat C Content Crew_704x576_30<br />

30 300 576 22 25 28 31<br />

call Codec\HM14\script_qp_hevc.bat Content Soccer_704x576_30<br />

30 300 576 22 25 28 31<br />

pause<br />

script_qp_hevc.bat<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

@echo off<br />

REM ENCODE_SCRIPT DIR FILE RES QP FPS FRM<br />

start Codec\HM14\script_par_rdpf_pd_hevc.bat %1 %2 %5 %9 %3 %4<br />

start Codec\HM14\script_par_rdpf_pd_hevc.bat %1 %2 %5 %8 %3 %4<br />

start Codec\HM14\script_par_rdpf_pd_hevc.bat %1 %2 %5 %7 %3 %4<br />

call Codec\HM14\script_par_rdpf_pd_hevc.bat %1 %2 %5 %6 %3 %4<br />

77<br />

*Other names and brands may be claimed as property of others.


script_par_rdpf_pd_hevc.bat<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

@echo on<br />

set KI=-1<br />

set INPUTDIR=%1<br />

set SEQUENCE=%2<br />

set RES=%3<br />

set QPVAL=%4<br />

set FRAMERATE=%5<br />

set FRAMES=%6<br />

call :ResolutionSetup %RES% W H<br />

if %KI% gtr -1 (<br />

set ENCMODE=RA<br />

) else (<br />

set ENCMODE=1i<br />

)<br />

set OUTPUTDIR=<strong>HEVC</strong>_%SEQUENCE%_%QPVAL%_%ENCMODE%<br />

if NOT EXIST Output\HM14_Bitstreams\%OUTPUTDIR% (<br />

mkdir Output\HM14_Bitstreams\%OUTPUTDIR%<br />

Codec\HM14\bin\TAppEncoder.exe -c Codec\HM14\bin\cfg\encoder_randomaccess_main.cfg -<br />

i %INPUTDIR%\%SEQUENCE%.yuv -wdt %W% -hgt %H% -q %QPVAL% -b<br />

Output\HM14_Bitstreams\%OUTPUTDIR%\<strong>HEVC</strong>_%SEQUENCE%_%QPVAL%.bin -o<br />

Output\HM14_Bitstreams\%OUTPUTDIR%\<strong>HEVC</strong>_%SEQUENCE%_%QPVAL%_recon.yuv -f %FRAMES% -<br />

fr %FRAMERATE% -ip %KI% ><br />

Output\HM14_Bitstreams\Logs\<strong>HEVC</strong>_%SEQUENCE%_%QPVAL%_%ENCMODE%.log<br />

)<br />

78<br />

:ResolutionSetup<br />

(<br />

IF %~1 EQU 1600 (<br />

set %~2=2560


)<br />

) else (<br />

)<br />

set %~3=1600<br />

IF %~1 EQU 2160 (<br />

set %~2=3840<br />

set %~3=2160<br />

) else (<br />

IF %~1 EQU 2048 (<br />

set %~2=3840<br />

set %~3=2048<br />

) else (<br />

IF %~1 EQU 1080 (<br />

set %~2=1920<br />

set %~3=1080<br />

) else (<br />

IF %~1 EQU 720 (<br />

set %~2=1280<br />

set %~3=720<br />

) else (<br />

IF %~1 EQU 576 (<br />

set %~2=704<br />

set %~3=576<br />

) else (<br />

IF %~1 EQU 480 (<br />

set %~2=832<br />

set %~3=480<br />

) else (<br />

echo Unknown resolution.<br />

exit /b<br />

)<br />

)<br />

)<br />

)<br />

)<br />

)<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

79<br />

*Other names and brands may be claimed as property of others.


C3: Scripts for 4:2:0 content Encoding with <strong>Intel®</strong> <strong>Media</strong><br />

<strong>Server</strong> <strong>Studio</strong> <strong>HEVC</strong> Encoder<br />

We now provide example scripts (windows batch file and the other windows batch files it calls) for<br />

encoding using the MSS <strong>HEVC</strong> encoder.<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

80<br />

run_MSS_<strong>HEVC</strong>.bat<br />

REM Script Sequence FPS qp1<br />

qp2 qp3 qp4 TU Resolution<br />

FOR %%G in (speed, balanced, quality) do (<br />

call Codec\MSS\mss_qp_hevc.cmd park_joy_3840x2160_50_8bit 50 26<br />

29 33 37 %%G 2160<br />

call Codec\MSS\mss_qp_hevc.cmd ducks_take_off_3840x2160_50_8bit 50 28<br />

31 35 37 %%G 2160<br />

call Codec\MSS\mss_qp_hevc.cmd crowdRun_3840x2160_50_8bit 50 26<br />

30 34 38 %%G 2160<br />

call Codec\MSS\mss_qp_hevc.cmd PeopleOnStreet_3840x2160_30_8bit 30 22<br />

27 30 33 %%G 2160<br />

call Codec\MSS\mss_qp_hevc.cmd Traffic_3840x2048_30_8bit 30 18<br />

22 26 30 %%G 2048<br />

call Codec\MSS\mss_qp_hevc.cmd Nebuta_Festival_2560x1600_60_8bit 60 30<br />

33 36 39 %%G 1600<br />

call Codec\MSS\mss_qp_hevc.cmd park_joy_1920x1080_50 50 26<br />

29 33 37 %%G 1080<br />

call Codec\MSS\mss_qp_hevc.cmd ducks_take_off_1920x1080_50 50 28<br />

31 35 37 %%G 1080<br />

call Codec\MSS\mss_qp_hevc.cmd crowdRun_1920x1080_50 50 26<br />

30 34 38 %%G 1080<br />

call Codec\MSS\mss_qp_hevc.cmd touchdown_pass_1920x1080_30 30 23<br />

26 30 34 %%G 1080<br />

call Codec\MSS\mss_qp_hevc.cmd BQTerrace_1920x1080_60 60 25<br />

27 31 34 %%G 1080<br />

call Codec\MSS\mss_qp_hevc.cmd ParkScene_1920x1080_24 24 23<br />

26 29 32 %%G 1080


call Codec\MSS\mss_qp_hevc.cmd park_joy_1280x720_50 50 26<br />

29 33 37 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd ducks_take_off_1280x720_50 50 28<br />

31 35 37 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd crowdRun_1280x720_50 50 26<br />

30 34 38 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd City_1280x720_30 30 22<br />

24 27 29 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd Crew_1280x720_30 30 22<br />

26 29 32 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd Sailormen_1280x720_30 30 24<br />

26 28 30 %%G 720<br />

call Codec\MSS\mss_qp_hevc.cmd BasketballDrillText_832x480_50 50 21<br />

24 27 30 %%G 480<br />

call Codec\MSS\mss_qp_hevc.cmd PartyScene_832x480_50 50 24<br />

27 30 33 %%G 480<br />

call Codec\MSS\mss_qp_hevc.cmd RaceHorses_832x480_30 30 24<br />

27 30 33 %%G 480<br />

call Codec\MSS\mss_qp_hevc.cmd City_704x576_30 30 22<br />

24 26 28 %%G 576<br />

call Codec\MSS\mss_qp_hevc.cmd Crew_704x576_30 30 22<br />

25 28 31 %%G 576<br />

call Codec\MSS\mss_qp_hevc.cmd Soccer_704x576_30 30 22<br />

25 28 31 %%G 576<br />

)<br />

move summary.txt Output\MSS_Bitstreams<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

mss_qp_hevc.cmd<br />

@echo off<br />

REM ENCODE_SCRIPT SEQUENCE QP TU RESOL FPS<br />

call Codec\MSS\mss_enc_params.cmd %1 %3 %7 %8 %2<br />

call Codec\MSS\mss_enc_params.cmd %1 %4 %7 %8 %2<br />

call Codec\MSS\mss_enc_params.cmd %1 %5 %7 %8 %2<br />

call Codec\MSS\mss_enc_params.cmd %1 %6 %7 %8 %2<br />

81<br />

*Other names and brands may be claimed as property of others.


mss_enc_params.cmd<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

@echo off<br />

set SEQ=%1<br />

set QPI=%2<br />

set /a QPB=%QPI%+1<br />

set /a QPP=%QPI%+1<br />

set OP=%3<br />

set DIR=Content<br />

set ODIR=Output\MSS_Bitstreams<br />

set RES=%4<br />

set FRATE=%5<br />

call :ResolutionSetup %RES% W H<br />

set "startTime=%time%"<br />

Codec\MSS\win_x64\bin\sample_encode.exe h265 -sw -f %FRATE% -u %OP% -async 2 -cqp -qpi %QPI% -<br />

qpp %QPP% -qpb %QPB% -w %W% -h %H% -i %DIR%\%SEQ%.yuv -<br />

o %ODIR%\msdk<strong>HEVC</strong>_%SEQ%_%OP%_qp%QPI%.h265<br />

set "stopTime=%time%"<br />

call :timeDiff diff startTime stopTime<br />

Codec\MSS\win_x64\bin\sample_decode.exe h265 -i %ODIR%\msdk<strong>HEVC</strong>_%SEQ%_%OP%_qp%QPI%.h265 -<br />

o %ODIR%\msdk<strong>HEVC</strong>_%SEQ%_%OP%_qp%QPI%_dec.yuv -p 15dd936825ad475ea34e35f3f54217a6<br />

Codec\PSNR.exe %DIR%\%SEQ%.yuv %ODIR%\msdk<strong>HEVC</strong>_%SEQ%_%OP%_qp%QPI%_dec.yuv %W% %H%<br />

echo Encoding time: %diff% msec. >> summary.txt<br />

goto :eof<br />

82<br />

:timeDiff<br />

setlocal<br />

call :timeToMS time1 "%~2"<br />

call :timeToMS time2 "%~3"<br />

set /a diff=time2-time1


(<br />

ENDLOCAL<br />

set "%~1=%diff%"<br />

goto :eof<br />

)<br />

:timeToMS<br />

::### WARNING, enclose the time in " ", because it can contain comma separators<br />

SETLOCAL EnableDelayedExpansion<br />

FOR /F "tokens=1,2,3,4 delims=:,.^ " %%a IN ("!%~2!") DO (<br />

set /a "ms=(((30%%a%%100)*60+7%%b)*60+3%%c-42300)*1000+(1%%d0 %% 1000)"<br />

)<br />

(<br />

ENDLOCAL<br />

set %~1=%ms%<br />

goto :eof<br />

)<br />

:ResolutionSetup<br />

(<br />

IF %~1 EQU 1600 (<br />

set %~2=2560<br />

set %~3=1600<br />

) else (<br />

IF %~1 EQU 2160 (<br />

set %~2=3840<br />

set %~3=2160<br />

) else (<br />

IF %~1 EQU 2048 (<br />

set %~2=3840<br />

set %~3=2048<br />

) else (<br />

IF %~1 EQU 1080 (<br />

set %~2=1920<br />

set %~3=1080<br />

) else (<br />

IF %~1 EQU 720 (<br />

set %~2=1280<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

83<br />

*Other names and brands may be claimed as property of others.


<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

)<br />

)<br />

)<br />

)<br />

)<br />

) else (<br />

)<br />

set %~3=720<br />

IF %~1 EQU 576 (<br />

) else (<br />

)<br />

set %~2=704<br />

set %~3=576<br />

IF %~1 EQU 480 (<br />

) else (<br />

)<br />

set %~2=832<br />

set %~3=480<br />

echo Unknown resolution.<br />

exit /b<br />

C4: Script for <strong>HEVC</strong> BD rate Computation<br />

We now provide example script (Visual Basic macro) for BD rate calculation provided by MPEG <strong>HEVC</strong><br />

committee. This macro needs to be installed in an excel document, and upon entering the computed<br />

encoding Bitrates generated by HM14 and MSS <strong>HEVC</strong> Encoder’s, and PSNRs generated by comparing<br />

decoded yuv files with respect to original yuv files and entering them into the same excel document, BD<br />

rate difference between HM14 and MSS <strong>HEVC</strong> encoder can be computed.<br />

BDrate.bas<br />

84<br />

Attribute VB_Name = "Module1"<br />

Public Function pchipend(h1 As Double, h2 As Double, del1 As Double, del2 As Double) As Double<br />

Dim d As Double<br />

d = ((2 * h1 + h2) * del1 - h1 * del2) / (h1 + h2)


If (d * del1 < 0) Then<br />

d = 0<br />

ElseIf ((del1 * del2 < 0) And (Abs(d) > Abs(3 * del1))) Then<br />

d = 3 * del1<br />

End If<br />

pchipend = d<br />

End Function<br />

Public Function bdrint(rate As Range, dist As Range, low As Double, high As Double) As Double<br />

Dim log_rate(1 To 4) As Double<br />

Dim log_dist(1 To 4) As Double<br />

Dim i As Long<br />

For i = 1 To 4<br />

log_rate(i) = WorksheetFunction.Log(rate(5 - i), 10)<br />

log_dist(i) = dist(5 - i)<br />

Next i<br />

Dim H(1 To 3) As Double<br />

Dim delta(1 To 3) As Double<br />

For i = 1 To 3<br />

H(i) = log_dist(i + 1) - log_dist(i)<br />

delta(i) = (log_rate(i + 1) - log_rate(i)) / H(i)<br />

Next i<br />

Dim d(1 To 4) As Double<br />

d(1) = pchipend(H(1), H(2), delta(1), delta(2))<br />

For i = 2 To 3<br />

d(i) = (3 * H(i - 1) + 3 * H(i)) / ((2 * H(i) + H(i - 1)) / delta(i - 1) + (H(i) + 2 * H(i - 1)) / delta(i))<br />

Next i<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

d(4) = pchipend(H(3), H(2), delta(3), delta(2))<br />

Dim c(1 To 3) As Double<br />

Dim b(1 To 3) As Double<br />

For i = 1 To 3<br />

c(i) = (3 * delta(i) - 2 * d(i) - d(i + 1)) / H(i)<br />

b(i) = (d(i) - 2 * delta(i) + d(i + 1)) / (H(i) * H(i))<br />

85<br />

*Other names and brands may be claimed as property of others.


Next i<br />

' cubic function is rate(i) + s*(d(i) + s*(c(i) + s*(b(i))) where s = x - dist(i)<br />

' or rate(i) + s*d(i) + s*s*c(i) + s*s*s*b(i)<br />

' primitive is s*rate(i) + s*s*d(i)/2 + s*s*s*c(i)/3 + s*s*s*s*b(i)/4<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

Dim s0 As Double<br />

Dim s1 As Double<br />

Dim result As Double<br />

result = 0<br />

For i = 1 To 3<br />

s0 = log_dist(i)<br />

s1 = log_dist(i + 1)<br />

' clip s0 to valid range<br />

s0 = WorksheetFunction.Max(s0, low)<br />

s0 = WorksheetFunction.Min(s0, high)<br />

' clip s1 to valid range<br />

s1 = WorksheetFunction.Max(s1, low)<br />

s1 = WorksheetFunction.Min(s1, high)<br />

s0 = s0 - log_dist(i)<br />

s1 = s1 - log_dist(i)<br />

If (s1 > s0) Then<br />

result = result + (s1 - s0) * log_rate(i)<br />

result = result + (s1 * s1 - s0 * s0) * d(i) / 2<br />

result = result + (s1 * s1 * s1 - s0 * s0 * s0) * c(i) / 3<br />

result = result + (s1 * s1 * s1 * s1 - s0 * s0 * s0 * s0) * b(i) / 4<br />

End If<br />

Next i<br />

bdrint = result<br />

End Function<br />

86<br />

Public Function bdrate(rateA As Range, distA As Range, rateB As Range, distB As Range) As Double<br />

Dim minPSNR As Double<br />

Dim maxPSNR As Double


minPSNR = WorksheetFunction.Max(WorksheetFunction.Min(distA), WorksheetFunction.Min(distB))<br />

maxPSNR = WorksheetFunction.Min(WorksheetFunction.Max(distA), WorksheetFunction.Max(distB))<br />

Dim vA As Double<br />

Dim vB As Double<br />

vA = bdrint(rateA, distA, minPSNR, maxPSNR)<br />

vB = bdrint(rateB, distB, minPSNR, maxPSNR)<br />

Dim avg As Double<br />

avg = (vB - vA) / (maxPSNR - minPSNR)<br />

bdrate = WorksheetFunction.Power(10, avg) - 1<br />

End Function<br />

<strong>Deliver</strong> <strong>High</strong> <strong>Quality</strong>, <strong>High</strong> <strong>Performance</strong> <strong>HEVC</strong> <strong>via</strong> <strong>Intel®</strong> <strong>Media</strong> <strong>Server</strong> <strong>Studio</strong><br />

87<br />

*Other names and brands may be claimed as property of others.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!