HDR Insights Series Article 4 : Dolby Vision
https://www.veneratech.com/hdr-dolby-vision-meta-data-parameters-to-validate-content/ In the previous article, we discussed the HDR tone mapping and how it is used to produce an optimum viewer experience on a range of display devices. This article discusses the basics of Dolby Vision meta-data and the parameters that the user needs to validate before the content is delivered. What is HDR metadata? HDR Metadata is an aid for a display device to show the content in an optimal manner. It contains the HDR content and mastering device properties that are used by the display device to map the content according to its own color gamut and peak brightness. There are two types of metadata – Static and Dynamic.
https://www.veneratech.com/hdr-dolby-vision-meta-data-parameters-to-validate-content/
In the previous article, we discussed the HDR tone mapping and how it is used to produce an optimum viewer experience on a range of display devices. This article discusses the basics of Dolby Vision meta-data and the parameters that the user needs to validate before the content is delivered.
What is HDR metadata?
HDR Metadata is an aid for a display device to show the content in an optimal manner. It contains the HDR content and mastering device properties that are used by the display device to map the content according to its own color gamut and peak brightness. There are two types of metadata – Static and Dynamic.
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
HDR Insights Series Article 4 : Dolby Vision
In the previous article, we discussed the HDR tone mapping and how it is used to produce an
optimum viewer experience on a range of display devices. This article discusses the basics of
Dolby Vision meta-data and the parameters that the user needs to validate before the content is
delivered.
What is HDR metadata?
HDR Metadata is an aid for a display device to show the content in an optimal manner. It
contains the HDR content and mastering device properties that are used by the display device to
map the content according to its own color gamut and peak brightness. There are two types of
metadata – Static and Dynamic.
Static metadata
Static metadata contains metadata information that is applicable to the entire content. It is
standardized by SMPTE ST 2086. Key items of static metadata are as following:
1. Mastering display properties: Properties defining the device on which content was mastered.
RGB color primaries
White point
Brightness Range
2. Maximum content light level (MaxCLL): Light level of the brightest pixel in the entire video
stream.
3. Maximum Frame-Average Light Level (MaxFALL): Average Light level of the brightest frame in
the entire video stream.
In a typical content, the brightness and color range varies from shot to shot. The challenge with
static metadata is that if the tone mapping is performed based on the static metadata, it will be
based only on the brightest frame in the entire content. As a result, the majority of the content will
have greater compression of dynamic range and color gamut than needed. This will lead to poor
viewing experience on less capable HDR display devices.
Dynamic metadata
Dynamic metadata allows the tone mapping to be performed on a per scene basis. This leads to
a significantly better user viewing experience when the content is displayed on less capable HDR
display devices. Dynamic metadata has been standardized by SMPTE ST 2094, which defines
content-dependent metadata. Using Dynamic metadata along with Static metadata overcomes
the issues presented by the usage of only the static metadata for tone mapping.
Dolby Vision
Dolby Vision uses dynamic metadata and is in fact the most commonly used HDR technology
today. This is adopted by major OTT service providers such as Netflix and Amazon, as well as
major studios and a host of prominent television manufacturers. Dolby Vision is standardized in
SMPTE ST 2094-10. In addition to supporting for dynamic metadata, Dolby Vision also allows
description of multiple trims for specific devices which allows finer display on such devices.
Dolby has documented the details of its algorithm in what they refer to as Content Mapping (CM)
documents. The original CM algorithm is version (CMv2.9) which has been used since the
introduction of Dolby Vision. Dolby introduced the Dolby Vision Content Mapping version 4
(CMv4) in the fall of 2018. Both versions of the CM are still in use. The Dolby Vision Color
Grading Best Practices Guide provides more information.
Dolby Vision metadata is coded at various „levels‟, the description of which is mentioned below:
Metadata Level/Field
Description
LEVEL 0
GLOBAL METADATA (STATIC)
Mastering Display
Aspect Ratio
Frame Rate
Target Display
Color Encoding
Algorithm/Trim Version
Describes the characteristics of the mastering display used for
the project
Ratio of canvas and image (active area)
Frame Rate
Describes the characteristics of each target display used for
L2 trim metadata
Describes the image container deliverable
CM algorithm version and Trim version
LEVEL 1
ANALYSIS METADATA (DYNAMIC)
L1 Min, Mid, Max
Three floating point values that characterize the dynamic
range of the shot or frame
Shot-based L1 metadata is created by analyzing each frame contained
in a shot in LMS color space and combined to describe the entire shot
as L1Min, L1Mid, L1Max
Stored as LMS (CMv2.9) and L3 Offsets
LEVEL 2
BACKWARDS COMPATIBLE PER-TARGET TRIM
METADATA (DYNAMIC)
Reserved1, Reserved2,
Reserved3, Lift, Gain,
Gamma, Saturation,
Chroma and Tone Detail
Automatically computed from L1, L3 and L8 (lift, gain,
gamma, saturation, chroma, tone detail) metadata for
backwards compatibility with CMv2.9
LEVEL 3
OFFSETS TO L1 (DYNAMIC)
L1 Min, Mid, Max
Three floating point values that are offsets to L1 Analysis
metadata as L3Min, L3Mid, L3Max
L3Mid is a global user defined trim control
L1 is stored as CMv2.9 computed values, CMv4 reconstructs RGB
values with L1 + L3
LEVEL 5
PER-SHOT ASPECT RATIO (DYNAMIC)
Canvas, Image
Used for defining shots that have different aspect ratios than
the global L0 aspect ratio
LEVEL 6
OPTIONAL HDR10 METADATA (STATIC)
MaxFALL, MaxCLL
Metadata for HDR10
MaxCLL – Maximum Content Light Level MaxFALL – Maximum
Frame Average Light Level
LEVEL 8
PER-TARGET TRIM METADATA (DYNAMIC)
Lift, Gain, Gamma,
Saturation, Chroma, Tone
Detail, Mid Contrast Bias,
Highlight Clipping
6-vector (R,Y,G,C,B,M)
saturation and 6-vector
(R,Y,G,C,B,M) hue trims
User defined image controls to adjust the CMv4 algorithms
per target with secondary color controls
LEVEL 9
PER-SHOT SOURCE CONTENT PRIMARIES (DYNAMIC)
Rxy, Gxy, Bxy, WPxy
Stores the mastering display color primaries and white point
as per-shot metadata
Dolby Vision QC requirements
Netflix, Amazon, and other streaming services are continuously adding more and more HDR
titles to their library with the aim of improving the quality of experience for their viewers and
differentiating their service offerings. This requires that the content suppliers are equipped to
deliver good quality and compliant HDR content. Moreover, having the ability to verify quality
before delivery becomes more important.
Many of these OTT services support both the HDR-10 and Dolby Vision flavors of HDR.
However, more and more Netflix HDR titles are now based on Dolby Vision. Dolby Vision is a
new and complex technology, and therefore checking the content for correctness and
compliance is not always easy. Delivering non-compliant HDR content can affect your business
and therefore using a QC tool to assist in HDR QC can go a long way in maintaining a good
standing with these OTT services.
Here are some of the important aspects to verify for HDR-10 and Dolby Vision:
1. HDR metadata presence
HDR-10: Static metadata must be coded with the correct parameter values.
Dolby Vision: Static metadata must be present once and dynamic metadata must be present for
every shot in the content.
2. HDR metadata correctness. There are a number of issues that content providers need to check
for correctness in the metadata:
Only one mastering display should be referenced in metadata.
Correct mastering display properties – RGB primaries, white point and Luminance range.
MaxFALL and MaxCLL values.
All target displays must have unique IDs.
Correct algorithm version. Dolby supports two versions:
Metadata Version 2.0.5 XML for CMv2.9
Metadata Version 4.0.2 XML for CMv4
No frame gaps. All the shots, as well as frames, must be tightly aligned within the timeline and
there should not be any gap between frames and/or shots
No overlapping shots. The timeline must be accurately cut into individual shots; and analysis to
generate L1 metadata should be performed on a per-shot basis. If the timeline is not accurately
cut into shots, there will be issues with luminance consistency and may lead to flashing and
flickering artifacts during playback.
No negative duration for shots. Shot duration, as coded in “Duration” field, must not be negative
Single trim for a particular target display. There should be one and only one trim for a target
display.
Level 1 metadata must be present for all the shots.
Valid Canvas and Image aspect ratio. Cross check the canvas and image aspect ratio with the
baseband level verification of the actual content.
3. Validation of video essence properties. Essential properties such as Color matrix, Color
primaries, Transfer characteristics, bit depth etc. must be correctly coded.
Netflix requires the Dolby Vision metadata to be embedded in the video stream for the content
delivered to them. Reviewing the embedded meta-data in video stream can be tedious and
therefore an easy way to extract & review the entire metadata may be needed and
advantageous.
How can we help?
Venera‟s QC products (Pulsar – for on-premise & Quasar – for cloud) can help in identifying
these issues in an automated manner. We have worked extensively with various technology and
media groups to create features that can help the users with their validation needs. And we have
done so without introducing a lot of complexity for the users.
Depending on the volume of your content, you could consider one of our Perpetual license
editions (Pulsar Professional, or Pulsar Standard), or for low volume customers, we also have
a very unique option called Pulsar Pay-Per-Use (Pulsar PPU) as an on-premise usage-based
QC software where you pay a nominal per minute charge for content that is analyzed. And we, of
course, offer a free trial so you can test our software at no cost to you. You can also download
a copy of the Pulsar brochure here. And for more details on our pricing you can check here.
If your content workflow is in the cloud, then you can use our Quasar QC service, which is the
only Native Cloud QC service in the market. With advanced features like usage-based pricing,
dynamic scaling, regional resourcing, content security framework and REST API, the platform is
a good fit for content workflows requiring quality assurance. Quasar is currently supported for
AWS, Azure and Google cloud platforms and can also work with content stored on Backblaze B2
cloud storage. Read more about Quasar here.
Both Pulsar & Quasar come with a long list of „ready to use‟ QC templates for Netflix, based on
their latest published specifications (as well as some of the other popular platforms, like iTunes,
CableLabs, and DPP) which can help you run QC jobs right out of the box. You can also
enhance and modify any of these QC templates or build new ones! And we are happy to build
new QC templates for your specific needs.
This content originally appeared here on: https://www.veneratech.com/hdr-dolby-vision-metadata-parameters-to-validate-content/