Theory

Dolby Vision is a technology for supplying dynamic metadata containing additional toning information to better fit the Directors Intent. There are many profiles of Dolby Vision, Noteably Profile 5, 7, 8.1 and 8.4. All profiles suppy this metadata at NAL Unit 62 and is known as the RPU. The RPU shapes the image by defining the behavior of a collection of NLQ and MMR filters with respect to the displays primaries and limitations.

  • Profile 5 is typically used by streaming services, it’s distinguishing factor is the ICtCp Color Model in it’s base layer, making the file incompatible with players that don’t support it. This color format has been shown to perform far better with chroma reconstruction from a 4:2:0 source.

  • Profile 7 is used in bluray releases, typically shipped with 2 tracks, a base layer and an enhancement layer. The base layer being a 1000 nitt HDR10 10b 4:2:0 compatibility source. The Enhancement layer is a HDR10 4:2:0 1080p source optionally containing additional luma and chroma details to reconstruct a 4000 nitt HDR10 4:2:2 12b source, this is called FEL, MEL is when there is no additional data in the track.

  • Profile 8.1/8.4 is primarily used as a consumer format, containing a HDR10 (8.1) or HLG (8.4) base track for compatibility, and the RPU layer can be ignored by players that dont support it. Most content ripped from bluray is converted to this format as it has the greatest player support.

Playback Support

Coming soon

Implementation

I personally believe players should be able to act as player-led (LLDV) and sink-led dolby vision sources. So far no implementation has fully covered the full scope of the technology and I hope player integrators look through this list to get a better understanding of the dolby vision pipeline to be compliant with all profiles, utilizing all data available. To implement this you essentially need 2 filters, a nlq, and an mmr. It’s highly important that all corrections in the pipeline are applied otherwise shifted color and or lumanance is likely. You’ll need the monitors EDID information to attempt to recover display information such as peak brightness and display primaries for applying the generated luts.

Nonlinear Quantization (NLQ)

$R = {\sum_{i=0}^{N} p_{j,i} * (s / 2^b)^i}$

Where $p$ are prediction coeffs in the metadata, $s$ is the BL luma value, $b$ is the BL bitdepth, and $j$ is the current segment of the polynomial.

float code_coming_soon = 0.0;

Multivariate multiple regression (MMR)

$R = {m_{0} + \sum_{i=0}^{N} ({m_{1}[i] * S_{0}^{i} + m_{2}[i] * S_{1}^{i} + m_{3}[i] * S_{2}^{i} + m_{4}[i] * S_{0}^{i} * S_{1}^{i} \over + m_{5}[i] * S_{0}^{i} * S_{2}^{i} + m_{6}[i] * S_{1}^{i} * S_{2}^{i} + m_{7}[i] *S_{0}^{i} * S_{1}^{i} * S_{2}^{i}})}$

float code_coming_soon = 0.0;

Open Source Efforts

Github user quietvoid has made massive strides reverse engineering the streams to convert between profiles and assure playback compatibility. The main project is Dovi Tool which can be used to extract or modify dolby vision streams. Other tools in his belt include vs-nlq which can combine the EL and BL tracks into one 12b stream.

Some of his code ended up in libplacebo, and from there it’s been adopted into many players such as mpv, infuse, MPC VR, and madVR.

The first related patent I could find from Dolby in the delivery of multiple video frames. This can be helpful trying to understand how the data could be packed in the HEVC stream.

Adaptive inter-layer interpolation filters for multi-layered video delivery

It appears the original intent for this would be for encoding 3D video content. However that changed with this patent. Which is the first reference of an enhancement layer.

Systems and methods for multi-layered image and video delivery using reference processing signals

Some other fun things have been done with the EL, such as constructing progressive video from an interlaced compatibility source

Dual-Layer Backwards-Compatible Progressive Video Delivery

Finally the result of years of playing around they figured out a new use for the EL. Dolby Vision Profile 7.

Layered representation containing crc codes and delivery of high dynamic range video