Generating HDR Video from SDR Video

Supplementary Website

Best experienced in Google Chrome

We propose a framework for HDR video synthesis from casual SDR footage by leveraging large-scale generative video models. A Multi-Exposure Video Model (MEVM) predicts exposure-bracketed linear SDR video sequences from a single nonlinear SDR input, and a learnable Video Merging Model (VMM) merges these brackets into high-quality HDR, preserving detail in both shadows and highlights. The viewer below presents side-by-side method comparisons rendered from decoded HDR frames; scrub through frames, toggle between SDR and HDR display, swap methods, and hover to inspect details.

Scroll down to view results.

Display

SDR HDR

In-the-Wild Videos

Our method can even be applied to your own footage! Here we test our method on our own video albums and other arbitrarily sourced Internet video. Our method can generate clipped highlight regions and generate noisy quantized shadow regions.

Iconic Cinema Moments

We demonstrate our method on iconic cinema footage. Our pipeline produces temporally coherent HDR outputs that enhance detail in both bright and dark regions without amplifying compression or grain noise.

Text to HDR Video Generation

Our pipeline can be chained with an off-the-shelf SDR video generation model to form a fully generative HDR video pipeline. We prompt a text-to-video model with exposure-related keywords such as “very bright”, “sunny day”, “very dark”, and “no lights” to produce an SDR input, which is then lifted to HDR; producing physically plausible dynamic-range expansion in both highlights and shadows while preserving temporal coherence.

Benchmarking our Method

Comparisons against baselines on the Stuttgart and UBC HDR benchmarks. Ground-truth HDR is available for each scene, enabling direct comparison. Switch the input exposure to inspect how each method handles over- and under-exposed inputs.

Limitations

Our fixed three-bracket exposure strategy cannot always recover the full dynamic range of high-contrast scenes, leaving some outputs with residual clipping after generation. We include representative failure cases to show where the approach breaks down and guide future improvements.

Choose display mode