Reviewing Production OpenEXR files on the Web for ML

Presenter: Max Grosse (Disney Research)
Duration: 8 minutes
Slides: HTML | PDF

All talks

Slides & video

Keyboard shortcuts in the video player
  • Play/pause: space
  • Increase volume: up arrow
  • Decrease volume: down arrow
  • Seek forward: right arrow
  • Seek backward: left arrow
  • Captions on/off: C
  • Fullscreen on/off: F
  • Mute/unmute: M
  • Seek to 0%, 10%… 90%: 0-9
Slide 1 of 11

Welcome to my presentation on Reviewing production OpenEXR files on the web for Machine Learning. I'm Max Grosse, Principal Software Engineer at DisneyResearch|Studios.

I will show you a little bit of our machine learning pipeline we employ here, and how dealing with feature film production assets provides a unique challenge in this, and how we address it using modern web technology.

Slide 2 of 11

As we are part of the Studio segment of the Walt Disney Company, we naturally deal with production assets.

When it comes to images, this typically means we deal with higher resolution, 1080p at least, up to 4k and even more. Additionally, the imagery is typically of high dynamic range, stored in the OpenEXR file format as 16-bit or 32-bit floating points.

For many applications, we also need to inspect other aspects of images apart from the final composed color data. These can be feature buffers from rendering, such as depth, normals, or alpha masks and provided as separate inputs so you can quickly determine for each final color pixel the depth or normal that goes along with it.

To inspect the training data that gets fed into our deep neural networks, to view error maps on the validation, and generally being able to judge the visual quality of our results, or enable artists to properly judge these, we require a little bit more than a cut-out thumbnail of a jpeg image within, for example, Tensorboard.

Slide 3 of 11

An example of our work is the Kernel-Predicting Convolutional Networks for Denoising Monte Carlo Renderings that my colleagues published in 2017 at ACM SIGGRAPH.

This introduces a deep learning approach for denoising Monte-Carlo rendered images that produces high-quality results suitable for production.

Slide 4 of 11

This is when work on JERI started, the JavaScript Extended-Range image viewer.

The idea is that we can have a remote, or local, HTTP server that serves not only the OpenEXR images we are interested in directly, but also an HTML5-based viewer such that these OpenEXR images can be directly inspected within a modern browser.

In particular, this allows conveniently reviewing result images stored on our compute cluster without the need to explicitly copy or mount it locally.

This also guarantees that the client receives the original images, without any additional compression, providing full control over the very exact pixel values that will be displayed. It functions like a very configurable <;img>; tag with a lot of additional features to dig deep into extended-range images.

Slide 5 of 11

For this, we have compiled the OpenEXR library to WebAssembly using EMScripten.

The EMScripten toolchain still lacks a bit in terms of quality of life, so this was a bit of a tedious process to get right, however once built, it can be potentially used in a variety of applications.

Decoding speed was not our primary concern, so we did not perform any extensive benchmarks on how fast it can decode EXR images compared to a native solution. Typically, even on local network, we will already have a small delay for downloading the EXR, so decoding speed did not show to be an issue to our users so far.

In particular with some local caching, switching the loaded images usually is instantaneous anyway.

Slide 6 of 11

Given the extended range nature of the input imagery, we need a way to control things like gamma and exposure, also in case we want to drill into details at particular dark or bright regions.

Deep learning is commonly driven by a loss function, which in our case is often a combination of image difference metrics. Visualizing this error is of great importance in helping develop the desired model.

For all these visualization aspects, we have opted to facilitate WebGL. This was a very natural choice and provides a very efficient and convenient way to change the way things get displayed without too much code and without modifying the original pixel values directly.

In particular, this offloads all the pixel operations to the GPU, keeping the user experience smooth and avoiding manipulating large arrays of pixels directly in a single JavaScript thread.

The basic viewer application then was written in mostly TypeScript with React.js, optionally handling the UI aspects and helping integrating the viewer into other React.js projects.

Slide 7 of 11

The viewer itself is configured by providing a JSON file that describes which EXR images to load, the remote paths to find them, which images to group, as well which images together should form an image difference error map (i.e. image differences).

Slide 8 of 11

Here's a quick demonstration of JERI in action.

The examples are right from the jeri website and only intended to illustrate the idea, note that these are not production assets and the results are far worse than our production results and serve only as illustration.

In this example, we have a noisy input that we might have gotten from our renderer.

We can now toggle to Improved" to inspect the results from a simple denoiser.

Then We can zoom in and pan around to really compare even on a pixel level.

Ultimately, we are interested in how we compare to a ground truth, or reference, which is also available here and we can simply toggle to that as well.

As you see for example there's quite some detail lost in the ice cubes.

Even better though is to have an error map between the output and the reference computed on the fly, as you can see here.

Here we can also adjust the exposure, so we can better visualize parts we could otherwise not see in that detail.

Of course the exposure adjustment work for color images just the same.

We can repeat that for a different set of images, if we want to see how it performed on different input.

Slide 9 of 11

We have integrated it into our ML monitoring system that is running on our cluster.

Here you can see a more typical use case where you can see the recorded training runs on the left, and a lot of different sets of images and metrics displayed in the main plane, allowing to quickly drill down and monitor your progress and results.

For example we can look at different validation images, different channel sets and at different points in time during the training. At any time we could also fire up a Tensorboard to look at other recorded metrics or write a report on our results, so a web-based solution like JERI really fits in nicely.

Slide 10 of 11

We have released JERI as Open Source Software, you can try it out yourself and integrate it into your own solutions, see jeri.io for details.

Slide 11 of 11

That's all from my side. Thanks you for your attention.

All talks

Workshop sponsor

Adobe

Interested in sponsoring the workshop?
Please check the sponsorship package.