Topic 1
A new loss function for Spatio-Temporal Denoising based on analyzing noise characteristics
From the published literature, it has been proven that deep learning-based image denoising algorithms outperform the traditional denoising algorithms. However, there are issues that are still unsolved when denoising images for high image quality in terms of human perception. Both traditional approaches and ML based models often do not produce results that look natural and overall pleasing. Especially in the movie making industry, quality expectations are extremely high and most of the denoising results from literature would not be accepted.
Video denoising application is additionally challenging from the image denoising because Human Visual System will find out any temporal inconsistencies if the frames (in a video sequence) are not correctly correlated. Therefore, temporal consistency is paramount for a video denoising algorithm.
Lately, at ARRI we did some work on deep learning-based video denoising which was to understand the best performance we can expect from a deep learning model. A new loss function was tried out and the infrastructure for the training set up. This thesis should cover an in-depth study of this new loss, optimizing it for image quality indistinguishable from real high quality camera images. Additionally, this thesis will include model architecture optimization to achieve the target noise distribution defined by the loss.
Details of the scope can be adjusted to the kind of thesis (BT, MT, internship).
Topic 2
Depth Maps for motion picture: Combining real camera characteristics with foundation models
Recent advances in generation depth from 2D images shows impressing results, e.g. (https://arxiv.org/pdf/2410.02073 ). Multiscale Vision Transformers are used to generate sharp depth maps together with estimating a focal length in pixels.
To get a reliable depth value, i.e. a distance in meters from the camera, additional camera data could be taken into account. This work investigates, how the available models can be used as a foundation model to take advantage of the extensive training on large datasets. In this work you set up a training procedure that allows the additional inputs from a known camera system to be fed in. These camera characteristics are extracted from available measurements and optical simulations.
Aiming a model creating depth maps that do not only look great but are also accurate in numbers and could later be used for real movie-making applications.
Based on background and interests the topic can be a thesis or an internship or a combination.
Topic 3
ML-based image processing versus classical image processing: comparing image quality over computational effort
Lately, learning based image processing algorithms have successfully proven their superiority in terms of resulting image quality compared to the classical image processing algorithms. However, mostly the computational ressources of the inference clearly exceed the ressources needed to apply a classical algorithm. In real applications, resources are limited.
I this thesis, we investigate learning based models with small number of parameters, and compare the results to the results of available classical algorithms. Additionally we study a range of model sizes and use a classic algorithm with scalable computational effort and study how the image quality vs. computational effort quotion compares between classic approach and ML based approach.
In more detail, this project investigates different scalable setups that are feasible on a real camera system. Mainly two different directions should be compared:
- Learning/optimizing the parameters of two different available classical architectures to an appropriate image quality metric
- Learning a Network directly
The results will contribute to answering the question, for which computational power neural network based models can be suited: We expect that for extremely small ressources, classical approaches show higher quality than a very small network. For bigger models we expect that the quality is not only better overall (this is known by SOTA), but also the quality vs computational load is better.
Details of the thesis scope can be adjusted to the kind of thesis (BT, MT, internship).
Topic 4
Image Quality in a Motion Picture Workflow
Compression is a well-studied topic in Image Quality research, video tests evaluating the quality vs. datarate are well known and standardized. However, in digital cinema recordings used for high end movie production, the setup is crucially different compared to consumer camera footage: After recording, the footage usually is subject to heavy color grading, which amplifies any quality issues, or used for VFX (e.g. greenscreen keying).
This internship/job/thesis should focus on the topic, how to include color grading – and if time allows other steps from movie post production – into the image quality tests aiming for a method that is used to compare different codecs that can be used in a digital cinema camera workflow. If time allows, also lossy image processing steps can be investigated in addition to codecs.
Based on background and interests the topic can be a thesis or an internship or a combination.
Contact: Dr. Tamara Seybold, tseybold(at)arri.de