Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network

Abstract

While fully-convolutional neural networks are very strong at modeling local features, they fail to aggregate global context due to their constrained receptive field. Modern methods typically address the lack of global context by introducing cascades, pooling, or by fitting a statistical model. In this work, we propose a new approach that introduces global context into a fully-convolutional neural network directly. The key concept is an implicit kernel convolution within the network. The kernel convolution blurs the output of a local-context subnet, which is then refined by a global-context subnet using dilated convolutions. The kernel convolution is crucial for the convergence of the network because it smoothens the gradients and reduces overfitting. In a postprocessing step, a simple PCA-based 2D shape model is fitted to the network output in order to filter outliers. Our experiments demonstrate the effectiveness of our approach, outperforming several state-of-the-art methods in facial landmark detection.

Download

We provide our trained network model as well as the code used to train/evaluate our network. The PCA-based model fitting and results are included.

[Merget.zip] (case-sensitive password is: CVPR2018@MMK)

The network predictions (heatmaps) on 300W are comparatively big (~1.2 to 1.6 GB per *.zip) and are therefore offered as separate download links.
To conserve bandwidth, please DO NOT download the files UNLESS you read the README.TXT and made sure that you REALLY need them. Thank you!

[Merget_heatmap_paper_o2.zip] (case-sensitive password is: CVPR2018@MMK) [Merget_heatmap_python_o2.zip] (case-sensitive password is: CVPR2018@MMK)
[Merget_heatmap_extra_o2.zip] (case-sensitive password is: CVPR2018@MMK)

For the curious, we also provide the output of the local-context subnet on 300W (~1.2 GB):

[Merget_heatmap_paper_o1.zip] (case-sensitive password is: CVPR2018@MMK)

Details can be found in: Daniel Merget, Matthias Rock, Gerhard Rigoll: "Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network". In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2018. [cvpr2018.pdf]

If you use the provided material, please cite our paper!

Miscellaneous

Another group ported our Matlab code to Python and got it working on Google Colab. We cannot give any support/warranties about their implementation, but you may want to check it out nevertheless:

https://github.com/ashxjain/Robust-Facial-Landmark

To top

Lehrstuhl für Mensch-Maschine-Kommunikation

Prof. Dr.-Ing. W. Hemmert (kommissarisch)

Theresienstraße 90
80333 München

Tel. +49 (0)89 289 28541
Fax. +49 (0)89 289 28535

E-Mail: mmk@ei.tum.de