Research statement - UCLA Department of Mathematics

Dr. Dominique Zosso Summary of Research Interests 

Abstract 

Images and Imaging are ubiquitous today. Nearly every mobile phone has an integrated camera, and 

many are quasi-professional “photoshoppers”. There is, however, more to image understanding than 

applying a canvas texture filter, red eye removal or motion blur correction—although the last one 

is a pretty hard problem, already. Images are more than just the result of cheap phone-cameras or 

expensive high-end gear. Images are also essential in remote sensing, medicine, biology, computer 

vision and what not. Also, the role of images is not only to aesthetically picture a scene, but to convey 

information. In medical imaging for example, we extract information from images about a physical 

reality that is hardly accessible to inspection otherwise, like intra-cranial MRI. Image understanding 

describes the collection of individual problems such as denoising, segmentation, classification, 

decomposition and so on, which scientists have been working on for a few decades, already. 

And still, many associated research questions have not been answered to full satisfaction, yet. Mathematically 

speaking, most image processing tasks are instances of inverse problems, where one looks 

for an underlying, hidden layer of information given a set of derived measurements. New applications, 

types of images, and questions to be answered through imaging come up everyday. My research interest 

is to ride on this highly dynamic wave at the cutting-edge frontier of image understanding 

problems. I want to contribute new mathematical models and efficient schemes for their computation, 

to provide new and better ways of getting information out of images, and to create new possibilities 

of imaging in the first place. Taking images is easy. Understanding them is the challenge. 

Motivation 

Ever since the first cave paintings, images and imaging have played important roles in human civilization. 

Early “research” was driven by the urge to perfect the fidelity of those images. The first 

successful use of heliography by Joseph Nicéphore Niépce, 1826, ultimately marks the beginning 

of “automated painting” as an entirely physical process of imaging, which shortcuts the necessity of 

human understanding of the depicted object altogether. 

The discovery of X-rays, commonly attributed to Wilhelm Conrad Röntgen in 1895, sparks the extension 

of imaging beyond the boundaries of normal human visual perception. From then on, it became 

possible to “see” information from objects that was otherwise hidden and inaccessible. The spectrum 

of imaging modalities hasn’t stopped broadening ever since, each modality picturing a different 

physical property of the underlying object. Modern day imaging is now heavily challenged by the 

backward problem that consists of gaining insight on the physical reality of an object, given one or 

several derived images. Think of clinical health-care without the assistance of non-invasive imaging 

techniques s.a. computed tomography or magnetic resonance imaging, or modern live science 

research without appropriate imaging instrumentation. 

If we continue the journey through “imaging history”, we realize that nowadays, thanks to the exploding 

number of image acquisition devices and the availability of cheap (electronic) computing power, 

not only the process of imaging itself, but increasingly also the interpretation of acquired images is 

to a large extent left to non-human automata. The computational problems related to this task are 

generally referred to as image processing and computer vision, and their ultimate goal is to achieve 

image understanding. Typical tasks include, but are not limited to: Image restoration, segmentation, 

registration and classification, decomposition, stereo and multi-view scene reconstruction. The range 

of applications is enormous: medical imaging, biological imaging, non-destructive testing, remote 

sensing, surveillance and monitoring, robotics, and even consumer electronics. 

1

Recent Graduate Research 

During my PhD thesis at EPFL, I have been working on a number of image processing problems. I 

would like to summarize the most important ones. 

Geometric image registration 

In this project, I first developed a novel geometric framework called geodesic active fields (GAF) 

for image registration in general. In image registration, one looks for the underlying deformation 

field that best maps one image onto another, see figure 1. This is a classic ill-posed inverse problem, 

which is usually solved by adding a regularization term. Here, instead, I proposed a multiplicative 

coupling between the registration term and the regularization term, the Beltrami energy of the embedded 

deformation field, see figure 2. This approach turns out to be equivalent to embedding the 

deformation field in a weighted minimal surface problem. Then, the deformation field is driven by a 

minimization flow toward a harmonic map corresponding to the solution of the registration problem. 

This proposed approach for registration shares close similarities with the well-known geodesic active 

contours model in image segmentation, where the segmentation term (the edge detector function) is 

coupled with the regularization term (the length functional) via multiplication as well. As a matter of 

fact, our proposed geometric model is actually the exact mathematical generalization to vector fields 

of the weighted length problem for curves and surfaces. 

As compared to specialized state-of-the-art methods tailored for specific applications, our geometric 

framework involves important contributions. Firstly, our general formulation for registration works 

on any parametrizable, smooth and differentiable surface, including non-flat and multiscale images. 

In the latter case, multiscale images are registered at all scales simultaneously, and the relations 

between space and scale are intrinsically being accounted for. Secondly, this method is, to the best of 

our knowledge, the first re-parametrization invariant registration method introduced in the literature. 

Thirdly, the multiplicative coupling between the registration term, i.e. local image discrepancy, and 

the regularization term naturally results in a data-dependent tuning of the regularization strength. 

Finally, by choosing the metric on the deformation field one can freely interpolate between classic 

Gaussian and more interesting anisotropic, TV-like regularization. 

Then, I provide an efficient numerical scheme that uses a splitting approach: data and regularity terms 

are optimized over two distinct deformation fields that are constrained to be equal via an augmented 

Lagrangian approach. This fast scheme is called FastGAF. 

Registration of cortical feature maps and automatic parcellation 

Much like a tree, the human brain exhibits a highly convoluted and irregular structure, with lots of 

complexity and variability: sulci and gyri vary a lot between subjects. On the other hand, high level 

structures of the brain – the “big picture” – are highly conserved, such as the two hemispheres, the 

lobes and main folds. A hierarchical representation of these structures is important for example in 

the context of intersubject registration: Considering the complexity of the cortical surface, directly 

involving local small-scale features would mislead the registration to be trapped in local minima. A 

robust method needs to rely on large-scale features, describing the main landmarks of the cortex, such 

as the main gyri or sulci. Subsequently, features are to be iteratively refined to drive the registration 

more locally and reach the desired precision. 

Here, I proposed a scale-space that lends itself for a meaningful hierarchical representation of structures 

on cortical feature maps on the sphere. The scale-space is expected to produce “generic” brain 

images at coarse scale, adding more and more individual details at finer scales, as shown in figure 3. 

Finally, I provided an implementation of the FastGAF method on the sphere, so as to register the triangulated 

cortical feature maps. We build an automatic parcellation algorithm for the human cerebral 

cortex, which combines the delineations available on a set of atlas brains in a Bayesian approach, to 

automatically delineate the corresponding regions on a subject brain given its feature map (figure 4). 

2

Figure 1: Image Registration. 

The human skull is registered 

to chimpanzee and baboon 

by finding the deformation fields 

u(x), s.t. human features match 

chimpanzee/baboon at x + u(x). 

Figure 3: Scale-Space of cortical feature maps. (a)– 

(b) The equalized spherical map and projection on the partially 

inflated surface, at scales k. (c) Median thresholding 

illustrates the simplification of structures. Main structures, 

e.g. the central sulcus (red), were well preserved in coarser 

scales. (d)–(e) Magnitude and orientation of the gradient of 

the map. 

Figure 2: The Swiss view of the Beltrami framework. A flat 

map corresponds to a higher-dimensional reality. Here: the intensity 

information (false-color) of the rectangular topographic map of the Matterhorn 

region translates directly into a three-dimensional terrain model. 

The surface of the Matterhorn as measured by the Beltrami energy of its 

map, is a direct measure of its “regularity”. (geodata c○swisstopo) 

3 

Ground Truth Automatic 

Error on GT Error on Feature 

Figure 4: Automatic parcellation. After 

registering a subject to the brains in the atlas, 

we perform automatic labelling. First and 

second row show manual (“ground truth”) 

and automatic parcellation results on the pial 

and inflated cerebro-cortical surface. The incorrect 

regions are marked in the third row.

Geometric image segmentation with Harmonic Active Contours 

In the meantime, I also worked on using the Beltrami energy as single term in a variational framework 

for image segmentation. As a more fancy extension of the Beltrami framework, we propose a segmentation 

method based on the geometric representation of images as surfaces embedded in a higher 

dimensional space, enabling us to naturally work with multichannel images. The segmentation is 

based on an active contour, described as the zero level set of the level set function, which is embedded 

in the image manifold along with a set of image features. 

Hence, both the data-fidelity and regularity terms of the active contour are jointly optimized by minimizing 

a single, unweighted Beltrami energy representing the hyper-surface of this embedded manifold. 

This approach is purely geometrical and does not require additional weighting of the energy 

functional to drive the segmentation task to the image contours. Indeed, the joint embedding of the 

image features and the level set both regularizes the level set and couples it to the image features. We 

have implemented both gradient-based and region-based criteria. The potential of such a geometric 

approach lies in the general definition of Riemannian manifolds, enabling the use of the proposed 

technique in scale-space methods, volumetric data or catadioptric camera images. This constitutes 

a more general framework for image segmentation by defining the optimal segmentation function as 

the harmonic map minimizing the hyper-surface of the manifold. The proposed technique is therefore 

called Harmonic Active Contour (HAC). 

Efficient algorithm for the level set method 

The level set method, as also used in the aforementioned HAC framework, is a popular technique 

for tracking moving interfaces in several disciplines including fluid dynamics and computer vision. 

However, despite its high flexibility, the level set method is limited by two important numerical issues. 

Firstly, the level set algorithm is slow because the time step is limited by the standard CFL condition, 

which is also essential to the numerical stability of the iterative scheme. Secondly, the level set method 

does not implicitly preserve the level set function as a distance function, which is necessary to ensure 

a good estimation of the contour normal and the curvature during the evolution process. Therefore we 

propose a new algorithm which overcomes these two fundamental problems in the level set method. 

The algorithm is based on recent advances of ℓ1 optimization techniques, some of which have already 

been employed in the fast algorithm for the optimization of the weighted Beltrami energy, above. 

The main idea is to split the original hard problem into sub-optimization problems which are wellknown 

and easy to solve, and to combine them together using an augmented Lagrangian approach to 

guarantee the equivalence with the original problem. This idea is borrowed from recent efficient ℓ1 

minimization methods originally applied to compressed sensing. 

Current Postdoctoral Research 

Currently, I am researcher at the Department of Mathematics at UCLA, on a fellowship of the Swiss 

National Science Foundation (SNF). This fellowship gives me great independence in my research. 

A Unifying Geometric Framework for Image Processing 

My main focus, the topic for which my current fellowship was granted, is research on a “unifying geometric 

framework” for image processing. We believe that the Beltrami framework has great potential 

as generalized regularizing functional, in particular if equipped with a weighting function that allows 

multiplicative coupling of the data term with the regularization term of the specific inverse problem. 

Indeed, the resulting energy functional has very interesting fundamental properties, such as geometric 

regularization interpolating between Gaussian and TV regularization, re-parametrization invariance, 

applicability to any Riemannian manifold and intrinsic automatic data-dependent modulation of the 

local regularization strength. 

4

Research in related fields, s.a. compressed sensing and optimization theory, has yielded very efficient 

optimization algorithms, which can be applied to the weighted Beltrami framework as well. We 

postulate that the weighted Beltrami framework represents an important step towards a unifying variational 

framework for geometric image processing, with a high degree of generality and a multitude 

of beneficial properties. Therefore we research, if and how other inverse problems in computer vision, 

image processing and other related domains can be generalized by the weighted Beltrami framework, 

and to develop robust and fast numerical schemes to optimize them. 

Beyond, I have developed two different generalizations of the Beltrami energy, that make the functional 

applicable to more interesting and more powerful regularization problems. 

Beltrami diffusion in the space of patches 

First, I am working on weighted Beltrami regularization in the space of patches. Recently, the use 

of patches has significantly gained in importance in various image processing applications. Indeed, 

the individual intensity or color information contained in a single local pixel is often not sufficient to 

completely characterize this pixel. Neighborhood information is required in order to better differentiate 

between textural features and noise, and diffusion on the space of patches was proposed mainly by 

Tschumperle. Patch-based embeddings of the “Beltrami-kind” were proposed as texture-aware edgeindicators 

and for denoising. Only very recently, a computationally more interesting minimization 

scheme was proposed. 

Here, I propose a novel model for image restoration, based on anisotropic diffusion on the space of 

patches, using the Beltrami embedding. We derive a local multiplicative coupling from the standard 

additive scheme and show how this automatically introduces an edge-aware pre-conditioner for diffusion. 

We propose a splitting scheme that naturally allows dealing with the patch-overlap and different 

non-linearities in a very elegant and efficient way. 

Graph-based and non-local Beltrami energy 

Beyond patch-diffusion, (patch-based) non-local regularization currently produces very promising results. 

For example, the current denoising state-of-the art is achieved by sparsification in patch-group 

transform-domain (BM3D). Currently, I am working on rendering the Beltrami-energy fully nonlocal, 

by extending its definition to non-local operators as introduced e.g., by Osher and Gilboa. This 

extension makes the benefits of Beltrami regularization, such as the intrinsic inter-channel coupling 

in vectorial or color images, readily available for data defined on graphs. This applies to non-local 

regularization where the graph-edge-weights are defined by patch-distances, but we equally see important 

usage in color-image processing, where node-distances are defined by various color-distances 

instead. Beyond, the anisotropic, inherently multichannel Beltrami-regularization thereby becomes 

available for any graph-based inverse problem, such as clustering or segmentation, with immediate 

applications in machine learning. 

Non-local Retinex 

Retinex is a theory on the human visual perception, introduced in 1971 by Edwin Land. It was an 

attempt to explain how the human visual system, as a combination of processes both in the retina 

and the cortex, is capable of adaptively coping with illumination spatially varying both in intensity 

and color. In image processing, the retinex theory has been implemented in various different flavors, 

each particularly adapted to specific tasks, including color balancing, contrast enhancement, dynamic 

range compression and shadow removal in consumer electronics and imaging, bias field correction in 

medical imaging or even illumination normalization, e.g. for face detection. 

In this project, I develop a unifying framework for retinex that is able to reproduce many of the existing 

retinex implementations within a single model, including all gradient-fidelity based models, 

variational models, and kernel-based models. The fundamental assumption, as shared with many 

5

etinex models, is that the observed image is a locally pointwise multiplication between the illumination 

and the true underlying reflectance of the object. Taking the logarithm splits the observed image 

into a sum of contributions from reflectance and from illumination. Starting from Morel’s 2010 PDE 

model for retinex, where illumination is supposed to vary smoothly and where the reflectance is thus 

recovered from a hard-thresholded Laplacian of the observed image in a Poisson equation, we define 

our retinex model in similar but more general two steps. 

The first step looks for a filtered gradient that is the solution of an optimization problem consisting 

of two terms: The first term is a sparsity prior of the reflectance, such as the TV or H1 norm. The 

second term is a quadratic fidelity prior of the reflectance gradient with respect to the observed image 

gradients. Since this filtered gradient almost certainly is not a consistent image gradient, we then 

look for a reflectance whose actual gradient comes close, while possibly respecting other priors and 

constraints on the reflectance or illumination. 

Our framework makes extensive use of non-local differential operators, of which the classical local 

finite difference operators are just special cases. Also, the freedom to choose different sparsity and fidelity 

norms in the first and second step, respectively, allows reproducing many of the existing retinex 

models, as illustrated in figure 5. As validation and illustration, we provide extensive comparisons of 

the existing models with their equivalents in the proposed unifying retinex framework. 

More importantly though, by using different and more interesting non-local weightings for the sparsity 

and fidelity prior, we are able to derive entirely novel retinex formulations. This allows us defining 

new retinex instances particularly suited for drop shadow removal, texture-preserving shadow 

removal, cartoon-texture decomposition, color image enhancement, and even shadow removal in hyperspectral 

images—all within a single framework by just selecting different norms and non-local 

weights. 

Subvoxel segmentation in diffusion-weighted magnetic resonance images 

Precise segmentation of diffusion-weighted MR images (DWI) is a difficult task due to the very limited 

resolution (currently around 2.2 × 2.2 × 3mm 3 ). In brain tractography and subsequent connectivity 

analysis, precise segmentation of the CSF-grey-matter and white-matter-grey-matter interfaces 

is, however, an important task in order to achieve robust fiber tracking performance and consistent 

parcellisation of the cortical surface across subjects. The diffusion-weighted acquisitions often suffer 

from severe distortions due to increased local field inhomogeneities, in particular in the anterior 

part of the brain and along the phase-encoded direction. On the other hand, anatomical MR images 

(e.g. T1 and T2 weighted acquisitions) can be produced with substantially better resolution (1mm 3 or 

less), relatively low distortions and well understood tissue contrast. Also, many excellent state-of-theart 

methods are readily available to perform high-quality segmentation of the intracranial structures, 

including the important CSF-grey-matter and white-matter-grey-matter interfaces. 

Here we work on the segmentation problem in poorly defined diffusion space by exploiting the samesubject 

anatomy easily extracted from T1-weighted images as strong shape-priors. We reformulate 

the segmentation problem as an inverse problem, where we look for an underlying deformation field 

(the distortion) mapping from T1 space into diffusion space, such that the structures identified in the 

T1 image, with high confidence, will optimally align with structures in diffusion space. 

Variational adaptive mode decomposition 

Many real world signals are a superposition of different underlying components of different spectral 

“color”. The different colors of light, radio waves, electrical currents, etc. are all just examples of 

this. In many cases, one is interested in isolating just a particular component out of this blend, like 

the modulated carrier signal of a desired radio station. In other cases, we wish to recover several 

or even all individual components less the noise that form the observed signal, with applications 

in many different domains, including image decomposition. In case of frequency-separated modes, 

6

Input Reflectance Illumination 

Figure 5: Unifying non-local Retinex model. Retinex is a model tempting to explain the capacity of the 

human visual system to normalize inhomogeneous illumination, and many particular, application-specific implementations 

exist. Our unifying model decomposes an input image into underlying reflectance and estimated 

illumination, and successfully reproduces basic retinex behavior (grey squares and checkerboard). The same 

model allows dynamic range compression and local contrast enhancement (radiography), shadow detection and 

removal (tiles), texture-cartoon decomposition (checkerboard), and illumination-invariant feature extraction for 

face detection. We are working on extensions to color and hyperspectral images. 

7

filter banks or wavelet decomposition are widely used techniques. The drawback here, however, is 

that one computes coefficients of far more components than actually present, and that the shape of the 

components is often overly restricted and predefined. Adaptive band decomposition techniques have 

gained importance, such as the empirical mode decomposition (EMD) algorithm. 

We are currently working on a variational model for adaptive mode decomposition, where from the 

input signal we estimate a number of modes which are mostly band-separated (but not strictly), and 

whose carrier frequency and bandwidth are determined on-line, adaptively. Indeed, we perform 

Wiener filtering on the demodulated input signal for each individual mode, and the resulting optimization 

problems are a mixture of adaptive Gabor-wavelet filtering and spectral Gaussian mixture 

models. 

Future Research: Vision and Objectives 

My research is driven by the quest for better models and tools for image understanding. I want to 

continue working on a broad spectrum of image processing and computer vision problems, that all 

have their useful application in science and society. Image segmentation, decomposition, denoising, 

restoration, illumination normalization—these tasks are all routinely performed in medical image 

scanners as well as consumer electronics, but research has not come to an end yet. As the imaging 

modalities become more and more complex, so do the questions to be answered by images. And the 

computational complexity of the inverse problems involved doesn’t stop growing. 

More challenging forms of images, such as omni-directional images or maps on more complicated 

surfaces such as the human cortex are becoming more important, and most of the established image 

processing tools do not directly translate to these new domains. Also, generalizing even further, data 

points on graphs can be considered images, and many machine learning tasks have a structure very 

similar to image processing problems, such as clustering or classification. 

Answering upcoming medical imaging problems still requires a lot of original research in applied 

mathematics, a lot of mathematical modeling intuition as well as the synthesis of many powerful 

existing concepts, like convex optimization tools, compressed sensing, Beltrami regularization or nonlocal 

operators, coupled with new insights from optimization, numerical optimization and scientific 

computing. I am strongly willing to further enlarge my toolbox in applied mathematics, and dedicate 

myself to research for new image understanding solutions. 

Dec. 2012, Dominique Zosso 

8

Research statement - UCLA Department of Mathematics

Create successful ePaper yourself

Delete template?

Save as template?