8 Février
2018
Journée thématique organisée par le GDR
ISIS Thème A
La
journée commencera à 10h, pour se terminer avant 17h30.
Programme
: (voir ci-dessous)
Lieu : Telecom ParisTech, 46 rue Barrault Paris 13ème, Amphi
B310
Organisateurs : Gersende
Fort (email : gersende.fort.at.math.univ-toulouse.fr) et François
Septier (email : francois.septier@imt-lille-douai.fr)
Echantillonnage
Monte Carlo et Apprentissage Statistique
|
Thème
: Les
développements récents en traitement du signal et des images
conduisent à des problèmes d?estimation de plus en plus difficiles,
de part leur forme complexe, le grand nombre de variables à estimer,
et la dimension croissante des données mises en jeu.
L'objectif de
cette journée est de présenter de nouvelles méthodologies Monte Carlo,
et leurs interactions avec l'apprentissage statistique notamment via
la modélisation Bayésienne et l'optimisation stochastique.
Nous aurons deux
exposés tutoriels par Francis Bach (DIENS, INRIA) et Nicolas Chopin
(CREST, INSEE) relatifs, respectivement à l'optimisation stochastique
pour l'apprentissage, et à des méthodes de Monte Carlo accélérées. La
journée contiendra aussi des exposés invités et des exposés
contribués.
Appel à
contribution : Les propositions d'exposés sont à envoyer
aux organisateurs avant le 15
janvier 2018.
Orateurs
:
Francis
Bach (DIENS, INRIA) - tutoriel
Simon
Barthelmé (GipsaLab, CNRS)
Julien
Bect (L2S, CentraleSupelec)
Nicolas
Chopin (CREST, ENSAE) - tutoriel
Jean-François
Giovannelli (IMS, Univ. de Bordeaux)
Manon
Michel (CMAP, Ecole Polytechnique)
Adil
Salim (LTCI, Telecom ParisTech)
Umut
Simsekli (LTCI, Telecom ParisTech)
Michal
Valko (Sequel, INRIA)
Herwig
Wendt (IRIT, CNRS)
Programme
:
Stochastic
gradient descent methods for machine learning - Francis Bach
(DIENS, INRIA)
In this talk, I will review the
main recent advances in stochastic approximation algorithms for
large-scale machine learning, with an emphasis on constant-step size
stochastic gradient and variance-reduced techniques.
A Bayesian estimator for the multifractal analysis of multivariate
data - Herwig Wendt (IRIT, CNRS)
Multifractal analysis allows to
assess the fluctuations of the pointwise regularity of signals and
images and has been successfully used in a large panel of
applications of very different natures. Yet, despite past successes,
the accurate estimation of multifractal parameters still remains a
challenging task and requires the availability of large sample
sizes. This work studies a Bayesian framework that makes use of the
jointly recorded data components available in multivariate data to
improve estimation of multifractal parameters. It relies on a
statistical model for the logarithm of wavelet leaders that
comprises a gamma Markov random field joint prior which acts as a
regularizer and counterbalances the statistical variability induced
by small sample size. The model is specifically designed to lead to
an efficient sampler for the estimation of multifractal parameters
and yields excellent results on data with potentially many
components.
D
ata sub-sampling with Determinantal Point Processes
- Simon Barthelmé (GipsaLab, CNRS)
work with Pierre Olivier Amblard, Nicolas Tremblay
Determinantal Point Processes
(DPPs) are a class of point processes that exhibit "repulsion". This
property can be leveraged to obtain high-diversity subsets, meaning
that DPPs can be used to sub-sample various objects (surfaces,
datasets, graphs, etc.) with relatively high fidelity.
This idea has been suggested by several authors and holds
tremendous theoretical appeal. However, many difficulties crop up in
the implementation, and our goal has been to lift some of them.
One aspect is that DPPs come in two variants: fixed sample size
(so-called k-DPPs) and varying sample size. DPPs with varying sample
sizes are more tractable, since their inclusion probabilities admit
a closed form. k-DPPs make more sense in many applications, but are
less tractable, since inclusion probabilities are much harder to
compute. We show that k-DPPs and DPPs are asymptotically equivalent
in the limit of large databases, which leads to tractable formulas
for inclusion probabilities.
On the use
of SMC methods in the sequential design of computer experiments -
Julien Bect (L2S, CentraleSupelec)
Sequential
design methods have received a lot of attention in the computer
experiments community, to address designs problems where the task
is to estimate some "local" features of an expensive-to-evaluate
computer model. Typical examples of such features include :
maximizers and/or maxima of the function (optimization), failure
probabilities or quantiles (reliability analysis), and so on and
so forth. A popular approach to construct such sequential designs
is to adopt a Bayesian point of view : the computer model is
endowed with a prior distribution (typically, a Gaussian process
prior) and a "sampling criterion" (aka aquisition function, or
infill criterion) is defined, using the posterior distribution, to
quantify the expected benefit of a new evaluation at a given
location. This talk will provide an introduction to these methods,
and then discuss how Sequential Monte Carlo (SMC) methods can be
used to implement them efficiently. The Bayesian subset simulation
(BSS) algorithm will be presented as an example.
Main
refs (available here)
:
[1] Julien Bect, Ling Li and Emmanuel Vazquez (2017). Bayesian
subset simulation. SIAM Journal on Uncertainty Quantification,
5:1, 762-786.
[2] Paul Feliot, Julien Bect and Emmanuel Vazquez (2017). A
Bayesian approach to constrained single- and multi-objective
optimization. Journal of Global Optimization, 67:1, 97-133.
Sequential
Monte Carlo: some recent developments - Nicolas Chopin
(CREST, ENSAE)
SMC
(Sequential Monte Carlo) is a class of Monte Carlo algorithms for
filtering and related sequential problems. This talk will cover
some recent research on how to connect SMC with quasi-Monte Carlo
sampling, and how to study formally certain advanced resampling
schemes.
Irreversible
Markov chain Monte Carlo methods - Manon Michel (CMAP, Ecole
Polytechnique) work with A. Durmus, S. Sénécal, Y. Deng and X.
Tan.
Recently
developed in Physics, irreversible MCMC schemes are under a
growing attention from the Bayesian statistics community, as they
embody a serious candidate for scalable MCMC methods. These
methods limits the random-walk behavior of the underlying Markov
chain by breaking free from the reversibility condition. Based on
piecewise deterministic Markov processes, these methods are
rejection-free, provide a continuum of valid samples and do not
require critical parameter fine-tuning. Accelerations up to
several orders of magnitude have been exhibited for different
target distributions.
In this talk, I will provide a broad overview of these methods,
starting from the first developments in Physics to the recent
applications in Bayesian statistics. I will then discuss recent
progress about how the random-walk behavior can be furtherly
reduced by the new scheme Forward event-chain Monte Carlo and how
the factorization of the transition probabilities is the key to
the reduction of the computational complexity of these methods,
down to constant-time complexity.
Segmentation-déconvolution d'images texturées: approche bayésienne
hiérarchique et échantillonnage stochastique - Jean-François
Giovannelli (IMS, Univ. de Bordeaux)
work
with C. Vacar
Le
travail concerne la question de la déconvolution-segmentation
jointe pour des images texturées. Les images sont constituées de
régions présentant des patchs de textures, en particulier
orientées, appartenant à un ensemble de K classes données. Chaque
classe est décrite par un champ gaussien piloté par une densité
spectrale paramétrique dont les paramètres sont inconnus. Par
ailleurs, les étiquettes de classes sont décrites par un champ de
Potts dont le paramètre est également inconnu. La méthode repose
sur un modèle hiérarchique et une stratégie bayésienne pour
estimer conjointement les étiquettes, les K images texturées,
ainsi que les hyperparamètres: les niveaux du bruit et des images
ainsi que les paramètres de texture et le paramètre du champ de
Potts notamment. La stratégie permet de définir des estimateurs
optimaux au sens d'un risque joint: maximiseur marginal a
posteriori et moyenne a posteriori selon les paramètres. Ils sont
calculés numériquement à partir d'échantillons sous loi a
posteriori, eux-mêmes simulés par un algorithme de Gibbs par bloc.
Deux des étapes sont délicates: (1) le tirage des images
texturées, gaussiennes de grande dimension, est réalisé par un
algorithme de Perturbation-Optimization [a] et (2) le tirage des
paramètres des images texturées obtenu grâce à une étape de Fisher
Metropolis-Hastings [b]. On donnera plusieurs illustrations
numériques. Le travail est en cours de publication [c].
[a] F. Orieux, O. Féron and J.-F. Giovannelli, "Sampling
high-dimensional Gaussian distributions for general linear inverse
problems", Signal Processing Letters, May 2012.
[b] C. Vacar, J.-F. Giovannelli, Y. Berthoumieu, "Langevin and
Hessian with Fisher approximation stochastic sampling for
parameter estimation of structured covariance" ICASSP 2011.
[b'] M. Girolami, B. Calderhead, "Riemannian manifold Hamiltonian
Monte Carlo", Journal of the Royal Statistical Society, 2011.
[c] C. Vacar, J.-F. Giovannelli, "Unsupervised joint deconvolution
and segmentation method for textured images: A Bayesian approach
and an advanced sampling algorithm", EURASIP Journal on Advances
in Signal Processing, special issue on 'Advanced Computational
Methods for Bayesian Signal Processing'.
Pliable rejection sampling - Michal Valko (Sequel, INRIA)
Rejection
sampling is a known technique for sampling from difficult
distributions. However, its use is limited due to a high rejection
rate. Common adaptive rejection sampling methods either work for
very specific distributions or without performance guarantees. In
this talk, we present pliable rejection sampling (PRS), a new
approach to rejection sampling, where we adapt the sampling
envelope using a kernel estimator. Since our method builds on
rejection sampling, the samples obtained are i.i.d. and w.h.p.
exactly distributed according to f. Another benefit of PRS is that
it comes with a guarantee on the number of accepted samples. We
will also discuss extensions that followed these results.
Related paper: Akram Erraqabi, Michal Valko, Alexandra Carpentier,
Odalric-Ambrym Maillard: Pliable rejection sampling, in International
Conference on Machine Learning (ICML 2016)
Snake: a Stochastic Proximal Gradient Algorithm for Regularized
Problems over Large Graphs - Adil Salim (LTCI, Telecom
ParisTech) work with Pascal Bianchi (Télécom ParisTech) and Walid
Hachem (UPEM).
A
regularized optimization problem over a large unstructured graph
is studied, where the regularization term is tied to the graph
geometry. Typical regularization examples include the total
variation and the Laplacian regularizations over the graph. When
applying the proximal gradient algorithm to solve this problem,
there exist quite affordable methods to implement the proximity
operator (backward step) in the special case where the graph is a
simple path without loops. In this paper, an algorithm, referred
to as "Snake", is proposed to solve such regularized problems over
general graphs, by taking benefit of these fast methods. The
algorithm consists in properly selecting random simple paths in
the graph and performing the proximal gradient algorithm over
these simple paths. This algorithm is an instance of a new general
stochastic proximal gradient algorithm, whose convergence is
proven.
Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic
Differential Equations for MCMC - Umut Simsekli (LTCI,
Telecom ParisTech)
Along
with the recent advances in scalable Markov Chain Monte Carlo
methods, sampling techniques that are based on Langevin diffusions
have started receiving increasing attention. These so called
Langevin Monte Carlo (LMC) methods are based on diffusions driven
by a Brownian motion, which gives rise to Gaussian proposal
distributions in the resulting algorithms. Even though these
approaches have proven successful in many applications, their
performance can be limited by the light-tailed nature of the
Gaussian proposals. In this talk, I will present an extension to
the classical LMC and develop a novel Fractional LMC (FLMC)
framework that is based on a family of heavy-tailed distributions,
called alpha-stable Lévy distributions. As opposed to classical
approaches, the proposed approach can possess large jumps while
targeting the correct distribution, which would be beneficial for
efficient exploration of the state space. I will also present
novel computational methods that can scale up to large-scale
problems and provide formal convergence analysis of the proposed
scheme. Our experiments support our theory: FLMC can provide
superior performance in multi-modal settings, improved convergence
rates, and robustness to algorithm parameters.
Lien arXiv