Associate Professor

Institut de Mathématique de Toulouse / INSA Toulouse

About me

  • Mixture models
  • Unsupervised classification
  • Testing procedures
  • Statistical methods for genomic data analysis
  • Application to microarray, RNA-seq, single-cell RNAseq, spatial transcriptomic data analysis
  • HDR in Applied Mathematics, 2022

    Université Paul Sabatier, Toulouse, France.

  • PhD in Applied Mathematics, 2008

    Université Paris-Sud 11, France.

  • MS in Mathematics, 2005

    Université Paris-Sud 11, France.



DEFIANT - An interdisciplinary approach to the design of effective nanoparticle-based antimicrobials


DDisc - Double-dipping in single-cell RNAseq (2021-2024).

Single Cell

Single Cell - Projet TTIL 2018 INP-INSA-ISAE (2018-2019).


Mixture-based procedures for statistical analysis of RNA-seq data (ANR JCJC)

RNA-Seq et Stat

Projet BQR RNA-Seq et Stat (2012-2013).


ASTEC-sc (Shiny app.)

ASTEC-sc = A Shiny application To Explore Clusterings of single cell RNA seq data.
ASTEC-sc is an interactive single-cell RNA-seq application. Written with the R package Shiny, this application allows you to upload a SingleCellExperiment (SCE) object containing count, normcount and logcount data sets, different cell clusterings, coordinates dimensionality reduction methods … This application allows you to visualize cells in dimensionality reduction methods and expressed genes, to compare cell clusterings and to determine marker genes. It is maintained by Nicolas Enjalbert-Courrech

R package maskmeans

This package is devoted to perform an aggregation / splitting multi-view K-means algorithm, starting with an initial clustering partition or matrix of posterior probabilities. The goal is to refine/improve the clustering obtained on the first, primary view by using additional data views; in addition, views which contain only noise or partially concordant information are down-weighted by the algorithm.

The Bioconductor package coseq

This package is devoted to the co-expression analysis of sequencing data. It contains the Poisson mixture models developed in HTSCluster (see below), the strategy based on Gaussian mixture models on transformed profiles (see Rau and Maugis-Rabusseau, 2016 for more details) and the use of the K-means algorithm for RNA-seq profiles after transformation via the centered log ratio (CLR) or log centered log ratio (logCLR) transformation (see Godichon-Baggioni et al, 2017).

R package SelvarMix

The R-package SelvarMix for variable selection in model-based clustering and discriminant analysis with a regularization approach

The R package HTSCluster

This package implements two parameterizations of a Poisson mixture model to cluster observations (e.g., genes) in high throughput sequencing data. Parameter estimation is performed using either the EM or CEM algorithm, and the BIC or ICL criteria are used for model selection (i.e., to choose the number of clusters).

R package Capushe

The Capushe software for model selection through penalized criteria devoted to penalty calibration based on the slope heuristics.

R package HTSDiff

This package implements a Poisson mixture model to identify differentially expressed genes from RNA-seq data.

SelvarClust - SelvarClustIndep - SelvarClustMV

Variable selection algorithm in model-based clustering corresponding to



  • +33 5 61 55 92 30 (INSA Bur. 116) / +33 5 61 55 86 48 (UPS-1R1 Bur 208)
  • INSA Toulouse
    Département Génie Mathématique et Modélisation (GMM)
    Bat 12
    135 avenue de Rangueil
    31007 Toulouse Cedex,