My research confronts statistical models and machine learning methods with massive heterogeneous biodiversity data to answer macro and community ecology questions. I am interested in measuring and anticipating the effects of human pressure on ecosystems, in particular plant invasions, to inform conservation planning. My interest for citizen sciences data pushes me to ask new questions that this type of data may specifically contribute to answer. Therefore, I develop methods to exploit the rich information hidden in massive crowdsourcing datasets.

I did my PhD INRAE at the UMR AMAP, Montpellier, France, where I studied statistical methods for species distribution models (SDM) based on large presence-only datasets coming from citizen sciences programs. The exploitation of data from the project Pl@ntNet was a major motivation for my thesis, and I collaborated closely with this project. This work included to (i) evaluate the benefits of deep learning and convolutional neural networks approaches for presence-only SDM, (ii) caracterize biases arising due to the distribution sampling effort, species niches and background points in presence only SDM based on Poisson point processes (iii) develop an new unbiased approach based on a joint model of sampling effort and species distributions, and (iv) measure the sampling and taxonomic coverage of Pl@ntNet contributions, in order to compare it with national botanical conservatories. This PhD was founded by a the national INRA-INRIA scholarship of 2016.

I have also been co-organising GeoLifeCLEF (editions 2018, 2019 and 2020), a part of the LifeCLEF evaluation campaign. It is a machine learning challenge aiming at predicting the most likely species from geolocation.

I did a first 17 months CNRS PostDoc in the Laboratoire d'Ecologie Alpine, Grenoble, working with for the project EcoNet. We explored the use of graph embedding methods to compare ecological interactions network architectures across space, environment or time. I especially leveraged the knowledge of trophic interaction between European terrestrial vertebrates and broad scale crowdsourcing data (GBIF) to study the changes of trophic network architectures related to higher land use intensity.

Currently in PostDoc with David Richardson at the Center for Invasion Biology of Stellenbosch University, South Africa, we are exploring opportunities of iNaturalist data for studying tree invasions in South Africa at local and national scales. Use cases include the early detection of recent first introductions, secondary introductions and the monitoring of natural spread at large scale. We are also combining past datasets (e.g. Southern African Plant Invaders Atlas) with recent crowdsourcing records (iNaturalist/Pl@ntNet) to reconstruct tree invasion spatial dynamics over last decades in South Africa and Europe and understand the drivers of their invasion success.