Florent Delgrange

Doctoral Researcher in Artificial Intelligence

AI Lab, Vrije Universiteit Brussel

Biography

I am a PhD student in the AI lab of the Vrije Universiteit Brussel (VUB), under the supervision of Ann Nowé (AI Lab, VUB) and Guillermo A. Pérez (University of Antwerp). My research interests lie in the fields of artificial intelligence and formal verification. More specifically, my PhD focuses on the formal verification of single- and multi-agent policies obtained through reinforcement learning. The end goal of my research is to provide end-users with reliable AI mechanisms.

News Feed Follow @twitter

Our paper The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models has been accepted to ICLR 2024!
Check out our new preprint: Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies.

Interests

Reinforcement learning
Model checking and synthesis
Multi-objective decision making
Planning under uncertainty and partial observability
Deep generative modeling
Multi-agent systems

Education

PhD in Computer Science
Vrije Universiteit Brussel (VUB), Belgium
Master in Computer Science, 2018
University of Mons (UMONS), Belgium
Bachelor in Computer Science, 2016
UMONS, Belgium

Experience

Doctoral researcher

AI Lab, Vrije Universiteit Brussel

Dec 2019 – Present Brussels, Belgium

Formal verification of single- and multi-agent policies obtained through reinforcement learning.

Scientific Researcher

RWTH Aachen University and UMONS

Sep 2018 – Aug 2019 Aachen, Germany and Mons, Belgium

Many-sided synthesis in stochastic systems.

Data science intern

Nokia Bell Labs

Sep 2017 – Nov 2017 Antwerp, Belgium

Trained machine learning models to detect, identify, and troubleshoot several impairments impacting DSL lines.

Research intern

UMONS

Aug 2016 – Sep 2016 Mons, Belgium

Introduction to research internship, in the software engineering lab. Development of a software tool for generating state machine visualizations from UML statechart specifications.

Featured Publications

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

Partially Observable Markov Decision Processes (POMDPs) are useful tools to model environments where the full state cannot be perceived by an agent. As such the agent needs to reason taking into account the past observations and actions. However, simply remembering the full history is generally intractable due to the exponential growth in the history space. Keeping a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is also intractable. Current state-of-the-art algorithms use Recurrent Neural Networks (RNNs) to compress the observation-action history aiming to learn a sufficient statistic, but they lack guarantees of success and can lead to suboptimal policies. To overcome this, we propose the Wasserstein-Belief-Updater (WBU), an RL algorithm that learns a latent model of the POMDP and an approximation of the belief update. Our approach comes with theoretical guarantees on the quality of our approximation ensuring that our outputted beliefs allow for learning the optimal value function.

Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

Publications

Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez (2024). Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies. arXiv Preprint.

Cite PDF

Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers (2023). The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models. The Twelfth International Conference on Learning Representations, ICLR 2024.

Cite Project PDF

Florent Delgrange, Mathieu Reymond, Ann Nowé, Guillermo A. Pérez (2023). WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks. Proceedings of the Adaptive and Learning Agents Workshop (ALA 2023).

Cite PDF Workshop Page

Florent Delgrange, Ann Nowé, Guillermo Perez (2023). Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees. The Eleventh International Conference on Learning Representations, ICLR 2023.

Cite Code Project URL PDF

Florent Delgrange, Ann Nowé, Guillermo A. Pérez (2022). Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 36 No. 6: AAAI-22 Technical Tracks 6, 6497-6505.

Cite Code Project DOI URL Extended Abstract Technical Report

Mahmoud Elbarbari, Florent Delgrange, Ivo Vervlimmeren, Kyriakos Efthymiadis, Bram Vanderborght, Ann Nowé (2022). A Framework for Flexibly Guiding Learning Agents. Neural Computing and Applications, Special Issue on Adaptive and Learning Agents 2021.

Cite DOI

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour (2020). Life is Random, Time is Not: Markov Decision Processes with Window Objectives. Logical Methods in Computer Science, December 14, 2020, Volume 16, Issue 4.

Cite URL PDF DOI

Florent Delgrange, Joost-Pieter Katoen, Tim Quatmann, Mickael Randour (2020). Simple Strategies in Multi-Objective MDPs. Tools and Algorithms for the Construction and Analysis of Systems - 26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Part I.

Cite DOI PDF

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour (2019). Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives. 30th International Conference on Concurrency Theory, CONCUR 2019, August 27-30, 2019, Amsterdam, the Netherlands.

Cite DOI PDF