Florent Delgrange

Post-doctoral Researcher in Computer Science

AI Lab, Vrije Universiteit Brussel

Biography

I am a post-doctoral researcher at the AI Lab of Vrije Universiteit Brussel (VUB). My research focuses on artificial intelligence and formal verification. Specifically, I work on theoretical aspects of reinforcement learning (RL), representation learning in RL, model-checking and synthesis in stochastic systems, and decision-making under uncertainty and partial observability. The end goal of my research is to provide end-users with reliable AI mechanisms. I am also the lecturer for the course Theory of Computation, which I teach at the VUB.

Before, I did a joint PhD within the VUB and the University of Antwerp under the supervision of Ann Nowé and Guillermo A. Pérez. My thesis focused on enabling the formal verification of deep RL policies (you can find the dissertation here).

My curriculum vitae is available here.

News

Checkout our new preprint: Deep SPI: Safe Policy Improvement via World Models.
I received the Best Poster Award at BeNeRL for my poster on our last paper. Check out the poster here.
Our paper “Composing Reinforcement Learning Policies, with Formal Guarantees” has been accepted at AAMAS 2025! Check out the dedicated blogpost.

Interests

Reinforcement learning
Model checking and synthesis
Representation learning in RL
Multi-objective decision making
Decision-making under uncertainty and partial observability
Deep generative modeling

Education

Doctor of Science, Computer Science, 2024
Vrije Universiteit Brussel (VUB) and University of Antwerp, Belgium
Master in Computer Science, 2018
University of Mons (UMONS), Belgium
Bachelor in Computer Science, 2016
UMONS, Belgium

Posts

Composing Reinforcement Learning Policies, with Formal Guarantees

Synthesizing controllers in large domains from verified world models and reinforcement learning policy composition.

Florent Delgrange

Last updated on May 22, 2025 13 min read

Composing Reinforcement Learning Policies, with Formal Guarantees

Featured Publications

Florent Delgrange, Raphaël Avalos, Willem Röpke (2025). Deep SPI: Safe Policy Improvement via World Models. arXiv preprint.

Cite URL

Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez (2025). Composing Reinforcement Learning Policies, with Formal Guarantees. Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025), IFAAMAS.

Cite PDF Blogpost Code Extended Abstract Poster

Florent Delgrange (2024). Activating Formal Verification of Deep Reinforcement Learning Policies by Model Checking Bisimilar Latent Space Models. VUBPRESS, Brussels University Press.

Cite PDF

Publications

Florent Delgrange, Raphaël Avalos, Willem Röpke (2025). Deep SPI: Safe Policy Improvement via World Models. arXiv preprint.

Cite URL

Cite PDF Blogpost Code Extended Abstract Poster

Florent Delgrange (2024). Activating Formal Verification of Deep Reinforcement Learning Policies by Model Checking Bisimilar Latent Space Models. VUBPRESS, Brussels University Press.

Cite PDF

Raphaël Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers (2024). The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models. The Twelfth International Conference on Learning Representations, ICLR 2024.

Cite Project PDF Code Extended Abstract

Florent Delgrange, Mathieu Reymond, Ann Nowé, Guillermo A. Pérez (2023). WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks. Proceedings of the Adaptive and Learning Agents Workshop (ALA 2023).

Cite PDF

Florent Delgrange, Ann Nowé, Guillermo Perez (2023). Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees. The Eleventh International Conference on Learning Representations, ICLR 2023.

Cite Code Project URL PDF

Florent Delgrange, Ann Nowé, Guillermo A. Pérez (2022). Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 36 No. 6: AAAI-22 Technical Tracks 6, 6497-6505.

Cite Code Project DOI URL Extended Abstract Technical Report

Mahmoud Elbarbari, Florent Delgrange, Ivo Vervlimmeren, Kyriakos Efthymiadis, Bram Vanderborght, Ann Nowé (2022). A Framework for Flexibly Guiding Learning Agents. Neural Computing and Applications, Special Issue on Adaptive and Learning Agents 2021.

Cite DOI

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour (2020). Life is Random, Time is Not: Markov Decision Processes with Window Objectives. Logical Methods in Computer Science, December 14, 2020, Volume 16, Issue 4.

Cite URL PDF DOI

Florent Delgrange, Joost-Pieter Katoen, Tim Quatmann, Mickael Randour (2020). Simple Strategies in Multi-Objective MDPs. Tools and Algorithms for the Construction and Analysis of Systems - 26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Part I.

Cite DOI PDF

Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour (2019). Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives. 30th International Conference on Concurrency Theory, CONCUR 2019, August 27-30, 2019, Amsterdam, the Netherlands.

Cite DOI PDF