Research
Multi-agent RL with an eye on production coordination
PhD at the University of Groningen. I study Dec-POMDPs and strategic world models, and connect that work to how production agents coordinate.
Research focus
- Strategic world models for multi-agent deep RL (PhD thesis, University of Groningen)
- Dec-POMDPs and decentralized control under partial observability
- SeqPPO: ~3× sampling efficiency vs. MAPPO, HATRPO, and HAPPO in our benchmarks
- Centralized training, decentralized execution (CTDE) with university collaborators
- Incentive alignment and robustness applied to production multi-agent design
- Sampled Policy Gradient extensions (MSc thesis): off-policy actor-critic with distributional RL
Work in preparation
ScrollSearch: Decentralised Control Made Learnable
In preparation
Event-Triggered Interference Pricing: Pareto-Efficient Communication for MARL Distributed Spectrum Access
In preparation
Published work
Open source
- Sampled Policy Gradient and variants (MSc thesis implementations)
- Realty hybrid RAG platform: natural-language queries over structured and vector data; broker turnaround from two days to minutes