Machine Learning Research Engineer — Meditron Core Team
Full‑stack medical LLMs: pre/post‑training, RLHF/DPO/GRPO, eval, scalable inference
Full‑stack AI engineer and researcher specializing in medical LLMs. 2+ years shipping end‑to‑end systems across training, evaluation, and deployment on 100+ GPU clusters. I co‑lead open‑weight Meditron releases and safety‑minded evaluation.
Recent work includes Meditron‑3 (8B/70B) with SFT, DPO, and GRPO; Apertus‑Meditron for medicine with ~+10% on USMLE‑style benchmarks; and deployable clinical evaluation harnesses with structured outputs and safety checks. Models validated via the Moove project by clinicians in 22+ organizations.
I care about efficiency, interpretability, and auditability. I also mentor MSc students and coordinate cross‑institution projects. Off‑duty: running, climbing, and café.
Direct medical feedback optimization from Moove evaluations. Reward models and GRPO pipelines that convert clinician Likert ratings and model comparisons into learning signals. Deliverables: reward‑model reports, reproducible training scripts, and benchmarked gains over Meditron baselines.
Sequential diagnosis benchmark with multi‑step, clinically meaningful reasoning and metrics beyond accuracy. Includes baselines and leaderboard‑ready outputs.
AAAI 2025 Workshop (GenAI4Health)
Open‑weight medical models and Meditree clinical reasoning method.
Read paper →
ACL SIGHUM 2023
Fine‑tuning GPT‑2 on synthetic poems for consistent rhyming.
Read paper →
Master Thesis, 2025
13–30% gains on reasoning and 5–9% on medical benchmarks; compares distillation setups.
Read thesis →
Dec 2024
Invited to discuss Llama‑based Meditron work and the MOOVE evaluation framework for reliable global health AI.
Oct 2024
Joined BIDS to continue Meditron research across training, evaluation, and publication.
Open to collaborations and conversations.