Xavier Theimer‑Lienhard

Machine Learning Research Engineer — Meditron Core Team
Full‑stack medical LLMs: pre/post‑training, RLHF/DPO/GRPO, eval, scalable inference

Get in touch

About

Full‑stack AI engineer and researcher specializing in medical LLMs. 2+ years shipping end‑to‑end systems across training, evaluation, and deployment on 100+ GPU clusters. I co‑lead open‑weight Meditron releases and safety‑minded evaluation.

Recent work includes Meditron‑3 (8B/70B) with SFT, DPO, and GRPO; Apertus‑Meditron for medicine with ~+10% on USMLE‑style benchmarks; and deployable clinical evaluation harnesses with structured outputs and safety checks. Models validated via the Moove project by clinicians in 22+ organizations.

I care about efficiency, interpretability, and auditability. I also mentor MSc students and coordinate cross‑institution projects. Off‑duty: running, climbing, and café.

Portrait of Xavier Theimer‑Lienhard

Current Projects

Auto‑Moove

Direct medical feedback optimization from Moove evaluations. Reward models and GRPO pipelines that convert clinician Likert ratings and model comparisons into learning signals. Deliverables: reward‑model reports, reproducible training scripts, and benchmarked gains over Meditron baselines.

  • Reward modeling from preferences and scalar signals
  • GRPO training with evaluation gating
  • Reproducible Axolotl/DeepSpeed training recipes

LiGHT‑SeqDx‑Bench

Sequential diagnosis benchmark with multi‑step, clinically meaningful reasoning and metrics beyond accuracy. Includes baselines and leaderboard‑ready outputs.

  • Eval harness with error taxonomy
  • Baseline suite and leaderboard outputs
  • Analysis of diagnostic failure modes

Experience

Research Engineer — LiGHT Lab (EPFL), Meditron Core Team

Apr 2025 – Present
  • Trained Meditron‑3 8B/70B with SFT, DPO, GRPO; launched runs up to 128 nodes/512 GPUs on SwissAI.
  • Finetuned Apertus‑Meditron for medicine; ~+10% on medical benchmarks (USMLE).
  • Deployed evaluation harnesses for clinical tasks with structured outputs and safety/format checks.
  • Models validated by clinicians in 22+ organizations via Moove; feedback shaped reward models and deployment.
  • Supervised 4 MSc projects across distillation, SFT, GRPO, and evaluation to production‑quality reports.

Visiting Researcher — Yale University (BIDS)

Sep 2024 – Apr 2025
  • Led Meditron‑Reasoning‑8B; +30% general reasoning and +9% medical accuracy vs baselines.
  • Built multimodal synthetic‑data pipeline (RAG + multimodal) generating 10k+ validated clinical notes with leakage/quality filters; revised system prompts and triage logic with ER physician feedback.
  • Coordinated EPFL ↔ Yale ops; mentored 2 MSc projects on synthetic data and clinician evaluation.

Meditron Core Team — LiGHT Lab / MLO Lab (EPFL)

Sep 2023 – Sep 2024
  • Created Meditree, a clinical reasoning inference framework; +5% on medical exam benchmarks; paper published.
  • Built structured guideline generator for differential‑diagnosis datasets; improved formatting and retrieval.

ML & Software Engineer — Irbis Consulting SA

Sep 2023 – Sep 2024 (part‑time)
  • Built an Electron app to automate bidding document creation with iterative user feedback.
  • Developed a PyTorch captcha solver and a retrieval‑enabled web scraper.

Publications & Talks

News

Meta summit photo

Invited Speaker — Meta Open‑Source AI Summit

Dec 2024

Invited to discuss Llama‑based Meditron work and the MOOVE evaluation framework for reliable global health AI.

Yale short‑term scholar photo

Short‑Term Scholar & Postgraduate Associate — Yale School of Medicine

Oct 2024

Joined BIDS to continue Meditron research across training, evaluation, and publication.

Affiliations

Education

Contact

Open to collaborations and conversations.