Xavier Theimer-Lienhard



Research Associate at EPFL. Previously Research Associate at Yale University. I've led research initiatives on LLM training and data generation. I work in EPFL’s LiGHT lab whose goal is to save more lives with technology. I believe that AI will bring more knowledge to the world.

For the past two years I’ve been part of Meditron, an academic initiative creating SOTA medical LLMs. My contributions include:

  • Meditron reasoning, a family of medical models trained for reasoning.
  • Meditree, a novel inference technique that improves clinical reasoning.
  • A pipeline for generating safe and representative synthetic patient data using DSPy and RAG.

News

  • [Apr 2025] → Completed my thesis at Yale University.
  • [Dec 2024] → Meditron 3 paper accepted to the AAAI 2025 Gen-AI for Health workshop.
  • [Dec 2024] → Invited at Meta’s inaugural Open Source AI Summit (Palo Alto).
  • [Sep 2024] → Began my research fellowship at Yale University.

  • Experience

  • [2025-present] Research Associate. Led the Meditron reasoning team.
  • [2024-2025] Research Associate. Led the synthetic data team.
  • [2023-2025] Research Assistant. Developped Meditree.
  • [2022-2025] MSc in Computer Science & Data Science at EPFL.
  • [2023-2024] Software Engineer part-time at Cougar Group.
  • [summer 2023] Software Engineer intern at Cougar Group.
  • [2018-2023] BSc in Computer Science at EPFL.

  • me.jpg

    Publications / Projects

    Reasoning
    Enhancing Meditron capabilities with synthetic and reasoning datasets
    X. Theimer-Lienhard, M. Jaggi, M.-A. Hartley.
    Master thesis
    Pdf
    Detailing finding on finetuning open-source LLMs on medical reasoning datasets, achieving 13-30% reasoning benchmark and 5-9% medical benchmark improvements. Compared distillation with and without reasoning traces and found that results are better when finetuning without reasoning traces.


    MeditreeFig
    Llama-3-Meditron: An Open-Weight Suite of Medical LLMs
    A. Sallinen, A. Solergibert, M. Zhang, G. Boyé, M. Dupont-Roc, X. Theimer-Lienhard, E. Boisson, B. Bernath, H. Hadhri, A. Tran, et al.
    AAAI 2025, Gen AI for health
    OpenReview
    We finetune Llama-3.1 with the Meditron mixture using SFT and ORPO. Meditron outperforms its base model by 3% on medical benchmarks, and our Meditree inference method allows us to reach GPT4-Base performance of 80% accuracy on medical benchmarks, a 5% gain over Meditron alone.


    gpoetFig
    GPoeT: a language model trained for rhyme generation on synthetic data
    A. Popescu-Belis, A. R. Atrio, B. Bernath, E. Boisson, T. Ferrari, X. Theimer-Lienhard, G. Vernikos.
    Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for CH, SSH and Lit.
    ACLAnthology
    We finetune GPT-2 on 142 MB of natural poems and 6 MB of rhyming poems and find that we obtain rhymes 60% of the time versus the 11% baseline.