Xavier Theimer-Lienhard



Hello! I’m Xavier Theimer-Lienhard, a ml scientist and postgraduate researcher at Yale University. I work in EPFL’s
LiGHT lab to advance large-scale medical language models.

For the past two years I’ve been part of Meditron, an academic initiative creating state-of-the-art medical LLMs. My contributions include:

  • Meditron reasoning, a family of medical models trained for reasoning.
  • Meditree, a novel inference technique that improves clinical reasoning.
  • Medipatient, an automated pipeline for generating realistic synthetic patient data.

News

  • [Apr 2025] → Completed my thesis “Enhancing Meditron capabilities with synthetic and reasoning datasets” at Yale University.
  • [Dec 2024] → “Llama-3-Meditron: An Open-Weight Suite of Medical LLMs” accepted to the AAAI 2025 Gen-AI for Health workshop.
  • [Dec 2024] → Invited at Meta’s inaugural Open Source AI Summit (Palo Alto).
  • [Sep 2024] → Began my postgraduate fellowship at Yale University.

  • Experience

  • [2024-2025] Postgraduate Associate at Yale University.
  • [2023-2025] Research at LiGHT, supervised by Pr Annie Hartley and Pr Martin Jaggi.
  • [2022-2025] M.Sc. degree in Data Science at EPFL.
  • [2023-2024] Part-time software developper at Cougar Group.
  • [2023] Software developpement intern at Cougar Group, supervised by Margot Clet.
  • [2018-2023] B.Sc. degree in Computer Science & Communication Systems at EPFL.

  • me.jpg

    Publications

    Reasoning
    Enhancing Meditron capabilities with synthetic and reasoning datasets
    X. Theimer-Lienhard, M. Jaggi, M.-A. Hartley.
    Master thesis
    Pdf
    We improved the Meditron models' step-by-step analytical processes by training on specialized reasoning datasets.


    MeditreeFig
    Llama-3-Meditron: An Open-Weight Suite of Medical LLMs
    A. Sallinen, A. Solergibert, M. Zhang, G. Boyé, M. Dupont-Roc, X. Theimer-Lienhard, E. Boisson, B. Bernath, H. Hadhri, A. Tran, et al.
    AAAI 2025, Gen AI for health
    OpenReview
    We introduce Llama-3-Meditron, a high-performing open-weight suite of medical large language models (LLMs) built on LLama-3.1 (8B and 70B).


    gpoetFig
    GPoeT: a language model trained for rhyme generation on synthetic data
    A. Popescu-Belis, A. R. Atrio, B. Bernath, E. Boisson, T. Ferrari, X. Theimer-Lienhard, G. Vernikos.
    Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for CH, SSH and Lit.
    ACLAnthology
    We propose a novel solution for learning to rhyme, based on synthetic data generated with a rule-based rhyming algorithm.