Sara Kangaslahti

prof_pic.jpg

I am a third year PhD candidate in the ML foundations group at Harvard University advised by David Alvarez-Melis. I am thankful to be supported by an NSF Graduate Research Fellowship. My research focuses on principled data-centric approaches for adapting and understanding LLMs. Recently, I have been working on finding ways to compress and connect models across scales and tasks.

Previously, I completed my Bachelor’s in Computer Science at Caltech, where I worked with Anima Anandkumar and R. Michael Alvarez on scalable tensor-based topic modeling methods.

My email is sarakangaslahti (at) g (dot) harvard (dot) edu. Please feel free to reach out to discuss research!

news

Oct 06, 2025 Released the preprint for my work 🪃 Boomerang Distillation Enables Zero-Shot Model Size Interpolation 🪃. We uncover boomerang distillation, a surprising phenomenon by which we can create a full family of models of fine-grained sizes with no additional training by interpolating between a pretrained and distilled model.
Sep 01, 2025 My paper Continuous Language Model Interpolation yields Dynamic and Controllable Text Generation was published at TMLR!
Jun 23, 2025 New preprint out: Hidden Breakthroughs in Language Model Training. We propose POLCA, a method for decomposing changes in the loss along arbitrary bases of the low-rank training subspace, and show that POLCA can be used to find breakthroughs in training that are obscured by aggregating all variation into a single scalar loss term.