Announcement_1
New preprint out: Hidden Breakthroughs in Language Model Training. We propose POLCA, a method for decomposing changes in the loss along arbitrary bases of the low-rank training subspace, and show that POLCA can be used to find breakthroughs in training that are obscured by aggregating all variation into a single scalar loss term.