Published on April 04, 2024 by r/singularity

A Twitter thread explaining a new paper from DeepMind - "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

AI