Google Releases Multi-Token Prediction Drafters for Gemma 4

date: 2026-05-05

draft: false

---

Google has launched specialized Multi-Token Prediction drafters that significantly accelerate Gemma 4 inference through speculative decoding. This approach allows the model to predict multiple future tokens simultaneously, delivering up to a 3x speedup on consumer hardware without compromising output quality.