Machine Learning
Frontier advances spanning LLMs, diffusion models, graph learning, and efficient architectures.
DeepSeek V3 / R1: Mixture-of-Experts and Multi-head Latent Attention
DeepSeek · 2025–2026
Demonstrated that 671B parameter MoE models with only 37B activated per inference can match dense models. Multi-head Latent Attention (MLA) dramatically reduces KV cache, enabling cost-efficient serving at scale. Sparked the "DeepSeek moment" — proving open-weight models can compete with closed-source frontier systems.
DiffusionGemma: Breaking Free of Left-to-Right Processing
Google DeepMind · June 2026
A 26B MoE model generating 256-token blocks in parallel via iterative denoising — up to 4x faster inference (1,000 tokens/sec on H100). Bidirectional attention excels at infilling, reasoning, and non-linear generation. Released Apache 2.0. Signals a paradigm shift: diffusion may rival autoregressive decoding for text.
SiST-GNN: Simultaneous Spatial-Temporal Message Passing
arXiv:2605.25548 · May 2026
Fuses topology and temporal evolution into a single message-passing operation for dynamic graphs. Sets new SOTA on link prediction, outperforming prior methods by 109–277%. Demonstrates that simultaneous ST reasoning beats sequential chaining.