Blog Math Roadmap Murmurs Photos Projects Links About

Back

Tags: #quantization

May 2, 2026

KV Cache 的 Prefill、Decode、驱逐与量化：从缓存追加到误差形态

从历史 KV、query prefill、逐步 decode、KV 驱逐和 KV 量化出发，将 cache 理解为沿序列追加的可见记忆，并区分集合近似与数值近似。

22 min zh-CN
- llm
- transformer
- attention
- kv-cache
- quantization