Calculate KV cache memory requirements for transformer models. Supports MHA, GQA, and MLA attention mechanisms with fp16/bf16, fp8, and fp4 data types.
💡 Tip: Type model names like 'llama', 'qwen', 'mistral', then press Enter to search
Model Configuration