Boyang Yan

❯

❯

KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows

KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows

Jun 19, 20261 min read

hierarchical radix cache

workflow-aware eviction policy overlapped KV prefetching mechanism

KV cache characteristics

KV Cache Size with Context Length
prefill latency
KV Cache Transmission (CPU to GPU)

Reference List

https://arxiv.org/pdf/2507.07400
https://github.com/PanZaifeng/KVFlow

Graph View

KV cache characteristics
Reference List

Backlinks

Agent-Specific KV-Cache Profiler with LangGraph, SGLang, and MLflow

Created with Quartz v5.0.0 © 2026