Cloud

Save costs and decrease latency while using Gemini with Vertex AI context caching

October 15, 2025 6 min read ● Google

Vertex AI context caching aims to cache tokens of repeated content. Caching works by allowing customers to save and reuse precomputed input tokens. Customers pay only 10 of standard input token cost for cached tokens for all supported Gemini 2.5 and above models.