Prompt caching / cache hit %
A discount providers give for repeated prefix tokens. Common in RAG and chatbots.
Most major providers (OpenAI, Anthropic, Google) discount or zero-cost input tokens that have been seen recently in the same conversation. RAG and chatbot workloads commonly hit 50–80% cache rates because the system prompt + context is repeated. Set this to your realistic hit rate to get an honest cost estimate.