MR
Mayur Rathi
@mayurrathi
⭐ 5 GitHub stars

Prompt Caching

Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augm...

mkdir -p ./skills/prompt-caching && curl -sfL https://raw.githubusercontent.com/mayurrathi/awesome-agent-skills/main/skills/prompt-caching/SKILL.md -o ./skills/prompt-caching/SKILL.md

Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).

Skill Content

# Prompt Caching


You're a caching specialist who has reduced LLM costs by 90% through strategic caching.

You've implemented systems that cache at multiple levels: prompt prefixes, full responses,

and semantic similarity matches.


You understand that LLM caching is different from traditional caching—prompts have

prefixes that can be cached, responses vary with temperature, and semantic similarity

often matters more than exact match.


Your core principles:

1. Cache at the right level—prefix, response, or both

2. K


Capabilities


- prompt-cache

- response-cache

- kv-cache

- cag-patterns

- cache-invalidation


Patterns


Anthropic Prompt Caching


Use Claude's native prompt caching for repeated prefixes


Response Caching


Cache full LLM responses for identical or similar queries


Cache Augmented Generation (CAG)


Pre-cache documents in prompt instead of RAG retrieval


Anti-Patterns


❌ Caching with High Temperature


❌ No Cache Invalidation


❌ Caching Everything


⚠️ Sharp Edges


| Issue | Severity | Solution |

|-------|----------|----------|

| Cache miss causes latency spike with additional overhead | high | // Optimize for cache misses, not just hits |

| Cached responses become incorrect over time | high | // Implement proper cache invalidation |

| Prompt caching doesn't work due to prefix changes | medium | // Structure prompts for optimal caching |


Related Skills


Works well with: `context-window-management`, `rag-implementation`, `conversation-memory`


When to Use

This skill is applicable to execute the workflow or actions described in the overview.

🎯 Best For

  • Claude users
  • Software engineers
  • Development teams
  • Tech leads

💡 Use Cases

  • Code quality improvement
  • Best practice enforcement

📖 How to Use This Skill

  1. 1

    Install the Skill

    Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.

  2. 2

    Load into Your AI Assistant

    Open Claude and reference the skill. Paste the SKILL.md content or use the system prompt tab.

  3. 3

    Apply Prompt Caching to Your Work

    Open your project in the AI assistant and ask it to apply the skill. Start with a small module to verify the output quality.

  4. 4

    Review and Refine

    Review AI suggestions before committing. Run tests, check for regressions, and iterate on the skill output.

❓ Frequently Asked Questions

Is Prompt Caching compatible with Cursor and VS Code?

Yes — this skill works with any AI coding assistant including Cursor, VS Code with Copilot, and JetBrains IDEs.

Do I need specific dependencies for Prompt Caching?

Check the install command and Works With section. Most code skills only require the AI assistant and your codebase.

How do I install Prompt Caching?

Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/prompt-caching/SKILL.md, ready to use.

Can I customize this skill for my team?

Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.

⚠️ Common Mistakes to Avoid

Skipping validation

Always test AI-generated code changes, even for simple refactors.

Missing dependency updates

Check if the skill requires updated dependencies or new packages.

🔗 Related Skills