Mayur Rathi

⭐ 34.1k GitHub stars

Agent-Safety

Agent-Safety is an code AI skill with a core value of Guidelines for building safe, governed AI agent systems. It helps developers solve real-world problems in the code domain, boosting efficiency, automating repetitive tasks, and optimizing workflows.

Guidelines for building safe, governed AI agent systems. Apply when writing code that uses agent frameworks, tool-calling LLMs, or multi-agent orchestration to ensure proper safety boundaries, policy

Last verified on: 2026-07-14

Quick Facts

Category code

Works With Claude, GitHub Copilot

Source github/awesome-copilot

Stars ⭐ 34.1k

Last Verified 2026-07-14

Risk Level Low

mkdir -p ./skills/agent-safety && curl -sfL https://raw.githubusercontent.com/github/awesome-copilot/main/skills/agent-safety/SKILL.md -o ./skills/agent-safety/SKILL.md

Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).

Skill Content

# Agent Safety & Governance

Core Principles

- **Fail closed**: If a governance check errors or is ambiguous, deny the action rather than allowing it

- **Policy as configuration**: Define governance rules in YAML/JSON files, not hardcoded in application logic

- **Least privilege**: Agents should have the minimum tool access needed for their task

- **Append-only audit**: Never modify or delete audit trail entries — immutability enables compliance

Tool Access Controls

- Always define an explicit allowlist of tools an agent can use — never give unrestricted tool access

- Separate tool registration from tool authorization — the framework knows what tools exist, the policy controls which are allowed

- Use blocklists for known-dangerous operations (shell execution, file deletion, database DDL)

- Require human-in-the-loop approval for high-impact tools (send email, deploy, delete records)

- Enforce rate limits on tool calls per request to prevent infinite loops and resource exhaustion

Content Safety

- Scan all user inputs for threat signals before passing to the agent (data exfiltration, prompt injection, privilege escalation)

- Filter agent arguments for sensitive patterns: API keys, credentials, PII, SQL injection

- Use regex pattern lists that can be updated without code changes

- Check both the user's original prompt AND the agent's generated tool arguments

Multi-Agent Safety

- Each agent in a multi-agent system should have its own governance policy

- When agents delegate to other agents, apply the most restrictive policy from either

- Track trust scores for agent delegates — degrade trust on failures, require ongoing good behavior

- Never allow an inner agent to have broader permissions than the outer agent that called it

Audit & Observability

- Log every tool call with: timestamp, agent ID, tool name, allow/deny decision, policy name

- Log every governance violation with the matched rule and evidence

- Export audit trails in JSON Lines format for integration with log aggregation systems

- Include session boundaries (start/end) in audit logs for correlation

Code Patterns

When writing agent tool functions:

python

# Good: Governed tool with explicit policy
@govern(policy)
async def search(query: str) -> str:
    ...

# Bad: Unprotected tool with no governance
async def search(query: str) -> str:
    ...

When defining policies:

yaml

# Good: Explicit allowlist, content filters, rate limit
name: my-agent
allowed_tools: [search, summarize]
blocked_patterns: ["(?i)(api_key|password)\\s*[:=]"]
max_calls_per_request: 25

# Bad: No restrictions
name: my-agent
allowed_tools: ["*"]

When composing multi-agent policies:

python

# Good: Most-restrictive-wins composition
final_policy = compose_policies(org_policy, team_policy, agent_policy)

# Bad: Only using agent-level policy, ignoring org constraints
final_policy = agent_policy

Framework-Specific Notes

- **PydanticAI**: Use `@agent.tool` with a governance decorator wrapper. PydanticAI's upcoming Traits feature is designed for this pattern.

- **CrewAI**: Apply governance at the Crew level to cover all agents. Use `before_kickoff` callbacks for policy validation.

- **OpenAI Agents SDK**: Wrap `@function_tool` with governance. Use handoff guards for multi-agent trust.

- **LangChain/LangGraph**: Use `RunnableBinding` or tool wrappers for governance. Apply at the graph edge level for flow control.

- **AutoGen**: Implement governance in the `ConversableAgent.register_for_execution` hook.

Common Mistakes

- Relying only on output guardrails (post-generation) instead of pre-execution governance

- Hardcoding policy rules instead of loading from configuration

- Allowing agents to self-modify their own governance policies

- Forgetting to governance-check tool *arguments*, not just tool *names*

- Not decaying trust scores over time — stale trust is dangerous

- Logging prompts in audit trails — log decisions and metadata, not user content

🎯 Best For

UI designers
Product designers
Claude users
GitHub Copilot users
Software engineers

💡 Use Cases

Generating component mockups
Creating design system tokens
Code quality improvement
Best practice enforcement

📖 How to Use This Skill

1
Install the Skill

Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.
2
Load into Your AI Assistant

Open Claude or GitHub Copilot and reference the skill. Paste the SKILL.md content or use the system prompt tab.
3
Apply Agent-Safety to Your Work

Open your project in the AI assistant and ask it to apply the skill. Start with a small module to verify the output quality.
4
Review and Refine

Review AI suggestions before committing. Run tests, check for regressions, and iterate on the skill output.

❓ Frequently Asked Questions

Does this work with Figma?

Some design skills integrate with Figma plugins. Check the Works With section for supported tools.

Is Agent-Safety compatible with Cursor and VS Code?

Yes — this skill works with any AI coding assistant including Cursor, VS Code with Copilot, and JetBrains IDEs.

Do I need specific dependencies for Agent-Safety?

Check the install command and Works With section. Most code skills only require the AI assistant and your codebase.

How do I install Agent-Safety?

Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/agent-safety/SKILL.md, ready to use.

Can I customize this skill for my team?

Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.

⚠️ Common Mistakes to Avoid

Skipping usability testing

AI-generated designs should be validated with real users before development.

Skipping validation

Always test AI-generated code changes, even for simple refactors.

Missing dependency updates

Check if the skill requires updated dependencies or new packages.

🔗 Related Skills

a11y A11Y accessibility Accessibility Expert accessibility-runtime-tester Accessibility Runtime Tester acquire-codebase-knowledge Acquire-Codebase-Knowledge acreadiness-generate-instructions Acreadiness-Generate-Instructions add-educational-comments Add-Educational-Comments