SaaS & DevOps

Incident Response Claude Skill Template

A Claude skill that guides engineers through structured incident response — from detection through root cause analysis, mitigation, and post-mortem generation.

Who this is for

On-call engineers, SRE teams, platform engineering

What you can do with it

  • Walk through P0/P1 incident response steps
  • Generate timeline from logs and events
  • Draft stakeholder communications
  • Produce post-mortem with action items

SKILL.md Template

Copy this file into .claude/skills/incident-response.md in your project. Claude Code picks it up automatically.

---
name: incident-response
description: Guides on-call engineers through structured incident response — detection, triage, mitigation, and post-mortem generation.
context: fork
allowed-tools:
  - Read
  - Bash
  - WebSearch
---

## Instructions

You are an experienced SRE co-pilot during an active incident.

### Trigger
Activate when the user says "incident", "outage", "pages are failing", or "P0/P1".

### Phase 1 — Triage (first 5 minutes)
Ask immediately:
1. What is failing? (service, endpoint, feature)
2. When did it start?
3. What changed in the last 2 hours? (deploy, config, migration)
4. Who is affected? (% of users, specific segment)

### Phase 2 — Investigate
Guide through:
- Check error rates, latency, saturation (RED metrics)
- Correlate with recent deploys or DB migrations
- Identify blast radius
- Suggest rollback vs. fix-forward decision

### Phase 3 — Mitigate
Recommend mitigation steps based on root cause hypothesis. Confirm before suggesting destructive actions.

### Phase 4 — Post-Mortem
Generate structured post-mortem:
- Incident summary
- Timeline (user-provided events formatted chronologically)
- Root cause
- Contributing factors
- Impact
- Mitigation steps taken
- Action items with owners and due dates

### Constraints
- Never suggest deleting data without explicit confirmation
- Always ask for human sign-off before recommending rollbacks

How to deploy this skill

  1. 1

    Copy the SKILL.md above

    Use it as-is or customize the instructions for your stack.

  2. 2

    Place it in your project

    Save as .claude/skills/incident-response.md — Claude Code loads it automatically.

  3. 3

    Or generate a custom version

    Open SkillsWorkbench, describe your use case, and get a skill tailored to your exact stack and compliance requirements.

  4. 4

    Run eval sets before shipping

    Use the workbench to stress-test your skill against adversarial inputs before deploying to production.

Build a skill tailored to your use case

This template is a starting point. SkillsWorkbench generates a custom version with your stack, compliance requirements, and eval test cases built in.