Aws-Resource-Health-Diagnose
Aws-Resource-Health-Diagnose is an data AI skill with a core value of Analyze AWS resource health, diagnose issues from CloudWatch logs and metrics, and create a remediation plan for identified problems. It
helps developers solve real-world problems in the data domain, boosting
efficiency, automating repetitive tasks, and optimizing workflows.
Analyze AWS resource health, diagnose issues from CloudWatch logs and metrics, and create a remediation plan for identified problems.
mkdir -p ./skills/aws-resource-health-diagnose && curl -sfL https://raw.githubusercontent.com/github/awesome-copilot/main/skills/aws-resource-health-diagnose/SKILL.md -o ./skills/aws-resource-health-diagnose/SKILL.md Run in terminal / PowerShell. Requires curl (Unix) or PowerShell 5+ (Windows).
Skill Content
# AWS Resource Health & Issue Diagnosis
This workflow analyzes a specific AWS resource to assess its health status, diagnose potential issues using CloudWatch logs and metrics, and develop a comprehensive remediation plan for any problems discovered.
Prerequisites
- AWS CLI configured and authenticated
- Target AWS resource identified (name, type, and optionally region/account)
- CloudWatch logging and metrics enabled on the target resource
Workflow Steps
Step 1: Get AWS Diagnostic Best Practices
Fetch `https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/` for monitoring and troubleshooting guidance to inform the diagnostic approach.
Step 2: Resource Discovery & Identification
Locate the target resource using the appropriate AWS CLI command for its type:
# EC2
aws ec2 describe-instances --filters "Name=tag:Name,Values=<name>"
# Lambda
aws lambda get-function --function-name <name>
# RDS
aws rds describe-db-instances --db-instance-identifier <name>
# ECS
aws ecs describe-services --cluster <cluster> --services <name>
# ALB
aws elbv2 describe-load-balancers --names <name>
# DynamoDB
aws dynamodb describe-table --table-name <name>
# SQS
aws sqs get-queue-attributes --queue-url <url> --attribute-names All
# API Gateway
aws apigatewayv2 get-apisIf multiple matches are found, prompt the user to specify region/account.
Step 3: Health Status Assessment
Run service-specific health checks:
# EC2
aws ec2 describe-instance-status --instance-ids <id>
# RDS
aws rds describe-db-instances --db-instance-identifier <name> \
--query 'DBInstances[0].DBInstanceStatus'
# Lambda - error rate over 24h
aws cloudwatch get-metric-statistics --namespace AWS/Lambda \
--metric-name Errors --dimensions Name=FunctionName,Value=<name> \
--start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 3600 --statistics Sum
# ECS
aws ecs describe-services --cluster <cluster> --services <name> \
--query 'services[0].[status,runningCount,desiredCount,pendingCount]'Key health indicators by service type:
- **Lambda**: Error rate, throttle rate, duration P99, concurrent executions
- **RDS**: CPU utilization, FreeStorageSpace, DatabaseConnections, ReadLatency/WriteLatency
- **ECS**: Running vs desired task count, task stop reason
- **ALB**: TargetResponseTime, HTTPCode_ELB_5XX_Count, UnHealthyHostCount
- **SQS**: ApproximateNumberOfMessagesNotVisible, ApproximateAgeOfOldestMessage
- **DynamoDB**: ConsumedReadCapacityUnits, ThrottledRequests, SuccessfulRequestLatency
Step 4: Log & Metrics Analysis
Find log groups and run CloudWatch Logs Insights queries:
# Find log groups
aws logs describe-log-groups --log-group-name-prefix /aws/<service>/<name>
# Start a query (last 24h errors)
aws logs start-query \
--log-group-name /aws/lambda/<name> \
--start-time $(date -u -d '24 hours ago' +%s) \
--end-time $(date -u +%s) \
--query-string 'filter @message like /ERROR/ | stats count(*) as errorCount by bin(1h)'
# Get results
aws logs get-query-results --query-id <id>
# Lambda cold starts
aws logs start-query \
--log-group-name /aws/lambda/<name> \
--start-time $(date -u -d '24 hours ago' +%s) \
--end-time $(date -u +%s) \
--query-string 'filter @type = "REPORT" | filter @initDuration > 0 | stats count() as coldStarts by bin(1h)'
# RDS Performance Insights (if enabled)
aws pi get-resource-metrics \
--service-type RDS --identifier db:<identifier> \
--metric-queries '[{"Metric":"db.load.avg"}]' \
--start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period-in-seconds 3600Identify: recurring error patterns, correlation with deployments (CloudTrail), performance trends, dependency failures.
Step 5: Issue Classification & Root Cause Analysis
**Severity**:
- **Critical**: Service unavailable, data loss, security incidents
- **High**: Performance de
🎯 Best For
- Data analysts
- Business intelligence teams
- Claude users
- GitHub Copilot users
- Data professionals
💡 Use Cases
- Finding patterns in customer data
- Creating automated dashboards
- Data pipeline auditing
- Query optimization
📖 How to Use This Skill
- 1
Install the Skill
Copy the install command from the Terminal tab and run it. The SKILL.md file downloads to your local skills directory.
- 2
Load into Your AI Assistant
Open Claude or GitHub Copilot and reference the skill. Paste the SKILL.md content or use the system prompt tab.
- 3
Apply Aws-Resource-Health-Diagnose to Your Work
Provide context for your task — paste source material, describe your audience, or share existing work to guide the AI.
- 4
Review and Refine
Edit the AI output for accuracy, tone, and completeness. Add human insight where the AI lacks context.
❓ Frequently Asked Questions
Can this connect to my database directly?
Most data skills accept CSV or JSON input. Database connectors are listed in the Works With section.
How do I install Aws-Resource-Health-Diagnose?
Copy the install command from the Terminal tab and run it. The skill downloads to ./skills/aws-resource-health-diagnose/SKILL.md, ready to use.
Can I customize this skill for my team?
Absolutely. Edit the SKILL.md file to add team-specific instructions, examples, or workflows.
⚠️ Common Mistakes to Avoid
Not validating data quality
AI analysis is only as good as your input data. Profile and clean data before analysis.
Ignoring data quality
AI analysis inherits all data quality issues — profile your data first.