Incident Response
TSC mapping: CC7.3 (Incident evaluation), CC7.4 (Incident response), CC7.5 (Recovery from incidents)
SOC 2 auditors do not expect zero incidents. They expect a documented, tested process for detecting, classifying, containing, and recovering from incidents — and evidence that the process was followed.
1. Detection Pipeline
The detection chain connects GuardDuty, Security Hub, and Config findings into an automated alert pipeline.
GuardDuty Finding
Security Hub Finding → EventBridge → SNS → PagerDuty / Slack / Email
Config Rule Violation
Route HIGH/CRITICAL GuardDuty findings to SNS
# Create SNS topic for security alerts
aws sns create-topic --name security-incidents
# Subscribe the on-call engineer's email
aws sns subscribe \
--topic-arn arn:aws:sns:<region>:<account>:security-incidents \
--protocol email \
--notification-endpoint [email protected]
# Create EventBridge rule: GuardDuty severity >= 7 (HIGH)
aws events put-rule \
--name guardduty-high-severity \
--event-pattern '{
"source": ["aws.guardduty"],
"detail-type": ["GuardDuty Finding"],
"detail": {
"severity": [{"numeric": [">=", 7]}]
}
}'
aws events put-targets \
--rule guardduty-high-severity \
--targets '[{
"Id": "sns-security",
"Arn": "arn:aws:sns:<region>:<account>:security-incidents",
"InputTransformer": {
"InputPathsMap": {
"severity": "$.detail.severity",
"type": "$.detail.type",
"account": "$.detail.accountId",
"region": "$.region",
"description": "$.detail.description"
},
"InputTemplate": "\"SECURITY ALERT\\nType: <type>\\nSeverity: <severity>\\nAccount: <account>\\nRegion: <region>\\nDescription: <description>\""
}
}]'
Route Security Hub CRITICAL findings
aws events put-rule \
--name securityhub-critical \
--event-pattern '{
"source": ["aws.securityhub"],
"detail-type": ["Security Hub Findings - Imported"],
"detail": {
"findings": {
"Severity": {"Label": ["CRITICAL"]},
"RecordState": ["ACTIVE"],
"WorkflowState": ["NEW"]
}
}
}'
2. Incident Classification
Not every finding is an incident. Classify before escalating.
| Severity | GuardDuty score | Example | Response SLA |
|---|---|---|---|
| P1 — Critical | ≥ 8.0 | Cryptomining, data exfiltration, compromised credentials | 30 minutes |
| P2 — High | 7.0–7.9 | Unusual API calls, port scanning, brute force | 2 hours |
| P3 — Medium | 4.0–6.9 | Policy violations, misconfiguration findings | 24 hours |
| P4 — Low | < 4.0 | Informational, expected noise | Next business day |
3. Incident Response Runbook
Document and maintain this runbook. Auditors will ask to see it and may ask responders to walk through it.
Step 1 — Contain
# Isolate a compromised EC2 instance: attach a quarantine security group (no inbound, no outbound)
aws ec2 create-security-group \
--group-name quarantine \
--description "Quarantine: no inbound or outbound traffic" \
--vpc-id vpc-xxxxxxxxxxxxxxxxx
aws ec2 modify-instance-attribute \
--instance-id i-xxxxxxxxxxxxxxxxx \
--groups sg-quarantine
# Disable a compromised IAM user immediately
aws iam update-login-profile --user-name <username> --no-password-reset-required
aws iam put-user-policy \
--user-name <username> \
--policy-name DenyAll \
--policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Deny","Action":"*","Resource":"*"}]}'
# Rotate compromised IAM access keys
aws iam update-access-key \
--access-key-id <compromised-key-id> \
--status Inactive \
--user-name <username>
Step 2 — Investigate
# Query CloudTrail for API activity from a specific access key (last 24h)
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=AccessKeyId,AttributeValue=<key-id> \
--start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
--query 'Events[*].[EventTime,EventName,SourceIPAddress,Username]' \
--output table
# Query for all activity from a specific IP address
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=SourceIPAddress,AttributeValue=<ip> \
--query 'Events[*].[EventTime,EventName,Username]' \
--output table
# Check GuardDuty finding details
aws guardduty get-findings \
--detector-id <detector-id> \
--finding-ids <finding-id> \
--query 'Findings[0]' \
--output json
Step 3 — Eradicate
- Remove malicious resources (EC2 instances, IAM users, access keys, unauthorized roles).
- Revoke any active sessions:
aws iam delete-role-policyor IAM Identity Center session revocation. - Patch or replace the affected system.
- Rotate all potentially exposed secrets in Secrets Manager.
Step 4 — Recover
# Restore an RDS instance to a point in time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier prod-db \
--target-db-instance-identifier prod-db-restored \
--restore-time 2024-01-15T03:00:00Z
# Restore from an AWS Backup recovery point
aws backup start-restore-job \
--recovery-point-arn arn:aws:backup:<region>:<account>:recovery-point:<id> \
--iam-role-arn arn:aws:iam::<account>:role/AWSBackupDefaultServiceRole \
--resource-type RDS \
--metadata TargetDBInstanceIdentifier=prod-db-restored
Step 5 — Post-incident review
Complete a post-incident review document within 5 business days. Retain for audit evidence.
Template:
Incident ID: INC-YYYY-NNN
Date/Time detected:
Date/Time resolved:
Severity: P1 / P2 / P3 / P4
Timeline:
HH:MM — Alert triggered (GuardDuty finding / manual detection)
HH:MM — On-call notified
HH:MM — Incident declared / severity assigned
HH:MM — Containment action taken
HH:MM — Root cause identified
HH:MM — System restored
HH:MM — Incident closed
Root cause:
Impact (systems affected, data involved, customers notified Y/N):
Containment actions taken:
Eradication actions taken:
Recovery actions taken:
Customer notification required? (Y/N)
If yes, date/method of notification:
Action items (owner, due date):
1.
2.
4. AWS Systems Manager Incident Manager
Incident Manager provides a structured, auditable incident lifecycle — declaration, escalation, runbook execution, and post-incident review — with a built-in evidence trail for auditors.
# Create a response plan
aws ssm-incidents create-response-plan \
--name soc2-p1-response \
--incident-template '{
"title": "P1 Security Incident",
"impact": 1,
"summary": "Critical security incident requiring immediate response"
}' \
--engagements '[
"arn:aws:ssm-contacts:<region>:<account>:contact/security-on-call"
]'
# Create a contact for on-call rotation
aws ssm-contacts create-contact \
--alias security-on-call \
--display-name "Security On-Call" \
--type PERSONAL \
--plan '{
"stages": [{
"durationInMinutes": 5,
"targets": [{
"channelTargetInfo": {
"contactChannelId": "arn:aws:ssm-contacts:<region>:<account>:contact-channel/...",
"retryIntervalInMinutes": 2
}
}]
}]
}'
Reference: AWS Systems Manager Incident Manager →
5. Customer Notification Requirements
SOC 2 CC7.4 requires that affected customers are notified of security incidents within a timeframe documented in your security policy and customer agreements. Establish this before you need it:
- Define the notification SLA in your Terms of Service or DPA (commonly 72 hours for personal data incidents, 30 days for others).
- Identify who approves customer notifications (Legal, CISO, CEO).
- Maintain a customer contact list that is accessible during an incident.
- Prepare a notification template in advance.
SOC 2 Evidence for Incident Response
| Evidence item | Retention |
|---|---|
| Documented incident response policy | Permanent |
| Incident response runbook (version-controlled) | Permanent |
| Incident log / ticket history (all severities) | 3 years minimum |
| Post-incident review reports | 3 years minimum |
| Tabletop or live IR exercise records | Annual, retained 3 years |
| GuardDuty finding history | Available via aws guardduty list-findings |
| CloudTrail logs for incident period | 1 year minimum (7 years for regulated data) |
| Customer notification records (if applicable) | 3 years minimum |
| AWS Systems Manager Incident Manager timelines | Exported from console |
Official references: