Skip to main content

Incident Response

TSC mapping: CC7.3 (Incident evaluation), CC7.4 (Incident response), CC7.5 (Recovery from incidents)

SOC 2 auditors do not expect zero incidents. They expect a documented, tested process for detecting, classifying, containing, and recovering from incidents — and evidence that the process was followed.


1. Detection Pipeline

The detection chain connects GuardDuty, Security Hub, and Config findings into an automated alert pipeline.

GuardDuty Finding
Security Hub Finding → EventBridge → SNS → PagerDuty / Slack / Email
Config Rule Violation

Route HIGH/CRITICAL GuardDuty findings to SNS

# Create SNS topic for security alerts
aws sns create-topic --name security-incidents

# Subscribe the on-call engineer's email
aws sns subscribe \
--topic-arn arn:aws:sns:<region>:<account>:security-incidents \
--protocol email \
--notification-endpoint [email protected]

# Create EventBridge rule: GuardDuty severity >= 7 (HIGH)
aws events put-rule \
--name guardduty-high-severity \
--event-pattern '{
"source": ["aws.guardduty"],
"detail-type": ["GuardDuty Finding"],
"detail": {
"severity": [{"numeric": [">=", 7]}]
}
}'

aws events put-targets \
--rule guardduty-high-severity \
--targets '[{
"Id": "sns-security",
"Arn": "arn:aws:sns:<region>:<account>:security-incidents",
"InputTransformer": {
"InputPathsMap": {
"severity": "$.detail.severity",
"type": "$.detail.type",
"account": "$.detail.accountId",
"region": "$.region",
"description": "$.detail.description"
},
"InputTemplate": "\"SECURITY ALERT\\nType: <type>\\nSeverity: <severity>\\nAccount: <account>\\nRegion: <region>\\nDescription: <description>\""
}
}]'

Route Security Hub CRITICAL findings

aws events put-rule \
--name securityhub-critical \
--event-pattern '{
"source": ["aws.securityhub"],
"detail-type": ["Security Hub Findings - Imported"],
"detail": {
"findings": {
"Severity": {"Label": ["CRITICAL"]},
"RecordState": ["ACTIVE"],
"WorkflowState": ["NEW"]
}
}
}'

2. Incident Classification

Not every finding is an incident. Classify before escalating.

SeverityGuardDuty scoreExampleResponse SLA
P1 — Critical≥ 8.0Cryptomining, data exfiltration, compromised credentials30 minutes
P2 — High7.0–7.9Unusual API calls, port scanning, brute force2 hours
P3 — Medium4.0–6.9Policy violations, misconfiguration findings24 hours
P4 — Low< 4.0Informational, expected noiseNext business day

3. Incident Response Runbook

Document and maintain this runbook. Auditors will ask to see it and may ask responders to walk through it.

Step 1 — Contain

# Isolate a compromised EC2 instance: attach a quarantine security group (no inbound, no outbound)
aws ec2 create-security-group \
--group-name quarantine \
--description "Quarantine: no inbound or outbound traffic" \
--vpc-id vpc-xxxxxxxxxxxxxxxxx

aws ec2 modify-instance-attribute \
--instance-id i-xxxxxxxxxxxxxxxxx \
--groups sg-quarantine

# Disable a compromised IAM user immediately
aws iam update-login-profile --user-name <username> --no-password-reset-required
aws iam put-user-policy \
--user-name <username> \
--policy-name DenyAll \
--policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Deny","Action":"*","Resource":"*"}]}'

# Rotate compromised IAM access keys
aws iam update-access-key \
--access-key-id <compromised-key-id> \
--status Inactive \
--user-name <username>

Step 2 — Investigate

# Query CloudTrail for API activity from a specific access key (last 24h)
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=AccessKeyId,AttributeValue=<key-id> \
--start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
--query 'Events[*].[EventTime,EventName,SourceIPAddress,Username]' \
--output table

# Query for all activity from a specific IP address
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=SourceIPAddress,AttributeValue=<ip> \
--query 'Events[*].[EventTime,EventName,Username]' \
--output table

# Check GuardDuty finding details
aws guardduty get-findings \
--detector-id <detector-id> \
--finding-ids <finding-id> \
--query 'Findings[0]' \
--output json

Step 3 — Eradicate

  • Remove malicious resources (EC2 instances, IAM users, access keys, unauthorized roles).
  • Revoke any active sessions: aws iam delete-role-policy or IAM Identity Center session revocation.
  • Patch or replace the affected system.
  • Rotate all potentially exposed secrets in Secrets Manager.

Step 4 — Recover

# Restore an RDS instance to a point in time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier prod-db \
--target-db-instance-identifier prod-db-restored \
--restore-time 2024-01-15T03:00:00Z

# Restore from an AWS Backup recovery point
aws backup start-restore-job \
--recovery-point-arn arn:aws:backup:<region>:<account>:recovery-point:<id> \
--iam-role-arn arn:aws:iam::<account>:role/AWSBackupDefaultServiceRole \
--resource-type RDS \
--metadata TargetDBInstanceIdentifier=prod-db-restored

Step 5 — Post-incident review

Complete a post-incident review document within 5 business days. Retain for audit evidence.

Template:

Incident ID: INC-YYYY-NNN
Date/Time detected:
Date/Time resolved:
Severity: P1 / P2 / P3 / P4

Timeline:
HH:MM — Alert triggered (GuardDuty finding / manual detection)
HH:MM — On-call notified
HH:MM — Incident declared / severity assigned
HH:MM — Containment action taken
HH:MM — Root cause identified
HH:MM — System restored
HH:MM — Incident closed

Root cause:

Impact (systems affected, data involved, customers notified Y/N):

Containment actions taken:

Eradication actions taken:

Recovery actions taken:

Customer notification required? (Y/N)
If yes, date/method of notification:

Action items (owner, due date):
1.
2.

4. AWS Systems Manager Incident Manager

Incident Manager provides a structured, auditable incident lifecycle — declaration, escalation, runbook execution, and post-incident review — with a built-in evidence trail for auditors.

# Create a response plan
aws ssm-incidents create-response-plan \
--name soc2-p1-response \
--incident-template '{
"title": "P1 Security Incident",
"impact": 1,
"summary": "Critical security incident requiring immediate response"
}' \
--engagements '[
"arn:aws:ssm-contacts:<region>:<account>:contact/security-on-call"
]'

# Create a contact for on-call rotation
aws ssm-contacts create-contact \
--alias security-on-call \
--display-name "Security On-Call" \
--type PERSONAL \
--plan '{
"stages": [{
"durationInMinutes": 5,
"targets": [{
"channelTargetInfo": {
"contactChannelId": "arn:aws:ssm-contacts:<region>:<account>:contact-channel/...",
"retryIntervalInMinutes": 2
}
}]
}]
}'

Reference: AWS Systems Manager Incident Manager →


5. Customer Notification Requirements

SOC 2 CC7.4 requires that affected customers are notified of security incidents within a timeframe documented in your security policy and customer agreements. Establish this before you need it:

  • Define the notification SLA in your Terms of Service or DPA (commonly 72 hours for personal data incidents, 30 days for others).
  • Identify who approves customer notifications (Legal, CISO, CEO).
  • Maintain a customer contact list that is accessible during an incident.
  • Prepare a notification template in advance.

SOC 2 Evidence for Incident Response

Evidence itemRetention
Documented incident response policyPermanent
Incident response runbook (version-controlled)Permanent
Incident log / ticket history (all severities)3 years minimum
Post-incident review reports3 years minimum
Tabletop or live IR exercise recordsAnnual, retained 3 years
GuardDuty finding historyAvailable via aws guardduty list-findings
CloudTrail logs for incident period1 year minimum (7 years for regulated data)
Customer notification records (if applicable)3 years minimum
AWS Systems Manager Incident Manager timelinesExported from console

Official references: