Incident Response

TSC mapping: CC7.3 (Incident evaluation), CC7.4 (Incident response), CC7.5 (Recovery from incidents)

SOC 2 auditors do not expect zero incidents. They expect a documented, tested process for detecting, classifying, containing, and recovering from incidents — and evidence that the process was followed.

1. Detection Pipeline

The detection chain connects GuardDuty, Security Hub, and Config findings into an automated alert pipeline.

GuardDuty Finding
Security Hub Finding  →  EventBridge  →  SNS  →  PagerDuty / Slack / Email
Config Rule Violation

Route HIGH/CRITICAL GuardDuty findings to SNS

# Create SNS topic for security alerts
aws sns create-topic --name security-incidents

# Subscribe the on-call engineer's email
aws sns subscribe \
  --topic-arn arn:aws:sns:<region>:<account>:security-incidents \
  --protocol email \
  --notification-endpoint [email protected]

# Create EventBridge rule: GuardDuty severity >= 7 (HIGH)
aws events put-rule \
  --name guardduty-high-severity \
  --event-pattern '{
    "source": ["aws.guardduty"],
    "detail-type": ["GuardDuty Finding"],
    "detail": {
      "severity": [{"numeric": [">=", 7]}]
    }
  }'

aws events put-targets \
  --rule guardduty-high-severity \
  --targets '[{
    "Id": "sns-security",
    "Arn": "arn:aws:sns:<region>:<account>:security-incidents",
    "InputTransformer": {
      "InputPathsMap": {
        "severity": "$.detail.severity",
        "type": "$.detail.type",
        "account": "$.detail.accountId",
        "region": "$.region",
        "description": "$.detail.description"
      },
      "InputTemplate": "\"SECURITY ALERT\\nType: <type>\\nSeverity: <severity>\\nAccount: <account>\\nRegion: <region>\\nDescription: <description>\""
    }
  }]'

Route Security Hub CRITICAL findings

aws events put-rule \
  --name securityhub-critical \
  --event-pattern '{
    "source": ["aws.securityhub"],
    "detail-type": ["Security Hub Findings - Imported"],
    "detail": {
      "findings": {
        "Severity": {"Label": ["CRITICAL"]},
        "RecordState": ["ACTIVE"],
        "WorkflowState": ["NEW"]
      }
    }
  }'

2. Incident Classification

Not every finding is an incident. Classify before escalating.

Severity	GuardDuty score	Example	Response SLA
P1 — Critical	≥ 8.0	Cryptomining, data exfiltration, compromised credentials	30 minutes
P2 — High	7.0–7.9	Unusual API calls, port scanning, brute force	2 hours
P3 — Medium	4.0–6.9	Policy violations, misconfiguration findings	24 hours
P4 — Low	< 4.0	Informational, expected noise	Next business day

3. Incident Response Runbook

Document and maintain this runbook. Auditors will ask to see it and may ask responders to walk through it.

Step 1 — Contain

# Isolate a compromised EC2 instance: attach a quarantine security group (no inbound, no outbound)
aws ec2 create-security-group \
  --group-name quarantine \
  --description "Quarantine: no inbound or outbound traffic" \
  --vpc-id vpc-xxxxxxxxxxxxxxxxx

aws ec2 modify-instance-attribute \
  --instance-id i-xxxxxxxxxxxxxxxxx \
  --groups sg-quarantine

# Disable a compromised IAM user immediately
aws iam update-login-profile --user-name <username> --no-password-reset-required
aws iam put-user-policy \
  --user-name <username> \
  --policy-name DenyAll \
  --policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Deny","Action":"*","Resource":"*"}]}'

# Rotate compromised IAM access keys
aws iam update-access-key \
  --access-key-id <compromised-key-id> \
  --status Inactive \
  --user-name <username>

Step 2 — Investigate

# Query CloudTrail for API activity from a specific access key (last 24h)
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=<key-id> \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[*].[EventTime,EventName,SourceIPAddress,Username]' \
  --output table

# Query for all activity from a specific IP address
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=SourceIPAddress,AttributeValue=<ip> \
  --query 'Events[*].[EventTime,EventName,Username]' \
  --output table

# Check GuardDuty finding details
aws guardduty get-findings \
  --detector-id <detector-id> \
  --finding-ids <finding-id> \
  --query 'Findings[0]' \
  --output json

Step 3 — Eradicate

Remove malicious resources (EC2 instances, IAM users, access keys, unauthorized roles).
Revoke any active sessions: aws iam delete-role-policy or IAM Identity Center session revocation.
Patch or replace the affected system.
Rotate all potentially exposed secrets in Secrets Manager.

Step 4 — Recover

# Restore an RDS instance to a point in time
aws rds restore-db-instance-to-point-in-time \
  --source-db-instance-identifier prod-db \
  --target-db-instance-identifier prod-db-restored \
  --restore-time 2024-01-15T03:00:00Z

# Restore from an AWS Backup recovery point
aws backup start-restore-job \
  --recovery-point-arn arn:aws:backup:<region>:<account>:recovery-point:<id> \
  --iam-role-arn arn:aws:iam::<account>:role/AWSBackupDefaultServiceRole \
  --resource-type RDS \
  --metadata TargetDBInstanceIdentifier=prod-db-restored

Step 5 — Post-incident review

Complete a post-incident review document within 5 business days. Retain for audit evidence.

Template:

Incident ID: INC-YYYY-NNN
Date/Time detected:
Date/Time resolved:
Severity: P1 / P2 / P3 / P4

Timeline:
  HH:MM — Alert triggered (GuardDuty finding / manual detection)
  HH:MM — On-call notified
  HH:MM — Incident declared / severity assigned
  HH:MM — Containment action taken
  HH:MM — Root cause identified
  HH:MM — System restored
  HH:MM — Incident closed

Root cause:

Impact (systems affected, data involved, customers notified Y/N):

Containment actions taken:

Eradication actions taken:

Recovery actions taken:

Customer notification required? (Y/N)
  If yes, date/method of notification:

Action items (owner, due date):
  1.
  2.

4. AWS Systems Manager Incident Manager

Incident Manager provides a structured, auditable incident lifecycle — declaration, escalation, runbook execution, and post-incident review — with a built-in evidence trail for auditors.

# Create a response plan
aws ssm-incidents create-response-plan \
  --name soc2-p1-response \
  --incident-template '{
    "title": "P1 Security Incident",
    "impact": 1,
    "summary": "Critical security incident requiring immediate response"
  }' \
  --engagements '[
    "arn:aws:ssm-contacts:<region>:<account>:contact/security-on-call"
  ]'

# Create a contact for on-call rotation
aws ssm-contacts create-contact \
  --alias security-on-call \
  --display-name "Security On-Call" \
  --type PERSONAL \
  --plan '{
    "stages": [{
      "durationInMinutes": 5,
      "targets": [{
        "channelTargetInfo": {
          "contactChannelId": "arn:aws:ssm-contacts:<region>:<account>:contact-channel/...",
          "retryIntervalInMinutes": 2
        }
      }]
    }]
  }'

Reference: AWS Systems Manager Incident Manager →

5. Customer Notification Requirements

SOC 2 CC7.4 requires that affected customers are notified of security incidents within a timeframe documented in your security policy and customer agreements. Establish this before you need it:

Define the notification SLA in your Terms of Service or DPA (commonly 72 hours for personal data incidents, 30 days for others).
Identify who approves customer notifications (Legal, CISO, CEO).
Maintain a customer contact list that is accessible during an incident.
Prepare a notification template in advance.

SOC 2 Evidence for Incident Response

Evidence item	Retention
Documented incident response policy	Permanent
Incident response runbook (version-controlled)	Permanent
Incident log / ticket history (all severities)	3 years minimum
Post-incident review reports	3 years minimum
Tabletop or live IR exercise records	Annual, retained 3 years
GuardDuty finding history	Available via `aws guardduty list-findings`
CloudTrail logs for incident period	1 year minimum (7 years for regulated data)
Customer notification records (if applicable)	3 years minimum
AWS Systems Manager Incident Manager timelines	Exported from console

Official references:

1. Detection Pipeline​

Route HIGH/CRITICAL GuardDuty findings to SNS​

Route Security Hub CRITICAL findings​

2. Incident Classification​

3. Incident Response Runbook​

Step 1 — Contain​

Step 2 — Investigate​

Step 3 — Eradicate​

Step 4 — Recover​

Step 5 — Post-incident review​

4. AWS Systems Manager Incident Manager​

5. Customer Notification Requirements​

SOC 2 Evidence for Incident Response​