Skip to content
SecureKhan
Go back

Incident Response Playbooks for Security Engineers

Incident Response Playbooks for Security Engineers

When an incident happens, preparation beats improvisation. This guide provides actionable playbooks for common security incidents that you can adapt to your environment.

Quick Reference

Incident TypeSeverityTime to ContainKey Actions
RansomwareCriticalHoursIsolate, preserve, assess
Data BreachCriticalHoursScope, contain, notify
Compromised AccountHighHoursDisable, investigate, reset
Malware InfectionMedium-HighHours-DaysIsolate, analyze, remediate
PhishingMediumDaysBlock, investigate, educate

IR Framework: PICERL

┌─────────────────────────────────────────────────────────────┐
│                    PICERL Framework                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  P - Preparation                                            │
│      └── Before incidents: tools, training, procedures      │
│                                                             │
│  I - Identification                                         │
│      └── Detect and confirm: is this an incident?          │
│                                                             │
│  C - Containment                                            │
│      └── Stop the bleeding: short-term and long-term       │
│                                                             │
│  E - Eradication                                            │
│      └── Remove the threat: malware, access, vulnerabilities│
│                                                             │
│  R - Recovery                                               │
│      └── Restore operations: systems, data, confidence     │
│                                                             │
│  L - Lessons Learned                                        │
│      └── Post-incident: what happened, how to improve      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Playbook 1: Ransomware Attack

Identification

Indicators:

Initial Triage (15 minutes):

□ Confirm ransomware (not encryption by legitimate tool)
□ Identify affected systems
□ Determine ransomware variant (ransom note, file extension)
□ Assess spread (still active or dormant?)
□ Declare incident severity: CRITICAL

Containment (0-4 hours)

IMMEDIATE (0-30 minutes):
□ DO NOT shut down affected systems (preserve memory)
□ Disconnect affected systems from network
□ Disable shared drives and network shares
□ Isolate backup systems (verify not compromised)
□ Block known malicious IPs/domains at firewall

SHORT-TERM (30 min - 4 hours):
□ Identify patient zero and initial infection vector
□ Implement network segmentation
□ Disable compromised accounts
□ Preserve forensic evidence (memory dumps, disk images)
□ Assess backup integrity

Network Isolation Commands:

# Windows - Disable network adapters
Get-NetAdapter | Disable-NetAdapter -Confirm:$false

# Linux - Bring down interfaces
ip link set eth0 down

# Or at firewall/switch level (preferred)
# Block specific VLAN or segment

Eradication

□ Identify all affected systems via EDR/logs
□ Determine initial access vector
□ Remove malware artifacts
□ Reset all potentially compromised credentials
□ Patch vulnerability used for initial access
□ Review and harden affected systems

Recovery

BEFORE RESTORATION:
□ Verify backup integrity (check for ransomware)
□ Confirm systems are clean before reconnecting
□ Test restoration on isolated network first

RESTORATION:
□ Restore from last known good backup
□ Reconnect systems in phases
□ Monitor for reinfection
□ Verify application functionality

Decision Framework: Pay or Not?

FactorConsiderations
Backup availabilityGood backups = don’t pay
Business impactMission-critical systems affected?
Decryptor availabilityCheck NoMoreRansom.org
Legal/regulatorySome jurisdictions prohibit payment
Threat actor reputationSome never provide keys
Insurance coverageMay cover payment (check policy)

Recommendation: Generally don’t pay. Payment funds criminal operations and doesn’t guarantee decryption.


Playbook 2: Data Breach

Identification

Indicators:

Initial Assessment:

□ What data was accessed/exfiltrated?
□ How many records affected?
□ Data classification (PII, PHI, PCI, IP?)
□ Regulatory implications (GDPR, CCPA, HIPAA?)
□ Is exfiltration ongoing?

Containment

IMMEDIATE:
□ Block attacker access (revoke credentials, block IPs)
□ Preserve logs and evidence
□ Identify all affected systems
□ Document timeline of events

DATA ASSESSMENT:
□ Identify specific data accessed
□ Determine record count
□ Assess sensitivity level
□ Check for encryption (was data encrypted at rest?)

Evidence Collection:

# Preserve logs
tar -czvf logs_backup_$(date +%Y%m%d).tar.gz /var/log/

# Database query logs
mysqldump --single-transaction mysql general_log > query_logs.sql

# AWS CloudTrail
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=GetObject --start-time 2024-01-01

Notification Requirements

Data TypeRegulationNotification Timeline
EU PIIGDPR72 hours to regulator
US HealthHIPAA60 days to affected individuals
CA ResidentsCCPA”Most expedient time possible”
Payment CardsPCI-DSSImmediately to card brands

Notification Template:

Subject: Important Security Notice

Dear [Name],

We are writing to inform you of a security incident that may have
affected your personal information.

What Happened:
[Brief, factual description]

What Information Was Involved:
[Specific data types - name, email, etc.]

What We Are Doing:
[Actions taken and ongoing]

What You Can Do:
[Specific recommendations - password reset, monitoring, etc.]

For More Information:
[Contact details, FAQ link]

We sincerely apologize for any concern this may cause.

Eradication and Recovery

□ Patch vulnerability/close attack vector
□ Reset all potentially compromised credentials
□ Review and enhance access controls
□ Implement additional monitoring
□ Conduct full security review

Playbook 3: Compromised Account

Identification

Indicators:

Triage Questions:

□ Which account is compromised?
□ Account type (user, service, admin)?
□ What access does this account have?
□ When did compromise likely occur?
□ Is attacker currently active?

Containment

IMMEDIATE (0-15 minutes):
□ Disable the account
□ Revoke all active sessions
□ Block associated IP addresses
□ Reset password (if keeping account)
□ Notify the account owner

INVESTIGATION:
□ Review authentication logs
□ Check for persistence (forwarding rules, OAuth apps)
□ Identify accessed resources
□ Check for lateral movement

Session Revocation:

# Azure AD - Revoke all sessions
Revoke-AzureADUserAllRefreshToken -ObjectId user@company.com

# Google Workspace
gam user user@company.com deprovision

# AWS IAM
aws iam delete-login-profile --user-name compromised-user
aws iam list-access-keys --user-name compromised-user
aws iam delete-access-key --user-name compromised-user --access-key-id AKIA...

Check for Persistence:

# Office 365 - Check for mail forwarding rules
Get-InboxRule -Mailbox user@company.com | Where-Object {$_.ForwardTo -or $_.ForwardAsAttachmentTo}

# Check OAuth applications
Get-AzureADUserOAuth2PermissionGrant -ObjectId user@company.com

# Check for delegated access
Get-MailboxPermission user@company.com | Where-Object {$_.IsInherited -eq $false}

Post-Containment

□ Forensic review of account activity
□ Reset password with strong, unique credential
□ Re-enable MFA (require re-registration)
□ Review and remove unnecessary permissions
□ User security training
□ Monitor for signs of continued access

Playbook 4: Malware Infection

Identification

Indicators:

Triage:

□ What type of malware? (ransomware, RAT, cryptominer, etc.)
□ How many systems affected?
□ Is it spreading?
□ What's the business impact?
□ Is data at risk?

Containment

IMMEDIATE:
□ Isolate infected system(s)
□ Collect memory dump BEFORE shutdown
□ Block C2 IPs/domains
□ Identify potentially affected systems
□ Preserve evidence

DO NOT:
□ Don't immediately reinstall (lose forensics)
□ Don't run multiple AV scans (may alert malware)
□ Don't power off without memory capture

Memory Acquisition:

# Linux - Using LiME
insmod lime.ko "path=/evidence/memory.lime format=lime"

# Windows - Using winpmem
winpmem_mini_x64.exe memory.raw

# Volatility analysis
vol.py -f memory.raw windows.pstree
vol.py -f memory.raw windows.malfind
vol.py -f memory.raw windows.netscan

Analysis

Questions to Answer:

□ What is the malware family?
□ What is the initial infection vector?
□ What are the C2 servers?
□ What capabilities does it have?
□ Has it spread to other systems?
□ What data has it accessed/exfiltrated?

Basic Analysis:

# Check file hash
sha256sum suspicious_file.exe

# Check VirusTotal
vt file suspicious_file.exe

# Network connections
netstat -ano | findstr ESTABLISHED

# Process list
tasklist /v

# Autostart locations
autorunsc -accepteula -a * -c -h -s -v -vt

Eradication

□ Identify all malware artifacts
□ Remove malware from all systems
□ Close infection vector
□ Reset compromised credentials
□ Update signatures/IOCs
□ Verify clean state

Recovery

□ Rebuild from clean image (preferred)
□ Or remove malware and verify clean
□ Restore user data from backup
□ Reconnect to network in phases
□ Enhanced monitoring for 30 days

Playbook 5: Phishing Incident

Identification

Reported By:

Triage:

□ What type of phishing? (credential, malware, BEC)
□ How many users received it?
□ How many clicked?
□ How many submitted credentials?
□ Was there malware involved?

Containment

IMMEDIATE:
□ Block sender email/domain
□ Delete email from all mailboxes
□ Block malicious URL at proxy
□ Identify all recipients

IF CREDENTIALS SUBMITTED:
□ Force password reset
□ Revoke sessions
□ Enable/verify MFA
□ Check for unauthorized access

Email Removal (Office 365):

# Search and delete phishing email
$search = New-ComplianceSearch -Name "PhishRemoval" -ExchangeLocation All -ContentMatchQuery 'from:attacker@evil.com AND subject:"Urgent"'
Start-ComplianceSearch -Identity "PhishRemoval"

# Review results
Get-ComplianceSearch -Identity "PhishRemoval" | FL

# Purge (hard delete)
New-ComplianceSearchAction -SearchName "PhishRemoval" -Purge -PurgeType HardDelete

Investigation

□ Analyze phishing email headers
□ Identify hosting infrastructure
□ Check for lookalike domains
□ Determine campaign scope
□ Report to abuse contacts

Header Analysis:

Key Headers to Check:
- Return-Path (actual sender)
- Received (email route)
- X-Originating-IP
- Authentication-Results (SPF, DKIM, DMARC)
- Message-ID

Post-Incident

□ User education (targeted training for clickers)
□ Update email filters
□ Consider simulated phishing exercise
□ Review security awareness program
□ Document lessons learned

Communication Templates

Internal Escalation

SUBJECT: [SEVERITY] Security Incident - [Brief Description]

SEVERITY: Critical/High/Medium/Low
STATUS: Active/Contained/Resolved
INCIDENT ID: INC-2024-001

SUMMARY:
[2-3 sentences describing what happened]

IMPACT:
- Systems affected: [list]
- Users affected: [count]
- Data at risk: [yes/no, type]
- Business impact: [description]

CURRENT STATUS:
- What we know: [facts]
- What we don't know: [gaps]
- Current actions: [what's being done]

NEXT STEPS:
1. [Action] - [Owner] - [Timeline]
2. [Action] - [Owner] - [Timeline]

NEXT UPDATE: [Time]

CONTACT: [IR Lead name and contact]

Executive Update

SUBJECT: Security Incident Update - [Time]

SITUATION:
[One paragraph summary suitable for executives]

IMPACT:
☐ Customer data at risk: [Yes/No]
☐ Business operations affected: [Yes/No]
☐ Regulatory notification required: [Yes/No]
☐ Media attention likely: [Yes/No]

KEY METRICS:
- Systems affected: X
- Estimated recovery time: X hours/days
- Estimated cost: $X

ACTIONS TAKEN:
• [Action 1]
• [Action 2]

DECISIONS NEEDED:
• [Decision point, if any]

NEXT UPDATE: [Time]

Customer/Public Notification

SUBJECT: Security Incident Notice

We detected unauthorized activity in our systems on [date].

WHAT HAPPENED:
[Clear, factual description without technical jargon]

WHAT INFORMATION WAS INVOLVED:
[Specific types of data]

WHAT WE'RE DOING:
• [Action 1]
• [Action 2]
• [Action 3]

WHAT YOU CAN DO:
• [Specific user action 1]
• [Specific user action 2]

FOR MORE INFORMATION:
[Dedicated page/hotline]

We apologize for any concern this may cause and are committed
to protecting your information.

Interview Deep Dive

Q: Walk me through how you’d handle a ransomware incident.

A: I follow the PICERL framework:

Identification (0-15 min):

Containment (15 min - 4 hours):

Eradication (4-24 hours):

Recovery:

Key decisions:

A: Timeline-based investigation:

1. Immediate Actions:

2. Determine Scope:

3. Investigate User’s Activity (3-day window):

Check for:
- Login anomalies (unusual locations, times)
- Email forwarding rules created
- OAuth apps authorized
- Files accessed/downloaded
- Emails sent (BEC pivot)
- Password changes on other sites (if password reuse)

4. Check for Lateral Movement:

5. Remediation:

Q: How do you prioritize incidents when multiple are happening simultaneously?

A: Prioritization framework:

FactorWeightConsiderations
Data at riskHighPII, credentials, IP
Active threatHighOngoing vs past
Blast radiusMediumHow many systems/users
Business impactMediumRevenue, operations
RegulatoryMediumNotification deadlines

Example Scenario:

Incident A: Phishing - 50 users received, 3 clicked, no cred submission
Incident B: Ransomware on 2 systems in finance
Incident C: Compromised service account used for lateral movement

Priority: C > B > A

C is highest because:
- Active threat
- Service account = broad access
- Lateral movement = expanding blast radius

B is second because:
- Ransomware is destructive
- Finance = sensitive data
- Could spread

A is lowest because:
- No confirmed compromise
- Can contain quickly

Hands-on Lab Scenarios

Lab 1: Ransomware Response

Scenario: User reports all files have .encrypted extension and a ransom note on desktop.

Exercise:

  1. Document initial indicators
  2. Write containment plan (what specific actions?)
  3. List evidence to collect
  4. Create communication for management
  5. Develop recovery plan

Lab 2: Account Compromise

Scenario: Impossible travel alert - user logged in from New York at 9:00 AM and London at 9:15 AM.

Exercise:

  1. What questions do you need answered?
  2. What actions do you take immediately?
  3. What persistence mechanisms do you check?
  4. How do you verify the account is clean?

Lab 3: Data Breach Notification

Scenario: Database containing 50,000 customer records (name, email, phone, partial SSN) was exfiltrated.

Exercise:

  1. Determine notification requirements
  2. Draft customer notification email
  3. Create executive summary
  4. List remediation actions

IR Metrics

MetricTargetFormula
MTTD<24 hoursDetection time - Compromise time
MTTC<4 hoursContainment time - Detection time
MTTR<24 hoursResolution time - Detection time
False Positive Rate<10%FP incidents / Total incidents
Recurrence Rate<5%Repeat incidents / Total incidents

Tools Reference

PhaseToolPurpose
DetectionSIEM, EDRAlert generation
AnalysisVolatilityMemory forensics
AnalysisAutorunsPersistence review
AnalysisWiresharkNetwork analysis
ContainmentFirewallBlock IOCs
EradicationAV/EDRMalware removal
DocumentationTheHiveCase management

What’s Next?


Key Takeaways

  1. Preparation beats improvisation - Have playbooks ready before incidents
  2. Preserve evidence first - Don’t destroy forensic data in rush to contain
  3. Communicate early and often - Stakeholders need updates
  4. Document everything - You’ll need it for post-incident and legal
  5. Don’t skip lessons learned - Every incident is a learning opportunity
  6. Practice regularly - Run tabletop exercises quarterly

Share this post on:

Next Post
Threat Modeling for Security Engineers