Building an incident response playbook for your organisation

When a cybersecurity incident strikes - ransomware encrypting your file server, a data breach exposing customer records, or a critical system failure at 2 a.m. - the difference between a contained event and a full-blown crisis comes down to preparation.

An incident response (IR) playbook is a documented set of procedures that tells your team exactly what to do, who is responsible, and how to communicate during an incident. Without one, response efforts are improvised, slow, and prone to mistakes that make things worse.

This guide walks you through building a practical IR playbook suited to a South African business environment.

Why you need a playbook

Incidents are stressful. People under stress make poor decisions unless they have a framework to follow. A playbook provides that framework.

Beyond the immediate operational benefits, a documented IR process is increasingly expected by auditors, regulators, insurers, and clients. If your organisation is subject to POPIA, you have a legal obligation to respond to data breaches within specific timeframes. A playbook ensures you can meet those obligations.

The six phases of incident response

The widely accepted NIST framework breaks incident response into six phases. Your playbook should address each one.

Phase 1: Preparation

Preparation is everything you do before an incident occurs. It is the largest section of your playbook and the one that determines whether the other five phases work.

Key elements:

  • Roles and responsibilities. Define an incident response team (IRT) with clear roles: incident commander, technical lead, communications lead, legal/compliance liaison, and executive sponsor. Name specific individuals and their alternates.
  • Contact lists. Maintain an up-to-date list of IRT members, key vendors, your cybersecurity and security operations provider, legal counsel, insurance broker, and regulatory contacts. Store this somewhere accessible even if your primary systems are offline - a printed copy and a secure cloud document.
  • Tools and access. Ensure the IRT has the tools and elevated access they need during an incident: forensic software, isolated investigation environments, admin credentials stored in a break-glass procedure, and communication channels that do not depend on your primary infrastructure (e.g. a dedicated WhatsApp or Signal group).
  • Training and drills. A playbook that no one has practised is a playbook that will not work under pressure. Conduct tabletop exercises at least twice a year, simulating realistic scenarios.
  • Asset inventory. You cannot protect what you do not know about. Maintain a current inventory of hardware, software, cloud services, data repositories, and network architecture diagrams.

Phase 2: Detection and analysis

This phase covers how you identify that an incident is occurring and assess its scope and severity.

Detection sources:

  • Security information and event management (SIEM) alerts
  • Endpoint detection and response (EDR) notifications
  • User reports (phishing emails, unusual behaviour, locked accounts)
  • Automated monitoring from your managed IT provider
  • External notifications (a customer, vendor, or law enforcement agency informs you)

Triage process:

When a potential incident is detected, the on-call responder should:

  1. Confirm whether the event is a genuine incident or a false positive
  2. Classify the severity using a predefined scale (see below)
  3. Notify the incident commander
  4. Begin logging all actions in the incident log

Severity classification:

LevelDescriptionExampleResponse
CriticalBusiness-wide impact, data breach, or ongoing attackRansomware spreading across networkFull IRT activation, executive notification
HighSignificant impact to one department or systemEmail server compromisedIRT activation, department heads notified
MediumLimited impact, contained to a single system or userSingle workstation infected with malwareTechnical lead and relevant analyst
LowMinor event, no business impactBlocked phishing attemptLog and monitor

Phase 3: Containment

The goal of containment is to stop the incident from spreading while preserving evidence for investigation.

Short-term containment (immediate):

  • Isolate affected systems from the network (disconnect cables, disable Wi-Fi, block at the firewall)
  • Disable compromised user accounts
  • Block malicious IP addresses or domains
  • Redirect traffic if needed

Long-term containment (stabilisation):

  • Apply temporary fixes that allow business operations to continue while the investigation proceeds
  • Build clean systems from known-good backups if necessary
  • Implement additional monitoring on potentially affected systems

Critical rule: Do not wipe or rebuild systems before forensic evidence has been captured. If you destroy evidence, you may never understand how the breach occurred or whether it has been fully resolved.

Phase 4: Eradication

Once the incident is contained, remove the root cause:

  • Delete malware and malicious files
  • Close the vulnerability that was exploited (patch, reconfigure, remove)
  • Reset compromised credentials - all of them, not just the ones you are certain about
  • Verify that no backdoors or persistence mechanisms remain

This phase often requires specialist skills. If your team does not have deep forensic and threat-hunting capability, engage your cybersecurity provider or a dedicated incident response firm.

Phase 5: Recovery

Recovery is the process of bringing affected systems back into production safely.

Steps:

  1. Restore systems from clean backups or rebuild from known-good images
  2. Verify that restored systems are free of compromise before reconnecting them to the network
  3. Monitor restored systems intensively for signs of reinfection
  4. Gradually return to normal operations, restoring services in priority order

Your business continuity and disaster recovery plan defines the technical recovery procedures. The IR playbook should reference these procedures and define who authorises the return to production.

Recovery timeline:

Set expectations with leadership. A minor incident may be resolved in hours. A significant ransomware event can take weeks to fully recover from, even with good backups.

Phase 6: Lessons learned

This is the most frequently skipped phase - and one of the most valuable.

Within two weeks of the incident, convene a post-incident review with the full IRT and relevant stakeholders. Cover:

  • Timeline. What happened, in what order, and when was each phase initiated?
  • What worked. Which parts of the playbook performed as expected?
  • What did not. Where were there delays, confusion, or gaps?
  • Root cause. What was the underlying vulnerability or failure that allowed the incident?
  • Recommendations. What specific changes to processes, technology, or training will prevent a recurrence?

Document the findings and update the playbook accordingly. Each incident should make your response capability stronger.

Communication templates

Your playbook should include pre-drafted communication templates for:

  • Internal notification to staff: “We are currently investigating a security event affecting [system/service]. Our IT team is working to resolve the issue. Please [specific instructions - e.g. do not click links, do not access specific systems]. We will provide updates every [timeframe].”
  • Executive briefing: A concise summary covering what happened, current status, business impact, estimated resolution time, and next steps.
  • Customer notification (if required): Factual, empathetic language explaining what occurred, what data may be affected, what you are doing about it, and what the customer should do. In South Africa, POPIA requires notification to the Information Regulator and affected data subjects in the event of a personal data breach.
  • Regulator notification: A formal communication to the Information Regulator that meets the requirements of Section 22 of POPIA.

Pre-drafting these templates saves critical hours during an actual incident and ensures the tone and content are appropriate.

Escalation paths

Define clear escalation criteria:

  • Technical escalation: When the on-call responder needs additional technical expertise (e.g. escalating from helpdesk to security analyst to external forensic specialist)
  • Management escalation: When business impact exceeds a defined threshold or when decisions about business continuity, public communication, or legal response are required
  • External escalation: When to engage law enforcement (SAPS Cybercrime Unit), the Information Regulator, your cyber insurance provider, or external legal counsel

Include decision trees or flowcharts that make escalation decisions obvious, even under pressure.

Keeping the playbook alive

A playbook is not a document you write once and file away. Schedule these maintenance activities:

  • Quarterly: Review and update contact lists, verify tool access, confirm that backup and recovery procedures are current
  • Biannually: Conduct a tabletop exercise using a scenario relevant to your current threat landscape
  • Annually: Perform a full review of the playbook, incorporating lessons from real incidents, industry developments, and changes to your environment
  • After every incident: Update the playbook based on findings from the lessons-learned review

Getting started

If you do not have an IR playbook today, do not try to build the perfect document on day one. Start with:

  1. Define your IRT and their roles
  2. Create a contact list and store it somewhere accessible offline
  3. Draft a basic triage and severity classification process
  4. Write containment steps for your two most likely scenarios (ransomware and email compromise are good starting points)
  5. Schedule your first tabletop exercise within 60 days

From there, iterate. Each drill and each real incident will reveal gaps that you can fill.

Ready to build or strengthen your incident response capability? Contact our team to discuss how we can help you prepare, test, and refine a playbook tailored to your business.

Need help with cybersecurity?

Our team can help you implement the solutions discussed in this article.

Get in touch