Incident Response Planning: A Must-Have for Every Israeli Startup

The worst time to design your incident response process is at 2am when an alert fires and nobody knows who owns the call.

Why startup IR plans fail

A lot of startups have something that looks like an incident response plan: a Google Doc titled “Security Incidents,” a Slack channel #sec-incidents, and a vague understanding that you should “escalate to leadership” for serious things. That’s not a plan. That’s a prayer.

When a real incident hits, the absence of a plan shows up as:

20 minutes of Slack messages trying to figure out who is responsible
Someone remediating before you’ve finished scoping, destroying forensic evidence in the process
Customers notified inconsistently because no one agreed on the communication template
A PPA notification missed because no one tracked the 72-hour clock

A functional IR plan has defined ownership, pre-built playbooks for your specific threat model, integrated tooling that reduces manual work during the incident, and a post-incident review process that actually changes how you operate.

The short version

What you need	Why it matters
Severity definitions	Stops arguments about urgency while the attacker is still in your systems
Source-specific playbooks	Your team follows steps, not improvises
Pre-staged investigation queries	Detection and scoping happen in minutes, not hours
Evidence preservation procedure	Keeps forensic artifacts intact for legal and PPA purposes
Post-incident review process	Turns incidents into system improvements

The incident response framework

Use NIST 800-61 revision 2 as your baseline. Six phases:

Phase	What happens
Preparation	Build the plan, tools, and playbooks before an incident occurs
Detection & Analysis	Identify the incident and scope its full impact
Containment	Isolate affected systems without destroying forensic evidence
Eradication	Remove the attacker’s access, persistence mechanisms, and artifacts
Recovery	Restore systems and verify clean state before returning to production
Post-Incident Activity	Document the timeline, identify gaps, assign corrective actions

Most startups fail at Preparation and Detection. The first time they think seriously about IR is during an active incident, which is the worst possible moment.

Phase 1: Preparation

Define severity levels

Write these down and get leadership to agree on them before any incident. The purpose is to remove judgment calls during high-stress moments.

Severity	Definition	Target response time	Who responds
P1 — Critical	Active breach confirmed, ransomware active, data exfiltration in progress, production down	< 15 minutes	On-call engineer, CTO, legal
P2 — High	Credential compromise suspected, privilege escalation detected, uncontained threat	< 1 hour	On-call engineer, security lead
P3 — Medium	Policy violation, anomalous access pattern, single failed attack with no impact	< 4 hours	On-call engineer
P4 — Low	Informational alert, no evidence of actual impact	Next business day	Assigned analyst

Build your contact list — offline

Document this somewhere accessible without internet. Slack, Notion, and your ticketing system may be unavailable or compromised during a serious incident.

Internal: On-call rotation, CTO, VP Engineering, legal counsel, CFO (for insurance claims), PR/comms contact
External: Cloud provider premium support, cyber insurance carrier incident hotline (different number than claims), forensics retainer if you have one
Regulatory: CERT-IL (1-800-611-911), PPA breach notification portal URL
Customer-facing: B2B customers have contractual breach notification requirements — know your point of contact at each account

Pre-stage your investigation tooling

During an incident, you want to be running queries, not installing software or searching for passwords. Pre-create:

A Slack channel template that auto-populates with the incident log template when triggered
Read-only access credentials for responders who don’t normally have production access
Saved SIEM queries for the five most common investigation scenarios (see Phase 2 below)
An immutable evidence storage location in a separate, isolated account — write-once, cannot be deleted or overwritten

The immutable evidence store configuration pattern:

Evidence bucket:
  → Versioning: enabled
  → Object Lock: COMPLIANCE mode, 730-day minimum retention
  → Access: write-only for incident responders, read for legal/forensics only
  → Hosted in a dedicated security account, separate from production
  → No delete permissions granted to any human principal

Phase 2: Detection and Analysis

Detection is where your SIEM pays for itself. For each incident type you need pre-built alert rules that fire into your alerting pipeline, and pre-built investigation queries that tell you the scope.

The five incident types you’re most likely to face

Based on what we see across Israeli startup environments:

Account takeover — compromised identity provider or cloud console credentials
Privilege escalation — IAM role abuse, policy manipulation, assume-role chains
Data exfiltration — bulk object storage reads, unusual API query volumes
Insider threat or departing employee — access abuse in the off-boarding window
Cloud misconfiguration — public storage bucket, overly permissive firewall rule

Pre-built investigation queries (pseudo-code)

Save these as named queries in your SIEM. When an alert fires, the investigator runs the relevant query immediately rather than writing it from scratch under pressure.

Account takeover — impossible travel detection:

QUERY authentication_events
WHERE event_type = "login_success"
  AND timestamp >= NOW - 30 minutes
GROUP BY user_id
HAVING count_distinct(country) > 1
RETURN user_id, list(source_ip), list(country), list(timestamp)

Privilege escalation — high-risk permission changes:

QUERY cloud_audit_events
WHERE event_name IN (
    "AttachRolePolicy", "CreateRole", "PassRole",
    "PutUserPolicy", "UpdateAssumeRolePolicy",
    "CreatePolicyVersion", "SetDefaultPolicyVersion"
  )
  AND actor_type != "root"
  AND timestamp >= NOW - 24 hours
RETURN timestamp, actor_arn, event_name, source_ip, target_resource
ORDER BY timestamp DESC

Data exfiltration — bulk object access:

QUERY storage_access_events
WHERE operation = "GET_OBJECT"
  AND response_code = 200
  AND timestamp >= NOW - 1 hour
GROUP BY requester_identity
HAVING sum(bytes_sent) > 500_000_000   -- tune threshold per environment
RETURN requester_identity, count(requests), sum(bytes_sent), min(timestamp), max(timestamp)
ORDER BY sum(bytes_sent) DESC

Departing employee access after off-boarding date:

QUERY authentication_events AS e
JOIN terminated_employees AS t ON e.user_id = t.user_id
WHERE e.timestamp > t.termination_date
  AND e.event_type = "login_success"
RETURN e.timestamp, e.user_id, e.source_ip, e.event_type
ORDER BY e.timestamp DESC

Scoping the incident

When a detection fires, answer the five Ws before taking any containment action. Premature containment — like disabling a user account without checking for active sessions or lateral movement — can tip off an attacker, cause production impact, and destroy forensic evidence.

Question	Where to look
Who — which identity?	Identity provider logs, cloud IAM audit trail
What — what actions?	Cloud API audit events, application access logs
When — start time, still ongoing?	Earliest event timestamp, most recent event
Where — which resources?	Storage bucket names, compute instance IDs, database identifiers
How — what access vector?	Source IP, user agent, console vs. API, MFA state

Document your answers before moving to containment. The scoping document becomes the foundation of your PPA notification and post-incident report.

Phase 3: Containment

Containment isolates the incident without destroying forensic evidence. The sequence is always: preserve first, then contain.

Preserve evidence before any remediation

Before disabling any account, terminating any instance, or modifying any access policy:

1. Export audit logs covering the incident window → write to immutable evidence store
   - Cloud API audit trail
   - Identity provider authentication logs
   - Storage access logs for affected buckets

2. For affected compute instances:
   - Take disk snapshots (label with incident ID and timestamp)
   - Capture instance metadata (running processes, network connections, scheduled tasks)
   - Do NOT terminate the instance yet

3. Record exact timestamps of:
   - When the alert fired
   - When the investigation started
   - When evidence was preserved
   - When containment began

Contain credential compromise

Disable the affected identity:
  → Set account to disabled/suspended state in your identity provider
  → Do not delete — deletion removes audit history

Revoke active sessions:
  → Call identity provider API to invalidate all active sessions for the user
  → Revoke all API keys and tokens issued to that identity

Apply emergency deny policy:
  → Attach a Deny-All permission policy to the cloud IAM user/role
  → This blocks further API calls even if a token was already issued

Scope the blast radius:
  → Query audit logs for all actions taken by this identity in the past 30 days
  → Identify any roles assumed, users created, or permissions granted
  → Each of these is a potential persistence mechanism requiring separate remediation

Contain a compromised compute instance

1. Take disk snapshot → evidence store (before any changes)
2. Capture memory if malware analysis is required
3. Replace the instance's network security group with an isolation group
   → Isolation group: no inbound, no outbound, no exceptions
4. Do NOT terminate until forensic analysis is complete
5. If instance is in an auto-scaling group, remove it from the group first
   to prevent the ASG from replacing it before analysis

Contain an exposed storage bucket

1. Enable access logging if not already active (for subsequent forensics)
2. Block all public access settings immediately
3. Remove any bucket policies granting public or cross-account access
4. Rotate any credentials or signed URLs that were scoped to this bucket
5. Query storage access logs to determine what was accessed, by whom, and when

Phase 5: Recovery

Before returning any system to production, verify clean state:

Confirm persistence is removed — check for IAM users created, backdoor functions deployed, scheduled tasks added, or modified startup scripts during the compromise window
Rotate all exposed credentials — API keys, service account credentials, database passwords, JWT signing secrets, and any secrets that may have been readable from compromised systems
Verify log integrity — check your cloud audit trail for events like DeleteTrail, StopLogging, DeleteLogGroup, or equivalent. An attacker who covered their tracks is more dangerous than one who didn’t.
Run standard detection rules for 24 hours against the recovered systems before declaring recovery complete — reinfection in the first 24 hours is common when root cause wasn’t fully identified

Legal notification obligations

For Israeli companies, two clocks start the moment you confirm a breach:

Obligation	Deadline	Who to notify
PPA notification (Amendment 13)	72 hours from discovery	Privacy Protection Authority breach portal
Individual notification (Amendment 13)	30 days	Affected data subjects
INCD notification (National Cybersecurity Law 2026)	12 hours (regulated sectors)	Israel National Cyber Directorate
Customer notification	Per contract — typically 24–72 hours	Designated customer contacts per your MSAs

Loop in legal counsel at the moment you classify a P1 or P2 incident where personal data may be involved. They need to start the notification clock and review your scope assessment before you make any statements to customers about what was or wasn’t accessed.

Do not make promises to customers about the scope of a breach until your SIEM investigation is complete and legal has reviewed the output. Premature statements that later prove incorrect create significant liability.

Phase 6: Post-incident review

Within five business days of full recovery, run a post-incident review (PIR). The output is not a blame report — it’s a list of specific system improvements.

Structure the PIR around:

Timeline reconstruction — minute-by-minute from first indicator to full containment, sourced from log data (not from memory)
Detection gap — why did the alert fire when it did? What would have caught it sooner?
Response gaps — where did the team lose time? What wasn’t covered by the playbook?
Control failures — which security control, if in place, would have prevented or limited the incident?
Action items — specific, assigned, time-bounded. Track in your engineering backlog, not a separate document.

The timeline reconstruction should come from your SIEM, not from team recollection. A parameterized query that takes the incident’s known indicators and returns a time-ordered event sequence is the most reliable artifact for the PIR and for PPA submission:

QUERY all_event_sources
WHERE (source_ip = <attacker_ip> OR actor_id = <compromised_user>)
  AND timestamp BETWEEN <incident_start> AND <incident_end>
RETURN timestamp, source, event_type, actor_id, resource_id, result
ORDER BY timestamp ASC

Running that query and exporting the result takes minutes in a well-instrumented environment. Manual reconstruction from memory takes days and introduces errors.

Final thought

An IR plan is only useful if it exists before the incident. The playbooks, severity definitions, pre-staged queries, and evidence procedures described here take a few days to build and return years of operational benefit. Every incident run through a documented process improves the next one.

If you want help wiring detection rules into your alerting workflow, or want a second opinion on your IR plan against the NIST 800-61 framework, contact us.