Bulkhead
Description
A bulkhead is a structural barrier that partitions a system into isolated failure domains so that a failure in one compartment cannot exhaust resources or propagate to another. The diagnostic shape: every resource that could be shared across tenants / services / components is instead partitioned per-tenant, per-service, per-component, with hard limits enforced. When tenant A consumes 100% of its connection pool, tenant B’s pool is untouched. When service X’s thread pool deadlocks, service Y’s threads are not in the same pool. The structural lineage is literal: ship bulkheads. A ship’s hull is divided into watertight compartments so a hull breach floods only the breached compartment; the ship stays afloat. The Titanic’s bulkheads went only partway up the hull, so water cascaded from compartment to compartment over the tops — a teaching example of insufficient bulkheading that is still cited in resilience-engineering literature a century later. The pattern’s diagnostic question is “what’s the blast radius of any single component’s worst-case failure?” — and the answer should be “exactly one compartment, no further.” The cost is duplication of resources (you can’t share a single thread pool across all callers); the benefit is bounded blast radius.Triggers
User-initiated: User describes noisy-neighbor problems, blast-radius concerns, or wants to isolate a problematic tenant/service. Vocabulary cues: “bulkhead,” “isolation,” “failure domain,” “blast radius,” “tenant isolation,” “noisy neighbor,” “compartmentalize.” Agent-initiated: Engine notices a multi-tenant or multi-component system with shared resource pools that could be exhausted by a single bad actor. Candidate inference: “this wants bulkheads — what’s the resource being shared, what’s the grain of isolation, and what’s the per-compartment limit?” Situation-shape signals: Multi-tenant system; multi-dependency system with cascading-failure risk; observed noisy-neighbor pattern; want to bound the worst-case impact of any single component’s failure.Exclusions
- Single-tenant systems with single dependencies — nothing to isolate from anything else.
- Resource-cost is dominant — bulkheading multiplies resource requirements (per-tenant pools have higher steady-state cost than a shared pool); for cost-constrained systems, the duplication may not be earnable.
- Failure modes are uncorrelated with the bulkhead boundary — if the failure mode propagates through a shared substrate the bulkhead doesn’t cover (shared filesystem, shared OS kernel, shared physical host), the bulkhead is theatrical, not real.
- Strong cross-tenant queries needed — bulkheads make cross-tenant operations expensive or impossible by design; if you need them, you’ve broken the isolation contract.
Structure
Relationships
- container — bulkheads are containers with isolation as the load-bearing property.
- graceful-degradation — bulkheads are the substrate that makes degradation rather than total failure possible.
- circuit-breaker — breakers and bulkheads compose at the dependency boundary.
- rate-limiting — rate-limits per-bulkhead are how the isolation is operationalized.
Examples
Ship compartments / Titanic case · engineering-and-technology
Ship compartments / Titanic case · engineering-and-technology
Organizational divisions · business
Organizational divisions · business
Blast doors in military bunkers · engineering-and-technology
Blast doors in military bunkers · engineering-and-technology
Kubernetes namespaces with resource quotas · computer-science
Kubernetes namespaces with resource quotas · computer-science
Lewis, E. V. (Ed.) (1988). *Principles of Naval Architecture* (2nd rev. ed., Vol. 1: Stability and Strength). Society of Naval Architects and Marine Engineers (SNAME) — chapters on damaged stability, flooding, and watertight subdivision. · engineering-and-technology
Lewis, E. V. (Ed.) (1988). *Principles of Naval Architecture* (2nd rev. ed., Vol. 1: Stability and Strength). Society of Naval Architects and Marine Engineers (SNAME) — chapters on damaged stability, flooding, and watertight subdivision. · engineering-and-technology
Nygard, M., *Release It!* (2007), Chapter 5 — the canonical software-pattern essay (introduces "Bulkhead" as a software stability pattern, named by analogy to ship architecture); Hello Interview primer on resilience patterns. · computer-science
Nygard, M., *Release It!* (2007), Chapter 5 — the canonical software-pattern essay (introduces "Bulkhead" as a software stability pattern, named by analogy to ship architecture); Hello Interview primer on resilience patterns. · computer-science
Per-process worker isolation (Gunicorn, Unicorn) · computer-science
Per-process worker isolation (Gunicorn, Unicorn) · computer-science
Per-service AWS account boundaries · computer-science
Per-service AWS account boundaries · computer-science
Per-tenant database isolation · computer-science
Per-tenant database isolation · computer-science
Per-tenant thread pools in multi-tenant SaaS · computer-science
Per-tenant thread pools in multi-tenant SaaS · computer-science
ship architecture (bulkheads in steel-hulled ships; SOLAS regulations); naval engineering literature · engineering-and-technology
ship architecture (bulkheads in steel-hulled ships; SOLAS regulations); naval engineering literature · engineering-and-technology
International Convention for the Safety of Life at Sea (SOLAS), Chapter II-1 ("Construction — Subdivision and stability") — IMO; first adopted 1914 in direct response to the 1912 Titanic disaster, current text SOLAS 1974 as amended. · engineering-and-technology
International Convention for the Safety of Life at Sea (SOLAS), Chapter II-1 ("Construction — Subdivision and stability") — IMO; first adopted 1914 in direct response to the 1912 Titanic disaster, current text SOLAS 1974 as amended. · engineering-and-technology
Spacecraft / aircraft pressure vessels · engineering-and-technology
Spacecraft / aircraft pressure vessels · engineering-and-technology