Skip to main content
business computer-science law medicine-and-health

Write ahead log

Description

Write-ahead-log is the structural primitive of “write the intent-of-change to a durable log before applying the change in place.” The log entry comes first; the in-place mutation comes second. On crash, the in-place state may be partially-applied or inconsistent, but the log is durable and ordered, so recovery is just “replay the log from the last known-good point.” The log is the source of truth; the in-place state is a materialized view derived from the log. The diagnostic shape: any operation that must survive a crash mid-execution must have its intent recorded in a place that won’t tear, before any in-place mutation begins. Databases use redo logs and undo logs; filesystems use journals; double-entry bookkeeping uses journals (literally the same word, same structural role); surgical teams use spoken-aloud “time-outs” before incision; contracts are signed before execution.

Triggers

User-initiated: User describes a system that must survive crashes mid-operation, or proposes mutating shared state without a recovery story. Vocabulary cues: “WAL,” “redo log,” “journal,” “commit log,” “crash recovery,” “transaction log.” Agent-initiated: Engine notices a system with multi-step mutations where partial completion would leave inconsistent state. Candidate inference: “this needs a WAL — what’s the log, the apply protocol, and the replay rule?” Situation-shape signals: Multi-step mutation that must be atomic across crash boundaries; durability requirement at finer grain than the underlying storage offers; audit trail requirement.

Exclusions

  • Single-step atomic primitives — a CAS or single-disk-block-write doesn’t need a WAL; the hardware provides the atomicity.
  • Truly stateless systems — no state to recover means no WAL needed.
  • Acceptable to lose recent changes — if RPO is loose (best-effort durability is fine), a WAL is over-engineering; a periodic snapshot suffices.
  • High-frequency-low-value writes where WAL fsync dominates cost — sometimes the WAL’s durability cost is too high; the alternative is batched writes with explicit data-loss windows.

Structure

Internal structure of write-ahead-log: a table of its component slots and the concepts that fill them. = an append-only durable container + a pending-change descriptor + a deferred apply step. The bookends are the WAL entry (intent) and the apply (or commit-marker). Recovery is the replay protocol: any apply that didn’t finish gets re-attempted on restart; idempotency of the apply makes the replay safe.

Relationships

Relationship neighborhood of write-ahead-log: a graph of the concepts it connects to and the concepts it is a part of.
  • bookends — the WAL entry opens; the apply (or commit marker) closes; recovery completes any orphan opens.
  • event-sourcing — promoting the WAL from “internal recovery mechanism” to “the actual data model” is event-sourcing.
  • idempotency — the apply step must be idempotent for replay to be safe.
  • load-bearing — the WAL is structurally load-bearing in any system that survives crashes.

Examples

Git commits · computer-science

git commit writes to .git/objects (the durable log) before any branch-pointer update. The commit log is durable; branch refs are a derived view.

Double-entry bookkeeping (Pacioli, 1494) · business

Double-entry bookkeeping, codified by Luca Pacioli in 1494 in Summa de Arithmetica, instantiates the write-ahead-log pattern centuries before databases. Every transaction is first recorded in the journal — a chronological narrative of intent (“on this date, paid X to Y for Z”) — and only afterward posted to the relevant ledger accounts (cash, inventory, accounts payable). The journal is the durable, append-only record of intent; the ledger is the derived view that aggregates intents into per-account balances.The structural parallels to a database WAL are exact. (1) Durability: the journal is written first, so even if posting to a ledger account is interrupted, the intent survives and can be replayed. (2) Auditability: every ledger balance can be reconstructed by replaying the journal; corrections never erase the original entry but post compensating entries (which is also why financial systems use event-sourcing discipline rather than mutating prior records). (3) Atomicity at the journal level: a single journal entry, with its debit and credit lines summing to zero, is the unit that either fully exists or does not.Inference: The pattern is not bound to computers. Any system that needs durable, replayable history of intent — surgical safety checklists, legal contracts, signed commits — instantiates the same WAL shape. When a digital system reaches for a journal, it is reaching for a 500-year-old solution to the same coordination-and-recovery problem.
the topic is structurally a WAL; consumers are derived views; the log is the source of truth.
Modern journaling filesystems — ext4 on Linux, NTFS on Windows, APFS on macOS, and predecessors like ext3, ReiserFS, and JFS — apply the write-ahead-log primitive at the operating-system layer. Before any metadata change (file creation, directory rename, inode-table mutation, free-space-bitmap update) is committed to the in-place on-disk structures, the filesystem first writes a description of the pending change to a dedicated journal region of the disk and flushes it to durable storage. Only after the journal entry is durable does the filesystem begin modifying the in-place metadata.The crash-recovery protocol is exactly the WAL replay pattern. On mount after an unclean shutdown, the filesystem scans the journal: entries marked as committed but whose in-place application is incomplete are re-applied; entries not yet committed are discarded. Without the journal, recovery would require a full filesystem-consistency check (the historical fsck on traditional Unix filesystems) — a scan of every inode and block bitmap that scales with disk size and could take hours on large volumes. With the journal, recovery time is bounded by the journal size, typically seconds.Inference: Adopting a WAL converts crash-recovery from an O(disk-size) consistency check into an O(journal-size) replay. The same trade-off applies in any system with persistent state where the alternative is a full integrity scan: the WAL pays a steady cost on every write (the extra journal flush) in exchange for bounded recovery time after failure. The trade is structurally favorable whenever crashes are possible and recovery time matters.
The Gray, Lorie, Putzolu, and Traiger 1976 paper on lock granularity and consistency degrees is part of the IBM Research foundation that established the modern transactional database. Their work — alongside Jim Gray’s broader ACID-properties formalization — codified the write-ahead-log as the mechanism by which a database guarantees atomicity and durability across crashes. Before a transaction can commit, all of its intended mutations must be recorded in the log; the log entries are durable; the in-place data pages can be written lazily because the log is the authoritative record of what the transaction did.The WAL is one of the oldest stable abstractions in storage engineering, and its structural analogues predate computing by centuries. The ship’s log — entries written contemporaneously with events, preserved in a durable record kept apart from the day-to-day operation of the ship — solves the same atomicity-and-recovery problem at the maritime layer. Double-entry bookkeeping, formalized in Pacioli’s 1494 Summa de Arithmetica, requires every transaction to be entered in the journal before posting to the relevant ledger accounts; the journal is the source of truth, the ledger is a derived materialized view. The apothecary’s prescription book and the surgeon’s operative note play the same structural role in medical practice.Inference: When a structural primitive is identifiable across domains that developed independently over centuries — accounting, navigation, medicine, computing — the underlying coordination problem is unlikely to be a domain artifact. The recurring presence of the WAL pattern wherever durable shared state must survive interruption suggests that “record intent before applying” is a near-universal solution to the partial-failure-recovery problem, and recognizing the pattern in a new domain accelerates designing the local instance.
Jim Gray’s “Notes on Data Base Operating Systems” (1978) is the text that turned transaction processing from implementation folklore into an engineering discipline, and it is where the write-ahead-log protocol is stated as a general rule rather than a one-off trick. Compiled from his work on IBM’s System R and delivered as course notes, the Notes formalize the transaction as an all-or-nothing unit of work and lay out the recovery machinery that makes that guarantee real. Gray’s statement of the WAL protocol is the canonical one: a log record describing an update must be written to stable storage before the corresponding change is written to the database itself. That ordering constraint is the whole concept in a sentence — the durable record of intent precedes the in-place mutation, so a crash can never leave a change applied that the log does not also record.The roles the catalog names are all present and named here. The log is Gray’s stable-storage log of update records, written and forced before any data page is touched. The pending change is the transaction’s mutation. The apply step is the in-place write to the data pages, which Gray’s scheme allows to happen lazily precisely because the log is authoritative — on restart, recovery replays the log to redo committed changes that had not yet reached the data files and to undo the effects of transactions that never committed. This undo/redo logging, together with the WAL ordering rule, is the foundation every later durable storage engine builds on; Gray also bundled in the surrounding apparatus (degrees of consistency, two-phase locking, two-phase commit) that the same notes made standard.Inference: The load-bearing invariant of any crash-survivable system is an ordering rule, not a data structure: record the intent durably before you act on it. Gray’s WAL protocol is the minimal statement of that rule, and its power is that it decouples durability from the cost of updating structured data — the log absorbs the fsync, and the data pages can be written lazily because the log can always reconstruct them. When designing a new persistence layer, the first question is which write must be forced first, because the answer to “what survives a crash?” is entirely determined by that ordering.
Kleppmann’s Designing Data-Intensive Applications (2017), Chapter 3, is the modern reference treatment of write-ahead logging in storage-engine design. The chapter walks through the structural reason every durable database uses a WAL: writes to the primary data structures (B-trees, LSM-trees, in-memory indexes) are not atomic at the hardware level, so committing the intent of a change to an append-only log file before mutating the data structure is the discipline that survives a crash mid-write.The treatment generalizes across implementations — PostgreSQL’s WAL, MySQL InnoDB’s redo log, SQLite’s WAL-mode journal, log-structured merge-trees — by surfacing the shared pattern: sequential append to a log file is cheap and crash-safe; in-place mutation of structured data is neither. The log absorbs the durability requirement so the on-disk data structures can be updated lazily, with recovery reconstructing any in-flight state by replaying the log forward from the last checkpoint.Inference: When designing any persistence layer that must survive partial failure, the question is not “do I need a WAL?” but “where is my log, and what does my recovery procedure replay?” If the answer is unclear, the system is one crash away from inconsistent state. The pattern is the answer to a recurring structural problem, not an optimization.
signatures (intent) precede execution; the signed contract is the durable record of intent, against which post-hoc disputes are adjudicated.
ARIES (Mohan et al., 1992) is the recovery algorithm that hardened the write-ahead-log from a protocol into a fully worked-out, crash-during-recovery-safe procedure, and it is the design most production databases descend from. Where Gray’s 1978 notes stated the WAL ordering rule, ARIES specifies exactly how to use the log to recover, in three phases: an analysis pass forward from the last checkpoint to find the dirty pages and the transactions in flight at the crash; a redo pass that “repeats history” by re-applying every logged change — even those of transactions that later aborted — to reconstruct the precise state at the moment of the crash; and an undo pass backward that rolls back the transactions that never committed. The redo-everything-then-undo ordering is the key conceptual move: rebuilding the exact pre-crash state first makes the undo logic uniform and lets the algorithm support fine-grained record-level locking and partial rollbacks that earlier schemes could not.The mechanism that makes this safe is pure WAL-concept refinement. Every log record carries a monotonic Log Sequence Number (LSN), and every data page stores the LSN of the last update applied to it (the pageLSN); recovery redoes a logged change only if the page’s stored LSN is older, which makes replay idempotent — the apply-step can be repeated safely after a crash mid-recovery. During undo, ARIES writes Compensation Log Records that themselves point forward to the next action still to be undone, so a second crash in the middle of rollback never re-undoes completed work or loops. In the catalog’s terms, ARIES is the log, the pending change, and the apply step engineered so that the replay-on-recovery is exactly-once in effect despite being at-least-once in execution.Inference: A write-ahead log only delivers crash safety if recovery is itself crash-safe, and the way to get there is to make replay idempotent. ARIES’s twin devices generalize beyond databases: stamp each logged action with a sequence number, stamp each target with the highest action it has absorbed, and skip any action the target has already seen; and when compensating, log the compensation with a forward pointer so interrupted rollback resumes rather than restarts. Any system that replays a log to recover — message queues, event-sourced stores, state-machine replication — needs these properties, or it trades a crash mid-operation for a crash mid-recovery.
Pacioli, Summa de Arithmetica (1494) — the journal as WAL, in accounting form, five centuries before databases.
patient identity, procedure, side stated aloud before any incision. The spoken-aloud declaration is the WAL entry; the incision is the apply step.
The WHO Surgical Safety Checklist, introduced in 2008 and validated in an eight-hospital international study published in the New England Journal of Medicine the same year, applies the write-ahead-log primitive to surgical procedures. Before any incision, the operating team executes a Time Out: identity of patient verbally confirmed, surgical site marked and reconfirmed, antibiotics administered, anticipated blood loss noted, allergies announced, equipment available — each item spoken aloud and acknowledged by the team. The intent of the procedure is recorded (literally, spoken into the team’s shared memory) before any irreversible action begins.The validation study reported a roughly 40 percent reduction in major complications and a comparable reduction in mortality across diverse hospital settings, with effects observed in both high-income and low-income contexts. The intervention itself is essentially free in equipment or training cost; the structural change is the explicit intent-recording step before action. The pattern of the failure modes the checklist prevents — wrong-site surgery, missing instruments discovered mid-procedure, allergic reactions to medications already administered — is exactly the class of crash-recovery problems the WAL pattern addresses: the in-flight state is incomplete or ambiguous, and without a durable intent record, recovery is structurally impossible.Inference: When evaluating any high-stakes irreversible procedure (surgical, financial, deployment-related, legal), the structural question is whether the team has executed a Time Out — an explicit, often verbally-rehearsed declaration of intent that is robust to in-progress confusion and that constitutes the durable record against which any in-flight decision can be checked. The cost of the Time Out is small; the cost of its absence, when something goes wrong mid-procedure, is potentially catastrophic.