Caching
Description
Caching is the move of keeping a close-to-consumer copy of frequently-accessed data so reads don’t have to traverse the full distance to the authoritative source every time. The diagnostic shape: there’s a source whose access is slow or expensive; there’s a consumer whose access is hot; a copy lives between them, addressed by key, populated on miss (or eagerly), and invalidated by some policy. The structural payoff is latency reduction and load reduction on the source. The structural cost is staleness — the cache may not reflect the source’s current state — and the operational cost of the invalidation policy. The famous quip is exact: “There are only two hard problems in computer science — cache invalidation and naming things” (Phil Karlton). The reason cache invalidation is hard is that it lives at the seam between two systems with different change-propagation guarantees; the cache promises an abstraction (looks-like-the-source) that the world keeps leaking through. Caching is structurally adjacent to replication. A read-replica IS a cache of the primary with eventual-consistency semantics; the engineering tradeoffs (staleness vs throughput, invalidation policy, TTL choice) are isomorphic. The vocabulary differs by tradition (databases say “replica,” web infrastructure says “cache,” CPUs say “L1/L2/L3”), but the structural primitive is one.Triggers
User-initiated: User describes slow access to a source, repeated identical queries, or proposes adding a cache. Vocabulary cues: “cache,” “caching,” “CDN,” “memoization,” “TTL,” “cache miss.” Agent-initiated: Engine notices repeated identical reads against a slow / expensive source. Candidate inference: “this wants caching — what’s the key, the TTL, and the invalidation policy?” Situation-shape signals: Slow / expensive source with hot access pattern; tolerable staleness in the consumer; read-heavy workload; reduceable latency dominates the user experience.Exclusions
- Write-heavy workloads — caches help reads; writes still hit the source, and the cache’s invalidation overhead can dominate.
- Strict-consistency requirements — if staleness is unacceptable, a cache without strong synchronous invalidation is wrong; the alternative may be no-cache or compare-on-read.
- No locality of reference — if every key is accessed exactly once, the cache miss rate is 100% and the cache adds latency.
- Small dataset, small source — if the source is already fast, the cache adds operational complexity without latency benefit.
Structure
Relationships
- eager-vs-lazy — cache is the eager copy; invalidation makes it lazy at staleness boundary.
- leaky-abstraction — cache invalidation is the canonical leak.
- replication — read-replicas are caches by another name.
- grain — cache key IS the grain.
- load-balancing — caches reduce load on the source by absorbing hits at lower levels.
Examples
CPU caches (L1/L2/L3) · computer-science
CPU caches (L1/L2/L3) · computer-science
Library reserve shelves · library-and-museum-studies
Library reserve shelves · library-and-museum-studies
Biological L1-cache (charged tRNAs) · biology
Biological L1-cache (charged tRNAs) · biology
Browser HTTP cache · computer-science
Browser HTTP cache · computer-science
CDN edge caches (Cloudflare, Fastly, Akamai) · computer-science
CDN edge caches (Cloudflare, Fastly, Akamai) · computer-science
Cache-Control, ETag, Vary, surrogate-control extensions) are the invalidation policy.The structural shape is caching at planetary scale. Latency reduction is the headline benefit — a user in Singapore reading content hosted in Virginia experiences a round-trip dominated by the speed of light, not by application logic — but the load-shedding benefit is equally important: the origin only handles cache misses, so a viral story whose 99% of reads are served from edges sees its origin load capped at the marginal-content-update rate rather than the user-traffic rate. The invalidation problem (when did the origin change? how do we tell the edges?) is the canonical hard part: long TTLs maximize cache-hit ratios but extend the window during which stale content can be served; aggressive purges restore consistency but defeat the cache.CDN vendors compete on the engineering details of this tradeoff: regional vs. multi-tier edge hierarchies, dynamic content compilation at the edge, programmable purge policies, ESI/edge-side composition, signed-URL invalidation. The structural primitive is one; the implementations vary by which dial each vendor optimizes hardest.Database query cache · computer-science
Database query cache · computer-science
DNS caches · computer-science
DNS caches · computer-science
Hennessy, J. L., & Patterson, D. A. *Computer Architecture: A Quantitative Approach* (Morgan Kaufmann) — the standard graduate text on the memory hierarchy and CPU cache design. · computer-science
Hennessy, J. L., & Patterson, D. A. *Computer Architecture: A Quantitative Approach* (Morgan Kaufmann) — the standard graduate text on the memory hierarchy and CPU cache design. · computer-science
Human working memory · psychology
Human working memory · psychology
Phil Karlton (Netscape), attributed — "There are only two hard things in Computer Science: cache invalidation and naming things." Reported by Tim Bray; the "off-by-one" rider is a later addition by Leon Bambrick, not part of the original. · computer-science
Phil Karlton (Netscape), attributed — "There are only two hard things in Computer Science: cache invalidation and naming things." Reported by Tim Bray; the "off-by-one" rider is a later addition by Leon Bambrick, not part of the original. · computer-science
Karlton, Phil (Netscape) — "There are only two hard things in Computer Science: cache invalidation and naming things"; Knuth, The Art of Computer Programming Vol. 3 — caching as memoization · computer-science
Karlton, Phil (Netscape) — "There are only two hard things in Computer Science: cache invalidation and naming things"; Knuth, The Art of Computer Programming Vol. 3 — caching as memoization · computer-science
Kleppmann, *Designing Data-Intensive Applications* (2017), Chapters 1, 5 — modern framing; Hello Interview primer on caching; computer-architecture literature on L1/L2/L3 cache hierarchies. · computer-science
Kleppmann, *Designing Data-Intensive Applications* (2017), Chapters 1, 5 — modern framing; Hello Interview primer on caching; computer-architecture literature on L1/L2/L3 cache hierarchies. · computer-science
Knuth, D. E. *The Art of Computer Programming*, Vol. 3: *Sorting and Searching* — the algorithmic substrate (searching, hashing, table look-up) underlying caches and memo tables. ("Memoization" as a term was coined by Donald Michie, 1968.) · computer-science
Knuth, D. E. *The Art of Computer Programming*, Vol. 3: *Sorting and Searching* — the algorithmic substrate (searching, hashing, table look-up) underlying caches and memo tables. ("Memoization" as a term was coined by Donald Michie, 1968.) · computer-science
Memcached / Redis as application cache · computer-science
Memcached / Redis as application cache · computer-science
RFC 7234 (HTTP Caching) — the canonical web-protocol-level specification. · computer-science
RFC 7234 (HTTP Caching) — the canonical web-protocol-level specification. · computer-science
Cache-Control, Expires, ETag, Last-Modified, Vary) and the freshness/validation model that govern how clients, intermediaries (proxies, CDN edges), and origin servers cooperate to serve cached responses safely.What makes the spec interesting as an instance of the caching primitive is that it explicitly codifies the staleness contract — the conditions under which a cached response may be served without revalidation, must be revalidated, or must be treated as expired. The protocol moves “trades staleness for speed” from an implicit engineering tradeoff into an interoperable, machine-readable agreement between all the caches in the path between origin and consumer.