Calibration
Description
Calibration is the operation that makes an instrument’s or a judgment’s outputs trustworthy at face value by aligning them against a reference standard taken to be true. A measuring device produces raw indications; calibration establishes the relation between those indications and the true quantity values supplied by a standard, and quantifies the residual uncertainty that remains. The same shape appears wherever outputs must be trusted: a classifier’s predicted probabilities are calibrated when a “0.7” really does come true 70% of the time, and a forecaster is calibrated when their stated confidences match realized event frequencies over many trials. The diagnostic question — “against what trusted reference is this output aligned, and how big is the residual error?” — separates a calibrated source (trust the number) from an uncalibrated one (the number needs correction first). The reference is always external: internal consistency (a device that repeats the same reading) is precision, not calibration. A precise-but-uncalibrated instrument is reliably wrong.Triggers
User-initiated: User asks whether a measurement, score, model output, or confidence can be trusted as-is, or describes aligning a tool against a known standard (“calibrate the sensor,” “the model is overconfident,” “what’s our ground truth”). Agent-initiated: Agent notices outputs are being trusted at face value without a named reference, or that stated confidences don’t match observed outcomes. Candidate inference: “what reference is this aligned against, and when was it last checked?” Situation-shape signals: A device or judge emitting values consumed downstream as truth; a probability/confidence stream whose reliability can be checked against realized frequencies; a traceability chain back to a master standard.Exclusions
- Drift — losing alignment over time is the failure calibration corrects, not calibration itself. The corrective-vs-failure-mode pair is the sharpest boundary; see drift.
- Error-correction — error-correction reconstructs corrupted content from in-band redundancy; calibration tunes the apparatus against an out-of-band reference.
- Mean-reversion — mean-reversion is a passive restoring force; calibration is an active performed alignment. No dynamic pulls the instrument back on its own.
- Accuracy without a reference — internal consistency (precision, repeatability) is not calibration. The comparison-to-an-external-trusted-standard is constitutive.
Structure
Relationships
- drift — corrective and failure-mode. Drift is what calibration fixes; the interval between calibrations is set by how fast drift accumulates relative to tolerance.
- error-correction — both yield trustworthy values from imperfect ones; the axis is in-band-redundancy vs out-of-band-reference.
- similarity — calibration presupposes the reference is genuinely comparable to the measured quantity; a mismatched standard yields a meaningless alignment.
Examples
JCGM 200:2012, "International Vocabulary of Metrology — Basic and General Concepts and Associated Terms (VIM)", 3rd edition, Joint Committee for Guides in Metrology / BIPM · engineering-and-technology
JCGM 200:2012, "International Vocabulary of Metrology — Basic and General Concepts and Associated Terms (VIM)", 3rd edition, Joint Committee for Guides in Metrology / BIPM · engineering-and-technology
Niculescu-Mizil, A. & Caruana, R., "Predicting Good Probabilities with Supervised Learning", Proceedings of the 22nd International Conference on Machine Learning (ICML 2005), pp. 625–632 · computer-science
Niculescu-Mizil, A. & Caruana, R., "Predicting Good Probabilities with Supervised Learning", Proceedings of the 22nd International Conference on Machine Learning (ICML 2005), pp. 625–632 · computer-science
Lichtenstein, S., Fischhoff, B. & Phillips, L. D., "Calibration of Probabilities: The State of the Art to 1980", in Kahneman, Slovic & Tversky (eds.), Judgment Under Uncertainty: Heuristics and Biases (Cambridge University Press, 1982), pp. 306–334 · psychology
Lichtenstein, S., Fischhoff, B. & Phillips, L. D., "Calibration of Probabilities: The State of the Art to 1980", in Kahneman, Slovic & Tversky (eds.), Judgment Under Uncertainty: Heuristics and Biases (Cambridge University Press, 1982), pp. 306–334 · psychology