Skip to main content
biology computer-science philosophy psychology visual-arts

Similarity

Description

Similarity, as a fundamental concept, names the structural claim that items sharing surface features are perceptually and cognitively grouped. The shared features are surface — color, shape, voice, style, tag, brand mark — not necessarily underlying mechanism or relational structure. The grouping inference is automatic: see two things with matching surface, treat them as the same kind of thing until told otherwise. This is distinct from structural sameness (isomorphism) and from analogical correspondence (the analogous_to edge). Surface similarity may or may not coincide with deep relationship. When it does, similarity is a useful heuristic for inferring deep relationship cheaply. When it doesn’t, similarity becomes the substrate of misclassification — the cargo-culter, the cosplay impostor, the Batesian mimic, the red-herring decoy all exploit the gap between surface and structure. The diagnostic question — “do these items share structure, or only surface?” — is the test of whether similarity-based grouping is doing real epistemic work. Analogical reasoning’s discipline (inference based on shared structure rather than shared surface) explicitly corrects similarity by demanding the structural account; the gestalt-similarity concept names the perceptual bias that discipline is correcting. The concept earns its place in the catalog because the surface-grouping bias is real across domains — visual perception, brand recognition, code style, organisational identity, species identification — and because the catalog needs an explicit name for surface-similarity to distinguish it from the deeper relationships analogical reasoning targets. Calling the bias by name makes its scope explicit and the gap-with-structure visible.

Triggers

User-initiated: User is reasoning about grouping or categorization on the basis of shared features. Vocabulary cues: “looks like,” “resembles,” “same family,” “matching style,” “they share X,” “looks the same,” “kind of like.” Agent-initiated: Agent notices the user is treating items as a coherent group on the basis of surface features without having examined whether the structural / mechanistic relationship holds. Candidate inference: “the grouping you’re working with is similarity-based; what structural relationship, if any, holds among these items? Is the similarity an indicator of the structure, or coincident with it, or misleading about it?” Situation-shape signals: Brand and design system work. Code-style and convention decisions. Categorization tasks in social cognition, biology, or ML. Reasoning about cargo-culting, copying, imitation. Any analytical task where group membership has been assumed on surface grounds and the structural account hasn’t been examined.

Exclusions

  • The shared features are structural, not surface — when items share a relational structure (both follow the same control-flow pattern, both have the same network topology, both have the same syntactic role), the relationship is isomorphism or analogous-to, not similarity. Calling it similarity demotes the structural claim to a perceptual one.
  • The features are unique to each item — if no surface features are shared, no similarity inference is available. The concept needs overlap in features to operate.
  • The grouping is by explicit declaration — when a set is defined by an explicit rule (“everyone who registered before Tuesday”) or by membership token (“members of Team Alpha”), the basis of the grouping is the declaration, not surface resemblance. Reading the group as similarity-based misclassifies it.
  • Pure quantitative-ranking contexts — when the question is “rank by metric X,” not “which of these are alike,” similarity-grouping isn’t the operation in play. Ranking presupposes a total order, not a grouping.
  • The features in question are mechanistically load-bearing rather than surface — when two engines share the same fuel system and that system causes their performance, the shared feature is structural-mechanism, not surface. Similarity-talk obscures the causal pathway.

Structure

Internal structure of similarity: a table of its component slots and the concepts that fill them.

Relationships

Relationship neighborhood of similarity: a graph of the concepts it connects to and the concepts it is a part of.
  • proximity — paired gestalt grouping principle. Items group by surface or by nearness, often by both.
  • isomorphism — structural opposite. Isomorphism is structure-preserving with potentially-different surface; similarity is surface-matching with potentially-different structure. The pair maps the endpoints of “what makes two things alike?”
  • cargo-cult — the failure mode of similarity-based reasoning when surface and structure have come apart.
  • red-herring — exploits similarity to the load-bearing element; the observer’s similarity-grouping bias is the lever the decoy uses. - analogous_to (the catalog’s edge kind, not a concept) — the structural-claim edge that asserts structural analogousness rather than surface similarity. Reading the pair clarifies the discipline analogical reasoning brings to similarity: similarity is the gestalt default, the analogous_to edge is the structural claim it must be checked against.
  • surface — similarity operates over surface features; the relationship between similarity and the surface concept is that similarity uses surface as its input and yields grouping as its output.

Examples

Brand systems and design systems · visual-arts

Apple products look like Apple products; Bauhaus chairs look like Bauhaus chairs. The shared surface vocabulary (typography, materials, proportion, color) lets observers identify category membership at a glance. The whole field of branding is built on managed surface similarity.

Cargo-cult engineering · computer-science

copying the surface practices of successful teams (standups, sprints, mission statements, microservices) without the underlying causal mechanism. The cargo-culter is operating on surface similarity; the failure mode is that surface and structure have come unstuck.
Tversky’s contrast model treats similarity as a function of common and distinctive features rather than as distance in a metric space. Crucially, the model weights distinctive features asymmetrically between the two compared items, which lets it predict empirical asymmetries — subjects rate “North Korea is similar to China” higher than “China is similar to North Korea.” The less salient item is more often the subject of the comparison; the more salient item is more often the referent.The example instantiates similarity as a structured, asymmetric relation rather than a symmetric metric. It is the canonical psychometric counterpoint to geometric (Euclidean / cosine) models of similarity that dominate machine-learning representations — and the empirical reminder that any cosine-embedding retrieval is choosing a specific similarity model that flattens the asymmetries Tversky’s framework foregrounds.
a harmless species evolves surface-similarity to a defended one. Predators with similarity-grouping in their learning machinery apply the model species’ avoidance learning to the mimic. The mimic’s fitness depends on the predator’s similarity bias.
uniform indentation, naming conventions, file organization. The surface uniformity isn’t load-bearing for the program’s behavior, but it’s load-bearing for the reader, who uses surface similarity to infer “same author / same conventions / safe to read pattern-by-pattern.”
Rosch’s prototype theory argued that natural categories are organized around best-example prototypes rather than necessary-and-sufficient feature lists. Membership is graded: a robin is a more prototypical bird than a penguin, and subjects are reliably faster to verify category membership for typical exemplars than for atypical ones. The category boundary is fuzzy because similarity to the prototype varies continuously.The example instantiates similarity as the basis of category assignment — to be “in” the category is to be sufficiently similar to its prototype. This is one of the cognitive-science origins of the graded-similarity intuition that underlies modern embedding-based retrieval: nearest-neighbor lookup against a prototype-like centroid is structurally what Rosch’s subjects were doing when they classified atypical exemplars.
Bates described palatable Amazonian butterflies whose wing patterns closely resembled those of unpalatable, predator-deterring species. Predators that had learned to avoid the unpalatable model generalized that avoidance to the harmless mimic on the basis of surface similarity. The mimic gains protection by exploiting the predator’s similarity-based generalization, without itself being toxic.The example instantiates similarity as the substrate of grouping in predator perception, with adaptive consequences. It also shows similarity as a structural attack surface: when a system routes behavior off surface features, agents can construct surfaces that exploit the routing. The same dynamic plays out in phishing emails imitating legitimate sender surfaces and in cargo-cult engineering imitating the surface of a successful system.
Wertheimer’s 1923 paper articulated the gestalt grouping laws — proximity, similarity (Ähnlichkeit), closure, common fate, good continuation — that describe how the visual system organizes raw retinal input into perceived units. Similarity in this account is one grouping principle among several: items sharing color, shape, size, or orientation tend to be perceived together, against a background of items that differ on those features.The example instantiates similarity as a low-level perceptual grouping operation, not just a cognitive judgment. It is the foundational reference for the catalog’s similarity primitive in the visual / gestalt cluster, and shows the principle at work below the level of conscious categorization that Rosch and Tversky studied — same structural move, different cognitive layer.
classifiers (and human users) sort messages by surface cues — sender domain, formatting, link patterns. Sophisticated attackers manage their surface to defeat the detector by approximating legitimate surface; the arms race plays out in the similarity-detection layer.
surface features (race, accent, dress, gender presentation) trigger automatic categorization with downstream attribution of dispositions, abilities, or risk. The phenomenon is the social-domain instance of similarity-based grouping operating below explicit awareness.
formal cognitive-science treatment of similarity as a function of common features minus distinctive features. The model captures the asymmetry of similarity judgments (Korea is more similar to China than China is to Korea) by weighting common-vs-distinctive features differently.
dot arrays where mixed-color dots are grouped by color regardless of position. Color similarity overrides spatial layout; the grouping is automatic and pre-attentive.
Philosophical Investigations §66 famously notes that “game” cannot be defined by a single shared essence: chess and tag and solitaire share no common feature, but each pair of games shares some overlapping feature. The category is held together by similarity-chains, not by structural identity.