What do we actually compare?

2 minute read

Published:

A while back I argued that humans learn by comparing — by putting two things side by side and finding what’s the same and what’s different. I still believe that’s the engine. But it leaves an obvious gap: to compare two things, you first have to decide what about them to compare. So: what do we actually compare?

When I put an ARC input and output grid side by side, I’m clearly not comparing raw pixels. I’m comparing features — size, the set of colors, the area each color covers, symmetry — and for each feature I note whether it’s the same (COMM) or changed (DIFF). In my solver I literally print this out as a little “comparison receipt”:

task 220 - pair 0
========================================
size
              rel    input  output  delta
  height      COMM   4      4       0
  width       DIFF   9      12      3
  area        DIFF   36     48      12
----------------------------------------
color
  count       DIFF   4      5       1
  4           COMM   4      4       .
  ...
----------------------------------------
symmetry
  hori        COMM   False  False
  ...

For a long time I treated this as just an engineering choice. Then I realized a cognitive scientist had written down almost exactly this, nearly fifty years ago.

In “Features of Similarity” (1977), Amos Tversky rejected the then-standard view that similarity is distance in some geometric space. Instead he modeled each object as a set of features, and similarity as a contrast of common and distinctive features:

\[S(A, B) = \theta\, f(A \cap B) \;-\; \alpha\, f(A - B) \;-\; \beta\, f(B - A)\]

Read that slowly and it’s almost unsettling how close it is to the receipt above. The common features A ∩ B are my COMM rows — they push similarity up (the θ term). The distinctive features A − B and B − A are my DIFF rows — they push it down (the α, β terms). My comparison receipt is, more or less, a Tversky contrast model computed over grid features.

What I find most useful is what Tversky’s model predicts beyond the formula. Because similarity is feature-matching with weights, it is asymmetric (comparing A to B need not equal B to A) and context-dependent (which features you weigh shifts with what else is present). For ARC that isn’t a footnote — it’s the whole game. The same two grids, compared on a different set of features, point to a different rule. So choosing the features to compare is not preparation for the reasoning; it is the reasoning.

This also tells me where comparison has to go next. Tversky compares objects on their features. But a lot of ARC isn’t about an object’s features — it’s about the relations between objects, and how those relations change. Comparing relational structure rather than flat features is precisely the step from Tversky toward Gentner’s structure-mapping — and toward comparing whole solution trees instead of single grids. Same move, one level up.