The whole comes first

3 minute read

Published: November 26, 2024

The whole is greater than the sum of its parts.

That line is the slogan of Gestalt psychology, and the more I work on perception in ARC, the more I keep returning to it.

Gestalt theory emerged in early-twentieth-century Austria and Germany (Wertheimer, 1923; Koffka, 1935; Köhler, 1929) as a rejection of the structuralist idea that perception is built up by adding together atomic sensory elements. The Gestaltists argued the opposite: we perceive whole patterns and configurations first, and the parts come after. Their grouping laws are still the cleanest description I know of how raw pixels become things:

Similarity — elements alike in color, shape, or size are seen as belonging together.
Proximity — elements close together are grouped together.
Closure — we fill gaps to perceive a complete figure.
Continuity — smooth, uninterrupted arrangements are read as one.
Figure–ground — we split a scene into a focal figure and its background.

When you look at an ARC grid, this is exactly what happens before any “reasoning” begins. You don’t scan pixels left to right. You see groups — this cluster is one object, this color is figure, black is ground — and only then do you start comparing.

Why does this matter to me beyond being a nice description? Because Gestalt is where I feel the boundary of my own approach most honestly. I work in pure symbolic representations, and when I go law by law, some translate cleanly and some resist:

Similarity is symbol-friendly — I can measure it from explicit features (same color, same shape, same size). ✓
Proximity is half-friendly — I can measure distance, but “closer therefore belongs together” isn’t itself a crisp symbolic fact; it’s a soft tendency.
Closure and figure–ground feel genuinely hard to make symbolic. They seem to need a notion of the whole and a shifting focus — figure–ground is almost a mode you flip, depending on what you’re attending to, not a fixed property you can write down.

So Gestalt sits right on the seam of my research: it’s evidence that some of human perception is wonderfully symbolic, and a warning that some of it strains against a static, discrete representation. That’s the same tension I wrote about in why I took the harder, symbolic road.

A diagram placing Gestaltism in the lineage of psychology that feeds into Cognitive Science and AI Where Gestaltism sits, in my reading: a holism-leaning strand of psychology that — alongside information theory, linguistics, and computer science — flows into Cognitive Science and AI.

And the question is far from settled in modern AI, either. Do deep networks “see” Gestalt wholes? The evidence is genuinely mixed: convolutional nets show human-like sensitivity to grouping principles like closure and proximity — but often only at the output layer, which suggests they learn fundamentally different perceptual properties than we do (Understanding Deep Convolutional Networks through Gestalt Theory, 2018; Mixed Evidence for Gestalt Grouping in Deep Neural Networks, 2022). So “what is the right substrate for whole-first perception” is still open — which is exactly why it’s worth working on.

I put together a short talk walking through this history of Gestaltism and where it meets AI — here on YouTube if you’d like the longer version.

Share on

Twitter Facebook LinkedIn

Seokki Lee (Albert)

The whole comes first

Share on

You May Also Enjoy

Two ways to solve ARC — and why I took the harder one

Why stacking approximations makes me uneasy

A solution is a tree, not a line

How big is the answer?