Representation Is Compression

Every time you represent something — in a drawing, a word, a number, a file — you are making a decision about what to keep and what to throw away. That is compression.


The original thing (a landscape, a face, a song) contains infinite detail. Any representation of it is finite. Something always gets left out. The only question is which details survive the translation.

Lossy vs. Lossless

In computing, compression is either lossless (the original can be perfectly reconstructed — like a ZIP file) or lossy (some information is permanently discarded — like a JPEG or an MP3). Human memory is almost always lossy. So is most real-world data collection.

This isn't a flaw. It's the point. A map that contains every detail of a city at 1:1 scale is useless — it's just the city again. Compression is what makes information actionable. The data scientist's job is to choose compressions that preserve what matters for the task at hand.


Here's a concrete demonstration. You've seen the Starbucks logo hundreds of times. But how much of it actually made it into your memory?

Branded in Memory

Step 1 of 5

Draw from Memory

Without looking it up, draw the Starbucks logo as accurately as you can. Use the tools below — take your time.

Size

Draw the Starbucks logo from memory, then compare your compressed representation to the original — and to everyone else's.

Notice what happened. You and hundreds of other people encoded the same logo, but each representation was different. Everyone kept the dominant features — green circle, mermaid figure — and dropped the fine details. That's the compression in action: high-frequency visual information (the exact crown shape, the star count, the precise arm position) got filtered out; low-frequency structure (color, rough shape, general subject) was retained.


This is exactly what a JPEG does to a photograph. It discards high-frequency detail that is expensive to store and that most viewers won't notice. It keeps the broad strokes. Your visual memory runs the same algorithm.

Compression in Every Data Type

  • Text: A summary compresses a document. A word compresses a concept. An emoji compresses an emotion.
  • Images: JPEG compression discards high-frequency pixel variation. Downsampling throws away resolution.
  • Tabular data: Binning a continuous variable (age → "20–30") compresses by quantizing. Averaging discards variance.
  • Models: A trained model is itself a compression — it encodes statistical patterns from millions of training examples into a fixed number of parameters.

When you choose a representation for your data, you are choosing a compression scheme. The decision is never neutral. A feature you don't include can't influence your model. A resolution you discard can't be recovered. A category boundary you draw will shape every downstream result.


Understanding representation as compression reframes the question. It's not "how do I store this data?" It's "what do I need to preserve — and what am I willing to lose?"

Checkpoint

A data scientist bins a continuous "age" column into ranges like 0–18, 19–35, 36–60, 60+. What kind of compression is this?