
A Pixel Is Not A Little Square (1995) [pdf] - fanf2
http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf
======
raphlinus
I imagine this came up partly as a result of the recent alpha compositing
discussion.

We desperately need some research, based in user studies and using modern
display technology, to settle some basic questions:

* What reconstruction filter gives the best results? Is it the same for vector (text) and natural images? By "best" I do mean contrast (sharpness) and lack of visible artifacts.

* For rendering of very thin lines (relevant to text), what gamma curve gives the perception of equal line thickness across the range of subpixel phase? How does it vary with display dpi? (Hint: it's likely not linear luminance)

* What gamma curve yields the perception of equal width of black-on-white and white-on-black thin lines (also relevant for text)? (Hint: likely not linear luminance)

I've seen a number of discussions where people feel they are able to answer
these questions from first principles, and corresponding arguments that doing
these in a "correct" way gives results that are less visually appealing than
the common assumptions of treating a pixel as a little square (so doing a box
filter for reconstruction) and ignoring gamma so effectively using a
perceptual color space for the purposes of alpha compositing.

I posit that these questions _cannot_ be solved by argumentation. I think user
studies might not be very difficult to do; you could probably get "good
enough" results by doing online surveys, though this wouldn't pass standards
of academic rigor.

~~~
BubRoss
I can tell you right now that you will have a very difficult time beating a
normalized gauss filter with a diameter of around 2.2 pixels in a general
case.

Color and luminance is a separate and orthogonal issue from filtering. I also
know that people get away with compositing without converting to linear space,
but I'm skeptical that any benefits they see aren't just a matter of getting
the color curve they want for free, as opposed to doing a similar correction
after something has been composited correctly.

~~~
dahart
I was working at PDI Dreamworks during one of the semi annual investigations
into which filter kernel was best. And I was completely blown away by how
quickly the lighting sup could identify and react to various filters. Gaussian
was voted down reliably and repeatedly for being too blurry.

Personally, I like the extra blur I get (and extra safety and guarantees) you
get with Gaussian. Back in the NTSC days I discovered that vertically blurring
interlaced video made it noticeably more clear and visible, even though it was
softer.

But, If you do really spend time with sharper filters, it is true that
Gaussian is softer and some pros really do want sharper images than Gaussian
can provide.

~~~
BubRoss
Were all the filters being compared with the same width? At the same width it
will be softer, at a smaller pixel radius is where you can compare aliasing
with the same visual sharpness as something like catmull-rom.

~~~
dahart
There were a variety of widths, it was one axis of the study. But I don’t
remember the details, it is certainly possible you’re talking about something
we didn’t test. Personally, at first I couldn’t even see the differences they
were discussing.

Having studied graphics and signal processing for a few years in graduate
school before that job, I thought I would be good at seeing the differences,
and I was a bit shocked how good they were at it, and how not that good I was.
:)

Truncating the Gaussian too closely though, and it’s not exactly a Gaussian
anymore, you lose the best antialiasing properties. I can totally see how it
will be sharper and more comparable to other popular filters. (Normalized &
truncated at 1.1 radius is just slightly outside the 1 std dev line, right?)

Gaussian is my personal choice for large format prints of images with extreme
aliasing problems.

~~~
blevin
My experience has likewise been that film DP’s have extremely impressive
visual acuity, memory for color, etc. Talented artists often have developed
whole sets of skills that the rest of us are unaware are even skills. In the
same way a programmer might have thought about cache line false sharing as it
affects memory hierarchy throughput, visual art often hides lots of expertise
you cannot directly perceive, even as you can sense the quality of the whole.

~~~
egypturnash
I like to describe “learning to draw” as “installing a 3d modeling and
rendering package on your brain, along with a decent collection of base models
to modify”. It’s a _complex_ skill set. If you start animating you get to add
in a physics simulation. And you become conscious of so many little things
that the layman only notices when you get it wrong.

------
bjourne
A pixel is a picture element. An element of a picture. Hence the name... It
turns out that thinking of them as little boxes arranged in rectangular grids
is very useful. Because that is how computers deal with them. Not as point
samples.

The article reminds me of the many mathematical text I've read insisting on
that vectors are not tuples of numbers. That thinking of them as anything
other than directions with magnitudes is wrong. Technically, that might be
correct but vectors-as-numbers is much more useful when calculating with them.
When you get into more abstract mathematics, and your vectors contain other
kinds of algebraic objects, such as polynomials, you are already so accustomed
with them that you can think of them as flying burritos if you like.

When I teach graphics programming, I will continue to tell students that
pixels are like little boxes.

~~~
msla
> The article reminds me of the many mathematical text I've read insisting on
> that vectors are not tuples of numbers. That thinking of them as anything
> other than directions with magnitudes is wrong.

I break every mathematical object down into three things:

1\. The intuition. Why do we have this concept to begin with? What underlying
idea are we trying to capture?

2\. The definition. These are the axioms.

3\. The implementation. This includes every way to _communicate_ the idea,
from natural language words to notation to source code.

Without the intuition, you have nothing but a symbol game. It's hollow.
Something with rules and notation but no deeper intuition is, arguably, chess.

Without the definitions, you can't think rigorously about your ideas and you
don't know if they lead to internal contradiction. You can dump the axioms
without losing the intuition; we did this with set theory at the turn of the
previous century, when Russell proved that the previous axioms were
inconsistent. We saved set theory without having to abandon the notion, the
intuition, of sets entirely.

Without implementation, it's just thought, and you can't communicate with
anyone. Moreover, without some intuition, the implementation is meaningless,
because you have no cognitive frame to use to interpret it.

So the tuple of numbers is one implementation of a vector. It allows you to
communicate some aspects of a vector, but without the underlying idea of what
a vector _means_ , what concept we're trying to get across, it's just a list
of numbers. They might as well be box scores or something.

~~~
mort96
There's also the case that there's a bunch of vectors you _can't_ represent as
tuples of numbers. A vector with orientation but a magnitude of 0 is a
completely valid vector afaik, and you could do things with it like
normalizing it to get a unit vector of the same orientation, and it's not
representable as a tuple.

The tuple model is extremely useful, but incomplete.

~~~
nonanonymous
Vectors with different orientation but the same magnitude are the same vector
(the zero vector.) Observe that if you add a zero magnitude vector and a unit
vector, the result is just the same unit vector. It follows that the
normalization scheme you described can't be a function on vectors (since you
can "normalize" two equal vectors but get two different results.)

------
dahart
This piece is a classic and a must-read for graphics people, but do remember
that this was written before LCD displays. Today’s pixels actually are little
squares to a much greater degree than CRTs in 1995. That doesn’t change the
theory or truth in Alvy Ray’s paper, but it does mean that the perfect
reconstruction isn’t the same now that it was then.

~~~
PorterDuff
Heck, if you've written color film recorder software, they're not even
particularly square

------
robbrown451
I wrote this on Quora several years ago, answering the questions of why pixels
are square (and quickly clarified that I interpreted it as "why are they laid
out on a rectangular grid?"): [https://www.quora.com/Why-are-computer-pixels-
square/answer/...](https://www.quora.com/Why-are-computer-pixels-
square/answer/Rob-Brown-13)

The more I've thought about it since, the more I think we should represent
images with hexagonal pixels (i.e., laid out on a hex grid), and that color
images should treat the center points of red, green, and blue subpixels as not
being right on top of each other. (the third image in the post shows how they
would be arranged, which is actually similar to how they are on some displays)

It would be a little harder to deal with for graphics programmers who are
working at a pixel level, or at the least, it would require a bit of
relearning. But it makes more sense in so many ways. Hex is just a better way
of "circle packing" (as you notice if you arrange a bunch of pennies on a
table), and of course real world displays tend to have the red, green and blue
subpixels offset from one another anyway. (are there any that don't?)

Obviously it isn't easy to change something like this at this point, but
still, I find the idea fascinating, and appealing to the OCD efficiency fan in
me.

~~~
osamagirl69
You might want to look into pentile displays. You are correct in that they
provide higher resolution than rectilinear displays (see the controversy tab
in [1]), but they have largely fallen out of style because it is impossible to
draw a straight line so everything looks blurry upon close inspection.

One cool trick is that you can sprinkle in more efficient white pixels to
allow the display to reduce its power consumption.

[1]
[https://en.wikipedia.org/wiki/PenTile_matrix_family](https://en.wikipedia.org/wiki/PenTile_matrix_family)

------
saboot
This is very relevant for image reconstruction for medical imaging systems.
When determining how much a beam of radiation is attenuated through an image,
it really matters what the representation you use when translating a 2D matrix
of values of material parameters, to an image representation. Are they boxes,
trapezoids (bilinear interpolation), 2D gaussians, spheres? Each technique has
some drawbacks, but for getting down to millimeter precision in scans it
matters.

------
sctb
Some previous discussions:

[https://news.ycombinator.com/item?id=8614159](https://news.ycombinator.com/item?id=8614159)

[https://news.ycombinator.com/item?id=1472175](https://news.ycombinator.com/item?id=1472175)

------
want2know
The same is true for digital audio.

We are so used to see the visual presentation of samples that look like a bar
diagram, that a lot of people think analog sounds better because the curves
are smoother.

Chris Montgomery has a great talk about this.

~~~
nothis
>We are so used to see the visual presentation of samples that look like a bar
diagram, that a lot of people think analog sounds better because the curves
are smoother.

Except for a philosophical debate about continuity, isn't that true?

~~~
arundelo
[http://productionadvice.co.uk/no-stair-steps-in-digital-
audi...](http://productionadvice.co.uk/no-stair-steps-in-digital-audio/)

 _The “stair-steps” you see in your DAW when you zoom up on a digital
waveform_ only exist inside the computer. _[...] When digital audio is played
back in the Real World, the reconstruction filter doesn’t reproduce those
stair-steps – and the audio becomes truly analogue again._

~~~
tzakrajs
Yeah, it's strange people are thinking the display of a spectrum analyzer is
somehow 1:1 with the underlying thing which they are measuring. As if a
digital clock with only the hour and minutes displayed implies that seconds
don't exist.

------
shawnz
A great video which touches on this idea, but mostly in the context of audio:
[https://youtu.be/cIQ9IXSUzuM](https://youtu.be/cIQ9IXSUzuM) (relevant part
around 8 minutes)

~~~
raphlinus
Be careful generalizing the audio results to pixels. The central lesson of
xiph's work is that people simply cannot hear frequencies above, let's say
20kHz. Therefore, as long as your sampling rate is above the Nyquist limit
(and under the assumption the signal chain is linear), any reconstruction
filter that passes frequencies through 20kHz is effectively "perfect."

There are two ways this is not true for pixels. First, even for "retina"
displays the human visual system can make out spatial frequencies beyond the
Nyquist limit of the display (this will vary by viewing distance, so is more
of an issue for young people who can get close to their displays). Second,
even assuming perfect gamma, the display must clip at black and white because
of physical device limitations. Thus, especially for text rendering, only a
reconstruction filter with nonnegative support is generally useful. Such a
reconstruction filter would be an extremely poor choice for audio.

It is true that many of the underlying signal processing principles are the
same, and I encourage people to learn and understand those :)

~~~
shawnz
Thanks for the heads up. There is one specific section where he compares
pixels to lollipop graphs and that is mainly what I was referring to, I didn't
mean to suggest that all the principles in the video apply to graphics in the
same way that they apply to audio.

------
pornel
However, image is not a wave. Sampling theorem applies beautifully to audio,
but only to a limited degree to images. Some filters make sense in frequency
domain, eyes are sensitive to certain frequencies more than others, but it all
breaks down on hard edges, which don't behave like square waves.

The problem is that in images the Gibb's effect is visible and annoying
(ringing artifact). If sampling theorem applied, people wouldn't be able to
see it, like they can't hear the difference between square waves shifted by a
half of a sample.

~~~
nullc
This is mostly only true because our display technology lacks the resolution
to reliably saturate the visual system in the same way that our audio
technology does.

------
jmull
Well, it's an overloaded term (like pretty much every term in software
development).

But there's good information in this article. In many contexts you should be
thinking in terms of point-samples, not squares or other areas.

I think it's not such a great idea to try to simplify the definition of pixel
in this article because it distracts from the useful info.

------
perrygeo
> [title repeats 3 times]

> rid the world of the misconception that a pixel is a little geometric
> square.

> The little square model is simply incorrect. It harms.

> I show why it is wrong in general.

Why the bluster?

He makes a decent case for this point in the domain of graphics processing;
what he calls "correct image (sprite) computing".

But the narrow focus on computer graphics undermines these broad
generalizations. There are many other domains that can be represented in
pixel-based data models. Climate, terrain, population and land cover mapping
are just a few domains where the use of a pixel as a "little geometric square"
is a perfectly viable approach.

Ultimately, if the message is "think about how your data model maps to
reality" \- I agree. But why the hyperbole? Why shit on an entire model
because it doesn't fit for your very specific use case?

------
kazinator
Pixel and voxel are not commensurable. A pixel is a display hardware concept.
A voxel is more comparable to a texel.

[https://en.wikipedia.org/wiki/Texel_(graphics)](https://en.wikipedia.org/wiki/Texel_\(graphics\))

Texels and Voxels _can be_ square/cubic, such as in video game applications
where it is accepted and exploited as a fundamental esthetic: worlds are
textured with tiled mosaics which reveal their square unit when approached
closely, and ditto for worlds made of voxels.

A voxel as a sample of a solid, for instance from a computed tomography, where
the fidelity of the reconstruction matters, is subject to different
requirements.

~~~
jacobolus
Both the word “pixel” and the word “voxel” are commonly used in both of these
different senses. I believe the word “voxel” was created to be explicitly a 3D
analog of a 2D “pixel”.

For example, you can also have “pixel art”, drawings made up of little
squares, which only loosely have to do with pixels as samples for a raster
image or physical camera detector elements or physical display elements.
[https://en.wikipedia.org/wiki/Pixel_art](https://en.wikipedia.org/wiki/Pixel_art)

------
pjs_
A pixel is basically a little square dude, come on.

~~~
BubRoss
No, your pixels are displayed using little squares (sort of).

------
ctdonath
I remember the fanfare & hoopla around the introduction of the IBM PS/2
featuring square pixels. Made computing shapes much easier.

------
stretchwithme
Of course not. It's a little teacup.

------
jcelerier
> A Pixel Is Not a Little Square! Microsoft Tech Memo 6

is this the reason why they forced this horrible bilinear filter on the
windows image viewer for so long (guess they still do) ? It made me crazy,
it's so ugly

~~~
BubRoss
No, this was written by Alvy Ray Smith 25 years ago in response to incorrect
filtering in computer graphics in general. This has nothing to do with
whatever you are seeing.

