It uses examples of how pixels on scanners, printers, and CRT monitors are not squares. Fair enough. Back when this was written, one could very reasonably argue that pixels were intended to be point samples with a convolution.
But modern pixels on most LCD screens certainly are exceedingly crisp squares (with subpixels), and processing RAW camera sensor data, antialiasing techniques, and image resampling seems to "mostly" assume that pixels do represent the average color over a square rather than a point sample. Not perfectly, but as a reasonable first-order approximation.
I understand that 25 years ago it might have been reasonable to argue that pixels aren't squares.
But today... aren't they, at least as a first-order approximation? (With some further adjustments to avoid antialiasing artifacts, desired sharpness, and so forth.)
The idea that pixels are primarily point samples seems far more misleading -- as if they were analagous to audio samples which are point samples as a first-order approximation, but pixels are nothing like audio samples.
No, LCDs are not crisp squares. Pretty much every screen type is still RGB in some kind of array. RGB stripe or Pentile are most common. You can also think about things like VR or a projector where your screen in transformed through a lens.
Everyone has to constantly compensate for this fundamental thing in computer graphics to this day.
Edit: As other things come to mind with how relevant this paper still is...
>antialiasing techniques, and image resampling seems to "mostly" assume that pixels do represent the average color over a square rather than a point sample. Not perfectly, but as a reasonable first-order approximation.
Not in the least. For example in texturing (wrapping images onto meshes), mip maps and bi/triliner filtering are very common. Assuming average color leads to a jagged (if you mean avg over one pixel) or blobby (if you mean average over several) mess. Mip maps are essentially point samples of a texture taken at different sample resolutions that are then resampled at runtime to form the best approximation. Essentially no one uses "average color."
I really doubt it though. I can't imagine a world where we chose to waste massive amounts of resolution in the normal case just so we don't need to sample as intelligently as we do today.
Even in upsampling things we do better than simple square pixel average so if we're always upsampling we'll probably always need to be smart.
It is already happening, see e. g. -. And the more common will be HiDPI displays - the worse would be support for low-res (say 96 DPI) displays. It is not uncommon to encounter an opinion that tricks used for high quality rendering on no-HiDPI displays are legacy and cruft, but more common is just to tread rendering issues which not visible on HiDPI as low priority.
It is the same old mantra - programmer's time is expensive, hardware is cheap, let's not spend time on creating efficient software and focus on making development fast and easy.
In image formats and image processing you can define pixel to be whatever you like. In many cases it's useful to define/interpret it as an infinitesimal point.
In other algorithms, it's not useful. For example, pixels of textures are filtered (mipmaps, AF, MSAA), because an actual ideal point sample of a texel being read as an ideal point sample of a pixel on the screen has awful aliasing artifacts.
Pixel samples are not the same as audio samples. Screen isn't a 2D wave. There's no gibbs phenomenon for pixels on screen. Treating samples as a wave works for resampling to a large degree, but breaks down in a few places like crisp edges (see font hinting, pixel art) and makes no sense for the alpha channel (wave-based sharpening filters create opaque halos).
And when a pixel is connected to a physical device, it's clearly not a point. Camera sensors' pixels aren't points. LCD pixels aren't points. Printed pixels aren't points. Our eyes actually see the difference. Pixel-grid-aligned lines are clearly sharper than the same lines shifted by half of a pixel. If sampling theorem applied to pixels, it would make no difference.
Alpha channels are generally problematic, since if alpha is small, big changes in the rgb channels result in small changes in the actual color. You can't linearly interpolate in this color space, just like you can't in a polor coordinate space like HSL.
Pre-multiplied alpha should fix most of these problems (though you still need to be careful with fancy filters, no not end up with values outside the valid range)
For instance, this is the Nexus One screen under a microscope:
Each of those lights is a subpixel.
The bottom half of this image shows some other common LCD pixel layouts:
As you can see, subpixels can be square but usually aren't, and pixels can't be square because they're made of subpixels.
Out of curiosity, I took a picture of my monitor, and this is what it looks like:
These look very different from the microscope pictures because the light sources are "glowing", which I think is probably more accurate to how we actually perceive pixels. But as you can see, "glowing" makes them not square, but round!
The fact it is made of three rectangular subpixels in no way means "pixels can't be square".
And the "glowing" effect in your picture is an artifact of light bleed from your camera. The only time we perceive pixels as round rather than square is if we have eyesight problems that blur the image projected onto our retina.
That's what I mean. Because of the way subpixels combine into pixels, only some colors can be square.
It still depends on what you’re using them for. Image that will be rendered straight to a webpage? Yeah, it’s probably a fine first order approximation. But that’s not the only context that gets rendered on a modern screen.
Say you’re using it for 3-d scenes. Yeah you absolutely should pay attention to what the paper says about footprint and sampling and filtering because, when you render it onto the screen, all of the processing steps between pixels on a texture to pixels on a screen require awareness of pixel footprint.
I don't know, I still think the concept would appear in the context of image scaling and aliasing.
I suppose you're right that what I'm describing is just the lack of conscious mental model for what the pixel represents, versus when I'm writing a shader, where I'm actively thinking through the mental model of a pixel.
You posit an opposition between "[arrays of] point samples with a convolution" and "[arrays of] exceedingly crisp squares". This is incorrect. If the kernel you convolve your array of point samples with is an exceedingly crisp square, the result is an array of exceedingly crisp squares. Convolution with a square is just a special case. Smith explains this, perhaps not very clearly, in Fig. 3, p. 5.
Conceptualizing this as a convolution allows you not only to handle the special case you're saying is "a reasonable first-order approximation," but also to analyze what effects that display or sensor convolution is having on your image, and possibly to correct them. Moreover, it can handle some other kinds of image degradation as well that go beyond that first-order approximation, such as bokeh, diffraction, defocus, and nonuniformity within pixels, as long as it's the same for every pixel.
That's how you find out what those "further adjustments to avoid antialiasing artifacts, desired sharpness, and so forth" are.
And it's how you can go about analyzing what happens to the image when you, for example, resample to a different sampling grid, as Smith explains on p. 4. If you analyze different linear resampling algorithms using the theory of convolutions, you can predict precisely how good the results will be; if you do not use this theory, you will be puzzled and will not understand your results at all. Believe me, I know; I wandered in that desert for years. But you don't have to. Learn from my mistakes.
Second, audio samples are frequently handled exactly the same way as pixels in this sense, using a "zero-order hold". If you oversample by taking 64 consecutive samples from your ADC and summing them, or set the PWM duty cycle on your Arduino to a new value every 125 μs and leave it there until the next sample, or shift a new sample into the shift register connected to your R-2R DAC and leave it there until the next sample, you're doing exactly the same thing as trying to draw a little square for each pixel—just in one temporal dimension instead of two spatial dimensions. A zero-order hold is a perfectly reasonable thing to do with audio samples for both input and output, although it's not the only option, and it can produce audible artifacts. The math for analyzing this—and correcting it, if that's what you want—is exactly the same as the analogous math for displays or camera sensors made out of little squares.
Far from being "far more misleading", the analogy between pixels and audio samples is deeply insightful and unlocks a wealth of mathematical tools that can be applied in both domains, as well as, for example, to linear control theory and GPS position estimation. The results this analogy leads you to are profound truths, not falsehoods as you seem to think. And some of them are very beautiful.
None of this has changed in 25 years (except the detail that now we have LCDs with little rectangles on them to complement the little squares of dye-sublimation printers that Smith mentions) and little of it has changed in 50 years.
You call convolution with a square a special case, I'm calling it a reasonable first-order approximation for modern displays.
And using a zero-order hold on audio samples isn't something most people frequently need to conceptualize. In fact, it's responsible for many people's great misconception that sampling rates above ~40 kHz result in noticeably higher-fidelity audio, because they think the DAC's output will be smoother. It's clear to me that you know better, but most people don't.
I'm in full agreement with you on all the math involved. It's simply that going around saying "pixels are point samples not squares" seems like it's going to be more misleading than helpful to 99% of people.
If someone wants to understand the basics of antialiasing, or reducing image resolution with bilinear interpolation, the conceptualization of square pixels works perfectly. More advanced antialiasing techniques are ultimately just subtle adjustments to that in terms of measurable output differences, which is why I stand by calling square pixels a reasonable "first-order approximation".
> If someone wants to understand the basics of antialiasing, or reducing image resolution with bilinear interpolation, the conceptualization of square pixels works perfectly
As Smith explains in the paper and I explained in the comment above, it's easy to get extremely conspicuous artifacts if you design a downsampling algorithm with the little-squares model, although if you're increasing image resolution you can frequently get away with it.
If you don't understand the formalization of the imaging problem that Smith is laying out—and it's clear that you don't—you aren't in a position to opine on how useful that formalization is. Applying that formalization will show you whether it's useful or not, and you can't try that until you understand it. My opinion from my experience is that I wasted a lot of time on the kind of seat-of-the-pants approach you're advocating, and now that I understand the rudiments of sampling theory, many problems that were intractable before have become easy. Learn from my mistakes instead of urging other people to repeat them.
> You call convolution with a square a special case, I'm calling it a reasonable first-order approximation for modern displays.
You are positing another opposition that simply doesn't exist. Perhaps you don't know what the phrase "a special case" means? A mathematical operation can simultaneously be a special case of a more general mathematical operation and useful for something, for example approximating a given display. And that is the case here. Being "a special case" of a more general case just means that everything that is true of the more general case is also true of the special case. It's not a way to deprecate the special case. It's a way to pointing you at a deeper understanding of it.
(Again, though, it's a zero-order approximation. First-order approximations are bilinear sampling and similar things. I think you have a lot of work to do before you can be in full agreement with me on all the math involved.)
Pixels are always interpolated, just like audio.
The fact that 1080 pixels exactly fit 1080 screen pixels don't give pixels a shape. And by the way, most modern display pixels are not squares. They come in all kinds of shapes. Triangles for example.
Then you are ignorant about the history of pixel art.
Which I am not judging btw. Just saying.
Pixel art from the golden years doesn't look right when displayed as little squares. It needs to be shown on a CRT or with an overlay emulating this look to do its creator's intentions justice.
"The Bitmap Brothers: Universe" (fantastic book btw.) went to great lengths to replicate the original rendering of the depicted pixel art. See e.g. .
Look at other Nintendo pixel box art from the 1980s. Many of the earliest Nintendo games had this common box art style. It's all little squares:
I have an NES classic edition and it lets you select if you want the CRT effect applied or not. Sometimes I play with this effect applied (how I experienced it as a child) but other times I enjoy the "pixel perfect" mode.
I designed pixel fonts back in the 1990s and I was always unhappy about how CRTs fuzzed them. I loved the sharp little squares of LCDs much better.
It's true that if you were playing Super Mario on your TV, the normal experience was for it to be a bit blurry.
On the other hand, if you were playing Lemmings on an Apple monitor with Sony Trinitron technology, the pixels were exceptionally clear. Here's what Trinitron looked like:
I'd argue that playing Lemmings in an emulator, you want the pixels to be clear, without a CRT overlay -- that this would be closer to the artists' intentions.
You also see that graphic design for icons and graphics being done on square gridded paper for many early games and graphical workstation.
Later graphics cards in the 01990s would sometimes emulate CGA 320×200 and mode 13h by using more scan lines, giving pixels that looked more like little squares. Or, well, rectangles.
Now we're veering back into technical details of its implementation. The question is, is "squares" (well, yes, rectangles) a useful model of what is visible to the viewer? To that end, the fact that each line was traced as twice on the display is trivial. We have to concern ourselves with the visible result to answer that question
I have a decent CRT VGA monitor here, connected to a VGA card in an old PC, and at lower resolutions the pixels are crisp and hard. The monitors are made for reading at a higher resolution after all. If I press my face at the screen I can see that yes, the pixels appear on a much more fine-grained grid of apertures, and yes, this and other factors somewhat fuzz their boundaries.
But for normal use at an arm's length, rectangles are a very good approximate model for some display types that were in use while VGA was still a thing games used, one that a skilled artist can exploit in an entirely different way than they'd exploit the characteristics of NTSC/PAL displays or an RGB display with a coarser dot pitch.
You do have big flat pixels when you have high quality but low physical DPI output device. Say, black and white Macintosh, or late '90s computer monitors displaying '80s screen resolutions, or a panel of individual LEDs. On the other hand, halftones are also sharp and have low DPI, but have no square grid at all, nor they correspond to source data used in printing in any straightforward way.
To conclude, big flat rectangles are just a specific artistic choice, they are not universal at all.
It was depicted as made of blocks on most versions of the cover art:
And many Mario games were released for portable systems with sharp LCD screens, and they used the same pixel art techniques.
idk I still play on a 1990s CRT and it's pretty blocky
You have to know some things about your pixel arrangements in order to properly anti-alias, font render, or display camera images, otherwise the result can be pretty ugly.
The point is that kind of information is not encoded in the buffer. Its just a point sample of the output of the previous stage in the pipeline.
(granted, this is specifically at high frequencies: at low enough frequencies, the sample might as well be the waveform within whatever contstraints the word length puts on you)
But I'm not sure what your complaint is, really. Sampling is a concept by itself.
> When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean—neither more nor less." "The question is," said Alice, "whether you can make words mean so many different things." "The question is," said Humpty Dumpty, "which is to be master—that's all."
The fact is that pixels don't have a well defined meaning. Some people use them as squares, some people use them as samples. Then you get into subpixel font rendering, where you quite explicitly acknowledge the screen as master, and you design your image for a desired result.
So yeah it's an ecosystem of people doing different things with different interpretations, and trying to mostly play nice with each other.
- pixels are point samples in 2D space
- their position is exact
- position of of top-left point is (0,0)
- position of bottom right is (cols-1,rows-1)
This way all math work (subsampling, affine or perspective warps, lens distortion, or even warping between image and any layer neural network). Failure to do that, will cause subtle issues that will degrade your performance. These will become quite important when working
with pixel accurate methods (3d object tracking, object detection in tiny resolutions and 1-1 mapping between OpenGL and Neural Network). So I have to agree, pixels are not tiny squares, they are dots in 2d space.
I often wondered how you would reproduce it digitally but apparently I was thinking about it wrong.
Halftone is the term