Hacker News new | past | comments | ask | show | jobs | submit login

A Pixel Is Not A Little Square, A Pixel Is Not A Little Square, A Pixel Is Not A Little Square! (And a Voxel is Not a Little Cube) [1]

[1] http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf




This is a classic, and a good point, and comes up on HN every time we talk about voxels.

But I think it's worth considering that:

- "Pixel" is an overloaded term. "Pixel" is a little square, sometimes, because common law - it's being used that way.

- This is an article about art, aimed at artists, and if an artist intends to create a blocky image in two or three dimensions, then what they're doing is "correct" regardless of the terminology they use.

- The pictures are really pretty, IMO. I realize some people don't like the block look, but I'm not alone when I say I love it and it's inspiring. The explanations in the article were nicely done.

- The article does cover sampling. Just not filtering.

- When Alvy Ray wrote his paper, the world was made of blurry CRTs. Now we have really crisp digital LCDs everywhere, and more now than then, pixels on the screen are little squares.

- "pixels" even in the days of CRTs have always meant little squares to a lot of people. Video game art benefitted from a little blur, but tons of artists and programmers have always thought of pixels as squares, and for the most part it's a working analogy.

This is why it's okay to call a pixel a little square, and overly rigid to claim it's not one. (And I think Alvy Ray was being funny and making an important point about math and graphics, for a siggraph audience, not pedantic about what words artists are using.)


I would extend this, and say that all terms are overloaded - so before taking issue with a definition it's always worth considering whether you're arguing that one definition is Wrong and the other is Right (which is reductionist and silly), or arguing that one definition is more useful for the context under discussion (which is often useful).


The fact that pixels are point samples and have no area is something I feel an inexplicable psychological aversion for, despite it making sense. However, Monty's "Digital Show and Tell" video[1] helps me feel more comfortable about it. His explanation of discrete audio samples being point samples[2] (and thus should be drawn as such, rather than as a continuous stair step which I find akin to pixels) was invaluable. I generally knew everything he was talking about, but he was able to explain and present it such that I felt like I understood it more deeply than I ever had before. It's definitely worth a watch.

[1]: https://xiph.org/video/vid2.shtml

[2]: https://youtu.be/cIQ9IXSUzuM?t=5m59s


This mantra was also the reason why MagicaVoxel, the "voxel" editor mentioned in the article was received relatively poorly on HN about a year ago: https://news.ycombinator.com/item?id=10953918

Yes, so technically we should probably call the modern rendering technique of pixels and voxels block-based rendering. I would say though it's fair game to call the underlying data "pixels" and "voxels" - in most retro-style games these assets are indeed stored internally at that level of abstraction. The issue now becomes how to render that data on the screen, and blocks are still a way to make this data look good.

While the block is not a pixel or voxel in the strictest sense, I do think it's a reasonable linguistic shortcut to use that word in this context, and it's not unprecedented to use the description of an abstract thing interchangeably for its more concrete representation.

In my opinion, there is nothing to freak out about, and I don't see why this article specifically prompts such a needlessly absolutist response.


I personally find "block-based" rendering of 8-bit era pixel art quite ugly. There's way too much high-frequency content – too many edges – making it very visually busy.

As others have said, 8-bit art was made in a context where the "pixels" were viewed through an analogue filter, that being an NTSC display. My understanding is that to a first approximation, such a display (i.e. the electron gun + shadow/aperture mask) acts as a spatial lowpass filter with a resolution of 640x480 pixels, with maybe a 1-pixel-wide horizontal "smearing" effect due to the horizontal movement of the gun. Of course various console + display combinations deviate wildly from this ideal, but the lowpass filtering is the basis for understanding how 8-bit art was expected to be viewed.


There is no question about the heritage of 8bit-style art, and whether you find modern renditions ugly or not is not really a concern here. But I would like to point out as someone who grew up in the eighties, people referred to those little atomic picture cells rendered on the display as "pixels", not as "my monitor's representation of a pixel".

It's also worth mentioning that for most of us today, graphics tastes have changed quite a lot, so emulating the original blur and smear of an NTSC display is perceived as far less of a generally acceptable aesthetic solution than rendering rectangular blocks is. Of course, opinions differ, but you can totally vote with your wallet on that.

You're talking about how art is designed with an expectation about the manner in which it is later consumed. To the degree this is even a realistic expectation, you would have to make the same concession to today's pixel artists who are creating works loosely based on an 8bit heritage, but which are often clearly and intentionally distanced from that heritage. You might personally object to viewing original games from the 8bit era on modern systems at all on the grounds of authenticity and distortion of artistic intent, but that's not an objection you can raise against modern pixel artists and their working premises.

Overall, I object to this notion that, say, the average 80s CRT pixel representation is somehow the canonical rendering of the ur-pixel and that we should strive to somehow emulate that. I think we should instead do whatever works for the modern context. You might counter that blocks don't meet this goal for your tastes, or that we should stop using recognizably blocky pixels altogether, but again that's a personal preference.

The actual argument above was not about personal preference at all, it was about what can rightfully be referred to as a pixel and what can't (and I do maintain that in this context linguistic pedantry does not yield any meaningful benefits).


> I object to this notion that, say, the average 80s CRT pixel representation is somehow the _canonical_ rendering of the ur-pixel

You'd have to imagine it's closer to the original artist's end goals though.

The guy who drew Mario probably didn't intend for us to see a blocky representation of an Italian plumber, they just wanted us to see a plumber.

OTOH, Seurat was consciously making a point that you can represent a person with nothing but colored dots:

https://en.wikipedia.org/wiki/Pointillism#/media/File:Seurat...

> people referred to those little atomic picture cells rendered on the display as "_pixels_"

'These aren't the pixels you're looking for'. As the article points out, the pixels on a CRT do not have a 1-to-1 correspondence with the pixels in RAM. If you understand a pixel as a sample, then the mental picture becomes clearer:

RAM pixels -> video card DAC converts to continuous signal -> CRT resamples again into monitor pixels -> your eye/brain reconverts into a continuous image


I answered that in my comment above, but it was an edit so you probably didn't see it in time.

Expanding on that answer, I think you're making a case for DRM here. Mario wasn't supposed to be not-blurry so we shouldn't be allowed to see him like that. If that's your opinion, I disagree, but fair enough. It's an entirely different thing though to assert that modern artists should not be making new pixel/block art, because it seems to me that their intent has value, too.

So the question becomes whether you object to that art style, or just the commonly used name for that style. Personally, I think you're going to have a hard time with either one, especially given the fact that the common usage of the word "pixel" has always been blurry, and that you could simply avoid modern "pixel" games at no personal cost and still allow other people to make and enjoy them.


To me, that doesn't seem so much in support for DRM, so much as support for something like the negation of "death of the author".

I think there is room to say both "this was the authorial/creatorial intent behind Mario's sprite, which should be acknowledged as being distinguished by being the intent" and to say "I aesthetically appreciate this deviation from the creator's intent, more so than the original intent."

The deviation in interpretation from the original intent is, I think, a new (derivative) creation, in a sense. (Though it may be an accident, and might not have an intent to create behind it.)


I think there's plenty of modern pixel/block art that's awesome, http://www.nuklearpower.com/2001/03/05/episode-002-why-is-he... as an early example.

But we should appreciate modern pixel artwork as a cool retro-themed anachronism, not an actual representation of what displays rendered in the past. The idea that a pixel is only a 2D square gets in the way of that -- ideally, people could understand that it refers to both a point sample as well as a box-filtered rendering of that sample, and those two senses imply different ways to view a set of pixels.


"There is no question about the heritage of 8bit-style art, and whether you find modern renditions ugly or not is not really a concern here."

That is exactly – and only – my concern. The remainder of your post you are arguing against a strawman.


As someone pointed out below, pixel art has been made for LCD displays for a long time, starting from all the handheld gaming consoles, and also including all the pixel art made in the last 15 or so years ever since LCD displays became predominant, also CRT computer monitors have much sharper pixels than TVs. At this point there has probably been more pixel art made on and for displays with sharp pixels than for crappy TVs. It's time to let go.


Where did I mention modern pixel art? My point is that modern renderings of 8-bit-era (i.e. 1980s) art is anachronistic and violates the artists' intent. That art was mastered for consumption through NTSC and looks ugly if you try to reinterpret it through the modern pop idea of "pixel art". It's like taking "Gone With the Wind" and watching it with frame-rate interpolation on a handheld device. It'll look like crap because you're changing the medium.

I don't care what modern artists do. You say "let go" like I'm taking a stance on nouveau pixel art; I'm not.


None of the European or Japanese was made through NTSC.


You're right about Europe, but I believe Japan used NTSC. Wikipedia agrees:

"NTSC...is the analog television system that was used in most of the Americas (except Brazil, Argentina, Paraguay, Uruguay, and French Guiana); Burma; South Korea; Taiwan; Japan; the Philippines; and some Pacific island nations and territories"

https://en.wikipedia.org/wiki/NTSC


They use NTSC-J, which is slightly different. I remember it as being different enough to be incompatible with US NTSC, but it doesn't look like it.

https://en.wikipedia.org/wiki/NTSC#NTSC-J

However, Japan uses both 60hz and 50hz power, so I wonder what that does to frame rate.


That, and game boy, one of the best selling consoles ever, had perfectly defined square pixels.


Yep, Game Boy was a different medium. And Nintendo's artists didn't just take the Mario graphics assets from the NES and plop them on a Game Boy; that would have looked like shit for a number of reasons. Yet that is what people do when they take the 8-bit-era Mario sprite and render it with giant squares.

Graphics assets are no more a picture than a recording of an artist's brush strokes are. They can only be interpreted in the context of a medium, be it CRT scanlines vs. LCD squares or oil-on-canvas vs. ink-on-papyrus.


PAL and NTSC-J exhibited the same analogue characteristics.


> block-based rendering.

I just refer to them as "boxels."


I know where you're coming from, but really this perspective mostly applies when you're rasterising some underlying ideal. When art is composed out of the rendered elements directly, we still need a word for the square thing we see on screen. Pixel is the chosen word. This means it has two meanings, despite your reference.


> When art is composed out of the rendered elements directly

I wonder though, would a lot of 'pixel art' actually look better with a different filter instead?


One increasingly-popular point of view is that console-game pixel art was intended to look like it does when sent through an analogue (e.g. composite) video cable to the average CRT television screen; and that artists relied on various "blending"-like effects that were inherent in this process, making the art assets look "off" when displayed crisply on an LCD. See http://www.tested.com/tech/gaming/2982-a-link-to-the-past-ho... for examples.


In the context of GP's article, it's fun reading through the comments on the page:

    I actually prefer the crisp pixels
    of emulated 8 and 16 bit games,
    I have no desire to emulate
    the crappy monitors we used to use
    to play said games. - lordofultima

    I actually enjoy gigantic, crystal clear pixels.
    - rateoforange
You'd have to imagine that actual graphics artists during the 8-bit era viewed low resolution as a limitation to overcome, not a purposeful feature of their artwork.


There's an analogous situation in pop music: it's mostly mastered to sound good coming out of your average car radio or PA system or $5 earbuds. This definitely is a "purposeful feature of the artwork." It might be a regrettable feature—a constraint that the artist wishes they didn't have to work under—but the work is definitely informed all through its creation by this constraint.


I kinda enjoy the anachronism too, but http://d2rormqr1qwzpz.cloudfront.net/uploads/0/5408/30085-ff... from the article really does look great.

All this reminds me of tge performance practice of old music.


>You'd have to imagine that actual graphics artists during the 8-bit era viewed low resolution as a limitation to overcome, not a purposeful feature of their artwork.

Well yeah, but it's a feature nonetheless. People aren't doing anything wrong by preferring sharp pixels. It's an aesthetic all to itself.


Interesting question. I think you've identified one of the main issues here: the subjective nature of what "better" means in a medium that is intentionally limiting itself.

Pixel art and voxel art are intending to evoke feelings of nostalgia but with a modern flare. It's important to the work that the art is visibly blocky. It's a completely artificial limitation, applied completely on purpose for the sake of the art.

What does it mean to start filtering today's pixel/voxel art? What would that achieve for the art? If *xel artists wanted to make it look "better" by smoothing out the blockiness, then it opens the door to asking, "why use low resolution blocks at all, if what you want is more realism?".

Personally, I'd guess realism, e.g., better filters or higher resolution or anything else that tries to smooth out the blocks, is antithetical to the goal, which is in part to specifically emphasize the blocks.


This brings up a good reason why voxels often look so blocky and unappealing - 'square' pixels are fairly reasonable approximations to the underlying image, plus your pixel resolution can match exactly to the display, so this makes rasterization a no-op (and you don't even see the squares).

But once you're in 3 dimensions, triangles do a much better job of approximating arbitrary shapes (also, hardware deals with triangles), plus the voxel dimensions have no relationship to your display. You're essentially picking a very clunky filter that obscures what you're trying to represent, out of some misguided analogy to 2D pixels.

Just to be clear, cubic voxels can be great as a stylistic choice, but are inferior to triangles as a general way to describe 3D scenes.


To be pedantic, a 1px square does NOT properly rasterize as a single pixel. If you apply a low-pass filter (as you should, to avoid aliasing) – even an ideal one – those little squares blur and "ring" across several neighboring pixels.

Rather, it is only an infinitesimally small point (an "impulse" or Dirac delta) which, when grid-aligned and ideally filtered, can rasterize as a single pixel.


That just shows why you shouldn't be applying filters after rasterizing. If you want to avoid aliasing, you should be supersampling or calculating exact coverage before you reduce everything to the final resolution. In which case the tiny square survives perfectly.


I never said filter after rasterizing. Pixel-sized squares have the same issue if you filter before rasterizing.

The only reason pixel-sized square survive supersampling intact is because supersampling uses an imperfect low-pass filter (block-based averaging). Here is a good example showing the shortcoming of straight supersampling: https://people.cs.clemson.edu/~tadavis/cs809/aa/aliasing5.pn...

You NEED to apply a true low-pass filter before rasterizing to completely eliminate Moiré patterns. Whether a round of supersampling occurs before this filtering is irrelevant. And pixel-sized squares don't survive low-pass filters intact, see https://en.wikipedia.org/wiki/Gibbs_phenomenon#Signal_proces...


Your goal is to take a pixel-sized square with sharp edges and render it on a screen composed of squares. If it displays wrong, you used the wrong method. You can't blame signal theory.

In the case of your second link, the problem is inappropriately applying a low-pass filter. This makes the edge of the square non-sharp, and adds distortion effects multiple pixels away.

This portion of signal theory only applies to a world made entirely out of frequencies. This causes problems when you try to apply it to realistic shapes. It's great for audio, not so great for rendering. The use of point samples does not automatically imply you should be using it.

If you don't like supersampling, that's fine. But you need to pick an antialiasing method that's compatible with the concept of 'edges'.

Edit:

You added "You NEED to apply a true low-pass filter before rasterizing to completely eliminate Moiré patterns."

I don't think that's right. It should look fine if you use brute force to calculate how much of every texel on every polygon shows up in every pixel. And it should look fine with much more efficient approximations of that. At no point should it be necessary to use filters that can cause ring effects multiple pixels away.


> This portion of signal theory only applies to a world made entirely out of frequencies.

That's not precisely true: discrete blocks or edges can be approximated arbitrarily closely in Fourier frequency space -- you just need to admit higher-frequency components.

'Pixel-sized square' is properly an oxymoron if you view pixels purely as samples -- a 'true' square can only be represented by an infinitely dense set of pixel samples.

But this is a feature, not a bug, because neither computer monitors, nor the human eye, can render nor see a 'true' square either, only successively better approximations to them.


> That's not precisely true: discrete blocks or edges can be approximated arbitrarily closely in Fourier frequency space -- you just need to admit higher-frequency components.

You can approximate those shapes, but that's an approximation. It's not impossible to do the math that way, but there are stumbling blocks to dodge. Like weird artifacts when you low-pass.

> 'Pixel-sized square' is properly an oxymoron if you view pixels purely as samples -- a 'true' square can only be represented by an infinitely dense set of pixel samples.

It's a square equal in size to the square-grid pixel spacing. It's not wrong, it's just being non-pedantic.

> But this is a feature, not a bug, because neither computer monitors, nor the human eye, can render nor see a 'true' square either, only successively better approximations to them.

A monitor does not quite deliver perfect squares, but what it displays is a lot closer to a square than to this shape: https://upload.wikimedia.org/wikipedia/commons/f/f8/Gibbs_ph... So calculating it more like that is not a feature.


> A monitor does not quite deliver perfect squares,

Monitors don't display colored squares, period:

https://upload.wikimedia.org/wikipedia/commons/4/4d/Pixel_ge...

Our cone cells also don't see colored squares, and in fact each color's distribution is random:

https://askabiologist.asu.edu/sites/default/files/resources/...

In fact, CRTs phosphors didn't even correspond to logical computer pixels. If your CRT resolution was lower than the maximum phosphor grid resolution of the CRT mask, then a single pixel would indeed spread across more than one tri-color phosphor group.

Signal theory points reminds us that 'square pixels' are just an arbitrary shortcut. We could equally well describe them as hexagons, ovals, or rectangles, or make the grid positions random, and monitors would work just as well. That's why when you blow up a and render it as a giant square with exact edges, it's a very misleading and arbitrary choice.


But at no point, either in the CRT or in the human eye, do the ringing artifacts actually show up. So any pipeline that renders those when you blow the pixel up is worse than rendering just a giant square.


They do. If you try to draw a white pixel in the middle of a black background on a CRT, the white bleeds around in a small circle, and if look at a point light source in a totally dark room, you will see a small halo around it. These are both ringing artifacts.


You see faint light around the pixel/light source. You don't see the edges of the pixel/light source as brighter than the center. Do you? (Can't check right now but it's not how I remember seeing things)


You can just design your game art to avoid Moiré patterns. This is also what all TV productions generally do.


While true, this is a category of stances that I quite enjoy poking holes into. I usually bring this up when people deride folk wisdom or trying to show how common sense is incorrect.

The basic problem with being petty is that, while you're right on one level, you're wrong on the global scale. In this case, that pixel are square or not is irrelevant. Do you look at your monitor straight on? Do you sit side-by-side with a friend playing a game? In that case, squares become trapezoid from your POV anyway. I'm going to bet you don't mind it, because our brain is very good are pretending perspective distortions don't exist.

The same apply in many domains: color correction, for example. I use to write in printing and designers would be hung up on proof to verify colors. No matters that reader would read teh final product under varying lights anyway, making precisely 'correct' colors moot.

In both cases, what is important is consistency. For prints, that all runs be the same. For monitors, well, as long as you look at a single monitor at a time, you won't notice distortion.

(In a somewhat related anecdote, the piano my parents own is and always has been out of tune. I learned on it and for me, all other pianos tones were unsufferable. What is the right pitch is mostly learned, even though musicians would like us to believe there is but a single harmonious scale.)


Well, a model is not the reality, but a useful abstraction. Considering that does it really hurt to wrongfully think of pixels as squares? It hasn't made any troubles to me yet, at least.


This is true for computational science applications. A voxel in an MRI-scan is indeed not a little cube, but a data-point. Same for sampling 2D images (e.g. photography).

In modern pixel/voxel art, however, the whole point is that they are in fact little squares and cubes. Otherwise you don't get the aesthetic this style is aiming for.

It should be noted for historical accuracy, that the origins of pixel art were displayed on relatively blurry CRT screens, and your pixels didn't actually look like little squares either, more like glowy blobs. That's a different point than the cited paper is making, however, which is about the sampling theorem where pixels and voxels are basically data points (that can be interpolated in various ways).

On the other hand, when I played 16 colour 320x200 games like Commander Keen and Duke Nukem (2D) in the 90s on a VGA CRT monitor, the pixels definitely looked like little squares, not blobs.

Edit: maybe I should mention the oldschool ANSI art as well (BBS/demoscene), which were made with various coloured ASCII-blocks, and there the "pixels" were in fact quite obviously little blocks (and sometimes other characters but that's more ASCII-art than ANSI).


Well, sort of.

There is a fair amount of graphics whose good looks are based around the fact that pixels are little squares. Like orthogonal lines.


Odd that it calls a reconstruction that doesn't blur "worst case" and "abominable". The smaller the details in an image, the less I want any filtering applied when it's sized up. (Or, you could say, the smaller I want filter regions to be.)


In this theory the reconstruction filters need to work as both downscaling and upscaling filters. A box filter looks good for integer-ratio upscaling, looks bad for non-integer upscaling, and looks terrible for downscaling.

What you want is a non-linear filter - it's pretty obvious that you can invent good upscaling filters (like Waifu2x / NNEDI3) that aren't based on FIR theory and can't be used to downscale.

> Or, you could say, the smaller I want filter regions to be.

And that's the opposite of you get from linear filters. There's a resampling filter called sinc that's "perfect" - it reproduces every frequency in the input signal - but it looks terrible for images because it tries to reproduce pixels based on every other pixel in the image. So you get ringing all over the picture instead of sharpness.


I am largely focusing on upscaling, because downscaling is easy to get close just by averaging pixels.

>What you want is a non-linear filter - it's pretty obvious that you can invent good upscaling filters (like Waifu2x / NNEDI3) that aren't based on FIR theory and can't be used to downscale.

When I'm looking at really tiny details even those can cause problems, and I just want pixel blobs.

>And that's the opposite of you get from linear filters.

Huh? I'm talking about the footprint of the filter, like the diagrams in the pdf. The box filter and Figure 1 filter sample one pixel, so they have the smallest region, and sinc is the largest possible. What's opposite?


Oh, I read that as "filtered regions". Like if you blow up an image with some noise in it, the noise is "infinitely small", so it should remain the same size and not get bigger with the image content.


Honest question: If a pixel is a point sample, why do all the down-sampling algorithms that look good involve some form of averaging the values of neighboring pixels? Wouldn't it be more accurate to pick one sample and throw away the rest?


It's a good question. The name for that - pick one sample, toss the rest - is "nearest neighbor sampling". Whether it's accurate depends on what you are trying to achieve, but the short answer is that in general it is not the most accurate, and it can cause a lot of problems.

But your intuition isn't wrong. Nearest neighbor can be a bit more "crisp" feeling on certain images, especially natural photos. A lot of photos have enough blurriness in the 1-3 pixels range that nearest neighbor sampling is a good choice if you're just resizing an image by a factor of 2.

But to see why nearest neighbor isn't the best in general, imagine down-sampling a checkboard and what happens as individual checkers become smaller than a pixel. You'll get moire patterns if you don't filter. You'd also expect pixels to turn grey as multiple checkers land there, and maybe not stick to only black or white.

Interestingly, some "sharpening" and up-sampling algorithms do the opposite - they subtract the value of surrounding pixels, rather than add. Those are assuming that some averaging already took place, and trying to un-average, if you will, to restore the original point sample.


This is a fairly deep topic so your question has a really long & detailed answer. The simple answer is "aliasing." The concept is that picking one sample for each pixel makes an aggregate image that could be confused with other images, whereas a good downsampling algorithm only represents a (lower frequency, bandlimited) version of the original image.


The short answer is that our eyes and brains don't work that way.

The long answer is... really long.


Indeed, Marching Cubes and similar algorithms are voxels and drive home the "not a little cube" point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: