Hacker News new | past | comments | ask | show | jobs | submit login
Cameras and Lenses (ciechanow.ski)
2132 points by mariuz on Dec 10, 2020 | hide | past | favorite | 212 comments

Although I knew most of technical aspect described having immersed myself in Photography several years ago, I cannot express properly how GOOD this article is. The sliders and diagrams are brilliant for helping a layperson understand how it all works. Kudos!

You'd like their other work :





Check out the whole blog! It's amazing.

These are some of the most intuitive explanations coupled with the slickest animation / demo work I've ever seen. All of that put together into crystal clear educational material is such a rare gem and requires incredible talent.

Impressive as hell.

I hope someone is paying them to do this full time. I'd pay for this.

And the articles renders so fast and so smooth on my laptop. In fact they render more smoothly and loads faster than 99% of all much much less technically advanced articles out there. Wow!

Heck, they render fast and smooth on my 3 year old iPad.

I also like his floating point numbers' explorer https://float.exposed/0x3fb999999999999a

The final words on the Tesseract page are lovely!

> I find it very inspiring that while we can’t physically experience a four dimensional space, with just a bit of ingenuity we can easily simulate how a tesseract and its shadow would look in our day-to-day world.

> You may find math’s indifference to the limitations of our human perception quite cruel, but I think it’s liberating. Reflecting on higher dimensions is transcendent – it removes the shackles of the physical world and allows us to explore the realms we’ll never encounter.

Regarding the lights and shadows articles, how they hell are they getting raytraced-quality lighting, with no noise, with changes to the scene in realtime?? Are these all pre-rendered PNGs?

Probably closed solutions to those simple lighting situations.

I was wondering why it was lighting up as visited link. Turns out it’s same blog that posted few years ago something so special it took me entire Christmas to read!

Added the blog to my Thunderbird RSS-Feeds :-)

You're right. Similarly, I cannot express how good this site really is in just a few words. Having been an enthusiastic amateur photographer for decades as well as having worked in electronic imaging for about as long, this is one of the best practical illustrations of the subject that I've come across.

I'd highly recommend it to anyone who is seriously interested in the underlying 'mechanics' of imaging.

(Incidentally, anyone whose has read my many anti-JavaScript raves, then I'd only say this is exactly how JavaScript ought to be used.)

+100 because I came here to post exactly what you did. I just bookmarked this to share with my tween who's getting interested in photography and starting to ask things like "Why that camera?" and "Why this lens?"

I agree, it's pretty basic but for people that don't know any of this it's an excellent explanation.

I wouldn't say it's basic at all. It's an excellent distillation of the physics of cameras.

My one minor nit-pick would be introducing the wave nature of light so early--to motivate Snell's law and total internal reflection, but without discussing diffraction until well later and only in passing.

Snell's law can be motivated from a purely classical particle-based model as:

1. Light travels slower through media other than vacuum (This is mentioned, but hand-waved as boundary conditions, which probably isn't meaningful to the intended audience). 2. Light takes the shortest (in time) path between two points. (Light... finds a way).

Either way, the audience might find the obvious "why" for the first point more satisfactorily addressed with a short sidebar on permittivity and permeability of materials (their electrical properties, effectively the capacitance and inductance "density").

That can also lead to an interesting discussion of why conductors are usually not optically transparent and insulators are.

I agree I think the motivation for refraction was the weakest point - they introduced waves, and even showed that the waves have to be continuous at the boundary... and then... back to rays!

It just needs a diagram showing the waves refracting, and the geometry of the wave fronts. (Pretty annoying that they never explained refraction like that in school either - Snell's law was just a given when it's actually really trivial to derive from first principles.)

The only thing better than a pinhole camera is a pinhole camera with multiple holes.

Let me explain:

When you have two holes (apertures), their images would likely overlap (depending on the arrangement).

Imagine you took two photos from a few millimetres apart to the left and to the right, then you superimposed them. Some areas would look ok (as both views saw the same distant object) and some would be less ok (of objects that are closer).

If you have a DSLR take the lens off and put a sheet with holes in front. Just try it.

Now as you add more holes you get more and more overlapping images on the same area of the sensor and eventually end up with a blur.

This blur is the same blur you see with a single large hole. It's just that with a large hole you had infinitely many overlaps!

Even cooler, if you happen to arrange the holes in a specific pattern you could capture images with different combinations of perspectives from different holes and you may even undo the overlaps. This is called coded aperture imaging:


This doesn't just solve the biggest problem (limited light) of a single hole, but also captures depth information and you can use it for 3d reconstruction, refocusing etc.

One final bit, with a warning of a deep rabbit hole:

That "infinitely many overlaps" I was talking about happens with lenses too and is essentially a convolution where you convolve the image with itself (actually many different perspectives of itself if I am correct). Which is just the Fourier transform.

>That "infinitely many overlaps" I was talking about happens with lenses too and is essentially a convolution where you convolve the image with itself (actually many different perspectives of itself if I am correct). Which is just the Fourier transform.

That statement is a bit muddled, let me unpack it. The infinitely many overlaps thing can be expressed mathematically as the convolution of the image with a function that's one where the aperture is open, and zero where it's not. The thing about the Fourier transform is actually related to a different phenomenon. When the slit is really small, you start getting diffraction effects. The diffraction bands are approximately the Fourier transform of the slit function. However that is not significant unless the slit is extraordinarily tiny.

Indeed! We like to think of light as photons, traveling in rays in straight lines. That's geometric optics and works great for most purposes in photography.

Physical optics takes into account the wave nature of light. This becomes important when the size of the lens becomes small (eg pinholes) ... there's diffraction around edges and pixels receive contributions from many points in space.

Geometric optics lets you model using ray-tracing, reflections, and Snell's law refraction.

Physical optics uses tools such as Fourier transforms, convolutions, and sinc functions.

Understanding a simple lens system? Geometric optics is your friend. Building an astronomical telescope? Check into physical optics.

Thanks for weighing in. My kid and I are really enjoying The CuCoo's Egg!

I read that a few times when I was younger. It's such a great story, and I've always been meaning to revisit it as an adult. When my kids get a little older I'm sure we'll enjoy it.

The diffraction bands are approximately the Fourier transform of the slit function.

Is this correct from a physics standpoint? Feynman had a lot to say about light in his book (QED https://www.amazon.com/QED-Strange-Theory-Light-Matter/dp/06...) and never framed it this way. At one point he remarked that the explanations in the book were related to diffraction, and the explanation there was very different from the Fourier transform.

In short, yes. An ideal image source incident on a positive optical lens produces its spatial Fourier transform at the lense's focal point. This is easiest to see with a lasersl backlighting a transparency, since the light is collimated and monochromatic. The transparency produces diffraction at its edges, which causes the effect. Actually, you'd also see the spatial Fourier transform at infinity if you took away the lens. The result of this is that you can do cool spatial frequency filtering effects at the focal point, then convert it back into an image with another lens. Laser systems that require high precision will use such a setup to remove high-frequency components and pass just the collimated light.

Pinhole lenses create visible diffraction effects pretty fast. Actually, even normal lenses cause diffraction at small ~f/11-f/22 apertures.

I suspect the GP comment is referring in context to the use of the Fourier transform to efficiently implement convolution.

The pattern of holes technique in that link (MURA) is also used in UV lithography for IC fabrication (example: https://youtu.be/8eT1jaHmlx8?t=1543).

> captures depth information and you can use it for 3d reconstruction, refocusing etc

That’s crazy cool! Got any good links for reading about how to do those things that way; 3d reconstruct and refocus?

Here's a paper on depth: http://groups.csail.mit.edu/graphics/CodedAperture/

Stanford has a class on computational imaging: http://stanford.edu/class/ee367/

I think these slides should give you a pointer to enough papers to fill a lifetime: https://web.stanford.edu/class/ee367/slides/lecture9.pdf

Georgia Tech also has a class, that's on youtube and Udacity: https://www.udacity.com/course/computational-photography--ud...

Thank you :)

Let us know if you make something ;)

Does Bartosz have a Patreon or something? I feel like I'm deeply indebted to him for the privilege of having read this work. Really, one of the most comprehensive and fantastic explanatory pieces I've ever read.

I cannot imagine the effort that goes into writing and coding a post like this.

Looking at the page source I'm doubly impressed there seems to be no use of third party libraries. I personally happily use three.js, but still, wow.

https://ciechanow.ski/js/base.js https://ciechanow.ski/js/lenses.js

I'm sure there's a simple explanation if I look at the code, but how he did the z-order for the fins on that aperture is breaking my brain.

I'm speechless, even in a mid-end android phone every animation is smooth, is like dark magic to me.

FYI there are digital sensors which aren't using bayer filters. Some are classic sensors missing the bayer filter (leica monochrome for example) other are based on different tech like Foveon sensors [0]

Many people also debayer their cameras themselves, especially in astro photography [1]

[0] https://en.wikipedia.org/wiki/Foveon_X3_sensor

[1] https://stargazerslounge.com/topic/166334-debayering-a-dslrs...

Fuji also use what they call an X-Trans sensor.


I picked up a Fuji last Christmas, and it’s one of the best purchases I’ve made in the last decade. Their X-line takes incredible pictures, and is so approachable. If anyone has been considering more serious photography, or is having a child and wants to capture more unique shots of them, I can’t recommend them highly enough.

I'm a big fan, I'm using my XT3 as a webcam right now too!

Which is still pretty much a Bayer sensor, just with a different filter layout. You still have to demosaic to get a usable file. And a well-tuned RAW convertor to do so.

Well, it's still a CFA (color filter array), but definitely not Bayer, which is a specific class of arrangements.

Bryce Bayer was my father. Hopefully I have a few more decades before I'm "debayered".

While the mod is interesting and a way to get into monochrome photography, it should be noted that the mod has disadvantages. Mainly, you end up removing the microlenses. Which seems to cancel out any gains unfortunately.

source: I've done this mod on a couple cameras and wouldn't recommend it for performance gains alone. It's a fun project, not much more.

In Rawtherappee you can choose which demosaicing algorithm to use.


That debayering sensor mod looks really interesting :) Nice way to quadruple the effective resolution of your camera, if you don't care about colors.

I've seen this kind of essay described as an "explorable explanation" - I absolutely love them. There's a directory of other examples of this kind of thing here: https://explorabl.es/

I obviously haven't finished reading the entire post, but I wanted to simply appreciate the hard work that has gone into this. I absolutely love interactive articles. Great job!

Lately an area of interest of mine in digital photography in particular is scene-referred vs. output-referred image formats and subjective perception. The author (justifiably) walks around this topic in his camera sensor overview, but the data captured by a sensor from photons hitting it doesn’t really contain a readily viewable image.

Data values captured by a modern camera far exceed the ranges that can be reproduced by output media such as paper or screens—meaning that the only way[0] to obtain an image we can communicate to someone else (or future ourselves) as a static artifact[0] is by throwing out data and conforming values to ranges that fit in the output color space, converting scene-referred (“raw”) data to output-referred.

This is where subjective perception comes in: how we perceive colors and shapes in a given scene depends a lot on what we had seen prior to this scene, our general mood, and various other aspects of our mind state. It’s only by taking control of processing scene-referred data that we can use the full range of captured values to try to convey most convincingly, within the constraints of the output space, our subjective perception of the scene around the time we pressed the shutter trigger.

(Naturally, further down this rabbit hole come the questions about e.g. what our conscious perception—but not the camera—was blind to in the scene, and eventually about the nature of reality itself, at which point one may feel compelled to give up and go into painting instead.)

[0] This would be quite niche, but I wonder if we could develop tools for exploring raw data at viewing stage, allowing the audience to effortlessly adjust their perception of the scene (even if within ranges specified by the photographer). Such exploration would require significant computing powers, but we’ll probably be there in a couple of years.

If I understand correctly, you're looking for a simplified RAW image editor? Many digital cameras allow storing RAW images alongside JPEG. The viewer can then load the RAW images into any (web-based) image viewer/editor that supports RAW format and have full control over tone mapping.

The tool interface needs to be simplified to make it a better fit for the use-case you present but I don't see computing power as a bottleneck.

Of course they'd still be limited by the dynamic range of the camera. This can also be resolved by calculating irradiance map based on multiple RAW images taken with different exposure times.

I think there is a fundamental difference between editor (designed to produce a deliverable) and viewer (designed for immediate experience) software. One of the things essential to the latter but really to the former is immediacy, hence I suspect that the computing powers commonly available today make it impossible for now (but likely not for much longer).

(Edit: “not really to the former”)

Apart from performance, another crucial thing is that the viewer must not have to think about technical aspects (like exposure, color profile, etc.), as you noted, so the GUI would have to be radically different.

I am envisioning producers bundling N processing profiles with their “digital negative”, and software that somehow allows the user to fluidly explore the perception of the scene by interpolating inside an (N-1)-dimensional space bounded by parameters in those profiles with really low latency.

To clarify, the alternative to taking control of processing scene-referred data is to use camera’s JPEG rendering of its own idea of how the scene should look.

If I'm conversing with a Japanese photographer in Japanese, what word do I use for "that Japanese word that American photographers use for a special kind of blur?"

It can't be "bokeh" because that's just an everyday word that means blur; for instance "mae no shashin ga boketeru kara, keshimashou ka" (the previous picture is blurry, so shall I erase it?)

Also, since that term came into use in around 1997, if I could have a conversation with an English-speaking photographer via time telephone to 1990, what would be the wording used?

I've seen a few of these kinds of interactive articles on hackernews but this is on a whole new level. Wow. One of the best explanations of just about anything I have ever seen.

It's truly amazing to me how someone can be this talented. The text, the animations, the style. It's all high quality.

I'm in awe.

In terms of sharpness, each lens is different. And the sharpness varies depending on the focal length (zoom), aperture and the sensor.

A great way to visualize this, is to take a look at the sharpness of the nifty fifty canon lens


Click into the heat map https://www.imaging-resource.com/lenses/canon/ef-50mm-f1.8-s...

Move the aperture from 1.8 to 2.8, and note how the sharpness increases, starting in the center of the lens.

Then at about 16, diffraction kicks in and things start getting less sharp again.

Things get even wilder when you introduce variable zoom and different camera bodies.

It's a great way to visualize the good/fast/cheap pick two tradeoff's of lenses.

Looks like the author works at Apple:


"By default all animations are enabled, but if you find them distracting, or if you want to save power, you can globally pause all the following demonstrations."

I love this writer / developer

This is beautiful.

I've been working with computer vision since university, but now days work underwater. I guess I should have a pretty good understanding of this stuff, but I still occasionally get caught out by optics.

This is the best possible pre-reading for anyone wanting to jump into the field.

I’m always impressed by the amount of work and quality this guy put on his posts. Especially the interactive widgets

What an incredible piece of work. I can only dream of having the patience, skill and drive to make something so beautiful and informative.

For a markedly less engineering oriented introduction see https://en.wikibooks.org/wiki/Modern_Photography/The_camera

One interesting thing I learned recently is how they make cheap plastic lenses, and the fact that plastic lenses were invented by an Australian. They are created by choosing a polymer with good optical properties and then thermoforming it by carefully applying air pressure to raise a rounded shape to a dome after clamping the edges, then holding the air pressure constant until it has cooled. Because nothing actually touches the lens portion, optical properties are preserved.

Do you mean, like creating a bubble of plastic fed from below where surface tension holds it in a shape like a bubble? And then letting it harden while in that shape? Does the air pressure you talk about mean blowing a stream down onto the lens?

And then, no further polishing is necessary? I'm fascinated to learn more if you have a source to read more from!

You can go up or down (probably sideways if you can account for gravity, eg. through rotary motion). Up is more common since you only need significant equipment on one side and access requirements mean that keeping the bulk of it 'down' is easier. No polishing is necessary. However, I recall reading that some polymers can be polished if you want, but IIRC it depends heavily on the polymer, the abrasive, intermediate fluids in use and result desired. Re: Source, I raided libgen for thermoforming over the last few months, visited about 10 equipment factories and studied their mechanisms and control systems, but actually learned this tidbit from 1000 Australian Inventions or some such book which I bought for my daughter before our family returned to China... to keep inventing ;)

This works for some lens shapes. For most though, molding is necessary.

I’ve been an amateur photographer for over 10 years now, and I’ve never seen such an easy-to-understand but still mathematically in-depth explanation of depth of field until now. Bravo. This is excellent.

It may not seem interesting at first, but I am intrigued by how they managed to draw the overlapping blades of the IRIS.

> In real camera lenses an adjustable aperture is often constructed from a set of overlapping blades that constitute an iris.

If you naively create 6 blades and rotate them around the pivot (the tiny screw), they will never overlap EACH OTHER. One will end up at the very top, and one at the very bottom.

If you know how this was created, please let me know.

This is interesting. I don't know much about html canvas, but it looks like this is handled in the author's custom js [0] in the line that starts with: "} else if (mode === "blades") {"

[0] https://ciechanow.ski/js/lenses.js

Do you mean, how it was drawn using canvas?

I'm guessing they drew overlapped part of the last blade twice

Looks like so, with a clever rotation, clipping, and drawing twice (had a cursory look at 'lenses.js', look for 'draw_blade')

Not sure I get your question, you want to know how it works in real life ? Have a look here: https://www.youtube.com/watch?v=cR7cibDvYyo

I think the question is how is this drawn on a canvas. It's not trivial, if I draw 6 shapes successively, one will be on top and another at the bottom, you would not see the tip of one at the top and the other part of the same blade at the bottom. Nicely spotted, interesting question!

Great article. One thing that's always stumped me as an amateur photographer is how an object is "in focus" when the rays from that object coverage to the smallest point, thus the smallest amount of pixels are getting hit. It would seem to me you would want to collect the max amount of information and thus use more of the sensor. How does converging to less of the sensor give you a sharper photo?

Ideally, the mapping from object to image is injective. What you're proposing will lead to "hash collisions" aka blurriness, since each point in the object will bleed colors into neighboring points in the image.

The entire object doesn't get collapsed to a single point. Rather, a single point of the object radiates light in all directions. A lens then captures a fraction of that radiation and collapses it back down to a single point. Then we iterate over each point in the scene with a "for each".

The amount of total light is the same, but the smaller the points of focus, the easier it is to distinguish image elements from each other because each point overlaps less with its neighbors.

Here’s a metaphor. Let’s say you have 10 glasses, with 10 different amounts of water in them. When you hit them with a spoon, each glass makes a different musical note: 10 different notes.

If you pour the water from each glass individually into 10 new glasses, you preserve those 10 notes.

If you combine every two glasses into one, now you have 5 glasses, and only 5 notes. It’s the same total amount of water but you lost information.

If you take up a larger part of the sensor, then the distance between any two details must be larger so the areas that their light covers over the sensor don't overlap. If they overlap, you have no clue where the information is coming from.

Look up circle of confusion, Rayleigh resolution criterion and sparrows resolution criterion for the technical details of resolution.

It's more that the rays from a point on an object converge to the smallest point when in focus? The rays from the object as a whole are spread out over an object-shaped are on the image plane, unless the object is (effectively) at infinity, when they will converge to a point...

How is this the first time I have gained an intuitive feeling for refraction? Instead I've been taught to simply accept Snell's law. It feels very powerful that a single hypothesis - and a reasonable one at that (namely that the electrical field is continuous at the junction between materials) is enough to explain refraction!

Refraction goes beyond electrical fields. An intuitive feeling for refraction can be obtained from:

- water: when waves go from, say, deeper to shallower water, their speed changes. If they are moving at an angle, refraction is seen. This can be explained by drawing wavefronts. The wavefront doesn't hit the shallower water all at the same time, and so does not slow down all at once, but gradually.

- wheels: when rolling axle with two wheels goes from pavement to grass on an angle, its direction will bend direction, due to the change in speed, which is experienced by one wheel ahead of the other.

Brilliant work.

It really helps building an intuitive foundation of these phenomena.

With such intuitive foundation, one has confidence to look into the math behind the curtain, even if this math gets complex, you will still feel confident.

IMO, this kind of explanations is the correct way to start teaching anything.

The first blog post of his that I read was Lights and Shadows (https://ciechanow.ski/lights-and-shadows/), and wow his blog is impressive. I love it!

> Alternatively, we could increase the sensitivity of the sensor which is described using the ISO rating.

Umm well, not really.

The sensitivity of a digital CCD or CMOS sensor never changes; it has one 'native ISO' equivalent sensitivity and that's all. It creates the same output signal in response to a photon every time.

Varying the pseudo-ISO simply varies the gain of the downstream post-readout amplifier. That's where noise is introduced.

Sometimes it's better to take a deliberately underexposed photo and tweak regions of it in post to achieve the desired exposure instead of introducing global noise from in-camera amplification.

> Sometimes it's better to take a deliberately underexposed photo and tweak regions of it in post to achieve the desired exposure instead of introducing global noise from in-camera amplification.

That is definitely not my experience for Canon's in low light situations. Here [1] is an example shot at ISO 100 but pushed by 5EV to match the brightness of the same scene shot at ISO 3200. The noise is much more tolerable for the latter.

[1] https://imgur.com/a/Ca9ccbK

P.S.: it's raw, no in-camera JPEG denoising in place.

Perhaps what we're seeing here is that quantization errors for the non-amplified signal become larger than the noise introduced by the amplification?

Only some camera sensors are isoless

There are also "dual native ISO" sensors. Still different apmplification levels, but in this case it is before ADC.

See https://www.youtube.com/watch?v=g8hHFt3ChZ8

"Dual native ISO" sensors don't perform different amplification before the read out amplifier or anything like that.

With an image sensor you get to decide between high sensitivity and high capacity (how much light it can detect before saturating). So a highly sensitive sensor can handle less light before clipping (which makes sense).

This trade-off can be tweaked by changing the size of the photodiode (bigger diode -> more sensitive), but also electrically - simply add capacitance in parallel to the photodiode. This requires more energy, i.e. more photons, to change the voltage over the photodiode, which is what the readout measures.

Dual native ISO sensors simply add another transistor (as a switch) in series to extra capacitance; this allows them to switch between low-capacitance (high sensitivity, but saturates faster) and high-capacitance (lower sensitivity, can handle more light).

Edit: Haven't watched the whole video but his core explanation in there seems to be that Dual ISO means the sensors has a PGA (programmable gain amplifier) -- almost all sensors use that approach, and that's what is controlled by the ISO setting in them.

Maybe you should watch that video…

I have a question for the more knowledgeable:

Color filters on the sensor split the light into three wavelength ranges - red, green, blue. Then photosites measure intensity of light, which means that sensor only knows that the incoming photon is "roughly green", but doesn't actually recognize its precise wavelength.

So e.g. when the light is pure orange, it falls down into red wavelength range and is counted as red.

Based on this I would expect that cameras would often produce pretty incorrect colors, but they are usually pretty good (after correcting for stuff like white balance).

You are right, this is a challenge. The wavelength of the light cannot be measured directly, only inferred by the intensity of the pixels with the different color filters. On the other side, most reproductions of photos are reproducing the original frequencies either. A computer screen has red, green and blue dots, which produce light at the corresponding wave lenghts. So if you have orange light, you have a signal on the green and red pixels and the green and red dots on your screen will light up. Which will be detected by the sensors for red and green light in your eyes. No where in the chain, not even in your eye, is a sensor for "orange" directly, it is just the mixture of the red and green sensitivity.

It is important to note, that neither the sensor pixels nor your eyes have a complete separate reaction to a wavelength. The sensitive area strongly overlaps. So for hues of green which have rather long wavelengths, you get some reaction on the red pixels, which gets stronger as you move towards orange, where both red and green pixels detect until it gets more and more red and less green. The exact absorption curve of the sensore color filters matters here, that is one reason, different manufacturers have slightly different color rendition. On top of that is calibration, when converting the raw image into a proper RGB-image, one can further balance the response. For that, the color calibration targets are used, which have like 24 patches of different colors. Taking a phone of this target, the calibration software can both calibrate for the light illuminating the target as well as the color response of your camera.

A common reason for red-green colorblindness is that the affected persons have the sensitivity between the red and green colors overlapping too strongly, so they loose the ability to differentiate. A green creates almost as strong a signal in the "red" cells. A way to improve the color vision for those people are glasses which increase that separation by absorbing the frequencies between the red and green colors.

> The wavelength of the light cannot be measured directly, only inferred by the intensity of the pixels with the different color filters.

Well, depending how you think "directly" but you can get pretty far with spectrometer, i.e. device that splits light and measures the intensity spatially to collect a spectrum. It's not impossible thought to build camera based on that principle, just need to sample the light in an array to make pixels

I was talking here about the typical foto cameras. Of course you can measure the wavelength of the light with other devices like spectrometers. I was specifically talking about camera sensors which have separate filters in usually 3 colors in front of the pixels. The sensors, which are made by Sigma, formerly Foveon, use a different principle. They determine the wavelength by measuring how deep electrons are generated by the photons in the silicon. The depth depends on the wavelength of the light. However, it is more difficult to get a precise color response that way as you cannot just use predefined color-filters.

Thank you for your great explanation. The overlap was the missing piece in my mental model ...

It works because the system is faulty in the same way our eyes are faulty, we basically have sensors for three main wavelengths and derive every color we perceive from a mix of those three, check this out:



They work exactly the same as a screen. Orange is not pure red, it is red + yellow, and yellow is red + green. So orange is red + _some_ green. The in-camera processing will render the image based on the sensor input and the known properties of the filter (think of it as a color profile, a mapping between what the sensors read and the color). And the processing includes color interpolation for each pixel, as each pixel (photosite) only has one color filter, but the resulting image pixel has all three colors; these are calculated based on the neighbor photosites.

Different sensors/cameras have different filters, and combined with the manufacturer specific post-processing this gives different cameras/manufacturers a different color rendition and feel.

It is because our eyes are also "incorrect". The cameras and processing and displays were engineered to match our eyes expectation. If some alien race with a sufficiently different physiology would look at our pictures they could tell that it is off. (and it wouldn't even mean that their eyes are more correct, just that their model is different.)

> So e.g. when the light is pure orange, it falls down into red wavelength range and is counted as red.

If you only work with one photon at a time, then yes, you don't know if the photon passing through the filter was red, or orange, or with smaller probability even green or blue. But when you have trillions of photons passing through, you can "see" the difference by the relative intensity of light.

Remember that at the quantum level, things don't happen deterministically; you have to consider the probability that a given outcome occurs. So the photon has a certain probability to hit the filter and get absorbed, a certain probability to pass through the entire depth of the filter, a certain probability to hit the sensor without generating an event, a certain probability to hit the sensor and initiate a chemical reaction (for film or biological eyes) or an electron cascade (for CCD sensors), a certain probability to quantum tunnel to the other side of the universe...

So getting back to your question, when pure orange photons hit a red filter, many of them will make it through the filter, but not as many as if they were pure red photons. When pure orange photons hit a green filter, some of them will make it through, but not as many as if they were pure green photons. So if your brain knows what the "white point" of a given environment is, relative to that white color it'll see a specific combination of "some red, some green" as orange. (Of course, if known-to-be-white objects are already orange--like when you put on amber ski googles--your brain will eventually adjust and recalibrate to perceive that color as something else... perception is tricky!)

It is just the same with your eye - and camera response curves are optimized to be similar to your eye. There is metamerism - many spectral distributions will look the same to your eye as well as to a camera. See https://en.m.wikipedia.org/wiki/Metamerism_(color)

>> So e.g. when the light is pure orange, it falls down into red wavelength range and is counted as red

'Pure' orange light isn't red. It has a wavelength of 590–620 nm. Red is 625–740 nm and Green is 495–570, so 'orange' is between red and green. The sensor filters each allow a range of wavelengths through. So green is triggered as well as the red filter. In RGB terms orange is 255, 127, 0, i.e. with a strong red component and a smaller green component.

White balance is computed downstream from the sensor, and is used to resolve the effect that a coloured light source creates a colour cast on objects, most noticeably on white ones. The human visual system auto-compensates for this but cameras require special processing, sometimes done using presets for different types of light (sunlight, shade, tungsten etc).

Your eyes cannot see orange directly, but as an overlap of different intensities of the stimulation of different receptors in the base colors.

So as long as the filters in the camera have roughly the same transmission curve as the sensitivity curve of the color receptors, all is good.

However, to animals (or hypothetical aliens) with other color receptors, the images produced by photo prints and screens would look quite weird, with colors all wrong.

Filters allow rather large families of wavelengths through. You don't have just red, you have a fraction of green. Because your eyes only have three types of photoreceptors they can be fooled by playing back a similarly large family of wavelengths back.

Not an expert but: orange color code is FFA500. It has red and green. I expect when orange hits the sensor, it will register as X amount of read and less than X green?

That is in the rgb color model. In physical world, there are (almost) infinitely many different spectra that could be perceived as same "orange". It is honestly pretty amazing how well human color vision works despite that

That is correct. The sensitivity of the "red" and "green" pixels overlaps in the orange light frequencies.

If you tried to turn the sensor data into light again, there would not be enough information to do so accurately. Everything is built around human perception of color. When a photon hits your eye, it produces a "tristimulus value" for your brain. (The tristimulus is produced by "S", "M", and "L" sensitive-cones; these roughly correspond to blue, green, and red; that's why those colors are what we use for base colors. But, you can use other colors, and you could use more than 3 if you wanted to. There is no law of the universe that splits colors into red/green/blue parts... that's just a quirk of human anatomy.)

The goal of a digital camera is to be sensitive to colors in the same way that your eyes are. If that tristimulus can be recorded and played back, your brain won't know the difference. The colors your monitor emits when viewing a photograph could be totally unrelated to what was in the original scene, but since your brain is just looking for a tristimulus, as long as that same tristimulus is produced, you won't be able to tell the difference.

(Fun fact -- there are colors you can see that don't correspond to a wavelength of light. There is no single wavelength of light that fully stimulates your S and L cones, but plenty of things are magenta.)

TL;DR: computerized color is basically hacking your brain, and it works pretty well!

Amazing, just amazing. Thanks alot. I would only suggest author to set og:image in your pages. When you share it on social media no image appears, you know how does it affect spreading, right? :)

I built a trinocular microscope mount for my DSLR camera out of a couple of small magnifying lenses. When I was trying to get it to work and improve the quality, I did a bunch of reading on optics. Building your own camera lens that works is pretty easy, but making it capture nice pictures is hard.

Having this article back then would have been a big help! The interactive widgets are great, and it looks like they're entirely home-grown. I am really impressed with the level of detail and work that went into this post.

The article is very good but you have to consider the existence of other filters, not only the Bayer one.

And the Canon Dual Pixel AF that works a bit different than the article explains https://www.canonwatch.com/dual-pixel-af-has-become-a-canon-... (but the substance is the same)

In the article it is clearly acknowledged that there are different types. When you create a introduction like this you kind of have to limit your scope

Very impressive work, the interactive examples are excellent and will help many understand the concepts.

I’m a novice when it comes to anything related to web development so I’m going to go and look at how this type of this is done. Would anyone have and suggestion or recommendations for getting started?

I teach fluid mechanics with Jupyter and I use pyGEL3D for inserting 3D models but it’s far from ideal. I’ve love to do something like this.

The pinhole diameter slider is backwards; the objects should appear sharper as the aperture shrinks. Also it'd be good if you showed the effect of diminished light-gathering power at the same time, to emphasize the tradeoff.

All of the sliders should be labeled.

All in all, this is a good conceptual effort, but the details aren't strong.

While this is a very long article I just want to say great webpage, I'm saving this for an seo reference file.

This is so good!

I wonder if the author publishes the source code for the interactive bits somewhere (I could of course look at them from the presented JS, but still). I've been looking for a non-fan-inducing way to generate a DOF-effect, and the one in the first example is definitely good enough for my use case.

I was 45 years old when I learned why squinting your eyes make you see sharper. Never too old to learn!

This article is just superb.

This article is stunning. The design, the way it presents the information, everything. I don't think I've ever seen such a good webpage.

Edit: Now I've seen some of his other pages... This man is a genius.

As someone who spends most of his time either shooting, teaching or hustling, this truly is phenomenal. It makes some of the more mind-bending aspects of photography perfectly clear. Great post!

Great work, we live in amazing times! I can't imagine how humanity will evolve from now on if knowledge is as available and as well and accessibly presented as it is these days!

I can only echo my impressed-ness. With or without an interest in the obvious topic, it is well written, well illustrated, accurate. A superb example of the internet at its best.

It's really fascinating - I had always assumed digital cameras had different sensors for different light colours like eyes do.

Wow - just wow - such a great resource. This is what I love about www - exposure to such enthusiasm and expertise.

Awesome article and blog, as an amateur photographer this is highly educational and fun to read!

This is really interesting. In the first picture what does the blue slider do?

It changes the focal length of the lens, although it does not seem to affect the lens (blue ring) in the left-hand pane which seems like a bug.

There is a demonstration later in the article (below the 5th mention of "focal length") which does affect the lens (blue ring) in the left-hand pane. Rotate the scene so that you are viewing the camera side-on, and you will see the focal length of the lens changing when you move the slider.

It's changing the focal length of the lens (roughly the distance between focus points of the lens). If you rotate the scene, you see the lens is blue. Further in the article, they mention the focal length, using the same coloring.

this is quite a beautiful page, and in the first few seconds you can immediately grasp focus & depth-of-field/bokke. But then it keeps going!

Is it possible to make a color pinhole camera?

Sure! "Color" vs "Black and White" vs "Sepia" is primarily a function of the photosensitive (aka film or digital sensor) layer.

This is great. Thanks for sharing.

Love this

As a hobby photographer, this is simply amazing and the most intuitive article I’ve come across. This is a must read.

I am curious, however, why we still can’t digitally reproduce bokeh. Apple is getting close. I thought LiDAR would theoretically solve that and could yield indistinguishable renders compared to analog lenses. That would be a game changer in my view and why I would like to see Apple develop a full-frame sensor coupled with their technology.

A large lens captures information over an area, and so to a certain extent can "see around" out of focus objects. A selective blur of a fully focused scene captured from a single viewpoint (i.e. a small lens) can only approximate this effect, because it simply doesn't have access to the same information. Even with a perfect depth map, you still don't know what's behind occluded objects.

If instead of resolving points of light on the image sensor, you use a group of pixels to resolve an entire tiny image you can effectively also see around things. You end up with a picture of many small sections of the large image each at a different angle. The image on the far left of the image would see a different angle than the image on the far right. This is exactly what the Lytro camera did and it's why you can take the picture first and focus later. Of course you sacrifice overall imagine resolution quite severely:

* https://www.researchgate.net/figure/a-b-The-first-and-second...

Well yes but you're still going to need a sensor as large as the aperture of the lens you want to simulate, which makes it a non-starter for phones.

Yea, this type of image recording has huge compromises but I personally find it really cool.

So do I! If I had a bit more photography budget I'd try them out.

Also, I'm excited about cameras with dual/quad pixel AF, that are kind of a hybrid between lightfield and traditional cameras. I wonder what kind of sorcery one would be able to do with the light field data in those cameras!

One of the limiting factors for modeling bokeh and flare-like effects is dynamic range limitation. You need extreme HDR capturing to accurately reproduce these effects, as they often play the largest part with bright, especially colored, light sources. I did work on flare simulation and while many effects can be modeled by a rather simple convolution (in the spectral space of course -- you cannot make a rainbow out of RGB straightforwardly), the problem is that kernels (PSFs, point spread functions) for these convolutions have very long tails and it's the shape of these tails that gives most of the 'natural' artistic feel.

The thing is, these tails become apparent only when you convolve with a very very bright source -- which on a typical 12-bit level linear raw image would amount to something like 10⁵-10⁶, i.e. needing 4-8 additional bits of HDR.

Here are some useful links on the topic of flare simulation, I believe bokeh has mahy similar aspects:



Here's another interesting paper [1] on that topic. It shows that synthetically blurred images are significantly more realistic if they're based on recovered radiance maps (HDR).

[1] https://people.eecs.berkeley.edu/~malik/papers/debevec-malik...

Phone cameras are extremely wide-angle so everything is in focus and there is no natural bokeh. To add bokeh, you have to separate the subject from the background, and then also determine how far different parts of the background are. This requires very advanced AI for non-trivial images (see the imperfections in Photoshop's "select subject" tool), which Apple is actually still doing (that's what portrait mode is). But if it's not perfect, it quickly becomes worthless, so in short, they are doing it, but only the most advanced companies can try.

Not sure but - as film e.g. captures actual photons from the scene, probably some kind of information is encoded through that.

Bokeh is a kind of of space representation, similar to how you can basically "see" through hearing a sound stage of instruments separated properly when someone has a really good sound system, or how dogs have "5.1/7.1" sense of smell.

How does one encode that I have no idea.

Look into light field photography tech. It is possible to capture a ”volume of light”, within which bokeh & more can be adjusted after the fact. Issue is the amount of data generated and complexity of tech versus getting a ”good enough for most situations” image via simpler means (regular photo). Regular + depth images (Apple LiDAR etc) with help of AI can create something vaguely similar to actual bokeh, but they’re missing a lot of source data.

In the world of 3D rendering (content created from scratch) very advanced & realistic bokeh effects are possible, as an example see http://lentil.xyz for the Arnold renderer.

Wow. I left the CG industry 3 years ago in a sad bout of defeat, involving both an ability to make a decent living and a realization that it would never my standards of creative engagement that were set by my lifetime love of photography and film. But, this project is very cool.

Wow, thanks!

Sidenote, it appears the author works for Apple.

The website sliders don't work well with a touch interface. If I use a mouse, I can smoothly move the slider values. But attempting to tap and slide them either has no response or makes the slide move to the extreme left or right.

I'm using Firefox (Nightly) on a Windows 10 tablet (Microsoft Surface Go).

I did some more testing and I think it's due to how Windows interact with the website slider.

I've tried with Chrome on Windows 10 and find that the slider works but only if I touch on the line the slider is sitting on to set a value. Attempting to tap on the round circle and pulling it to slide does not work.

I tried again on Firefox and if I very carefully tap on the line, I can set the slider. But touching the circle and sliding does not work. So, Firefox is more sensitive than Chrome when it comes to determining whether I am touching the line or not.

I presume this is due to how Windows interprets touch events and passes them on to the browser, making the sliders hard to work with in my case.

Sliders work perfectly for me on latest FF on latest LineageOS, which suggests that it's not a problem with the website, rather with either FF or Windows on your device.

It works flawlessly on my iPad in Safari.

Firefox on Android here, every animation works flawlessly.

Wow. What is this tool used to make these animations?


"Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents."


We detached this subthread from https://news.ycombinator.com/item?id=25371653.

But it's way less effort than checking, isn't it? The name in the footer isn't English, so it's hard to know from that and there's no about page that I could find. The Twitter profile makes it more obvious, but I only clicked on it to check because of your comment. Why bother??

I don't see mention of pronouns anywhere. Usually when people don't list pronouns or hints it's because they haven't thought about it or don't care. Without any comment from him, you have no place telling others how to refer to him.

I'm going by the photo on his Twitter account, but it's as valid a guess as any other. If you went by how I look or by my legal name, you would be wrong, and I would be annoyed at someone on a website chastising others for not using what they assumed was correct.

> The site is the work of a guy, "him", and "his" work. We don't have to guess and say "them" and "their" work.

Assumptions are presumptuous and harmful. Everyone is a they/them until otherwise stated. These are extremely useful pronouns.

You're a they/them.

Calm down. You seem to be overly attached to regressive labels. I don't know what drives this in you, but the universe is burning and people just want to be happy. Your life is short; you're going to die and rot away. Is this really something to get worked up over?

In the future, when our descendants are all uploaded to computers or replaced by an AI overlord, people will be all sorts of genders. They won't be bound by yesterday's norms. Would you be a stuffy gender policeperson, or will you just let people be who they are?

>You're a they/them.

Why on Earth are you trying to dictate someone else's pronouns?

“They” is historically understood to be plural. Some information fidelity is lost if “they” can mean multiple people and a person.

Language is hard, so I understand the need for compromise. “Zee” could be used to avoid gender centric labeling while maintaining language clarity.

Historically understood to be plural by you.

Actual-historically, from 1375, not so much.


That is not true. Singular they has been in use as long as plural they. It has always been understood to be usable in either situation.

I'm sure there are examples, but in common use it is understood to be plural and has been used that way in all writing for a long time.

That is not correct


> They with a singular antecedent goes back to the Middle English of the 14th century... and has remained in common use for centuries...

> in common use it is understood to be plural

What I said is correct. Prior to about 4-5 years ago you did not find people using they in the singular, hence "in common use".

> I’m sure there are examples, but in common use it is understood to be plural and has been used that way in all writing for a long time.

It is grammatically plural, but it has been accepted for semantically singular use since long before the Victorian effort to impose Latin-inspired rules on English usage which failed to eradicate it despite intense effort. (Victorian prescriptivism did have some good effects — regularized spelling FTW — but trying to eradicate clear and useful usages like singular “they” was one of its less-well-considered, but fortunately also less-successful, efforts.)

> in common use it is understood to be plural

Nothing in what you wrote contradicts what I stated. The man on the street thinks they is plural, and nearly all writing treats it that way. Historical examples excepted.

Anyone else came here expecting a Haskell post?

Impressive work! Nit on design: in first ~3 attempts to scan through the page to see if it is worth reading, i found myself stopping very early when the black background started given it usually signals a footer and end of useful content. The only reason I ended up discovering there is more content was the sanity check that if it is HN front news there has to be more to it :)

Another nitpick is that the sliders are all unlabelled and rely only on their colours matching the text. I found myself having to go back and forth between the text and the sliders, and I imagine it must be worse for colour blind people.

Maybe this should make you think about your "worthiness" scan ;)

Agree it is imperfect but there is too much Internet, too little time in life, therefore fast heuristics and patterns are useful :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact