Hacker News new | past | comments | ask | show | jobs | submit login
K|Lens – The third dimension in image acquisition (k-lens-one.com)
83 points by Tomte 51 days ago | hide | past | favorite | 47 comments

A couple of reasons why I think this is cool.

First, why would you want lightfields? Computational photography -> variable boken (e.g. fix the bokeh for different depths, make it smoother, etc), depth info (extremely valuable for VFX), having some variability in your camera position - so being able to add camera shake more realistically. There's a bunch of other cool things you could do.

I've only taken a look at existing lightfield solutions briefly, but this looks cool. It's novel because it seems to only use the bare possible minimum number of ray samples to be called a lightfield, and makes up for it with heavy compute (even heavier than a typical lightfield).

This approach has interesting tradeoffs. A typical lightfield camera typically uses, say, 25 ray samples. This number of ray samples isn't enough to avoid compute, but cuts the resolution by a lot (25x in our case - instead of having 50 megapixels, you're at 2 megapixels). A typical implementation uses a micro-lens array in front of the camera sensor, and that means that there's wasted overhead due to tiling circles leaving a bunch of area empty.

Their proposed approach "only" leaves you with 9x less resolution (which is cool), which they try to reconstruct / upscale back to 2x less resolution (which is also cool and should be workable).

3-4 years ago we already had Lytro / the Light camera. Wasn't great apparently[1], but a much more practical idea than this massive lens.

Judging from the samples, the iPhone with ML-based depth sensing and Lidar also seems to do a much better job. As will most of the latest ML models available.

[1] https://www.theverge.com/circuitbreaker/2018/4/10/17218758/l...

Light field tech is so exciting.

Light fields will be extremely conducive for building ML models around. It'll be possible to generate incredibly rich and realistic 3D scenes with the additional information light fields convey. Possibly some downsampled number of rays to make compute easier.

Multiple view scene constructions will make for incredible filmmaking that can be tweaked in post. No shot, reverse-shot. No cheating the camera. (That's assuming we don't switch from optics to mocap + rendering outright.)

Very cool. I do always wonder with projects like this that rely heavily on proprietary software, what happens if a few years down the road the company folds? Now do I have a fancy brick to mount on my camera? I was hoping to see a more explicit statement about whether the raw data was still reasonably usable or at least readable outside the proprietary software. That may be the case and I just missed it, but as a potential customer of this the possibility of the lens being useless without their software can be a bit spooky. That having been said the tech looks great and I hope they do really well!

I had one of the first Lytro light field cameras [0], and thats exactly what happened. The camera was essentially bricked and all my photos were deleted from the only cloud platform that could render them.

Luckily, or maybe unluckily, for me the camera ended up taking terrible pictures and no images of any consequence were taken with it.

[0] https://en.wikipedia.org/wiki/Lytro#Original_Lytro_Light_Fie...

This, exactly. The software never went past “cute demo”.

Don’t worry too much. It looks like this is Bring Your Own Camera so the files are whatever format you’re used to working with.

The algorithm for processing plenoptic images is also well-known and this company is not the first to do it. Someone will come up with a free tool for dealing with these images.

Good point! I’ve been excited to see where light field tech leads on the practical side of things. If indeed they’re able to overcome the resolution limitations using AI assist then this is a great step forward!

Sample images at https://hidrive.ionos.com/share/ayvcprk57q show the resulting 5k*5k depth map. It has lower depth resolution than what I would expect from a $4k lens. It would be interesting to see a comparison of this depth map compared to an RGBZ image generated from Google or Apple's Portrait Mode AI filters. The primary benefit of this is probably the ability for RGBZ video, but that can also be generated from 2 or more DSLR cameras + software.

It is kind of disappointing that it doesn't seem to map any of the folds in the garment, the individual fingers, or other details. It also seems to get the depth around the knob by the right elbow quite wrong. All-in-all, no apparent order-of-magnitude (if any?) improvement over single image "AI" algorithms that have been shown before.

Im not sure how good our instinctive expectations are. The folds in the shirt, for example, are very prominent in the normal photo because of the shadows. But the difference in depth really isn’t that large.

Say you have the 255 shades of gray in RBG, and you want to spread them evenly over the distances of 1-5m. That would give you a 1-step increase in brightness for every 1.6cm or so, which happens to be pretty close to what I believe these folds‘ magnitude might be. I’m not entirely sure how prominent the difference would be to the naked eye. IIRC, the MPAA considers even 50 to be plenty.

I‘m leaving out lots of details (pun not intended, but appreciated): you’d spread your resolution logarithmically, for example, not linear. And, of course, you could work with more than the resolution of 255. But it’s a different domain and I believe some intuitions are off if you compare it with the x and y dimensions.

I'm not so convinced I'm seeing the limits of resolution, either angular or depth.

Using parallax to calculate depth undoubtedly has principal limitations in far away details, and mapping to an 8-bit depth buffer is another very reductive step in that regard. (regardless, I'd expect even the folds to show up, at least qualitatively, if I'd looked at an 8-bit rendering of a 3D scene's z-buffer; the gradient contour steps are clearly visible, and dense, yet fail to follow the folds, indicating that the underlying depth data simply doesn't track them at all)

Let's take the sleeves then -- clearly a large difference in relative depth, yet they blend into the rest of the garment. My impression is very much that of standard depth reconstruction "AI" that more or less guesses depths of a few image features, and does some DNN magic around where to blend smoothly and where to allow discontinuities, with the usual "weirdness" around features that don't quite fit the training sets.

Possibly all we can get out of this "parallax" method of depth reconstruction isn't a whole lot better than just single image deep learning, which would not surprise me, as it ultimately relies on the same approach for correctly recognizing and mapping image features across the 9 constituent images in the first place, vs. a true lightfield sensor that captures the actual direction of incoming light.

Look at the shirt between the sleeve and whatever the cook is sprinkling. There's an obvious, soft "bump" there that doesn't seem to correspond to anything in the actual geometry - I'm betting it's an interpolation artefact.

This solution uses the same sensor for the depth and RGB, which is a win. When you use separate depth and RGB sensor like most smartphones the images do not align exactly which causes problems at close range. You also get weird artifacts when the depth sensor saturates but the RGB does not.

I think this design will require a lot more cpu for post processing though.

That's super neat but that is quite the price tag for a KickStarter lens. They're backed by a solid lens manufacturer but this is going to need specialized software for post image processing which is a risk of its own. I would need a really compelling reason to spend $2-4k on this lens instead of others I can guarantee I'll use.

I'm tempted but I think I'll pass.

Recently my son was enthusiastic about anaglyph images viewed with red-cyan glasses. These are much better than what they had when I was a kid because the cyan filter passes both the green and blue channels. You can use whatever color you like as long as it isn't red or cyan.

I got enthusiastic too.

We found out that you can't (in general) make a good anaglyph image with a 2-d image and a height map because the left eye and the right eye can see around objects so the 2-d image usually is missing pixels that the left and right eye would see.

You could reproject an anime image if you had the cels; you could reproject the image for the left and right eyes by simulating the multiplane camera.

I am just starting to take 3-d photographs by moving my camera and that way I get all the pixels for the left and right eye. That lens might be better for taking portraits at very close range but I don't know if it would beat my current lenses. Certainly I could get a really good telephoto lens for that price today whereas I don't even know if I'll be taking 3-d photos a year from now when that lens would arrive.

as crazy as it sounds, look for a cheap used nintendo 3ds. they have a stereo camera for the lenticular display and some photoshop filtering could turn them to anaglyph

I've got a New 3DS in my backpack. I love the 3D effects in Fire Emblem.

I'm a big fan of Loreo stereo lenses. They're well made and inexpensive.

You can use stereo pairs to calculate depth, and in this case, you only lose half your horizontal resolution (vs. losing significant horizontal and vertical resolution on the K lens)


Apple has both an SDK for turning a series of photos into 3D models and scenes, as well as LIDAR sensors in their high-end phones and iPads. Notably, though, last I checked these are separate initiatives and the SDK doesn’t use the LIDAR data, but depth information the camera gathers by using two of its lenses.

With lightfield cameras, you do actually get several slightly different perspectives. You can imagine it as if there were several pinhole cameras around the (large) front lens element.

It looks like this one splits the field 3 ways in 2 different directions for a total of 9 images. I guess I could pull out the middle left and middle right and make an anaglyph from that.

The depth map in all the examples looks like it's very low resolution with some fancy upscaling

It has to be. This is like doing SfM = "Structure from Motion" with 9 images, each of which has roughly 2MP.

To me, this looks like a rip-off of https://raytrix.de/ who have been granted a patent for pretty much the same thing in 2013. And they have been selling pretty much the same thing, except that for Raytrix, the lens is pre-installed by them whereas with K-Lens, you install it onto your own camera.

Also, they effectively split the resolution by 3.x on each axis, because they are multiplexing a 3x3 image grid onto the same sensor. That means a $50k RED MONSTRO isn't "good enough" anymore to get 4K output.

There have been several commercial manufacturers of lightfield cameras (Lytro being another one). Which part is patented?

The hardware tech is different too, Raytrix seems to be using a microlens setup (like most other lightfield cameras). A microlens setup also reduces resolution fwiw.

I was referring to US 8,619,177 B2. This one also uses a multi-lens arrangement which reduces resolution.

If I remember correctly, Lytro went bankrupt in the period between when Raytrix filed their patent and were granted it, so they "got lucky".

I'm surprised that there's a patent like that - university researchers have been making custom microlens lightfield setups for some time.

If I understand the K|Lens implementation correctly, they stuck a mirror-based kaleidoscope in the middle of the lens. I don't know if there's anything extra on top of that.

Agree. BTW it looks like the KLens patent was applied for in 2012 so while the Raytrix patent was still in evaluation. And both are German companies. There must have been a wave in popularity in researching this kind of technology.

That said, I wonder what KLens has been doing from 2014 (when their patent EP 2 533 199 B1 was issued) until now.

Interesting choices of examples there.

The still of the motorcycle rider has obvious motion blur, which immediately raises the question of how that can work with a depth channel. Can you access the depth data in some other way than just a Zdepth channel? If not, there are some serious limitations to what you can do with it.

In the video example, the depth channel flickers a lot. This seems to indicate the depth is some kind of calculated estimate, and not very reliable.

> The depth maps we currently generate come from our hardware and software prototypes. And yes, we’re already at incredibly high levels. That’s because our “deep-learning algorithms” learn from all small errors and inaccuracies.

I like it that they quote deep learning algorithms

Can anyone point to a very simple guide on how lightfield cameras work per chance, I've heard of the lytro camera which I believe used an array of lenses above the sensor?

Where all those lenses the same though?

With the Lytro camera I believe that could 'refocus' an image, but didn't think it could obtain 3D information? If that's the case, could anyone guess how 3D information is obtained from this camera.

From my brief reading of their website, I'm not sure I would call this a lightfield lens.

It's just 9 images from 9 slightly different offsets along with some clever post-processing. (Although maybe the difference is one of degree rather than kind - the Lytro is "micro-lens array" so maybe the question becomes "how many images do you need to be a light field image?)

My go to for light field cameras would probably be this video:


It explains the specifics of how light field cameras work quite well, but doesn't go too deep into light fields. For more of an overview of light fields in general, I can recommend this video:


My answer to your question would be that light field cameras sample discrete perspectives from a 4D light field. These samples can either be 1) combined directly (requires a very high view density) 2) interpolated to try to recover the continuous 4D light field function (this is an active area of research), 3) downsampled to volumetric 3D (Google's approach with "Welcome to light field" etc.), or 4) downsampled to 2D + depth (a depth map)

Each of these use different methods.

This looks boring. 3D laser scanning for virtual home tours and drones like this are far more exciting. https://youtu.be/rzOrle_Dq_E

Want to use an iPhone 12/13 to make models? https://youtu.be/0F3uFeqFOOw

Or whole rooms/spaces/outdoor scenes using the '3D scanner app' (works best with lidar equipped iphones)

Looks like a 3x3 light field lens for full frame.

So will SNR be 9x what it would be with a typical lens? On top of requiring a narrow aperture? I suppose with a modern full-frame there's a very low noise floor, so you might as well use it.

No, I don’t think it would increase noise levels. The number of pixels the sensor captures remains the same, the resolution is cut to 1/th. Each individual pixel in the final image will be some sort of combination of the respective, superimposed pixels. In naive averaging, a noisy pixel would lead to a larger but less prominent artifact. That, already, is close to what noise reduction in post-production does. And I can’t think of a reason not to cut outliers altogether, which would eliminate noise.

I didn't say noise levels, I said SNR. Signal is divided by 9, while noise remains constant...

Have we broken the web page? Goes to a 403/forbidden page for me.

Apparently, yes. I posted a link to an archive but got downvoted for my trouble. [1]

[1] https://web.archive.org/web/20211003065357/https://www.k-len...

Interesting that they're running a $75K Kickstarter for this. They must have some cash in the bank to have funded development on this, so is the Kickstarter purely a marketing trick?

Iirc this is a spin-off of a German research institute, which most likely provided the funding for the groundwork... Productizing is then left to private enterprises

German news article about it: https://www.golem.de/news/k-lens-one-erstes-lichtfeldobjekti...

Whatever this was, it's apparently been hugged to death by HN. Currently 403's. I did find an archive from back in October, though.[1]

[1] https://web.archive.org/web/20211003065357/https://www.k-len...

f6.3 minimum seems very high.

It’s f2.2 divided by 9.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact