Hacker News new | past | comments | ask | show | jobs | submit login

Webcams usually have somewhat poor latency, at least in my experience. And I think the performance critical code is all in OpenCV, not Python.

Webcams don't have memory, so they can't have latency. It's all in the software. In fact, I've done some apps on Linux that just did readout of the frame from the camera sensor to ram, and displayed the buffer on the next framebuffer flip with zero copies. There was no perceived latency.

It just takes some work.

Web cams do have memory, and do a remarkable amount of post processing on the frames, hence the latency.

Source: have used used webcam cube cams in a robotics application where latency needed to be accounted for.

I guess it depends on how the camera chip is configured.

But all kinds of post-processing can be (and is) done on readout (pixel by pixel) without needing to keep the whole frames and go through them. You just keep some calculated parameters from previous frames (and not the whole frames).

You can do a lot of processing this way, incl. scaling, white balance, color correction, effect filters, etc.

Web cams add usb interface and an additional controller chip, so maybe there's some added latency there. But if you use the camera sensor directly over CSI, you can get pretty low latency.

There was ~100ms of latency measured for CSI going directly into an FPGA. The vendor considered the post processing to be part of their value add, and had no way to access the non buffered output, even if you disabled the obvious stuff like noise reduction.

Not to drag this too far off topic, but what kind of post-processing? And is it standardized across different types of webcams?

Noise reduction is a big one. You usually need to interpolate over several frames to do that.

Also depending on the bus being used cameras often compress the image using JPEG to reduce the bandwith (very common with USB cameras, I doubt that the Mac camera does that).

Then on a computer you have the latency of whatever pipeline is being used and how much buffering is involved. For instance since typically the display doesn't run exactly synchronized with the camera you'll have some frames that'll end up "stuck" waiting for the monitor's vsync. If the app then does some internal buffering on top of that you can very easily reach a noticeable amount of latency.

In general you can consider than anything above 100ms is easily noticeable and that's only 3 frames at 30fps and 6 at 60, it's really not that much when you consider the complexity of modern video capture pipelines.

Why use averaging of multiple frames instead of just raising exposure time? I guess it can lead to smaller rhomboid distortion due to rolling shutter? But that should not increase latency, comapred to equivalent increase in exposure time - so that we're comparing apples with apples.

I expect it's more complicated than just simple averaging.

Not really standardized, as there is a broad range of camera hardware out there.

A pipeline might look something like demosaicing -> denoising -> light/exposure adjustment -> tone mapping -> encoding

Some of these steps will likely have hardware support (e.g. demosaicing, encoding) some won't. You may carry a few frames around for denoising, e.g. You have to sync up with the display at some point. Some of the steps might change order too, depending on what you are trying to do and what hardware support you have.

I haven't looked at available hardware specs recently. You used to be able to get very basic capture cameras that offloaded nearly everything to the computer - so that would give you minimal latency on the capture side (limited by your communication channel obviously); However keeping a stable image as resolution and frame rate grow is really hard that way, eventually impossible without a realtime system. More modern cameras are going to do much of that on board.

While not post-processing, often you can configure the exposure duration and frame rate which will make things feel more or less responsive.

This is as much as possible what the Camera cue in QLab[1] does (I'm the lead video developer), and there most definitely is perceived latency. It's better than it used to be, when webcams were DV over Firewire, but it's still enough that webcams are generally a poor choice for latency-sensitive situations like live performance.

The alternative that usually gets used in productions with a moderate budget is a hardware camera with HDMI or SDI output into a Blackmagic capture device. There's still some latency, but it's better than a webcam. It's pretty much down to the minimum of what you can do with a computer -- the capture device still has to buffer a frame, transmit it to the computer, which copies it into RAM, then blits it to video RAM, where it waits for the next vertical refresh.

All that said, the human finger doesn't travel all that fast, so for a HID application like this, you could probably get the latency way down by extrapolating the near-future position of the finger, much like iPadOS does with Pencil.

[1] https://figure53.com/qlab

>Webcams don't have memory, so they can't have latency.

You don't need memory to have latency. Speed of light / electrons is enough...

That said, webcams absolutely have extra processing (and thus latency).

If you can't store the frame, and have to send it out on a shift out from CMOS, where do you get the latency? The processing has to be real-time if you can't store multiple frames.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact