
Fast and accurate object detection in high resolution 4K and 8K video using GPUs - joubert
https://arxiv.org/abs/1810.10551
======
reaperducer
Somewhat OT, but standing in line at the grocery store checkout last night I
couldn't help but wonder why so many billions are being spent on complex
visual problems like self-driving cars, then at the same time the checkout
clerk has to look in a laminated binder and then manually punch in a code
number to get a price for bananas. (Happens at both big chains around here —
Safeway and Kroger alliances.)

Surely with all this image AI and ML coming out of SV, there must be a way to
augment cash registers to detect the difference between a dragonfruit and a
kumquat.

~~~
magnetic
Looking at the laminated binder is not a nominal case, thankfully. Think of it
as a cache miss.

Most checkout clerks have memorized the codes and don't refer to the binder
anymore. Perhaps you saw a new employee.

Also, many fruits/veggies have stickers on them with the code (either numeric
or bar-code) so you don't have to do a lookup into "the binder". That's how
you can easily go through the "self checkout" lanes (although there is a
fallback in the form of a "virtual binder" in the UI if you can't find the
code and want to select your produce from a set of buttons/images on the
screen, with optional string searches).

It's often key to have a differentiator in the form of a code because
different items can be visually equivalent (think about bananas vs organic
bananas).

~~~
reaperducer
_many fruits /veggies have stickers on them with the code_

IME, too often the sticker isn't useful because it's fallen off, or is
obscured by the green eco-friendly produce bags.

 _That 's how you can easily go through the "self checkout" lanes_

I don't use self-checkout, on principle. I'd rather wait in line three minutes
than cost a minimum-wage checkout person their job.

 _It 's often key to have a differentiator in the form of a code because
different items can be visually equivalent (think about bananas vs organic
bananas)._

This is an excellent point I didn't think of. I don't know that AI will be
able to distinguish regular from organic produce in my lifetime.

~~~
gattilorenz
OT, but...

> I don't use self-checkout, on principle. I'd rather wait in line three
> minutes than cost a minimum-wage checkout person their job

Thanks for stating this, I never realized the potential connection between the
two. I'm going to do the same thing from now on.

------
bcatanzaro
They used P100 GPUs that have a peak throughout of around 10 TFlops. V100
GPUs, or the RTX 2080 Ti/RTX Titan, have north of 100 TFlops.

So it’d be interesting for them to update their performance figures with
current GPUs. It likely would run at much higher frame rates.

------
ingenieroariel
I have six gopro in an spherical arrangement and each one produces 960p video
at 120fps, this renders a spherical 4k video.

From what I read they can process Yolo 6fps on multiple GPUs. Is a 20x speedup
with more gpu possible based on their architecture?

~~~
CorvusCrypto
Well for training attention-cnn model I think it.would speed it up, but as
they note in the paper, running the pipeline they experienced worker
saturation on their dataset (see figure 12). Looks like saturation occurs
after 7 parallel workers with their hardware.

------
nshm
Just 4 frames per second sounds crazy slow.

