Hacker News new | past | comments | ask | show | jobs | submit login
500MP camera that can identify every face in a crowd of tens of thousands (telegraph.co.uk)
61 points by systemfreund on Sept 28, 2019 | hide | past | favorite | 20 comments



This sounds like hell, especially coupled with China trying to "predict crime" before it happens [1] and a "social credit" score based in part on who you are seen with [2]. It's only a matter of time before this technology is implemented in your Country [3].

[1] https://www.independent.co.uk/news/world/asia/china-ai-crime...

[2] https://www.bbc.com/news/world-asia-china-34592186

[3] https://www.nytimes.com/2019/04/24/technology/ecuador-survei...


This looks to be either 36 or 40, can't quite count the discrete units correctly from the photo.

Let's say it's 40 x ~14MP cameras fixed in to a housing.

What's stopping anyone from using higher resolution subunits and more of them.

Or, if you bolt ten of these things together do you get a 5000MP camera?


Well, they are having to align some number of sensors ensuring sufficient overlap to ensure that they can process the images together. This means they either are warping, aligning, and combining the image into one frame to do processing (computationally intensive), or they have significantly more overlap to ensure that one of the frames contains the entire face (storage intensive), or they are doing some super-resolution processing trickery with lower resolution sensors.

500MP x 10 ips is 5 billion pixels per second that they have to process, that is equivalent to processing 20 4K30 streams at once, even without taking into consideration the extra data you would need, then you actually need to store the data somewhere and do the facial recognition. How many of these do you think would be reasonable to have in an area?


> Well, they are having to align some number of sensors

> ensuring sufficient overlap to ensure that they can

> process the images together.

It would be much easier to avoid the issue of stitching altogether and simply process the images separately, then merge the resulting output data. As far as I'm aware, you're not going to find a GPU that can process a 500MP image efficiently.


But when the images are processed separately, your required actual recorded resolution could go up quickly as your overlap between imagers needs to ensure that a face to be identified is contained within a single imager, or at least a large enough portion of the face is contained within an imager to give a sufficiently high confidence of being correct. So as I mentioned, this becomes even more storage intensive, though because there is less image processing, it becomes more CPU efficient.


I disagree, if you miss an individual on the first capture, there's always the second capture. The number of people you fail to detect because their face is halfway between two screens would likely be far exceeded by the number of faces you fail to detect because there's an issue within the algorithm itself or simply the face isn't fully visible.

That's okay. There are of course limitations. You can't detect faces you can't see, for example (i.e. somebody walking the wrong direction). If you're detecting 10k faces at an accuracy of 99.99%, one in every detection frame is a failure.


I guess the headline here isn't that these plucky scientists managed to stick a bunch of cameras together, but rather the Chinese state can stick one of these in a crowded area and instantly identify everybody within it.


these plucky scientists

I assume you used that word sarcastically?

   From Wiktionary:

   plucky (comparative pluckier, superlative pluckiest)

     Having or showing pluck, courage or spirit
     in trying circumstances.

   Synonyms

    brave
    spunky
    feisty
I dunno. IMO a "plucky" Chinese scientist would be one who would refuse to work on something as dystopian as this.


I think that was sarcasm...


Yeah, I guess achieving arbitrarily huge images is just a matter of increasing the resolution/number of sensors and then applying the correct optics. The bottleneck (and technical achievement) of a system like this would be the processing power required for stitching and analyzing these massive images in anything close to real time.


The processing can be parallelized too. Just split the data into chunks and hand it out to GPU's.

I don't forsee any scaling limits really, apart from $$$'s and power availability


Exactly, this is a trivially parallelized problem. My iPhone 11 Pro can easily do real time face boundary display with dozens of faces at once on a 12MP sensor (and that’s not it’s sole purpose, if all it cared about was face boundary identification I’m sure it could do a lot more). With custom ASICs, the face boundary checking could be embedded alongside the sensor array in this mega camera sensor setup. Then once a face boundary is identified, you send just those pixels off to a GPU array to do the face identification.


other things will probably limit visibility, like earth curvature, buildings, etc... probably better to have multiple of those arrays? I would be really curious to get details on the lens setups for this array though.


I really don't understand why to concern. We keep putting more power in government's hands in name of "peace" or "security". This is just one step further.


> China currently has an estimated 200 million CCTV cameras watching over its citizens

I wish reporters would stop mixing up cctv cameras with facial recognition cameras. 200 million cctv cameras is a meaningless number in the context of facial recognition. How many of those cameras are capable of facial recognition? No wait, that’s way to “sciencey” and detailed for this clickbait article.


Facial recognition is a software function which can be performed on image data coming from any camera with sufficient resolution, most likely that includes the majority of surveillance cameras deployed in China. China might suffer under an antiquated and corrupt political system but the country does not lack modern technology.


Furthermore, facial recognition is constantly improving, and a camera that might've been inadequate for it a few years ago can now be used for matching, if not training.

Combine that with the replacement rate of cameras, and you get two factors crossing each other and meaning that the vast majority of those 200M cameras can already be used for recognition, and the ones that can't today, will be soon.


My point was not about what China is doing and what they are capable of. My point is that the reporting is shoddy.

You simply cannot recognise someone’s face in a large crowd with a traditional wide angle 320x240 resolution cctv camera mounted high up on a pole. This is not the movies where you can “zoom in and clarify”. Traditional cctv can place people at a certain time based on the colour of their clothes, their height etc. It is used for gathering evidence after a crime has been committed, not for automatically recognising people.

Modern cctv has high enough resolution for facial recognition but nobody bothers reporting numbers properly. It’s the differentiation between the capabilities that I want to point out. It’s like saying that there are 1 bn toy guns AND real guns in existence. It’s a meaningless number.


Of course I don't have a number or source either, but it's not really that common to have such low resolution camera even for surveillance any more. At least not in China.


China has a very large number of cameras on their roads that essentially constantly surveil all of the drivers on the main roads. I imagine that a sizable chunk of those quoted CCTV cameras are those.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: