MAC address tracking is a good simple way to get an approximate number of people: it's very easy to install, it requires only a WiFi antenna, and the data is easy to translate into a count. However, there are privacy and legal concerns which have prompted phone vendors to obfuscate this data. And there can easily be zero (or more than one) broadcasted MAC address per human, even when filtering by OUI.
Retailers have been using specialized "people count" technology like infrared break-beams, thermal cameras, and CCTV+CV systems for a long time. CCTV is the most accurate by far, but it's also commonly considered to be an invasive level of surveillance (and rightly so). It's also not particularly accurate in adverse situations. At Density, we looked at (and tried) many of these technologies - but ultimately found them lacking because they were either too invasive or too inaccurate. The device we've built uses a lower-resolution depth-only sensor because it can return extremely accurate results, without having the capability for facial recognition or other analysis techniques that are harmful to privacy. So far the technology is working very well - with an algorithm based on a deep-learning "human classifier" we're seeing accuracy above 99% in many of our deployments.
Here's a cool (and in-progress, excuse our dust) summary of some issues that the technology has to navigate: https://faq.density.io/algo/
We've switched from a mobile phone MAC address based location analytics people counter system(I believe founded by someone who worked on Google Analytics) to a stereo camera based system(bellwether).
We switched as our original provider's business model shifted(I believe in part due to the change in MAC addressing by telecom providers?).
We believe strongly in personal freedom and personal data security.
But we are scratching our head as far as what we can do to ethically and affordably better understand and serve our customers in an integrated(digital & meatspace) way.
I'm not sure if our product would be able to meet your needs right away, but reach out to firstname.lastname@example.org if you'd like more information about what we have.
Hint: You need more than just 1 antenna ;)
Also the MAC address is obfuscated until a device fully connects to an access point.
Disclaimer: I may or may not have worked on such a system before.
Fully open source MIT Licensed project that does it nicely by the look of it:
The depth map you provide doesn't contain the full output of the sensor, but how is that any different than using cameras but not passing the data up any higher?
If you look at this video:
https://www.youtube.com/watch?v=YOKMx7EDVys you can see the kind of image that Density has access to, though their depth map they show is only what's on the left side.
Within spec, the algorithm does well tracking 8+ people at a time, and even when a line of people extends through a doorway we usually don't have to track more than that.
I was thinking more in terms of lidar tracking of people. It's been around for nearly a decade and works quite well.
Distance - I'd imagine that the distance from the top of a door frame to the top of the head of the average shopper is roughly equivalent to the distance from the Soli device to someone's hands.
Software - The Soli hardware streams radar contacts to software trained to separate and classify them, not too unlike the approach that Density is already using.
Benefits over a CV approach might come in power usage and lower complexity in the contact classification software.
The hard thing about that approach might be handling conflicting signals and determining a "source of truth" in adverse situations. Generally speaking, there are variables in the real world that will trip up any given strategy (even if you were to station a human at each doorway with a hand counter). And while nothing can count perfectly 100% of the time, the difficulty is in finding an affordable strategy that will be really close or perfect most of the time.
Here's a neat demo of a multi-sensing device (not ours) which combines many signals to guess the activity taking place in a room: https://www.youtube.com/watch?v=aqbKrrru2co
We are a CCTV/deep learning system that uses existing infrastructure (think old school grainy security cameras), and we're are also able to capture additional information like age and gender of a person. Unlike other invasive/HD CCTV systems, we don't use facial recognition, and we also work over very wide areas, not just over restricted doorways.
It's commonly used by some retail and shopping malls to "estimate" traffic and conversion. It's great for determining peak times and visit duration. This kind of technology has been commercialized for some time now.
So we built a tool for the NGOs in this sector, with a very similar technology used by OP as basis:
One key difference is that each Aileen box is a client, which will upload its findings to a server, so that NGO management staff can review it.
Now that the basis is there (and being piloted), we hope to make it into a product tailored to the humanitarian aid sector. One key aspect is taking privacy seriously, others will drill down into the features that refugee camp managers tell us they'd need (for instance alerts if populations seem to be on a rapid move).
Seems like you wouldn't need to do that, just do something akin to a DoS where the traffic is spoofed to contain a different MAC? I guess you'd need driver level access to the adaptor though.
Or, maybe you can just send data whilst flipping random bits in the MAC address memory address location?
Sounds like a fun project!
We entered/exited the store several times to check out the general store appearance.
After about ten exits/entrances, the clerks approached me to ask me to stop doing so as I was wrecking their purchase/visitor ratio.
If you pay attention in a mall, you'll see employees crounching to avoid triggering the sensors (they're usually at 1.5m height)
It isn't "just" Amazon warehouse employees and other low-paid folks.
You can do this with just receive though. A $10 USB TV Tuner and a Raspberry Pi will listen in to most cellular bands, and pull enough out of the over-the-air machine to machine chatter to do a similarly accurate job of counting cellphones...
The cellular transmitters (can)_ run with a lot more power than Wi-Fi, and are on lower frequencies, so the range at which you'll detect them is significantly longer, which might make localised device counting less useable.
But then you are not in the price range anymore.
Retailers rollbacked to IR sensors and now are slowly adopting counters on cameras with deep learning for disambiguation.
Isn't it odd that you can't read electromagnetic signals penetrating your walls without your consent?
When reality doesn't match peoples' expectations they legislate a 'fix' (making listening illegal) rather than fixing the fundamental technical issue (with encryption, randomized IDs, etc).
That is, it's a rancher that is gonna get up in your business if you put in a rain barrel, not a cactus.
I use promiscuous mode to monitor and send alerts about everybody with a wifi active phone that comes onto my property. I also have cameras and motion sensors. It's not illegal and if it's made illegal I'll keep doing it because I have an intrinsic fundamental human right to protect my property.
Two ideas to consider for next steps, if you're interested. One is to crowdsource the data, to build out a map of places and how busy those places are. You would need to add in a lot of privacy mechanisms though, e.g. only sharing data of mostly public places vs homes. You could also build out a map of interesting places based on this (e.g. we used foursquare data in our past research to build out clusters of places, see http://livehoods.org/)
Another is to estimate how busy a place really is based on traffic, in terms of #people, and #seats or #tables available. This could be especially useful for campuses (where is a good place to study?) or cafes. You might need some crowd-based approach to label ground truth, and it's unclear what the incentive would be.
We did consider commercialization back in the day, but never came up with a plausible business model. It's not clear why business owners would want this, and they might even have an incentive to cheat. Though I would definitely say that cities would be interested in this data, e.g. urban planners or depts of public works. They have so little data about what is actually going on in a city. For example, we spoke with people who wanted to know the effects of closing a bridge or closing off a street.
Google Maps does that. Search for a popular cafe on your area and you should see the "Popular times" graph with live data.
Google also sends local businesses a Bluetooth Beacon to install on the ceiling inside their business to help with with getting the "popular times".
It's called Project Beacon and any local US business can request one.
"Project Beacon is designed to improve the performance of location-related features in Google products for your venue, such as popular times, reviews etc. The beacon itself is configured for just this purpose, and isn't re-configurable by the user. If you would like to obtain a beacon to use with our developer platform, you can find a list of manufacturers at g.co/beacons"
But, wait: Google only gives you relative data over time, where that error is irrelevant. Never mind.
Facebook pages for real-world locations also gives hourly foot traffic, segmented by age, local/tourist, and gender.
Clippy: "It looks like you're having a party, I'm switching on the disco lights and music."
people ≥ 3, right? :)
1 - https://github.com/dom96/deauther
Sounds like great job security for DOTs.
Theoretical worst care, a MAC address has 24bit of organisation identifier and 24 bits of device identifier. So If an organisation/manufacturer only makes one model of device, they'd "only" need to build ~16.7 million (24bits) of them before they repeated a device identifier (if they chose not to use up any of their organisation bits to reflect that rollover). Again, maybe half that if they just randomly choose a device ID each time instead of enumerate the space.
Practically? Manufacturers screw up...
(Also, many Wi-Fi adaptors have easily changeable MAC addresses. Back in the day when cafes used to charge for Wi-Fi access, it wasn't uncommon to sniff the network for a "paid up" MAC address, and either wait til they left and use it, or de-auth them and do a hostile takeover of their paid-for internet access. Apologies to anyone who used to pay for "unreliable" Wi-Fi at Atlas Cafe on Alabama St back in the late 90s/early 2000s...)
So 16 million devices.
That does though drop the chance of any collision at all (aka the birthday paradox) of devices discriminated solely by the 24 bit device identifier down to sort(2^24) which is only 4096. A significantly smaller number than I expected...
That sounds intuitively and anecdotally incorrect.
The downside is that many people have rather unique sets of SSIDs that still allow for pretty good tracking markers.
An enterprising hacker could submit those to wigle and figure out not only uniques, but also tell what geographical part of the world you're from.
Nicer hackers share this for public knowledge on HN :-D
(edit: really? -1'ed? How is this wrong? Would love to hear from detractors, as this technique is how malls and supermarkets track individual users.)
No idea. I downvoted you because you complained about your downvotes. Don't interrupt the discussion to meta-discuss the scoring system.
It is a great example of how a simple concept like "I can see other devices connected to the network" can be transformed into "I can log who is at work when". This could be done with a $45 device. With $500 I could make something much more nefarious with regular arduinos.
It's interesting (and a bit frightening) to see just how many devices (e.g.: APs, mobile phones, dash cams, etc.) there are discoverable out in the wild.
 - https://github.com/felsokning/Cpp/blob/master/Public.Cpp.Res...
 - https://news.ycombinator.com/item?id=18932906
* As stated in the docs, not everyone has a phone, they estimate it being 70%, but I wonder how accurate this is in different regions for different crowd sizes and applications (cafe visitors vs a parade)
* As stated in the comments, some individuals carry multiple WiFi capable devices and it can be much more than 1 per person
* How large can a crowd get, before signals get jammed to a point this method stops being useful?
* Many people prefer mobile data and sometimes even don't turn WiFi on
So, I mean, it obviously can be used practically, but I struggle to estimate, how much should I believe what the device reads w/o actually seeing the crowd.
Basically just blasts thousands of MAC addresses into the air flooding their counters and ruining their data.
I sat in an ASHRAE technical session presentation the other day about the program. All of the technology goals intend for the people count results to be anonymous, and for the technology to be "open source."
Cost goals are <$0.06/ft^2 for residential and <0.08$/ft^2 for commercial systems.
"Wifi & Bluetooth driven, LoRaWAN enabled, battery powered mini Paxcounter built on cheap ESP32 LoRa IoT boards"
I'd think they could very quickly & automatically begin collecting data about income, repeat vs new customer, even streamlining the order process by recalling past orders or upselling, etc.
It seems like an area that is incredibly ripe for data collection & analysis; I almost wonder if there isn't a business in here, selling this feature-set as as B2B service.
For the record, I prefer not to be 'tracked' -- I don't know that I can realistically say it would actually impact my fast-food selections though. The concept does intrigue me nonetheless.
Those factors should completely swamp anything that can get teased out of reams of data.
Many of the ideas you suggested (recalling past orders, upselling) can easily be accomplished with existing smartphone technology. I've never used the McD app so I don't know if they utilize location tracking but if they do I can almost guarantee they are already checking to see which locations you visit the most, what you order, when you usually come in, whether you use the drive thru or not, etc. Even without location tracking they can still do all of that if you order online.
tl;dr In my opinion they don't do this because they can already collect the data through less invasive means.
We never wanted to track an individual person and would have hashed the MAC, nor did we want to get any data from them (phone brand or whatever). We wanted to be sure to not have more than 1000 people in the building (another law) and for security reasons we did not have turnstiles at the exit.
At the end we had to use cameras that counted heads...
* you usually see entry and exits of both and can start to map them as "one" user.
* you see only wifi entries and exists without corresponding bluetooth (maybe user usually has bluetooth turned off or is out of range of BT capture).
* You see only BT with no corresponding wifi (user has discoverable bt but wifi off).
The normalization is complicated as wifi/bt range are pretty different (unless you have hardware to extend the pickup for bt) -- but if you deal with a time window slide (and repeat event correlations) it should be possible.
Interestingly, some parcel services use handheld barcode scanners that don't even try to randomize their MAC or hide their name.
(German blog post https://blog.rolandmoriz.de/2019/01/08/eine-flasche-metadate... )
I used ARP-scan to log and monitor the MAC addresses of the people on the network.
Had a lot of fun at the time !
 - https://en.wikipedia.org/wiki/Tally_counter
The pub chains are registered with our Information Commissioners office and I am struggling to understand how every one of those people (including the ICO!) could have missed this.
I note Bluetooth also uses MAC addresses, so this issue is not limited to WiFi. Spoofing your WiFi MAC on a mobile phone with Bluetooth may not be making you as anonymous as you hope to be.
> A business uses Wi-Fi analytics data to count the number of visitors per hour across different retail
outlets. It is not necessary to know whether an individual has visited an individual store (or multiple
> This involves the business processing the Media Access Control (MAC) addresses of mobile devices
that broadcast probe requests to its public Wi-Fi hotspots. MAC addresses are intended to be unique
to the device (although they can be modified or spoofed using software).
> If an individual can be identified from that MAC address, or other information in the possession of
the network operator (the business, in this example), then the data is personal data.
Indeed, collecting unique MAC addresses, potentially from multiple endpoints, can reveal a lot of personal, sensitive, information (location, trips, time you go to the coffee shop or which hospital you visit, etc.).
The only ways to properly collect and store MAC addresses are either using privacy-protecting methods (e.g. cutting the last bits of the MAC address, potentially using bloom filters) or immediately aggregating the collected MAC addresses.
I know my Netgear router has a history of basically any connection attempts and their MAC address; it does this by default, not sure I can even turn it off.
And how would you even present a GDPR notice to the user prior to logging that kind of information?
There is an issue with retailers tracking shoppers movements in a store.
FTC take on the issue:
In other words: no, you probably will not be in the clear.