
Transport for London plans to collect data from passengers' mobiles - lnguyen
http://news.sky.com/story/tfl-may-make-322m-by-selling-on-data-from-passengers-mobiles-via-tube-wifi-11056118
======
AliAdams
That is a pretty misleading headline - this isn't asking to see their text
messages or something like that; It's just tracking when and where a MAC
address is seen in order to work out traffic trends. In fact some of the data
looks really interesting:
[https://imgur.com/Hx6mDSm.jpg](https://imgur.com/Hx6mDSm.jpg) (credit to
bcraven for the link)

Most large public WiFi deploys come with this capability already included
(albeit normally with an additional license required), whether or not the
owners of the system are aware of the capability / are utilising it. Punishing
TFL with sensationalist journalism for being open about this application will
only make such use in future more hidden and isn't constructive.

~~~
criddell
Don't Android and iOS randomize the MAC addr these days? Are they investing in
something that's going to have a very short life?

~~~
benbristow
You need to login to the TFL WiFi using your Virgin Media or supported phone
network's login, so they can identify you by that instead.

~~~
criddell
Is that just if you want signal under ground, or is having WiFi turned on a
requirement for riding Tfl?

~~~
Angostura
If you want to use their wifi :)

------
bcraven
Here's the pdf from the pilot findings: [http://content.tfl.gov.uk/review-tfl-
wifi-pilot.pdf](http://content.tfl.gov.uk/review-tfl-wifi-pilot.pdf)

I like this figure of the ways that people travel between KGX and Waterloo-
this is the sort of data that you can't pick up using Oyster cards:
[https://imgur.com/Hx6mDSm.jpg](https://imgur.com/Hx6mDSm.jpg)

~~~
jjp
The PDF is a great read to understand the projects motivations, data collected
and examples of how the data analysis could help anybody using the underground
network make better informed routing decisions.

------
matthewmacleod
I'm actually happy about this.

The data that has been collected is great ([http://content.tfl.gov.uk/review-
tfl-wifi-pilot.pdf](http://content.tfl.gov.uk/review-tfl-wifi-pilot.pdf)) and
not realistically available in any other way. The only data stored is (hashed)
MAC and timestamp. It's taking place on a system where most users are already
tracked semi-anonymously using contactless cards or Oyster cards, and is
trivially avoidable by disabling wifi.

This seems like a good trade-off.

~~~
monort
MAC do not contain enough entropy to be usefully one-way hashed.

MAC address is 48 bits of which 22 bits is vendor.

~~~
gsnedders
The proposal, as it stands, is a salted hash, with the salt discarded once a
day. If the salt isn't stored anywhere, then there's enough entropy.

------
raimue
Even here in Germany of which most think that we have good privacy protection
laws, mobile phone providers are collecting and selling anonymized movement
profiles of their customers. Public transport companies use these to improve
their service, as well as companies offering navigation systems.

I would be surprised if tracking via the phone network was not common practice
in other countries. In case of London, WiFi would of course enable tracking of
people on the Tube, but in general using WiFi tracking should yield a more
fine-grained movement profile, which would not be possible with phone networks
only in a dense city center.

~~~
peterjmag
For those who haven't seen it, here's a great illustration of what that data
looked like for one person back in 2009:

[http://www.zeit.de/datenschutz/malte-spitz-data-
retention](http://www.zeit.de/datenschutz/malte-spitz-data-retention)

I can imagine that it's only gotten more granular and comprehensive since
then.

~~~
KGIII
That site makes me happier about being able to seldom carry my phone. I don't
leave it home for privacy reasons, but that's a nice side benefit.

------
cjsuk
Why do they need to do this? They already track people with Oyster and
contactless which gives them journey information. This is just a convenient
data grab and profiteering.

I'm going to sound old and curmudgeonly here, but I'm starting to miss the
days when we bought paper tickets with cash.

~~~
nkoren
> This is just a convenient data grab and profiteering.

I know people working on this project, and that is absolutely not their
motivation. They want to better understand the movement of people within the
system, so they can design better stations, trains, and services.

The traditional way of understanding such passenger movements is by parking
students with clipboards and people-counters all around stations, doing
passenger interviews and such. This traditional methodology is not only hugely
expensive, but the data it generates is _spectacularly_ bad, and has lead to
some incredibly poor planning at times.

~~~
spuz
Could you give more details of the kind of information they receive from wifi
tracking that they don't already through oyster barriers and how this helps
with service designs?

~~~
shawabawa3
Just guess but: Accurate real-time data of passenger movement through the
stations.

Barriers only tell you when they enter/exit, they give you no information
about the routes people use, how long it takes them, how many people transfer
between lines etc

------
corney91
Surely this would be an automatic sanction when GDPR comes into effect,
leaving your wifi on isn't exactly "explicit consent".

~~~
delinka
Is the Underground a private institution? Is it government? Does GDPR provide
exemptions for government institutions? Do the police assist the Underground
in performing searches and data collection? Can the police assume "consent"
when they collect your Wi-fi MAC or your Bluetooth ID or any other data your
mobile device voluntarily broadcasts?

~~~
jbreckmckye
Yes, no, no, no, N/A as GDPR provides escape hatches for actions to do with
'public safety'.

~~~
superqwert
LU is pseudo-private. It falls under TfL, which falls under the London Mayor.

------
superqwert
Only access point connection time, hashed MAC address and time are logged.

The only thing that can be found out about anyone, is where they were, at what
time, possibly who they were (all of which can be obtained from CCTV footage),
and (additionally) how many web requests were made.

No browsing data is stored.

This doesn't particularly change the state of existing LU monitoring.

~~~
icebraining
Agree, this can be obtain from CCTV footage; so why don't them? Why log even
more stuff?

~~~
superqwert
I believe that some more modern trains allow TfL to measure the weight change
of carriages under passenger load, which can be used to estimate numbers, but
imagine the cost of trying to roll that out on all trains and platforms... The
fares would have to double for a year at least.

~~~
gsnedders
I'm sure the Class 345s, soon to appear on Crossrail, do this to show loadings
on platform screens (to trying to avoid crush loading in specific coaches
while others are empty), but I can't find any source for that. The Class 700s
on Thameslink show loadings on platform screens in the "Core" section, but
this is based upon CO2 levels within the coaches.

------
jnsaff2
This is why I switch my wifi radio off every time I leave home. Especially
when going to travel via the tubes.

And this is why the "50-shades of wifi off" of ios 11 is so idiotic.

Don't have iOS 11 yet so don't actually know how many steps I need to take.

~~~
knolan
I’m not sure I see the issue with iOS11. 3D Touch Settings and select WiFi and
toggle it off. It’s about as much work as opening control center.

If you don’t have a 3D Touch phone you have to open Settings and Navigate to
WiFi which is right at the top of the list with Bluetooth and Airplane Mode.

~~~
ricardobeat
You just gave me the first real use for 3D Touch since I got my 6S. Thanks!

On the subject, one swipe + one press using control center is really much
faster than finding the settings app. For me it takes at least six actions to
get to the switch: press home twice, open a folder, scroll, open settings,
click wifi). Even if you keep settings in your home screen that's at least
four clicks/touches away.

~~~
lucaspiller
Note that in iOS 11 doing this through the Control Center it will say “Not
Connected” which is exactly what it means - the WiFi hardware is still
switched on, it just discconnects:

[https://support.apple.com/en-gb/HT208086](https://support.apple.com/en-
gb/HT208086)

~~~
knolan
We get it. We’re talking about turning WiFi off via settings.

------
grahamel
A more informative article was posted earlier last month
([https://www.ianvisits.co.uk/blog/2017/09/08/tracking-
smartph...](https://www.ianvisits.co.uk/blog/2017/09/08/tracking-smartphone-
wi-fi-signals-reveals-curious-journeys-on-the-london-underground/)), and there
was more info on the data privacy elsewhere too
([https://www.theregister.co.uk/2017/09/09/london_tube_trackin...](https://www.theregister.co.uk/2017/09/09/london_tube_tracking_trial_may_make_commuting_less_miserable/))
saying that TfL "depersonalized [MAC addresses] using a salt which [they] then
discarded at the end of each day."

------
lifeisstillgood
I used to work with a startup that did this data collection for British Rail
at just these stations. It was used for fairly important things like "will
this new bridge be sufficient for the passengers who currently walk all the
way round to get to platform X. Will it be a crush risk / bottleneck".

A railway station should be through of as a maze where the walls really do
move in and out every few minutes. hundreds of signals can appear and
disappear just like that.

Its worth mentioning that I am a big supporter of GDPR _but_ I will be one of
the first to allow a variety of researchers to have blanket access to this
kind of stuff for medical, town planning etc research. I need to look more
deeply into that part but there surely must be ways for legitimate research
such as this to be conducted?

~~~
saalweachter
The thing about slippery slope is that they are _really awesome_ places to
live, except for the constant risk of sliding where you don't want to be. The
price of that awesomeness is vigilance, to remain precisely at the point on
the slope you want to.

------
codingmyway
They've been collecting data for at least 3 years. It's only the detecting of
attempted wifi connections but it's the best data source fro predicting
overcrowding of stations. A lot of what they tried out to manage crowding
during the Olympics is being refined and used to manage passenger flows so
some passengers are sent on different routes.

------
znep
For an example of a somewhat similar project, see Austin's bluetooth traffic
sensor data:

[https://github.com/cityofaustin/hack-the-
traffic/tree/master...](https://github.com/cityofaustin/hack-the-
traffic/tree/master/docs)

They used to have the code used to anonymize the data available, but I can't
find it. I believe they re-randomized the MAC addresses daily so you could
track trips but not a device long term. With enough sensor locations that
seems like it could still be vulnerable to de-anonymization to some degree.

------
shimfish
Does anyone know if the iPhone's MAC randomization has any effect on this?

~~~
Matt3o12_
As far as I know, the iPhone only randomizes its MAC address after the device
has been locked for two minutes and not connected to any known networks. Two
things that are very unlikely to happen in a subway. One of the advantages of
riding the subway is that you can use a phone and the other one is that there
is rarely any cell connection (at least in my subway), so you need WiFi
anyways.

WiFi randomization is great if you are going to the store by car because the
devices on the way there cannot track you but once you arrive there, check
your texts it is game over.

~~~
joosters
There’s not great WiFi coverage on the underground right now, but I imagine
that it will continue to improve, especially if TfL want to collect more of
this data...

~~~
matthewmacleod
Really? The vast majority of tube stations have comprehensive coverage. I
can't imagine how it could get much better, beyond being available on trains
themselves.

------
mm4
they talk about some great benefit for the passanger but they did not say what
it actually is...

~~~
superqwert
This is currently the only method available to get good estimates of # of
people on a platform - unless you would prefer visual recognition software.

TfL needs this information to be able to stop overcrowding on platforms.

~~~
zimpenfish
> unless you would prefer visual recognition software

Given they already have multiple cameras covering platforms from multiple
angles, this seems like it would have been on the list.

~~~
superqwert
When TfL have previously suggested visual recognition software, the media
exploded. So they took a step back.

~~~
zimpenfish
I suppose calling it "visual recognition software" invokes a kind of "facial
recognition and tracking" vibe. I'm thinking more of the "is this platform
crowded" visual recognition stuff - "How much of the reference image of an
empty platform do we have today?" kind of thing.

~~~
superqwert
Without recognising single persons, it would be much harder. You would have to
consider that a person twice as close to the camera takes up 2x as much room
as someone 2x further away. Generally, it can also be very hard to see people
on a crowded platform and you can almost double the amount of people on the
platform and see as little of the platform after as you did before. Wifi will
probably generate better customers in that case. Add in the fact that you are
still processing visual data and that you will get a lot of media attention
and Londoners will lose trust in your services and you have a definite no-go
for your project.

~~~
zimpenfish
> you can almost double the amount of people on the platform and see as little
> of the platform after as you did before

I would suggest that "overcrowding" was achieved well before that point
though.

> Wifi will probably generate better customers in that case.

Assumes that one person equals one Wifi though which isn't necessarily the
case.

> you are still processing visual data and that you will get a lot of media
> attention

Perhaps. It would definitely require a deft hand on the PR tiller.

~~~
superqwert
1\. Not necessarily - 4 short people behind 4 tall people will do the job. But
I would argue London platforms ARE overcrowded already, rather often. Mostly
because you can only control where people go a certain amount. If everyone
gets off many trains at the same stop, but don't leave, there isn't much TfL
can do short of asking people to leave for their own safety. Maybe if TfL
actually got any decent amount of funding (Like Southern and SouthEastern got
due to being rubbish) from the government, they could work on creating further
lines and spaces that would decrease Underground congestion. Instead, the
government has cut funds and the Mayor has frozen fares. And yet people still
want infrastructure to improve.

2\. 1 Person != 1 Wifi, but if 1W = 0.2P, you can still get a good estimate.

3\. TfL is a large bargaining chip for the mayor - no mayor would even
consider risking a PR disaster like the one that could come from this.

------
nereid666
[http://content.tfl.gov.uk/review-tfl-wifi-
pilot.pdf](http://content.tfl.gov.uk/review-tfl-wifi-pilot.pdf)

~~~
ZoFreX
Anyone who hasn't seen this yet should take a look, it explains in detail:

\- Why they are doing this \- What the potential passenger benefits are \- And
bonus, it includes lots of cool infographics on passenger journeys

------
timrichard
I only connect to their WiFi to email my workmates, and say "sorry I'm
runnning late again, ----ing Circle line'.

They can intercept that traffic as often as they like.

~~~
victorhooi
They're not logging any wifi traffic.

They're storing a hash of your MAC address and timestamp - the idea is to log
foot traffic over time.

I think this issue has been blown up somewhat by people who don't understand
the technical issues.

------
chaz6
They have already collected the data. The issue now is whether or not to make
the data available to third parties, albeit anonymized.

------
jimmcslim
I think a lot of transportation departments have been doing the same thing
with car Bluetooth transmitters to monitor traffic patterns in real-time.

