Hacker News new | past | comments | ask | show | jobs | submit login
The $24B Data Business That Telcos Don't Want to Talk About (adage.com)
314 points by larrys on Oct 27, 2015 | hide | past | web | favorite | 170 comments



Many hedge funds have started to consume this type of data.

There is a piece in WSJ that discusses how one fund, two sigma, uses cell phone tracking data, as well as many other sources of data to build trading signals.

http://www.wsj.com/articles/how-computers-trawl-a-sea-of-dat...

In fact, I think the biggest funds are putting more effort into this type of big data exploration from funds than they are into trying to glean more information out of the time series data provided by the exchanges data feeds.

10 years ago, being able to scrape the web was a competitive advantage for funds, 5 years ago it was real time sentiment analysis of news reports.

Today, its being able to consume 100's of disparate data feeds and build alpha generating signals from it.


Are they tracking the CEOs of companies in which they invest, or what?

Tracking A, B, and C-level executives should be very profitable. Who meets with whom is valuable info. Especially when you also have their phone call metadata. The activity in advance of a merger should be quite visible.

Tracking elected officials should reveal who influences whom, and who's bribing whom. It may be possible to detect bribery to the level of establishing probable cause for an investigation in that way.


No, it's more about understanding consumer sentiment about a product category or company based on mining consumer information in aggregate. For example, a hedge fund could know how many people went inside a Wal-Mart over the last few months based on cellular location data. If there's an unfavorable downturn in store traffic, the hedge fund can shift its investments based on information that Wal-Mart won't report for another few months.

> Tracking elected officials should reveal who influences whom, and who's bribing whom. It may be possible to detect bribery to the level of establishing probable cause for an investigation in that way.

The way political fundraising is done in the US, most bribes look like legitimate campaign finance money. In some states, it's even legal for a candidate to simply redirect whatever is in his campaign fund directly into his own bank account after the election. And even in states where that is illegal, 501(c)4 / SuperPACs allow politicians to do whatever they want with the money with no oversight.

With laws like that, why even bother to investigate bribery?


>> Are they tracking the CEOs of companies in which they invest, or what?

> No ...

How could you know that? I doubt anyone would reveal it, and I don't see why they wouldn't - for the unscrupulous (there are plenty of them), it could be very valuable. Remember the story about Uber tracking journalists, for example.


They would track CEOs, if only they could. I'm guessing that even in the US, mobile networks only share aggregate or anonymized data about customer location, and don't allow for spying on specific individuals.

If you want to track CEOs, I'm guessing it would be much easier to track their private jets or the license plates of company cars.


they MAY only share aggregate but what's stopping, for example, VZW using the data to their own means? competitors use their towers, potential acquisitions, market/political threats or other interesting entities. Laws and morals? Hahaha!


Most telecoms won't do this because the potential reward simply isn't worth the liability they would have for insider trading (if they get caught spying on corporate execs, then the SEC will assume they spy on every corporate exec). In fact, most big telecoms have a secret "VIP" program that flags accounts of especially wealthy, influential or famous people and restricts access to them unless you're on the "VIP" customer service team.

I'm sure hedge funds track CEOs, but they're not using data from Verizon to do it.


They calculate risk:reward and do whatever nets the most profitable return. Fines and fees are only anathema when the costs exceed the gains gotten from illicit behavior. Otherwise, paying fines without admitting guilt is the corporate way.


There are better ways to track CEOs than through mass data collection -- targeted data collection (aka hiring a private investigator) works much better. Though most CEOs of big companies have security details that keep the wolves at bay.


http://www.cnbc.com/id/38722872

satellite recon of walmart parking lot usage, to estimate business trends


Yeah, I was using that as an example -- but I'm sure you can think of other things you could measure using bulk cell phone data that would give you insight into the future financial performance of a company -- often before the company itself knows.


aren't the companies employing wireless trackers in store? even small shops can get some useful tracking data from "custom" OpenWRT based firmwares on consumer-level wifi/router devices.


> aren't the companies employing wireless trackers in store

Yes. Lookup "beacons"


>In some states, it's even legal for a candidate to simply redirect whatever is in his campaign fund directly into his own bank account after the election

This doesn't sound right. Any sources?


This might have been mentioned on John Oliver - the episode about oil in North Dakota


I thought that kind of tracking was illegal:

https://www.ftc.gov/news-events/press-releases/2015/04/retai...

maybe it's ok as long as it's anonymous.


my initial thought was how many teens are patronizing $MALL_STORE or how many people are stopping at Chevron gas stations versus Shell.

at least that's more demographic/general, in my opinion. what you suggest is bordering "creepy".


Peter Lynch, of Magellan fame, wrote that he'd hang out in shopping malls looking at store traffic.


Just bordering? Seems full on creepy to me.


I'm not privy to what Two Sigma do, but I'm guessing all the tracking data they have is anonymized, did I just make up a word?, and of a more general trend variety.


What. The. F*ck?

People are being tracked for trading advantage and that is not at all illegal? Not inside trading at all? How the hell did we get _here_?


It's not "inside" trading if it's data available in the marketplace.


People are being tracked for trading advantage and that is not at all illegal?


Aggregate data. So it's supposedly anonymized.


As in "We tasked one of our interns, the one with a Russian literature degree, to anonymize the DB ... two hours before we let the clients access it".


> Today, its being able to consume 100's of disparate data feeds and build alpha generating signals from it.

How do they validate that this has any connection to reality? Or those who had budgets to spend doesn't ask such silly questions?


There are a number of traditional statistical methods for validation that can be used to answer this type of question. These sorts of firms have some of the largest private clusters in the world, they are certainly capable to and do answer this question before trading.


Sure. I could answer to this meme by another meme - most of statistics based studies are wrong in all possible way - biased data set, misapplied methods, wishful thinking, untestifiability.


So what's your point? Sure, you can mess up your test -- does that mean you shouldn't trade? Does that mean you shouldn't run validation tests at all?


My point is that there are more faith and wishful thinking than science in all these "models". Moreover, the very nature of the markets are too complex to model, especially using such data sources. The result would be some naive description of reality, like pre-scientific religious cosmology, a wrong but sophisticated enough 'model' based on 'current consensus'.)


Have you worked in a variety of quant funds? What makes you think markets are too complex to model? What evidence do you have that the models used at successful shops are not sufficiently rigorous?


Because all one needs is to bullshit one's boss or sponsor. I gave an example of a religion for a reason. And bullshitting is much more efficient strategy than actual modeling and rigorous research. Group thinking is much easier and cheaper than proper science, especially when even the best economists are full of doubts.

I took Yale's financial markets courses, so I know what ideas they might have. This is the Nobel laureate's course.


The bosses in the typical quant firm are also quants themselves, and have no trouble following the rigor used to justify the validity of a trading strategy. Sure, at a certain point in the chain, you're talking with someone (eg. COO) who isn't allowed to see the details of the modeling procedure, but this is the point of scaling up and having risk/pnl limits -- no HFT firm, for example, is at risk of blowing out from lack of statistical rigor.

I doubt taking Yale's financial markets courses is sufficient to understand or be exposed to what is currently being used in quantitative finance. I took Stanford's financial markets and statistics classes, and those are far behind and unaware of what's currently being used in the industry (I've worked in two highly successful HFT groups since Stanford).


I find this hard to believe: "consume 100's of disparate data feeds and build alpha generating signals from it". Can you point to an example?


Well there are 40 different exchanges with which to trade in the US alone, Canada adds another 10.

So there are 50 different signals before you even start up your servers for the day.

Since there is no one true currency market most HFT funds make their own currency signal from anywhere from 5 - 20 different feeds, lets call it 10 to bring our list of data feeds to 60 before we get into any futures and bond markets or to Europe and Asia.

Consuming the above mentioned brings us close to 100 already.

Add in feeds from

- google trends

- analytic s about consumer trends from say 5 different providers each covering say 5 demographics across 5 sectors

- twitter data feeds aplenty, following 2-300 different hash tags representing stock #APPL

- twitter following for news feeds, like AP, Reuters, etc to do sentament analysis.

- don't forget your machine readable news feeds

- now lets get into government data. FRED has 100's of data sets to parse to use as model inputs. See: https://research.stlouisfed.org/fred2/

What are we up to, 200-300 different signals and we haven't really broken a sweat yet.

What sort of source would you like me to provide other than what I see on my computer screen and do for a living?


I know of one company that takes photos of retail parking lots using nanosats and then sells that data to hedge funds. They're trying to put up more satellites so they can get even more photos per unit time.


I've been trying to think of a practical, effective way for smartphone end users to protect their confidentiality. This is the simplest solution I've come up with, but I'd appreciate any feedback:

Hardware:

* Tablet, or smartphone with baseband disabled.

* Cellular-wifi router (i.e., wifi hotspot), prepaid so the provider doesn't need your personal info.

.

Software:

* Android with per-app permissions controlled by user (e.g., user can enable/disable access to location data for particular apps). This could be a fork of Android or maybe there is security software that could be installed, such as on a rooted phone.

* VOIP app on phone

* VPN

.

By decoupling the baseband from the handheld computer (i.e., by keeping the tablet and cellular connection on different devices), using the cellular service without providing identifying info, and sending only encrypted data over the cellular connection (via VPN), you would protect your confidentiality from the cellular provider.

Because your phone number is decoupled from your cellular service (because you use VOIP over a VPN), nobody can tie your phone number to your location.

Of course someone who is determined could track you down. Your identity needs to be tied to your phone number or nobody will know how to call you; and your VOIP vendor could point someone to your VPN provider, who could point them to your cellular provider, who could figure out which hotspot you use. But I think it does protect you from everyday mass surveillance.

Any thoughts on how practical or effective this would be?


The problem with all of this is that you would need to swap SIM cards on the cellular-wifi router almost every time you change locations -- and that's if you can even find a data vendor willing to sell you a prepaid data plan without a credit card on file.

Practically, this solution you've come up with is too fragile to be relied upon. It's only secure if you can maintain this level of care with every operation you perform on the phone. This is why the NSA's data collection is so insidious -- it only takes one slip-up to connect everything you ever did anonymously to your "profile".

With the type of data collection that is done today, it's nearly impossible to avoid unless you use the Internet in a very different way than the average person. Avoiding data collection through technical means is a futile exercise at this point -- if you object to the data collection, public policy is the best avenue to prevent it at this point.


I'm not sure how any of this will help, if there's a chance they can link the phone/router IMSEI[1] to your person in some way.

It's reasonable to assume, that any network provider will, for billing, internal use (eg: document/fight network abuse) and/or as mandated by data tracking laws store IMEI, MAC and IMSEI numbers, along with connection meta-data (tower, exact location if available, timestamp etc).

I don't think it's possible to get meaningful privacy from an attacker that either a) is your service provider, or b) works with your service provider (eg: NSA, buyers of data for "advertising")?

You could use TOR - but you'd have to use it for everything -- which pretty much rules out real-time voice/video chat AFAIK. Perhaps a VPN that crosses jurisdictions and corporate ownership would help against "commercial" attackers (eg: the ad networks). I'm doubtful how effective such a "single-hop" defense would be against state actors.

Not that I necessarily think all threat models should try to circumvent illegal government wieretaps - just pointing out that if that is wanted in addition to just un-linking meta-data on data/communication from meta-data/data on shopping/banking -- the needed security measures are likely to be inconvenient.

[1] https://en.wikipedia.org/wiki/International_Mobile_Station_E...


> you would need to swap SIM cards on the cellular-wifi router almost every time you change locations

Assuming the user can get an anonymous pre-paid service (which I agree isn't certain), why would they need to change SIMs? The cellular provider only sees an unknown person sending encrypted data to a VPN hosting service.


Assuming there selling your home address to stores you visited. If you carry the same phone to the same house every day for a month they can guess your home address.(1) boom targeted direct mail.

Let's take this a step further, they link your phone's location to when you leave the checkout line, now they know where you live and what your buying.

(1) They may or may not be doing this, but at a technical level it would be easy.


Yes, I assume that a store could very easily link WiFi MAC addresses to CC numbers, and then loyalty cards.

The probability that someone checks out and leaves the store at the same time that you do is high. But two times in a row? Lower. 5 times in a row? Extremely small.


Identifying home or work location is something Google Maps does for your convenience. It would be trivial for cell operators.


Carrier don't have data that specific, cell phone triangulation isn't that precise.


There is a very thorough guide for achieving pretty much this from the Tor Project blog.[0] The most significant part in this process is completely separating the baseband from your user device.

[0] https://blog.torproject.org/blog/mission-impossible-hardenin...


Sounds great so far. There would have to be a temporary switch for 911 emergency calls from the user-endpoint. I have heard that VOIP services are not compatible with the phone/location trace programming necessary for emergency dispatch centers.


VoIP uses E911. Usually you pre-register your address with the VoIP service. But services with dedicated apps may pass along your GPS location today, I'm not too sure there.


There are ways to pass the coordinates in a VoIP 911 call. Unfortunately, these standards are designed by astronaut architects and are rather cumbersome. For instance, one uses MIME multipart bodies inside a SIP message. Which few systems handle properly (because it's such a bad idea and has no practical use cases).

Instead of being engineers and saying "ok, well just throw in a header X-911Location" or something simple, they gotta make shit complicated. It's ridiculous.

They also use this IETF designed address format. Which is rather ... comprehensive. All sorts of designation for streets and branches and so on. Except... none of the 911 systems actually use that. So they design this protocol with no care to how actual systems work. Or how actual street addresses in the US work. They basically made up their own idea of how address topographies could work.

But I'm guessing some of the larger companies doing VoIP probably have it sorted (like when you do a VoIP call on T-Mobile and there's no GSM just WiFi).


> prepaid so the provider doesn't need your personal info.

I don't think this is possible in the USA anymore. I tried with ATT Verizon and Tmobile but I'm sure they are smaller players. All required somewhat my personal info altho for prepaid Tmobile did not ask for SSN. All told me this is due to the fact you're getting equipment "practically free". When offered to actually pay for it, I was always told "we don't have option like that".

Europe is much simpler. I walked into Play store [1], put $60 on table, and got USB dongle with prepaid scratch card of 6GB on a 3G network. No IDs, no documentation, nothing. Refills are very simple - you purchase a scratch offs for 3/6/9 GB or more, over the counter, no ID required.

[1] http://www.play.pl


I think you can still get around that with some effort.

In 2010, I purchased a dumb phone/sim card for use on a prepaid plan. At the time it was the only way to get onto a super cheap prepaid plan since they wouldn't sell you a sim/the plan directly. I then put the sim into my own purchased smart phones(multiple nexus devices and a one plus one). For convenience reasons, I refill my account online with a cc but I technically could simply buy the t-mobile refill cards for cash then activate the time on them.

As far as I know, there's nothing stopping me from doing this fresh again with a new prepaid phone and then never associating my real identity in any way.


They don't even need to; your telco can know your home address based on a week's worth of location data, and could likely determine more about you based on where you go to work, shop, etc. Humans are creatures of habit, so it's relatively easy to draw inferences from simple patterns (i.e. if you spend 6-8 hours in the same location every night, that's probably where you live.) All it takes is one time checking your e-mail from an IP address to forever link that IP address to you. Know enough IP addresses someone has connected from and you can come up with a pretty good picture of their friend circle, their movements, how old they probably are, etc.

I work in the telecom industry, and this behavior exists because it's not explicitly illegal. This type of data mining is just scratching the surface of their capability -- turns out that a combination of about 30 seconds worth of data scraped off http traffic (not https, though even https can tell you something) and a location are enough to identify most individuals and link them to a profile in a DMP -- which can tell you all sorts of information like which products you've purchased recently (both online and in brick and mortar stores with a credit card), any relevant demographic information, and even what kind of porn you like.

The real restraint on this has been in the use of this information. Marketers have been remarkably conservative in using this information; likely for fear of scaring off customers with "creepy" data. But rest assured they know more about you than you do yourself.


Thanks; that is valuable information. So my whole plan (in the top-level post at the root of this discussion) is hopeless, at least in terms of having anonymous, untracked phone usage?

> your telco can know your home address based on a week's worth of location data

How accurate is that location data, do you know? Within 10 yards? 100 yards?


I can't remember the limit, but it's a legal one not technological. The military has a legal monopoly on precise GPS.


Similar here. I too have used my CC for refills and may have given my name for caller ID purposes, but I don't think it was mandatory nor do they have my SSN or other hard details on my identity.

AFAIK you shouldn't have to get a throwaway phone either. "SIM Card Kit" or "Bring your own phone/device" may be keywords to research -- you should be able to get a SIM that you can put in any GSM phone, and it will come with instructions about a website to visit, or phone number to call, to activate it. If they insist on names etc and it's prepaid, you could give a fake name since it will have no bearing on your ability to use it, although I'd be worried that it might be illegal for some stupid reason.


Go to a smaller vendor like Cricket, they will still do cash prepaid.


Computer + wifi hotspot is a strong local optimum. I started down this road, and ended up with a Samsung i9500 because I wanted something small enough to fit in my pocket (everything without a baseband was too big, or wicked old). Hotspot is from Freedompop - $4/mo is great for experimenting.

I didn't have a clear direction after that, so I proceeded with the works-right-now solution of Google Hangouts (with all of its associated proctology). I haven't done much else with it besides backslide a bit by getting a cheap SIM so that it would ring reliably (Hangouts is a flaky POS).

Location tracking is my biggest concern (Getting on the PSTN anonymously seems like a completely different problem, and I'm less interested in tackling it), but I don't see much way around it when you're using the cell network for backhaul. Unless you religiously shut off your wifi point, pay for a new wifi point + plan in cash every few weeks, and get enough people doing this that you can blend in.

IMHO all of these guides that talk about prepaid sims and burner phones really only work for exceptional situations where someone is willing to jump through many opsec hoops. They aren't congruent with people's standard expectations of cell phones. Any solution has to roughly work with people's expectations to be adopted, since most people only casually want to defend their privacy.

The proper privacy-preserving cell solution would use bearer tokens to pay for network access, and have no device identifier tokens. This is obviously a pipe dream. The only advance I see on the horizon is as more wifi points open up, ideally coupled with software control of your identifiable cell radio that would selectively allow tracking for checking in if it had been too long since seeing a wifi spot. Most people are probably only out of the range of wifi for an hour at most times, so if it was acceptable to delay reception of messages that long, a lot of every day privacy could be practically achieved.

If you had $100M perhaps it would be possible to start a privacy-preserving MVNO with devices that shuffled identifying information every day.


Thanks for the comment -- just a reminder for people reading this that in the wifi hotspot solution case, in order to thwart location tracking you should change your device MAC address, otherwise the wifi providers can identify your device (and some of them will!).

An unexpected obstacle about the privacy-protective MVNO that I heard about from someone who was investigating this is taxes on "telephone lines" that the MVNO is supposed to pay whenever it "activates" a "line". It may be difficult to reconcile this with changing subscriber identities every day (assuming you can get ahold of devices that change device identity every day).

Whenever I talk about this I say: it's too late to change the cell phone network cheaply now (though we should still be vocal about the problem and not give up: we shouldn't accept that there is someone who knows where almost everyone is almost all of the time, which is the case today). If you're designing a new communications system, make sure that it starts with privacy protection and user and device anonymity, and layers optional identity on top where needed, rather than the reverse! Let's not be saying in 2030 "oh, if only people in 2015 had thought about the privacy issues with this technology...".


It seems like you could solve the "new line" problem within a constant factor - just pay the cost for a pool of triple the number of "lines", which then get allocated a day ahead of time by spending a bearer token (round-tripped through eg TOR such that the provider cannot link the new and the old). Device "identities" can be reused because the mix pool is constrained by the subscriber base anyway.

I'd worry about the IMEI/etc becoming a secret token that could be reused after it was assigned to someone else's device. But even if the SIM's technology couldn't fix that, it could be fixed at a higher level after gaining network transit.

But yeah, I was teetering between $30M and $100M for my estimate, and the unknown unknown wtfs pushed me on up :>


I'm not positive whether it counts as a "new line" if you swap existing lines of existing users, but maybe there's some way to arrange it to avoid triggering the tax every day for every single user.


Thanks. This deserves to be repeated:

it's too late to change the cell phone network cheaply now (though we should still be vocal about the problem and not give up: we shouldn't accept that there is someone who knows where almost everyone is almost all of the time, which is the case today). If you're designing a new communications system, make sure that it starts with privacy protection and user and device anonymity, and layers optional identity on top where needed, rather than the reverse! Let's not be saying in 2030 "oh, if only people in 2015 had thought about the privacy issues with this technology...".


Tax basis for mobile in the US is generally usage based to satisfy USAC/USF; not per act.


> Location tracking is my biggest concern ... but I don't see much way around it when you're using the cell network for backhaul

If the cellular provider doesn't know who you are, why does it matter if they track your location? Under my proposed plan, they won't know who you are because the prepaid cellular service is anonymous and all your data on the wire is encrypted and going to the same VPN host.

> Getting on the PSTN anonymously seems like a completely different problem, and I'm less interested in solving it

Is there a VOIP service that will take bitcoin or some other anonymous currency? The problem is that you have to give people your name and phone number (unless you do only outgoing calls); inevitably they will become associated in many databases.


> If the cellular provider doesn't know who you are, why does it matter if they track your location?

I guess they won't, so you'd defeat some commercial advertising. But the location information still gets dumped to government databases which will easily correlate your path to home/work/friends/CC etc and find an identity for that IMEI. It's similar to saying that Bitcoin is "anonymous" - sure, if you jump through a lot of opsec hoops you can achieve privacy for a particular transaction. But the system lacks the stronger property of untraceability, which is required for common casual users to retain their privacy.

Psuedononymous PSTN access doesn't actually seem that hard - a prepaid CC to some VOIP provider. It's just orthogonal to location tracking, so I haven't thought much about it.


To add, IMO there is a gap in the market for a dumb phone that can also act as a 4G router. I can then pair it with my tablet/iPod touch when I need to go online.

But most times, I want to use the phone just to make/receive calls/text & have good battery. No need for a big display or other stuff.


The biggest problem with this I think is that it files in the face of the goals and incentives of all service providers. They want your personal data for marketing (not to mention engineering reasons like infrastructure optimizations), and they need your personal data because government.

Some commenters mentioned that you can still get burner SIMs in Europe, but last time I checked it was incredibly hard anywhere in the world (maybe I don't know where to look) - even in China they want your official documents to sell you a prepaid card. The reason I always been told, that sounds plausible to me, is that easy access to burner phones leads to too much mess with criminals using them for their criminal things, and with random pranksters calling in fake bomb alerts.


> it was incredibly hard anywhere in the world (maybe I don't know where to look)

I was never asked for any sort of ID to get a SIM in Mexico or Vietnam. In any case, retail mobile store workers don't care enough / can't tell a real document from a fake one.


When I was in Europe and Latin America both, I could buy SIMs by writing down whatever number I concocted for my DNI. Now, I actually wrote my passport number down but nobody really looked. One time I used my CA drivers licence, which could have been a Boy Scout troop membership card for all anyone cared.


Note that document forgery is a "real crime" most everywhere - so if you were caught (perhaps due to some unrelated investigation) - it might not be the best idea.

I'm not saying you shouldn't break the law to preserver your anonymity (at your own discretion) -- I'm just saying it's something to take into consideration. If you for example fear that you might get dropped in a black site, fed polonium or pushed down some stairs over your on-line activities -- a little document forgery might be just the ticket.


Astonishingly simple in the states to get a SIM without an ID; a random exception.


>Android with per-app permissions controlled by user (e.g., user can enable/disable access to location data for particular apps). This could be a fork of Android or maybe there is security software that could be installed, such as on a rooted phone.

This is a built-in feature in Marshmallow [0].

[0]: http://www.greenbot.com/article/2990078/android/how-to-toggl...


Yes, one can limit some permissions, but there are others that vanilla Android will not allow one to block. For instance: Device ID. I don't have a complete list of the un-blockable permissions, but simply using Android's built-in functionality is not sufficient to block all identifiable data that your phone's apps leak.


> simply using Android's built-in functionality is not sufficient to block all identifiable data that your phone's apps leak

Do you know happen to know a solution? I know there are many confidentiality utilities and Android forks which emphasize it, but I don't know enough to evaluate them.


Not exactly. Years ago I used an app called LBE Privacy Guard with success, but it's closed-source and cannot be trusted with such a privileged level of access to my phone.

Now I use a combination of blocking with the built-in app, running a custom ROM with root (which allows you to use many of the known privacy apps at their full potential), regularly changing my DeviceID with the help of yet another app and routing some of my device traffic through Tor via Orbot. It's definitely not perfect, but I feel like I'm at least partially thwarting the rampant collection/trade of my personal and private data.

So, sorry I can't be of more help, but you really have to just dig in and read the fine print with this sort of thing. Many of the apps only work with specific versions of Android, require root, might require ROM patching, ect. YMMV, unfortunately. Good luck.


Doesn't solve the problem of the network reporting your location (disabling the baseband does, but operating system level permissions don't impact the network).

I know that's outside the scope of your comment, but I wanted to point out that the baseband is not a thing the operating system really controls in any meaningful fashion.


Android tablet with no baseband, e.g. a Nexus 9 for example. Freedom pop sells such a device

permissions: Android Marshmallow has this bog standard, no root needed, so might have to watch to make sure it's available on whatever device you choose.

voip: google voice/hangouts can make this possible, also skype, hangouts, and a number of others.

VPN: basically anything android compatible.

It's not a terribly convenient thing to initially setup but it's not difficult either anymore, I actually did this for a few months when I broke my phone and just decided to go all IP instead of buying a new phone (had the tablet, and the hotspot device, previous phone didn't have data). It worked well enough in the Los Angeles area, but outside a city it was basically non-functional.

The other points about 911 and such are bigger issues than this.


VoIP calls over VPN over cellular data are likely to have extreme reliability, latency, and jitter problems.


I was wondering about that.

* Regarding the VPN: Inevitably it adds an extra hop but with a VPN that provides sufficient bandwidth, low latency, and sufficient processing resources to decrypt/encrypt at wire speed -- I assume that either it will cost extra or will be something the user has to setup at a good hosting provider -- couldn't performance be sufficient for voice? IIRC, from long ago, voice needs ~80 Kbps.

* Regarding cellular data: Cellular connections are very widely used for voice, of course. Cellular data connections are used now (e.g., VoLTE). On one hand I'd have the same doubts you do; on the other it seems to work. Aren't there already VOIP apps?


"Uncompressed" voice "needs" 64kbps (hence ISDN was 2 voice channels etc). If you use a modern codec, you could probably get away with less, for high-quality mono voice. Note that you might not want to use VBR, for other reasons (I thought this was already submitted to hn - couldn't find it - so here's a fresh submission):

"Attacks on packet length may be surprisingly good: Hookt on fon-iks": https://news.ycombinator.com/item?id=10462363

That said, I suspect that eg: Ogg Vorbis at 64kbs CBR would be fine for voice, assuming you could keep the latency down.

Latency and drops would probably be the biggest issue. And you can of course prove, that latency sets a limit to the number of hops -- say if you'd want to hop through a European jurisdiction from the West Coast.

I'm actually planning to (try to) move to a pure data-oriented (not sure if there will remain a cellular part, or if I'll just use wlan) (probably) SIP for all my phone needs. That'll probably involve a SIP server on dedicated (rented) hw in Germany, and VPN from the phone/laptop. Not quite sure yet if it'll work out, or if it'll be good enough.


I think we should differentiate two things:

First there's state level surveillance.It's a hard technical challenge and I'm not even sure those steps would help.

Second , there's what this article talks about. Probably using VPN(and maybe some software that turns down 3G/4G connection unless it's strictly necessary) would neutralize most of this issue - because planting baseband malware and maybe risking exposure doesn't seems to be worth it just for ads.


> First there's state level surveillance.It's a hard technical challenge and I'm not even sure those steps would help.

If the state is determined to track or identify the user, I agree this doesn't help. It might delay them a little. But I don't mind the state investigating people for legitimate reasons (e.g., under a warrant); I'm not trying to protect criminals. Also this won't protect people persecuted by repressive states - that's a very valuable goal, but outside the scope of this idea.

For dragnet surveillance I suspect this would work, simply by adding enough complexity to the task that it's not worth it for one more data point among billions.

Of course anyone could make a social graph based on my phone calls and learn a lot that way, but I don't think anonymous phone calls are possible. If I want to receive incoming calls then I have to give out my phone number; my name and number inevitably become associated. (I can think of a few weak solutions, such as having many phone numbers, but that's imperfect and impractical.)


The thing is: just by doing those thing - you mark yourself as a target(maybe worthy of deeper probing).

And if everybody do so(hard to believe), the state, since it seem to really value dragnets - would play another move in this probably infinite cat and mouse game.


I would consider this illegal wire tapping. Just because the carrier has the location data as part of its operations doesn't mean it can use that data for any other reason besides providing service.

I am glad this story is out, I have seen airsage data and it is easy to deanonymize. This company shouldn't be in business.


It is illegal but cops don't care because they want the same data. Time for a massive class action lawsuit.


Speaking of wiretapping, this kind of tracking also boosts the carrier-NSA partnership (more data the NSA can get through their friends, the carriers).


I can understand that when people are using a free service, they are the product. But mobile contracts are by no mean free. I find that amazing that the TelCos would even contemplate charging their customers and at the same time using them as products.


Cable TV is also not free, but people put up with ads (at least in the US). In fact, I'm not sure people even remember that one of the original propositions of cable was that it was ad free.

I think a more accurate adage would be that if companies can get away with ads (or gathering and selling personal data), then they will do so. Unfortunately it would appear that giving money directly to service providers does not actually protect you from such things, and I suspect the reason is fairly straightforward: All companies are driven to increase margins as much as possible, and will eventually feel financial pressure to try such measures. Unless consumers object strongly (i.e.: leave the service in numbers large enough to offset the benefits of a measure under consideration), such measures will in general find their way into use.

So what we're really saying is consumers need to pay companies more than the money would get otherwise. If consumers aren't willing to pay that price (and it shouldn't surprise us if they aren't---this can be a lot of money), then we shouldn't be surprised if such things show up, regardless of whether the service is paid or not.


>In fact, I'm not sure people even remember that one of the original propositions of cable was that it was ad free.

That's mostly a myth. The original purpose of cable was to get TV signal in areas where broadcast didn't go. The first basic cable channels were TBS--which had advertisements--and Christian Broadcast Network--which probably didn't. The unfiltered cable stations, for the most part, had advertisements.

You are likely remembering HBO, Cinemax, The Movie Channel advertisements. They didn't have advertisements because you had to pay per channel.


So what we're really saying is consumers need to pay companies more than the money would get otherwise.

And what will stop them from running ads on top of that anyway?


The TV business model is dying. A few more years and they will only be watched in retirement homes.


The "if you're not paying, you're the product" quip was just a way to get people to swallow the surveillance business model.

Now that it's a viable model, why would anyone choose to have N-1 revenue streams when they could have N? It's not like businesses always act on principle over profits.

Combine the consumer's low perceived value of privacy (thanks to intentional and unintentional actions of the businesses doing the surveillance), the fact privacy is largely a market for lemons, and the low number of options in the marketplace. Together you get service providers that rarely lose business for choosing to survival their customers.


> I find that amazing that the TelCos would even contemplate charging their customers and at the same time using them as products.

Nothing the telcos might do would be amazing to me anymore.


> I find that amazing that the TelCos would even contemplate charging their customers and at the same time using them as products.

What are you going to do? Go to a TelCo that doesn't? It's a gamble, but considering the limited alternatives it's not much of one.


It won't take long for alternative phone OS that do not broadcast the location (whether it is compliant with regulations or not). If all the telco can tell is which tower is activated, it will limit a lot their spying capacity.


The network itself knows your location because your signal is usually visible to multiple cell towers. They can triangulate your location from that.


You mean like the NYT and pretty much every magazine sells the product, but also has ads?


and those companies are doing great!


The ones who know what they are doing are doing great, yeah.


It's crazy. I was at a big data conference and had a sales person tell me that they are a broker for several of the large telcos and that you could use the real world data of the as a datapoint for your programmatic ad buys. His example was horrible—but he said that they found that diaper companies had a shot at advertising to dads and getting moms to try a new brand so they used cell data to look at porn usage, browsing patterns that show that they have a child to do programmatic buys. It's getting way too creepy for me.


What conference was that?


They are using public airwaves. "We" can regulate them. If there is any "we" remaining in our shitty excuse for democracy.


I am always surprised that ad block users do not also refuse to use smartphones.

If you ask me, I'd rather Amazon retarget a bag of chips at me via a 300x250 display ad than Verizon sell my location breadcrumbs to some unknown entity.


Even more amazing to me is Surveymonkey's Contribute Plugin [1] - where you voluntarily run all your mobile device's traffic through their VPN and they analyze your browsing and app use... (and I tend to like SM as a company, but I can't even imagine agreeing to that).

"The Contribute Plug-in enables your iOS device to use our free VPN service that helps to secure your mobile data. In exchange for providing you with this free service, we collect and analyze some of your mobile data that passes through our VPN to gain insights and understand how consumers like you use mobile apps and mobile devices. We may also combine profile data from your Contribute account with your mobile data."

1: http://help.surveymonkey.com/articles/en_US/kb/SurveyMonkey-...


What in god's name does doing this service have to do with SM's core business?

Lets not forget who SM was tightly leaning in with...

/tinfoil


You can't see a connection between marketing research and browsing habits?


I cant see WHY I would want a survey company to also be my browsing VPN -- and then also telling me they will be mining my habits..

I cant make the connection as to why I, as a user, would ever want to use that service from them....


Ah, I see. I misread your comment then. I'm not familiar with SM's product, but there is a high chance they are incentivizing participation.

edit: Yes, this seems to be the case:

"You can install the Contribute Plug-in on an iOS device (US only) to increase your contribution to participating charities from $0.50 to $1.00 each time you take a survey."


I'd say the opposite. The bag of chips display ad is annoying, ugly, and slows down whatever I'm trying to do on the web.

The location breadcrumbs ... well, it's not that I really want those shared. But at least it happens entirely in the background. I'm not even sure I care that much, given that I never even see the precision ad it theorectically enables.


upvote. I don't agree with you, but I don't think you deserve downvotes for what you are saying.


Adblocking is available on mobile platforms.


Carriers have been known to add unblockable identifiers to your mobile traffic, and sell that data.


Not really available on Android. Google is using the "not interfering with other apps" excuse to ban all adblockers from the store.


> Not really available on Android. Google is using the "not interfering with other apps" excuse to ban all adblockers from the store.

Good thing Android isn't iOS and you can install whatever the hell you want, whether it's on the Play Store or not.


Ghosted has a _web browser_ for Android. Works great on 5.1.1.


$!#/# AutoCorrect... Ghostery has...


Ghostery browser FTW.



As the article you linked to mentions, GhostRank is opt-in, and I am always careful to never opt in to it.


I have bad news for you, Ghosterys privacy policy and T&C allows them to sell data about your browsing activity to third parties.

I'll link them here once I get home and off my throttled connection.


If you're talking about Ghostrank than that's totally opt-in. If you're not then I'll need to see some evidence of this because their privacy policy mentions nothing about selling browsing habits: https://www.ghostery.com/about-us/privacy-statements/ghoster...


It seems you are correct. However I remember having removed Ghostery from all of my Computers after reading their privacy policy. The current version looks to be fine.


Yeah, privacy is dead and gone.

The other day I was in UK at a Tesco open doors event. They talked mainly about tribes and agile, but also demoed a couple of new technologies.

Turns out they have face tracking operational on all their petrol stations. And they have, in lab, cameras and software that does face recognition and eye tracking. They plan to send targeted ads and coupons, based on what shelve products caught customer's attention.


Somewhat ironically, given many people's feelings about Google and tracking, Google's Project fi might be the best network for privacy. Yes, Google targets ads based on some portion of your profile, but they do not sell your data to 3rd parties like the carriers.

Edit: I got curious and it looks like fi excludes call data from being shared with other Google services. https://support.google.com/fi/answer/6181037?hl=en

Disclaimer: I work for Google (not of fi) so take my opinion with whatever size gain of salt you feel is appropriate.


Fi traffic still traverses the telco's network. Unless Google has special privacy clauses in place, there does not appear to be anything blocking Sprint (et al) from implementing the 'what websites are users visiting while in your store' feature mentioned in the article.


Except Telcos cannot link (time, ip) with a paid account when traffic is only passing through them.


I don't get why Google acting as a vertical monopoly in the data tracking -> advertising value chain makes it better.


Because Google doesn't sell your data to marketers and aggregators like telcos do.


But google is a marketer and aggregator. What is the difference.


That your data isn't passed around to multiple companies that you don't know, have no relationship with and you don't know their privacy policies.


I'm surprised that HN readers would find this surprising. It's well known that cellular providers track user locations, Internet usage, and probably other things. It's well known that a very widely used strategy is to collect as much information as possible about end users by businesses for targeted sales and marketing, and by governments for security/control (to varying degrees depending on where you live, but it's even spread to poor nations such as Sudan [1]).

Why is this story a surprise? I assumed it has been happening for a long time.

[1] http://www.defenseone.com/technology/2015/10/african-states-...


I'm gonna try really hard to not go over the line of the recent rule about too much negativity, but this line of questioning/reasoning really pisses me off. I see it quite a bit on reddit/twitter too. As a matter of fact, one time Jacob Appelbaum asked why people do it and my response was that it seems like a variation on the loaded question fallacy. Implies "you should have expected X", therefore attempting to avoid actual discussion, or shifting of the discussion via the overton window.

It's just such a useless statement. "Why is this a surprise?"

Well, first of all, I'm sure there are plenty of people who might not have had any idea cell data was being used this way and on this scale. Second, I'm willing to bet of the many who might have had a vague idea, this gives a more concrete background with rough numbers to solidify the idea. Third, It's not a surprise to those of us who have been paying attention, but the problem is when we have said something in the past it almost invariably has been ignored or dismissed as paranoid or crazy... At least until a good story or leak comes out and gets enough attention to grab enough media mindshare. Edward Snowden was a classic example of this at work. Sure, many of use knew about tempest and echelon and five-eyes, knew about cell tower metadata issues via watching ownership of said towers. Everytime with few exceptions though, publicly stating these things got us called "conspiracy theorists".

Maybe you just use this as a rhetorical and are one of us that have been paying attention, but this statement is not condusive whatsoever to intellectual discussion of a subject, and we need to address it's fallacy when we see it because it's too pervasive.


/choir

/agree

next instruction?


Next instruction is to vote out all incumbents in congress and instate term limits in order to shift the balance of power back to the people.

You left that pretty open so I took it where I wanted...


What's particularly outrageous about this is that there is seemingly no way for consumers to make a market "choice" for privacy.


Some of us wish to think the best about people or corporations. Naive, sure, but you have to start your chain of trust somewhere.

> It's well known that a very widely used strategy is to collect as much information as possible about end users by businesses for targeted sales and marketing

Yes, but it's usually limited to free services, or low cost services. It's rarely applied wholesale to a group who is already paying through the nose for crappy service.


> Yes, but it's usually limited to free services, or low cost services. It's rarely applied wholesale to a group who is already paying through the nose for crappy service.

Why would you expect that? There's nothing about increased privacy levels in ToS or in product marketing, nor there was ever in history an actual, significant demand for privacy services. It may change now, as all the stories broken over the years slowly make people care about tracking. But why would anyone assume that if you pay for something, you get special treatment? You get only the minimum required treatment that's spelled in the contract. That's how it always has been everywhere.


Can you please explain the point of your post?

Why is it relevant whether the story is a "surprise"?

Expressing your blasé attitude does nothing but downplay the significance of this.

The type of people that express arguments like this are the reason that these sort of things continue to happen.

"Oh, the Iraq war was based on lies? Yeah, everyone knows that".

Just because it's an accepted reality does not mean that you should just let things get a free pass. That's how you allow the same things to keep happening.

Congratulations on not being surprised. Everyone is very impressed.

Sorry for the rant. This is probably my #1 pet peeve.


One of my problems with this is how there is such a huge lack of alternatives for some people. I dropped having a cell for almost a year, and it was wonderful, in that I felt more secure about my privacy, but It also greatly affected my interaction, observation, and awareness of my surroundings. After meeting someone and getting a job that required on call though, it's now hard for me to imagine how I could go back to that again. To me, it's one more way that technology is evolving faster than people are keeping up with, but the kicker is that it's ripe for abuse by negarious entities. As Thomas Drake says, the Stasi's would have wet dreams about this tech today, and although you may not thing the $currentpower is so bad, what happens when the next guy uses the prescedent, forces the telcos to share, and dissapears anyone who disagrees about $policy? Oh, and now he has enough data to walk the cat back a few years and ex post facto you to $blacksite?


Google gets into "fiber," which is a strategic threat to carriers.

Carriers get into analytics, which is a strategic threat to google.

The headline is misleading- I think the carriers have been pretty upfront with their shareholders about their intention to get into this space.

In a world with facebook, google, et.al., writing an article like this without that context is incredibly cynical.


Deanonymization technology is very good. Does that still surprise anybody who's technically knowledgeable?


https://projectbullrun.org/surveillance/2015/video-2015.html...

DJB's hilarious talk on this topic. ("I AM the man in the middle!")


Telcos are using deep learning to cluster and classify users, and model and detect ab/normal behavior for everything from ads to fraud detection. Some are using us: http://deeplearning4j.org


See this paper "Header Enrichment or ISP Enrichment? Emerging Privacy Threats in Mobile Networks": http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/hotmi...

For example Orange in Jordan adds an HTTP header with the phone number of the client to every connection. And there are technical people out there still saying HTTPS/TLS should not be mandatory…


This says "Telcos" and then lists specific names. Do we know for sure that any telcos (my concern is US-based) who do not do this?

e.g.: T-mobile? I have less hopes for AT&T...


I wrote this story and spent months trying to get these guys to talk. As I note in the story, "Verizon and Sprint declined to comment for this story. AT&T and T-Mobile said they don’t share consumer or location data with SAP, Sybase, AirSage or Vistar."


The data cannot for all intents and purposes be anonymized and still be useful. I have seen airsage data, they give the user a new unique identifier but it is consistent over a 30 day window. Very easy to deanonymize, the unique identifiers are only opaque to people who can't do math.


Yeah, data anonymization doesn't really work, except perhaps for highly aggregated data (eg: percentage of people out of work in a nation). It does provide a small barrier in the form that "if you don't look, you won't see" -- so it might prevent some inadvertent leaks of private information.

But it should rarely be considered a "security boundary" so to speak.


Why not? It's not hard to set up streaming filters that count how many people visit a certain mall in realtime without ever persisting the raw location stream.


Right. But when the data you have is detailed logs of individual movements, second by second, meter by meter - stepping down to "number of unique IMEIs in a 100 square km area" is moving to "highly aggregated data". If that is what is seen as anonymization, the yeah. But then you've pretty much reduced a dataset to information that answers a specific question. Typically anonymization is (AFIK/IMNHO) taken to mean taka a dataset that can answer some questions, and transform it to a dataset that can still answer some questions we haven't come up with, but avoids answering questions that lead to identifying individuals. Eg: the problem with the NYC taxi driver data:

http://arstechnica.com/tech-policy/2014/06/poorly-anonymized...


What other companies are utilizing this data? Any beyond SAP, Airsage, etc... I'm trying to do some research in this area


> I wrote this story ...

Thanks for joining us here. I'd love to hear more.

That you can get an audience on Hacker News isn't surprising, but how much interest is there from the general public? My very limited experience with non-technical people I talk to is that they don't know anything about it, don't understand the implications, and don't want to bother to figure it out. I'm hoping your much broader experience is more encouraging!

By the way, if you post something to the top level of the discussion identifying yourself, you'll probably get plenty of feedback and interest (unless it's too late for this discussion). Where you posted is buried too deep to be found by most HN readers.


Sure, but it's short term. Calls and messages are dropping every month, even on developing markets. The raise of WhatsApp and similar messengers (FB, Hangouts) is exponential, no hyperbole.

Telcos are dead and they don't want to admit it. I'd bet my money on super cheap mobile ISPs raising soon. Based on a completely different technology and making better use of the mostly empty spectrum.


If you're concerned about this, you should use an Android device with a no-contract plan you pay for with cash.


how would this circumvent the issue? you'd still have to get a plan with a carrier. they can still track your location and your habits through the data that you're sending through their network.


They can't correlate that behavior with you as an individual if you don't provide any personal information to the carrier. You could also just run your traffic through a VPN and avoid all of this.


do they really need us to provide personal information? they can get a +/- 10m GPS lock on our home location.


Several dozen people live within a 10m radius of where I do. I'm not even a registered resident at that location, so they'll have a difficult time correlating "my" data to my personal information.


There's proably, over the last year of your movements, a point which you've made a purchase that can be linked to you. Maybe more than one, and those can be matched against your behavioural data. Or maybe they have parking information.

Perhaps you deal only in cash, and never give your name to a commercial entity (Hotel, gun retailer, car rental or seller...) -- but for most people that isn't true.

I also don't see why anyone would need your exact address. The only use-case that comes to mind is to send you postal mail. If they have your SIM/IMEI/phone number/full name (see above) and behaviour patterns, if and when they want to approach you "in person" that would be easy?

[ed: come to think of it, pair this data with an archive of public web cams, and you could probably a) automatically pick out faces, and b) match recurring faces with location data, to c) pair faces with recorded data streams. Makes all those cameras in the UK seem even more creepy.]


If there is any continuity to your phone usage, then that is you as an individual that is being tracked. That's what it means for a person to have an identity: continuity in behavior patterns.

It's safe to assume those several dozen people don't share your phone. And even if they did, that's a level of familiarity that means you are among a group of people that probably operate as an economic unit. Nobody is stealing you identity here: they are assigning you one.


I believe there are anti-terrorist laws that where passed here in Europe that forces you to register the burner phones with the provider. At least here in Spain...


They don't track individual users the way you think they do.


Could you be more specific?


They don't care about the house you live in, they care about the zip code, so what you propose is irrelevant. A cpg brand says "tell me where do segments which buy from Retailer XY at location Z live?" and gets an answer. That's what this technology does.


I'm aware of the application you're describing, but aren't you saying that you don't know of applications where they track individuals?

Perhaps that's a bit conspiratorial, but it's hard to believe that people who otherwise have been unscrupulous would suddenly hesitate to use information about individuals. What about job applicants, competitors, the guy dating your daughter, etc.?


I don't think that your solution is appropriate to the problem. The telcos would not make money on our data if we were not customers. We should demand a) either that they proof that we're not part, e.g. opt-in only b) share revenue or c) sue them into oblivion.


This won't help anything unless you also have no job, don't travel the same route or sleep in the same bed. You are the places you go and the people you interact with.


How do telcos get GPS or wifi-level accurate location information? Are the phones sending that (easy enough to disable with software changes) or some other mechanism?


They get cell-tower-level accurate location information.


There are marketers out there who do ping your cell location so that they can use that to market items to you.


I wonder where they get that number from.


Was anyone super annoyed by the gigantic unnecessary top menu that moves with you as you scroll?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: