Hacker News new | past | comments | ask | show | jobs | submit login
CanaryTrap: Detecting data misuse by third-party apps on online social networks (arxiv.org)
83 points by massacre 40 days ago | hide | past | favorite | 22 comments

This is a great idea and in my opinion should be a best practice for any company. We're actively working on enabling this functionality for our core data aliasing engine at the company where I work.

The idea is pretty cool when you start to think about adding self-destructing properties to individual pieces of data, so reasoning about data type and entropy becomes a risk modeling problem.

A concrete example: imagine if bank account numbers, credit card numbers, emails etc have self-destructing properties where there exists an outer shell "pointing" to the data but the underlying data is destroyed (using techniques like crypto-shredding et al.). The outer shell would have canary properties that work in real-word systems but since the underlying data is destroyed, all we would be left with are canary properties without the underlying data leak.

A good example of some companies that offer something similar:

- https://canarytokens.org/generate

- https://github.com/thinkst/canarytokens

- https://canary.tools/

Pretty cool technology that can really go far.

We started something similar with BreachInsider (https://breachinsider.com) to allow businesses (or I guess individuals?) to do this themselves with minimal overhead or resources. The idea being that they sprinkle these ‘users’ throughout their databases and see where they show up, and be alerted if they ever get contacted or show up somewhere unusual (Pastebin etc.)

We ran something similar, firing ‘insiders’ across many of the top 100 sites and services, to spot breaches (either in the traditional sense of security incidents, or lapses in privacy for end users).

canarytokens.org is great. highly recommend them.

> Our further investigation reveals that Facebook does not fully enforce its policies [9] that require app developers to disclose their data collection and sharing practices as well as respond to data deletion requests by users. Of the analyzed apps, 6% apps fail to provide the required privacy policies, 48% apps do not respond to data deletion requests, and a few apps even continue using user data after confirming data deletion.

This is the real alarming part

I remember suggesting to our security admin to add a 'honeytoken' user to our production database in case it ever got owned. At least you would know you were owned.

This is good practice. Extra points if you insert them regularly so you can tell when you were owned.

Great idea! Decades ago, we would do this when selling lists of (voluntary) subscribers. We would seed the lists with our own secret monitoring names and addresses in order to track the use and misuse of the lists. It's a shame that these simple things need to get reinvented constantly. Sometimes, people see differences when they should see similarities.

We used those as well.

The research paper[1] contains a list of the 16 apps on page 11.

[1] https://arxiv.org/pdf/2006.15794.pdf

Holy crap, Toms Hardware (tomshardware.com), the well known IT hardware review site is one of the offenders. :(

"Facebook app" is an ambiguous term. IIUC, this is "third-party apps running on the Facebook platform", rather than "apps created by Facebook".

Not that different from Android apps or Windows apps. Facebook also refers to them as apps so calling them "Facebook apps" is a pretty accurate description.

I'm not saying it's inaccurate. I'm saying it's ambiguous. Literally every search result I got on the first page for "Facebook app" was about a mobile app written by Facebook to access a Facebook-owned service (fb, WhatsApp, Instagram).

Since some people might not read the article and just the title, it seemed worth calling out.

Edit: ah, the title was edited from "Facebook app" to "third party social network app". So never mind :)

Hello everyone,

Lead author of the paper here. I am encouraged to see such insightful discussion on our work. Excited to discuss and address any questions/concerns that you anyone may have.

A preprint of our full paper can be found here: https://arxiv.org/pdf/2006.15794.pdf.

We are also publicly sharing a disclosure page (https://github.com/shehrozef/canarytrap). This page contains details of third-party apps which are detected as misusing user data or violating Facebook's TOS in our work.

If Facebook really cared, they'd offer to share an anonymized email proxy when connecting for the first time, like Apple does when signing with Apple on a website that supports its SSO.

This is a good idea but third-party developers might lose interest in Facebook's SSO if they are not getting user's real email address :)

I'm pretty sure they used to, a long time ago

16 out of 1024 apps... Surprisingly low. Lots of these are games, presumably funded by ads and/or iap, scrounging a few more bucks from email lists fits right in.

Hey, Lead author of the paper here. I would like to highlight that these detected apps amount to more than one percent of monitored apps. Considering Facebook has millions of apps, there could be tens of thousands of apps potentially misusing users data.

Its actually 16 out of 300 odd that were installed properly. But the goal was to prove this is a good way to monitor

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact