Hacker News new | past | comments | ask | show | jobs | submit login
Mark Zuckerberg Promised a Clear History Tool Almost a Year Ago (buzzfeednews.com)
578 points by jmsflknr 4 months ago | hide | past | web | favorite | 185 comments



Also, the "View As" functionality has been "temporarily" disabled months ago because of a security issue, never to appear again.

"View As" was a vital tool to assess what data is shared with whom, and guarding one's privacy is harder without it.


Impersonation code is some of the worst and scariest to write.

I've seen advisory after advisory just on user impersonation. Just no. Don't implement.

If you want to impersonate, use a private session or have users you control do the testing.


This is why it took Microsoft 20 years to write 'runas.' Privilege elevation is easy, privilege-shedding is hard, and probably magnitudes more difficult when the OS didn't grow up with it like Unix has.


I even just wish they had a simpler "View As non-friend" feature.


Couldn't you just view your profile while logged out? Or using a browser in incognito mode?


There's a more limited view if you're not logged in with an account. You could create another account just for this though..


No, you have to either make a fake account or get a friend to show you their view when you temporarily unfriend them. If your profile is moderately locked down you can't view much if you're logged out completely. So there's no good way to quickly see everything about you that's visible to anyone on Facebook.


I wouldn't be surprised if FB doesn't have a concept of "non-friend." People are either friends or friends you haven't met yet.


And what's more, is that some security settings can be set to "friends of friends", now that I think about it.


FB does have a concept of enemy, though. Which is implemented as communication embargo.


Nah, those are "old friends".


You can also block people who are not your friends.


Yeah Google+ had that when it asked you to categorize your contacts into friends, coworkers, family, and acquaintances.


wondered when or if they would bring this back


Taking the approach of a scientist, i.e. a data scientist, this is a hard problem to solve. Not only does Facebook have to implement the simple user.delete.all.data(), but think about the number of internal products that are machine learning based. In order to preserve the usefulness of those models and the hundreds of millions in R&D spent to develop them, you have to carefully understand exactly what products and algos will be affected by just deleting a single record.

The sheer number of records per user is probably astounding, and facebook has to figure out a way to understand exactly how thousands of records per user, being deleted, will affect it's models.

Thought experiment: Imagine, all of the hacker news community and other privacy minded people go right into the tool and delete everything. That is an entire community and classification of people that facebook has to figure out how to reconcile the loss of data for, and fix its machine learning models. The difficult part is since so much current AI logic is based on historical data, a user deleting data, will be nearly impossible to untrain from the model.

I know there are more nuances here, but I think there are some challenges Facebook has in implementing this, and business wise it's a drain on resources that will hurt them. It's easy to think it would be so simple for FB to just call user.delete.all.data(), but the reality is the loss of data could seriously mess up their algos.


> In order to preserve the usefulness of those models and the hundreds of millions in R&D spent to develop them, you have to carefully understand exactly what products and algos will be affected by just deleting a single record.

My answer to that is 'who cares?'

It's not my problem, let me delete all my data.

In fact it's required by law in some places.


Exactly. The ability to delete user data should have been something engineered from the ground up. They are paying the penalty of technical debt and hopefully learning some tough lessons.


Taking the approach of a scientist, i.e. a data scientist, this is a hard problem to solve. Not only does Facebook have to implement the simple user.delete.all.data(), but think about the number of internal products that are machine learning based. In order to preserve the usefulness of those models and the hundreds of millions in R&D spent to develop them, you have to carefully understand exactly what products and algos will be affected by just deleting a single record.

This is not a science problem and it is not hard to solve: a scientist would delete the data and re-derive their models because they want the best data possible. This is how you "preserve the usefulness of those models," by ensuring that the data has integrity.

facebook has to figure out a way to understand exactly how thousands of records per user, being deleted, will affect it's models.

This is looking through the telescope backwards. You create your models with the data that you have, and if some data is removed then you regenerate.

You can bet your ass they groom their data after detecting stuff like new botnets and fake clicks, and individual user data does not have to be different.

That is an entire community and classification of people that facebook has to figure out how to reconcile the loss of data for, and fix its machine learning models

Again, they already know how to do this, and they do it all the time. There are no nuances to it outside of roadblocks installed by FB themselves.

It's 110% a business problem, and 110% Zuckerberg's choice to make it difficult, and to invent difficulties in bringing it about.


I don't think that's what this article is about.

I for one would be OK with a machine learning model that's been trained using my data. I wouldn't demand that it be "re-trained" just because I asked for my history to be deleted. I think most people, given an explanation of what a machine learning model is, would be fine with this.

The clear history thing is about personally identifiable information though. It's not having your information dangling around after you leave Facebook. It's having the option to delete your shadow profile.

Throughout all the hearings, FB execs have been very wishy-washy about how they handle your browsing history, particularly what they collect outside of FB properties.

People should continue to call them out on this until they get their act together.


The problem is that these lookalike models are constantly being re-trained to build an accurate profile that advertisers can then target. If you clear your history, the data that is used to train these models no longer exist. OC is saying FB needs time to adjust their models for a potential mass deletion of input data.


1. Anonymize the data in the machine learning training corpora.

2. That's it. Problem solved.

Question for extra credits:

These machine learning models you speak of, do they mostly do stuff that...

a) is in the interest of the users

b) goes against the interest of the users

Society does not have an obligation to tolerate negative externalities just to protect someone's business model. That is precisely what regulation is for.


Unfortunately, anonymizing the data is not what's being demanded. People want to completely delete their data, which presents a harder problem from a modelling perspective.

Also, worth noting that FB has 2 primary users: you and advertisers. The second class of users is very interested in having your data. Unfortunately, FB seems to favor them bc they are the user who's paying them.


> Unfortunately, anonymizing the data is not what's being demanded. People want to completely delete their data, which presents a harder problem from a modelling perspective.

99.9% of the people are not even aware of the distinction. This is an attempt to follow the letter of the law to create an excuse not to follow the spirit of the law. What people want is personal privacy and cognitive freedom. They don't care about implementation details.

> Also, worth noting that FB has 2 primary users: you and advertisers. The second class of users is very interested in having your data. Unfortunately, FB seems to favor them bc they are the user who's paying them.

You are right. Sorry for the typo. I did not mean "users", I meant the "used".

Supermarkets have two types of users: customers and other businesses that use supermarkets to distribute their products. It would be in the interest of those other businesses if, for example, food safety standards were more lax. But not in the interest of the customers. The food industry has a lot more money and power than the customers, so we solve this with regulation. It's kind of Civilization 101, really.


> Anonymize the data

In our world of Big Data, where it's easy to combine many disparate databases, effectively anonymizing data is also a very hard problem to solve.


> In order to preserve the usefulness of those models and the hundreds of millions in R&D spent to develop them, you have to carefully understand exactly what products and algos will be affected by just deleting a single record.

Indeed, that sounds like a hard problem.

So the question becomes "how much of the hundreds of millions in R&D are being allocated to solve the record-deletion problem?". I'd like to see spending reports, new research publications from Facebook (e. g. retroactive record-deletion, attempts to build models that are robust to deletion, attempts to build k-anonymity into models from the start).

> I know there are more nuances here, but I think there are some challenges Facebook has in implementing this, and business wise it's a drain on resources that will hurt them.

I can give Facebook the benefit of the doubt and suppose that they're trying to figure out how to transition smoothly without destroying their models. But I'd also like to see a hard timeline when they promise to roll out user.delete.all.data() anyways (business value be damned).

The real test of whether a company really "takes your privacy seriously" is whether they ever reach a point where privacy becomes a priority, i. e. whether there are decisions to do the right thing even when it could be bad for the bottom line (under the current business model).


It might be a difficult problem to solve, but there’s no evidence that Facebook is working on it or that they ever intended to work on it.

The easiest explanation is that Zuckerberg was lying. We could give him the benefit of the doubt but since he’s always lying it makes more sense to save time and assume he’s lying about this as well.


Based on your use of the word always to describe Zuckerberg as a liar, I infer that you are upset by FB, understandably. I would add that I am not a fan of FB and I am deactivated, but unless you work at FB, making comments on if they are working on it or not is just speculation. I would say a more probable speculation is >=1 employee is working on this tool.


> have to implement the simple user.delete.all.data()

That may not be as simple as you think. FB probably has thousands of data types and each one needs to define how a deletion should affect everything else. E.g. if I delete my comment, what happens to your reply? That one is obviously solved already since it's a core part of the product, but there are probably lots of pieces of data that were never designed to be deleted so will need effort to support.


Yep, totally agree. And, these lookalike models are the bread & butter that drive FB's advertising revenue, so it won't be taken lightly. My guess is that if this feature is ever launched, it'll be in a form that is necessarily not heavily adopted.


Facebook quiet literally figured out how to build a global social network connecting billions of people.

Given they solved that insanely hard problem pretty well, I’d suggest they could also solve this if they had sufficient motivation.


Oh no, not the poor neural networks!


Can I propose that we start on a "noise" tool/browser plugin for these suveillance captialism companies? Not only would this be very interesting to implement it woud help to mess up their profitability by training the algorithm in arbitrary stuff that no-one is interested in.

It seems more sensible than trying to block them - instead give them loads of bad data ruining their business model?

For instance you could pick which set of noise you wanted to feed Google i.e. convince Google you are single mother from Lithuania... or Facebook you're a 63 year old man from Italy or Instagram you are a Teenager from Korea.

Every time a tracker or data point is sent to them to "know" you it could be a lie. This would lead you out of your search bubble too so political ads by Russians being bought to undermine Hillary's black vote might be more clear to everyone.

Trusting Facebook or Google to do the right thing against their financial interest won't work will it.


AdNauseum[0] hides all the ads from you, but clicks them in the background to increase noise ratio. As you might assume, it's not available in the Chrome's extension store.

I've used a similar add-on for Google search (the add-on made some human-like Google searches in the background), but I can't remember its name now.

Facebook would be a pointless exercise, since the <div> names are automatically generated. You can't even hide the sidebars you never use using CSS because they'll all change their <div> IDs on refresh. That and the murkiness of what exactly is an ad on Facebook makes even the most thorough adblockers struggling to work properly even in an otherwise-ideal situation (a desktop browser with an adblocker).

[0] https://adnauseam.io/


> Facebook would be a pointless exercise, since the <div> names are automatically generated.

Also:

> all ads are supposed to contain the word “sponsored” as part of a mandatory disclosure, so users can distinguish between ads and their friends’ posts. Our tool recognized ads by searching for that word. Last year, Facebook added invisible letters to the HTML code of the site. So, to a computer, the word registered as “SpSonSsoSredS.” Later, it also added an invisible “Sponsored” disclosure to posts from your friends.

https://www.propublica.org/article/facebook-blocks-ad-transp...


On top of that, Facebook and I have a different view on what constitutes as an ad.

A Facebook page that I'm following paying money to reach me is an ad according to Facebook. To me, that's extortion.

On the other side, we have notifications about friend suggestions. You can't turn them off in the notification settings. That to me is an "ad" for Facebook itself, even if nobody payed them money. It's an undesired content that pulls my attention away from what I came there to do in order to promote a product.


> A Facebook page that I'm following paying money to reach me is an ad according to Facebook. To me, that's extortion.

How would you fix that? The only thing I can think of would be a chronological feed where everybody sees every post.


That would indeed be best and has been the number one complaint about Facebook UX since time immemorial. It would also end one of the most annoying issues on FB's timeline - "I just saw an interesting post, but the timeline randomly refreshed itself and now it's gone".

The model they currently use serves two goals: to extort money from fanpages, and to hook users up by intermittent reinforcement.


Oh yes, very occasionally it's been an ad, or at least mass media output - even when I recall who posted it it's hard to find.

They probably already record every item displayed on screen (and how long for, and whether my cursor hovered it, etc.) so a list of items I saw (on any device) in the chronological order I saw them should just be a simple db lookup away.


Do they need to solve it or do they need to provide me with the tools to help me filter them out?

If you build a cockpit by measuring everyone's dimensions and averaging them, you'll end up with a cockpit that fits nobody[0]. I believe the same is true for feeds as well.

There's an amount of information that works for the average Facebook user. However, none of us are precisely on that average. By building algorithms whose mission is max consumption by the average user, they've made their product useless to everyone.

[0] https://www.thestar.com/news/insight/2016/01/16/when-us-air-...


Delete Facebook.

Honestly, everything this company does is shady beyond all shame.

Twitter keeps fucking around with their timeline feed, but they've preserved a "chronological, show me everything" option through all of their other bad ideas.


Do you hide tweets liked by the people you follow? Shadow notifications? "In case you've missed it" sections?

After a few refreshes of my Twitter homepage in a short time span, I did a little research: I've edited the CSS to make the distinction of "liked" tweets more obvious. Over 50% of "fresh" content turned out to be likes by the people I follow that are in no way chronological.

You can fight it... for now, but I assume not for long. I've stopped relying on Twitter for that same reason I've stopped relying on Facebook.


> Shadow notifications?

Is that the name for the "recent tweet from X" notifications? Those have made Twitter notifications completely useless for me. :(


That's how I refer to all the notifications that I didn't expect and don't need.

As far as I know, there's no universal term for them. Since we already associate darkness with bad UX patterns, "shadow notifications" as a subset of "dark patterns" has a nice ring to it.


This sounds to me like an argument for browsers to stop honoring markup that makes text invisible. Are there legitimate use cases for interspersing invisible zero-width characters in the middle of a word? I don't know what the markup that Facebook uses actually looks like, so I'm not sure how closely it resembles legitimate design patterns.

We've long had the ability to set a minimum font size, but I suspect browser vendors have viewed that purely as an accessibility feature.

(Tangent: I've also been increasingly disappointed that browsers let sites make text unselectable. I understand that there was originally a use case, but I only ever encounter it as an anti-feature these days.)


Yes, there are many such use cases, and it would be impossible to predict what they are (what are you going to do, live word segmentation in the browser with any reliability across all languages in the world?). At the LDC, we do character-based annotation on words, which, since it includes highlighting individual characters, means each character has its own span (this is actually something I've seen done for wildly unrelated reasons on other sites, with spans, divs, etc). I'm sure there are millions of sites doing things like hiding/showing text that's next to other text for a million different reasons.


Can you elaborate on what "the LDC" is, and how broken your site would look if every piece of text you have styled to be invisible were instead shown by the browser?

Breaking up a word into multiple tags is very different from inserting invisible extraneous content into the middle of a word. The latter is what I'm questioning the usefulness of; the former I can see as occasionally useful, but should be avoided when it breaks screen readers and search engines.


Hey, sorry I missed this. LDC is the Linguistic Data Consortium (https://www.ldc.upenn.edu/). Our web annotation tool (internal app) would probably be broken in many places.

Take a modal for example - the text is not visible until the modal is opened. Let's think about menus that display on hover - they wouldn't work. What about tooltips or other sorts of floating label type things? What about other text that only appears on hover, or when some action is performed?

Sure, you could wait to insert these things into the page until the last second, but it would be less performant and complexity would be higher.

If you saw what the web looked like without display: none and visibility: hidden, I think you'd change your mind. There are infinite applications.


>This sounds to me like an argument for browsers to stop honoring markup that makes text invisible

Aside, this reminds me of the brief period of time where "hacking" SEO by pasting huge volumes of keywords in text the same color as the background and/or hidden by other page elements was a relatively common trick. Search engines simply switched to more sophisticated methods of page ranking, I wonder what change could be made remedying this which would be similarly non-disruptive outside the bad actors.


So, I can understand those who argue for ad-supported websites, and against ad-blockers.

But can anyone honestly claim that putting an invisible 'sponsored' on non-sponsored posts is ethically OK? Can't we all agree that ads should be readily identifiable as such?


> Can't we all agree that ads should be readily identifiable as such?

By human users for sure. Scripts are a whole other question.


Are they though? Why would they be? Or taking it another way, what moral reasoning justifies making it purposefully difficult for scripts (read: user agents) to identify ads?

BTW. to the extent clearly marking ads/sponsored content is required by law, I wonder whether Facebook could be sued for making this difficult for users with disabilities - after all, accessibility tools like screen readers are scripts and rely on machine-readable metadata.


> Scripts are a whole other question.

I disagree entirely.


You mean identifiable to bots? Since they're already visually identifiable to humans, who don't see the invisible 'sponsored'.


This seems like a solvable problem. Ignore elements that have display: none, visibility: hidden, or opacity: 0. Do fuzzy text matching if you must. If you know that they'll never put just "Sponsored" on the real ads, ignore anytime you find that string.


So here's a different idea. I wish EU would legislate that all 3rd party ads in electronic media need a machine-readable tag constructed in a way that is trivially readable (e.g. <div class='ad'> instead of requiring training a ML model on rendered page to recognize; metadata or a red pixel in the corner on video streams, etc.), and denotes the exact area taken by the ad, or otherwise facilitate its removal. Coupled with GDPR, this would end ad-tracking bullshit by fiat (at least for Europe).

(By 3rd party ads I mean all ads that are advertising anything not directly sold or provided by the owner of the service showing that ad. That's to resolve the immediate objection of whether telling people about new features is an ad or not.)

Man can dream.

EDIT: But this man would also vote for the person pushing for such legislation.


Isn't it just weird that we can't vote for that issue without getting every other issue the person also wants, like why must issues be lumped in to person-bundles. It's just broken.

Personally I'm stuck with voting for "the party not dismantling the NHS who stand a chance of getting in".

Maybe in my kids lifetime we'll get proper PR at the national level (UK).


It's what I find appealing about the concept of "fluid democracy" - the ability to vote on particular issue in isolation. Taking it to the extreme is probably a bad idea, but so is our current way.


As is commonly quoted, putting every issue to public vote would lead to bringing back the death penalty.

...also brexit ;-)


Brexit has really damaged my faith in democracy.


I'd be nervous that Google would consider this click fraud even if you aren't the publisher and would limit some account features.


They will detect your noise. Then you will end up being one of their "noise client" datapoints, associating you with the actions and behaviors of a person who would pay for privacy and thus they will try to serve you ads according to your behavioural patterns as usual.


I think the goal would be to get a sufficient percentage of the entire internet to generate noise, without having to be aware of it.

In a world where Firefox and Safari were the major internet browsers, this wouldn't actually be that hard since it could be built into those browsers directly and would already align pretty nicely with their respective parent company ethos


"You cannot log onto Facebook with this web browser, please use Google Chrome or Microsoft Edge to continue" --FB


Circling back again, the more likely warpath FB would pursue is making FB work very poorly on non-cooperative browsers and excellently on cooperative ones.

Then users will blame the browser for being trash rather than FB.


Yes, that is one potential countermeasure. I wonder if it would be effective or if it would be the beginning of the end for fb.


>In 2015, it is expected that 70.1 percent of Facebook users worldwide will access the social network through their mobile phone.

What I don't know if that means browser or App. Not effective at all if most people use the app.


Is there precedence for this? Are there others that have tried this and have been classified as "noise clients" or are you just surmising ahead of time? You are probably right, but feeding them noise has been on my mind for a long time and I don't see another answer.


What about not using their services, and use a privacy extension to prevent other sites from using these services on your behalf?


Never had FB account. Never will. I saw this on day one. Not using FB is insufficient. Privacy tools are insufficient. Gov't involvement is insufficient. Make them "eat noise." I don't know if it will work, but we're running out of options quickly.

https://www.newsweek.com/facebook-tracking-you-even-if-you-d...


Surmise is a strong word. I will settle with deducing it.


I have been using this strategy for years. I can't really tell if it works or not, but it makes logical sense.

One issue is that there's data about you that's not generated by you, but by people you know (i.e. John adds me as a contact into their phone along with my birthday and address, Jane friends me on VK and keeps tagging me in pictures that have location data, etc).

I suspect that right now, companies give higher importance to data that you've entered about yourself, but once they realize you're feeding them noise, they can adjust their algorithms to trust other people's data about you (if they're not already doing that), which would significantly decrease the effectiveness of feeding them noise about yourself.

It will still work for all data points about you that can't be collected from other people (like browsing habits, your phone's location, etc)


> they can adjust their algorithms to trust other people's data about you (if they're not already doing that), which would significantly decrease the effectiveness of feeding them noise about yourself.

Maybe some people can inject noise about their friends, instead.

For instance:

1. install the FB app on a phone with an address-book full of garbage and share it with the app,

2. inject fake geotag data in the photos you tag your friends in,

3. put the actual tag on strangers in the background,

4. tag your friends in stock photos, etc.

Some of these you obviously can't do with friends who don't consent, since they may get annoyed by all the fake tags, but others would be transparent to most social media users.


I've been thinking about this as well, after using a manual version of this strategy. Might still implement it one day.

That said, it's pretty hard. It's not entirely clear which data points Facebook uses, but it's very likely you can't influence all of them. For example, friends adding you as contacts, or linking your profile to your phone number. If you also want to continue using it as normal, your normal usage will still give a lot of data about what you view; what you interact with; when you use it; that might stand out from the random stuff - and if you don't want to continue using it as usual, the extension will have to visit the website by itself, without interfering with your regular browsing activity, and without relying on your computer being on.

I also think that if something like this would get big enough to actually impact the quality of their algorithms, it'd be pretty easy for Facebook to detect abnormal activity.

(And of course, it's not even that much of a problem if the algorithm screws up more often, as long as advertisers believe it's accurate.)


Reminds me of the good old days of meetups organized by the UK email newsletter NTK (Need To Know) where people would swap supermarket loyalty cards - ideally as far apart in demographic terms as possible, such as a single guy and a mother of two - to fk with the data.

Edit for extra nostalgia: https://en.wikipedia.org/wiki/Need_to_Know_(newsletter)


>Can I propose that we start on a "noise" tool/borwser plugin for these suveillance captialism companies?

It's been done :)

http://ceur-ws.org/Vol-1873/IWPE17_paper_23.pdf

https://github.com/dhowe/AdNauseam/wiki/FAQ


I read through a lot of this. I think it is still worth another try -- they approach it more from an obfuscation point of view -- I want to take it to the next level and "drown them" in noise. Don't just say "It's been done" and then decide it will never work. It may take several failures to find the best methods -- much like search was a major failure until the right recipe was found.


If you don't like Facebook/Google/etc., wouldn't you be better off just not using them rather than spending time trying to hurt them? How will data vandalism make your life better?


You can't meaningfully opt out of Facebook/Google/etc. They're embedded in tons of websites, and your friends will almost certainly give your information to those companies.


You can block trackers pretty easily, or just not visit those websites. You'll probably be a happier person if you just don't visit foo.com than spending any portion of your brief time on Earth trying to trick an ML model into thinking you're a single mother from Lithuania.

Also, if your friend gives someone else information about you, do you expect that you should be able to control it? I'm pretty sure many of the details of my life appear in other people's emails, voicemails, journals, etc. that will never have a "forget me" button.

I'm mainly suggesting that there are lots of ways to improve this situation, from market forces to legislation, and it's sad to see people wasting time on things that won't work.


> You can block trackers pretty easily, or just not visit those websites.

And since they're buying data from non-internet data brokers such as credit card companies, retailers that track you in their stores, etc., blocking the trackers and avoiding the websites doesn't actually stop the spying.

To stop them from spying on you pretty much requires that you stop engaging in large swaths of real-world society.


You can choose to not use Facebook and Google services, and you can use ad/tracker blocking browser plugins, and null their domains in your /etc/hosts file. That would deny about 98% of their activity as far as you are concerned.

You can't really control what your friends are telling them about you though.

Personally, I've just about opted out of the entire internet except for work, email, and a few web sites that I read regularly. The flotsam is just too thick to be worth the bother.


And they fingerprint and track you regardless of you having an account!


> data vandalism

Words are powerful. Normalizing terms like "data vandalism" is the road to being legally required to use FAANG in such a way that maximizes their profitability. Probably not a great idea.


> If you don't like Facebook/Google/etc., wouldn't you be better off just not using them rather than spending time trying to hurt them?

I don't use Facebook/Google/etc., but they collect data about me without my consent anyway, and there is no way to delete that data or to make them stop.

That's what makes me so furious with them and all other ad companies. Since they're never going to change, I wish that they'd go out of business pretty much every day.


What if instead of don't like them you identify them as an active enemy?


Might be fun. This would be similar to what recording industry did on P2P sites. At first it was easy to get what you were looking for. Then it got saturated with garbage.


I recall the P2P sabotage efforts being pretty feeble. I particularly recall Madonna's record company having an album labled as her most recent album shared on KaZaa that was actually just a bunch of mp3s of her saying "What the fuck do you think you're doing!?". Funny stuff.

Edit: Found a copy of the audio: https://www.youtube.com/watch?v=XU5xC00m-gA


I remember that! From around 2001-2003, a good chunk of the Top 40 tunes found on file sharing networks would be one verse, looped over and over so the MP3 duration was equal to that of the actual song.

This worked best on mainstream P2P clients like Limewire, Kazaa, Ares (all based on Gnutella I think), because the fake MP3s would propagate more widely.

It didn't work as well (or at all), on relatively obscure, community-based networks like Slsk, where people were more likely to share entire albums, and not just the singles.


Did it? I didn't notice it.


I barely noticed it back in the day too. I had a friend who was into piracy quite a bit more who complained about getting the wrong movie or getting the wrong mp3s.

I don't know if I ran into that issue, but I wasn't really downloading a lot of mainstream stuff compared to him either.


Think Kazaa(?) days


You know, anybody can submit an "Ask HN" post rather than hijacking the comments of another.


I think the largest issue would be hiding the plugin from their eyes. Like youtubeDL sort of being a secret. The noise creating plugin can't be advertised as such because then the new history is easily un-noiseable.

The tool would need to be something like a color picker that has a hidden command to run headerless searches in a tab of your browser.

This (sort of) worked for teenagers hiding pictures on a fake calculator app on smartphones years ago, but your mom wont try to pry at your facebook obfuscation.

Just a thought


Noiszy does that https://noiszy.com


I've used this for quite some time, and can verify it works on some level. Generally after running it my google now suggested cards on Android are all out of right field and totally irrelevant to anything i'm actually interested in.


Amazing, but I really don't trust them to browse the web for me!


Can anyone vet this? How do I know they are not evil?


This might work. I vaguely remember Google actually panic-pulling a chrome extension that invisibly hid and clicked on every ad of every page visited.

So I think they identify this as a tough to protect weakness as well.


Something like this? https://cs.nyu.edu/trackmenot/


Is this similar to what PrivacyBadger does, or would they complement each other?


Not similar at all.

TrackMeNot runs as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and Bing. It hides users' actual search trails in a cloud of 'ghost' queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles.

https://cs.nyu.edu/trackmenot/

Privacy Badger is a browser add-on that stops advertisers and other third-party trackers from secretly tracking where you go and what pages you look at on the web. If an advertiser seems to be tracking you across multiple websites without your permission, Privacy Badger automatically blocks that advertiser from loading any more content in your browser. To the advertiser, it's like you suddenly disappeared.

https://www.eff.org/privacybadger



We could choose to cease to have preferences, limiting what there is to target.


i ve done an app with a friend that generate random phone number and add it to your adress book. And my pleasure is to let all the social network to siphon my adress book.


I am totally down with this.


There is a great interview with him that was recorded a couple days ago where he talks about the status of the history tool: https://youtu.be/WGchhsKhG-A


Summary of Zuckerberg's statements:

* They're still working on Clear History, and think it is worth implementing (Zuckerberg is the one who introduces the topic–the interviewer hasn't asked him about it). It's important that users have a tool like this, even if it makes their experience "worse" (i.e. less personalized).

* Clear History is nontrivial to implement, because Facebook was not built with this functionality in mind. And they want a user interface that is more capable than just a single "delete everything" button, because their research has shown that users frequently want to clear only certain aspects of their data (e.g. data shared to a specific service or app via Facebook), so this adds complexity.

* (In the context of a Facebook subscription service, which they were discussing before Zuckerberg mentioned the Clear History tool)–having Clear History is a prereq for any sort of subscription service, but they want Clear History (and all privacy controls) accessible to all users, not as a privilege for certain users.


Any highlights or links to specific segments from that 1 hour and 43 minute video?


He starts talking about it at 1:04:08: https://www.youtube.com/watch?v=WGchhsKhG-A&t=1h4m8s


I like how the only comment that isn’t highly speculative gets immediately downvoted.


is there a money timestamp or is the whole interview about the history tool?


It's clearly not a priority. Such a tool was a nice rhetorical device to deflect some heat during the congressional hearings, but there's no way to monetize it or use it to "increase engagement" so it's languishing in some Facebook backwater. I think it was naive to think that it would turn out otherwise.

And to be clear, I'm not saying that Facebook has necessarily been malicious or lied. It's just that a product manager faced with the choice to prioritize this vs. some new engagement enhancing widget is going to pick the widget every time unless there's constant pressure from above.

Edit: This is why legislation is required. So privacy issues can't be ignored simply because they don't contribute to the bottom line.


The Recode article in December touched an important point about the name "Clear History" [0]

> There’s a reason that Clear History isn’t called “Delete History”: Using the feature will disassociate browsing data that Facebook collects from your specific account but it won’t be erased from Facebook’s servers completely, Baser said. Instead it’s just “de-identified,” which means it’s stored by Facebook but no longer tied to the user who created it.

[0] https://www.recode.net/2018/12/17/18140062/facebook-clear-hi...


Zuckerberg did promise Winklevoss twins to build their social network as well.


From the FB statement generator:

We promised a history tool and have not delivered it. We are deeply sorry. We founded Facebook to connect the world and facilitate relationships. We remain committed to doing things.


They really like to do the BP oil spill apology approach, it's getting old though. You can only continuously say sorry so many times before people get sick of hearing it.


The vast majority of the user base that they care about doesn't even know there's anything to say sorry for. If you would use this tool, you're worth much less to them anyway.

addendum: If this bothers you, I would strongly recommend just jumping ship now. There's almost no incentive for them to change and no sign of one on the horizon. If you're not getting enough value from them right now to accept the data they keep on you, you'll be happier just giving up and moving on early.


But how are those of us who already don't use their services supposed to protect ourselves?


My point is to stop holding your breath that Zuck is going to be the one to do anything for you. Do you think that's any less true if you don't even use his service?


Maybe it's time for south park to do another one of these, but with Zuckerberg as the star:

https://www.youtube.com/watch?v=15HTd4Um1m4


True. But I expect the statement from BP would have sounded a bit more like it had been written by a human being :-)


You'd be absolutely right. Here are two statements for comparison . They're similar in context: both were published after a major crisis, during the pressure from the legislators, but before they actually talked to the lawmakers:

https://www.theguardian.com/business/2010/jun/17/bp-tony-hay...

https://eu.usatoday.com/story/tech/news/2018/03/21/read-face...

Even though the BP one is substantially more detailed, at least I've read it completely. I've skimmed over several paragraphs in the Zuckerberg's statement because they essentially say nothing.

Some part of it can be blamed on the medium itself, since Zuckerberg couldn't post links, subsections, nor visually-distinguishable lists in a fucking Facebook status.


We had no idea we weren't developing it.


It won’t happen again.


We'll do a better job (in the future™).


"Will you start development, then?"

"Well yes, but actually no"


“We wanted everyone to be connected so we gave the world herpes. Thank you. Sorry. Thank you.”


FB won't change unless major advertisers do. And even then, there are plenty of fly-by-night companies selling utter garbage (think tshirts/hoodies with dynamically generated text labels) that will continue to find customers on the platform because of FB's targeting ability.

I hate to say it, but it's probably too late anyway. FB has billions in the bank and are likely working on much bigger-picture projects with their surveillance tech. Why bother with a B2C product that brings tons of PR grief when you can enter billion-dollar contracts with militaries and governments where secrecy is the default MO?


In the all hands meetings they would talk so much about establishing or re-establishing trust, and I always wondered why they don’t just have a “clear everything from high school and college” option. People would _love_ that option, and it would make so many more people comfortable with using the platform again.

Their loss.


A few months ago I went through and deleted all of my Facebook content from 2005 through 2010, except for a few pictures. It took 5+ hours to delete each item individually. A tool to assist this would be very helpful! I’m sure I’m not the only one who wrote a lot of stupid posts and uploaded a lot of embarrassing (and some slightly incriminating) photos to Facebook in college that they don’t want saved forever.



That would have been helpful!


I did the same thing, and yet every couple of years some of those deleted items reappear on the news feed of my second account that I only use to double-check my privacy settings.

It's an European account and I'm wondering if only marking the content as deleted, instead of actually deleting it, violates the GDPR.


Can you report that to the privacy regulator in your European country? The "mark as deleted" versus actual deletion is an important distinction. I can't do anything as a non-European...


I was planning on looking into how to best document it the next time it happens, and whom exactly to report it to (i.e. the national or European data protection authority). Not sure if there is anything else I could do.


People would love an option to unsubscribe from magazines/newspapers online too. There's a reason they make you call their support line to do it.


Pardon my ignorance here, never worked on this kinda program before, but isn't is actually hard on complex databases like Facebook to actually remove data, better just to mark it as unviewable?


It's not just complex databases, but any database that uses indexes (basically any production database).

Deleting a record from a table can cause a re-index, which is very intensive. It's much easier to flag a column in a record as "deleted" or whatever, and then run a cleanup during off-peak hours.

I'm sure there are clever ways around that with proper knowledgeable DBAs on your team, but as I'm a web dev for smaller audience projects, I don't touch solutions that require those types of optimizations that I'm sure Facebook has implemented.


》Deleting a record from a table can cause a re-index

As will adding new data and FB loves the story of adding to their profiling DB.

This is not an indexing issue but a "we love adding data but hate removing it" issue.


Long story short, no. If that data were a liability instead of an asset, it would be done yesterday.


Why? Google has the option for many years, it is hard but definitely doable.


I'm sure there's some technical reason they could come up with, but to me, that's admitting they didn't design the database with that use-case in mind.

Disclaimer: Have no experience with databases at that scale so maybe this isn't entirely unreasonable.


The HN database must be very complex then, you can't do either.


"Oh, we actually meant advertisers/governments/Bing and other partners will have a tool which will provide a clear view of your internet history"


It probably just takes a long time to develop! I mean, keep in mind, Facebook's app doesn't even support landscape mode. It's hard to find the development bandwidth for advanced features.


My biggest issue with metrics and general "data" focus, is that it's incredibly hard to generate data about goodwill. eg we can prove that call to cancel reduces cancelations by 20%, but we have no way to measure how many customers now hate your company and tell their friends never to use your company, how many people would have returned but now won't, how many people just cancel 3 months later.

In Facebook's case, we know they're seeing huge hits in active, engaged users, but their whole metrics ecosystem is built to optimize micro-engagement. Taken as a whole, those micro-optimizations are killing goodwill and turning off users, but it's much harder to measure that.

So many decisions are made because variant b "won", without any discussion of the layers upon layers of factors that aren't easy to measure.


There's also the issue of 100s, if not 1000s of random businesses having added my contact information for advertising purposes, and I can't remove them all without manually clicking each and every one. Anyone should be able to remove these advertisers who have no business having my info (random car dealerships and real estate agents 1000s of miles away, typically). Questions asked in the help forum have gone unanswered for months.


What exactly are engineers and designers at Facebook doing? There are thousands of them! I've seen teams of 10 ship more (granted, no one is delivering on Facebook's scale)...

I know most of them are likely working on backend stability and performance... but come on.


It's in Development™


> It's in Development™

Probably somewhere in the backlog :-)


Move fast and break things. It's mostly in the "break things" phase, though.


I probably wouldn't be amongst the first in line for any 'tool' facebook would be offering to completely own your data. I feel as if there would be something that would be underpinning the tool which would make in itself useless.


Yeah, like the "delete your account" option. It's hard to find (disabling is easy to find, but also completely reversible, so useless), and I'm sure it doesn't actually delete anything, but it made me feel better than just leaving my Facebook account active/disabled. I didn't have much of value there anyway.

I would love more transparency here, but we're not getting that without government intervention, and that's unlikely to come anytime soon (and what gets passed may be worse than the current situation).


A few days ago my fiancée asked me to delete her Instagram profile. There's no user friendly way to do it. You have to go to help.instagram.com or search Google to find the link.


> I would love more transparency here, but we're not getting that without government intervention, and that's unlikely to come anytime soon (and what gets passed may be worse than the current situation).

The GDPR (General Data Privacy Regulation -- an EU regulation that came into force last year) provides for this and has potentially very large fines to back up the regulation. Obviously it only applies to EU residents (or businesses) but I've heard that California is getting their own version of GDPR -- though I haven't looked into it. Facebook implemented the "delete your account" option that actually works in response to this, as well as giving explicit and opt-in consent to whatever tracking they use.


The only way to clear history with Facebook is to create no history to begin with.

If you want to "clear" history to delude yourself that revenant Facebook information is gone (or at least make it difficult to obtain from your own account), there's plenty of options for that.


Well I am in europe so we have GDPR and I have deleted my account in january, waited the (non-optional) month without logging in to confirm I want to be gone. I will now request the information they still might have. That will be fun.


I can't wait until CCPA takes effect and there's _some_ protections for us in the states.


This gets me wondering. What kind of incident would be required to provoke public outrage over Facebook? Something that would bring public awareness re their data collection policies? Or is Facebook Just Too Big To Fail™ anymore?


Massive public attention has already been brought to Facebook and its policies. If the public isn't as outraged as you expect them to be, that's because they don't think Facebook's data collection is a big deal, not because they're somehow unaware.


> Massive public attention has already been brought to Facebook and its policies.

Not so much, really. Outside of techie types, my experience is that very few people are aware of this stuff.


> If the public isn't as outraged as you expect them to be, that's because they don't think Facebook's data collection is a big deal, not because they're somehow unaware.

No, the public who's aware isn't outraged because they don't think there's anything they can do to stop this. Resignation is very different from thinking it's "[not] a big deal."


Private dicks posted all over the place? (See Snowden’s interview by John Oliver)


Anyone with a background at Facebook definitely needs to be spoken with regarding their integrity. Obviously only a small subset of FB employees make the shitty decisions we see in the news, but it's very concerning to me that engineers still go to work there. There are at 2 or 3 other BigCos that pay comparably, and many others that pay a level down that don't require you to be complicit in unethical behavior of this scale.


> it's very concerning to me that engineers still go to work there

Me too. And, although it may be unfair, I have to admit that I view engineers who are still willing to work for them as suspect.


When Facebook releases a tool to clear your history, how will you know it’s really cleared? It will probably work as well as these instructions you can already use to remove your digital fingerprint from Facebook and several other sites: https://youtube.com/watch?v=oHg5SJYRHA0


As someone else said, if you're in Europe you can send a subject access request after sending the deletion request.

That will tell you what data they still have about you.

You can't _know_ they've deleted everything if the subject access request comes back empty, but it puts then in the position of having to actively break the law to lie about whether they've deleted your data.


> it puts then in the position of having to actively break the law to lie about whether they've deleted your data.

True, but I don't think Facebook is too bothered by that sort of thing.


I should also point out that most of the linked video quotes a statement by Facebook about what they promise to do with your data.


I made one: github.com/spieglt/fb-delete


Question to the HN crowd : Would having event data & PII-ish data decoupled be an acceptable compromise to privacy granted that you can wipe the PII stuff and the only data left is completely anonymous (just a UUID that leads to nothing in case of deletion)?

Is anonymity in the eye of the beholder?


> Would having event data & PII-ish data decoupled be an acceptable compromise to privacy

That depends on how you define "event" and "PII" data. The normal industry and legal definitions of PII leave out huge swaths of actual PII, after all.

Personally, I'm not even at the point where I can consider what is or is not acceptable for companies like Facebook to keep. I'm still stuck at the point of stopping them from collecting data about me without my permission in the first place.


I think "big data" companies realized that people simply were unaware of the extent of the tracking and showing them how much they're being tracked would make for terrible PR, potentially much worse than any publicized data leak. It turns the tracking from some abstract concept into a very tangible list of actions being recorded potentially forever.

Take that guy who managed to get a dump from his Spotify data thanks to the GDPR for instance: https://twitter.com/steipete/status/1025024813889478656/

I'm sure we all knew Spotify tracked us but seeing how much they track is pretty crazy to me even though I'm pretty sensitized to these privacy issues.

I expect that a full dump from Facebook would be at least as verbose if not more. I don't think they want the users to see how the sausage is made.


The problem is that privacy tools like this will severely hurt Facebook's business.

Facebook uses its pixels to track purchase and browser history. This purchase and browser history is a crucial part of their lookalike audience technology, which in turn is one of the most critical components of their advertiser tool suite. Without this data, advertisers will be less effective at targeting their ads, which means lost ad revenue.

It's not surprising to me that Facebook is backtracking on this and vying for other privacy improvements that don't hurt their fundamental business as much (like letting users view how advertisers got their contact info, or letting users view page ads).


Building shadow profiles for people who don’t use your service is inexcusable.

Perhaps their “business” would need to be rescaled if they were honest about serving ads to their users.


You have to do it manually. I did it using a plug in and meticulously doing each section at a time. likes, comments, posts, pictures, etc...


What the king says is what the king says, but what [facebook] stands for is undying opposition to [clear history tool].


the memories feature makes for a nice deletion tool, but i know the posts are just being hidden


Well, you see, people didn't leave Facebook in the numbers anticipated, so the project got deprioritized.


When beta testing, they used that tool to clear Zuck's history. Thus Zuck forgot about it.


One really shouldn't believe anything that man says.


Facebook's behavior is highly analogous to, and best viewed in terms of, narcissistic personality disorder. Thus when people bring up their legitimate concerns about how FB screwed them in one way or another, Facebook thinks the problem is the "beating" they're taking in the press.

I really don't want to read a single word about the "beating" they're taking. It helps Facebook by exaggerating the harm done to them, and it pats the press on the back for its supposed effectiveness. Facebook keeps growing and doing the same shit, that's how effective it is. "Maybe if we just had more facts!" says the well-meaning fool who then loses to a Zuckerberg or a Trump. Dream on, the world doesn't run on facts and virtue. By way of contrast, if Facebook executives had to suffer an actual beating, Rodney King style, things would be fixed in a hurry. I confess I would love to be holding one of the night sticks too.


Your larger point is caught in the hyperbole but the point of ‘the world doesn’t run on facts’ (meaning, I think, political world) is a good one.

FB get the appearance of pain and attrition. Press get more views and plaudits. Nothing changes.

They need actual pain to their organization (not violence to individuals of course) before a proper change will come. Think Microsoft pain in the 90’s. It seems Europe will again lead the charge on this.


Yes, the political world, the one in which Mark Zuckerberg is a player and in which almost none of us commenting here are players, unless maybe we were to unite into one bloc. That world (and to the extent that "that world" runs "the world," "the world" too) has little use for facts. I'm not saying it's great or that we should give up. (Or that they can escape the facts permanently.) But it is a fact. And each downvote of that fact, proves the world doesn't run on facts!


A Clear History Tool was tested and it cleared itself from the history. So...




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: