Hacker News new | past | comments | ask | show | jobs | submit login
Privacy Update: Third-party Tracking (backblaze.com)
175 points by busymom0 39 days ago | hide | past | favorite | 25 comments

HN thread about the original issue: https://news.ycombinator.com/item?id=26536019


Backblaze Privacy Update: Third-Party Tracking - https://news.ycombinator.com/item?id=26550506 - March 2021 (69 comments)

I think this is a great example of how to handle such a mistake, and it makes me trust Backblaze more (yes, I am aware of https://en.wikipedia.org/wiki/Service_recovery_paradox - I'm more likely to trust a company that admits mistakes, because it makes it less likely that they are hiding other, potentially worse mistakes and I'm more likely to be transparently informed when something that affects me goes wrong).

I was a bit surprised that I didn't see something like "we're working with Facebook to delete any copies of the data Facebook may be storing".

> I was a bit surprised that I didn't see something like "we're working with Facebook to delete any copies of the data Facebook may be storing".

The email they sent to affected users says this:

> Facebook is obligated to only process information based on our instructions and we have instructed them to not further process this data and to delete it.

I still think it's sketchy that they used the Google code at all, and I think they should have immediately deleted that from all their pages.

It’s sketchy that they used google tag manager? It’s a useful and extremely common tool on marketing sites. Putting it on user pages was a mistake.

It's sketchy they do not have separate repositories for user pages vs marketing pages. With the obvious sensitivity of user data, mixing very different applications seems unwise.

"Additionally, we’re continuing to evaluate steps to help ensure that such an issue does not occur again" most likely covers stuff like that.

I agree with you as a tech, but pushing back on Google Tag Manager outside of the tech team can be a painful exercise. Everyone in sales and marketing wants it, everyone in management wants it, heck I've seen a company where the legal team argued GTM was legally required on every page (so they could use it to manage a popup).

The googletagmanager or similar equivalent is pretty much required if you want to run advertisements for your service anywhere. Without that, you won't be able to figure out how many ad conversions you got - and thus the cost per install for your ad campaign.

Main thing to remember is to only have it on the landing page for the ad campaign clicks and definitely not on the signed in pages.

Well, not exactly, GTM makes it easy to slather your site (as much or little of it as you want) with 3rd party js / 1px image tracking pixels. Mostly I see it as a low-code tool to allow the dev team to wash their hands of having to add those manually, but it is certainly possible to hardcode the tags into your templates instead (but then it's a much bigger headache to add and remove them willy-nilly). For tracking your conversions etc. you will often see Google Analytics on a site (and it, in turn, may be installed on the site via GTM... turtles all the way).

GTM makes it easier if you are using multiple advertisement platforms. If you are using twitter, fb and google ads, then GTM makes it easier.

Google Analytics is different from GTM. And as far as I know, Google Analytics only works with Google, not FB, Twitter etc (which GTM does).

It's a bit embarrassing to have this happen, but I can't throw stones about configuration causing issues like this.

Addressing it publicly like this (identifying exactly what was shared, who was impacted etc) builds a lot of trust. No denial or fobbing off, just clear communication.

> I can't throw stones about configuration causing issues like this.

True, such misconfigurations are bound to happen.

That's why it's so worrying that they thought it was OK to have a third-party script injector on the signed-in pages to begin with.

I see now that they've removed Google Tag Manager from those pages, however they're still fairly opaque about preventing such a glaring security and privacy issue from happening again.

Not a customer, however I was seriously considering switching to B2 for my backups before this happened. This post does very little to make me want to pursue that further.

Its refreshing to the BB actually continue the investigation and share data with the public regarding the incident.

I hope to see a blog post on HN sharing what they've done and concluded from their review of 3rd party JS. Hopefully the conclusion is that its best to just purge FB from the site entirely.

I was one of the critics and I definitely trust backblaze more after the transparency, communication, and seriousness shown here.

Will check out B2 for our backups sometime :)

I suggest blocking all facebook domains via DNS/hosts etc. Even if you don't use FB they collect data about you from crap like this.

I’ve done this for years and can’t remember a time a website hasn’t worked (other than FB of course, no loss there).

unexpected but very appreciated.

Also appreciated that GTM is being removed from private pages all-together.

I’m not happy with how Backblaze handled informing affected users at all.

They were aware of it on March 21 and said that they “were preparing a communication with affected users at that time”.

I received an email about that only yesterday.

That’s 52 days it took them to write me an email which imo just isn’t good enough and afaik does not conform with the GDPR requirements for data breaches.

It sounds like the data leaked was not PII, which is outside the scope of GDPR personal data breach guidelines.

Filenames were leaked, which could very well contain PII.

Something as simple as an IP address (which you're implicitly leaking by just loading any resource from the third-party domain) is considered PII under the GDPR.

If you use Google Tag Manager on a page, but don't add any Google tracking codes on it (eg Google Analytics), the page sends data to Google?

Any OSS self-hosted Google Tag Manager alternative to recommend?

If a GTM container loads on a page but is not configured to fire any tags on that page, then the only request that will be made to Google servers is the one that loads the (empty) Container. While this request might end up in a logfile, the "interesting" data is limited. In particular, GTM is served from a different domain than any other Google properties so even third-party cookies couldn't be used to tie the request to your identity.

Not sure about OSS alternatives. I know of a few competitors (Adobe Launch, Ensighten, Tealium), but they're all commercial. I don't expect there's a large market, the key value proposition of tag management platforms is cutting down coordination overhead between the marketing and IT teams, and self-hosting is the opposite of that. However the trend towards first-party tracking may open a market for it.

> even third-party cookies couldn't be used to tie the request to your identity

Google has enough visibility on the internet to be able to correlate mere IP addresses with high accuracy. IP addresses are also still considered PII under the GDPR.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact