Hacker News new | past | comments | ask | show | jobs | submit login
Facebook adding “fbclid” parameter to outbound links (thisinterestsme.com)
135 points by hiby007 on Oct 22, 2018 | hide | past | favorite | 89 comments

This is breaking some links. You might believe it "shouldn't", and a server "should" ignore the added params, but the reality is it's breaking them. This past weekend, I posted a link to an image on Facebook, and FB generated the preview fine, but created a link that 404's.

My link: https://pbs.twimg.com/media/Byv5uWSIIAEf38C.jpg

Facebook made: https://pbs.twimg.com/media/Byv5uWSIIAEf38C.jpg?fbclid=IwAR2...

I guess if FB really wants, they could make a second fetch to ensure that their added params don't break the third party server. Or could they add a whitelist of domains that use their first-party tracking?

I really don't like the end result right now, looking like "the web works" from inside FB, but not when you try to follow this link out of it. I don't believe at all that that is FB's intent here, but it's just one more time that some silo breaks another part of the ecosystem, and to an untrained eye it looks like the third party is the culprit.

I had a quick look at the protocol but couldn't see anything on handling that case

Regardless of the protocol - it's not unreasonable to return a 422 or 403 by default for that (malformed) request when first seen - as it would indicate that something sketchy may be going on - which can be whitelisted later

I believe it was breaking all NYMag links today, appears fixed on NYMags end now.


Sure, where "fixed on NYMags end" means "NYMags has implemented a workaround because FB broke things".

it also broke a couple of safe paste sites for me, that's /r/assholedesign

There are a number of comments here who seem genuinely happy about this. This is a perspective that is hard for me to understand, largely because I'm strongly in the pro-privacy, anti-tracking ideology.

So if you are part of the group who sees this as a good thing, I'm genuinely interested to understand why you see this as a good thing and whether you view the mass surveillance of the general public by advertising companies as bad?

Targeted ads have helped small businesses and indie brands thrive.

Previously, only large brands and national/multi-national corporations could afford to advertise at scale and reach customers through TV/Radio/Newspapers (that too with a high minimum spend).

Now your local mom and pop bakery could have a spend as low as $100 a month to reach their customers and help drive their business.

The world is not black and white, and neither is the morality of advertising.

I hope this perspective was useful to you.

Even ignoring the morality, this is all ultimately driving the adoption of ad blocking. Advertising online is an industry digging its own grave, which would be fine except they’re going to bury a lot of those small and indie sites with them.

I don't think ads inherently promote ad blocking, but you're right: this specific instance of an ad breaking links and lowering the quality of other sites does indeed promote a culture of ad blocking (if they can remove the fbclid param?).

I don't see any indication that advertising online is an industry digging its own grave. On the contrary, the increase in quality of ads over the past 10+ years (especially in terms of unobtrusiveness and relevance) would suggest to me that online advertising is digging its way _out_ of the grave it dug itself with flashy, irrelevant ads throughout the 90s and early 00s.

While this is completely true, it is worth noting that a huge majority of the $100 ads that are marketed as the channel that enable small businesses to "thrive" are Facebook or Google ads, and they are professionals at confusing these business owners with vanity metrics, payment / pricing models, constraints on packages, etc (and this is just from the buyers side, these cos have massive leverage on the publishers' end as well).

Until the 100$ run out and the ad service starts marketing the competition.

it was, and thank you for sharing it.

Engineer here: I used to be completely anti-tracking.

Then, I started needing analytics for my own business. Without analytics, I wouldn't be able to sell with efficiency, and therefore, I wouldn't have a business. Granted, the anti-consumerist in me thinks maybe as a society we shouldn't be so concerned with our efficiency to sell. But, we live in a capitalist world, and I don't see that changing any time soon.

The way I see it now, I'm less concerned about tracking than I am about how big some businesses are -- especially in this space.

At every start up I know, they use analytics, and no one is doing anything spooky. But, I'm sure there's plenty of spooky stuff going on at the FAANGAMUs.

"Sell with efficiency" sounds a bit vampiric to me; a bit growth-at-all-costs or refusing to accept the normal costs of doing business (although I can sympathise with that mindset).

Where do you draw the line? This is parallel to the discussion around government surveillance. Just be cause they / you can, doesn't mean they / you should.

If Internet tracking had no potential use to governments then they'd be regulating the shit out of it. The problem is that governments want their own noses in the same trough, and so all these privacy-invasive technologies continue to be developed. The fact that it's not illegal means that anyone with the ability to implement it can, as long as they can sleep at night.

As to solutions that could help with "selling efficiency", maybe some kind of agreed tiers of analytics from benign to spooky that users can opt-in / opt-out of when visiting a website or using an app. Which GDPR is a bit of a kludgy solution for. The problem is that it only takes one bad advertiser to break agreed rules and the trust is gone again for all advertisers.

One bad apple.

Analytics are unquestionably useful. Collecting the data without user consent is what potentially should be regulated.

I'd be completely ok with intent based advertising - I search for $X you show ads that are related to $X. And I'm ok with measuring what percentage of those who click the ad that converts as well. However, that's not where we are today. The appetite for marketing and advertisement has grown to such a level that companies like Facebook want to know every aspect of your life so they can satisfy that appetite. Clearly a) there's a demand from companies big and small and b) it's working. Companies can still be ethical about it - like why should Facebook track every URL I visit (via the Like button)? Or, why not provide an option to opt-out of targeted ads - I don't like them and I do have friends who like them. I get it but provide an option to opt out.

The general retort I hear is what do you have to hide? Like the only thing that people want to hide are the bad and evil stuff.

and those companies arent just selling that gathered data to online marketers, they are pairing it with offline data and selling it anyone with enough money.

How extensive analytics did you need? Do you believe it's possible to make enough money with respectful and privacy-friendly attitude?


Facebook, Amazon, Apple, Netflix, Google, AirBnb, Microsoft, Uber


Edit: M for Microsoft, derp

The FAANGAMUs acronym doesn’t have a lot of hits, what did you add Microsoft Uber and Airbnb?

Microsoft owns Bing, one of the most important search engines and advertising platforms on the Internet...

Uber had that secret API access granted directly from Apple so it could see which apps you had open at all times (in case you opened Lyft) so it could charge you different rates.

AirBNB is similarly large, so I thought they were worth mentioning.

TL;DR version: I like tracking now because it now makes money for me.

As in "I was against child labour, but then I've inherited textile factory in China and paying adult wages is not efficient".

Or, "I was against child labour, until I lived in a country where it was culturally acceptable and I couldn't buy food / compete without it."

The point I was trying to make is: before I worked in the growth team at several mid-sized startups, I had this naive assumption that tracking data was basically the food for an evil monster.

I had this idea of an evil group of people getting together everyday and looking at this data and somehow using it to puppet my entire online-life.

Sure, this group of people exists at every decent sized online company, and sure they're trying to get you to spend more time and money on their site/app/whatever, and sure this tracking data helps them.

Sure, SOME of these websites are peddling fake news or selling scams or preying on the poor/unfortunate/uneducated/etc. But I think that's the exception, not the norm.

Most successful companies make a product people genuinely like. There are millions of people that would buy and enjoy this product if they knew about it. Most companies are just trying to use this tracking data to get their product in front of as many of those people as they can, and as few people that don't want their product. They're trying to fine-tune their messaging to make sure it appeals to the people that actually like their product. They're trying to use it to figure out how to BETTER make a product people actually want!

Again, if you're saying that increasing our efficiency in sales is a bad thing, you're saying that capitalism is bad. But I've just come to see this data as something that enables product evolution to occur much faster. I see this data as something that's helping the world, mostly, get more of what it wants.

Like everyone says, Capitalism is the worst economic system, except all the others we've tried.

> Again, if you're saying that increasing our efficiency in sales is a bad thing, you're saying that capitalism is bad. But I've just come to see this data as something that enables product evolution to occur much faster. I see this data as something that's helping the world, mostly, get more of what it wants.

Unfortunately, I've seen too many product decisions catering to the manipulative aspects of adtech. UX often suffers, not improves with ads. Online platforms all seem to follow the same game ad monetization plan these days which results in messes like Frankensteinish apps--see official Twitter app.

As for actual hands on manufactured products or services, I'd like to know how ads improved the UX.

> Without analytics, I wouldn't be able to sell with efficiency, and therefore, I wouldn't have a business.

I guess your summary is about right.

> we live in a capitalist world, and I don't see that changing any time soon.

We were living in a capitalist world two decades ago, and we didn't have a significant amount of tracking back then. If you are concerned about your competition using tracking, then just try to make a better product.

If you are concerned with making a better product, you are going to need quite a bit of tracking and instrumentation to understand how to make a better product.

Using analytics to understand what users are thinking is like using tea leaves to divine the future. Just ask them.

Users will give you the preferences that they think are important. Or are important at the point of that interview.

Analytics can give you a decent glimpse of revealed preferences, which may or may not be what you're after.

Whether or not this is a good thing, depends on a lot of subjectivity, sure. But suppose you run a porn site - if you asked most users what they wanted in porn (before they had seen any), they would probably say one thing. If you examine what kinds of videos people look at, you'll see another. (This theme, with actual data from pornhub, is explored at length in the book "Everybody Lies" by Seth Stephens-Davidowitz.)

Both routes (asking and instrumenting) have their uses.

Glad you brought up porn because I believe that if you optimized any site's features just based on engagement analytics, you'd end up with a porn site with elements of gambling!

I'm kidding of course, but the idea is that analytics tell you a part off the user story, but doesn't answer the deeper "why" questions. It certainly has a place in tech, but it's less than what it's currently afforded.

Bender was on to something. "X but with blackjack and hookers" is the next untapped market for disruptive startups. We've worn out "X, but on the internet" and "X, but blockchain".

I mentioned before that "Thinking about it, I imagine that one instance of Google Analytics would be fine, but tying two instances of it in order to track a user would probably be ridiculous, right?".

My guess that it's the Facebook employees, advertisers and marketing people who like this.

There's a flawed belief that it's necessary. UX on the other hand suggests otherwise. AdTech is not concerned with UX though and tries to wrap targeting in some kind of pseudo user benefit—spin.

Good products and services sell even without tracking. Advertising is an economic powerhouse though and will always push for anti-UX trends because it fundamentally runs polar opposite to the user experience.

Hell, good products and services sell without advertising!

Advertisers study how to sell a product, and the most important product they have to sell is advertising.

I understand your stance from an end users perspective completely.

However, it's not hard to reason why people whose livelihoods depend on being able to track users and increase the value of their ad inventory would be happy about this.

People are unusually good at separating their personal interests from consumer interests. I've observed this emotion arise in many entrepreneurs first hand, be they in the brick retail, or conventional energy or obsolete auto parts, it's common for people to be happy about events that benefit their livelihood even when it has a negative impact on humanity or that ecosystem.

> people whose livelihoods depend on being able to track users

Those people can go hungry or find another line of work. I have zero compassion for that behavior. Justify it how you want, but most people abhor it.

FB has already publicly announced some of these changes:



Basically, FB is expanding its tracking, allowing 1st party vs. their third party cookie tracking. I suspect the click-id query string is part of that rollout. This helps it get around things like Apple's new ITP (Intelligent Tracking Prevention). 2.0 in Safari.

This is actually fantastic news for advertisers that have their own data warehouses and need to create a better 1-to-1 click tracking to internal user data. This allows much better attribution and testing of incrementality so businesses can tell where their value is truly coming from.

I’m pretty excited to see this roll out more broadly.

And another reason for me to avoid FB links. I'm sure blockers will begin stripping that out.

FB just doesn't understand the optics they create.

>FB just doesn't understand the optics they create.

Sure they do; they just also know that the vast (VAST) majority of people don't understand the implications and/or don't care.

They do, and we do. But the mass sheep audience does not. All they know is: "Does the link work?" and "Can I share it?" That's all.

True enough, even though there is nothing but negative sentiment these days toward the Facebook brand.

I suppose there is someone even thinking that Portal will be good for their home.

I actually ordered a Portal device for each of my family households (parents, grand parents, siblings and in-laws) for Christmas.

I think it’s a great product and can’t wait to have mine at home.

All the fuss about tracking is non-sense. Ads are a great way to monetize products that you want to make available to a large audience. And obviously as a user you want meaningful ads and not just a bunch of garbage. To do that tracking is necessary...seems like a straightforward value exchange!

I also want to note that I buy stuff frequently from ads...some of my most loved items found me through ads! It’s frankly a great way to discover great stuff.

Do I sometimes see ads that are not relevant? Sure, just as I see post from friends/family that are not relevant...I just scroll by, easy as that!

Nice try, Zuckerberg.



Absolutely not, adwords already does this with gclid, and Bing with... mkclid I believe, fbclid will be nice to have and be a convenience for those who data warehouse their own ad data.

If this wasn't Facebook this wouldn't be news, gclid has been around for years.

Totally. It’s really just feature parity - about time.

I can see urls eventually being thousands of characters long with referral links daisy chained.

I use Neat URL[0] with Firefox to strip things like that from URLs.

0: https://addons.mozilla.org/en-US/firefox/addon/neat-url/

That's cool, but only really protects the surfer. Instead of stripping them, you could also either fill them with garbage values, or with more effort, swap the parameters with another link. This would degrade the analytics results, but might be harder to detect. And if enough people used it, you'd get some kind of herd tracking immunity.

Do you know if the rewrite happens before a user clicks a link? If it happens after, the data ends up actually being sent out to their server...

Some already are.

A lot of malicious links are just base64'ed to another redirect service; to another base64'ed address (continue as long as your head can keep up.)

I'd expect something like uBlock stripping this off.

If I remember there is a 1024 character limit.

There isn't. HTTP doesn't define a limit but browsers, servers, and other software can. A quick google search turns up multiple places referring to Internet Explorer's limit being 2,083 characters and other browsers' being tens of thousands of characters.


I wonder why would fb move away from the well-established utm [1] link parameters to this? From the article, I can't see any functional difference.

[1] https://en.wikipedia.org/wiki/UTM_parameters

Web marketing analyst here.

UTM parameters tag campaigns at the aggregate level, to be used for reporting. The fbclid is almost-certainly unique to the click. While you could make, e.g., utm content unique per-click that's not what it's for. Anything in a UTM parameter is intended to be human-readable, and will almost certainly appear as-is in a report somewhere. Click ID parameters are internal IDs used to join data sources, which is not the type of data that should go into UTM parameters.

Note that Google Analytics, the tool that invented UTM parameters, itself does not use UTM parameters when it does this sort of thing. Google Analytics uses the gclid (AdWords) or dclid (DoubleClick) to join against user or click level data from other tools.

UTM are used for a website's own anayltics while fbclid will link users actions on a website to advertising done on facebook and report on it within fb. UTMs just track where a user came from, fbclid will track who a user is and match it back to fb ads.

Many links will already been submitted with UTM parameters and replacing them may have unintended side effects.

First time I read about UTM. It looks like those parameters are used to track ad campaigns. They provide metrics over which campaign works best.

The "fbclid" parameters on the other hand seems intended to track individual clicks. That is, Facebook wants to keep tracking individuals when they follow links to off-site pages.

It's no different than what every other major player is already doing, they're just now catching up.

I'm not aware of this, but maybe I'm behind the curve? Are you saying that if somebody posts a link on Twitter, that link gets a tracking-parameter appended when people click through? Ar what is it they ar all doing?

So they're modifying URL? Facebook is breaking things. But sure, they've run the numbers and decided they don't care.

Browsers will now have to resort to removing query parameters to prevent tracking. And websites should really use click-to-enable sharing buttons to prevent Facebook from snooping on everything.

My guesstimate is that the number of URLs that are shared on Facebook AND that already have a completely orthogonal "fbclid" parameter is infinitesimal.

Maybe among the URLs shared on Facebook there are a few whose servers only respond to a fixed amount of parameters, changing their behaviour when additional unused parameters are appended to the query string, but I imagine that the number of such cases is so low it's not even worth considering.

What exactly is Facebook breaking, in your opinion?

Would Facebook also break things if they were instead making an async request to the destination and appending a custom header to it, something like "X-Coming-From-Facebook"?

It breaks some server-side caches. And the article itself notes that pages were indexed by Google with the "fbclid" parameter attached.

I don't get the part about async requests. What's the scenario?

> What exactly is Facebook breaking, in your opinion? > > Would Facebook also break things if they were instead making an async request to the destination and appending a custom header to it, something like "X-Coming-From-Facebook"?

Extra headers are typically ignored, not only since different clients send different headers since the beginning.

I know multiple systems which however decode the query string and complain about unknown options or don't accept a query string at all for some resources. On the later case it is ignorance on the other case it is intensive input validation.

This is the reason why Google Analytics can be configured to read marketing parameters out of the hash fragment instead of the query string. A surprising number of sites will choke when unexpected data shows up in either the query string or the hash fragment, but very few will choke on both (most sites that mishandle query parameters are from an era before rich use of fragments became common).

Notably, most links with GA marketing parameters are under the control of the website owner. Facebook links are not. This makes such a work-around less feasible.

I've seen a broken link from this url parameter already.

I understand being upset about the tracking aspect, but attaching query params to a link isn't breaking anything. Of all the ways Facebook could have implemented something like this, I actually prefer it this way. Query params are easy for me and other adblockers to strip off. Imagine if they were messing with request headers or something that was harder to notice or change.

An incredibly small number of sites might already be using `fbclid` internally, and an even smaller number won't be able to update their sites.

I am totally on board the don't-break-the-web train, but this just doesn't seem like a problem to me. Maybe once stats come out we'll see that it's a bigger issue, but... I kinda doubt it.

I got caught this weekend where I linked an image on Facebook, and it generated the preview properly but the link is broken because the host 404's with the added fbclid parameter.

I entered: https://pbs.twimg.com/media/Byv5uWSIIAEf38C.jpg

Facebook made: https://pbs.twimg.com/media/Byv5uWSIIAEf38C.jpg?fbclid=IwAR2...

So sure, there may be an argument that the server should ignore that param. But it's absolutely false to say it "isn't breaking anything".

Huh. That is very surprising to me, but I stand very corrected.

I would consider it to be pretty bad practice to treat query params this way, extra query params should be ignored. However, web standards are descriptive, not prescriptive. If enough sites have strict requirements about the number of query params, then it doesn't really matter what good practice is, and Facebook should accommodate this by moving back to cookies or at least make it opt-in or something.

I guess this makes sense. The request contained something the server didn't understand, and wasn't expecting, so returning a 404 seems sane.

After all, as a developer, it's my site - I choose the URLs (including the query strings) which are valid and acceptable to me.

I wonder how many sites other than Twitter are rejecting requests with unknown query parameters?

Facebook can't set arbitrary request headers easily. That would require messing with how the user-agent retrieves things. Cookies are the chosen way servers can set client request headers. So the tracking-information is passed by a cookie. Now since browsers are starting to be selective about cookies, Facebook becomes inventive.

Adding a request parameter absolutely will break things. And they knew this. The only question was what's worse: Not being able to track some people or breaking some of their links. Facebook decided the former is more important to them and their customers.

And even if nothing else brakes, uglifying the URL people are posting is in itself an anti-feature.

I'm less worried about uglifying URLs, and more about the 404 stuff that a few other people have posted about now. It never occurred to me that there'd be more than an extremely minimal number of sites that would error out when receiving extra query params.

That's a new one for me, I need to make sure I remember it in the future. From what I'm seeing online, it's not even necessarily considered bad practice, so... I dunno anymore.

But agreed, Facebook should back this out.

Try going to a Pizzeria and say "Pizza Margherita, with extra Zejeako please". Should they just give you a Pizza Margherita because they don't know what Zejeako is? Or should they tell you they can't do it?

> Facebook should back this out.

They won't. They knew what would happen and did it anyway.

Author failed to do any research, instead of going for the typical "FB is doing something secretive and cryptic" angle. Related links that explain this:




This hn thread is a perfect example of a news bubble. Googling "fbclid" returns the answer in the first result, but hn votes up an article that has no information and treats it as some secret tracking that fb has implemented. HN is excessively biased against any discussion of tracking/analytics on the internet. The community allows no room for true discussion - only blatantly biased opinions.


Edit - reworded to be less aggressive

This article was posted days before any of the links you've given.

According to the metadata for the site, it was originally published on 2018-10-14 and last updated 2018-10-16.

Facebook's own article about the feature came out 5 days after this article was published. So, at the time, Facebook _was_ being secretive about it. Aside from that one line, the entire article reads more like "this is new, I wonder what it does".

Lastly, when I googled "fbclid" the top 3 articles are completely unrelated to Facebook (but, then, I'm not in marketing so this doesn't surprise me) and the forth is this very article.

That is fair, I didn't check when the author posted.

The first link (for me) when googling fbclid is the reddit post in r/analytics I linked to, which doesn't have a ton of info but gives more than what the author had. Though you're correct, it was posted after the author originally posted, and I can't fault him/her for not checking in again a few days later.

They were probably doing an A/B test / phased rollout and didn't want to announce it until it was available to everyone.

Not really seeing the comments here as jumping to the conclusion that this is a totally secretive and nefarious practice. I think most people here are used to links being instrumented with tracking of this sort.

Your first two links don't contain any technical information about the "fbclid" parameter. They can be read to understand why Facebook does it, but not how. I see how it is useful to get this info when I'm paying Facebook for views. But that's something different from understanding how it works. So articles like this are needed to piece together what is going on.

Now the interesting question will be whether "fbclid" can be tied to individuals. And I couldn't readily find this info in the links you posted. Maybe I'm bad at reading?

This will most likely be the same as gclid for AdWords . You can’t tie it back to individuals for gclid and I’d expect the same there .

Small nitpick, but I believe you meant "instead going for," which has the opposite meaning of "instead of going for."

> you should always explicitly set the canonical URL for each page.

Could someone explain or give a reliable article that explains this well?

When Google crawls your site, it doesn't know the difference between two urls with the same path, but with different GET params.

Theres no way for them to know whether or not the extra params on the URL change the result page. (i.e. example.com/index.php?post_id=1 and example.com/index.php?comment_id=1 could be very different pages, or they could be the same; you don't know).

So in comes the canonical url! This tells Google the proper url required for a specific page. That way if Google gets to a page using two different urls, it can tell that they are the same page.

You can list it by adding a tag to your HTML head.

You can even do face things like rewrite urls entirely (i.e. If the crawler hits example.com/?category_id=1&item_id=2, you can correct the ugly url by listing the canonical url as example.com/category/1/item/2)


This might cause issue of duplication for sites which do not have canonical URL implemented.

Google is pretty good at automatically determining which qquery parameters actually modify the response content (pagenb=, id=, q=, etc.) and which do not really (sortby=, highlight=, utm_source=, gclid=) so that should not be a problem.

Well yeah... but so would utm's and the millions of other CID's out there. Adobe... GCLID... the list goes on. You need a canonical.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact