Hacker News new | past | comments | ask | show | jobs | submit login
URL shorteners set ad tracking cookies (ylukem.com)
497 points by firloop on Jan 3, 2021 | hide | past | favorite | 202 comments



This is really interesting. I suppose tiny url gets a kicked back from their ad network for this. I'm the creator of the URL shortener (T.LY) and a Link Unshortener tool. I spend most of my development time fighting bad actors. My goal is to have a legitimate competitor to bitly that people benefit from. We do not set any cookies on redirects but do use cookies for authentication for users.

T.LY: https://t.ly/

Link Unshortener: https://linkunshorten.com/


It is a shame that T.LY displays only the footer without JavaScript enabled instead of degrading gracefully. Surely a plain HTML form that POSTs should suffice?

I'm not sure how much work it would require for you to support this, but it would help cement your place as a good web actor if you're so inclined!


Sorry about that. I honestly didn't think anyone browsed the web without javascript enabled. How common is that? We do offer a simple to use api that you could build on top of to shorten link. Also an extension that offers the ability to shorten a url in one click.

API Docs: https://t.ly/docs/

Extension: https://t.ly/extension


I browse with NoScript blocking JavaScript by default as too many web developers (or their managers) have violated my trust to not do Dodgy Stuff over the years. Unfortunately I don't believe there will ever exist accurate numbers of the true portion of people who browse a subset of the web with JavaScript disabled, at least partially because many of those same folks will prevent the means used to collect the data in the first place.

It's no place for me to dictate how you do your development, so I won't do that. It is however my personal opinion that websites should depend on HTML and CSS, and progressively enhance functionality with sprinkles of JavaScript. The vast majority of websites are not interactive applications, and I think modern web development practices could do with something of a hard reset.

I'll leave it as an exercise for the reader to decide how things ended up where they are now and whether it's a good thing for them. Personally I think it's comical and horrifying just how much compilation goes on in projects written in that particular interpreted language these days!


It's an unfortunate effect. I feel conflicted about blocking all the tracking and stuff, as I believe most cites will ignore browsers such as mine in their statistics, where everyone has javascript, because analytics says so, and most will have webgl, webasm, canvas, webfont, notifications, whatever...

They probably don't count private browsers even when I am a logged-in paying user - who parses server logs these days.


Do you run native phone apps? Because they are 10000x worse. 75mb to display a website and 20 tracking/retargeting libs.

Apps are killing the open web anyway. People being born now will grow up without knowing what the web is.

I agree about the state of the web tho. I'm a web dev and I often browse with JS disabled and always with adblocking and pihole.


Not OP, but only open source ones. It's a shame, but such is life.

I have 1 client-provided phone that I only power on for specific purposes that is not neutered and has a closed-source app.


I wouldn't say it's common, but this is the one forum where a considerable amount of people disable JS when browsing (and likely only whitelist few if any sites). It's always a good thing to support nonetheless, so please go for it!


I use u-block origin on "medium" mode where you have it block 3rd party javascript by default and it behaves the same. Unblocking the cloudflare originating javascript fixes it. I'd guess my setup is more common than having javascript disabled entirely. Not a complaint, just another data point.


Yes I do have the site behind cloudflare but could remove the javascript rocket loader feature. I will look into this. Thanks for sharing!


Is there some reason you want to block cloudflare JS? If not, wouldn't it be easier to add an exception to uBlock rather than try to get devs to change their site for you?


The extension is blocking 3rd party JS. Nothing against Cloudflare specifically.

There are various reasons to block 3rd party JS; security, privacy etc; CDNs and remote-linking of Javascript and other such content is counter productive to those endeavours.

A good-citizen should aim to self-host anything as important as executable code _where possible_. The reasons, I hope, are obvious.


There are some people who have JS disabled by default. See the NoScript extension. So the dev wouldn't just be changing it for eikenberry, but for all such people.


No reason, I always just add a cloudflare exception and will probably look into making it a global exception as it is pretty common. I was chiming in to help the site dev understand the issue.. giving another data point.


Why would it be easier for all users to add an exception instead of one site owner to make one change?


>I honestly didn't think anyone browsed the web without javascript enabled. How common is that?

Not the person you asked, but speaking for myself, all the time. I have a javascript toggle I use several times a day, and leave it set to off as much as I can.


Interesting..How many sites work without javascript these days? Does google?


More than you'd think, not as many as I'd hope. Easily 2/3rds of the ones I visit work, FWIW.

Google works fine without javascript. Stunningly fast.


> Easily 2/3rds of the ones I visit work, FWIW.

I started blocking JS a few weeks ago, and this has been my experience as well - a pleasant surprise.

For a long time I thought that would be a step too far, that browsing would become so annoying and unpredictable because of it. Any annoyance from having to turn on JS for individual sites is easily outweighed by the number of annoyances I avoid - news websites are actually readable, blogs open with their content rather than with an in-your-face pop-up, and as a bonus, I pay attention to things like: how many 3rd party domains is it trying to connect to? does it require 3rd party JS to be enabled to function at all? did they even consider the possibility of disabled JS and bother to write a noscript message? Things like this translate to a measure of trustworthiness to me now, and I've been both horrified (by simple blogs trying to connect to 80+ domains) and pleasantly surprised (by complex-seeming websites that don't use 3rd party JS at all).


It's funny when the blocker says 99+ scripts blocked on a news site because it can't display more than two digits. Or when you end up with a black page because they didn't bother with a no-script version.


> Google works fine without javascript. Stunningly fast.

Until they block you for "suspicious behavior" after a few minutes of using it like that.


Genuine question- has that happened to you?


Yes! It also happens when I am not logged in, or use a VPN (though that is understandable).

But simply disabling javascript and clicking next a few times is usually enough to set off their blockages. Even when not using a VPN. It happens less if you only ever look at the first page.


It makes sense the fingerprint you create with javascript can identify you easily. Without that google treats you as suspect.


Interesting! Thanks for responding. I haven't had that happen to me yet, but I'm often using the same static IP I've had for years. Perhaps that keeps the trigger at bay.


DDG has a No Javascript based search.


Plenty, and you get used to either toggling or enable the specific scripts that need to run.


Would you mind offering a bit more detail about the "toggle"? Which browser, what's the name of the extension, etc. I would love something like that but don't really want to go through the effort of setting up a whitelist right now.


Sure thing, I'll edit this reply with the extension in question when I get home. Won't be very long.

[edit] The extension is called Quick Javascript Switcher. https://chrome.google.com/webstore/detail/quick-javascript-s...

It works as advertised for being a Javascript toggle.


Not the person you asked, but, NoScript is a choice. uBlock Origin as well with the right settings. Should be on Chrome or Firefox. On mobile, only the latter I think. Kiwi too?


with noscript, I blocked everything and then when sites broke, I very much enjoyed figuring out what element was the culprit. I became very good at remembering what elements rescued what urls.

When i switched to Ublock origin, i didn't even realize how to do this. I just allowed all JS whenever i found broken sites.

Now, this very thread encouraged me to finally figure out Ublock origin settings, and now, finally enable specific JS elements instead of a blanket "allow".

Here is a gret userguide for Ublock origin toggle.

https://www.maketecheasier.com/ultimate-ublock-origin-superu...


> I honestly didn't think anyone browsed the web without javascript enabled.

Not very common in the general population. But there are those (mostly software developers) who prefer to be in control what code they run on their computers. I know one person who does most browsing using lynx. That is certainly extreme, but extensions like NoScript and uMatrix (has gone out of maintenance recently) certainly have their user base.


> I honestly didn't think anyone browsed the web without javascript enabled.

I certainly do! I am sure that I am very unusual, but to this day I very much prefer not to grant execute permissions to ever page I read. JavaScript is a huge security/privacy/performance hole, and is simply not needed for displaying lines of text and images, nor for accepting forms data.

It has some pros, too, but on the whole I really miss the mid-2000s Web and am not fond of all the web applications out there.


I also run script blocking on everything. And I block the domains for all url shorteners in my DNS so I don't accidentally go to some weird site. I also evangelize this to all my customers and some actually want me to install it on their computers too. Sadly there is so many pages that breaks down without allowing a lot of external scripts but it has to be something really important for me to bother with unblocking something.


I realize you're not using React for t.ly, but the method I outline in this[0] blog post could perhaps be made to work for you if you at any point would like to accommodate users without JS enabled. Yours is the kind of site that this method is best suited for - a relatively simple UI with basic I/O.

The biggest hurdle I've encountered so far, is that Stripe doesn't offer a fully nojs alternative to enable users to make payments, although this would be incredibly easy for them to do, considering that they already offer a hosted checkout[1]. The only thing missing here is a way to get the checkout URL itself from the server side, when the Checkout-session is generated.

[0] https://blog.klungo.no/2020/05/28/using-react-and-redux-to-a...

[1] https://stripe.com/en-no/payments/checkout


At work I'm not allowed to have any extensions so I just turn Javascript off in lieu of ublock origin.


You can actually sanity check how common it is for T.ly by triggering an analytics hit within a <noscript> tag. Looks like you're using GTM/GA on your site, so this[1] should put you on the right track.

You'll still be blind to individuals that are blocking GTM/GA itself since you're not using the newer server-side GTM option, hence only a sanity check. But it's a fairly low-effort tweak to be able to get a read on how common it is for your site specifically.

[1] https://www.simoahava.com/analytics/track-non-javascript-vis...


> I honestly didn't think anyone browsed the web without javascript enabled.

I know a bunch of folks have replied, but I'm another one. I remember back before JavaScript; I remember when Flash was the bane of those who cared about privacy or security; I remember when 'DHTML' was the buzzword of the day.

I actually have a lot more appreciation for what JavaScript enables now than I used to. It really is neat that we have this platform-independent mostly-not-completely-insecure app runtime. Pity that it is built atop what should have been a hypertext system, though.


> I honestly didn't think anyone browsed the web without javascript enabled. How common is that?

I don't know how common it is, but I do. I have a secondary browser profile which does allow it, but frankly for just about any page I visit if it doesn't work without JavaScript I will skip it: the Internet is large and I rarely need to look at a page.


Javascript JIT is a massive attack surface, and is disabled by many on higher-assurance machines.


NoScript is my starting point, and I'd need to really need a site to enable it globally.


Can you instead use the meta refresh tag?


Why use meta refresh tag over a http redirect: https://t.ly/home


Try curl https://t.ly/c55j

The response is:

<meta http-equiv="refresh" content="0;url='https://weatherextension.com/'" />


samb1729 isn't talking about viewing shortened URLs. That works fine with javascript disabled. samb1729 is talking about viewing the the homepage of t.ly and creating shortened URLs.

Side note, I think a 301/302/303/307/308 redirect is better than meta refresh (t.ly happens to use a 301 redirect + meta refresh).


Yes T.LY uses a 301 redirect which is better for SEO for the long url domain.


Thorrez is correct in their interpretation of my comment, so I have nothing to add there.

However given your username I'd like to let you know Cobra Kai season 3 recently released and is as silly as ever, in case you haven't already watched!


> Link Unshortener: https://linkunshorten.com/

Well, Google Analytics and Googlesyndication are known to set the infamous PREF cookie (remember Snowden and PRISM?)... so I wouldn't recommend that website either if the whole point of this discussion is to avoid ad tracking cookies.


Seems nice. I'm curious, how do you make money / stay in business? I couldn't find any paid options.


Thank you! The site and extension are free to use to shorten links. I do offer the ability to upgrade starting at $5 a month which allows custom domains, ability to customize links, expire links based on date or clicks, private stats, ability to shorten links using the API (https://t.ly/docs/).

I also recently release a new feature called OneLinks that are great for social media bios. Here is an example on a OneLink: https://t.ly/TimLeland

Extension Link: https://t.ly/extension


Just a heads-up, "OneLink" is trademarked by AppsFlyer: https://support.appsflyer.com/hc/en-us/articles/115005248543...


Hmmm. My browser complained that you’re running three trackers on that site (google, cloudflare and digital ocean).


Are you seriously equating "has an image hosted on digitalocean" (which probably hosts the entire site) with "tracking"?


No. It said they were trackers operated by those companies.


Yes cloudflare for speed and protection. Digitalocean for file storage. I may remove google analytics.


Using cloudflaire doesn't give me confidence this will in anyway not track me.


Presumably, that's a tradeoff the OP is willing to make.


Maybe this has changed since your comment, but I see three paid plans on the homepage.


Yes always been there. There are additional plans once you register for more short links and teams.


> We do not set any cookies on redirects but do use cookies for authentication for users

You also set cookies on every request apparently. What are they for?

    $ curl -Is https://t.ly/ | grep -ic set-cookie
    3


How is your project protected from being bought off by a bad actor?


Wow! https://preview.tinyurl.com/examplezoom really shows https://zoom.us/j/123456789 link whereas Chrome network inspector confirms the viglink.com redirect. uBlock origin blocks the latter via Dan Pollock’s hosts file and Peter Lowe’s Ad and tracking server list.


As someone who uses a whitelist approach, I am curious whether people ever experience false positives or missing entries with these lists? I have little experince with those lists except for going through one of them once and being shocked at what was in there.

The setup I use is customised for me, i.e., Rube Goldberg would be proud. I can view and manipulate all traffic from outside the application and outside the origin computer. I can strip cookies based on IP, domain or URL very easily. I also control DNS so only domains I approve would even return an IP address.


There are many false positives or grey negatives when using those filters.

But it mostly happen during these kinds of redirects where one or more actors wants to be in the redirect loop. This could be URL shorteners or price comparison websites.

uBlock asks if you want a one time exception when a redirect leads you to a blocked url.


What is the user interface for your setup like? It sounds attractive but possibly prohibitively frictious to be workable for me.

I currently use a combination of uBlock Origin blacklisting, NoScript whitelisting, and Little Snitch alerting, if you need a baseline to compare. I've also run a Pihole instance in the past to loop my phone in, but that's not running as of today.


No GUI.

I think what I have created is something like a cross between Pi-Hole, Burp and something yet to be named. But it's faster, more flexible, uses different software and is Java-free.


Sorry if I was unclear, I wasn't asking about a GUI. I mean how do you interface with it as the user? I assume it isn't just something you launch and forget about given your description.


Oh, sorry I misunderstood. It is ideally run on a gateway, but can also be run on the same machine if using a UNIX-like OS that isn't locked own. I do interface with it a lot because I like to look at logs and dumps and experiment with configurations, but that's not required. Setup consists of a single script that sets up all the servers and imports the data. Any changes while using consist of editing text files. There are some tiny shell scripts and some helper tools I wrote in C to facilitate hands-on DNS management as I am very active in managing DNS data, I like to see IP addresses rather than hide them. I intentionally do many DNS lookups semi-manually. This is purely personal preference, not required. This system could be "set it and forget it" once you have the proxy configs and DNS data you want. The amount of DNS data I actually need to survive is quite small. Those outsourced blocklists the ad blockers use could be larger than personally curated whitelists, depending on the user. The DNS and proxy servers use little system resources.

A programmer with an excellent track record for reliability once said something like "The best interface is no interface." This is how I like things. I do not want to be required to costantly interact. He is the author of the DNS server and daemontools, which I use to control the servers.

HTH


That sounds so cool, I'd love to know more about your setup!


Tried in a new profile and didn't see any viglink.com.

Edit: the link should be https://tinyurl.com/examplezoom (which does have viglink.com).

For some reason you wrote the preview link, https://preview.tinyurl.com/examplezoom, which does not have the tracker.


I think that's their point: preview.tinyurl.com is lying to you.


Ah, I misunderstood.

TBF I think they have direct link on preview page simply because they don't want to track the traffic from these pages (instead of trying to disguise), but the practice is still bad.


I currently host https://femto.pw/ - A URL shortener I've kept up for ~4 years and intend to indefinitely. It doesn't do anything with regards to tracking cookies or other dark patterns. It just redirects you using a 302 redirect.


FYI that your site is blocked by this list: https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list

HN post for that list here: https://news.ycombinator.com/item?id=25512273


That list is questionable at best.

There are many claims the list author makes without any source code at all, though a lot of buzzwords. The reddit r/pihole moderator pulled the post: https://www.reddit.com/r/pihole/comments/kh5dit/the_quantum_... . The thread was more entertaining before the list author deleted every downvoted comment they made.


[0] is perhaps even more concerning - apparently it bears a striking resemblance to Steven Black's (slightly more reputable) list[1] [edit: plus a few hundred thousand other rules of questionable sourcing].

[0] https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/i...

https://github.com/StevenBlack/hosts/issues/1487

[1] https://github.com/StevenBlack/hosts


I agree that it's questionable. I commented the same in the thread I linked: https://news.ycombinator.com/item?id=25513161

However, at least for Pi-Hole users, more is usually better, so I added the list to my Pi-Hole.


> > We were testing an AI that could show some basic emotions about internet content, and turns out it was very precise at getting “annoyed” by ads and “unsolicited” third party connections…

Holy shit that's such bullshit.

They are basically claiming they invented a artificial general intelligence, with feelings, that happens to feel the same way about ads as us. It's basically sentient instead of publishing research papers, they turned it into an ad blocker.


It's just colorful language for the fact that ads and spyware score high on their model for bad websites.


First: Marketing bullshit is still bullshit.

Even if it's not morally wrong, it makes you look like an idiot who doesn't understand the technology you are selling. In the worst case it might even be used as evidence that your work is a fraud.

There is no benefit; To the lay person, It would sound just as impressive to say "We trained a machine learning model to detect ads and spyware" and that wouldn't immediately set off alarm bells with people familiar with the current state of machine learning.

Second: Talking about fraud, the evidence linked above is pretty strong.

Their alleged AI is somehow detecting test domains that authors of other lists as "ads or spyware". Test domains that aren't linked anywhere on the internet.

In one "smoking gun" example, the test domain doesn't even have a DNS entry. The alleged AI can't even load the domain to scan it.


No, more is not usually better. Especially with a garbage ""AI-generated"" (not) list with untrustworthy maintainers like this one. It's better to add a low number of lists with trusted maintainers, who actively curate their lists and respond to false positives. That means no "mega-list" abominations like oisd.nl.

I suggest: https://www.github.developerdan.com/hosts/

https://gitlab.com/curben/urlhaus-filter/raw/master/urlhaus-...

https://raw.githubusercontent.com/notracking/hosts-blocklist...

https://raw.githubusercontent.com/anudeepND/blacklist/master...


Can you explain why more is not usually better?

I added the 4 lists you recommended to my Pi-Hole, which added a net new 73,253 domains to my Pi-Hole. My total is now close to 2M.


You could just blacklist *.com and be done with it.


You joke but I would be most happy if all my web needs could be served on .onion addresses


Hm, well I've got to work out how to get off that list! Thanks for giving me the heads up.

EDIT: I'm not sure quite how to deal with being put on ad lists. Sure, people can upload any file to our host so it's plausible that someone, at some point, has uploaded an advert. Someone could also redirect to an advert domain and we'd have no way to really deal with that unless it was reported. Ideas are welcome for solutions.


Just some thoughts:

1. Reach out to the list maintainer to see why your site was added.

2. Create a blocklist comprised of those ad lists. Don’t redirect to sites on the blocklist.

3. (Of dubious practical value) Create a Terms of Service that says users may not use your to link to advertisements.


+1 to the second suggestion as a low-effort way to make some headway in staying off blocklists.

A place to start might be this large, very popular list that combines a bunch of other lists: https://oisd.nl/

Actual text file is here (large file warning): https://hosts.oisd.nl/

Just prevent your service from shortening links to any of those domains.


You might want to consider checking for hosts listed in https://github.com/notracking/hosts-blocklists

This is an excellent merged blocklist, with public whitelist (oisd is fully closed, no insight in what is whitelisted and why, also causing more false positives..)


No longer the case: https://oisd.nl/excludes.php


Right on time, sjhgvr can't allow his rep to be (rightly) blemished on any corner of the internet.


> 3. (Of dubious practical value) Create a Terms of Service that says users may not use your to link to advertisements.

That seems entirely unenforceable. Aren't ALL websites ultimately advertisements?


> Aren't ALL websites ultimately advertisements?

No. Some are just information, art, or what-have-you. Here's one I just found now.

https://aaron.axvigs.com/


That could still be considered an advertisement of his existence and writing skills.

If the goal is purely informational, why is the author's name attached?

The site also advertises the CMS it runs on.

That's my point, by a reasonable standard, ANY site that exists is an advertisement for something or other, thus a rule saying "no linking to advertisements" is worse than useless.


This must be the mindset it takes to work in the ad tech industry.

Ads are sort of like porn. There are lots of things you certainly know serve no other purpose than to advertise something and you can block them outright. Native advertising is certainly difficult though.


Do not work, nor I have I ever worked, in ad tech.


I guess you have a different understanding of what "advertising" is than the general understanding.


advertising or ad·ver·tiz·ing [ ad-ver-tahy-zing ] - noun - the act or practice of calling public attention to one's product, service, need, etc.


I believe it's possible for a website to exist without calling attention to anything.

Or perhaps you believe the mere existence of information is a call for attention.


Doesn’t all content exist to receive attention?

I think there would be exceptions, like test sites, personal experiments etc. that could make it on to the internet without seeking attention, but any content designed for consumption is attention-seeking.


> Doesn’t all content exist to receive attention?

Maybe. Attention can also be granted without it have been called there. There are also websites not designed for consumption.

If every website is advertising, then surely most of human discourse and activity would also be considered advertising. What's even the purpose of the word?

You're not going to convince me that everything is an ad, and I probably won't convince you either. I'm not interested in playing any further semantic word games. I'll read any replies you make if you choose to, but I have nothing more to offer in this thread.


I agree that not everything is an ad. I think the parent comment is fairly trite.

I do believe all content made for consumption (even purely informational content) is attention-seeking.


For me the problem is that you hide URL's that I can click on and have no idea where I end up. So I block all url-shorteners as a principle on my pi-hole.


What happens to it when you die? Do you have a contingency plan to export this data somewhere for archival purposes?


I've worked with the Internet Archive to ensure continuity if I get hit by a bus or anything. A list of all items that have been uploaded to the site will be provided to them if anything happens to me.


Femto's links don't seem to unfurl on Facebook


Tinyurl actually has a preview feature, which you can enable by default.

https://preview.tinyurl.com/examplezoom

Curiously, this specific tracking behavior (both the redirect and the cookie) goes away when turning on previews.

(Incidentally, my uBlock origin filters block the VigLink redirect as a tracker, by default, as a sibling commenter points out.)


Although "Oh By"[1] is not strictly a URL shortener it can be used as one quite nicely.

When used as a URL shortener, there are no cookies, no tracking, and ublock origin shows a nice big zero throughout. This is because the revenue model of Oh By is selling custom/vanity codes - not monetizing user data or advertising.

"If you're looking for a dead-simple URL shortener that respects your privacy and doesn't slow you down with ads or multi-megabyte interstitial pages, Oh By might be for you."[2]

[1] https://0x.co

[2] https://0x.co/faq.html


. You have to type http:// on the message field To make a redirect


Yes, correct.

The typical use case is a human message, not a URL. If you want a redirect you need to explicitly prefix it like that…


Everything that sits between you and your destination is a middleman tracking you, unless proven otherwise.


(Astronaut looking at planet Earth) "Wait, it's all trackers?"


(Stallman) "Always has been"


Isn't tracking the entire business model of URL shorteners?


Wasn't the primary use of URL shorteners to compress a given URL in order to reduce the character count? Given today's Twitter, what are they still used for besides visual convenience?

Do youtu.be, t.co, fb.me and dlvr.it next!


No, the primary point was always to add UTM trackers to the URL. That’s why companies kept using them after Twitter introduced t.co.


Can't you add the UTM tracker to the URL with shortening the URL?


My company uses them in its print assets like billboards, posters, and transit ads.

I see them all the time in commercial text messages, like from things I've subscribed to, or delivery alerts so I can track the pizza guy.


Do they use QR codes in addition to the shortened URLs? I’ve always wondered why QR code’s haven’t caught on more. Especially for things where the objective to access information more convenient than fat-fingering.


QR codes are everywhere! They're on a lot more foods and such than they used to be even 5 years ago. French's mustard has one, Barq's Root Beer cans have one. A lot of electronics I buy have a card in the box with a QR code to get to the company's site.


It's a shame the Netflix app on smart TVs doesn't show one for login.

Rather than awkwardly typing in my username and password through a remote control, I should be able to open the Netflix app on my phone and scan the qr code.


T.LY generates QR codes for all short links generated. We also have a simple tool for creating QR codes from any URL https://t.ly/qr-code-generator


> Given today's Twitter, what are they still used for besides visual convenience?

Data analytics - basically you spread out different shortened links on your campaigns / media, so you can track effectiveness while at the same time the user does not have to manually type in cryptic characters.


Yeah, what I mean is that I don't think URL shorteners do anything for users aside from being slightly better to look at


Well, click tracking and click counting come to mind.


I mainly use them when I need to send a link that needs to be manually typed at some point (e.g. asking person to go some website during phone call).


Text messages still use short links and carriers sometimes block by domain for links sent via A2P over their network.


We use Cloudflare Workers as a very simple URL shortener [1]. It has very generous free tier (100k requests per day) so it's more than enough for a lot of use cases.

[1] https://lucjan.medium.com/free-url-shortener-with-cloudflare...


Cloudflare docs [1] recommend using an ‘AAAA’ record with the value ‘100::’ for the dummy DNS entry.

1. https://developers.cloudflare.com/workers/learning/getting-s...


Thanks, edited the article.


https://is.gd is the best url shortened I know of. Straight to the point, fast, light, no snooping or tracking.


Thank you for sharing. This will be my default url shortener from now on. Simple, fast and customizable.


Not particularly surprising. I was building a url shortner some 12-13 years ago but eventually abandoned it. But this was exactly how I planned to monetize it.


This headline might be the biggest "duh!" I've ever read on the site. In this day, and in this surveillance market economy, you must assume that you WILL be tracked wherever you CAN be tracked.


I understand the downvotes, given this is HN, but while this is "duh", lots of people don't actively think about it whenever they see a shortened link. Posts like this are okay now and again to remind people that they can and will be tracked wherever possible.


For exactly this problem I did build https://unshort.link

It is a service that unshortens the url and removes (if possible) the tracking parameters.

It is GPL3, allows Easy Self Hosting and has an automatic browser plug-in


I really wish web browsers would strip tracking code bullshit from URLs. When I copy/paste a link for friends I always manually edit that crap.

On the other hand I do love websites like WireCutter which only exists because of referral codes.


You can do this with an extension like https://gitlab.com/KevinRoebert/ClearUrls

I found that it broke some sites though so I removed it.


Yes wirecutter exist because of affiliate links but they do offer detailed reviews. I use/trust them often for purchases and amazon affiliates do not cost the user anything extra.


If I recall correctly, Viglink does affiliate marketing. Essentially they are setting an affiliate cookie to make money from anything you purchase on Amazon, Walmart.com, Ebay, etc. This cookie will override any other that was already set. So if you clicked a link to a book from a blog post and then clicked on a tinyurl, they would get the affiliate referral money and not the blog.

It's an easy way to make money because it doesn't involve a long sales process with major advertisers. Viglink does all that. tinyurl, bitly, et al are probably making a fair amount given their reach


Basically:

1. TinyURL does not give Zoom any more customers than they would have had otherwise.

2. Zoom pays VigLinks and TinyURL.

3. An incompetent, or unethical performance marketer gets to claim to their boss they are driving X upgrades for $Y when in reality they are driving 0 incremental upgrades for $Y.


TinyURL and a several free alternatives have been known to do it for a while now. But, not everybody does this to be clear.

Running a free URL shortener costs time and money which is why they do it. For my URL shortening service https://blanq.io, I am planning to remove this feature and only support custom domains. Free shortening is highly abused by spam and its a daily battle to be one step ahead of them.

Last week, a single bad user created a phishing link and brought down the entire site for an hour until I was able to restore it.

Lesson learned.


I am not surprised. URL shorteners will try to monetize eventually. One way is to support ad networks, other is to show ads and videos before navigating to the target URL. I am 100% sure TOS allow it since the beginning.

As far it seems to be a grim future, it is almost only way they can monetize. Otherwise they will close their businesses rendering millions of URLs broken, what I think is the future that is too easy to predict.


> As far it seems to be a grim future, it is almost only way they can monetize.

Bitly charges $30/month (basic) which seems like an outrageous amount of money to me for what it does. How much more monetization do they need?


Could also cross-subsidize by being a sub-affiliate network as part of an affiliate network. Company earns percentage of affiliate commissions produced by in-network links, which subsidize the non-commissionable out-of-network links (and non-earning in-network links).


Delete your cookies regularly... here's how with Python and Firefox. Maximum privacy, no need for extensions.

        import sqlite3
        import pprint

        def cnt():
          rows=csr.execute("select count(*) as nr from moz_cookies").fetchall()
          return rows[0][0]

        db=sqlite3.connect(r'C:\Users\XXXXXX\AppData\Roaming\Mozilla\Firefox\Profiles\YYYYYYYYYYYYYYYYYY\cookies.sqlite')
        csr=db.cursor()

        ckbefore=cnt()
        rows=csr.execute("select host,count(*) as nr from moz_cookies group by 1").fetchall()
        for r in rows: print("- %3d %s" % (int(r[1]),r[0]) )

        csr.execute("delete from moz_cookies where host not in ("
                " 'your', 'own', 'list', 'of', "
                " 'sites', 'you', 'trust')")
        db.commit()

        ckafter=cnt()
        print("%d > %d cookies" % (ckbefore,ckafter) );

        csr.close()
        db.close()


Don't want to be mean, but just to inform you, guidelines says "Please don't delete and repost. Deletion is for things that shouldn't have been submitted in the first place." and I know you have posted and then deleted the same post yesterday. It is fine to repost if you didn't get notice no worries


Sorry about that, noted.


I honestly thought that was common knowledge. Like why else would you use a URL shortener, since Twitter started doing it on their own?

I can do more to help web users understand trackers... perhaps I will work on that this year.

I’ve worked in and around the space for too long to see outside of my bubble.


Not all URL shorteners do that. I know because I own and maintain one that doesn't.


[flagged]


There are cases where URL shorteners are useful. E.g. some websites would parse a link you embed within a text you post and replace it with the actual video if that's a link to YouTube. A shortener may be the only way to post a classic hyperlink to a YouTube video there. Shortened URLs may also help when you need to put them on paper/merchandise or on TV or say them in a voice call. That's sad goo.gl has been discontinued - it was what you could rely on. IMHO archive.org should make their own.


No, it's public and has been run for 11 years already and will continue to do so in foreseeable future. I would say it is the most popular one in my home country and it has good reputation among users. From my experience most linkrot issues comes from the fact that sites and documents URL shorteners link to go down before URL shorteners themselves. Many websites from 11 years ago doesn't exist anymore.


Do you have some form of information escrow in place? E.g. could archive.org store a page of all your short-url mappings?


Not at the moment but Archive.org is an option I'm considering.


[flagged]


Please don't post in the flamewar style to HN or cross into personal attack. Those things aren't compatible with curious conversation, which is what we're going for here. We're also trying to avoid the online callout/shaming culture [1].

Even if you're right, beating people with a stick will neither improve their behavior nor the quality of conversation for anybody else. The endgame of this is a ghost town inhabited by a few nasty diehards, abandoned by users one would actually want to have a conversation with. That seems to be the default fate of internet forums but the goal of this one has always been to stave it off a little longer [2].

[1] https://hn.algolia.com/?sort=byDate&type=comment&dateRange=a...

[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...


What harm have they "already caused"?

Is link rot such a damaging phenomenon that it warrants attacking hobbyists and their not-for-profit public service?

Will you help financially compensate their time setting up these fail-safes?


[flagged]


> [Unnecessary crude remark]. He made the mess, so if he has any integrity he'll foot the bill for cleaning it up.

He [set up a server with a link shortening service pro bono, eating the cost of server maintenance for 11 years], so if he has any integrity he'll [do more free work].

I'd argue it's the user's fault if they decide to trust a small hobby site to last until the end of time. How many link shortening services have you used which promptly died, causing you to find this ridiculous hill to die on?


It's users choice to use a shortener to shorten their long URLs. Calling shorteners middleman is just wrong.


The person who uploads the link is not the only affected party. This affects every unrelated person who might ever want to follow those links long after the shortener is dead and gone.


Any link on the internet - shortened or not - can after some time die. Domain registration expire, websites get shut down. Domain changes ownership and new site goes up. Relax. It's just a lifecycle of Internet resources. Let us end this conversation. You obviously see things differently.


Thank you for being concerned for my life. I've set it up in a way that someone will take it over after my sudden death, don't worry.

And I care about climate change, even after my death.


Use Cookiebro webextension to get rid of such tracking cookies automatically. Problem solved.

https://nodetics.com/cookiebro


They could of course be sharing the click "back channel" with the ad network without any visible redirect at all, and still capturing just as much data. I guess it couldn't actually set a cookie with the viglink.com domain though.

Is that important enough to risk being "found out", or do they just not care that much about being found out, so went with the somewhat technically easier to implement but visible to end-user option?


I use to run into a sci usenet poster who usually provided 10-30 shortened links with his postings pointing at books, papers and previous postings (google groups). Arguing over a topic he one time explained he had a clear analytics picture of what references other posters did and didn't read, who [silently] participated in the discussions, how much people read before and after writing a response, etc.


I do this for teaching. But I don't use a public url shortener because I trust none of them. I have a shortener built into my teaching site.


Some URL shorteners provide a copy of their mappings to the Internet Archive as a promise that if they stop functioning for any reason the archive will continue to provide the mapping (and make the mapping file available).

[0]: https://archive.org/details/301works-faq


Isn't this kind of redirection to set a cookie something explicitly blocked by Safari's Block Cross-Site Tracking feature? And I believe Firefox introduced a similar feature as well (not sure about Chrome). I feel like this kind of redirect thing was explicitly called out in the blog post announcing the very first version of this feature.


Does this kind of cookie work anymore at all with browsers who use restrictive rules for third party cookies? (like Safari)


Eh, for links to content on my website I just cooked up my own URL shortener using Apache rewrite maps and a little scripting to generate the short codes. Simple, private, and entirely under my control (which also means I don't have to worry about the links breaking).


I did that for a while with a short domain I used to own (urlb.at). Then ended up regretting it and shutting it down.

I eventually decided that URL shorteners were a terrible idea for the web and that I wanted the 'actual' URLs out there.


> Then ended up regretting it and shutting it down.

Care to elaborate?


I assume because it creates/introduces an arguably unnecessary point of potential future failure.


Also possibly because URL shorteners are frequently abused, e.g. to obfuscate links in spam. Operating one responsibly is a considerable amount of work.


I'm an avid tinyurl user. Anyone from that site want to explain their justification for this before I stop using your service?

What's a good alternative (with the ability to tailor the shortened url)? I wouldn't mind paying a couple bucks a year.


Take a look at https://t.ly/ as an alternative to tinyurl. You can update the url ending on the $5 a month plan. It’s a shorter domain with more options available.


I noticed that share buttons like sharethis and addthis do it also. I bet if you look deep into their privacy policy (which no one does) it'll vaguely mention their data acquisition and "monetization" usage.


The worst kind. These ones will outright share (and profit from) your social profile data to advertisers.


Maybe cookie stuffing for incentive payments? e.g. why those coupon sites make a ton of money even though they have zero to do with purchase intent (someone searching for a coupon is already in their cart about to buy).


This post is a clear example of why the cookie law is an overreach. If you don't want websites setting cookies on your browser, why don't you configure your browser not to save cookies?


His GDPR letter is quite well written, too

https://ylukem.com/files/_viglink-gdpr-email.png



cookie dropping is more common than people realize.

I've created a free service with no ads and completely free that also generates qrcodes (https://qrli.to)

The problem with url shortners is usually the abuse they get (from affiliate tracking above to MLM or CPL for dating sites). However the entry barrier is so low and they are still a relevant part of the infrastructure, not surprised bitly and tinyurl are monetizing this way.


Another reason I use a publicly curated HOSTS file (search GitHub hosts file for examples), even if it is a little annoying that those links break.


Are these cookies being caught and blocked/discarded/etc by Safari Intelligent Tracking Protection on macOS Big Sur and iOS 14?


Is it possible to get rid of tracking by disabling the third-party cookies in browser?


Pls pardon my ignorance.

Is this not addressed by blocking all 3rd party cookies at the Browser ?


Why do url shorteners even exist? They literally add no benefit whatsoever.


Simpler qr codes that can be read by your phone at a greater distance or with more error correction.

Text messages where going over a character limit adds to the cost


For ex. I can literally type the short url in a browser when using a different device. It's convenient uses cannot be understated.


Malicious links can have warning pages instead of redirects, malicious url shorteners can change URLs after they were promoted.


not everything supports HTML, like calendar invites and SMS messages, and some of these things have character limits.


Is yourls.org an alternative? Requires some work though


The title should be "TinyURL sets ad tracking cookies" as this is the only one proven to do in this article.

There are tons of URL shorteners, and not all of them do this.


bit.ly and t.co both do, and they're hugely popular. I just left the HTTP responses out of the post for brevity. From the post:

>While neither redirect you to an advertising company like TinyURL, Twitter’s primary business model is advertising, and bit.ly’s privacy policy says they share data with third parties to “…provide advertising products and services…”

Both services set long-lived tracking cookies:

    curl -v 'http://bit.ly/aFzVh0'
    ...
    < Location: http://nymag.com/daily/entertainment/2010/08/hear_katy_perrys_milk_milk_lem.html
    < Set-Cookie: _bit=l03lLp-b899a3350a02095760-00P; Domain=bit.ly; Expires=Fri, 02 Jul 2021 21:47:25 GMT

    curl -v 'https://t.co/45cMiYOHQ8'
    ...
    < location: https://luke.cat/
    < set-cookie: muc=6d0d0800-f738-4704-b292-f03b6e5a5f91; Max-Age=63072000; Expires=Tue, 03 Jan 2023 21:49:09 GMT; Domain=t.co; Secure; SameSite=None


Not my personal url shortener.


Of course they do? How would erst make money otherwise?


Consider "commoditizing the complement" (https://www.gwern.net/Complement) e.g. a news site making their content linkable through social media for ad revenue at the actual page.


Wow never heard of that, thanks!

This is one of the thousand reasons that I don't think capitalism will be viable beyond 10-20 years from now. The endgame will be perfect monopoly - one global player in every niche of our daily existence. Slowly force-feeding us a diet of whatever is most profitable (whatever service encompasses the most dysfunction in exchange for money).

Off the top of my head, a better system might be one that seeks to eliminate dysfunction instead of profiting from it. Web browsers could provide short links to all websites by using a hashing function instead of an encrypted refcount. They could remove as many identifying bits as possible (like cookies). I like the direction that Apple and others are going, preserving less user data and letting less spill between unrelated websites.

The question of what all these advertisers will do once they're not allowed to track us is a big one. But my guess is that targeted advertising is not needed in the first place. They did just fine (arguably better) with demographics in the centuries before tech revealed our personal browsing histories.


> This is one of the thousand reasons that I don't think capitalism will be viable beyond 10-20 years from now.

Hmm. You posted this from your phone or computer that was created by capitalism, from an OS created by capitalism, using a browser created by capitalism, to a message board for an organization who literally specializes in capitalism. While the original incarnation of the internet wasn’t created by capitalism, military funding and the inherent authoritarianism is probably not the ideal direction to return to. Yet you think all of this only has 10-20 years left?

Oddly, you express a preference for what Apple are doing instead, yet they are the single largest product of capitalism or any other economic system that the world has ever known, including Saudi Aramco. Capitalism just “cured” a pandemic faster than anyone thought possible.

Now, it’s not without its issues, but all of the evidence seems to suggest that we maybe ought to think twice before abandoning it and probably killing hundreds of millions of people (again).


> You posted this from your phone or computer that was created by capitalism, from an OS created by capitalism, using a browser created by capitalism, to a message board for an organization who literally specializes in capitalism.

... that all base on centuries of research, science and technological development that happened before capitalism was even first proposed. Your point being?


Ah yes, “you dislike Society yet you contribute to it in someway, I am so smart”.

The classical Sciences and Arts were all founded and developed under “divinely ordained” Monarchies. I suppose that would’ve been a fantastic case for conserving that system for you?

Have you thought that maybe all those material accomplishments made under capitalism have less to do with the system itself and more to do with the fact it’s the only one around? Pretty sure many of today’s tech is founded as much on innovation that came out of Soviet labs as anybody else’s.

Also, incidentally, current day capitalism is at the beck and call of one of the last remaining communist countries. Just a curiosity.


> “you dislike Society yet you contribute to it in someway, I am so smart”.

Not even close to what I said. I didn’t suggest that he contributes anything to society.

> Have you thought that maybe all those material accomplishments made under capitalism have less to do with the system itself and more to do with the fact it’s the only one around? Pretty sure many of today’s tech is founded as much on innovation that came out of Soviet labs as anybody else’s.

It’s (mostly) the only one around because the others all failed spectacularly every other time. Not only did states collapse, but about 100 million people died. It’s amazing that you’d use the Soviet union as an example, considering where they ended up.

> Also, incidentally, current day capitalism is at the beck and call of one of the last remaining communist countries. Just a curiosity

China is the least communist of the remaining communist countries. And do you happen to know what major change allowed their GDP to explode and make them soon-to-be the biggest economy in the world?

Even ignoring that, do you really want to live somewhere like China? If you think poverty and working conditions are bad in the US, just you wait!

Unless you meant one of the other examples, like Cuba, North Korea, Vietnam or Laos. I’m guessing not.


How is this news in 2021?

I remember going to talks by tech people at link-shortener companies (bit.ly, IIRC) in like 2012 where they were talking about all the fancy analytics and tracking they offered and why it was so great that you should route all your links through them to get more "insight" into visitors.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: