One of the major vectors for fingerprinting in various ways is font rendering. It's extremely surprising how integrated with the OS font rendering is, considering it's something that can be done completely seamlessly in userspace for most applications.
I spent a long time integrating libpango/cairo (http://www.pango.org/ and https://www.cairographics.org/) into a semi-toy opengl game engine I wrote which had a Windows target, and was baffled at the level of complexity/interdependency required to build it. Gecko (firefox's rendering engine) maintains a fork of cairo https://github.com/mozilla/gecko/blob/0ea4c5812c2adecbed1d84... , which is as far as I can tell the only way to statically link it. Both these libraries require Glib.
There isn't an alternate minimalist open-source font rendering library as far as I'm aware, simply because the task of handling all font-rendering edge cases in all languages is so gargantuan. So we're stuck with a huge intertwined set of poorly understood C libraries, tightly integrated into the system and every application and ever-exploitable.
The real problem is that people get used to how their operating system renders fonts, and different operating systems render fonts very differently. The difference between how macOS and Windows render fonts is almost philosophical. macOS does not hint fonts at all in an attempt to preserve a font's character at small sizes, but Windows applies aggressive hinting (and no anti-aliasing) in the Y-direction to make fonts sharper and more readable, even if it totally changes the shape of the letters.
On Windows, at least, fonts can also differ between monitors (even on the same computer.) The built-in ClearType tuner tool lets the user adjust the font rendering gamma for each monitor individually. It would be a shame if this didn't have an effect in the browser, where users probably do most of their reading these days.
If you made your own simple from-scratch font rendering library that rasterized glyphs to the same array of pixels on every operating system, regardless of user settings, I guarantee your users would be upset. When you render a lowercase 'e' to a 6x6 pixel grid, any slight difference in rendering makes a huge difference to the character of the letter, and it's the sort of thing that contributes to the "uncanny-valley" feeling that non-native apps have. This is why Pango and Firefox go through so much effort to use the native renderers on each platform and why they may seem complicated to an outsider.
macOS does have an autohinter, but it only makes very slight changes. More important is the stem darkening (which Apple calls font dilation), which distorts outlines to make fonts bolder at small sizes.
Windows actually has two font rasterizers: GDI, which lived in the kernel prior to Windows 10, and DirectWrite. GDI does no vertical antialiasing, but DirectWrite does. Both do vertical hinting, but only GDI does horizontal hinting. (With ClearType on, GDI applies a hacky subset of horizontal hints--Beat Stamm has documentation on it on his Web site.)
FreeType, naturally, can do all of the above (except the GDI hacks), depending on how you configure it.
I think most people don't really care that much how their fonts are rendered, or would even notice a change. (And no, I don't have data to back that up)
edit:
The emphasis here is on "most". In a world where an OS will have many millions of users, at least more than a few are going to care a lot about things like the differences between font renderers; programmers and graphic designers are no doubt among them. But the typical user? Without real data to back it up, count me as skeptical.
If anecdotal evidence counts, I've heard lots complaints about font rendering, even from people in real life and non-technical people. (I like the fonts on X. Why does Y look so bad? Why can't it just look like it does in Word on Windows?)
It turns out that if people spend most of their lives/jobs looking at text that appears a certain way, they will get accustomed to it. If these people are using LoDPI monitors where they can perceive each individual pixel, you'd better hope your custom font renderer replicates your users' favourite renderer down to the exact pixel layout of each glyph.
I don't think you need to be accurate to the pixel level, unless you're targeting bilevel (black and white) rendering. (This is empirical: nobody who I've tested my custom font renderer with has been able to tell the difference between it and the OS renderer, once the same features are enabled.) But you do need to do everything that the OS does: gamma correction, TrueType hinting, subpixel AA, autohinting, stem darkening…
Yeah, that's fair. I think there's a "Windows XP" preset for Infinality-patched FreeType and it gets pretty close to what Windows does. Its rendering of 9pt Segoe UI is near identical to Windows', except for slightly softer font smoothing, but this doesn't seem to be annoying for Windows users (it's probably less than the difference between GDI and DirectWrite.)
I heard of this because a friend switched from Windows to GNOME 3 a little while ago and told me about the Infinality preset. He seemed to be very happy with it.
At University I used to do lighting design for theatre (still do occasionally). I put lighting design in the same category as good typography, and the difference between 100ms and 300ms page load time.
After most shows, 90% of the audience couldn't tell you anything about the lighting, but good lighting absolutely lends to the overall experience. The right use of colours and angles in a scene can really add to both the immersion (in the sense that you believe in the setting) and the mood (in the sense that it adds to the tone of the scene).
We have data on fast page load times - we know that it makes people more likely to use certain kinds of site. They won't know why though. I am pretty sure that the same is true with font rendering, kerning etc.. Done right it makes for a pleasant reading experience, done wrong (or inconsistently) an unpleasant one, and most people wouldn't be able to say why.
You're getting a lot of downvotes, but while I think you're mistaken (I do believe most people care), it's an easy fact to miss because most people don't know or notice font rendering. Which is where caring comes in - people tend to care about change (and not particularly like it).
Nobody cares about how things are done as long as they're consistent, but once it changes in a way that's visibly perceptible to most mainstream users (and it actually is, you might be surprised), then they do care, and they tend not to like change.
I find it odd that I would be down-voted so many times, merely for expressing skepticism about the proposition that the typical user is sensitive to the differences in font rendering between platforms, so much so that they would complain loudly and in mass if it were to change.
Obviously some people care about it. But most? Show me the data.
Though information leaks via font rendering is very much a solvable problem, another major vector for fingerprinting is WebGL: https://browserleaks.com/webgl
Since graphics rendering is tied so tightly to the hardware, I can't really imagine a scenario where all fingerprinting leaks from WebGL can ever be plugged.
If you need a minimalist open-source font rendering library, look at stb_truetype:
https://github.com/nothings/stb
A single header file.
Unfortunately it cannot do what harfbuzz or pango do, that is lay out complex scripts or do any layout
Thanks, this project is really cool! Layout is a major difficulty though -- no one wants to use all-monospace or forgo linebreaks, or to have to code that logic by hand.
Perhaps an entropy function could be added to Firefox to add noise to Firefox's rendering in the canvas; noise that would change on each render. This might make it more difficult to establish a reliable fingerprint, at least without acquiring multiple samples to average away the noise.
I might add that the noise doesn't necessarily have to always come from the same distribution. The more the noise changes distributions, the harder it would be to use sample averaging to extract the signal, I think.
> There isn't an alternate minimalist open-source font rendering library as far as I'm aware...
You can use Freetype in conjunction with libagg. It will lack the more sophisticated multi-lingual layout features Pango provides but is doable for portable font rendering.
It would be interesting to make a fork of Pango that is independent of Glib or, more realistically, uses a stripped down Glib with minimal dependencies.
I find it demeaning to call other people’s hard work “poorly understood” just because you don’t understand it. The open source world is large enough that you’re bound to encounter projects written in a language you might not be familiar with, or written in a style you consider old-school. Grow up and have some respect for the software we are using every single day.
I don't get why Battery Status API was dropped entirely instead being "castrated"... by default, allow clients to only detect <5%, <25%, <50%, <75%, <=100% (or simply <25% & >25%) so that e.g. Unity or other high performance apps can say "do you really want to play Unreal Tournament with low battery".
This could also be used by ads or stuff like background monero miners to prevent killing the user's battery...
> There probably is not much of a difference for a website if user is low in power than if it is low in memory. There should be then just one flag/event really: low resources (catch-all for: battery, memory, CPU time, storage etc.). But browser could just emit please-save event and give a second or so to complete. Then decrease timer intervals significantly.
Even if I have 100% battery, I don't want any of it going to ads or miners. For anything else, the OS already bugs the hell out of users when battery is low.
For me it's rather a miner than ads. Ads are obnoxious, some of them actual malware - get me rid of all the porn, viagra and tracking ads and I'll happily run a miner for you.
It does waste electricity, yes, and that's, to get back to the topic, something that a well-designed Power API could help keep in check.
Still, I prefer giving a "mining" contribution to a website operator (who actually needs some kind of revenue to live) over watching obnoxious ads and having my every move end up in dozens of tracker databases, in addition to hackers distributing malware over ad networks.
I prefer blocking them so bad they go out of business. If you can't pay to host your website then you don't have a viable business model and shouldn't be on the internet in the first place.
I am not indebted to anyone to allow them to use my machine's resources for their monetary goals.
If they want to make money off of me, make it clear -- I can respect that if I value your content. But exploiting browsers that are not caught up with latest security practices is being a scum and I will not tolerate it, ever.
This reasoning could be extended (and maybe it should be) to remove APIs/headers allowing detection of OS/browser version (maybe Mac users are willing to pay more?), screen size (bigger monitor = more disposable income?), etc. For all of them there are tradeoffs, but it would be an interesting exercise to think about.
I really doubt advertisers will make their ads less resource-hungry (and therefore less noticeable/appealing) when they detect low battery levels just because it would be better for the user (albeit worse for themselves).
Actually, it's trivial to prove advertisers will ignore it: If the API actually became popular, ad blocker plugins would override it to always return the low battery status. Ad networks would respond to that by ignoring it. And now it's a zombie API.
The moral of the story is that Web API development is now driven at least partially by "How will advertisers abuse this new API?", and probably will be moreso over time, and that has, ahhh, interesting interactions with the forces pushing the browser into being the universal VM platform.
Defense in depth; you can't count on ads getting blocked so you also work to make the environment as hostile to them in other ways as well. If you've got something as easy as forging a single number, you take it for the ads that may still get through.
The use case is for kiosk style applications to be able to give warnings to the user. Or for other apps to take extra precautions such as saving copies of dynamic state to non-volatile locations in case the power fails completely.
Though a battery percentage is not really useful for this as it could mean many things. My current main phone at 25% could last until tomorrow quite easily, my previous main device had at most a couple of hours left at that point. A simple "warning level" API would be more useful, allowing the app to register for events like "less than one hour of power left", "power loss likely in the next ten minutes", "entering super-power-saver mode" & "manual shut-down starting" and/or poll for the same flags. That way the device, which is much better placed to monitor such properties, can manage the warning levels perhaps with an off switch in case users don't want apps to know. Of course, you might have to enforce limited resource use in response to low power events, and you shouldn't allow much time between emitting "entering super power saver" events and actually entering the mode (which will shut off browser tasks until the mode is left).
So while there is a valid use case, that doesn't rely on trusting advertisers or other hogs to do something nice, the API as wasn't fit for that purpose and making something that is might be more complicated than you'd first think (so perhaps not worth the effort for the few apps that would actually make use of it).
> Or for other apps to take extra precautions such as saving copies of dynamic state to non-volatile locations in case the power fails completely.
The system can crash, hang, forcefully reboot or otherwise die at any moment, without warning. All it takes is one "oopsy", for the phone to hit the ground and the battery to fly off.
If there is important data that needs to be preserved, applications should already be committing it to non-volatile locations periodically.
Perhaps because there isn't a clear use for it other than fingerprinting? Battery percentages don't necessarily mean much. More useful would be a power-saving flag.
Endless nags don't do end users much good. Battery status is a highly visible UI element on mobile. We don't need extra nagging from within an application. Also even less granular reporting is still an element to perform fingerprinting on.
App developers should be already working on these conditions by default. You should already have a game save system that can expect client shutdown or disconnection at any time.
That said, I could see a 15% or less warning but your mobile OS is doing that anyway, so why bother.
Earlier today I tested Safari, Chrome, and Firefox on my Mac using the panoptoclick site. The only browser that passed the test was Firefox in "Private" mode.
Both Safari and Chrome failed even when using "Private/Incognito" mode.
Randomizing your useragent is the worst idea ever.
Your fingerprint won't conform to the features that are expected. I won't go into too much detail here, on purpose. But e.g. if you claim to be Firefox while using chrome, that's completely detectable, server side, even without javascript.
If you have javascript it's of course even easier to just check if your browser supports the features it should according to its user agent.
Edit: Btw. I have actually reversed some fingerprinting JS code in the wild, and yes, they do check if you lie about your useragent.
HTML5 is a living standard, so naturally different browser versions will have different features implemented.
Also, there's the thing where you can conform to a spec but it's still possible to tell a difference in implementation by observing certain things.
For example the non cryptographically secure random number generators in chrome and firefox used to be different algorithms. So if you seed the RNG with a number and call it, you used to get different results for firefox and chrome and you could tell if their user agent is lieing. (actually it was a bit more complicated than that, you can't seed Math.random in JS, but you get the point)
This isn't about sites needing that info to work correctly. This is what sites do for fingerprinting.
I wonder when Safari will follow this example given the efforts they're taking to make their browser more privacy focused where it sorta counts, in stopping the different ways advertisers use to spy on you. Now that Flash is on it's way out the door, it's up to the DOM to be abused by advertisers. I think a safe bet is to disallow canvas manipulation from "Third-Parties" as a good first step, maybe by default?
Canvas API allows fingerprinting precise enough to, by itself, create a unique identifier for a user.
However, fingerprinting doesn't require any kind of single magic API, but almost certainly will use a collection of techniques to narrow down users. If they know your browser version, default language, screen resolution, etc those three alone can narrow you down by many orders of magnitude.
And they are adding more APIs like Device Memory which, while not as precise as a canvas fingerprint by itself, could probably whittle it down an order of magnitude further.
Apparently turning on "Do Not Track" has long been known as a way to single you out on the web. You'll help them narrow you down by about three orders of magnitude. It was a silly idea to begin with.
> Apparently turning on "Do Not Track" has long been known as a way to single you out on the web. You'll help them narrow you down by about three orders of magnitude. It was a silly idea to begin with.
Wouldn't that be solved if FF turned DNT on by default?
It would also make it even more useless. IE originally had it default on; they turned it off after people complained that it meant advertisers wouldn't honour it anymore.
Well it was useless to begin with anyways, as it's just kindly asking ad-publishers to please not track you, which is the closest thing we have to the platonic idea of naivety.
It's not useless in the context of a legislative requirement to honor it, which was the original goal that DNT was supposed to drive to. Since we now have pretty much 100% regulatory capture in the Internet space that of course is now dead.
> It's not useless in the context of a legislative requirement to honor it, which was the original goal that DNT was supposed to drive to. Since we now have pretty much 100% regulatory capture in the Internet space that of course is now dead.
It was never really likely to happen, especially when even advocacy groups like the EFF opposed legislative approaches to enforcing DNT.
DNT is meant as an explicit opt-out. A unified way for users to expressly say that they do not want to be tracked, if possible. Users actually going into the settings and flipping that switch makes it clear that it's their request and therefore could have given it legal bearing. (Exceptions from that for functionality that requires tracking, like online shopping, would be given implicitely by using that functionality.)
Unfortunately, Google, Facebook and Microsoft managed to kill that effort off.
Microsoft by doing exactly that, they default-enabled it in Internet Explorer, taking away any legal bearing. And Google+Facebook came out saying they don't give a fuck, they'll track users anyways, and then with something around 2/3 of all bigger webpages having Google Analytics deployed and some other ridiculous percentage having a Facebook Like-Button, there were essentially no webpages left that could have come out and said that they want to respect it in some fashion.
How can that be? If 100 users buy a specific MacBook or ThinkPad they have identical hardware. Or do different revisions of the same chip have different rendering? Windows and browsers force updates at about the same schedule on everyone.
I'd expect it to narrow you down to hardware/OS, perhaps driver (though most users use whatever Windows or the OEM throws at them). How can it do more?
If 2 people buy a new MacBook and run a WebGL or Canvas fingerprint, why would they have different results?
I do not believe it is unique at all. It will be unique for a hardware-OS-driver-browser combo. And if you are not doing anything strange, those numbers will be pretty common (most users just use whatever OS/driver package ships, and the browser gets upgraded like everyone else).
The sad part is that this requires a sort of herd immunity to be effective: if 85% of users are still running chrome from within the Facebook app, the added anonymity provided by this technology is (if I understand correctly) less effective at preventing tracking.
Even worse; by default this setting is disabled even for Firefox users. If 95% of users don't enable `privacy.resistFingerprinting`, then that setting is far less effective for the 5% of users who do.
This isn't true at all. If you know how to use Chrome right (with right flags / startup options [1]) it is way more private browser then FF and more advanced. All it takes is some reading and some patience (but you get useful knowledge in return).
Security and privacy is just a marketing gig for Firefox in my personal opinion. Not truly reflecting the reality.
Additionally you can use ungoogled-chromium to take it to the next level.
> Fingerprints use information that’s gathered passively from your browser such as the version number, operating system, screen resolution, language, list of browser plugins and the list of fonts you have installed.
Why any of this information, apart from the screen resolution and the language (assuming it's the Accept-Language header's contents), is accessible to the web pages anyways?
Remember when HTML first came about? The whole idea was that the browser would re-flow everything to fit on whatever resolution you had. Lots of sites hired writers, editors, and design people from the print world and tags were abused and added so those people could control the layout like they were used to doing instead of adapting to a better way. SVG scales and doesn't require the site to know your resolution.
We've really screwed up by allowing the browser developers to pander to site developers and give them access to the kingdom. Now that we can all see the folly in that, it's time to reverse the trend for the next 20 years.
Agreed, it seems to me Firefox doesn't need to tell every site that I'm using "X11; Ubuntu; Linux x86_64; rv:56.0" right in the request header. That's surely a bit help with fingerprinting.
Perhaps the sites could figure it out anyway with more effort, but at least force them to make the effort.
Browser plugins was useful to know what to serve the client. I think the list of fonts was IE-only, although you could try to detect them by setting the font of a piece of text and checking if the size of the rendered text changed.
The version and OS are part of the User-Agent, and the idea was also to allow the server to send the most appropriate/compatible version of the page.
Is that for FF or your specific user agent? I always thought Panopticlick wasn't being fair there. Browsers update all the time, so your agent is changing every 6 weeks. If they are comparing your UA of today against UAs from their entire project, it won't be remotely accurate.
for User Agent actually. And, my test with Chrome gave me a similar score.
Yes, agents change relatively rapidly, so unless trackers have a model predicting User Agent change, a tracking profile is only good for only so long.
Still, I suppose that one could either try to find the most common User Agent string, and set your browser to that, or have it change somewhat stochastically (but only enough so we don't compromise legitimate uses)
This "breaks" some libraries with a legitimate use of reading the canvas data, like favico.js[0] (although arguably, the only reason favico.js has to read the canvas data is because the favicon API is stuck in the nineties)
That's how I feel about most browser "features" these days. What ever happened to plain old static sites? Why does everything need little floating and collapsing stuff all over? And fade-in imagery and all that nonsense.
I agree, the only reason favico.js needs that is because favicons can only be set through a link, which means you need to pass a data URL, which means you have to read the data out of a canvas.
I don't see many other situations where that applies.
This seems like a hack of a solution that breaks more than it solves.
I think the "correct" solution would be for canvas text rendering to be better specified in the standard such that it couldn't be used for fingerprinting.
We'd all benefit from standardized canvas text rendering in the long run.
Will extensions be able to ask for the permission at install and then be able to use it in the background without having to explicitly ask each time?
I ask because it's useful to be able to draw an image to canvas to then extract image data without having to parse the file format in javascript. Having to pop up a permission dialog each time would simply annoy the user.
For myself, as up until recently an only-FF user, FF needs to up their embeddability game. Many people are using Chromium outside of Chrome and that number is growing as more and more custom UIs are being built around existing browsers. Some apps are using things like Electron and some are using things like CEF. Servo has a stated goal of embedding via CEF, but it's not here yet.
I think if Gecko had an easy to use C API and prebuilt binaries to download to link with your app, you could see an uptick in FF usage. I'm familiar w/ past attempts/work at doing this, but it needs to be real, supported effort. Until then, I remain a Chrome (...er...Chromium) user (of course, I am not representative and maybe the ROI on embedding isn't worth it).
I'm using Chrome on my personal laptop running Debian 9 since Firefox is still a minor pain on Debian. The "firefox-esr" package is based on version 52 and it's become usable again (IMO) since 56.0 or so - where I switched on my Windows based work laptop. The alternatives to using firefox-esr are OK but a bit fiddly - extract a tar.gz somewhere and add to PATH or muck around with symlinks, or install via apt using an unsupported third party (Ubuntu) package source.
That's my excuse, not great but switching is on my TODO :-)
Personally, I keep the ESR-package installed, which means the .desktop-file and icon and such continue to be in their correct places, and then I just override the firefox-command via the PATH. So, in steps:
1. Download the tar.bz2 off of Mozilla's webpage.
2. Unpack it into a directory where you want to keep it. Make sure your user has write permissions to that directory, otherwise the auto-updater won't work.
3) In your PATH, place a file with name "firefox" and content:
#!/bin/bash
/path/to/where/you/unpacked/firefox/firefox "$@"
4) Make that file that you just put into your PATH executable (chmod +x /path/on/your/PATH/firefox).
(I'm actually not on Debian, for me it's rather that I replace Firefox Stable with Firefox Nightly, but I don't think this makes a difference.)
Firefox 57 on Windows 10 still has performance problems. Particular JS-heavy sites like Gmail and ElixirForum for example drag it down to 100% CPU usage (on 2 cores even, not 1!) so while I am glad to use it for a lot of browsing, it's still not quite there for a full-on Chrome replacement.
It's not a personal choice, it's a moral choice. You are contributing to making the web a shittier place for me, for everyone by using Chrome.
Whether you give a fuck about me being angry at you for that is of course a different story, but there's no question about people being allowed to be angry at you for it.
One example: applying an effect to an image using custom JS.
I also once used this as a fun way to arrange the repeated string "FILE NOT FOUND" in the shape of the characters "404", using the system's native font.
I've used it in the past to prerender components on a secondary canvas, before combining them together on the main canvas. There might be a better way of doing this though.
It sounds like the major sources of variations in canvas behavior are the OS and the browser. So why would results of using this canvas method not correlate strongly with just using user agent string?
> Subtle differences in the video card, font packs, and even font and graphics library versions allow the adversary to produce a stable, simple, high-entropy fingerprint of a computer. In fact, the hash of the rendered image can be used almost identically to a tracking cookie by the web server.[1]
Surely not all browser version numbers are required for a site to optmize rendering. Surely some versions could be collapsed? Too much info is just given away.
I applaud Firefox and Mozilla. More can be done though.
Could you defeat canvas fingerprinting by adding a bit of subtle random noise to the canvas every time it is rendered to an image before sending off to the server for fingerprinting?
I spent a long time integrating libpango/cairo (http://www.pango.org/ and https://www.cairographics.org/) into a semi-toy opengl game engine I wrote which had a Windows target, and was baffled at the level of complexity/interdependency required to build it. Gecko (firefox's rendering engine) maintains a fork of cairo https://github.com/mozilla/gecko/blob/0ea4c5812c2adecbed1d84... , which is as far as I can tell the only way to statically link it. Both these libraries require Glib.
There isn't an alternate minimalist open-source font rendering library as far as I'm aware, simply because the task of handling all font-rendering edge cases in all languages is so gargantuan. So we're stuck with a huge intertwined set of poorly understood C libraries, tightly integrated into the system and every application and ever-exploitable.