Hacker News new | past | comments | ask | show | jobs | submit login
Tor at the Heart: Firefox (torproject.org)
416 points by nachtigall on Jan 3, 2017 | hide | past | web | favorite | 71 comments

Why doesn't Tor browser just automagically run a read-only lightweight Linux VM who's only program is Firefox, and only network connection is proxied through Tor? Seems like that would solve almost every fingerprinting and sandbox escape vulnerability.

That shares your X11 session, which is a security nightmare. If you're running an untrusted application, you should absolutely not give it access to your X11 session.

(There are other reasons you might want to have a Tor browser running inside a container, but if the main goal is to nullify fingerprinting and sandbox exploits, you're better off just using an actual VM).

Wayland is less of a security nightmare, how would it work with Wayland?

Seems OK until you want to upload a file somewhere.

That doesn't seem like a particularly hard problem to solve.

See Qubes, Whonix, Tails, Subgraph etc.

I'd like to see the canvas fingerprinting dealt with in Firefox mainline, it's used everywhere.

Fingerprinting (in general) is the next thing on the agenda after First Party Isolation. Addressing canvas fingerprinting is in the plan:



How accurate is canvas fingerprinting? And could it be used in the courts (which would seem a pertinent question for many Tor users)?

Someone else can answer better than me, but https://panopticlick.eff.org claims the canvas fingerprint provides 17 bits of identifying information (click detailed results after testing)

On mine, the System Fonts give 17 bits of info (1 in roughly 200,000 computers). User-Agent is next with only 8 bits.

Based on that alone, it seems that just replying back with either a blank font list or the minimal standard font list (e.g. only Times & Arial) would solve most of this problem.

I'd love to see the Firefox team fix that first.

A blank font list where? There's no way to get a direct list of fonts: you just try rendering text with a given font and look at the metrics of it versus the fallback. Font lists are done using side-channels (and you also therefore have to have a list of fonts to sniff in the first place).

The only way to stop font-based side-channels is to limit the web to a fixed set of fonts: and that will horribly break the web in some linguistic communities where there's a fair amount of web content that relies on specific fonts (that typically map old Windows codespaces to other characters for support for their language, often before Unicode covered those characters).

You also need identical fonts for a given user agent, and that's very hard to guarantee short of shipping your own fonts (e.g., consider an OS update that changes a font!), and that becomes expensive fast.

Disable looking at the metrics of text rendering results?

Get two block level elements, render some text in between, and calculate how far apart the two block level elements are, and you can already determine the height of the glyph.

So, yeah, to disable that you'd have to entirely disable the CSSOM, which would cause ridiculous amounts of breakage.

Unless every browser in the world adopts the same list, replying with a fixed list of fonts would make users of a given browser immediately recognizable (especially for low-marketshare browsers like Tor). Seems like you'd want a system where the response to a list-of-fonts query would be semi-random and likely to overlap with the lists that are naturally produced by other browsers.

Generally speaking, you have two approaches (that I'm aware of) for addressing fingerprints: one is to "hide in the crowd", i.e., return values that are common across the browsers population. The second is to create unique value for each separate session (like incognito and cookies). See: https://www.microsoft.com/en-us/research/wp-content/uploads/... [PDF!]

But user agents already identify the browser, right?

I agree that implementing this first in Tor is probably not a good idea, but if Firefox were to do it first, then I don't see the problem. "They're a Firefox user" isn't nearly as specific information.

User agent gives the browser version and platform version. Two macs with the same OS version and the latest version of Chrome will have the same user agent.

That's the point. With this feature, two computers with the latest version of Firefox would have the same font list.

Is that true? I thought the point of using the font list for fingerprinting is that it can vary widely from user to user.

It would reduce the variability. 1 in 200,000 is reasonably unique. But if all Firefox browsers reported the same result for fonts, then it would provide no more information than the spying website already has (i.e. the user is using Firefox).

I'd bet that Chrome would follow quickly, which would put pressure on Apple to do the same. If that happened, we'd have a minor victory.

All I'm trying to do is reduce information that is needlessly leaked out by a browser. True privacy still requires more.

This would also have the additional positive effect of reducing differences in rendering across browsers. At the moment there's a risk of the browser a webpage is viewed in not having the right fonts.

There's no reason for browsers to make a large number of fonts available if websites aren't able to use them because not all browsers make them available.

However, there may be an issue with internationalisation.

I think they are working on the fonts problem.

I think this is an over-estimation of the amount of entropy. If canvas hardware acceleration is disabled, the only things that can really have an impact on the output of the panopticlick canvas fingerprint are OS version, user-agent and available cpu vectorization instructions.

Interesting. I wonder to what extent one could trap vectorization instructions in e.g. KVM. Presumably they are unprivileged instructions on x86?

> Presumably they are unprivileged instructions on x86?

They're all unprivileged; having to go to the kernel would defeat the purpose most of the time.

Also, trapping them wouldn't make a difference. Fixing the CPUID fields on the other hand (so that these code paths are not taken in the first place)...

The 2012 UCSD paper [1] claims they observed 5.73 bits of entropy in their admittedly non-representative population.

As with everything, it depends on the user's threat model. In a court setting, it'd depend on how individual pieces of evidence stack up against a user to make them look bad, and whether there is enough reasonable doubt.

[1] https://cseweb.ucsd.edu/~hovav/papers/ms12.html

That's less than picking one character randomly from a keyboard. Seems pretty small to me.

    >>> import math
    >>> print math.log(95) / math.log(2)


>>> print math.log(95, 2) 6.56985560833

It is frighteningly effective.

How do you fix it without disabling canvas?

The tor version of Firefox patches it to be unique per website iirc. There is this Firefox add-on that does something similar: https://addons.mozilla.org/en-US/firefox/addon/canvasblocker...

I don't think it can be solved without origin specific permissions - to allow using local fonts and whatnot used on the canvas, at the same time I don't think that would be an issue.

You don't even need canvas to do font-sniffing. It gets better with canvas, but it's largely good-enough without.

why don't browsers just ship with its own fonts?

Few (none?) of the ancient "web-safe fonts" are permissively licensed. And nice-looking fonts that look good on all OSes and that have a large set of glyph coverage are quite expensive to make (though I'm sure Google/Microsoft/Apple could afford it).

Google does sort of do this with the Droid family of fonts. https://fonts.google.com/specimen/Droid+Sans They're not perfect though, but it is a start. I think one or two of Apple's might be permissively licensed also but I'm not certain.

They used to, loooong time ago... At least I think so? I seem to remember Netscape on Linux bundling the fonts, not sure what they did with Arial/Helvetica on Windows 3.1.

However good fonts are a massive undertaking and only make sense for OS vendors, at least the fonts which include many languages.

It will increase security and privacy in Firefox, that's great.


Note that Google is not only in the browser business but also the fingerprinting business. They want a future where everything is on the web and where they have an acceptable way of seeing everything everyone is doing.

Realistically, most users use off the shelf hardware so for every machine there are millions that are specced exactly the same. That's not very useful for fingerprinting. It would be a good idea though to stop adding more discriminating features to browsers but as you imagine, that is not the direction Google wants to go to.

For every fingerprinting trick there is an obfuscation trick though. People just need to keep checking the fingerprinting scripts. A great advantage of the web is that you can in fact see the source code.

Also, we expect publishers to embrace the post-ad world. Why would it be easy to block ads so much they stop being viable, but impossible to stop fingerprinting?

I'm getting HTTPS errors on two platforms (and two internet connections) for this website. It seems fairly ironic, but I guess it's just me. Am I doing something wrong?

IIDRN says it's up: http://www.isitdownrightnow.com/torproject.org.html

I hate to be that guy but considering the subject of the site.. perhaps tampering? Might be worth collecting some info to understand the problem.

Yeah, it's just me. Didn't work on wifi at work or 4G, but does at home. Maybe they're lagging behind a CA update....

Why did they make up this term "uplift" instead of just saying "upstream"?

That's not the same thing. "uplift" for that page means "backport from development trunk to a more-stable branch".

Theoretically Tor Browser's fork is just another development branch of Firefox, just like Mozilla's primary development branch, and they are "uplifting/backporting" individual patches into stable the same way Mozilla might backport crucial dev-branch improvements/bugfixes into stable.

But they aren't landing patches on "stable", but on the upstream development branch (i.e. mozilla-inbound/central). Since it seems more reasonable to view tor browser as a release branch of firefox than as a development branch, it seems like they are using "uplift" in the opposite sense to the normal Mozilla usage i.e. they are taking a patch from a product branch and moving it to a development branch. In that context "upstream" might indeed be a less confusing choice of words.

On the other hand I think it's clear what they actually mean here, so probably not worth worry about too much.

That page doesn't explain the use of "uplift" to mean what's commonly called "upstreaming". That page refers to "uplift" meaning a patch being added to an older Firefox branch ahead of normal whole-branch promotion. (I.e. cherry pickin from the Nightly channel to Aurora, Beta or even Release ahead of Nightly as a whole becoming the new Aurora, etc.)

Truth said, I can't find an explanation/definition of what actually an "uplift" means there?

It looks like this page is designed for people who already know what an "uplift" is, but wish to implement it properly. That being said, it also appears that "uplifts" will include bug fixes made in Tor Browser and sent upstream to Firefox, rather than just features added (but disabled by default) as was implied in the OP article. I would have assumed that bug fixes made in a downstream product would already have a mechanism to find their way to Firefox. Maybe "uplift" was the term all along for that mechanism, or is a rebranding of it?

Uplift to us is bringing the patches into mozilla-central pref'd off so that Tor developers can just pref features on, rather than re-merge patches for each major and dot release. We also add tend to add tests.

Typically landing a patch on a release branch, "uplifting" it from the main development branch (but occasionally uplifting it from thin air into the release branch).

The article describes it as upstream patch that is disabled by default which allow Firefox to be less discriminative when it comes to accepting patches.

That sentence did not read to me as a definition, and the sentence still works for me using the normal "upstreamed".

"uplift" in this context seemed to mean to me that they were not upstreaming the patches verbatim but neutering them for Firefox.

It's a mix. Some patches are just getting rebased and landed. For others, the Firefox and Tor Browser teams are working together to re-implement the feature in a way that makes more sense in the broader Firefox architecture.

For example, for First Party Isolation, we took the "origin attributes" feature that we built to support containers (user-specified tracking limitations) and reused it for isolation. In the containers case, origins get tagged with a user-specified label; with First Party Isolation, they get tagged with the top-level origin.

And to be clear, there's no "neutering" going on here. We're adding the full features that Tor Browser has, since the whole point of this exercise is to let Tor Browser user preference changes instead of patches. That means that the full capability of the Tor Browser features are in Firefox if users want to enable them.

Regarding your last sentence, does that mean that in the future I could open a link in a 'tor browser' container? That's awesome if so.

Speaking as someone who is familiar with some parts of the Firefox architecture or concepts thereof, but not the source code itself: If you were to implement per-container pref overrides, theoretically yes. AFAIK, prefs are global right now. I don't know if it's feasible to implement this in Fx or if that will leak through to the chrome (process).

But it's an interesting idea! Would you mind filing a bug at bugzilla.mozilla.org?

And will the default for 'Private Browsing' be a Tor container?

Why use firefox at all? why not something based on libcurl that absolutely does not talk back to the server after reciving the document unless the user clicks on a link or submits a form?

Writing a web browser is non-trivial.

There is a middle ground, using electron + per-site customization. HN user megous got a bit of attention when he teased a bit more detail two weeks ago:


that's the Richard Stallman approach: https://stallman.org/stallman-computing.html

he has a script that he can poke to download the content and email it to himself. then he reads it with emacs or maybe lynx with no networking enabled.

At least he doesn't print it out.

I wish you good luck doing online banking that way :')

Are people doing online banking using Tor?

That would be rather dumb, I think, assuming the bank account is tied to you in meatspace

If your country blocks access to online banking, you would use tor.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact