Hacker News new | past | comments | ask | show | jobs | submit login
Firefox – Fix parsing of content-length http3 header (services.mozilla.com)
113 points by jiripospisil 14 days ago | hide | past | favorite | 87 comments

Reading the bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1749957#c5 and then the code, isn't there still a problem if content-length is not set by the server? Not that it should ever not be set but...

  if (contentLengthStart == -1) {
    // There is no content-Length.
That's the same flow it would have taken with the case sensitive code.

That comment is interesting too:

"I'm just a random kibbitzer, so my apologies if this is off-base, but... isn't an even more fundamental problem here that the code is doing a naive string-based search in the first place? For example, I believe this is a valid HTTP header block that could be passed into this code:

GET / HTTP/1.1\r\nHost: example.com\r\nCookie: foo="Content-Length: 100"\r\n\r\n (In particular, GET requests normally don't have a content-length header at all, since the default if none is present is to assume an empty body.)

Wouldn't this cause the code to compute the wrong body length and break things?"

> Not that it should ever not be set

I am an HTTP/1.1 dinosaur. Is chunked transfer encoding no longer a thing in HTTP/3? It used to be a valid reason to omit Content-Length.

> Is chunked transfer encoding no longer a thing in HTTP/3? It used to be a valid reason to omit Content-Length.

IIRC HTTP/2 and HTTP/3 are essentially always chunked. I would expect a missing content-length to be completely valid.

That is correct. By default HTTP/2 and HTTP/3 requests are streamed and can have an "infinite" length. The presence of a content-length header will limit limit that length, and inform the peer upfront on how much data will be sent.

For HTTP/1.1, a stream where the length is not known upfront required the usage of chunked encoding. With HTTP/2 and /3 that encoding is no longer required since the underlying protocol already support framing and contains and end-of-stream information.

I suspect no, because there is no mismatch between there being no content length header found by this file and the content header later being discovered by some other part of the code. It'll probably require more digging but I don't think FF would crash if the header was missing, this was likely caused by the header existing but this part disagreeing about it.

Plenty of servers out there configured not to return /Content-Length:\s?(\d+)/i at all. I mostly stumble upon them in my video plugin. m3u8 (content-type: application/vnd.apple.mpegurl) resources seem to be particularly configured this way most of the time.

Note that this code is parsing the client’s requests, not the server’s responses.

Is this the bug that's causing firefox to hang as of recently? https://news.ycombinator.com/item?id=29918052

[edited, link was to this same discussion]

Welp, yes thanks. Edited.

Seems so. It made Firefox crashloop as well.

Wow, this sounds like a lot I avoided by having QUIC blocked at the network level.

Smart person. I wished I had had that foresight. Would have saved me a couple of hours.

Interesting, was wondering why it was doing that recently. Seems when I try to use some extensions it triggers for me

>/* aIgnoreCase = */ true

Very interesting, is that comment inserted automatically by an IDE?

A nice way to tell what the random bool parameter does, without having to declare a variable with ignoreCase=true first and pass in to self-document. But at least JetBrains IDEs show inline the parameter names, and multiple languages have named parameters, which for booleans at least is a nice thing.

This is a great reason why programming languages should support named arguments.

This is standard practice in a lot of places because bare Booleans in parameter lists don't communicate anything useful to the reader. Don't know if there are IDEs that automate it though. (In most languages one should avoid designing functions that take Booleans like this in the first place and use a flags/options parameter instead.)

In Python, you can just specify `foo=bar` when invoking the function, or even better, make `foo` a keyword-only argument.

From comments in Bugzilla learned about https://profiler.firefox.com/docs/#/ , looks super cool! C/C++ tooling improved a lot in last decade.

There is a lot of profiling and debugging tools for C/C++, funny enough a lot of them are originated from the KDE community. From the top of my mind:

* valgrind (https://valgrind.org/): Contains a lot of tools

* Heaptrack (https://apps.kde.org/heaptrack/): For memory allocation analysis

* KCachegrind (https://apps.kde.org/kcachegrind/): For call graph vizualisations

* ELF Dissector (https://apps.kde.org/elf-dissector/): For analysis of binary files/dependencies

* Massif-Visualizer (https://apps.kde.org/massif-visualizer/): Valgrind massif visualiser

* Gammaray (https://www.kdab.com/development-resources/qt-tools/gammaray...): Tool to analysis the current state of a Qt project a bit like you would do with firefox inspector.

FYI, most of these work great on other languages languages. Massif and most of the valgrind suite (cachegrind, callgrind, memcheck) can be used with Rust.

Yeah I use kcachegrind with xdebug/php for my day job

> funny enough a lot of them are originated from the KDE community.

How is that funny?

Funny is probably a funny way of saying noteworthy here.

and most importantly ASAN/MSAN/TSAN/UBSAN. (Although I admit that the inability to use ASAN and MSAN simultaneously is a big limitation that Valgrind doesn't have

Not as much as Rust has, though. I'd be hard pressed to go back to C-land now. I'm surprised they wouldn't consider a greenfield component (HTTP/3) a good candidate for it, sad.

I was going to respond to parent with "or just write it in Rust /s" but I should have known better.

For a greenfield component I do kind of agree though. I wonder does Mozilla still have the in-house Rust expertise for something like that after the layoffs?

Mozilla has a Rust QUIC implementation (one of three good ones in Rust) https://github.com/mozilla/neqo

I'm not sure why it's not used here.

Mozilla didn't lay off all their engineers using Rust. Just the ones whose primary job was working on Rust (as opposed to Rust components in Firefox). They still have tons of in-house Rust expertise.

I wonder if they still have the in-house expertise to know what in-house expertise they have.

On the contrary, the tooling ecosystem is huge with 50 years of history.

Naturally most of it isn't free beer.

Free beer C and C++ tooling you mean.

Plenty of nice tooling like Purify, Insure++ and V-Tune exist since decades.

Oof, treating Content-Length as case sensitive?

But why does failure to parse http headers cause the browser to busyloop and block all requests?

There is a response from a Mozilla security member at lobste.rs.


This is the interesting bit I had no idea about:

> [Q] Do I understand from this that Mozilla can just update my browser settings remotely without my updating‽

> [A] Yes. This is part of the “Remote settings” service, which we can use to ship or unship features (or gradually) or recover from breakage (like here!). We mostly use it as a backend for Firefox Sync and certificate revocation lists. Technically, we could also ship a new Firefox executable and undo / change settings, but that would realistically take many more hours. Needless to says, every “large” software project has these capabilities.

This is new to me. Can anyone tell me which other "large" projects have this capability?

I didn't know about Remote Settings.

I found some detailed information on Mozilla's website [1] and I have three entries about RemoteSettings in about:config.




I turned them to false.

This [2] explains what Normandy is.

[1] https://firefox-source-docs.mozilla.org/services/settings/in...

[2] https://firefox-source-docs.mozilla.org/toolkit/components/n...

Most large software that hits servers on the network will have this because it's borderline malpractice not to. Not only would it make it difficult or impossible to fix third-party regressions like this, but if your software started causing issues for third parties (like crashing their servers or getting users kicked off the internet or corrupting their data), you'd have no way to mitigate it in a reasonable amount of time.

FWIW I maintained a popular chrome extension (100k weekly users) and I eventually had to add an extensive set of chicken bits pulled from my server multiple times a day. Why? Because chrome updates take over a day to propagate and the extension could regularly break when webdevs made frontend changes to the websites my end users visited. It simply wasn't feasible to ask end users to manually pull a chrome update every time this happened (not to mention that chrome's extension store is designed to make this very hard.) I added an option to disable it but as far as I know none of my users turned that on.

Not exactly the same but most modern hardware has equivalent functionality where they can flip 'chicken bits' in the hw to disable various features in response to regressions - stuff like spectre or meltdown is a good example. Naturally that is instead flipped by OS or firmware updates, but considering those come down across the wire automatically now, it's basically the same idea.

> Not only would it make it difficult or impossible to fix third-party regressions like this, but if your software started causing issues for third parties (like crashing their servers or getting users kicked off the internet or corrupting their data), you'd have no way to mitigate it in a reasonable amount of time.

I completely agree that in the current case resetting it remotely is the easiest solution for both parties. However, it is probably the first time many users learned something like this exists. One would think an open browser promoting user empowerment would have more transparency on this feature and how it can be controlled.

This is also a place where corrupt governments can impose sanctions on other countries through.

If they want to do that, they'll do it. Governments have forced online service providers to implement back doors before, so Mozilla or Google could similarly be forced to do that.

Multiple reports on the initial thread (https://news.ycombinator.com/item?id=29918717) that telemetry features are getting silently re-enabled after users explicitly disable them. If you had telemetry on, you hit this bug immediately due to the server configuration.

I wonder if this is to blame?

In any case, this is incredibly sketchy behavior for a "user-empowering", "privacy focused" browser.

> Can anyone tell me which other "large" projects have this capability?

I'd say that all popular apps (Android/iOS) have such functionality. Most will probably use it for A/B testing, but technically you could change anything you'd like with it.

Apart from the companies creating their own implementations [0], popular services I have seen include Optimizely [1] and Firebase Remote Config [2].

[0]: https://engineering.fb.com/2014/01/09/android/airlock-facebo...

[1]: https://www.optimizely.com/products/intelligence/full-stack-...

[2]: https://firebase.google.com/docs/remote-config/

That also explains how the telemetry stuff could have been returned to default even though it was disabled, this mechanism has a lot of bad use cases that you'd have to trust Mozilla to never abuse.

That's insane. Is there a way to disable "Remote settings" in about:config?

Not tested and I have no idea what this breaks, but you could set services.settings.server to something unusable like localhost. The pref was listed here (https://remote-settings.readthedocs.io/en/latest/tutorial-de...) in the dev environment instructions.

For what it’s worth this sentence was updated.

They are busy making us hate them as much as they can, eh?


I wonder if remote settings is what changed my proxy settings after update too.

I wonder if mozilla will migrate development to github.com now that phabricator development has been stopped.

(Former Mozilla employee)

I would be very surprised if they did. The Firefox development workflow does not map very well onto Github.

care to explain concretely how it does not map?

For one, phabricator is only used for code reviews, not for issue tracking which is managed by bugzilla. Github issues are nowhere near usable for a large product such as gecko/firefox.

note: IIRC Wikimedia had deprecated Bugzilla for bug reporting and replaced it by Phabricator.

> Github issues are nowhere near usable for a large product such as gecko/firefox

I have huge doubts about that. Looking at the development of Servo on github was beautiful, the colored tags, the PR views, the activity stats, the reviews diffs, the wikis, everything about it was much more readable than it is on Bugzilla/phabricator IMHO. https://github.com/servo/servo The same can be said currently about webrender.

About scale, github is battletested, for Example Rust is an example of a large scale repo with a LOT of activity (~has more weekly commits than gecko-dev) so the premise Github issues are nowhere near usable for a large product seems false. What idiosyncrasies could mozilla devs have that are immutable and that make them so much different from large projects developped on github? You specifically mention issues: Github can do meta-issues, can auto closes issues after PR merge, has bidirectional PR/issues references, can autoclose issues on markdown checklist check, etc etc and if you couldn't fit into existing github features, I'm pretty sure you could totally solve your issues with Actions such as https://github.com/marketplace/actions/dependent-issues Also github is accomodating, when the Apache foundation migrated to github they were open to do changes to the platform to facilitate Apache needs, cf: https://github.blog/2019-04-29-apache-joins-github-community...

In addition to my belief that github can be better or at least as good as buzilla, if gecko migrated to github, it would attract order of magnitudes more open source contributions because of the lower barrier to entry and to familiarity (also bugzilla is awful on smartphones) You really think you should seriously consider the migration.

Last I looked github issues had no good way to express dependencies between issues, which is used a lot in gecko's bugzilla.

So you may be right, but I'm not a MoCo employee so I won't be part of any decision about that. Also consider the cost of moving all the tooling that exists to github (in the specific context of gecko): a migration needs to bring a lot of benefits, especially if you take into account that gecko's canonical repo is a mercurial one, not git.

Improving onboarding for new contributors would be nice for sure, but "order of magnitude more contributions" seems optimistic. I mostly expect there would be tons of low quality issues filed.

Pretty interesting!

The joys of HTTP and human-readable data formats in general!

more like the joys of "robust" APIs where instead of having one valid input and rejecting everything else you have billions of valid inputs and raising an error is considered a bug on your side.

Worse of all case insensitivity, not all languages have the same casing rules, so you just ask for bugs that only appear when your user is not US American.

Isn’t this issue in an http/3 component, specifically a non-human-readable format?

No, the issue was that the Content-Length header was searched for in a case-sensitive manner

The code looks so much error prone. That's scary! (+1 -1, string constant, boolean parameters)

They should introduce some functional programming into this!

Hopefully this will be a wake-up call for Mozilla, to stop with their fruitless rewrite and catch up with Chrome, or die like Netscape did.

What a mean and pointless comment. It's just a bug. It's not even a "fruitless rewrite" — this bug is in C++ code.

And the implementation of HTTP/3, that was Google QUIC, is required for chasing Chrome's "fire and motion".

HTTP/3 is QUIC but it is not Google's QUIC aka gQUIC. The IETF's QUIC working group developed this protocol over several years, and it isn't much like gQUIC as a result. Most obviously gQUIC pre-dates TLS 1.3 and so it has its own cryptography layer, in the standard QUIC that's provided by the TLS 1.3 engine, so almost all TLS 1.3 work can be reused.

But do you think the final design of QUIC is more closely aligned with anyone's incentives more than Google's?

It's a pretty clear evolution of http2 overcoming well documented shortcomings. It does help the mobile use case but overall I'd say no.

How so?

I'm not referring to anything specific, just the general tendency for systems to change over time to match the preferences of the entities with the most influence. Sometimes this is beneficial for others, sometimes it's detrimental. The point is that the evolution of the internet is now primarily selecting for what big tech wants.

This is much more obvious with the web. Everyone uses Chromium, so the web as a standard is now effectively whatever Google decides it is. See Widevine.

FWIW I don't see anything specific wrong with QUIC. I'm quite excited about it and think it will be good for the internet.

But you could apply this vague feeling (that big companies change the world) to everything.

Is it really Apple's Donald Trump presidency? Google's Democratic Backsliding and Facebook's Climate Change?

I don't think it's useful to distinguish "big tech" in this regard from say, the Military Industrial Complex's outsize influence on our world, or the way Malls bulldozed America's distinctive local businesses, or the way Hollywood decides how much of the world imagines things.

And I especially don't see value in assigning them name ownership of things that were done by large numbers of individuals working together just because it so happens that you find it easier to remember "Google" than say Jana Iyengar or Martin Thomson (editors of QUIC, neither of whom work for Google).

Unlike the W3C corporations can't actually participate in the IETF, which is a human activity and so is done by humans. Corporations can, and do, pay their employees to participate, but everybody is welcome and, as outfits like EDCO discovered, just flooding the IETF with bodies doesn't achieve anything except make you look stupid. (EDCO wanted TLS 1.3 to have defective encryption because it would be easier than doing their jobs properly).

These are fair points, and I appreciate you pointing out that the IETF has a pretty dang good process.

But do you think if QUIC wasn't exactly what they wanted, Google would have signed off on it? I don't think so. I think they would have kept using SPDY. So I don't see much difference between QUIC coming from the IETF or Google itself, since I'm assuming the final output artifact is the same.

At the end of the day all I'm trying to say is we should be wary of the tech choices of big companies. Most websites probably don't need HTTP/2+, a CDN, etc, and the complexity that comes with it. Most startups probably don't need microservices. Most web apps probably don't need more than Mithril.js (or HTML forms for that matter). I'm not arguing against things existing or being good, just that it's important to be cautious buying in wholesale and assuming the way large companies do things is the way everyone should do things. If you're not careful you can end up in a place where the simple option is no longer available. For example it's basically impossible to make a web browser with a small team now.

I'm excited about HTTP/3 as long as they don't try to take HTTP/1.1 away from me.

Oops *gQUIC not SPDY.

Eh, I mean yeah it is a bug and bugs happen, but one does have to wonder how a bug this visible and immediately detected by so many users wasn't caught in any kind of testing. Is Mozilla in the Microsoftian habit of just not testing the code they push at all?

If this affects Firefox 96, then it made it through Nightly and Developer Edition, and at least four weeks of wide deployment in Beta before being promoted to the release version.

For Mozilla as a company, as a business, is there a benefit of pouring resources into own web engine?

That's the wrong way to think about it. Mozilla isn't a for profit business, they are a non-profit with a mission.

The right question is, does pouring resources into their own web page align with their mission statement?


This is incorrect. The Mozilla Foundation (the non-profit) owns the Mozilla Corporation, which is for-profit and does Firefox development. The Corporation is the one that gets money from Google, and sends millions of dollars per year to the Foundation.

Technically you are correct in that the corporate entity that is Mozilla Corporation is registered as a for-profit company.

However, since its sole shareholder is a non-profit Foundation, it is able to invest differently than a typical corporation.

For example, there is no pressure from the Foundation to "maximize shareholder value."

Is it though?


> The Mozilla Corporation is a taxable subsidiary that serves the non-profit, public benefit goals of its parent, the Mozilla Foundation, and the vast Mozilla community.

The Mozilla Corporation is a wholly owned subsidiary of the Foundation. The constructs are for regulatory and financial hand waving.

Well, Mitchell Baker gets millions of dollars a year while the browser market share continues to decline. So there's some profit motive involved.

So no it becomes a grievance about CEO compensation for poor performance?

> to stop with their fruitless rewrite

What are they rewriting?

> catch up with Chrome

The standards driven development that Mozilla fostered in the aughts is mostly gone.

Today the standards are mostly written by the dominant browser and everyone else just plays catch up.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact