Feel free to AMA, though no guarantees I'll have good answers.
Am I understanding this correctly to mean that you were part of a group that invented this font? Independent of the great work you obviously did on everything else, if this is correct, I just wanted to give a specific thank you for that! It's been my go-to font for my terminal, text editor, monospace font in the browser, and even the font my Linux machines use when they boot up before launching the GUI for years now.
I'm glad you like the font!
Man, I love your phd thesis! I use it regularly to show my students an example of what a PhD is supposed to be; and, independently, to teach them some planar differential geometry. Your study of Euler's elastica is the best written account of it that exists (and I have read many more geometry books that would like to admit).
Oh, wow! Keep up the amazing work!
If you’re concerned about tracking as a website operator you can simply load the fonts from your own server, most of them can be found on Github and have liberal licenses (I think all fonts on fonts.google.com)
It's been suggested that Browsers or Operating Systems choose a useful X value and install the top fonts everywhere like they used to install the classic "core web fonts".
Web developers who are concerned about privacy exposure from Google Fonts can use google-webfonts-helper to extract the font files and self-host them.
Note that some browsers are going in the opposite direction and actively hiding installed fonts from websites due to the potential for fingerprinting.
Different machines render fonts in a unique way. Consider different GPUs (hardware differences), different drivers (kernel/driver differences), browsers (user-land differences), and sizes (user settings) can all contribute to a font being rendered differently per-user and increasing entropy of a particular unique font id.
> "...your requests for fonts are separate from and do not contain any credentials you send to google.com while using other Google services that are authenticated, such as Gmail."
>"Google Fonts logs records of the CSS and the font file requests, and access to this data is kept secure."
> The Google Fonts API is designed to limit the collection, storage, and use of end-user data to what is needed to serve fonts efficiently.
> When millions of websites all link to the same fonts, they are cached after visiting the first website and appear instantly on all other subsequently visited sites. [...] The result is that website visitors send very few requests to Google: We only see 1 CSS request per font family, per day, per browser.
I guess, what would you want to see that would assuage your concerns, beyond what is written in the FAQ?
Apparently they need to collect and store end-user data for serving fonts efficiently. Wonder what that could be...
And if that information happens to be enough for further tracking then it seems to be fair game!
Not a Google fan, but at their volume of traffic seems like it could be something they’ve optimized for.
But they could have said so. And they could also have said that the information is not correlated with anything else.
All we know is that they have, very carefully, written something vague that they could do pretty much anything they wanted with.
And we are left with the question, why would they do that?
How? Stylesheets can't use fingerprinting or Flash cookies or anything like that, only scripts can.
What’s the rule of thumb on file size ... is 1 variable font smaller than 1 normal font? Or is it more like 1 variable font is smaller than 3 normal fonts combined (eg bold, italic, regular).
This is only a rough guide, don't take it as gospel. I'm also researching ideas (radial basis function interpolation) to make it more sparse for larger numbers of dimensions.
The compression ratio varies depending on how the variable font was made though. E.g. Noto Sans clocks in at 1.5 MB because it is made from 8 masters for two axes. Reducing it to 5 masters results in a 900 KB binary.
Less overall file size, less HTTP requests. Combine that with preloading and font-display, we're getting to the point where webfonts aren't a giant bandwidth suck.
(More room for ad tracking scripts instead!)
I remember when Adobe launched Multiple Master fonts  in 1992 (same concept)... and then killed them in 1999 because not enough people were using them.
I'll be fascinated to see if there's actually demand for it this time around. I personally have had tons of times I've wanted a semibold where none existed, or something halfway between regular and condensed.
But I'm not convinced they're going to save bandwidth for most sites. After all, they incorporate two sets of font outlines along each dimension instead of one, right? And you rarely see a webpage that uses three variations of a font along a single dimension (e.g. sans-serif regular, semibold and bold). A "third" font and beyond is more often a different style (italics) or typeface (serif) entirely. So I don't see the savings in most cases...
I'm 99% sure it'll catch on this time around, in short :)
So different sites can more specifically tailor font rendering based on their own needs at no extra cost to the user.
(I think the features I was thinking of were ligatures and lowercase numbers)
* each font load is 170 KB (I downloaded Roboto and that's the file size of one weight)
* 36.3 trillion font loads (per the source)
* 36.3 trillion * 170 KB/font load = ~6171 PB 
Plug 6171 PB into the GCP calculator  with all egress via GCP Cloud CDN to N. America and the bill comes out to just shy of $130 million.
OP is off by an order of magnitude by my napkin math, but it's closer than I was expecting.
Also we don’t know the number of http cache check requests. I’m assuming they number is way, way higher than the number of fonts served? That can cost you 160 million alone, with no bandwidth.
Edit. It's now apparently 1 euro /tb without vat
Basically, one out of a thousand requests delivers a broken font that breaks document.fonts.load().
https://gs.statcounter.com/os-market-share has Linux usage at just under 1%.
I bet many of them use Google Fonts, and with them auto-refreshing every 5 seconds all day long, they might get seriously overcounted.
Also, Google Fonts does more than just delivering a font to the browser - the text= and font subset features can reduce the download size a lot, which you couldn't do if you were serving the Woff2 yourself on a dynamic site (eg you didn't know what characters were needed before the page rendered on the server or client).
Lastly,for common fonts like Roboto and Open Sans, there's a non-zero chance it'll already be in the user's cache. That's a win.
Looks like Macs have roughly a 17% share of the windows + mac total requests. The overall market share per wikipedia (https://en.wikipedia.org/wiki/Market_share_of_personal_compu...) is something like 7% for Apple.
Fonts are probably used by most "normal" computer users in some way so I wonder where that difference comes from
- windows breaks faster and must be replaced more often, increasing the number of sold windows systems per year
- windows users use computers differently? Must be a pretty big thing to cause such a difference
- more windows systems not connected to the internet?
I kinda doubt that the typical enterprise windows setup with strict firewall rules etc. will have a meaningful impact on web fonts and IE 6+ seems to support google fonts (https://developers.google.com/fonts/faq#what_browsers_are_su...)
- Macos has broken browsers and doesn't cache, resulting in multiple font downloads per website browsing session.
- Macos use computers differently? Do they refresh the page multiple times racking up a download count?
The first odd thing about the stats is Linux's share; with a 2% market share it's racked up 10T downloads. A major part of that is Android based browsers, but a decent part of that will be Linux desktop users, developers, automation tests and so on. Overall we should expect users' browsing habits playing a part; there may be a large shift towards mobile/tablet web consumption and reduced desktop web consumption causing a stats skew.
The same effect shows up to a greater degree on mobile, where Android has something like three times iOS’s units sold yet the same amount of mobile traffic and half as much app store revenue.
When I worked at Apple, the plastic MacBooks outsold everything else because they were cheapest. It was weird that we did all our power/performance benchmarks on a $2000 15" MacBook Pro when they only made up 10-15% of Mac sales.
From that time, we also knew that Macs were extremely popular for home use, IIRC something like 30-40% penetration, but people were using Windows for work or being given a Windows laptop from work for personal use.
One of the reasons Apple never made a Netbook was that it was an inferior good. People, when asked, prefered a larger screen and trackpad but couldn't afford it. The bottom-tier 11" MacBook Air and the iPad were a response to upsell that demand.
30% penetration in the USA. Here we are talking about global usage. In many other first-world countries Macs are nowhere to be seen, eg 3%.
Then later in this same thread you say:
> Webpages in non-latin alphabets probably use way less fonts from Google Fonts.
So uh, which is it?
Or they are replaced more often just because they are cheaper.
There is a second hand market for Apple stuff, on the PC land that's unthinkable.
More Windows devices are in corporate/enterprise environments and are less likely to be used for casual web browsing?
Answers? Oh, I don't have them, I am just pointing out the problem.
Anytime an advertisement company decides to do something for free, my cynicism goes on high alert. I am genuinely curious though - what's the business model? Are they collecting massive amounts of data while offering fonts?
As far as the unsavory interpretation of how they could conceivably use Fonts data to further that end: calls to download their fonts are another touchpoint with the Google ecosystem, and a theoretical vector for further tracking the browsing behavior and device graph of an individual.
That said, traditional font licensing for commercial use is absolutely bonkers. Even if Google is using the data they collect from serving font files to feed into their user tracking, they've done a service to the web.
 Working at a creative agency, I learned that we can't so much as legally use a client-dictated (and licensed) font in a mockup without paying thousands of dollars for a license ourselves.
 Webfonts tend to be licensed on a per-pageview basis. One client got hit with a temporary flood of scraping/bot traffic, and the biggest economic impact was the unexpected six-figure font bill that month. We convinced them to put the site behind Cloudflare and used a Worker to strip out the font include for suspected bot traffic and inadvertently lowered their licensing cost by more than our annual retainer.
 I can't say how it is everywhere, but working at one of the top three marketing agencies, Google Fonts are the only approved open-source fonts we're allowed to use (since they're primarily free for commercial use, as well). Not every client is able or willing to absorb a 5-6 figure font line item for every engagement, so without Google Fonts we'd (and I presume other major agencies) would be stuck with using system default fonts everywhere.
More importantly though, it gives Google accurate insight into web traffic (many users block Google Analytics, but almost everyone loads web fonts), and it allows them to crawl websites more easily – before web fonts, many websites used pre-rendered PNGs to show web-unsafe fonts, which made crawling impossible.
Google has poured a lot of money into this, the fonts are free and open source, and the users aren't the product. Overall, I think it's a rare win-win story in this age of dystopian adtech.
Why, though? Not charity, I'd assume. The only other answers I can come up with are pretty damn nefarious.
1. A lot of text was rendered into PNGs, and that made the web less searchable, as well as less accessible, slower to load, and less mobile-friendly. All 4 of these factors do have economic impact at Google scale.
2. Fonts were one of the few features that Flash had that HTML5 was lacking, and we wanted to accelerate that transition. Again, mobile was one of the major driving factors.
3. For a while, we were organizationally funded under Google Docs. Again, fonts were one of the major missing features compared with Microsoft Office, so filling that gap was strategic. Here, our open source approach really paid off, otherwise dealing with proprietary font licensing in the context of documents that can be shared and copied would have been nightmarish.
4. To the extent that you are able to make the case that fonts make ads better (or advertisers happier), getting modest amounts of funding ceases to be a problem. To be clear, when I was on the team this was more of a glimmer of future abundant resources than day-to-day reality.
Lastly, while "charity" isn't exactly the right word, the motivations of the people working on the team are/were basically that we love fonts and want to make the Internet better. At Google scale, we were able to sell the project using basically a combination of the above arguments.
Never once when I was on the team were we asked to implement any form of individual user tracking, nor did I hear a suggestion of such a thing. All our work on collecting analytics was to improve performance and quantify our impact. I have no reason to believe things have changed on that front since I was directly involved.
I loved that site.
> it allows them to crawl websites more easily – before web fonts, many websites used pre-rendered PNGs to show web-unsafe fonts, which made crawling impossible
Since the browser hands font rendering off to the operating system, however, the most pertinent browser-specific adjustment would be updating the (configurable) defaults (Serif, Sans-serif, etc...) for each supported platform - Georgia instead of Times for a default, for example.
One motivation was sharing the cache among all sites that linked the fonts, which at the time I think was a big win, but as has been mentioned elsewhere, this is going away.
Is there a way to somehow proxy a subset of popular fonts locally rather than block the CDN entirely?
The closest I could find was Google Search Appliance but I wouldn't think that is applicable to these stats.
Caches are no longer shared across domains because doing so leads to privacy leaks: https://www.jefftk.com/p/shared-cache-is-going-away
Enough so that if you rely on caching for user experience, your average user is not going to have a good time.
- not all sites use CDN, even if using the same font
- not all sites use the same CDN
- users clear the cache (clearing history does that on firefox)
- incognito mode is a thing
- first load matters a lot
- browser have a limit of the disk space they use for cache, and they evict older entries when they reach it. Given sites are now bloated, this fills up fast.
- this repeats for each browser. I have 5 on my laptop, 3 on my mobile, 2 on my tablet.
The challenge is that to ensure consistency, you need to use the "lowest common denominator" of fonts that are installed by default across all operating systems. Which leaves you with (like) Arial, Times, Courier, Verdana, Georgia, Palatino, and (hahahaha) Comic Sans.
The real answer here is why web designers use Google Fonts as opposed to embedding their own fonts. To which the answer is: it's so much easier. (Tech, licensing, formats, compatibility, etc.)
And most of those fonts have licenses that are inconvenient at best. The only thing that allows them to be packaged for Linux distributions is Microsoft's '90s-era "Core Fonts for the Web" initiative. This initiative is long discontinued, and so the fonts cannot be downloaded from Microsoft anymore. Only the '90s versions of the fonts are free. Worse yet, the license forbids packaging the fonts in any way other than with their original 32-bit Windows installer, which means that hacks like cabextract are necessary to install them on any other system.
Instead we're restricted to only a few fonts that actually have decent cross platform support.
It’s kind of the same problem as saying that the browser should include common images. Which images? Why? How many?
Images follow a different usage distribution than fonts. I'd say that the top 100 fonts are enough to render most web content, for images I'd say this is obviously different, the top 100 images might appear often but not as often as the top 100 fonts.
Google is already distributing the equivalent for text, in the form of the brotli corpus which ships in every Chrome installation.
Well, pontificating here:
1. Because many applications don't ship with additional fonts, and it's an additional layer of complexity. Some do—Microsoft Word comes to mind, IIRC.
2. Because you're still left with the same problem: unless all browsers can agree on an additional set of standard fonts, you as a web developer will only want to use those installed by Chrome and Edge and Safari and Firefox.
3. Because licensing for desktop application may (?) be more of a pain than licensing for web usage. Which may not matter for Google Fonts, since they may be the license holder for all anyway. I don't know.
I cringe when I think about how many petabytes of jQuery has been sent over the wire over the years.
HTML5 as a living standard brings the majority of the needs that used to be served by plugins – interactivity, dynamic pages, two-way communications, multimedia.
WebSockets provide for realtime communications with the browser, something again not possible without 3rd-party plugins.
So, what you are saying is in fact happening. But that doesn't mean it's not an issue with sliding goalposts. It's just that it doesn't happen as quickly as we all may like it, but that's because "basic functionality" is a constantly moving target with different definitions depending on who asks.
At least in this case a lot of the requests are cached across sites.