Hacker News new | past | comments | ask | show | jobs | submit | drewtato's comments login

I agree that memory size isn't going to matter much, but memory speed should be noticeable, especially on large projects.


It's definitely blending between sphere and torus, and then torus and sphere. Otherwise it would keep getting fainter as time went on.


It's nice to have something that the humans know with absolute certainty so that we know much to trust the machines.


My guess is either compression or stuff lingering in RAM. The CPU can't be smart here since it doesn't know what any of the future ifs will be. It doesn't know they're in order, or unique, or even valid instructions. You could (theoretically; the OS probably wouldn't let you) replace an if with an infinite loop while the program is running.


Could it be the branch predictor of the CPU somehow catching on to this? Perhaps something like 'the branch for input x is somewhere around base+10X.


> Perhaps something like 'the branch for input x is somewhere around base+10X.

That's unlikely. Branch predictors are essentially hash tables that track statistics per branch. Since every branch is unique and only evaluated once, there's no chance for the BP to learn a sophisticated pattern. One thing that could be happening here is BP aliasing. Essentially all slots of the branch predictor are filled with entries saying "not taken".

So it's likely the BP tells the speculative execution engine "never take a branch", and we're jumping as fast as possible to the end of the code. The hardware prefetcher can catch on to the streaming loads, which helps mask load latency. Per-core bandwidth usually bottlenecks around 6-20GB/s, depending on if it's a server/desktop system, DRAM latency, and microarchitecture (because that usually determines the degree of memory parallelism). So assuming most of the file is in the kernel's page cache, those numbers check out.


I doubt it, branch predictors just predict where one instruction branches to, not the result of executing many branches in a row.

Even if they could, it wouldn’t matter, as branch prediction just lets you start speculatively executing the right instruction sooner. The branches need to be fully resolved before the following instructions can actually be retired. (All instructions on x86 are retired in order).


I was expecting to find out how much data YouTube has, but that number wasn't present. I've used the stats to roughly calculate that the average video is 500 seconds long. Then using a bitrate of 400 KB/s and 13 billion videos, that gives us 2.7 exabytes.

I got 400KB/s from some FHD 24-30 fps videos I downloaded, but this is very approximate. YouTube will encode sections containing less perceptible information with less bitrate, and of course, videos come in all kinds of different resolutions and frame rates, with the distribution changing over the history of the site. If we assume every video is 4K with a bitrate of 1.5MB/s, that's 10 exabytes.

This estimate is low for the amount of storage YouTube needs, since it would store popular videos in multiple datacenters, in both VP9 and AV1. It's possible YouTube compresses unpopular videos or transcodes them on-demand from some other format, which would make this estimate high, but I doubt it.


That storage number is highly likely to be off by an order of magnitude.

400KB/s, or 3.2Mbps as we would commonly use in video encoding, is quite low for original quality upload in FHD or commonly known as 1080p. The 4K video number is just about right for average original upload.

You then have to take into account YouTube at least compress those into 2 video codec, H.264 and VP9. Each codec to have all the resolution from 320P to 1080P or higher depending on the original upload quality. With many popular additional and 4K video also encoded in AV1 as well. Some even comes in HEVC for 360 surround video. Yes you read that right. H.265 HEVC on YouTube.

And all of that doesn't even include replication or redundancy.

I would not be surprised if the total easily exceed 100EB. Which is 100 (2020 ) Dropbox in size.


For comparison, 100-200EB is roughly the order of magnitude of all HDDs shipped per quarter:

https://blocksandfiles.com/2023/07/11/disk-drive-numbers/


Sure but exabyte-level storage is done with tape, not HDDs.


I doubt any of YouTube is being served off tape. Some impressive amount of its data is probably served from RAM.


I mean, it would explain the minutes-long unskippable ads you get sometimes before a video plays. There's probably an IT maintenance guy somewhere, fetching that old video tape from cold storage and mounting it for playback.


I don’t have ads. No wait times.


> EB

I pine for the day when "hella-" extends the SI prefixes. Sadly, they added "ronna-" and "quetta-" in 2022. Seems like I'll have to wait quite some time.


> Originally 'quecca' had been suggested for 1030 but was too close to a profane meaning in Portuguese

(from https://iopscience.iop.org/article/10.1088/1681-7575/ac6afd, via wikipedia)

Seems like we missed opportunity to have an official metric fuckton.


For anyone wondering "queca" would be the normal spelling of the "profanity" although it's probably one of the milder ways to refer to "having sex". "Fuck" would be "foda" and variations. Queca is more of a funny way of saying having sex, definitely not as serious as "fuck".

Hyundai Kona on the other hand was way more serious and they changed it to another Island in the Portuguese market. Kona's (actual spelling "cona") closest translation would be "cunt", in the US sense in terms of seriousness, not the Australian more light one.

Source is I'm portuguese


> Two of the suggestions made were brontobyte (from 'brontosaurus') and hellabyte (from 'hell of a big number'). (Indeed, the Google unit converter function was already stating '1 hellabyte = 1000 yottabytes' [6].) This introduced a new driver for extending the range of SI prefixes: ensuring unofficial names did not get adopted de facto.

Rats. So close!


We should petition for "fukka-"


So is YouTube storing somewhere in the realm of 50-100EB somewhere? How many data centers is that?


> The 4K video number is just about right for average original upload.

No, it definitely is not.


For Smartphone / CE grade recorded video. Not for professional ones. Remember that number is average.


4k is not the average. The amount of sd videos out number hd and of the hd most are 1p max.


What is 1p?


1k. 1280x1024


It's not the average in either case.


I was under the impression that all Google storage including GCP (and replication) is under 1ZB.

IIUC, ~1ZB is basically the entire hard drive market for the last 5 years, and drives don't last that long...

I suspect YouTube isn't >10% of all Google.


On one hand: just two formats? There are more, e.g. H264. And there can be multiple resolutions. On the same hand: there might be or might have been contractual obligations to always deliver certain resolutions in certain formats.

On the other hand: there might be a lot of videos with ridiculously low view counts.

On the third hand: remember that YouTube had to come up with their own transcoding chips. As they say, it's complicated.

Source: a decade ago, I knew the answer to your question and helped the people in charge of the storage bring costs down. (I found out just the other day that one of them, R.L., died this February... RIP)


For resolutions over 1080, it's only VP9 (and I guess AV1 for some videos), at least from the user perspective. 1080 and lower have H264, though. And I don't think the resolutions below 1080 are enough to matter for the estimate. They should affect it by less than 2x.

The lots of videos with low view counts are accounted for by the article. It sounds like the only ones not included are private videos, which are probably not that numerous.


You’re forgetting about replication and erasure coding overhead. 10 exabytes seems very low tbh. I’d put it closer to 50-100EB at this point.


I did the math on this back in 2013, based on the annual reported number of hours uploaded per minute, and came up with 375PB of content, adding 185TB/day, with a 70% annual growth rate. This does not take into account storing multiple encodes or the originals.


Keep in mind youtube permanently keeps a copy of the original upload which may have an even larger file size


Do you know that for certain? I always suspected they would, so they could transcode to better formats in the future, but never found anything to confirm it.


I know it was true at least a couple of years ago, but there is no guarantee that they keep it.


On all of the videos I have uploaded to my YouTube channel, I have a "Download original" option. That stretches back a decade.

Granted, none of them are uncompressed 4K terabyte sized files. I haven't got my originals to do a bit-for-bit comparison. But judging by the filesizes and metadata, they are all the originals.


I'm interested in the carbon footprint.



Thanks, 10 million metric tons per year. ~0.1% of global emissions.

Pretty wild stuff.


The most important thing is to sort by date modified by default. Usually, the file you want is very new.

After that I mostly just use "pic:" or "path:".


Nothing's thing was a branded version of Sunbird https://www.sunbirdapp.com/ which is technically similar to the iMessage part of Beeper Cloud. It is essentially a way for you to login with your Apple ID on a Mac Mini in a server farm, and then interact with its desktop iMessage client from an Android app.

This was done because Apple obfuscates how its notification system works, so the cheapest short-term solution is to just use real Apple hardware.

When Nothing released it, it was found to have many flaws, which is where that video comes in. Nothing unreleased it and hasn't followed up since.

Beeper Mini uses the long-term solution of reverse engineering Apple's notification system so that it can run independently of an Apple device.


Most likely yes.

However, if you don't specify the argument type, it compiles fine. It's rare that you need to specify argument or return types on a closure, so it's not actually a large issue.


This is definitely not helped by Rocket. Even I, knowing Rust well, don't enjoy the way Rocket puts so much logic into attribute macros and function signatures.


Yeah, I generally have no issue with Rust's syntax but Rocket's API is certainly something.


I get the reason, but there's no need to worry. Most people here know that links aren't personal unless it's a Show/Ask/Tell HN. You can find other examples on the front page.

HN has a thing against questions in titles because a lot of them are clickbait. I don't care about this rule but that's why it was changed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: