S3 isn't getting cheaper

Dunedan · on July 28, 2022

> It's hard to come to a firm conclusion since pricing data is held very secretly (AWS quickly deletes announcements of historical price decreases).

That's not correct. Here is the history of every (properly tagged) price reduction AWS ever announced: https://aws.amazon.com/blogs/aws/category/price-reduction/

rckrd · on July 28, 2022

Those blog posts are a good start. I was referencing the full pricing page for s3 that breaks down by region and class. AFAIK there are some third parties that track the granular data, but it's not preserved anywhere on the AWS page. A lot of price decreases also might have happened silently (without a blog post) or as de facto decreases (widely but privately negotiated, e.g. "sticker price").

Dunedan · on July 28, 2022

> I was referencing the full pricing page for s3 that breaks down by region and class.

You were referring to the announcements and these announcements are still available.

The actual pricing pages of course only show current prices. What would be the motivation for AWS to show outdated prices there as well? No other company I can think of does that and given the complexity of the pricing structure for AWS services that'd only confuse users even more.

> A lot of price decreases also might have happened silently (without a blog post)

Do you have an example for that or is this just hearsay?

> or as de facto decreases (widely but privately negotiated, e.g. "sticker price")

Privately negotiated deals aren't regular price reductions as they're only available for customers with a fairly large spend for such AWS services.

lstamour · on July 28, 2022

I can say that price _increases_ happened for a few AWS products in specific regions/countries without blog posts. Often AWS simply emails affected customers 1-3 months in advance and after making the change. That said, I imagine prices shifted due to currency, labour costs, energy costs or other regional concerns. And I’ve seen the opposite - I’ve seen AWS freeze prices and eat the cost differences long after they should have raised costs. It varies by product and region and marketing strategy, I expect.

The only rule of thumb is what is posted on the pricing page is supposed to be what you get charged. No APIs exist for the most part, and the actual charges can sometimes differ anyway (e.g. grandfathered pricing etc.)

nolok · on July 29, 2022

I don't disagree with you but as parent said, op was talking specifically about decreases.

ctvo · on July 28, 2022

> Those blog posts are a good start. I was referencing the full pricing page for s3 that breaks down by region and class.

This is not a good look.

From your blog post:

> It's hard to come to a firm conclusion since pricing data is held very secretly (AWS quickly deletes announcements of historical price decreases).

There’s nothing wrong with “I missed that. Thanks, let me update the blog”.

themitigating · on July 28, 2022

If that's his core argument then it undermines the entire blog post right? Not to mention it calls into question other information he's provided .All AWS pricing is available publicly , saying the prices are "held very secretly" and being obviously wrong isn't a good look.

chc · on July 28, 2022

It doesn't appear to be his core argument, so what is the point of this hypothetical?

themitigating · on July 29, 2022

That was incorrect of me to say. I was trying to say something like "He only has x key (core) points and if one is wrong it doesn't look good"

I used "core" incorrectly

rsstack · on July 28, 2022

> as de facto decreases (widely but privately negotiated, e.g. "sticker price").

Widely, AWS doesn't negotiate on pricing. Small business get the list price or they can switch to another cloud. Medium and large businesses (from 100 employees to Fortune 500) get blanket discounts (excl. some specific line items) in exchange for a spend commitment, e.g. if you commit to at least $10M AWS spend per year, you'll get a 10% discount for all your spend. It doesn't give different discounts to S3 compared to other AWS products.

Do unique top-top-tier customers of AWS get special S3 pricing? Nice for them, but it isn't _widely_.

realityking · on July 28, 2022

I don‘t know about S3 specifically but with a large enough consumption you can negotiate a specific rate card for individual AWS services in addition your Enterprise discount.

nmjohn · on July 28, 2022

FWIW $10M is at least 1 order of magnitude, and in some cases 2 orders of magnitude higher than necessary in order to negotiate discounts, depending on services used.

rsstack · on July 29, 2022

AFAIK enterprise discounts start at or above $1M. I don't have recent data on minimums or common discount rates.

themitigating · on July 28, 2022

I was at two companies that negotiated data transfer rates with a minimum commitment . I think one was $0.01/GB min of 5PB a month.

rsstack · on July 29, 2022

I think at one company I worked, our discount specifically excluded data transfer. It is X% off for all AWS services not including their data transfer costs (both inter-AZ and extra-VPC). It makes sense to me that AWS, with their networking-heavy pricing, negotiates on services and data transfer separately.

rwiggins · on July 29, 2022

I expect pricing is probably pretty bespoke, given there are humans involved. I worked for a company that had X% off all AWS costs, including all data transfer - except for CloudFront, which had a whole separate negotiated rate card (including data transfer, per-request rates, etc.).

luhn · on July 28, 2022

AWS's pricing strategy changed a few years ago. Previously, they'd drop prices aggressively with much fanfare every time they did. Nowadays, they almost never drop prices, but release new offerings they say will save money [for certain use cases]:

- Gravitron 2 is slightly cheaper than Intel instances, with a greater price/performance ratio—at least for some applications.

- m6i is same price as m5, but new generation of CPU and increased network performance.

- gp3 is 20% cheaper than gp2 and has better baseline performance. It can also scale performance independently of size, so no more overprovisioning storage to hit a certain IOPS.

- S3 has IA, Glacier, Glacier Deep Archive, all offering cheaper storage but more expensive retrieval.

Not defending AWS here, just noting that they don't seem to be interested in direct price reductions anymore.

(EDIT: Previously I stated gp3 and gp2 were the same price. Thanks zerocrates and maxxam for the correction.)

TimTheTinker · on July 28, 2022

This is how every market player is incentivized to behave, apart from regulatory authorities acting to prosecute monopolistic behavior.

1. Price and compete aggressively to grow market share, put competitors out of business, and even transform the market, if possible

2. Once you become the dominant player (i.e. a monopoly), raise prices (or don't lower them) as much as possible without damaging your market position

This is why it's so important, especially in the US, for government to begin aggressively enforcing antitrust law again -- something we've failed to do against companies founded since the 1980s.

Those laws are there for a reason - not just as a weapon against "evil" (though Standard Oil was admittedly pretty bad) but to keep the market healthy and growing. There are so many positive second- and third-order effects that occur only when competition is healthy and well-regulated, and when prices are transparent.

kortilla · on July 29, 2022

> This is why it's so important, especially in the US, for government to begin aggressively enforcing antitrust law again

It already does and it didn’t particularly stop. The problem is that there is no obvious damage to consumers here (what US antitrust law is based on).

GCP and Azure are both significant players with similar offerings. The fact that people use proprietary Amazon APIs to manage stuff isn’t a high enough bar to show a monopoly.

“We are locked into their product because it’s a big engineering expense to move off” isn’t an argument for monopoly busting. It’s a reflection on poor business decisions by the complainer. It has never worked against Oracle/MS in the past, it won’t start now.

A whole generation of engineers is about to relearn the importance of open source that drove everyone to open source stacks 15 years ago.

chii · on July 29, 2022

> A whole generation of engineers is about to relearn the importance of open source

it's not really about open-source, but about inter-operability and open protocols.

Imagine if AOL internet was the defacto standard, and every website is their own walled garden? Oh wait, no we already have that - it's the mobile app ecosystem!

The reason the web is so successful (but not monopolistic) is that the http protocol is open, and the HTML standards are open (at least, until google started meddling now since they are almost a browser monopoly...)

So laws for anti-trust should now take that into account - platform monopolies can be beaten by forcing interoperability via legislation.

weq · on July 29, 2022

To the software wide of the equation --> open source is NOT the answer, abstractions ARE. The problem with programmers TODAY is that they take technical article written by a a cloud provider as gospel, and start slathering large layers of priority api calls/technical debt all over their solution - because thats what example code is designed todo, LOCK YOU IN. No need for contract nasties when your devs are doing the work for them!

Use an eggshell architecture, put your dependencies on the edge. global dependencies are the enermy.

For IAC, thats a slightly different story.

glenngillen · on July 29, 2022

Note: ex-AWS employee

The problem with the above narrative in this case (which I agree is generally true), is that's not what AWS did. If you look at the announcement history shared in a sibling comment you'll see that so much of it happened long before the cloud wars really heated up. I find that particularly curious. Is that because margins used to be very very fat and AWS just trimmed them down as economies of scale allowed in an attempt to (unsuccessfully) stave off competition? Is it just coincidence that the system found a natural level of margin efficiency right around when competitive pressures started to ramp up? Something else?

I have zero insight on why it has played out this way. I'd love to know though.

ericd · on July 29, 2022

Seems likely that the competitors they were trying to dig into were on-prem/leased machine incumbents more than the nascent cloud competitors, though? Trying to build the brand of "Cloud" as an alternative, to the point where now it's practically the default choice for big enterprise/government customers. They combined that with extremely aggressive credit packages to get startups while they were young, so they would adopt the various services rather than building out their own infra on bare metal and saving a bundle long term.

Though making it harder for cloud competitors was certainly good for them as well.

It's like with any startup, your biggest competitor is what people do currently, not necessarily another company, let alone another startup. For many software startups, the biggest competitor is something like pen and paper, excel, standard email, etc.

glenngillen · on July 30, 2022

Again I want to be clear that I have zero special insight on this based on my previous employment...

I was at a startup that was an early and very large user of AWS. The alternatives for us at the time would have been companies like Rackspace and other innovative (at the time) colo type providers. AWS wasn't super competitive one a $/compute basis at that point, and the credits didn't last very long, but it was wwwaaayyyyy more flexible. Add in the incredible convenience of S3 relative to most other alternatives at the time and it was an easy, though not obviously cheap, option.

The common narrative I hear though is it was the startup focus that won it for AWS. Everyone else was chasing the on-prem and enterprise market as you said. AWS went have startups, dangled some modest credits to make it happen, and they stuck around. The conventional wisdom was this is a terrible mistake. Enterprises pay the bills, startups go bust in an economic downturn (and we're coming out of a cycle at this point so companies are understandably nervous). Except those startups that AWS attracted turned into Netflix, Airbnb, Uber, Lyft, etc. and whole host of voracious consumers of infrastructure. The startups had become the enterprises. The competitors were _still_ trying to convince enterprises that cloud was safe enough to adopt. They belatedly realized they'd played the wrong game, tempted some of the not-so-startup-anymore companies across with more competitive pricing commitments, and finally the battle began. By this stage AWS had won most of the viable early adopters and used that as the beachhead to grow into the big enterprise and gov areas.

At least that's the narrative I've been told a few times over the years, and it seems plausible and maps onto my own experience. Though all of that experience has been startups and not enterprise/gov so it's a very skewed perspective.

wmf · on July 29, 2022

But AWS invented IaaS, started at 100% market share, lost market share for years, and is now below 40%. Egress pricing is anti-competitive but that's a different narrative.

karmasimida · on July 29, 2022

Market share is alone isn't a valid metric IMO, especially in cloud where the industry still grows at breaking neck pace.

In fact invite other players to join you to heat up the capital market might be a beneficial deal, as it indicates interests.

fomine3 · on July 29, 2022

It's hard lock-in rather than antitrust. I don't know what should be done.

TimTheTinker · on July 29, 2022

I suspect we'll arrive at a place as a society where regulators closely monitor use of protocols and require companies to fully support open protocols and network APIs.

toolz · on July 29, 2022

There have been 9 antitrust lawsuits filed against big tech in the past two years, what are you suggesting should there be more of?

Dylan16807 · on July 29, 2022

How much have they had to pay in penalties, and how many significant structural changes have there been? That's the standard to use, not number of lawsuits or even number of verdicts.

Antitrust penalties, more often than not, aren't even as big as the extra profit a company made!

winkeltripel · on July 29, 2022

There should be results. Tech giants are too big to fail, as are the big banks and insurers. If we break up the banks and insurers by product offerings, then why not break up big tech?

Each company gets split into 3 smaller firms, each with access to all the IP of the mothership. They get to fight for their customers, and barter over business assets, to figure out who owns what.

DannyBee · on July 29, 2022

I find it so highly ironic that anyone truly believes this would end up better.

They would not fight over customers, in the same way the breakup of AT&T did not result in a fight over customers - only a more complete, nimble, and effective domination of the entire US that lasts to this day.

These companies are hamstrung by their size and internal cultural infighting - you will break them into much more effective units than they can culturally achieve themselves.

The typical trotted out example of Microsoft is an aberration - Microsoft did not believe anything bad would ever happen, was horribly defiant, and refused to prepare. It still bounced back anyway, to a point of serious domination again.

Meanwhile, all of these current tech companies are prepared for this eventuality, as AT&T did.

nl · on July 29, 2022

I'm unconvinced that breaking them up is a good thing or not.

But for me the biggest "for" argument isn't that the broken up companies will compete with each other (that makes no sense because you'd break them up along business lines where they don't compete anyway).

Instead for me it is that it removes cross subsidies both financial (as in "we can give away this loss making application because we make so much money from X) and marketing (our new product Y is inferior to the existing product Z but we can make Y the default in our other apps and then people will just use it).

This removal of cross subsidies does increase competition.

However it's possible to force the removal of cross subsidies by other means (eg, the old "force the user to be able to select the search engine in Internet Explorer" regulation etc).

ericd · on July 29, 2022

Yeah, you could kind of see this when Google started making Maps pay its own way, and started charging seriously for Maps API access - suddenly a host of decent alternatives sprang up, because they stopped sucking all of the oxygen out of the market.

edmundsauto · on July 29, 2022

It’s interesting - those financial cross-subsidies could lead to a situation where the sum(consumer_value_from_free_product) > total_consumer_harm.

nl · on July 29, 2022

Yes that's the common argument, and there's often a good chance it is better. Hard to make a generic policy when details matter a lot

edmundsauto · on July 30, 2022

Great point and a very underacknowledged aspect of governing.

wallmountedtv · on July 29, 2022

Acting on splitting up the monopolies. Making Amazon/Google/Meta into multiple smaller companies. As antitrust fines have now just become the exepences one pay to run a monopoly now.

zerocrates · on July 28, 2022

I seemed to remember gp3 being (incrementally) cheaper than gp2... yeah, gp3 is 8 cents per GB-month vs 10 for gp2 in us-east-1. That's actually a reasonably significant price difference, now that I look at it.

pwarner · on July 29, 2022

There were use cases with gp2 where one would over provision capacity to get IOPS. gp3 let you just dial up IOPS which was much cheaper than over provisioning. This may seem a corner case, but it was moderately common. Also AWS should cut their prices, well in this environment maybe holding them is a cut....

hinkley · on July 29, 2022

It’s noteworthy to me that available memory has not increased in some time. You can find machines with more memory, and perhaps the number of classes has increased, but outside of very narrow envelopes it might be cheaper to have 2 m6i’s with undersubscribed CPUs than to try to increase the amount of cpu per core on something else.

wmf · on July 29, 2022

This is mostly due to DRAM stagnation and Intel/AMD staying at 8 channels for a while. Memory capacity should increase with Genoa.

bushbaba · on July 29, 2022

see the u6tb-1 and x2i machines

hinkley · on July 29, 2022

I don’t know what data center u6tb-1 is in but it’s not in the ones I use.

X2i is interesting, but may overshoot the mark for the service I’m thinking of, but might be appropriate for another. I may have to try again as to whether I can make the r6i math work.

maxxam · on July 28, 2022

gp3 are 20% cheaper than gp2 per GB

fleddr · on July 28, 2022

Storage price reductions slowing down or even coming to a halt makes me wonder what this means for "infinite storage" companies.

I'm talking services like Facebook, Youtube, that are "free" (ad supported) where on a daily basis an absurd amount of new content is added, yet almost nothing ever removed.

If storage needs grow endlessly yet storage costs stopped going down, wouldn't that mean that the model in the long term is unsustainable? Sure you can delay the inevitable (compress content, move old stuff to cold storage) but ultimately storage costs per user goes up whilst income likely does not.

Spooky23 · on July 28, 2022

I disagree, I’m still in the datacenter storage business and our service delivery costs have been going down. You have to look at the service as a whole… nvme let us retire frames, etc.

I think hyperscale margins are high, and I think between the many tiers of storage and backend technology they are constantly cooking cost out.

fleddr · on July 29, 2022

But still content keeps growing, and income per user won't. Doesn't that mean that at one point you're out of runway?

Spooky23 · on July 29, 2022

I don’t think so.

The highest speed storage is cheaper than ever and getting cheaper. That’s the expensive stuff. Our “big” high speed storage array was like $5M in 2012. Now it’s like 10x bigger, 5x faster, and 40% of the cost.

Bulk spinning disk prices drop at a slower rate but historically they are performance bound. The marginal byte of cold data has no marginal cost, as you always have lots of space on disks where you’re short on IO budget.

Really cold workloads that need to live on-prem for reasons and need to be retained for 30+ years can go even cheaper. That stuff lives on tape at a lower price point, like a half penny per raw gig.

winkeltripel · on July 29, 2022

Not really. As costs go up, you can mitigate by pushing more of your data out to slower mediums, like hdd.

alexc05 · on July 28, 2022

I don't think the article indicates that storage isn't getting cheaper. Equally possible is that amazon's S3 margins are getting better.

For companies like Facebook and YouTube it is possible that they're not affected in the same way that a regular S3 customer would be.

h0l0cube · on July 29, 2022

> Equally possible is that amazon's S3 margins are getting better.

Which is acceptable if they're providing better value in features. From casually looking around it seems that Backblaze's B2 is much cheaper, but doesn't have a good story around reliability, namely uptime guarantees and simple multi-region redundancy. If Backblaze could match AWS in this regard, and with their Bandwidth Alliance with Cloudflare, they could provide better downward pressure on pricing.

femto113 · on July 29, 2022

I'm not worried about the ad supported free storage providers. On YouTube the ratio is something like 10,000 views for every 1 upload, so they amortize that storage cost across a lot of other users, not just the person who uploads it. Ratio will be lower on Facebook but even active users probably don't average more than 1 or 2 uploads per day. I've had the same gmail address for over 10 years, never delete anything, and I'm still under 50% of their free storage limit.

andreimackenzie · on July 28, 2022

Once ancient content no longer produces enough ad revenue for the companies storing it, I suspect they will support it only in paid tiers for users who really want it. Something like this is already happening in Google's paid tiers: each user is valuable as part of Google's advertising audience, but the economics no longer make sense once those users keep many GB on Google's servers.

Robotbeat · on July 28, 2022

Ancient content can be moved away from edge servers and more to archival.

pclmulqdq · on July 28, 2022

This already happens at YT. It's pretty common knowledge that after 300 views, the handling of a video changes a lot, including moving from a cheap storage medium to the media used for popular content.

pyrolistical · on July 28, 2022

already happening. ever notice youtube videos with low view counts take a really long time to load?

ineedasername · on July 28, 2022

Presages an interesting digital future in which nothing ever truly goes away, it just gets slower and slower. An endless inner migration that never quite hits the stopping point, and which could theoretically be reversed if only the slowed content garnered enough attention.

vermilingua · on July 28, 2022

Endlessly falling towards an event horizon of obscurity, but never reaching it (from the perspective of an external observer).

Dylan16807 · on July 29, 2022

I'd expect a significantly different scale. Specifically, I'd expect normal tiers ranging from "instant" to "several seconds", then a huge gap, then a rock bottom tape archive.

From a more zoomed out look, you could simplify it to only two speeds. The fast speed taking 0-15 seconds, and the slow speed taking minutes to hours. Any content that's been accessed once or twice in the last week or month would be on the normal tiers. Extremely dead content could fall to the tape tier, but it has nowhere further to fall, and it would take only a tiny amount of activity to rescue it.

I don't really see a reason for there to be a continuous falloff in speed. There's not really anything between hard drives and tape for responsiveness, either existing or proposed, that I'm aware of. Nor is there anything slower than tapes.

ineedasername · on July 29, 2022

I suppose that's what it would look like currently, but as content increases and the aggregate long tail of unpopular material grows ever larger there could be shifts in desirable storage solution characteristics that fit different economic niches. An extremely dense, extremely cheap, and extremely slow WORM storage device could find a place somewhere in the future. Cheap as in order(s) of magnitude less.

The next immediate step from online tapes today could be offline tapes with online indexes & robotic retrieval systems. These exist today. The continuous falloff would be a matter of priority ranking given to content requests-- not merely FIFO-- so ever less popular irrelevant content gets shoved further back in the robotic retrieval queue. A recently iced bit of content might be top priority for the tape loader while something not touched in years might sit hours down the queue. The continuous decline isn't defined by the storage media but instead by the capacity of the retrieval systems. Speed would continue on a slow decline as content increases even more and the low economic value of that content make investing in increased capacity impractical.

Eventually you get to a point in some far off future where the retrieval time for some obscure bucket of bits is measured in significant fractions of a human lifetime, where a dying grandfather requests a video of his wedding 70 years earlier only to have it arrive just in time for his own grandson's dying moments decades later.

I think I've gone too far imagining unlikely slow storage dystopian futures though, so I'm going to stop now before I start ranting about the Slow God who needs only enough access requests from the masses of his adherents to prioritize his retrieval from the depths of cold storage. But Dante Alighieri warned of what was stored in the coldest depths and it was no god... Oh God what hath this comment awakened?!?!

Okay now I'm really done.

Dylan16807 · on July 29, 2022

I don't really see a reason to prioritize content that hasn't been accessed in 2 months over content that hasn't been accessed in 200 months, when picking what tape to get next. Either way there is only one person waiting.

And I'm already assuming the tapes are offline, because online tapes would just be a waste of money.

Another issue is finding enough content suitable for very high latency systems. Right now they seem to basically just be for backups.

Robotbeat · on July 29, 2022

Tape and cheap disk are about the same price per GB, but tape is more stable over long periods and doesn't have to be powered (although power for very large, slow and rarely accessed disk is low).

h0l0cube · on July 29, 2022

> There's not really anything between hard drives and tape for responsiveness, either existing or proposed, that I'm aware of. Nor is there anything slower than tapes.

It makes me wonder what a storage device would look like that is cheaper than HDD, similar or better storage density, and allows random access, with a trade-off for slower speed?

fomine3 · on July 29, 2022

Blu-ray disc?

wmf · on July 29, 2022

Unfortunately, even putting eight double-sided discs in a cartridge only gives you 5.5 TB which is a joke compared to 26 TB hard disks. It costs a fortune as well. https://pro.sony/ue_US/products/optical-disc-archive-cartrid...

Dylan16807 · on July 29, 2022

They cost more than a hard drive and are less dense. And those "archival disc" cartridges Sony makes are even more expensive, with drives that cost more than tape drives.

fleddr · on July 29, 2022

Good point, this is what I would also expect.

It's also quite telling how in most of these services, old content is very well hidden in the UI and sometimes near impossible to get to.

staticassertion · on July 29, 2022

That content has extreme temporal properties. An image that's uploaded is increasingly less likely to be accessed over time. You can probably store it on very cheap hardware with pretty serious compression.

Not to mention that it's mostly text and images that get uploaded to these sites. Videos are the worst case. Let's say there are 1 billion facebook accounts, each one uploads 1GB of data (seems huge), and let's assume that compression just cancels out replication.

That's 1,000 petabytes. On S3 that's in the low 6 figures per year, obviously ignoring the exfil/access costs.

That's not that much. Obviously you want to keep some of it "hot" - profile pictures, recently uploaded pictures, etc. Hand waving, assuming 1% of the data needs to stay hot (seems high), that's 10PB of data. Certainly you're in "big data" land but it's not like there aren't databases that'll handle it.

Not to mention FB invests in tech like zstd.

Cold storage is definitely still getting much cheaper. Dropbox SMR drives are pretty crazy: https://dropbox.tech/infrastructure/smr-what-we-learned-in-o...

pclmulqdq · on July 28, 2022

This is a huge concern for me too. I suspect that FB and Youtube will start auto-deleting old content that people haven't seen in the last X months.

jamesfinlayson · on July 28, 2022

I've noticed pictures in Facebook Messenger getting compressed after three or six months. I'm not sure if this is new or if it's just more aggressive now.

Gigachad · on July 28, 2022

There would be exabytes of content on youtube over 5 years old with under 20 views. Stuff no one will miss when its gone.

hsbauauvhabzb · on July 28, 2022

Except people that use YT as a family video archive

Gigachad · on July 29, 2022

I'm certain that in the ToS they reserve the right to remove any video at any time for any reason. Dumping your family videos on a public video sharing site that you don't pay for and expecting them to exist there forever for free was never a good idea.

hsbauauvhabzb · on July 29, 2022

Tell that to the people who lose videos that way. Google would consider that a cost (loss of goodwill with a minute set of people).

Idk about YouTube but I know many people whose sole copy of their treasured images is on Facebook.

vineyardmike · on July 29, 2022

They’ll probably migrate those people to a “Google Photos” or “Google Drive” storage option so they pay for their usage but still have archive and sharing options.

Shish2k · on July 28, 2022

How does that compare to today’s data though, given larger numbers of higher-bitrate higher-resolution videos?

Perhaps during the first year of youtube’s existence people uploaded an exabyte of video… but if people today are uploading one exabyte per day, then it hardly seems worth the hassle of deleting that first year.

adrianN · on July 29, 2022

Historians might miss it.

prepend · on July 28, 2022

If storage lifetime is increasing it might not be so bad.

Or compression/deduplication.

As long as they have consistent ad revenue, they’ll be ok.

flatiron · on July 28, 2022

I wouldn’t believe dedupe would be a big thing with social media user created content.

vineyardmike · on July 29, 2022

I suspect (but have no data for) that a significant chunk of content on social media is memes and still-image-video. Those are likely to be easy to dedupe or compress aggressively

notatoad · on July 28, 2022

storage price continues to decrease. the point is that S3 price doesn't drop at close to the same rate. facebook and youtube aren't hosting on S3, so they get to experience the real decreases in storage cost.

fxtentacle · on July 28, 2022

The main issue we have with S3 is the extortionary egress bandwidth fee. Storage pricing seems OK, but what's the point if I can't send those files to users?

gsanderson · on July 28, 2022

The margin on bandwidth is enormous. I think Cloudflare did some research into it a while back. But there are rivals providing cheaper S3-compatible storage at scale. Cloudflare's R2 ($15/TB, and no fees for egress). Oracle Cloud Storage ($25/TB, and first 10TB of egress is free). There's Wasabi ($5.99/TB, and no fees for egress or API requests). To name but three. Of course each of those have drawbacks e.g Wasabi is intended for long-term storage, R2 is still in beta etc. But there are options for new data.

jrochkind1 · on July 28, 2022

I didn't realize Cloudflare R2 was available yet. Took a look based on your mention... "currently in beta". But you can still just sign up and use it already?

Anyone here done that and want to report back?

Ah, I found this posted by it's author elsewhere in these threads:

https://www.vantage.sh/blog/cloudflare-r2-aws-s3-comparison

Very useful. Wait.... "no public access" AND "no pre-signed URLs"? Am I misunderstanding what that means? That would seem to make this not use cases of serving files to end-users? or what am I missing?

chc · on July 28, 2022

I haven't used it, but my understanding from reading about it is that the intended use-case is to create a Worker (roughly equivalent to an AWS Lambda) that mediates access to the bucket. So you could have the worker provide public access, or have the worker do some kind of authentication — you just have to roll that part yourself.

camhart · on July 28, 2022

They plan on adding features so you don't need to use workers.

See the last section of this article: https://blog.cloudflare.com/r2-open-beta/

jrochkind1 · on July 29, 2022

Thanks!

Also interesting to see that a future goal is: "Integration with Cloudflare’s cache, to scale read requests and provide global distribution of data."

I had wrongly assumed that would be the whole point of R2, that it would do that out the gate!

gsanderson · on July 28, 2022

I haven't used it either but it certainly sounds like it's open to anyone to use. As @chc mentions, it reads like you use a Worker to control access. So you can make your files public by simply not requiring any authentication. I've just been scrolling through their getting started guide. It looks promising: https://developers.cloudflare.com/r2/get-started

vlovich123 · on July 29, 2022

We do have presigned URLs actually. Public buckets coming soon

jrochkind1 · on July 29, 2022

Thanks! That may meet all my requirements...

Although one of my use cases is video files (fairly small non-profit usage), which I have understood are not allowed by CloudFlare CDN terms of service (at least at non-"enterprise" tiers?)... it's been confusing to me understanding if I could, for example, serve video files from Backblaze via CloudFlare CDN and Bandwidth Alliance; or with R2, if there's a way to serve video files from R2 to the public that is allowed by tos.

vlovich123 · on July 31, 2022

Yes you can serve whatever you want from R2 directly. From [1]:

> The Cloudflare Developer Platform consists of the following Services: (i) Cloudflare Workers, a Service that permits developers to deploy and run encapsulated versions of their proprietary software source code (each a “Workers Script”) on Cloudflare’s edge servers; (ii) Cloudflare Pages, a JAMstack platform for frontend developers to collaborate and deploy websites; and (iii) Workers KV, Durable Objects, and R2, storage offerings used to serve HTML and non-HTML content.

Now it’s important to note that the Cache product does not fall into these supplemental terms (even if you use the Workers API to access it). So if you are Caching the video files you’d potentially run into problems (but that would also be true of serving video content from Blackblaze that you were caching).

[1] https://www.cloudflare.com/supplemental-terms/#cloudflare-de...

jrochkind1 · on Aug 1, 2022

Thanks this is helpful.

Delivering video files without caching them is probably a mistake, of course!

i5heu · on July 29, 2022

Sadly this dose not speak about data-integrity via eg. CRC.

Also I don't like it 1mm that I cants see how often and where my data is replicated

vlovich123 · on July 29, 2022

Overlooked this question:

> But you can still just sign up and use it already

Yes. You can just enter a credit card & start using it.

ReactiveJelly · on July 28, 2022

> I think Cloudflare did some research into it a while back

I remember! They called it the Bandwidth Alliance https://www.cloudflare.com/bandwidth-alliance/

gsanderson · on July 28, 2022

Yes! That was it. There was a blog post too with the estimated markup: https://blog.cloudflare.com/aws-egregious-egress/

gerwim · on July 29, 2022

Wasabi's egress is not fully free as you make it sound. Their egress works like this: if you store 1 TB of data, you have 1 TB of free egress, as seen in their FAQ [0]. This means it's unfit for certain cases and another competitor could be a better option.

[0]: https://wasabi.com/paygo-pricing-faq/

gsanderson · on July 29, 2022

You are correct, that would be another example of one of _its_ drawbacks. They certainly promote "No Fees for Egress" and "No egress charges" (home page). But yes, whether that actually is the case depends on your usage.

i5heu · on July 29, 2022

> and no fees for egress

This is somewhat wrong. You have to serve R2 Objects via a Worker which will cost you something per hit.

So you have a kind of egress fee.

gsanderson · on July 29, 2022

True, it should be no fee for egress _bandwidth_. But yes, some form of egress fee (because of the fetch/get operation).

rckrd · on July 28, 2022

A bit tangential, but in some cases you can use S3 Gateway endpoints, which can dramatically reduce your egress costs if you're just accessing it within a VPC.

[0] https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpo...

judge2020 · on July 28, 2022

This is the entire point of the egress fees. If you can keep traffic inside AWS it's basically free. Getting it out costs money.

impulser_ · on July 28, 2022

Have you looked into other S3 compatible services?

We switched to DO Spaces because of the lower bandwidth and storage fees. The savings was actually quite a bit and no noticeable differences for our use case.

I know there are others services out there that are also s3 compatible and cheaper.

judge2020 · on July 28, 2022

For my use case, DO Spaces was quite bad in terms of artifacts being served from stale caches after being replaced and the 'purge' API being called on the bucket; this was over 2 years ago so maybe it's better now.

jopsen · on July 28, 2022

This is the other issue with using another cloud, S3 is extremely well documented.

Some other cloud providers give no statement on reliability/availability/consistency. And worse some providers give statements that violates the CAP theorem.

The big clouds are some what reasonably documented, but many smaller vendors leave you guessing, or promise what I know they can't keep.

tpmx · on July 28, 2022

I think this would be an excellent area for regulation since it anti-competitive (portability is dramatically reduced since it has been made artifically expensive to move the data to another cloud service) and the cloud services haven't shown any interest in doing something about it on their own.

ipaddr · on July 28, 2022

Regulation for bringing down prices for unnecessary cloud services like hosting a file?

Let's focus on things users cannot change. Using the cloud to host files is an easy and expensive way to store files that should be a luxury or a tax on the foolish.

tpmx · on July 28, 2022

No, for bringing down the cost of moving that file to a competing cloud service provider.

sfe22 · on July 28, 2022

Are you seriously suggesting that S3 is a monopoly with means to block competition. I mean even when discussing cloud in general, what you are suggesting is a joke.

tpmx · on July 28, 2022

Anti-competitive practices (https://en.wikipedia.org/wiki/Anti-competitive_practices) don't require a monopoly.

themitigating · on July 28, 2022

Is that what S3 is for?

I've used it to store large datasets that are processed within a region, a backup system, logs and metrics, and as an origin for CloudFront. I think you're referring to using to host a consumer download service because of web objects in a serious business you'll need a CDN.

Gigachad · on July 28, 2022

All S3 usage I have encountered in the wild has been user submitted files. Stuff like profile pics, document attachments, etc. You could in theory store these on the VMs storage but that encounters the frequent problem that the storage fills up. While S3 is unlimited and easy to integrate.

advisedwang · on July 28, 2022

Hard disks have been getting larger capacity but i/o capability is fairly flat. The result is that price/GB is dropping but price/iops is not. So _cold_ storage is where you would expect to see pricing fall. I don't follows AWS pricing closely but I've seen a lot of news around Glacier over the year so that might reflect this fact.

didgetmaster · on July 28, 2022

The capacity and speed curves of drives is quite different. When the first TB drives were released about 15 years ago, the fastest ones were about 100MB/s. Now you can get 20TB drives (20x capacity) and the fastest ones are about 300MB/s (3x speed). It is just far easier and cheaper to make a drive twice as big than it is to make it twice as fast.

SSDs are a different animal, but have some similar characteristics. Within the same generation (e.g. m.2 pcie gen 4), you can get drives that have a lot more capacity but have roughly the same access speeds (i.e. the 2TB version is very similar to the 1TB version). The speed increases between generations is much better than with HDD. Pcie gen 3 drives seemed to max out about 3500MB/s while the gen 4 drives are about double that. I have seen reports that gen 5 drives might double it again

With HAMR technology we might get HDD drives with capacities in the 50TB-100TB range. You can bet that the speed won't be 5x current technology even if they get dual-actuators in them. There will need to be some kind of breakthrough technology to improve it significantly.

This is why we need better data management systems. If the meta-data (e.g. file table) is only 1% of the data that is still a lot of data to read in and store in RAM. We need better systems where the file records are much smaller. https://didgets.substack.com/p/where-did-i-put-that-file

aledalgrande · on July 28, 2022

There's now a new Glacier product, called Glacier Instant Retrieval, which supposedly makes it cheaper and faster to retrieve from Glacier, if you do it a limited number of times.

pwarner · on July 29, 2022

Yeah this is magic price wise

kbumsik · on July 28, 2022

Well, at least AWS has not increased the price like GCP did: https://cloud.google.com/storage/pricing-announce

epberry · on July 28, 2022

The R2 vs S3 story is pretty interesting. For archival use cases, S3 still wins by a mile, but for running apps, R2 often wins (minus lacking features).

I wrote up my findings here: https://www.vantage.sh/blog/cloudflare-r2-aws-s3-comparison

tnolet · on July 28, 2022

For my company, neither egress nor storage cost are the big issue. It’s the API call (PUT) cost.

We deal with payloads that are just a little too big for a database (we run Postgres and Clickhouse) but just too frequent (~100 per second) and small (think largish json blobs) to be effective on S3.

We are write heavy. Reads are probably 1% but need to be instant for a good UI and API experience.

jopsen · on July 28, 2022

Yeah, S3 is not for tiny blobs..

What I have seen done before is concatenating many small blobs into a single large blob that is stored on S3. This works great for batch processing afterwards.

If you need read access to the objects one option is merge them into a large blob, and then create a small index file that keeps offsets for each of the tiny blobs. Then you fetch the index file, find the offset of the tiny blob you want and, do a range request for this offset into large blob.

This mostly works when you're not read heavy. I recently did an index file for serving HTML files out of a tarball. As an alternative to uploading many small files.

wmfiv · on July 28, 2022

Have you looked at Kinesis Firehose? It was pretty much build for this use case although you will still need to see if you can define a partitioning scheme probably in combination with an S3 Select query to meet your query requirements.

https://aws.amazon.com/kinesis/data-firehose/?nc=sn&loc=0

https://aws.amazon.com/blogs/aws/s3-glacier-select/

tnolet · on July 28, 2022

We are using Kinesis. It’s fine. Great actually. We still need to store user logs and generated data persistently. Cold storage is also not an option. This is data that needs to be accessible the moment the event that generates the data happened. Don’t want to push my product too much but I run a synthetic monitoring comopany. Check my bio and you’ll get a gist of the type of workloads.

krn · on July 28, 2022

> For my company, neither egress nor storage cost are the big issue. It’s the API call (PUT) cost.

> We are write heavy. Reads are probably 1% but need to be instant for a good UI and API experience.

It sounds like the recently released OVH High Performance Object Storage[1] might be a good fit.

It has better performance than S3[2], completely free API calls, and $0.015 / GB egress.

[1] https://corporate.ovhcloud.com/en/newsroom/news/high-perform...

[2] https://blog.ovhcloud.com/what-is-the-real-performance-of-th...

0xbadcafebee · on July 28, 2022

You can host your own S3 API-compatible object storage service on some EC2 instances (exercise left to the reader to figure out how to make that reliable). Zero PUT cost, higher operational overhead.

  Minio: https://github.com/minio/minio
  SeaweedFS: https://github.com/chrislusf/seaweedfs
  Ceph: https://ceph.com/en/discover/technology/

YawningAngel · on July 28, 2022

This is a very naive question so I might be very wrong, but isn't postgres pretty flexible about objects now?

michael1999 · on July 28, 2022

100/s is 3 billion per year. Postgres has a hard limit of 4 billion blobs in a column.

that ignores the reality that RDS storage is comparatively expensive.

singron · on July 29, 2022

If you think a table might get anywhere near this size (blobs or not), I highly recommend table partitioning.

michael1999 · on July 29, 2022

My problem is customer ops, not mine so much. They run our software.

michael1999 · on July 30, 2022

That's not true. Our ops struggle too. 32bit oids really are a barrier.

tnolet · on July 28, 2022

Yes, but it does blow up TOAST and has a lot of impact on the deletion behavior on busy tables. We removed all larger json blobs from PG. typical settings or config stuff in json in PG are fine. We use that all the time. But larger json blobs of several kilobytes are still an issue for semi timeseries data.

iknownothow · on July 28, 2022

Could you elaborate on the TOAST issues you're having? We're pretty liberal with our use of large JSONB objects and might hit a billion objects in a year or so.

LunaSea · on July 28, 2022

Have you tried using PostgreSQL Large Objects (LOs)?

Helmut10001 · on July 29, 2022

I cannot speak for the big companies, but for private cloud storage, I suggest to go the route for self-hosting, all the tools are available in 2022: ZFS for reliability, Proxmox for separation of concerns, Opnsense/pfsense for access control, Nextcloud for convenience (if you need such file sync at all). Add a photovoltaic plant and your electricity bill will be _Ok_ (you should do this anyway).

I have a 40 TB ZFS Z2 Pool consisting of 6x 8TB drives, and a 16TB offsite pool that is booted for backup snapshots weekly. You'll have to replace the 6 drives running 24/7 approximately every 5 years. If a drive costs $200.00, that will be $1200.00 per 5 years, or $20.00 per month. Add about $400.00 (with PV) to $800.00 (without) for electricity per year ($30.00/$60.00 monthly) and $7.00 monthly for UPS batteries. For these $57.00, you will get a full virtualization feature set under your control, not only a 30TB ingress data sink.

With Amazon Glacier, the cheapest "data sink" cloud storage, 30TB would equal $123.00 monthly (or $30.00 with S3 Glacier Deep Archive), with quite a few feature caveats.

WillPostForFood · on July 29, 2022

I admire that setup, but it could read like an ad for S3.

Awesome for a hacker, but too much if you just want a lot of easy reliable storage.

RosanaAnaDana · on July 29, 2022

Easy reliable cheap

Pick two

vineyardmike · on July 29, 2022

If someone is considering the “store at home” route and this makes their head spin then consider just buying a prebuilt nas. You pay for someone doing this for you, but it has similar monthly price in the long run.

Dunedan · on July 29, 2022

Your setup doesn't factor in labor cost and has a vastly lower durability.

That's not to say that S3 is superior compared to your setup, but it's a different solution for different needs.

Helmut10001 · on July 29, 2022

Yes, of course. Although I weight labor cost and return of technological knowledge gain against each other, i.e. a solution _for different needs_.

klodolph · on July 28, 2022

It looks like storage costs haven’t changed much since the last S3 price reduction five years ago.

My other take on this is that given how slowly HDD costs are going down at this point, tape is going to remain relevant for some consumers for a lot longer than many of us thought.

aejnsn · on July 28, 2022

> AWS has a healthy margin to continue to offer _strategic price cuts_ only when necessary

I’ve never seen a non-strategic price cut. :)

smilekzs · on July 29, 2022

I think the contrast is between:

A: Cutting prices whenever costs go down

B: Cutting prices only when competitors do so

fomine3 · on July 29, 2022

Inflation, virtually!

tick_tock_tick · on July 28, 2022

Cloudflare's entry into this market will be interesting no bandwidth costs might put real pressure on AWS to get reasonable with it's pricing.

https://blog.cloudflare.com/introducing-r2-object-storage/

epolanski · on July 28, 2022

That sounds a lot like "let's hope the new AMD GPUs will be priced low so Nvidia will cut their prices"

tick_tock_tick · on July 28, 2022

I mean they announced pricing already and it's much cheaper.... And are you implying that if AMD prices GPUs cheaply doesn't impact Nvidia?

Comevius · on July 28, 2022

As a European I take not getting many times as expensive, like our fuel, energy and food prices. I wonder how the European cloud will fare this winter, and how many European customers we will lose.

coding123 · on July 28, 2022

Because of inflation, the price is going down.

However also because of inflation, I have had a 10% "pay cut" since mid last year.

Sebb767 · on July 28, 2022

> Even with a slowdown of Moore's Law, it seems like AWS has a healthy margin to continue to offer strategic price cuts only when necessary.

Which doesn't surprise me much, really. If your customers are mostly stuck with you, competition is sparse and people pay the price you demand - why would you reduce prices?

mathgladiator · on July 28, 2022

Has anyone had any direct experience between S3 and Backblaze for the basic usecase of uploading and downloading files?

https://www.backblaze.com/b2/cloud-storage-pricing.html

Is just radically different.

dmw_ng · on July 29, 2022

B2 isn't great for low latency serving, objects that aren't in hot cache have extremely variable delays on first fetch, and the delay (at least as of a year ago) scales according to object size.

For largeish video (over 500mb) I remember seeing >1 second latency, enough to rule out using it for anything public facing

bicijay · on July 28, 2022

Lately im doing Backblaze B2 + Bunny CDN and im very satisfied with the prices

js4ever · on July 28, 2022

Blackblaze reliability and performance are below aws, same for bunny cdn. Although I understand it can be interesting for some use cases where perf/reliability is not critical.

coder543 · on July 28, 2022

Can you cite your sources?

AWS "reliability" has been the direct cause of a number of sleepless nights for me over the years. Comparing to a few years ago when I worked on a large-scale product hosted on bare metal servers that worked beautifully, I don't think AWS is all it is hyped up to be.

Anecdotal, I know, but even with no experience using Backblaze or Bunny, the bar they would have to meet is a lot lower than you're implying.

js4ever · on July 28, 2022

I'm talking about my personal experience, on blackblaze the number of 500 errors was simply not acceptable for my use case, likewise for performance and latency. I was a bit disappointed by bunny cdn rps/latency. But indeed price is not at all comparable.

Also I'm not talking about any aws service but more specifically about S3 a d CloudFront.

Finally, as I said above those blackblaze and bunny are amazing if you try to optimize the cost as your main goal.

ape4 · on July 28, 2022

Counting today's outage? https://news.ycombinator.com/item?id=32267222

upupandup · on July 28, 2022

I guess you get what you paid for but how much of a difference is there? The cost saving it advertises is quite alluring.

dilyevsky · on July 28, 2022

Afaik Backblaze has maintenance every Thursday during which they don’t guarantee uptime.

Rodeoclash · on July 28, 2022

Theirs other options available depending on your risk appetite.

For example, I built a file sharing tool (https://www.fileyeet.io/) off the back of Storj (https://www.storj.io/) which is a distributed file storage backed by a crypto coin (maybe one of the few legitimate uses of crypto, although I'm not convinced yet).

Storj was a much cheaper option than S3 although I do have to trust that their systems are as secure as the advertise them to be. Likewise, R2 seems like a good "in-between" option.

Both R2 and Storj share the S3 API for integrating with them.

i5heu · on July 29, 2022

I love Storj, very nice idea. Also they not need cryptocoins to work, they could just replace the tokes with something else.

fnordpiglet · on July 29, 2022

Any customer of scale negotiates a deal with aws. Aws loves making pricing deals. Public pricing is advertising.

paulpauper · on July 28, 2022

all amazon services seem so expensive, hundreds of dollars/ month bills easily for running a few large ec2 servers all month.

Sebb767 · on July 28, 2022

AWS is great if you have lots of money and little time. Which was very true in the last ~decade, where we had lots of venture funded startups that had very much money, but little time.

Also, while you definitely pay a markup, the standard EC2 pricing also contains instant availability. If you skip on buying a car and instead pay a taxi to wait 24/7, your costs will also be insanely high. Additionally, AWS provides a great ecosystem where your app can easily be managed in - things like getting a https certificate, setting up a redundant load balancer and even a CDN, database or a Kubernetes cluster can be made simply with a few clicks in an UI. If you don't have someone who knows how to configure those services, it can detract a lot from what you're actually trying to do as a business. Lastly, it has all these enterprise features you suddenly need - solid billing, encryption, certificates etc..

Don't get me wrong, it's expensive, but there's a reason so many businesses use AWS.

chrisan · on July 28, 2022

if all you are running are a few ec2 servers then you have quite a lot of competitors to choose from and price out/balance with reliability.

we use aws at work, but been running side projects on digital ocean for years for way less than hundreds a month

snoopy_telex · on July 28, 2022

Why haven't there been a upstart taking market share? Why do only the big players get into the game and hold similar costs?

0xbadcafebee · on July 28, 2022

It's easy to enter the market and offer a cheap product. It's hard to enter the market and offer a very solid product. It's nearly impossible to enter the market and provide the hundreds of services the big players can provide, and operate at the scale they can provide, servicing the number of markets and customers they do, with the level of support they do.

The big players know that what separates the majors from the minors is trust. If you buy from AWS, you know what you get works, and you will pay a premium for that assurance. And also it is really fricking expensive to be AWS.

jamesfinlayson · on July 29, 2022

Very much this - I don't have much experience with providers in this space but Digital Ocean has taken maybe a decade to build and offer a small number of services.

That said, there is a lot of AWS that I probably wouldn't know existed - I know in my day job I make big use of maybe five services, plus maybe another 10 glue services between them (CloudWatch, IAM, VPC etc).

vorpalhex · on July 28, 2022

DigitalOcean, Linode, Hetzner, Vultr...

jonatron · on July 28, 2022

DO's spaces didn't even work for the most basic test of putting a few images on a page when I tried it. I couldn't say it's comparable to S3.

vorpalhex · on July 28, 2022

I use DO spaces heavily - for images no less. It works great. What issues did you have?

jonatron · on July 28, 2022

Uploaded about 100 images. Put those images on a page. 50% of them timed out when downloading. Maybe I was just unlucky, but it was a good way to make me instantly lose all confidence DO Spaces. S3 and Backblaze B2 worked fine for the same thing at the same time as when this happened.

vorpalhex · on July 28, 2022

That's definitely incorrect behavior. Did you reach out to support?

ipaddr · on July 28, 2022

Something tells me you are doing something wrong.

upupandup · on July 28, 2022

Which raises my question: Is there some open source github project that lets you create an S3 API compatible with underlying heterogeneous VPS hosts clustered? I'm guessing even with this storage would still be expensive, so how does Backblaze pull this off?

vorpalhex · on July 28, 2022

Yes, several. Minio is the big one but I believe Ceph can do this too. Others exist.

Disk is cheap. Real cheap.

upupandup · on July 28, 2022

ah yes minio! the one issue i have with using VPS for production is, are they hardened? or is it enough just with proper unix user management and UFW? I have this fear that the VPS box has attack surfaces or zero day vulnerabilities but when I am on the cloud I do not have this worry.

Perhaps irrational but can't argue with the peace of mind that expensive clouds offer. Although we've seen misconfigured S3 buckets leaking data so.

vorpalhex · on July 28, 2022

What makes you believe the threat model for AWS is different than DO? They are just VMs at the end of the day.

rckrd · on July 28, 2022

Large fixed costs, economies of scale, whole product (need compute AND storage). It's possible that a company like Cloudflare can disrupt certain verticals in storage that are mispriced by AWS yet have a larger than expected TAM.

epolanski · on July 28, 2022

Because infrastructure is not that high of cost for most companies and every dev and devops out there knows AWS and has no clue how competitore dashboards even look like.

waitTho · on July 28, 2022

Human biology is not programmed to abide conflict of interest. On paper things can be written just so, but what’s on paper does not stop feelings, friendship, and connection between two people from forming given biology. Pointing at some philosophy to equivocate away physical laws is the simple con leveraged against the people.

The big players leverage their understanding of science and well paid lawyers to play a cognitive game where investment in storage is set aside, as storage is “a solved problem”, they collude to focus government spend on new things they can charge consumers for after charging us via taxes and agency, to build it.

Good luck finding a VC willing to compete against Bezos. They’re not going to target the guy managing the infra risk, providing a cheap platform key to their cheap startup gambling. They’re going to target naive college kids to try and build a rocket for them. Because VCs are smarter than Bezos; do none of the work, own the reward.

dtx1 · on July 28, 2022

I suspect it might be an issue of "cheap enough". For smaller Projects, the s3 storage cost doesn't matter too much. Medium sized projects may find the tradeoff between price vs cost of a custom solution to still be more than good enough. Large enough projects simply chose a different, more specialized solution anyway

jenny91 · on July 28, 2022

Big projects get custom discounts; there's a whole different pricing game for huge customers.

jopsen · on July 28, 2022

Or they make lots of money and choose not to care.

daniel-cussen · on July 29, 2022

There is such a thing. Like if it ain't broke don't fix it, compute costs aren't always that big, sometimes single-thread is enough even on Wall Street exchanges.

And like come on originally the alternative was hiring employees and managers for those employees and dealing with human error and deception overlapping with the guilt of exploiting them, the whole management game. Difficult to find true leadership, and a work-ethic shared between manager and employee. And the education I got in the nineties and noughts was made for desk jobs with stationary, not computers.

Sometimes automation alone is fine. Even on a computer that hasn't been reset in fifty years and is obsolete according to everyone else, hey if it does the trick. Make sure it stays powered. Cobol. Does the trick. Nuclear plants, they use really old software, one using very new software and connected to the internet was hit by the Morris Worm.

I literally picked up a 32 ounce rock on the street that was intended, judging by its shape and way it was cut out of cement and pebble composite, for stoning. Like for Biblical harlots. It was left behind right after a protest was cleared one Friday afternoon on Portugal and Alameda, Santiago Centro, Santiago Chile. Found it Friday like in April, at 22:16. I roam, I checked out the scene--as is my wont--to see what's up, different graffiti on the walls, and then whoops don't see a lot of pint-sized rocks on the sidewalk, somebody might trip, better clear it. Took it home. Realized I should have worn gloves. Next time I went to hang out in front of the police station I told them about it, hey you could do forensics I said, they're like uh no wrong station for forensics, uh...thanks for clearing it away, those are meant for cops.

I got stoned with similar rocks after watching cops on motorcycles retreat from a mob, didn't click that mob was throwing rocks, kind of aiming at whatever that moved, and fuck did I then move, I sprinted away to safety before the gates closed on me. Luckily I didn't take a direct hit, none nailed me.

But the moral of the story is this: nothing about being tens or hundreds or thousands or in this case millions of years old renders it ineffective as a weapon. Same laws of physics. Same gravity. Muscle equal strength. Rock is just as hard now as it was then. Harder than my skull. Death is bad. Bad then, and bad now.

Dunedan · on July 28, 2022

> Yet, AWS S3 pricing hasn't decreased as fast as the underlying storage costs. > […] > Another blog post analyzes the same theory for compute and finds a similar story using pricing data from AWS EC2.

AWS used to do frequent price reductions years ago. At a certain point they seem to have stopped doing that and are now only doing them rarely. That's really a shame as there are still a lot of AWS offerings which are priced way too high (data transfer being the most prominent one).

It'll be interesting to see if and up to which point AWS will keep the prices stable with raising inflation.

aabhay · on July 28, 2022

One of the benefits of inflation!

chrislusf · on July 29, 2022

Besides storage cost, S3 API access cost can also be high if frequently accessed. And latency is unpredictable.

You can use SeaweedFS Remote Object Store Gateway to cache S3 (or any S3 API compatible vendors) to local servers, and access them at local network speed, and asynchronously sync back to S3.

https://github.com/chrislusf/seaweedfs/wiki/Gateway-to-Remot...

safeerm · on July 29, 2022

Like the other comments mentioned your post misses of how transparent AWS has been in price reductions in the past.

More importantly, S3 now has several tiers of pricing depending on how frequently you access the data. So maybe lately they haven’t reduced the pricing of the top tier of S3 but they’ve made it significantly cheaper for other use cases of data. That is very contrary to the comments being made of innovators dilemma.

(I used to work at AWS but have no knowledge about pricing decisions)

ocdtrekkie · on July 28, 2022

This is eventually why every cloud migration will eventually get undone: As cloud providers dial up the profit margins, going on prem will be a no-brainer.

ksec · on July 29, 2022

If you account for IOPS and not just on Cost / GB. The pricing structure of HDD hasn't changed a bit in years. Especially when S3 ( Blue Spot ) used to be way above HDD Cost Per GB price.

So while I do agree with AWS pricing structure has changed in recent years. I dont see how the Data shown correlate to that conclusion.

skywhopper · on July 29, 2022

I’m sure AWS’s margins are growing on S3, but to be fair, S3 is more than just raw storage. The majority of the cost and value is the API and the ever-growing features it provides. Those aren’t getting cheaper nearly as fast as the underlying storage hardware is.

aqsheehy · on July 29, 2022

I get the feeling the next frontier in storage is freeing up spare capacity. Be that local first or someone cracking efficient p2p storage.

TedShiller · on July 28, 2022

Also need to take into account decreasing cost of bandwidth. This article only mentioned disk storage as an underlying cost.

xwdv · on July 29, 2022

We currently have plans to move entirely to R2 on our roadmap. Definitely interested in Cloudflare’s offerings.

influx · on July 28, 2022

It looks like the most recent data point, S3 was actually cheaper than the raw storage, while providing 11 9's of durability, across multiple AZs. Still looks like a pretty good deal, as the price of storage has mostly flatlined in this graph.

Nor does this even account for cold storage, or reduced redundancy.

secabeen · on July 28, 2022

The prices are in different units. S3 is $/GB-Mo, raw storage is in $/GB. So, when they are equal on the graph, you are paying each month to AWS what the cost of the raw disk is. Now, yes, you need a lot more than just raw disk to effectively store data, but even if you just assume a 5-year lifespan for that disk, the price difference displayed on the graph as equality is actually a 60x difference in plain $.

The price has also appeared to flat-line because they are using a linear graph scale for logarithmic data. It should use a log-scale for the y-axis.

ceeplusplus · on July 28, 2022

Now factor in redundancy and server costs, and the difference is not so huge. You're still paying a multiple of the raw cost, but unless you're storing on the exabyte scale I think it shouldn't really matter in the grand scheme of things.

Cost of a single engineer to manage a Minio cluster probably already outweighs the extra cost you're paying at any reasonable scale (i.e. most companies). And if you're a big player the published costs are not what you're paying.

hedora · on July 28, 2022

They use wide erasure code stripes, so redundancy is ~ 1x within a data center. Let's assume 3 data centers.

It's well known how to build a storage node whose cost is mostly disks, let's say 50% of the hardware is not disks.

So, redundancy and server costs explain up to a 6x mark up.

Power and networking really shouldn't account for the other 54x.

Sebb767 · on July 28, 2022

You also need to house these servers, including backup power etc., manage them, maintain them, develop and deploy the software etc.. Also, you should really check the power usage of enterprise disks and servers; it may be cheap, but it's far from comparable to your average desktop. Then you need to add in that they need to have a reserve capacity as well; you can now go and store 100TB on S3 and AWS will be fine with it - but they need to have those disks up & running already.

Don't get me wrong, S3 is expensive, but replicating the availability, feature set and scalability is going to be very expensive, too. You can cheap out if you don't need these features, of course.

jenny91 · on July 28, 2022

I agree with you but if you explain the 6x mark up, you only have 10x to explain out of 60x.