Hacker News new | past | comments | ask | show | jobs | submit login
Moore's Scofflaws (oxide.computer)
139 points by steveklabnik 7 months ago | hide | past | favorite | 100 comments



I am constantly impressed by Oxide. Their writing is top notch, I think they're exceptional engineers and product thinkers, and clearly they care about the ecosystem they inhabit.

One day I would love to get all of their OSS up and running locally. Truly, why not try to run your own private cloud? How many old laptops, desktops, PIs, etc must we all have lying around?


People have been setting up servers since they've had internet connections, it's only the last few years that people started calling everything "the cloud" and acting like having a physical server on the internet is difficult or strange.

I don't know what old laptops and PIs have to do with anything though, one $600 desktop is going to dwarf all of them combined in performance. There is no reason for anyone to do that unless they are a kid with only an allowance for income.


Yeah, a lot of us that are a touch older learned a lot of unix/linux fundamentals from running our own server in our closet, low upload bandwidth of DSL be damned.


One honest question for all people who miss the "good old days" of on-prem: have you ever worked in a shop that ran their own private datacenter? because I did and it wasn't lovely.

My first job ran their own datacenter. We started with a rack of about 50 nodes and some odd computers here and there. By the time I left we had a rack of 250 nodes and we had thrown away all individual servers (oh Solaris I won't miss you).

> Truly, why not try to run your own private cloud? How many old laptops, desktops, PIs, etc must we all have lying around?

Although VMs and Containers have done a world of difference on running your own datacenter, there are still reasons why just use the hardware that you have lying around for a business is a bad idea.

First, random laptops and desktops don't have a good ratio of price, performance, and energy consumption. Second, despite virtualization, all this heterogeneity in hardware and software is bound to cause an ongoing maintenance headache. Third, I hope your business doesn't have a sudden spike of demand, or you'll be struggling to find hardware to support it. Fourth, say good bye to quick iteration and experimentation when you have huge lead times and tons of capex.

Bottom line, cool for a hobbyist project, not cool for starting a business (and a maybe for a well establish and mature business).


> why not try to run your own private cloud?

Because until you're big enough to have local weather systems evolving inside your offices, that's called 'servers'. I'm not saying not to do it, it's great, and cost effective, and much safer, but calling a small-to-mid business's servers 'cloud' is like saying I have a 'private Uber' in my garage that I can drive myself.


Right or wrong, I think the notion of "the cloud" that Oxide is selling is the "elastic" part--the ability to request resources by quantity instead of having to think about the exact servers responsible for serving your request. So physical servers become an implementation detail rather than a developer concern. That's a concept that can apply to how you provision two on-site servers (or even one, really) as much as a whole multi-data-center cloud computing service.


People have been doing that for 10+ years solid, even more with rudimentary middleware such as grid computing.


I don't think Oxide is trying to say no one's done this before, more that they're the best and most comprehensive offering to buy an onsite setup for this fully set up from a third party.


That is correct, yeah. Like every company and product, we do think that we are doing something special, but we'd never claim that nobody has done something like this before. Heck, a lot of the choices that we're making have come out of personal experiences of the founders doing that sort of work previously, as well as hyperscalers talking publicly about the choices they've made internally. We build on the shoulders of giants, just like everyone else.


It's a great value product, and I would definitely be excited if I was still filling my own data centre with VMs.

But you can't buy integrated like this without software support.

So when 0xide have the (nice) problem of topping out whatever market share they can win, their shareholders will want to extract more revenue from existing customers. That can only come in the form of licensing updates (or selling new sleds) according to _how_ they're used: by core, by power, AI model, number of customers, or whatever else.

Either 0xide are such a roaring success that Amazon reduce their cloud prices to compete. Or they save their select customers so much money on the cloud that they can start raising the rent.

(not a negative judgment by the way, customers win either way. Just putting this kind of marketing in context and a reminder that you don't own anything any more, haha)


You're right: their shareholders (especially the less imaginative ones) could pressure them to take a page out of IBM's playbook – the parallels between an 0xide cabinet and an IBM big iron cabinet hosting multiple VMs/LPARs is too tempting. Or, on a positive note, they could offer customers the option to invest in the "go the extra step" customer service engineering that IBM offers.


I really like 0xide's insight here. I don't know if Bryan has an MBA but this article reminds me of how I learned that good systems engineers make for good product people.

A friend of mine from Sun went to Santa Clara university for their MBA and I was always interested in running a company and we talked about what core knowledge the school felt that a "Master" of business administration should know. Not too surprisingly there was a lot about how you could organize a bunch of unreliable elements (employees) into a structure that reliably delivered results (products). And in one sense that is exactly what cloud computing did as well, organizing unreliable "cheap" PC type servers, into a fabric of reliable service delivery. It is the system of organizing the parts that is the solution.

Greg Lindahl was one of the founders of Blekko and his experience of managing hundreds of machines for the super computer types with big machine budgets and low IT budgets, was essential to Blekko's ability to deliver its search engine. With a DevOps group of six (one manager and five SREs) Blekko managed 2000+ machines in two data centers. IBM was astonished when they bought us that we could manage with so few devops engineers.

We had also done the "What would this cost us on AWS" calculation many times with many different variables and the 'break even' point, where the cost to self host at a Colocation facility fell below the cost to host with AWS was 120 servers. And that included reducing the DevOps group size to 3. So about 10 racks worth of servers. There was a really funny experience when IBM insisted we move our infrastructure to SoftLayer (an IBM business) when IBM acquired us and we pointed out that the "soft money" cost of running our infrastructure on SoftLayer was about $9M a month versus $120K a month. Which their finance group shut down that talk and just renewed the lease in the Colo.

But to successfully run a big distributed thing like that, you needed to both put your servers in a colocation facility and have Greg's software which allowed you to manage them with a small team. The total cost of that infrastructure depended on it.

I got a chance to go visit the 0xide team and they absolutely "got" that requirement. If I were running Engineering and Operations in Blekko today we totally would have kicked Supermicro to the curb and replaced them with the 0xide solution. I don't know if they have teamed up with Antithesis for testing their stuff but man, that combo of management software that never breaks and integrated server cabinets with all the things? That's a pretty good integration.


> I don't know if Bryan has an MBA

An MBA from the Sun Microsystems School of Hard Knocks, maybe. Otherwise I don't think so.


Ha, you're right on both counts. I really love the case study approach for MBAs, though -- and I feel like one of the best things we read in preparing for Oxide was the Stanford GSB case study on Nebula[0].

[0] https://www.gsb.stanford.edu/faculty-research/case-studies/n...


What was the secret sauce?


perl


Hah, was imagining some kind of organization breakthrough… “we wrote scripts,” not what I expected.


Ah but it's really encapsulating a decade plus of running clusters on limited resources into scripts that automated nearly all of the detective work on fault analysis, resulting in very clearly signaling of problems. Combined with good process for managing "microfaults" (Greg's term not mine) led to a well instrumented (only kept the data that was important for diagnosing the faults) and robust in the presence of nearly any failure.


Bryan Cantrill is a million times smarter than me so I'm probably missing something. I thought people bought into the enterprise cloud ecosystems because of cost savings elsewhere? If you go into a platform like AWS/GCP/Azure shouldn't you be designing your applications to take advantage of the flexibility in the wide range of services offered? If you don't need that and you're not buying into all the abstractions they provide, you could use any other cloud-lite (Vultr, OVH, Linode etc)?

Maybe I'm stuck in 2015 but are people really out here rolling their own clouds these days?


Our big belief is that elastic infrastructure (that is, cloud computing) should be orthogonal to the economic model (that is, own versus rent). Today, that is not the case: if you want cloud computing, you more or less have to rent it. But we know that there are many who wish (or need) to run cloud computing but on an owned asset. There are many factors that deter that today, not least the odious per-core licensing that we highlight here. (For a particular brazen example of this, see VMware's infamous "AMD tax" ca. 2020.[0])

[0] https://news.vmware.com/company/cpu-pricing-model-update-feb...


Why aren't there prices on the website? (Seems weird that there isn't given that blog post, where costs are pretty central.)


(not Bryan, not actively part of the decision to put prices on the website or not, just my own take.)

Different audiences have different expectations. In the place we are, that is, enterprise sales, the expectation is not that you can go to a website, get a price, and click a button. The expectation is that the two organizations will communicate over a period of time ("the sales cycle") to work through everything that goes into a deal, and eventually come to an agreement or not. This includes so many variables that putting the price on the website wouldn't make sense, as you're never going to end up at that exact price.

I used to find this attitude frustrating, as an engineer, but the longer I have been a professional, and worked at various organizations, the more I come around to that being a good thing, not a bad thing.


As someone that often has limited time to research viable options to present to leadership, I expect to be able to negotiate pricing at the enterprise level if we want to move forward. What's deeply frustrating that keeps me from sometimes mentioning an option is having no idea what the rough order of magnitude pricing is. Some kind of pricing is deeply appreciated.

If I don't know whether something will fit within the approximate budget for the project and can't quickly get an idea from other research I'm not going to mention it as an option. I'm used to spending about a million per rack but that's for a complete ESX cluster with storage and networking, if I have no idea how alternatives stack up against that it's hard to put it on the table.


You are right that this sort of sales style has its downsides. I am also not saying there's no value in having something on the page, or that Oxide is always going to be this way. I am just talking about why it is a norm in many parts of the B2B world.


I will echo this. A product with no pricing information will often just get no further exploration, especially if a competitor does have it. Don't underestimate the overhead of just having to talk sales before you have a price.


I’m surprised there is anything that you can spend a million dollars on where nearly all the alternatives aren’t “please contact sales”. And even if there was, you couldn’t do any real analysis without contacting sales, because any displayed price would be highly negotiable.

Where I’m frustrated by this attitude is when I just want to buy 10 seats of something and it doesn’t have a price, not at the seven digit level.


All I really want to know is "ballpark" - are we talking four, five, ten figures?

I feel bad burning salesman time for a solution that is way out of our range, but I can't learn that until after a meeting or two.


I hear you, but I can assure you that this comes with the territory for sales people. Qualifying leads is part of the job.


Oxide is offering you radically different -better- terms. The price they mean to charge will almost be based around what they need to a) fund more development, b) grow, and probably also relate to what their competition charge. The actual price you would actually negotiate surely depends on other factors, like how badly you want those better terms. If the better terms aren't enough to make you talk to them about pricing, then Oxide surely isn't for you.


It's between a half million and a million.


If the price isn't listed, I always take it as a sign that nobody is buying it for price reasons ("if you have to ask, you can't afford it"). Outside of the blog, I don't even see a claim that it's cheaper so that's consistent.


That is understandable, but you would be mistaken.

That quote (which may not even have been said by J.P. Morgan) is talking about luxury consumer goods, which is a completely different market than business to business sales.


I wonder how many startups are overpaying for things like cloud because they are using consumer thinking instead of business thinking.


A huge number. I do a lot of due diligence of companies, and the over engineering on the cloud and costs have to be seen to believed. A common theme now is people use lambdas and step functions in replace of function calls. So a page load or transaction may involve two dozen serial lambda invocations. It is madness.

By way of contrast, a payments provider we explored had four moderate sized boxes running everything. They were about six years old and fully depreciated, but more than adequate to run millions of transactions through a month.


"If you have to ask, you probably can't fit it in your house."


> Our big belief is that elastic infrastructure (that is, cloud computing) should be orthogonal to the economic model (that is, own versus rent).

I’m probably missing something obvious, so take this as a genuine question rather than an attempt to debate, but why aren’t they fundamentally connected?

If you sometimes need 10 servers and sometimes need 100 (elasticity) then with renting you can always have the proper number. If you own, then you have to own all 100 servers.


(not Bryan)

> why aren’t they fundamentally connected?

They are connected in the sense that capacity planning is always a thing. But that doesn't mean that the ownership model is inextricably tied to the deployment model, which is how I personally say instead of "elastic infrastructure." "I am making an HTTP call to request a new VM" is very different than "file a ticket with IT to procure me a new server, install it in the data center, and then send me the keys." The slogan "pets vs cattle" is an advanced form of this change in thinking. In some sense, it is hard to own your hardware yet treat your servers like cattle.

You basically have four basic options:

* deploy elastic, rent your hardware: this is the default today for many startups

* deploy to hardware directly, buy your hardware: this is the Old Times

* deploy elastic, buy your hardware: this is Oxide

* deploy to hardware, rent your hardware: this is a thing, though not nearly as popular as the other three

(I call them basic because hybrid is a thing: you can own usual capacity and then rent more for bursts, or as a fallback, etc)

Now, Oxide is not the only game in town when it comes to "deploy elastic, buy your hardware": this is what many IT departments do. You could, for example, buy some servers, toss OpenStack on there. Oxide's thesis is that doing this is less than ideal, and we can make it significantly better. In fact, it is so suboptimal that many people choose "deploy elastic, rent your hardware" because it is so much easier. And now we're back to Bryan's statement.

Does that makes sense?


It does make sense, thanks.


I'm not Bryan, obviously, but part of the answer here is that 100 servers running at 100% capacity is an absolute upper bound, but most of the time you're nowhere near that. Most of the time few things are at full capacity, which means that you can multiplex your physical hardware resources to increase utilization efficiency.


I actually find it interesting how many things we (which, in part, I mean I) got wrong about cloud computing early on. Various other things too.


Even taking into account the cost of skilled staff, AWS was never price competitive with in house, at least in my experience. I simply could never make the numbers work.

In my industry (telco) we had two teams: my team ran our own hardware, the other team ran less than 10% of our workload on an AWS stack that cost as much per month as we paid per year - including annualised capital costs.

They also had double the ops team size (!!), they had to pay for everyone to be trained in AWS, and their solution was far more complex and brittle than ours was.

Assuming Oxide would have been price competitive with what we were already using, I would have jumped at the chance to use them, I could have brought the other team on board, and I think it would have given us a further cost and performance advantage over our AWS based competitors.


Where you using cloud scale style purchasing in house or were you on enterprise servers with enterprise switches with enterprise storage?

The cloud is good for many users, especially if they migrate to cloud native system design, but as a telco you would probably have facilities and connectivity which helps out a lot.

Companies like Mirantis, choosing a technology completely inappropriate for distributed systems (puppet) put a bad taste in the mouths of many people.

I implemented OpenStack at one previous employer to just convince them that they could run VMs, intending it to allow for a cloud migration in the future.

As they ran a lot of long lived large instances it was trivial to make it cheaper to run in our own datacenter. Well until I moved roles and the IT team tried to implement it with expensive enterprise gear and in an attempt to save money used FEXs despite the fact I had documented they wouldn't work for our traffic patterns.

Same thing during the .com crash. I remember our cage was next to one of the old Hotmail cages with their motherboards on boards. We were installing dozens and dozens of Netras and Yahoo was down the hall with a cage full of DEC gear...we went under because we couldn't right-size(cost) our costs.

A lot of the companies who save a lot in cloud migrations were the same, having decked out enterprise servers, SANs, and network gear that was wasted in a private cloud context.

Enterprise _ is often a euphemism for we are fiercely defending very expensive CYA strategy irrespective of value to the company or material risk.


We provided SaaS billing services to telcos. We were very successful in our market, but not very big. Just a couple of racks of gear.

Our production workload was pretty homogenous, and we were super cheeky - we’d use previous generation servers to keep capex down. We didn’t even use VMs, just containers; docker-swarm was good enough (barely). Our bottleneck was always iops, so we’d have a few decked-out machines to run our redundant databases. It worked fine, but I have subsequently enjoyed working with k8s a lot more.

We did use enterprise gear, but previous gen stuff is so much cheaper than current gen. So our perf per watt was not great, and we’d often make the decision to upgrade a rack when we hit our power limits, since we rarely had other constraints.

As mentioned elsewhere, we did use AWS spot instances for fractional loads like build and test. It’s not that we didn’t use cloud, it’s that we used it when it made sense.

All of that said, I do suspect the equation has changed - not with AWS, but with Vultr. I’ve deployed some complex production systems there (Nomad, Kafka+ZK, PG) and the costs are much closer to in house. They have also avoided the complexity of binding all their different services together. They also now provide K8s and PG out of the box, charging only for the cost of the VMs - as opposed to the wild complexity of AWS billing.

So maybe I’m coming around.


> in an attempt to save money used FEXs despite the fact I had documented they wouldn't work for our traffic patterns.

What is a FEX? Feline Expedited eXchange?


Cisco Fabric EXtender. Like a TOR switch but dumber.


At my previous employer (which was telco adjacent in a way) we basically came to the same conclusion. Big plans were drawn up to only locate specific hardware in physical data centers and move 90% of the load to the cloud. It never left the planning stage because our VP could never get the numbers to work. Once you cross a certain threshold of services, scale and reliability you’re paying a premium to be in the cloud.


This has been my (admittedly small and small size) experience - it's hard to make cloud competitive unless you have something like vastly changing requirements, huge burst needs, etc.


Right - we made a lot of use of spot instances for building and testing. It’s great for that kind of fractional use, for sure.


Yes. Example: oil&gas, and telcos. Some of them believe that the public cloud doesnt do what they need it to do. Super-high performance networking for example, IPv6, SCTP protocol, whatever. IIRC Schlumberger built a private cloud on OpenShift and Verizon are a decade into rolling their own cloud built around OpenStack and OpenShift (they have blogged/presented about it quite a bit). AT&T built a private cloud and then tied up with Microsoft to make Azure Operator Nexus (see https://learn.microsoft.com/azure/operator-nexus/overview).

I honestly don't know who is on the right side of technology and economics here. One glance at Oxide or Nexus says these systems aren't cheap. But if those industries are right, and regular cloud genuinely won't cut it, then maybe these systems can find a niche.

One thing that would determine the fate of these platforms is if the major vendors, that everyone in a given industry uses, commit to it. For example, in telco that might be Ericsson, Nokia, and Huawei. I don't know if they have sufficient incentives to do so yet. They would, I guess, prefer to wait for a clear leader to emerge and then commit to only that rather than bet on three or four speculative horses?


There are alleged cost savings elsewhere. I haven't seen a company that's actually realised them. Every company I know of that's gone all-in on cloud is paying a fortune and still has a big team to manage everything.


My company alone has over 150 case studies that document specific positive ROI on cloud adoption. Not every can or will run their own; I’m pumped for Oxide and I think they are going to smash, but some teams really really benefit from Public Cloud.


I think kubernetes has partly been about not relaying directly on all these cloud services anymore. Sure AWS and friends have 200+ services that you could use, and if you do use all of them, it would be hard to move to oxide.

Many applications don't depend on all that stuff, just vm, storage and db. Partly that is by design so the application can be deployed in different clouds. Also older application that moved to the cloud.


Looking at Oxide, aren't hyperconverged computing solutions a product category? Looks like Oxide's yearly revenue is on the order of $15M/year with about $45M of Series VC cash. So these folks might be smart but they've not shipped very much product yet. At scale (1-10K servers) I'm not sure why I would tell my CTO or CIO to buy from them instead of a larger, more established vendor?

Definitely agree that you can easily pay too much for cloud services and that the hyperscalers have every incentive to put as much margin in their pockets as they can get away with.


I don’t know Oxide’s solutions personally, but the fact that the economics have always favored self-hosting and yet AWS is a gazillion dollar business suggests that while they may be building in a pre-existing product category, nobody’s actually gotten the product right yet.


I wrote a bit about "hyperconverged" vs "hyperscale" and how we're the latter here https://news.ycombinator.com/item?id=30688865

> So these folks might be smart but they've not shipped very much product yet.

We just started shipping late last year, announcing our first two customers then: https://oxide.computer/blog/the-cloud-computer

So yes, still early :)


Tangent but "Your margin is my opportunity" was originally a Sam Walton quote when he was building the 5 and dime stores. Bezos was deeply inspired by early Walmart and Sam.

https://www.acquired.fm/episodes/walmart


That's interesting -- the quote is very much attributed to Bezos. Did the quote actually come from Sam Walton, or merely the sentiment? Either way, it's clear that the Waltons were an inspiration to Bezos!


Quote Investigator comes down pretty hard on the Bezos side and doesn't mention Walton. https://quoteinvestigator.com/2019/01/13/margin/


Thank you for correctly citing Wright's law, but I will say that Moore's law is very much over-cited. Moore spoke only on number of transistors per chip, not anything about transistors per dollar or transistors per unit area.


This was a great blog post highlighting some of the weird business models that plague cloud computing. I tried to find competitors to suppliers like Oxide in the EU where there should be a huge amount of business cases for "datacenter in a box". Surprisingly there are very few options available (yes, Azure Stack is available but there is still licensing per cpu for that).

Why aren't there more players in this area?


We have definitely had the same question! I think the answer is twofold: it's technically hard -- and it's cross-domain in that you need both hardware AND software to pull it off. On the one hand, it's not like such expertise doesn't exist (and we are not doing our own silicon here!), but on the other, even with the right team, it is time-consuming (and therefore expensive).

If the difficulty and cost were the only challenge, this would be a candidate to do at a larger company, but the cross-domain nature of it makes it really thorny: you would need a lot of internal alignment to succeed -- and (more challenging) you need to maintain that internal alignment for a protracted period of time. I had done something not wholly dissimilar (though frankly much less ambitious) at Sun back in the day[0] -- and even though Sun was much more amenable to this kind of disruptive endeavor than any company of its size[1], we barely pulled it off. Indeed, some of the worst behavior I ever saw at Sun was from the people who trying to prevent us from succeeding because they felt it threatened them. I simply cannot imagine doing something more ambitious at Sun -- or as ambitious anywhere else.

We came to the conclusion that it really has to be new company formation -- which means raising money to do it, which means finding the right investors. Even though the upside is extraordinary, finding the right investors in hard tech is really, really tough[2]. So yes, there should be more options -- but there aren't, and there are frankly unlikely to be in the foreseeable future...

[0] https://bcantrill.dtrace.org/2008/11/10/fishworks-now-it-can...

[1] https://bcantrill.dtrace.org/2011/07/12/in-defense-of-intrap...

[2] https://oxide-and-friends.transistor.fm/episodes/deep-tech-i...


VxRack, Nutanix, Outposts, Azure Stack as you said, probably something from Oracle Cloud...


Isn't this just an ad?


It's definitely content marketing, but I find Oxide interesting enough that I don't mind.


It's just an ad with a kitchy title, but people hate realizing too late something was an ad.


KVM is free. ZFS is free. Linux & BSD are free. The margins on server kit are tiny. If I've grown to the point that I need to buy my own hardware, the application software is probably something written in-house, so no licensing there either. Oxide gets plugged here non-stop, but the audience here still struggles to see what the point is. Sounds like a vanity project to keep some of the old Solaris talent in-pocket.

edit: and the current growth-area is GPU/ML processing, where Oxide has bupkis.


>KVM is free. ZFS is free. Linux & BSD are free.

Your time, hopefully, isn't.

>The margins on server kit are tiny.

And man, does it ever show.

>Oxide gets plugged here non-stop, but the audience here still struggles to see what the point is.

I'd like someone to appreciate music with; but alas, my cat doesn't like Debussy, she likes tuna, and I must love my cat for what she is.

That the HN audience struggles to see the point of a properly-integrated systems offering isn't an indictment of that offering. They (and good for them!) may never have had the professional experience of having to manage a large enterprise IT resource on questionable hardware with indifferent software support from multiple vendors.

>Sounds like a vanity project to keep some of the old Solaris talent in-pocket.

Then that would be a vanity project for Oracle; but I believe that ship has sailed.

>the current growth-area is GPU/ML processing, where Oxide has bupkis.

Nobody but nvidia has anything hw-wise in that area (for the moment...).


> That the HN audience struggles to see the point of a properly-integrated systems offering isn't an indictment of that offering.

It definitely shows who it's actually an indictment of though.


> Your time, hopefully, isn't.

Even on low-end hardware, the cost of competent deployment amortizes to zero. Like most industrial pursuits, the deployment cost of the first box is high; the cost for each additional box is orders-of-magnitude less. Add the end of Moore's Law into the mix--getting many years of economic service out of boxes that would have been obsolete within the year--the cost is lower still. There are people out there who've been running aisles of rock-bottom supermicro gear, for almost decade, without incident. Gabriel's Law: "worse is better". FAANG rode Gabriel's law to the bank.

> Then that would be a vanity project for Oracle

As for "in-pocket", I did not mean Frankenstein's Solaris. I meant Operation Paperclip. Knowledge was acquired on someone else's dime, and it would be imprudent to leave this talent disengaged--to be retrieved and used by one's competitors.

> Nobody but nvidia has anything hw-wise in that area

Exactly. That is the castle that needs to be stormed. That is where the treasure is. 0xide is running a war against an already-dying kingdom. They might as well be playing minecraft.


> There are people out there who've been running aisles of rock-bottom supermicro gear, for almost decade, without incident.

Have there? But not you I take it?

Bryan worked at joyant and ran a public cloud on dell and supermicro gear. And it seems they had plenty of problems.

I have heard the same sentiment im other places.

Additional evidence comes from the cloud providers themselves. Why did all of them develop their own stuff if operations with that stuff is so problem free?

Google tried it and rejected it quickly.


> Have there? But not you I take it?

I have a variety of customers running a variety of different x86_64 gear. From an up-time perspective, there has been low correlation between price and reliability. If we're talking non-x86_64 systems--like zSeries or industrial control systems--then the price and the reliability are correlated, but that is not an x86_64 market, and probably never will be.

> Bryan worked at joyant and ran a public cloud on dell and supermicro gear. And it seems they had plenty of problems.

Never called it perfect. I wrote "worse is better"--the "best" losing out to the "good enough". The supermicros I mentioned were "good enough". Of course Cantrill et. al. had problems with them, since they were running Solaris/Indiana on x86_64 gear, expecting Sun-tier SLAs. They'd need Sun-tier hardware to accomplish that, but the Sun that made that kind of gear is dead, just like Joyent. They are both dead for a reason: worse is better. Google also used bare-bones discount servers in their early days (not wasting money on brands, SLAs, and fancy cases[0]).

> Additional evidence comes from the cloud providers themselves. Why did all of them develop their own stuff if operations with that stuff is so problem free?

Let's say there are three tiers of users 1) small-fries who spin up a few AWS images, 2) businesses who've grown to the point where public cloud is no longer economical, but not invested enough in computing to justify developing their own hardware, and 3) FAANG, who have so much money and so much need that it would be absolutely stupid for them not to develop their own hardware. At FAANG-scale it's no longer just about cost or quality; it's about being in control of your own business. #2 would seem to be 0xide's target market, but as someone who inhabits that space, I am unimpressed.

[0] https://en.m.wikipedia.org/wiki/File:Google%E2%80%99s_First_...


> Never called it perfect. I wrote "worse is better"--the "best" losing out to the "good enough".

What good enough is depends. Sure maybe if you built new processors and everything it would be even better.

But using the same processor at the core as typical servers, and just sounding it with a slightly different architecture can improve things. That's exactly what google does too.

> Of course Cantrill et. al. had problems with them, since they were running Solaris/Indiana on x86_64 gear

Ah the old 'its the os fault' excuse. Classic.

> dead, just like Joyent

Joyent was bought, it didn't go bust and the actual infrastructure is still running.

> Google also used bare-bones discount servers in their early days

Again, that's exactly what I am saying. Why do you think google stopped doing that?

> At FAANG-scale it's no longer just about cost or quality; it's about being in control of your own business.

Except if you actually listen to interviews of the people who did those innovations at Google and co, they stopped working with the traditional vendors and did costume stuff, it was because they couldn't get the quality out of the traditional architecture.

That is also why google invested in coreboot, linuxboot and are running their internal infrastructure on things like NERF firmware, instead of the standard you get from the typical vendors.

> #2 would seem to be 0xide's target market, but as someone who inhabits that space, I am unimpressed.

Have you done an actual detailed comparison with real price quotes compared to a traditional system plus all the software and setup and so on? Because if you haven't its not worth much as an opinion.


> Ah the old 'its the os fault' excuse. Classic.

The problem wasn't Solaris. The problem wasn't x86. The problem was Cantrill & Co.'s expectations. The industry began moving to availability in software instead of hardware back in the early 2000s. Premium iron vendors fell on hard times because of this. They had a premium product with a premium price, but demand for it dried-up. Cantrill came from one of those companies, and is right to think commodity x86 hardware is crap in comparison, but it doesn't matter for the mid-tier and below markets anymore. They've already cleared this obstacle.

> That is also why google...

What Google does is irrelevant to this discussion. That is not 0xide's market. Businesses seriously competing in the compute space are going to vertically integrate as much as possible--not just motherboards and lights-out management, but all the way down to the CPU: Google's Cypress & Maple, AWS's Graviton, Apple's M1. Switching to 0xide kit is not vertical integration.


I have heard a lot of whoppers in my time, but hearing that my expectations are too high might be a first!


Just going off what fanboy wrote. If you went into the business of running foreign workloads on commodity x86 hardware expecting anything more than pain and sadness, your expectations were too high.


I am so confused by your position. It seems to be that you believe x86 PC hardware is garbage, but that for most companies garbage is just perfectly fine and the only people who shouldn't use garbage are companies that can make their own CPU? And anybody that tries to improve the status-quo shouldn't even try. That's quite a line of argument.


> I am so confused by your position

If you read with a little less haste and hostility, and a little more humility--rather than just jumping in and getting personal, you will see that I am being perfectly clear.

I am saying that this battle already played-out, 15-20 years ago. The people who need primo hardware already have things like zseries, which 0xide is in no position to compete with. The people who are competing against Azure and AWS have to make their own hardware--partly to distinguish themselves in the marketplace, partly because they'd end up sharecropping for their vendor if they did otherwise. The rest is in the commodity band, where interchangeability is key. An okay server that I can swap-out with any other vendor is an asset. A fantastic server, with custom components & bespoke tooling, that I can only get from one supplier, is a liability. All of their open sourcing makes no difference if it only runs on their equipment.


> The people who need primo hardware already have things like zseries

>"People who need primo hardware" ...can buy a mainframe. That is quite the take.

>The rest is in the commodity band, where interchangeability is key. An okay server that I can swap-out with any other vendor is an asset.

Setting aside the trouble one has already incurred with a system that needs to be swapped out; switching vendors (even of erstwhile "commodity" systems)isn't a neat process in any medium-to-large organization. There's purchasing, contracts for maintenance, depreciation, validation by your DC people that power and cooling for the new box is good...

And...in the end, if you your hardware is truly commodity, swapping out for another vendor is not going to yield a better outcome in the long term. You'll be riding the crap-hardware merry-go-round all over again, just on a different horse.

Here's a nickel, kid. Go buy yourself a better server.


> If you read with a little less haste and hostility, and a little more humility--rather than just jumping in and getting personal, you will see that I am being perfectly clear.

You literally called me a 'fanboy' but I am somehow I am hostile. I didn't say anything hostile in the slightest.

You even doubled down on your position. If you were ever gone be google or a mainframe you would have done it already, so be happy with the garbage provided. That literally your argument.

> I am saying that this battle already played-out, 15-20 years ago.

Markets change over time. Its never set in stone. The industry is continuing to grow, computing needs increase for all companies and many, many billions get spend each year on new servers.

The main reason the 'fancy' hardware like Sun and friends failed, was because of the processor on the hardware side, and because of open source linux on the software side. The costume motherboards that cost slightly more and firmware weren't actually the real problem.

Also, 20 years ago was before virtualization and scripted setup were universal, things were just totally different. Those decisions should effect what the right decisions are today.

> The people who need primo hardware already have things like zseries

Again, the market is growing, any money companies who might have needed 10 computers in the past now need 100. And those that needed 100, no need 1000. And so on.

There is a gigantic gap between a mainframe and your standard Dell PC Server, there is lots of space between those things.

> The rest is in the commodity band, where interchangeability is key.

And yet a huge amount isn't actually this perfect interchangeable commodity. Sure you can get any basic OS to boot and there is some standards, but the situation isn't close to actually being a true commodity.

It also depends on where you expect interchangeability. If you define much of your infrastructure with terraform scripts and a few services to monitor everything, then that's the interchangeability layer you actually care about. Sure if you buy a bunch of Oxide servers, you can just rip one out and replace with the server from some other vendor. You have to rip out the whole rack and replace it with some other rack.

But whatever that something else is, its just gone be another target for your terraform scripts that gets monitored in a comparable way (against hat part isn't neatly standardized anyway). Industry 'standards' like RedFish are not close to providing that even for pretty low level stuff. And once you want to do more complex things there are even less standards.

If you operate a number of VxRail Dell Rack, its not that easy to just rip it out and replace it with some rack of HP machines. Again, this is specially true if you use anything but the most basic features, because very quickly you are in proprietary non-standard software hell. Guess what, Dell doesn't actually want their racks to be perfectly easily replaceable.

And if you have an issue that your Dell rack and your HP rack behave inconsistently, guess who isn't gone care about that, Dell and HP. And guess what, neither of their implementation is open source either so good luck with that. There is a reason people stick to a low number of vendors.


One can simultaneously be impressed by the engineering while wondering if this level of custom implementation at what will probably be pretty low scale really makes sense.


Bezos had a clear proposition: "I can handle your workload better than you can for X cents per hour." Cantrill doesn't.


Bezos's proposition needed to be qualified for each customer; in some cases, he was wrong. Yes, sometimes renting other people's hardware makes sense, for a time, until it doesn't.


I wrote 'proposition', not 'fact'. Jeff made a simple proposal, then I, as the customer, can evaluate it, accept it, or reject it. 0xide doesn't have a simple proposal. They have a constellation of nitpicks and tastes, but that's not good enough. The typical consumer in this space wants quantifiables.


> edit: and the current growth-area is GPU/ML processing, where Oxide has bupkis.

Yes and about 5000 startups plus many large going after that market.

The market Oxide is after is proven to be large, not very innovative and very few new competitors. And guess what, its not industrial maschines that survive 50 years. Servers are not cost effective after a few years, so there is a lot of turnover. Costumer also don't have a huge amount of brand loyality.

I can see worse buissness strategies then that.


I know AWS has a reputation for being expensive overall. But it is crazy that it has the cheapest VMs. See here: https://cloudoptimizer.io/

But of course other things like egress, serverless services more than make up for it.


Per core licensing is pretty dumb, especially considering commodity systems now have 32, 64, or more cores, and server cores vary widely in capabilities.

Per core maybe made a tiny bit of sense when a server was one or two cores. But that era's long gone.


IMHO per socket was extractive most of the time, especially when memory capacity was the limit on performance.

The 'Oracle doesn't have customers, it has hostages'

The main purpose shifted from a hypothetical development cost offset to an effort to extract more fees from customers who were locked into your product.

Broadcom's changes to VMware, where they have publicly stated that increasing revenue while not attempting to build new revenue streams or through any value adds for the customers is happening now.

I get that any technology moves to an extraction phase after growth slows.

But for me, moves like per core licenses are an indicator we need to consider vendor mitigation.

Unfortunately capital markets seem to prefer these lower long term but predictable gains through extraction vs long term dividends etc...


Cloud, for better or worse, seems to be sold in vCPUs (read: cores), RAM (gigglebites), and disk.

I wish there was a much better way to compare vCPUs than just "count" because one core of some five year old server is not the same as a core of a modern performance beast.


It's not like they charge the same price for a current gen vCPU as a five-year-old vCPU. The benefit of self-service is that you can benchmark it yourself.


Also it is a bizarre unit to break things up over. Actually, how do per-core programs account for things like SMT? What about Bulldozer? Can I get a discount license if I only run it on an “efficiency” core? And we won’t even get into GPUs…

What next? Let’s charge users based on how many instructions they have in flight for our application. These ILP cheaters have been mooching off the rest of us for ages!


Oracle and IBM have detailed pricing tables with dozens of cases like "one mainframe core is worth three Intel cores". But then you negotiate the price anyway so it seems kind of pointless.


It made a lot more sense back in the day. Then a slot of things switched to per-socket.


I am really curious if part of the reason why oxide is leaning into racksize is that joyent has scared them from attempting a 1U/2U solution.


Case is inverted with javascript disabled. I've never seen that before!


I can't reproduce in Firefox on macOS. Tell us more about your setup!


For that visit, Windows 11 with Firefox developer. Ublock Origin and Noscript, with some combination of previously allowed and default-blocked script sources. I didn't really investigate what was missing to mess with the case, but allowing all scripts on the page cleared it.

Just a weird novelty impact that I hadn't seen before, not really a big deal. Also I retract "invert". I went back and looked again and it's definitely not inverted. Check it out!

https://zlnp.net/serve/weird-case.png


Ooh, very cool. Looks like you're overriding the font at the browser or OS level? We suspect that has something to do with it. If you look at the raw HTML response, which is what should be getting rendered when you have scripts turned off, you can see that the casing is normal. The mystery to me is why turning scripts back on would fix it. Maybe that overrides your font override and changes it back to our font.


I'm not aware of any font overrides (that's not something I typically fuss with) but now you have me suspicious that I don't know my own setup!


The cloud is where Moores Law goes to die.


Clouds = heaven, it fits.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: