Hacker News new | comments | show | ask | jobs | submit login
Sun engineer responds to the Backblaze "Petabytes on a budget" design (c0t0d0s0.org)
169 points by bensummers 2944 days ago | hide | past | web | 103 comments | favorite

Ok I must be missing something. Everyone is knocking this backblaze thing because it doesnt do ZFS or because it is not a "super duper high end san replacement" or because the components are not rated for enterprise level work. It seems to me that a company whose product is personal and small business backups does not need any of those things. They have software to do mirroring. Why would they need any fast network filesystems if all data comes and goes via the internet? Will the hard drives see that much data I/O, or will they mostly just fill up and sit idling with the very occasional read for restore (i strongly suspect the later)[1].

Of course on the other hand, there are the poeple touting it as the end all be all, a solution to kill NetApp.

I guess what Im wondering is: how did so many people get the idea that this article about a specific solution to a specific problem was actually some sort of general purpose solution attacking all the big name people? What am I missing?

[1]A huge chunk of this article is about the hard drives and PSUs not being enterprise ready, but for an enterprise load, but I just don't see it. I bet a lot of these boxes run idle a large chunk of the time. I have 10 year old desktop hard drives running just fine in a file server, because it has a similar load: mostly idle most of the time.

Exactly - this is a replacement for a tape library, not a database server.

Indeed. It would be nice to see some comparisons to the architectures (and, as everyone seems to care so much, prices) in the various 'VTLs' sold by the likes of Falconstor, HP, EMC, and StorageTek/Sun!

"how did so many people get the idea that this article about a specific solution to a specific problem was actually some sort of general purpose solution attacking all the big name people?"

Backblaze did it, They did it when they decided to put this graph in the article:


It's an apples to oranges comparison.

Well no, it's not an apples to oranges comparison because no hardware vendors offer a solution which is comparable to the one Backblaze have devised.

That graph is simply comparing their solution to the closest commercial equivalents. It just so happens that these are all way off because all hardware manufacturers want to design their hardware to provide the highest throughput.

(That said, our experience with the Sun X4500s hasn't been great from an I/O point of view.)

It is an apples to oranges comparison because Blackblaze is only including the cost of components in that graph for their solution, but they're including all of the research, development, assembly, and support costs in the other vendors' solutions.

What are the labor costs for testing all their hardware (with 10 sata controllers no less)? What are the labor costs for assembling all those systems? What is the labor costs of developing the system architecture? What are the labor costs of creating their storage application which reduces their need for in-box redundancy? And if you want to talk about Amazon there is data center rental, power, cooling, and ongoing administration.

In short, they're comparing the costs of their components to the costs of a ready-to-go solution from other vendors. Sounds like apples and oranges to me.

They did invite these criticisms by drawing comparisons to S3, EMC, etc. However, they already tried to factor out the cost of operations so that s3 was on equal footing with the non-service offerings. We can quibble about the specific numbers they used, but I think the bigger quibble is that by their own estimates, those costs are substantial, seemingly far outstripping their hardware costs by something like an order of magnitude. Even so, at the scale of a pentabyte or more, the savings are substantial enough to be worth addressing.

As for the missing costs that you cite, think this through. What do you really think the per-unit costs for assembly and testing are? Even if each unit required a couple of man-days, the costs would probably only add ~10% or so per unit. The other costs you cite for the initial hardware and software engineering are fixed costs, the same whether they are storing 1PB or 1,000. Also, if they did their job right, much of the per-unit testing cost should be mitigated by the overall systems design. The system management automation can do a test cycle on new nodes before promoting them to production use, and of course, failures should be dealt with automatically.

Good point, but I suspect even given the amount of money they've spent on development, and given the production runs on these things are tiny, they'll probably still beat Sun on total price. Hell, even if they only made one of the things they could spend $800k on developing/manufacturing it and still beat Sun.

As far as I'm concerned, you'd have to write the custom software even if you used Sun's boxes. I wouldn't trust data to a single machine of anyone's design. Though that's probably just me.

The S3 comparison is totally invalid though, I agree.

"That said, our experience with the Sun X4500s hasn't been great from an I/O point of view."

Of course it's not. It's an IBM PC down inside. I bet it can still boot MS-DOS and run GW-BASIC. ;-)

PCs are not a nice architecture for servers: there are starvation points all over the system, from registers (AMD64 solves some of it) to memory to I/O. Sun could have based the Thumper on a more server-ish (SPARC?) design, with plenty of memory and I/O bandwidth, but then it would not run Windows and the ability to run Windows is a defining advantage in the high-volume server market.

If I were to design such a box from scratch, I would couple the disks close to the network interfaces over a dedicated bus so the CPU could just say something like "hey, disk 3, drop blocks 10239 through 10300 on buffer 12 of your network controller while I go on with header parts and prepare to send it off". While I am at it I would skip the PSUs and go with DC power and a small battery, Google-style.

I fear I probably described something Thumper-ish and I will be ridiculed by someone with lots of server design experience, but that's the life of a hardware engineer who went the software very early. And I welcome such criticism ;-)

  PCs are not a nice architecture for servers: there are
  starvation points all over the system, from registers (AMD64
  solves some of it) to memory to I/O.
Could you clarify the ways in which you think a PC architecture will hobble a data server? It looks like the X4500 runs dual Opterons, which relatively speaking have no shortage of registers. And it supports 16 GB of memory, which doesn't seem too shabby. And while I'm not sure how the SATA controllers are connected to the bus, I'd be a little surprised if this was the bottleneck. Which part of the 'PC architecture' do you think is the limiting factor?

- It's not the amount of memory, it's how much memory you can transfer without disrupting program execution due to bus contentions. It makes a huge difference whether your 16 GB of memory are in two, four or eight sockets (different sockets can (or at least could) be accessed simultaneously) or if it's attached to a single processor or pooled system-wide.

- AMD64 is a little better than vanilla 32-bit x86, but it still has few registers compared to POWER or SPARC architectures. This increases the risk of memory access, which is bad. I suppose there is a point when it's pointless to add registers and x86s do some convoluted stuff with shadow registers, so the picture is not really clear. Optimizing compilers should alleviate this too, but, like car builders say, there is no substitute for the cubic-inch.

- Still about processors, the least a multi-threaded CPU can do for you is to keep an execution context in-chip and prevent a context swap from memory. That saves a lot of memory bus time that cannot be used by other parts of the system. AMD64s (and their Intel counterparts, AFAIK) max out at 2 threads/core. POWER and SPARC max out at 4 and 8 tpc respectively (again, a number off the top of my head).

- On PCs (defined here as "a computer that can run Windows"), there is little distributed intelligence. I never saw a PC where CPU, disk controller and network interfaces could do the chat I described. Contrast it with the typical vintage mainframe design, where there are as many things going in parallel as designers can think of. As as example, there is an IBM disk-drive in the Computer Museum where you can see two sets of heads/arms on opposing sides of the disk, effectively being able to read/write different cylinders simultaneously. While I don't believe such machines are in current use, this serves to illustrate how far a server designer is willing to go in order to beat a throughput record.

It's an apples to oranges comparison.

Maybe for the low-level tech, but certainly not what it's used for. There are lots of online backup services out there. Some of them publicly make it known they use Amazon S3 for storage. I'm sure there are some other companies, who own their hardware, like Backblaze, but didn't optimise their cost structure using custom builds.

If "doing backups" is your business, then "look how cheap our kit is!" probably isn't the best marketing strategy. A "small business" is someone's bread-and-butter, they don't care that their backup needs aren't sexy to storage geeks.

I dunno, it made me consider them. Instead of "we spend $2m on storage, and someday to hope to make enough money from people like you to not go out of business", its "look we aren't charging you much, and we are promising you 50 yrs of storage, this is how we can actually do that and stay in business".

I agree. Do you remember a company called Bandwagon? http://ridethebandwagon.com/

They had a big splash a couple of years ago about how they were going to offer unlimited backup for all your mp3s in iTunes, all nicely and automatically synced, for about $10/month or something. They were going to do the storage on S3. They offered anyone who mentioned them on their blog a free years' service.


I had about 400G of mp3s at the time I think. I thought, "OK!"

Their service lasted about 2 days as they underestimated, seemingly by about an order of magnitude, how much data people actually had.


Then they went back to the drawing board and came up with some lame service where you buy the storage space on S3 yourself and then they sell you the ability to access it or something. Lucky they didn't go out of business.

Anyway, knowing the details of the kind of system Backblaze is using, and the price per gig it breaks down to for them, is evidence they're not overselling themselves, and gives me confidence that they'll still actually be there in a few years, unlike Bandwagon. I'm much more likely to use them because of that. Add in all the discussion they've generated, and it's been a very clever marketing event for them. Kudos!

The answer to your question is in the price comparisons that Backblaze provides. If you compare the price of something you imply that it fills the same or similar needs. Since there is no explicit caveat, and they don't explain why their solution is so much cheaper, competitors will come out and provide that explanation for them.

I'm pretty sure that the Backblaze solution is actually cheaper and that they are cutting some of the right corners for their particular application scenario. But it remains an open question how much cheaper it ends up being.

I think their price comparison is a bit misleading since it doesn't include replacement costs for the desktop grade hard disks they use and it doesn't include any of the costs incurred by the software/labor/maintainance required to make this a reliable backup solution.

Interesting papers, but based on what I've skimmed and remember I don't think that they compare desktop grade disks with server grade disks.

Given the length of history, they do, in effect, by comparing disks with server-grade interfaces (e.g. SCSI) with desktop-grade interfaces (e.g. IDE).

I used to work for a large consumer photo site, and we wished we could trade speed for lower cost or lower power consumption. Forget SCSI drives, SATA disks were 10x faster than what we needed for long-tail storage. If a photo became popular, we could cache it forward. The parato-ization of the photos was probably 99.9/0.1. I'm actually surprised that drive manufacturers haven't come out with a slower disk that is cheaper, lower power, and/or more reliable. We could find a zillion places to hide latency from our users.

Samsung's new generation of drives is currently only available in "EcoGreen" versions: 5400rpm. Doesn't particularly affect bandwidth when you're at 500gb/platter, though you lose on latency.


Seagate Barracuda LP and WD RE-GP are ~6000 RPM disks that are lower power than normal 7200 RPM SATA. Maybe next year they'll come out with ~5000 RPM disks.

Cool. I wonder what sort of density could be achieved with slow solid state storage. The line between disk and tape continues to blur, I guess.

That's a pretty weird analysis, comparing a system that is special purpose with a general purpose one.

He's concentrating on disk-to-board throughput, but this being network attached storage on a LAN, using the motherboards' single ethernet port the amount of data read or written to those drives per second will never exceed 1 Gbit/second anyway.

I agree with the reliability analysis though, backblaze seems to be on fairly thin ice there.

Still, it seems to work for backblaze, I'm taking it as read that they've lived through a bunch of drive failures by now and they seem to still be in business.

"That's a pretty weird analysis, comparing a system that is special purpose with a general purpose one."

I think that's why he says right at the beginning:

  This device is that cheap because it cuts
  several corners. That's okay for them. But
  for general purpose this creates problems.
  I want to share my concerns just to show you,
  that you can't compare this to a X4540 device.

And then he goes and does the comparison anyway...

Let's not forget that BackBlaze started this, by making a graph that compared their prices to S3, etc. (and were also criticised for that in the previous HN thread).

They are talking about cloud storage. Having 3 live copy's of your data in 3 locations is far more important than having lot's of redundancy at a single location. If you want redundancy just add more locations. Large scale company's like Google, Microsoft, and Yahoo understand this but sun is selling feel good hardware to companies that like to think they have unusual needs.

As to speed, nothing you do is going to reduce a disk's low latency so talking about bandwidth is mostly a waste of time. Device bandwidth is rarely the limitation once you start to scale, because you can add devices to your network faster than you can build a better network.

No, he contrasts them :)

I think the can of worms is that Backblaze themselves compared their system to more general solutions. While they were doing that for their own purpose they weren't so explicit with that point that many people (such as GigaOM, etc) took it as a general comparison.

They explicitly said that it was a storage unit/$ comparison, not throughput, not reliability.

It would have helped if they'd published graphs next to it showing the same list of systems sorted by throughput and another sorted by reliability!

For sure. I think one of the big problems with the original article is that the fabriacator of the project did the comparison and failed to see the suggestion that was being made. Probably not on purpose, but plenty of folks ran with it thinking they could have all of 'cheap, fast, reliable'.

They just got 'cheap', reliable comes at the expense of using multiple of these units and fast looks like it is going to be impossible without striping across multiple setups like this.

Interesting for an extremely narrow use case (long tail storage, think backup services, video sites, photo sites).

Outside of that maybe useable with some adaptation but not 'as is', besides that it is not as though their machines are up for sale, the figures they quote are mostly after all the r&d is already factored out, if you had to do this for a one-off, just cutting and folding that case would cost a pretty penny.

I'm fairly sure there is a market for a device that is a more robust version of this.

Probably at a 50% or so premium compared to the one they showed you could get a wider spectrum of uses and a bit more robust construction.

I would have bought one three months ago, now I had to settle for 20T or so in one 4U box, for a price a little over half of what they quoted in the article.

Ironic though how Sun spent the last 20years trying to persuade companies that networks of Unix mini-computers were a cheap reliable alternative to mainframes - yet when it comes to putting disks in boxes you need to have everything stamped with the magic 'enterprise' badge.

If you believed their spin presumably you would buy from a proper enterprise company like IBM or EDC - not some failed workstation vendor.

If you look at SANs in general, Sun undercuts that whole space, especially if you look at what they do with the 7000 series which combine SSDs and HDDs.


The 7000 series is an incredibly cool system, and it's definitely giving NetApp/EMC a run for their money.

That said, it's still really expensive if you're willing to go that extra mile and design your own system for your needs. And it's really not that hard.

Yeah I totally agree it's expensive - though compared to the other enterprise storage solutions it's cheap. That said people should be making cheaper solutions when you don't that enterprise level system - especially if you can handle the failures in your software stack.

Backblaze is making a backup server. They don't need to handle power supply failures. If one goes down, you'll replace it long before someone needs their backup. They don't need to worry about disks failing while writing (zfs and battery backup). Hell they can restart from scratch any one backup that encounters disk failures. They have all the time they want to do a backup.

> you'll replace it long before someone needs their backup

How do you know when I'll need my backup?

Because you can access it from that other server that bb kept, and maybe even those two other servers if they're halfway smart.

Replication is what it is all about.

Statistically it would be a rare case for a customer to request a backup restore from a failed unit even if you don't replicate. Bb could handle these with polite mails explaining your data will be available in 24 hours and offering you some free service as compensation - it wouldn't hurt them, they do consumer backups.

He did a good job explaining why one should shop around and weigh their options... Such that it works for backblaze but that doesn't mean it will work on other applications especially concerning power failure and vibrations. He did a very good job of objecting to there message professionally regarding its use for others.

I think his 'DC-3' analogy is valid - BB has a decent system if you don't mind the risks, Sun is selling peace of mind.

>Sun is selling peace of mind

Right upto the point where they are sold to a database company that isn't interested in selling disks.

the nice thing about the blaze is that if seagate decide to double their prices or much you to a new expensive platform - you just buy toshiba instead.

Ask Joyent about the "peace of mind" they got from their Thumper ...

FUD and RAS being two of the basic tenets of marketing most anything in the enterprise IT space, after all.

There is a lot of good engineering in the Thumper. There is a lot of clever engineering in the Backblaze. The comparison states a lot of valid points on power distribution, drive reliability and RAID throughput.

I see Backblaze will learn a lot from building their storage systems and future versions may be much better, but this one cuts too many corners for me to feel confident it will have the same kind of reliability I would recommend.

What amazes me most is the use of desktop-grade hard drives. Not because they are sloppily built, but because their performance requirements are so different from a server environment.

This makes for some nice reading: http://cacm.acm.org/magazines/2009/6/28493-hard-disk-drives-...

Can anyone estimate the practical benefits (or not) of adding some extra Gbit network cards in the free motherboard slots on the BB device?

Probably not much. The disk IO of this system is probably pretty bad. It might not be able to saturate a single GigE connection. Additionally the data is going mostly to consumer residential internet users. It's doubtful they need more than one GigE connection per box.

You would have to do work (either 'etherchannel' or similar bonding, or load balance requests over 2 IPs) but in theory, if you had a high-bandwidth bus like pci-express, you should be able to get more bandwidth out of a chassis by adding more ethernet cards.

Me, I would add more cache at the same time. SATA tends to be really good for sequential, but be really, really bad for random. multiple sequential streams hitting the same disk start to look an awful lot like random access after a while.

They are using a consumer grade motherboard with one ethernet port, 3 PCI slots and one PCI express slot, the slots are all used up by the sata interfaces.


> Given the reliability info he cites, the fact that backblaze is still in business must mean their usage-to-storage ratio is small.

No, it probably means that they get their reliability at a different level, such as simply storing the data on a number of machines. Systems like glusterfs make this completely transparant.

Gluster is broken. Hadoop is pretty proven, but I haven't heard of single, non-academic Gluster deployment that worked. From personal experience I can honestly say that software cost us several customers, and caused a support nightmare for a good month before we bought a SAN instead.

> Gluster is broken

Agreed, but I do know a few situations where it is being used, the 'stack' has been undressed to the minimum and tested to the hilt before deployment.

Still, I would not risk my company on it. I've been a big fan of their architecture from pretty much day 1, but so far they seem to be feature oriented and not stability oriented.

At least a couple of pharma companies actively use Gluster. Haven't used it myself, but have heard some stories (not positive) about the stability

( ahhh I deleted my post before I saw the response )

From the numbers he cited I got the impression that the drives they are using would fail 3+ times more often than enterprise drives (lower MTTF + speced at low temps). The cost of replacing the drives (even if they are 1/2 the price) would make the system uneconomical.

Also: because of their case design (non-hot-swappable) the cost of human replacement is higher too.

Thus they must have a low usage rate to have a low failure rate for their system to make business sense.

I think their use case is sufficiently a-typical to get away with this.

They're primarily write-only, only when a customer retrieves their data does stuff get read.

If they spin down the drives when the volume is not being accessed the mttf for a single drive goes up a lot, possibly beyond the point where it matters.

That made me think of MAID (and COPAN), which is decent approach.


Floor loading (and rack size) is a problem with their 'very big box of disks'.

That, and the fact that the company seems to be struggling to get sales.

Given proper redundancy and a adequately read/write unbalance, it's safe to power-down drives during read-heavy cycles and only power them up when you have some redundant data to write.

But you will have to manage the cycles for redundancy, lifetime and power savings. The software to do it must be really clever.

Or even to power down the entire pod if the data on it is quiescent. If you managed the fill and aging properly, this wouldn't even be hard to do -- as a backup service, they're write-heavy, which gives them a lot of flexibility that other applications don't have.

"It's the same with this storage, this hw needs the parachute in form of the software in front of the device."

Yes, software on (relatively) unreliable hardware is exactly how Google has been able to scale to such an immense scale.

I think everyone agrees the Backblaze (bb) has no performance - and that's fine.

But a lot of his complaints are about reliability, and thats kind of a moot point since Backblaze (and anyone sane) stores every piece of data on at least two (and ideally 3 or more) separate bbs. I would argue if you don't need performance (i.e., no huge database), but you do need a lot of space, you're better off just setting up two 'mirrored' bbs (so that when one goes down, the other takes over) than almost any 'enterprise' solution.

He does have a point about ZFS though - I'm sure at that kind of scale eventually the RAID 5/6 write-hole is going to bite you in the ass.

I read that as "every piece of data on at least two (and ideally 3 or more) separate bbs", meaning BBS as in a bulletin board system. Wow, that caught me off guard.

Fascinating read, and its nice to see he brought up a lot of the concerns that were mentioned here. A few things that I noticed:

He mentioned that 1 Disk uses 120 MB/s, so 5 of them is 600 MB/s. He then says that converts to 6 GB/s. Now, I'm not all that familiar with speeds, but I know 1000 MB is 1 GB, so shouldn't 600 MB/s be .6 GB/s?

Other than that, he raised a number of valid points. The Backblaze system is NOT general purpose. Its designed to take a bunch of data (Backups) and hold on to it for as long as possible. In that situation you don't need high Throughput, or even all that reliable hardware. Once the drives are full, their usage will drop, and as long as the RAID can be kept up to keep the data secure, as long as costs stay down they're fine to keep replacing hardware.

One thing of note, and something I never saw in the Backblaze article: Their service promises to hold your data for 50 years. Assuming that one of their pods is filled, what is the estimated cost to keep that data over the course of 50 years? I would assume they calculated that out, and that the data showed to go with the consumer grade parts, but I would really like to see the numbers.

You got confused about the GB/s because he uses the capital "B" at some point where it should have been the smaller 'b' for bits, not bytes. So, 600 MB/s is 600*8 = 4800 Mbps.

I hope that clears it up.

Please note that this is the guys private blog, he definitely does not speak for Sun in that spot and to label the article "SUN Engineer responds" makes it seem like the response is an official one but it very clearly isn't. The text is full of other mistakes as well.

From the article: "600 MByte/s, a little bit less than 6 GBit/s."

bytes vs bits.

He also leaves out that you could buy 10 of these Backblaze things for the price of one Sun X4540. So with a distributed load, thats 10x the performance, 10x the reliability, etc... I'm not an expert on storage, but I'd be curious to see how that stacks up, never mind the comet-proof redundancy by placing your storage in datacenters across the world.

10x the rack space, 10x the cooling, 10x the power ...

People often forget that the rack space, the actual physical space, is very costly. Depending on the datacenter, the power might run you just as much too. Last company I worked at was essentially forced out for using too much power per U.

Would you really buy a hackish low-spec solution like this and then place it in a premium rack?

Given the idea, buy 10 and place them at geographically diverse points, you may as well put them in the back room at branch offices or something. Who cares if one goes offline for a while. Hell, put them in countries with good consumer internet connectivity (japan, korea, sweden etc) and just use that.

If you're going ghetto, go all the way!

yeah, cooling systems are designed to handle a certain amount of heat per square foot. Most of the he.net locations around here can only handle 1 15a 120v circut per rack. That's 5 low-power dual-socket servers for a full-sized rack.

When I'm buying data center space I look at cost per watt and ignore the space requirements. I've not yet hit a place that was willing to sell me more power than I could use in a rack. If the space is low/medium density, I fill it with 2 or 3u chassis (I'm currently using http://supermicro.com/products/chassis/3U/833/SC833S2-550.cf... because I got a good deal on a lot of them.) If it looks like they will sell me more power than I can eat with those, I pop in a few of the supermicro 2 in 1u boxes: http://www.supermicro.com/Aplus/system/1U/1021/AS-1021TM-T+....

This is part of why you see so many people leaving airspace between 1u servers. (me, I use the 3u servers for 'air gaps' as they run really cool.)

To give people an idea; our main datacenter gives us 2kW per rack, and we pay extra for 4kW. A rack full of these ~700W storage pods would need about double that again, and isn't going to leave you much wiggle room.

Its not though. Both devices are 4U rack space servers. This actually sticks 67TB in one 4U rack the X4540 gets 48TB in 4U.

This is less space rackpace per TB maybe more power.

Even if you need two of these for redundancy the cost savings is significant.

OK, so it's less rackspace per TB. But it's still more power, and more cooling. And in terms of filling a rack, power and/or cooling become limiting factors long before volume does.

What's that in Watts/U?

10x the capacity though - what's your point?

Those costs are minor compared to the hardware solutions being discussed.

Haha! Unexpectedly needing a new cage at your DC, or an extra 50kW power, or being told you can't put anything more in your half-full racks because you've exceeded floor loading gets expensive real quick.

Upvoted. I would upvote a million times over if I could. So many developers seem to not realize just how expensive physical plant (and the people to administer it) really is -- "hardware is cheap" fails to account for the facility costs involved in acquiring necessary space, power, cooling to run it -- often ahead of time.

Managing the IT team and a growing data center installation for a medium size business was one of the most enlightening experiences I've had as a developer writing software targeted at "internet-scale".

A few years ago I did some consulting for a datacentre provider that you've almost certainly heard of who were giving serious consideration to buying a utility and operating their own power station, to get their power and cooling at cost. They'd already contemplated and rejected moving their entire operation to northern Alaska/Canada/wherever and just paying to plumb in the bandwidth (customers wanted to be able to get at their kit if they needed to).

At svtix, a tier3 data center in downtown san jose, Not at the top end of the price scale, but not at the bottom, I can get 2 20a 120v circuits (4800w, that's 3360 usable.) and a full locking rack for around $1100 a month. I can buy two or three racks full of power. If I wanted to pull a gig connection to another nearby data center, say, because I couldn't get more at svtix, we are talking something like $500-$1000 a month, depending on how badly I get screwed by my poor negotiation skills.

Just sayin, at those prices, it takes a long time to amortize out a six figure appliance.

At svtix, a tier3 data center in downtown san jose, Not at the top end of the price scale, but not at the bottom, I can get 2 20a 120v circuits and a full locking rack for around $1100 a month. I can buy two or three racks full of power.

As long as that space is available. To guarantee business capacity you'll probably need to buy a large block of rackspace at one time. Of course, capacity availability varies based on the current local business climate, but that's something you have to plan for too.

If I wanted to pull a gig connection to another nearby data center, say, because I couldn't get more at svtix, we are talking something like $500-$1000 a month

Upwards of $1000, but note that GigE is a fairly small link for data center cross-connection and will still usually have a large lead-time. Not only do you need to be able to guarantee sufficient capacity either at svtix or at the other colo, but you need to be prepared to wait to get that capacity allocated for your use.

you have a very good point that you pay much higher prices (or just massively overbuy, which is similar) if you want guaranteed capacity right now, right were you are.

My model is such that it's not that big of a deal to just get an unconnected rack at a new data center and buy transit there. When I was blocked on new power at SVTIX, I slapped a few new servers in at rippleweb.com, which is in Sacramento. If I had a big, expensive san I couldn't replicate, well, then I'd have a different model. (rippleweb has awesome prices on high-power 1U server co-location, but it's not as good as getting a full rack elsewhere.)

But yes, the observation that you pay more if you want your stuff provisioned right now is very correct. At prgmr.com, well, I wait for the deals, so I end up paying a lot less than I would otherwise. (but then, sometimes the 'we are out of space, come back later' message is up for longer than I'd like, as well. It's definitely a compromise.)

The use case for a company like backblaze though allows you to buy bandwidth very cheap because it will be mostly upstream, the direction in which datacenters are normally not seeing much traffic.

A 48TB Sun X4540 is 50K (list price, sun.com) (2x6core, 32GB RAM, 4xGigE, 4U, 2 year warranty).

The BackBlaze device is $8K for 67TB.

With 1 processor and 4G RAM.

So if you normalize this a bit, for around 50K you can get 6 BackBlazes (6 processors, 24GB of RAM and 402TB of storage).

Obviously we're still considering apples and oranges to taking this too much further would likely be silly. But its still pretty impressive numbers.

At first don't compare the mentioned list prices with the street prices for components.

This is why small companies, startups, and people new to the storage world hate working with storage vendors. I understand the economics, I get why they sell storage the way they do, and I like my 70% discount off list price, but it makes for a horrible user experience.

If I were a cleverer person, I'd apply an every-day low price model to "enterprise" storage.

I went with the reseller I did at svtix, primarily because they were upfront enough to list prices on their website. http://egihosting.com - I don't want to spend time fucking around over price, and I'm sure you don't want me to waste your time. Post the price, and I'll buy it or I won't.

Seriously, If you say you want six figures for your product, I'm walking. I don't have that kind of scratch, and I'm not going to waste your time or my time lowballing you. If you really mean $30K, well, I'd have to scrimp and save for a few months, but it's doable if the product really does solve more than $30K worth of problems for me.

Even though Sun are at the cheaper end of the SAN market - I struggle to see why adding a network interface to hard drive still ends up as 10x the cost of the drive.

I agree that what the article misses is that Backblaze is spending the money on software rather than hardware, since the price difference is so huge (and the software solution is a fixed cost).

What I'm more interested in is whether the lower MTBF of the cheap drives and home-brewed chassis ends up with a higher cost per year due to higher failure rates. If a desktop drive costs $100 and fails three times a year, but a server drive costs $200 and fails once every 2, the initial cost savings is moot.

Well, remember that if you were seriously on a budget you'd just return all the failed drives for warranty replacement (Seagate is 5yrs). The hardware costs would never add up to more than the enterprise, it's that the labour costs of dealing with it would be more, and worse - more likely to lose data. Enough more likely to justify the additional cost? Hm, I doubt it, not for this use case.

How is MTBF derived? He claims "1.2 million hours normalized on 40 degrees", that's almost 140 years. It seems to me that it's just some extrapolated test statistic being bandied about with no relevance to real world conditions.

Seems like he's just making it easier for Backblaze to match feature to feature by listing out the differences O_o.

I'm pretty sure Sun isn't worried about Backblaze as competition in the enterprise storage market. Building cheap storage boxes is much different than building what Sun provides.

That's usually how disruptive technologies start out though; the fact that Sun doesn't care makes Backblaze that more dangerous.

zfs with OpenSolaris is a good tip. 2009.06 is a much less mess (leave alone that we-will-rewrite-everything-in-java svc:/* horror)

zfs performance in 2009.06 has some very serious problems, particularly with zvols.

Do you have references and pointers? Is this going to hurt me when I put two non-storage servers live running 2009.06?

You can find a number of them on the opensolaris forums if you search on terms like 0906 iscsi, comstar 0906, etc... here's an example: http://opensolaris.org/jive/thread.jspa?threadID=104593&...

Our experience was that 0906 was utterly unusable if you're using comstar, with performance being mildly degraded for straight zfs usage.

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact