Hacker Newsnew | comments | ask | jobs | submitlogin
jbooth 1264 days ago | link | parent

Ok, you're up to 3k random reads per second at the absolute peak. Realistically more like 2k for 6x10k disks at RAID 0, and it's awfully hard to even fit 10TB on a RAID1+0 setup. What's that, 10x2TB? Good luck with the latencies on those 2TB disks.

Still doesn't add up to 10k select statements per second over a 10TB dataset on a singlenode. Even without writes, that's not happening. I call BS on grandparent post.



CrLf 1264 days ago | link

That shows a lack of understanding of what kind of hardware is being used in the real world to handle this.

With a half-decent SAN with 15k drives and 4Gbit fibrechannel connections, you can get 1000+ IOPS without the storage system even breaking a sweat. Under load it can easily give 10 times that.

This is something that's everywhere in the business world.

Pair this with a bunch of cores and a few GB of memory, and you can have an RDBMS that chews through impressive amounts of data. Unless, of course, you optimize nothing and swamp it with lame queries that do nothing that table scans. Funny enough, the same people that are fine with doing everything in code are the ones that can't be bothered to think more than one second about what kind of queries they are throwing at the database.

-----

jbooth 1264 days ago | link

Well, yeah, you can get some decent IOPS for 100k, but 20 servers with 4 spindles a piece are still more.

-----

CrLf 1264 days ago | link

You will also consume more power, and have 20 servers to manage instead of one, and you have to customize your data to be distributed between those 20 servers, hampering the possibility of querying that data for patterns that enable you to optimize your business.

-----

fleitz 1264 days ago | link

No kidding, it's like a battery backed write cache doesn't even exist in the NoSQL world. I was able to easily drive 200MB/sec of random IO on 25 15K drives.

-----

fleitz 1264 days ago | link

Btw, this was 200MB/sec of random writes. I didn't even bother with random writes. I could have gotten the writes to be basically sequential if I had bothered to write a COMB style UUID generator. I happen to be a fan of UUIDs for the surrogate keys as it makes database merging so much easier.

-----

gaius 1264 days ago | link

6 disks? An EMC array can throw 20x as many physical disks at this sort of problem. An Exadata can compile SQL down to microcode and execute it on the storage, like a graphics card doing matrix operations on dedicated hardware.

Again, as I say, the NoSQL crowd have no idea about what the state of the art is in the RDBMS world.

it's awfully hard to even fit 10TB on a RAID1+0 setup

It would actually be hard for me to buy an array that small...

-----

jbooth 1264 days ago | link

What's the cost on that EMC array?

How many commodity servers could I buy for that?

I have a pretty solid idea what state of the art is in the RDBMS world - it's diminishing returns as a machine that's twice as powerful costs 10X as much, all the way up the enterprise ladder. It's spending 100k on your software licenses, 100k on your storage and 500 bucks on a CPU.

Not that there's anything wrong with that. It's ok. If your domain is highly transactional, it's probably a better move than implementing your own transactions over something else. Just don't pretend that your limitations are actually strengths -- you have your own strengths.

-----

gaius 1264 days ago | link

It doesn't matter. You see, in business, there is no "cheap" or "expensive". There's worth the money, or not. It doesn't matter how many commodity servers I could buy for the cost; no matter how cheap they are, the money would be wasted if that simply the wrong technical approach.

Because you can't compete at this level by chucking increasing amounts of anything at the problem - people, dollars, spindles, nodes, you name it.

-----

jbooth 1264 days ago | link

You see, in business, everything is about cheap or expensive. It's just a more broad definition that includes developer time and ROI.

If your problem is extremely transactional and legitimately unshardable, feel free to drop 6 mil on exadata. Or a half a mil on a database server and backup. But frankly, your objections are starting to have a religious feel to them. All I was saying is that PL/SQL is a pile of crap to code in and fundamentally unscalable without spending a boatload of money. A little better design can get the same thing with a lot less cash.

EDIT: No, those are facts, PL/SQL looks like it was designed in 1965 and, yes, putting all of your CPU processing into a single node is fundamentally unscalable. I've seen it. It was fundamentally unscalable.

I'm not making a religious point about RDBMS - it can be the best model in many situations. I'm making a point about single bottlenecks for your architecture.

-----

ora600 1264 days ago | link

BTW. All architectures have a single bottleneck. Thats pretty much by definition.

Oracle tried to market their Exalogic as "no bottlenecks" which is nearly as funny as "unbreakable linux" and "zero latency".

-----

gaius 1264 days ago | link

"pile of crap" and "fundamentally unscalable" and I'm the religious one o_0

-----

cosmicray 1264 days ago | link

You buy the EMC unit, because you want the EMC tech to call you and say "we see you have a failing drive, I'm en-route, and I'll be there in 15 minutes with a replacement". Even if you are not paying attention, the unit called in and told the control center it needed attention.

-----




Lists | RSS | Bookmarklet | Guidelines | FAQ | DMCA | News News | Feature Requests | Bugs | Y Combinator | Apply | Library

Search: