Hacker News new | comments | ask | show | jobs | submit login
AWS X1 instances – 1.9 TB of memory (amazon.com)
340 points by spullara on May 18, 2016 | hide | past | web | favorite | 180 comments

Does anyone have numbers on memory bandwidth and latency?

The x1 cost per GB is about 2/3 that of r3 instances, but you get 4x as many memory channels if spec the same amount of memory via r3 instances so the cost per memory channel is more than twice as high for x1 as r3. DRAM is valuable precisely because of its speed, but the speed itself is not cost-effective with the x1. As such, the x1 is really for the applications that can't scale with distributed memory. (Nothing new here, but this point is often overlooked.)

Similarly, you get a lot more SSDs with several r3 instances, so the aggregate disk bandwidth is also more cost-effective with r3.

Not sure I quite understand your math here. The largest R3 instance is the r3.8xlarge with 244 GB of memory. 4 times of that would only get you to 1 TB. Also, this: "DRAM is valuable precisely because of its speed", is wrong (https://en.wikipedia.org/wiki/Dynamic_random-access_memory).

1. 4 of those R3 instances cost less than the X1 but offer nearly double the bandwidth. The X1 is cheaper per GB, but much more expensive per GB/s.

2. If DRAM was not faster than NVRAM/SSD, nobody would use it. "Speed" involves both bandwidth and latency. Latency is probably similar or higher for the X1 instances, but I haven't seen numbers. We can make better estimates about realizable bandwidth based on the system stats.

This is probably a dumb question, but what does the hardware of such a massive machine look like? Is it just a single server box with a single motherboard? Are there server motherboards out there that support 2 TB of RAM, or is this some kind of distributed RAM?

For example Dell sells 4U servers straight out of their webshop which max out at 96x32GB (that's 3TB) of RAM with 4 CPUs (max 18 cores/CPU => 72 cores total). They seem to have some (training?) videos on youtube that show the internals if you are curious:

https://www.youtube.com/watch?v=vS47RVrfBvE main system board

https://www.youtube.com/watch?v=_poMPOUGRa0 memory risers

Don't know what hardware AWS is using, but Ark has server boards supporting 1.5TB, which is close enough to make 2TB believable: http://ark.intel.com/products/94187/Intel-Server-Board-S2600...

Edit: Supermicro has several 2TB boards, and even some 3TB ones: http://www.supermicro.com/products/motherboard/Xeon1333/#201...

(Disclaimer: AWS employee, no relation to EC2)

This would require expensive 64GB DDR4 LR-DIMMs though.

We have some supermicros that have about 12TB RAM, but the built in fans sound like a jumbo jet taking off so consider the noise pollution for a second there.

Er, are you summing a TwinBlade chassis? You have to be.

6TB is about where single machines currently top out due to the hardware constraints of multiple vendors and architecture, and memory bandwidth starts being an issue. You have to throw 96x64GB at the ones that exist so wave buh bye to a cool half a million USD or so. If you're sitting on a 12TB box I want a SKU (I want one!).

I don't actually think Supermicro makes a 6TB SKU, even. That's Dell and HP land.

We do have a twinblade chassis, but I'm pretty sure they are a 6TB SKU. To be honest, I'm not the one who procured them so I can ask if you are interested in a SKU.

> Are there server motherboards out there that support 2 TB of RAM

Sure, http://www.supermicro.com/products/motherboard/Xeon/C600/X10... supports 3TB in a 48 x 64GB DIMM configuration.

Once upon a time I hacked on the AIX kernel which ran on POWER hardware (I think they're up to POWER8 or higher now). In my time there the latest hardware was POWER7-based. It maxed out at 48 cores (with 4-way hyperthreading giving you 192 logical cores) and a max of I think 32TB RAM. Not the same hardware as mentioned in the OP, but pretty big scale nonetheless.

This shows a logical diagram of how they cobble all these cores together: http://www.redbooks.ibm.com/abstracts/tips0972.html?Open

I've seen these both opened up and racked up. They are basically split into max 4 rackmount systems, each I think was 2U IIRC. The 4 systems (max configuration) are connected together by a big fat cable, which is the interconnect between nodes in the Redbook I've linked above. The RAM was split 4 ways among the nodes, and NUMA really matters in these systems, since memory local to your nodes is much faster to access than memory across the interconnect.

This is what I observed about 5-6 years ago. I'm sure things have miniaturized further since then...

yeah, sure, you can get a quad xeon 2U server with 2TB of RAM for around $40K. Here's a sample configurator: https://www.swt.com/rq2u.php change the RAM and CPUs to your preference and add some flash.

No insight into what Amazon uses, but we've got HP DL980s (g7s, so they're OLD) with 4TB of RAM) and just started using Oracle x5-8 x86 boxes with 6TB of RAM 8 sockets. I believe 144 cores/288 threads.


4 CPU, 60 cores, 120 threads (cloud cores), 3TB RAM, 90TB SSD, 4 x 40GB Ethernet, 4 RU. $120K.

Same price as the AWS instance for one year of on demand.

I can stick 1.5 TB and two sockets in blades right now. Blades. Servers can carry a lot more, amd it's not even especially expensive.

Yeah, just realized my knowledge of server hardware is hopelessly outdated. They seem to be a couple of orders of magnitude more powerful than what I assumed was available.

4 physical CPUs and 1.9TB of RAM is doable in a 4U server for sure, and possibly in a 2U. So, it just looks like a big server.

Intel processor support up to 1536 GB of ram so basically 1.5 TB per processor.

How flipping awesome is it that some very large portion (90% or so?) could probably all be one nice contiguous block of mine from x86_64 userspace with a quick mmap() and mlockall().

I think I have picked this up from an earlier thread discussing huge servers: http://yourdatafitsinram.com/

One of the links on the top points to a server with 96 DIMM slots, supporting up to 6 TB of memory in total.

IDK about AWS, but for SAP HANA, this is done via blades. I've seen 10 TB+.

My guess is that it is not really DRAM but flash memory on a DIMM like this product form Diablo Technology:


Your guess is wrong. It's DRAM plain and simple.

You're right but the next few years will likely bring cloud offerings with non-volatile-memory-on-DIMM like 3Dxpoint.

As a reference the archive of all Reddit comments from October 2007 to May 2015 is around 1 terabyte uncompressed.

You could do exhaustive analysis on that dataset fully in memory.

Your point is accurate, but I'd like to point out that the dataset isn't actually all the comments on Reddit -- it's inly what they could scrape, which is limited to 1000 comments per account. So basically it's missing a lot of the historical comments of the oldest accounts.

I only point this out to try and correct a common error I see. You're absolutely right that it is awesome that the entire data set can be analyzed in RAM!

Are you sure? The dataset is from here: https://archive.org/details/2015_reddit_comments_corpus

Looking at the thread from the release, I see no explanation of how he got the data, but I see several people commenting that they finally have a way to get comments beyond the 1000 per account: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_eve...

It looks like they got clever, so what they are doing is getting 1000 links from each subreddit, indexing the comments and linking it back to the user. So yeah, they did figure out how to got back more than 1000 comments for some users, but they'll still be limited by 1000 links per reddit per sort. So for a very active reddit, you can't get really old comments.

IIRC you can use the search API to enumerate 1000 at a time, using a "created_at>=..." restriction to move the cursor starting position. I had to poke in the code to find out how to talk Amazon's search syntax directly

Yeah, that might work too. That would be pretty clever.

What about an official data dump, like StackExchange does? Wouldn't that be beneficial to all parties? Are there legal restrictions?

I haven't poured over the code, but reddit is most definitely more than one server so there's gotta be a sync protocol in there somewhere. So, technically speaking, no restrictions there.

Practically speaking, on the other hand, reddit likely never took user content licensing into account to begin with - they never knew they'd get all of digg a couple years in, and 2006 was a different time besides.

The only way reddit would be able to release the data now would either be a specific clause (added from the start) allowing such in the TOS, or a general-purpose "we can do w/e we want with your data"; nobody thought to add the former (as noted), and of course the latter would have been roasted alive the moment it was noticed.

So, me thinks licensing issues.

Modmailing /r/reddit.com is the canonical way to reach the admins (for any reason), if you'd like more info.

It would be interested to see the distribution of 1000 comments from each account over the period of 12 months. Some people go dormant - like vacation, or depression, or lack of interest in topics - then cluster a bunch of comments when, say, they are on a drunken rage binge.

What time of day the accounts most frequently comment. (I'd bet there is an interesting grouping of those that post while at work during the day, and those who post from home at night.

or what subreddits people comment in most during the day vs which /r/ they post to at night ;)

Some of those have already been done.

Sounds like you'd enjoy reading this guy's stuff: https://www.reddit.com/user/minimaxir/submitted/?sort=top

Love it thanks!

You may be interested in an SQLite version of the dataset that is 553 GB vs. the 908 GB JSON: https://archive.org/details/2015_reddit_comments_corpus_sqli...

The storage format of a dataset can make a big difference in memory usage.

I would like to know how much of that is memes and shitposts

That is pretty remarkable. One of the limitations of doing one's own version of mass analytics is the cost of acquiring, installing, configuring, and then maintaining the hardware. Generally I've found AWS to be more expensive but you get to "turn it on, turn it off" which is not something you can do when you have to pay monthly for data center space.

It makes for an interesting exercise to load in your data, do your analytics, and then store out the meta data. I wonder if the oil and gas people are looking at this for pre-processing their seismic data dumps.

Why does everyone(really!) compare aws to colocation ? I've never head an aws-believer ever mention dedicated servers.

Why don't you compare aws to building your own cpu?

I suspect it is because "everyone" (which is to abscond with your definition) believes that colocation is an alternative to AWS (well the EC2 part anyway). I would be interested to hear how you seem them as not being comparable.

On your definition of "aws-believer" is that someone who feels that AWS is a superior solution to deploying a web facing application in all cases? Does your definition include economics? (like $/month vs request/month vs latency?)

Can I assume that you consider comparing AWS to building your own CPU as an apples to oranges comparison? I certainly do, because I define a CPU to be a small component part of a distributed system hosting a web facing application.

by "everyone" I mean >90% of comments on this forum comparing aws to an alternative (90% of cases it's straight to colocation)

Building your own datacenters is also an alternative to aws-ec2, but you go to that step by step (I think?) (dedicated > colocation > datacenter). In some cases when you have crazy growth you can skip a/some steps (ex: dropbox going from aws to their own datacenter)

They don't even compare them in whatever $/request/latency/$metric since they don't even mention them. And dedicated has also many options from low-price+no-support-shitty-network to high price/good-network/support etc.

Just curious - but wouldnt the GPU based instances be more efficient for oil and gas people?

Or load a data set in this monster and then use GPU workers to hit it?

GPUs work when the data is small and the calculation can be parallelized. Random access to memory from a GPU would be slow. It's more like a separate computer (or lots of separate computers) that you can send a small program to execute and get the result.

Ah thanks - I've never worked with GPUs

It's probably why they have their own memory.

More food for thought: how many neurons + synapses can one model with that amount of RAM?

Are you using seismic to describe what the data is, how big the data is, or both?

Here is a good link: http://www.seismicsurvey.com.au/

Basically you take waves that are transiting the area of interest and do transforms on them to ascertain the structure underground. Dave Hitz of NetApp used to joke these guys have great compression algorithm, they can convert a terabyte of data into 1 bit (oil/no-oil).

One of the challenges is that the algorithms are running in a volume of space, so 'nearest neigbor' in terms of samples has more than 8 vectors.

In the early 2000's they would stream their raw data off tape cartridges into a beowulf type cluster, process it, and then store the post processed (and smaller) data to storage arrays. Then that post processed data would go through its own round of processing. One of their challenges was that they ended up duplicating the data on multiple nodes because they needed it for their algorithm and it was too slow to fetch it across the network.

A single system image with a TB of memory would let them go back to some of their old mainframe algorithms which, I'm told, were much easier to maintain.

Spot instances are about $13 - $19/hr, depending on zone. Not available in NorCal, Seoul, Sydney and a couple of other places.

Do you mean on-demand instances? The announcement says "Spot bidding is on the near-term roadmap." And $13 / hour is the on-demand price in US East.

Indeed, doesn't look like it's there yet. Based on that I guess the spot prices will be around $1-3/h - not bad, if you have a workload that can be interrupted.

Going to comment out the deallocation bits in all my code now.

You jest, but sometimes that's exactly what you need for short-lived programs¹. Bump alloc and free on exit is super fast if your space complexity is bounded.

¹ http://www.drdobbs.com/cpp/increasing-compiler-speed-by-over...

JonL White actually wrote a serious paper about just this idea in 1980: http://dl.acm.org/citation.cfm?id=802797

Memory leaks be damned... Seriously, that is just huge.

Add some bitcoin mining with the power you still have afterwards

Question for those who have used monster servers before:

Can PostgreSQL/MySQL use such type of hardware efficiently and scale up vertically? Also can MemCached/Redis use all this RAM effectively?

I am genuinely interested in knowing this. Most of the times I work on small apps and don't have access to anything more than 16GB RAM on regular basis.

Postgres scales great up to 256gb, at least with 9.4. After that it'll use it, but there's no real benefit. I don't know about MySQL. SQL Server scales linearly with memory even up to and past the 1TB point. I did encounter some NUMA node spanning speed issues, but numactl tuning fixed that.

I setup a handful of pgsql and Windows servers around this size. SQL Server at the time scaled better with memory. Pgsql never really got faster after a certain point, but with a lot of cores it handled tons of connections gracefully.

I've very successfully used shared buffers of 2TB, without a lot of problems. You better enable huge pages, but that's a common optimization.

I don't work on 2TB+ memory servers, but one of my servers is close to 1TB of RAM.

PostgreSQL scales nicely here. Main thing you're getting is a huge disk cache. Makes repeated queries nice and fast. Still I/O bound to some extent though.

Redis will scale nicely as well. But it won't be I/O bound.

Honestly, if you really need 1TB+ it's usually going to be for numerically intensive code. This kind of code is generally written to be highly vectorizable so the hardware prefetcher will usually mask memory access latency and you get massive speedups by having your entire dataset in memory. Algorithms that can memoize heavily also benefit greatly.

I've used Postgres out to the terabyte+ range with no probs, so it all works fine. Of course, whenever you approach huge data sizes like this, it tends to change how you access the data a little. eg. Do more threads equal more user connections, or more parallel computation? Generally though, databases aren't really hindered by CPU, instead by the amount of memory in the machine and this new instance is huge.

No idea about MySQL, people tend to scale that out rather than up.

For MySQL, it depends a bit what you're hoping to get out of scaling.

Scaling for performance reasons: Past a certain point, many workloads become difficult to scale due to limitations in the database process scheduler and various internals such as auto increment implementation and locking strategy. As you scale up, it's common to spend increasing percentages of your time sitting on a spinlock, with the result that diminishing returns start to kick in pretty hard.

Scaling for dataset size reasons: Still a bit complex, but generally more successful. For example, to avoid various nasty effects from having to handle IO operations on very large files, you need to start splitting your tables out into multiple files, and the sharding key for that can be hard to get right. But MySQL

In short, it's not impossible, but you need to be very careful with your schema and query design. In practice, this rarely happens because it's usually cheaper (in terms of engineering effort) to scale out rather than up.

Finally, an instance made for Java!

I dislike developing in Java. I am not a fanboy by any stretch of the imagination. That being said, someone who takes the time to understand how the JVM works and how to configure their processes with a proper operator's mindset can do amazing things in terms of resource usage.

It's easy to poke at Java for being a hog when in reality its just poor coding and operating practices that lead to bloated runtime behavior.

For a long time I wondered if it was a failing of the language or the culture.

After spending 4 days trying to diagnose a problem with hbase given the two errors "No region found" and "No table provided" and finally figuring out it was due to a version mismatch I now believe it is the culture.

At the very least you should be printing a WARN when you connect to an incompatible version.

Definitely the culture. It seems to be a Java axiom that you should never use one object when eight will do.

I've never seen another community that would actually have a builder for the configuration for the factory for the settings for a class.

This is over a decade old now, but is still very appropriate http://discuss.joelonsoftware.com/?joel.3.219431.12

I assume you've never had the pleasure of configuring a Sendmail installation?

Actually configuring sendmail is a piececake these days thanks to Google and countless engineers documenting in excruciating detail the problems they faced.

Bad logging is not endemic to the Java development mindset... in fact I'd give Java the nod for the _option_ to generate tremendously large quantities of log data when requested, above almost any other platform, excepting maybe Windows if you figure out how to turn on ETL.

If you want bad logging, look at most PHP projects...

For really thoughtful logging, the Apache HTTP client and HikariCP are good Java examples.

Bad code is bad code, no matter what language it is written in.

So much this. Back in 2001 I used IntelliJ IDEA on a PC with 128MB of RAM. It worked perfectly, and it was the first IDE I used that checked my code while I was writing it. The much less evolved JBuilder on the other hand stopped every couple seconds for garbage collection.

Both were written in Java.

And don't get me started on Forte (developed by Sun itself, no less). It was even slower and more memory-hungry than JBuilder.

I love Java. We shifted from c++ a year after it arrived on the scene. Since then, I've never needed to learn a new language in any depth. To me, that's a good thing and shows the longevity of the language.

> ...can do amazing things in terms of resource usage.

Sorry, but you just made my day. :P

You jest, but think about how unbelievably painful it'd be to write a program that uses >1TB of RAM in C++ .... any bug that causes a segfault, div by zero, or really any kind of crash at all would mean you'd have to reload the entire dataset into RAM from scratch. That's gonna take a while no matter what.

You could work around it by using shared memory regions and the like but then you're doing a lot of extra work.

With a managed language and a bit of care around exception handling, you can write code that's pretty much invincible without much effort because you can't corrupt things arbitrarily.

Also, depending on the dataset in question you might find that things shrink. The latest HotSpots can deduplicate strings in memory as they garbage collect. If your dataset has a lot of repeated strings then you effectively get an interning scheme for free. I don't know if G1 can really work well with over 1TB of heap, though. I've only ever heard of it going up to a few hundred gigabytes.

>With a managed language and a bit of care around exception handling, you can write code that's pretty much invincible without much effort because you can't corrupt things arbitrarily.

The JVM has crashed on me in the past (as in hard crash, not a Java exception). Less often than the C++ programs I write do? Yes, but I of course I wouldn't test a program on a 1TB dataset before ironing out all the kinks.

>The latest HotSpots can deduplicate strings in memory as they garbage collect

Obviously when working with huge datasets I would implement some kind of string deduplication myself. Most likely even a special string class an memory allocation scheme optimized for write-once, read-many access and cache friendliness.

Or I would use memory mapping for the input file and let the OS's virtual memory management sort it out.

mmap is not "a lot of extra work".

You're assuming the dataset is a file you can just mmap there. If it's more complex (built up from a network feed or something) then yes, there is more work involved than just an mmap call, coordinating shm segments is not zero complexity.

But the real impact is if you want to be mutating that data set. The default behaviour of "tear down the entire process on error" can of course be worked around even if you ignore data corruption errors, but not having to do things by hand is the point of managed runtimes in the first place.

Hah. You'd be surprised how few developers actually know that shared memory exists.

Use shared memory.

When you suddenly realize that your "big" data is not really that big!. Who needs a Hadoop/Spark cluster when you can run one of these bad boys

That was kind of my thought as well... I worked on a small-mid sized classifieds site (about 10-12 unique visitors a month on average) and even then the core dataset was about 8-10GB, with some log-like data hitting around 4-5GB/month. This is freakishly huge. I don't know enough about different platforms to even digest how well you can even utilize that much memory. Though it would be a first to genuinely have way more hardware than you'll likely ever need for something.

IIRC, the images for the site were closer to 7-8TB, but I don't know how typical that is for other types of sites, and caching every image on the site in memory is pretty impractical... just the same... damn.

I think you're missing a unit. 10-12 thousand? million?

million... lol

Heh, but I wonder what the default per account limits are on launching these... prolly (1) per account.

Why would they put any kind of a limit on it?

All AWS accounts have a "limits" which have the default limits as to how many instances that you could launch in that region.

The reason is so if you fuck up a scaling script for example you can't launch 1000 machines and take all the capacity and then bitch that you won't pay for it.

It's a stop gap.

However, aside from the hard limit of 100 S3 buckets, all other limits are configurable at the request of your AWS rep

It looks like that hard limit became a soft limit in August:


Hmm... I didn't know this. The last time I asked they said it would never happen. I was told the original reason was that all buckets had to have a unique name

I thought that too but I successfully raised our limit after they made it a soft limit, so give it a shot!

because they can only put these into racks so fast

And to prevent a run-away script from suddenly spooling up thirty of them. Besides issues with their hardware capacity, they're generally pretty good about refunding mistakes like that, so they're eating the cost...

All I can think about is the 30 minute garbage collection pauses.

Wow, did not know that's available on AWS marketplace! Thanks for sharing! : )

Actually, as far as VMs go, the JVM is fairly spare in comparison with earlier versions of Ruby and Python -- on a per object basis. (Because of its Smalltalk roots. Yes, I had to get that in there. Drink!) That said, I've seen those horrors of cargo-cult imitation of the Gang of Four patterns, resulting in my having to instantiate 7 freaking objects to send one JMS message.

If practice in recent decades has taught us anything, it's that performance is found in intelligently using the cache. In a multi-core concurrent world, our tools should be biased towards pass by value, allocation on the stack/avoiding allocating on the heap, and avoiding chasing pointers and branching just to facilitate code organization.

EDIT: Or, as placybordeaux puts it more succinctly in a nephew comment, "VM or culture? It's the culture."

EDIT: It just occurred to me -- Programming suffers from a worship of Context-Free "Clever"!

Whether or not a particular pattern or decision is smart is highly dependent on context. (In the general sense, not the function call one.) The difficulty with programming, is that often context is very involved and hard to convey in media. As a result, a whole lot of arguments are made for or against patterns/paradigms/languages using largely context free examples.

This is why we end up in so many meaningless arguments akin to, "What is the ultimate bladed weapon?" That's simply a meaningless question, because the effectiveness of such items is very highly dependent on context. (Look up Matt Easton on YouTube.)

The analogy works in terms of the degree of fanboi nonsense.

A small word of caution: I'd strongly recommend against using a huge java heap size. Java GC is stop the world, and a huge java heap size can lead to hour long gc sessions. It's much better to store data in a memory mapped file that is off heap, and access accordingly. Still very fast.

Good advice. Even with G1GC it's hard to run heaps that large. However, not to be overly pedantic, Java GC has many different algorithms and many avoid STW collection for as long as possible and do concurrent collection until it's no longer possible. I don't think it's fair to just call it stop the world.

Azul C4 will collect with stop the world due to GC no longer than 40ms. You might want to think beyond Oracle/OpenJDK (altough openjdk with Shendoah GC will be interesting in the future) if you are dealing with this kind of heap sizes.

I know that you are probably going to be modded into oblivion, but can Java address this much memory in a single application? I'm genuinely curious, as I would assume, depending on the OS that you'd have to run several (many) processes in order to even address that much ram effectively.

Still really cool to see something like this, I didn't even know you could get close to 2TB of ram in a single server at any kind of scale.

Bigger iron has been at 64-512 TB for a while:



Or significantly higher if you don't restrict yourself to single-system-image, shared memory machines - there are at least 2 1300-1500 TB systems on the Top 500 list.

Not using the out of the box solutions. But while I haven't done this personally my understanding is Azul Zing will allow you to efficiently use multi TB heaps in Java.

Java can address 32GB heaps with compressedoops flag enabled. After that flag is off, you can address as much as 64 bits will allow. http://stackoverflow.com/questions/2093679/max-memory-for-64...

Do a little research before implying that there's no way that Java can address gigantic heaps.

You can but garbage collection will kill your performance for very large heaps. You either end up needing to use off heap memory to take it out of scope for garbage collection or using many small JVMs with more reasonable sized heaps.

I wasn't implying or assuming anything, I was genuinely asking... I'm more familiar with windows, than other OSes, but iirc, windows apps can only get 4GB per process. (Maybe that was just 32bit windows apps).

Just 32bit apps. 64bit apps can go much higher.

I was incorrect... Then again, I've never needed to address more than a couple gb of ram.

[1] http://stackoverflow.com/a/11892191/43906

and Scala too.

Scala _beats_ Java in most of the benchmarks: http://benchmarksgame.alioth.debian.org/u64q/scala.html

> _beats_ Java

Not according to that data!

A bit under $35,000 for the year.

Yes, but only if you pay for three years up front. Spelling out all the options:

  * $117K / year on-demand
  *   81K / year for one-year commitment, nothing up front
  *   69K / year for one-year commitment, $34,285 up front
  *   67K / year for one-year commitment, $67,199 up front
  *   35K / year for three-year commitment, $52,166 up front
  *   33K / year for three-year commitment, $98,072 up front
All these prices are total, e.g. for the last option, you pay $98,072 up front and then nothing more, which divides out to about $33K / year over the three years.

Plus, eventually, the spot market, and of course you can run save money with on-demand if you only need the instance occasionally.

AWS pricing is a bit ridiculous. $100,000 can buy some much hardware even while taking account of the additional costs like electricity and hardware failures.

It's a lot, yes, but you're getting a lot more than just the instance: - You can back up the image with fantastic reliability and recover into multiple AZs automatically with an autoscaling group for practically peanuts. - You can place that instance into your own private VPC complete with security groups (stateful firewall), Network ACLs, and inter/intra AWS routing. - You get strong authentication and RBAC through IAM over the instance and it's environment.

That's what I consider in the "free or nearly free" tier, off hand. The other benefits come with being able to interface seamlessly and quickly (same infrastructure) to the rest of AWS services.

You might do better at finding a piece of hardware that does that (and I'm curious now what a 2TB RAM server goes for) but I think you'd be hard pressed to find a way to start from scratch to deploy that instance and all of the services that come with it for under that price. People with on-prem compute likely have some of that already, but the value here is that you could request an X1 today without ever having been a customer before, and you'll get all of that, and access to more, just with that one instance.

If that's not a good value proposition today, then I'd say wait just a few months. Today probably marks the highest price anyone will ever see for an X1. Given past history, it's just going to go downhill from here.

Also, the common use case of "I don't need this hardware 24/7/365."

If you need to do some analysis or computation on a massive data set once a month or something, it's going to be cheaper to pay $5k/yr (assuming you run for 24 hours a month and don't make use of the spot market) than to purchase and maintain the hardware and infrastructure.

The AWS instance is very predictable, in operation and cost, and that's valuable to business.

If your unique snowflake in your own datacenter (don't forget to factor in the physical space and your datacenter personel into your costs) doesn't work well, it can mean replacement, additional costs and downtime. If the AWS instance has a hickup, terminate and replace (I'm not saying that's going to be trivial either at a 2TB RAM instance size).

A big hunk of hardware of your own also represents significant CapEx and a depreciating asset for business. Spinning this monster up costs you $13 to start testing immediately, and you can walk away from it at any time. That's worth a lot.

"Fail quickly, fail cheaply."

And the staff costs of good sysadmins, and the reliable network, and everything else that comes with AWS?

For some use cases, the pricing of AWS makes sense.

I tend to agree with you on AWS. I swear by it. But in this case I would probably go at it this way...

How many hours can I run the machine before it costs more than building one? Probably a month? (random guess) Will I be running it longer than that?

If yes, build the machine, if no, just rent it from AWS.

I can't even imagine a scenario where I run this 24 hours a day 365 that I wouldn't build out a Hadoop cluster or similar for.

Well, the advantage of AWS is that you don't have to run it 24/7 unless your workload requires it.

Exactly, so say you are running it 3 hours a day every day and it costs x an hour and you want an ROI on your hardware of 3 months. If y is the cost of building it...

if 3x * 90 > y build else rent

So at $13.338 (assuming no reserved instance) if y is less than $3600 you make your money back in 3 months. Of course a machine with those stats will not be $3600.

Your math is off. At $13.338/hour you're looking at just shy of 10 grand per month ($9763 for a month of 30.5 days.)

It looks like one of these bad boys would set you back about $40000. So you'd break even at 4 months. If you're going to go for the 3 year reserved instance, with $100K upfront, you'd be way better off on capex and opex just buying and colocating the thing (not considering other expenses.)

The bigger you grow the more the balance tips towards do it yourself because your costs for colocation, managing hardware, etc amortize out better with scale. AWS gets a little better with scale but not nearly as quickly.

My math isn't off... it's for 3 hours a day and I state that in my post (perhaps not clearly enough). Your number is for 24. The point was if you're not full utilization it could make sense.

People using AWS are not typically, say, small business people looking to eke out savings however possible so they can take an extra vacation or buy an new car. They are corporate people or people running startups using other people's money. Their success or failure doesn't depend on saving the organization money so they are willing to pay more to have less to worry about themselves personally.

Quite the opposite. If you're a small business in need of compute, then it's quite a value.

I'll give you a case in point. I have a friend who works for a construction company. He has three servers he's cobbled together all at one site, and a NAS for storage. They provide email and document sharing, accounts management, etc. Every new blueprint that comes through needs to be reviewed for accuracy, but they come in as images, so he has another server that just runs tesseract OCR when he gets a pdf in. He asked me if he should get three servers, or one and virtualize the three. He wanted to plan for growth, but didn't know what that would be. the NAS was underutilized.

He's buying no servers. He's setting up Workmail and Workdocs for their users. (Workspaces is an option in the future.) Blueprints are uploaded to S3 as they come in, where the file arrival notification triggers a lambda function that performs an OCR pass on the file and dumps it into a new bucket. Because this function runs on-demand, he avoids paying for a server to sit there, and only pays a few cents for a doc. His other services no longer run on servers he has to manage, so he doesn't worry about patching. He has reliable, off-site backups for the first time and version control, and he only pays for the exact amount of storage he's using as opposed to buying space for his NAS in advance. And he doesn't need to worry about scaling.

That story plays out all the time. It's arguably a bigger savings for smaller people who can't afford the upfront costs than large corporations that already are likely to have massive sunk costs in datacenter space and assets. If you're a small business, the moment you go cloud you not only save money, but you gain capabilities (the biggest being in redundancy and reliability, which are the most important...how many small businesses have a DR/COOP capability on-prem?) which you would never have had otherwise.

> It's arguably a bigger savings for smaller people who can't afford the upfront costs

Well first let me say that I have been doing all of this (as a business person who knows tech and Unix) for quite some time. And what you wrote above spins my head because I don't know much about AWS (but I can easily turn up a server if needed). My point is what you are describing for a small business person requires a tech person knowledgeable with AWS to implement (and keep on top of) which is the "cost" as opposed to perhaps saving money in this particular example for this particular customer. I am no doubting that AWS (which I see you work for in your profile) is good for certain types of uses (such as perhaps the example you are giving). But you still need a knowledgeable tech guy to keep it all working it is not like using webmail vs. running your own email server (imap, smtp etc.)

Additionally once you are on AWS you are pretty much locked into AWS by the way (as you describe) things are setup. I am not convinced that at a time in the future AWS will not change their pricing to take advantage of the lock that they have (even though now they continually drop prices). Or add extra fees or whatever. Being in business so many years I have seen this happen and changing from AWS will be near impossible. (In other words for people on AWS such as you describe there is no easy alternative competition for a system once designed and specific to AWS in the way you detail).

> Additionally once you are on AWS you are pretty much locked into AWS by the way (as you describe) things are setup. I am not convinced that at a time in the future AWS will not change their pricing to take advantage of the lock that they have (even though now they continually drop prices). Or add extra fees or whatever.

This is just FUD. AWS prices are always lowering, there is never a time they have increased. Even if they increased their prices, Google, Rackspace and Microsoft would eat their lunch. There is plenty of competition in the computing space.

> Being in business so many years I have seen this happen and changing from AWS will be near impossible. (In other words for people on AWS such as you describe there is no easy alternative competition for a system once designed and specific to AWS in the way you detail).

If your solutions are so narrowly defined that they won't run on anything else than AWS, then you are doing it wrong. You might run on specific AWS services, but there should be no reason why someone could not recreate their solutions over on Google Compute or other services. Yes, you might have to rework some of your solutions, but it should be very doable.

> This is just FUD. AWS prices are always lowering, there is never a time they have increased. Even if they increased their prices, Google, Rackspace and Microsoft would eat their lunch. There is plenty of competition in the computing space.

The prices are lowered, but AWS is also continuously introducing new more powerful instance types and deprecates older ones.

> If your solutions are so narrowly defined that they won't run on anything else than AWS, then you are doing it wrong. You might run on specific AWS services, but there should be no reason why someone could not recreate their solutions over on Google Compute or other services.

AWS has many solutions that are different than you are used to. For example there is no NFS (well there is EFS which supposed to be similar, but it's still in preview for at least a year now). You're forced to improvise, and use their solutions. Typically S3, which is not exactly 1:1 alternative to NFS, so you'll need to improvise and rework your applications to that model.

If you want to move to something else, you no longer will have these services available.

> Yes, you might have to rework some of your solutions, but it should be very doable.

Exactly, that's what's pointed out. You'll have to rewrite your applications. There's no lock in that can't be solved by rewriting. The problem is that the rewrite might be quite expensive.

>My point is what you are describing for a small business person requires a tech person knowledgeable with AWS to implement

It may seem that way, but in actuality, it really doesn't. In the case I described, their "tech" guy was someone who'd been pushed into the role by necessity. He had the most knowledge out of a body of people who didn't really make it their area of expertise, and became the IT guy as a result. He did a well enough job working on his own. But there's things he's going to miss, and it's a lot to keep up with if you want to run an infrastructure right. As you pointed out, it's a lot different running Software-as-a-service solution like Webmail vs. Infrastructure or Platform as a service.

But that's the nice thing about AWS (and to be fair, Azure and Google do this quite well too) is that there's a solution for your tech level. If he were really knowledgeable and wanted to roll his own mail solution, he can, and stand up instances running sendmail and put in his own MX records into our DNS service and manage all of the bells and whistles. Or, he can use Workmail (or Gmail for business if you want to be agnostic) and point and click his way to getting mail set up.

So there's a lot of range there, and the higher you go up the stack, the more management you hand off to the cloud provider at the (hopefully small) expense of control. But if the provider's offering fulfils all of your requirements, then it's a great deal for small, non-technical, shops who can offload their workloads into a provider who are experts at managing infrastructure, and you, the business owner, can create the occasional account and otherwise focus on business.

There's a great chart on this you might have seen, but it looks something like this: http://1u88jj3r4db2x4txp44yqfj1.wpengine.netdna-cdn.com/wp-c...

In any case, there's a solution for everyone depending on their skillset, and arguably, (in my humble yet biased opinion =P ) AWS has the best spread. If I had one self-criticism about us, it's that it's not always intuitive as to which of our services fall in the SAAS/PAAS/IAAS stack. Education will always be a challenge no matter who the provider is, and being that cloud is still relatively young, it's up to AWS, Microsoft, Google, and everyone else to explain just what "cloud" means and how to start with it.

As for locked in, it’s AWS’ goal to keep you by ensuring you /want/ to use the service, not because you’re forced to be here. There’s no lock in other than transit fees. Which, if you have thousands of terabytes, can be hefty. That’s bandwidth for you. But there’s nothing proprietary. Take your code and data with you. I’ve seen engineers walk customers through the process of getting their stuff out. It happens. Not often, but it does, and we’ll help with that too.

> If he were really knowledgeable and wanted to roll his own mail solution, he can, and stand up instances running sendmail and put in his own MX records into our DNS service and manage all of the bells and whistles.

Ok that is something I actually need to do. I have a project now to bring up a smtp and imap server and I can put it on the colo box or do it on AWS. [1] Is there a specific step by step guide to doing this with AWS. Quite frankly with all of the things you do there I wouldn't even know where to start. Otoh if I go to, say, Rackspace I can easily and quickly spin up a server (with RAID) and not have to give any thought to doing so because it's a centos server running on a vps. With AWS I don't know the difference (and would have to learn) as far as the different instances availability zones all of that (and quite frankly I have little time to do that so I would typically go with what I already know). Make sense?

[1] So what I am saying is I can do this on a colo box but don't even know where to begin to do it on AWS. I don't mean I need handholding to install and get the server working, just to get it working given the various offerings on AWS.

> bring up a smtp and imap server and I can put it on the colo box or do it on AWS.

Is that all the server will be used for? If you just want to run a Linux server 24x7 you will not be taking advantage of any AWS services, so you would probably be better off getting a colo box or VPS to act as your mail server.

If you really want to run this on AWS, you just need to spin up an EC2 instance, and configure the Security Group (think of it like a firewall). There is a wizard which will guide you through the process[0]. Once the instance is running you can (optionally) use Route 53 to set up DNS records for your new instance.

Also, feel free to buy my book :-) It starts with the basics like launching instances, and then moves on to more interesting AWS-specific features [1].

0: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/launching...

1: http://www.amazon.com/AWS-System-Administration-Practices-Sy...

With regard to my comment (below) I guess what I am looking for is a guide like "hello world" for AWS.

I think this is a great intro into the different components. https://www.airpair.com/aws/posts/building-a-scalable-web-ap...

If you have any other questions, feel free to reach out to our forums, or drop me a line. =)

It really depends on your needs... if I can have my postgres/mysql, or other table storage serviced by another company without having to become intimately familiar with another set of software to administer and/or pay someone to do so it may be a good option.

If you can save a FTE admin, you're saving at least $10K/month, that can cover the hosting and bandwidth for a pretty big site/software/service on AWS and the like.

$9.6K/mo without a reservation, but a hefty 70% off with a 3-year reserved instance purchase. It's actually good value, it's 8x the RAM of the nearest memory optimized instance (R3.8XLarge) at just 5x the price.

If you have a whole bunch of memory optimized instances, this can allow you to simplify and consolidate and still save you money. Still, don't put all your eggs in one virtual basket.

Under $35k per year AFTER you've paid the $98k up-front cost for a reserved instance. The blog post shows pricing for reserved instances. Hourly price is $13.338 without up-front payment. Expensive, but still nice to have this available as an option for niche scenarios that need all that RAM.

I'd love to see someone benchmark how fast you can get ~2TB of data into that instance from S3, process it, get the results back into S3, and power down.

Big data version of Cannonball Run.

Or you could just use BigQuery =).

What fun would that be :)

An R930 on Dell's site configures out to double that. And that doesn't include power/bandwidth.

I'm sure you would also get a hefty discount if you were a customer who regularly buys servers that cost that much.

Thanks for looking that up... I was going to say, I'd be surprised if these servers weren't costing Amazon over $70K to begin with.

IIRC that's not a whole lot more than i2.8xl, which has 1/8 of the RAM. But anyway, most likely this will probably be used for on demand "Small Data" type jobs that will benefit from running on a single machine.

If you paid hourly (instead of using a reserved instance) it costs $116840.88 per year (13.338 per hour * 24 * 365). A 1 year reservation + hourly charges for a year is $67197.96 (all up front).

You can buy a server from Dell with 1.5 TB of RAM and four beefy processors (80 cores) for about $60K. You can power and cool it for a year for maybe $4K.

Every time I look at AWS, it just doesn't make sense from a financial standpoint (even after you add another machine for redundancy, and remote hands -- you're ahead after 12 months).

I'm sure there are companies that will run this instance type 24/7 for years. But you can also spin one up for a short period of time, use it to do things you would normally need to do across a bunch of machines and then tear it down.

Having this type of instance available via an API call is within seconds is really cool.

So this thing is worth about three humans in the western world :(

(not that I don't understand that this could be said of many things, might actually be true in economic terms etc. It just never occured to me with other heavy machinery).

It's always fun when making devops forecasts to plot the date when the servers will be paid more than you due to anticipated growth.

The use cases for this class of machines, I think: spin up, do work for a few minutes/hours/whatever, then spin down. They're far from running 24/7 which _can_ make it more economical than buying dedicated hardware which might sit idle.

The cost of those machines are tiny compared to the SAP HANA licenses you'll need for your "Enterprise-Scale SAP Workloads" mentioned in the blog.

Recompiling tetris with BIGMEM option now...

I'm curious about this AWS feature mentioned: https://aws.amazon.com/blogs/aws/new-auto-recovery-for-amazo...

We've experiemnted with something similar on Google Cloud, where an instance that is considered dead has its IP address and persistent disks taken away, then attached to another (live or just created instance). It's hard to say whether this can recover from all failures however without having experienced them or even work better than what Google claims it already does (moving around failing servers from hardware to hardware). Anyone with practical experience in this type of recovery where you don't duplicate your resource requirements?

How does this thing still only have 10 GigE (plus 10 dedicated to EBS)? It should have multiple 10 Gig NICs that could get it to way more than that.

Funny how the title made me instantly think: SAP HANA. After not seeing it for the first 5 paragraphs or so, Ctrl+F, ah yes.

Not too surprising given how close SAP and Amazon AWS have been ever since SAP started offering cloud solutions. Going back a couple years when SAP HANA was still in its infancy; trying it on servers with 20~100+ TB of memory, this seems like an obvious progression.

Of course there's always the barrier of AWS pricing.

The pricing is surprisingly enough not terrible. Given that dedicated servers cost $1-1.5 per GB of RAM per month the three year price is actually almost reasonable.

That being said, a three year commitment is still hard to swallow compared to dedicated servers that are month-to-month.

It's not just the commitment, but the fact you have to cough up $52,000-$98,000 up front.

Wow! http://codegolf.stackexchange.com/a/22939 would now be available in production.

Hmm around $4/hr after a partial upfront. I'm guessing that upfront is going to be just about the cost of a server which is around $50k.

What happened to the other 16 threads?

18(core) * 4(cpus) * 2(+ht) = 144

Some fraction has to go to the physical host for its resources.

dom0 happened.

I'd be guilty if I ever used something like this and under utilized the ram.

"Ben we're not utilizing all the ram."

"Add another for loop."

I'm taking it this is so people can run NodeJS or MSSQL on AWS now? Heh, sorry for the jab - what could this be used for considering that AWS' top tier provisioned storage IOP/s are still so low (and expensive)?

Something volatile running una RAM disk maybe?


Thats amazing.

That's the reserved pricing. You have to pay $52,000+ plus up front to get that price. Standard on-demand is over $13.

Only if you reserve them for three years, which carries an upfront cost of ~$98K.

Partial upfront is only $52,166, and over the life of the reservation, only costs an extra 2%.

Yeah so you better know exactly why you want these and how to use them

16GB of ram should be enough for anyone.

Edit, y'all don't get the reference: famous computer urban legend...


We detached this subthread from https://news.ycombinator.com/item?id=11723858 and marked it off-topic.

A joke about an ancient myth in our tech community is "off topic" - you guys are too stringent with this. I think it's lame.

The joke doesn't make any sense because the Bill Gates quote refers to personal computing, not development.

Just imagining the entirety of Google running on 16GB of RAM :).

To answer the question at hand though, MySQL seems great at up scaling with larger hardware, as does Redis (from what I've seen, it seems to almost get more efficient at sizes >500GB). MongoDB (my goto database) occasionally doesn't seem to scale very nicely off the bat, but after some configuration that too scales very well, possibly even better than MySQL.

MongoDB's In-Memory storage engine would be pretty sweet to try out with the X1 instance :)

Pretty sure every single person on HN gets the reference; it's just trite and doesn't add anything to the conversation.

Lighten up. You know what's funny - is that I have tons of friends in work, on multiple private slack groups and all over, guess what - we all joke.

That's what make us friends.

I'm not here to prove myself - I participate in the tech community here and it's like everyone like you just tries to put on some superiority complex and act like tech is ONLY serious business.

That's bullshit.

Get over yourself.

I've seen plenty of jokes that were good, relevant, and tasteful upvoted on HN. Do you think it is possible you are mistaken about the quality of yours?

It's 100% relevant to the tech community as nobody outside our community would get it off hand.

Encouraging folks to write more inefficient code?

I'd be interested in hearing what Gates [1] has to say about it, though.

[1] "640 kB ought to be enough for anybody"

Too bad no one has a real source for that quote, probably because he never actually said it (though like "Beam me up Scotty" which was never quoted in the original Star Trek, he may have come close, but it's not nearly as memetastic)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact