

Ask HN: 10k+ node _windows_ cluster in AWS? - leakybucket

For testing purposes, I'd like to know if anyone has successfully created, controlled, and torn down clusters of 10,000 nodes or more of Windows instances in AWS?  Or in other cloud providers?<p>I'm looking to setup a large scale QA test that will drive Windows client traffic (think Outlook, CIFS file sharing) to a data center installation.
======
mwasser
If you're looking to experiment -- I think the best answer is talk to AWS
support as this isn't a normal case that is allowed by AWS accounts.

Check out
[http://aws.amazon.com/ec2/faqs/#How_many_instances_can_I_run...](http://aws.amazon.com/ec2/faqs/#How_many_instances_can_I_run_in_Amazon_EC2).

To quote it: " You are limited to running 20 On-Demand or Reserved Instances,
and running 100 Spot Instances per region. Cluster Compute Eight Extra Large
instances are limited to running 8 On-Demand instances per region and Cluster
GPU Quadruple Extra Large instances are limited to running 2 On-Demand
instances per region (Currently cc2.8xlarge and cg1.8xlarge instances are only
available in US East (N. Virginia)) "

However, these are the default limits -- AWS will up your account limits by
quite a bit if you just ask. This said, max I've ever seen AWS approve was 400
on-demand instances per one region (I think this means 2800 instances total
over 7 regions... never saw if the limit applied to all regions). Also,
likelihood of having a 10000 spot instance request fulfilled is probably near
zero if I were to guess even if they let you make a request that big without
first clearing it with them.

I take it you're also familiar with the cost of running that? The absolute
minimum you could possibly be charged using small instances is $1,200 --
partial hours are rounded up.

~~~
leakybucket
My rough estimate was ~$14,000 for 24 hours of 10,000 windows instances: 24
hours of 10,000 non-persistent on-demand Windows micro instances in Amazon
West: $7200 Assume 0.5 mbit/sec average to each instance over 24 hours
(105TB): $5020 AWS data rate from aws is currently 0.

If I understand their pricing, cyclecomputing adds a 20% upcharge: $1440

~~~
mwasser
I'm unfamiliar with the HTC/HPC space but I'd highly suggest running with just
a few micro windows instances with some kinda dummy job to verify they work
well enough for your purposes. For what I've tried to use them for (website +
database type of demo machines) they've been just about unusable/ way too
unresponsive to get much done. Linux micro are good enough for most stuff I
need but not windows ones.

While not that big a charge relatively speaking -- you'll need to include EBS
as well since micros only have EBS \- EBS space (10c/GB/month) -- I'd imagine
10GB minimum per machine though AWS provided instances have 30GB. \- Est:
(10,000 machines * 10GB * $0.10) / 30.4 (avg days in a month) = 328.95 \- EBS
IO (who knows? depends on what you do) -- rate is 10c/ 1 million IO ops.

------
ovechtrick
I was at Super Computing 11 this past November.

I haven't done it personally. But I saw a demo at the CycleComputing booth
that looked pretty slick.

They've brought up to 30,000 linux nodes. Not sure if they have an offering
for windows or azure.

There are entire companies being built to solve this problem.

~~~
leakybucket
Thanks - I saw the ars article about cyclecomputing, and was researching them.

~~~
ovechtrick
I followed cloud pretty closely at the conference.

My biggest take away was that: if I'm going to do public cloud... I'll
contract out a company to deal with the details of bringing up the
cluster/images.

------
JoachimSchipper
An AWS (support) forum may be a better resource. I don't see why this wouldn't
work - maybe the Amazon guys would appreciate a heads-up, though.

