
Amazon EC2 Elastic GPUs - madmax108
https://aws.amazon.com/blogs/aws/in-the-work-amazon-ec2-elastic-gpus/
======
kyledrake
For people that don't have huge budgets for this stuff but need a GPU in
production for some light work, I wanted to mention with a little bit of
effort you can do this for a lot cheaper with gaming/mining GPUs and a
4U/midtower rack space. This also gets you a video card attached to an X
interface for doing things like WebGL screenshots with SlimerJS (my use case).
I'm not sure if the GPU compute stuff lets you do that, you usually need a
display emulator to make X work with headless GPUs.

I'm not mentioning this to be smug - if you don't need massive GPU clusters on
demand, the cost difference is _substantial_. I can build a GPU rig for $1,200
and it's going to cost $40/mo to host it. Compare that to the $500+ per month
you're going to spend at a cloud provider.

I'm actually shipping one out today. It looks like this
[https://twitter.com/neocitiesweb/status/804065175045218304](https://twitter.com/neocitiesweb/status/804065175045218304)

~~~
dandelany
Are there easy services out there that let you mail them a hardware server and
they will host it in their rack? Never heard of doing this on a server-by-
server basis... But I'm also not a DevOps guy :P

~~~
seanp2k2
As someone who previously worked in an ISP / webhosting datacenter and now
sees most younger devs as people who don't actually know how their stuff runs
("it's in the cloud"), this is an amusing comment.

"The cloud" is just marketing on top of shared hosting and virtual private
servers. Those things have been around for decades. At the end of the day,
your code is still executing on some poor Xeon out there, but maybe it's using
some kernel features to restrict memory access and you deploy it using a fancy
whale with a shipping container. We've had LXC, cgroups, jails, and xen for
many years as well (LXC was 2008, Xen 2003). Solaris had zones in ~2004. We
have SSDs, DDR4, 10GbE, and 32-core-per-socket Xeons now, so you don't have to
really give two shits about crazy amounts of random reads / writes, storing
state in 4 separate network-connected nosql systems with REST APIs, 4 layers
of reverse proxies, 3 layers of virtualization, and 60-frame-deep call stacks
as most stuff just works well enough with little thought put into the systems
aspect of it. Despite all of this, Google Sheets is less responsive than the
1985 release of Excel, and takes longer in wall-clock time to enter in 100
rows of integers, then sum up a column. See also:
[https://en.m.wikipedia.org/wiki/Wirth's_law](https://en.m.wikipedia.org/wiki/Wirth's_law)

#progress

The only thing which is really new is a generation of people who don't
understand anything system-related deeper than the first abstraction layer.

~~~
dandelany
I'm well familiar with shared & dedicated hosting, and with colocation as a
concept - I guess I've just thought of colo as something only big companies
with big budgets and lots of servers do. I like the idea of building a single
rack server in my apartment for a project, and mailing it away never to worry
about it again.

Good points re: abstraction of the deeper system layers though. I guess my
position as a frontend developer is that every braincell I spend thinking
about things like hardware and memory allocation is one less to spend thinking
about the UX of my app. Abstracting layers away is a good thing - except when
they leak, which is often...

~~~
x13
"never to worry about it again" .. i guess this is why i stopped renting a
cabinet at level 3 .. when a hard drive dies, who wants to go fix it?

------
arielweisberg
This is really trippy. GPUs generally need a fast interface like PCI-E and
won't work over a network like a disk.

So to elastically provide GPUs over a rack how does that work? How do you not
have a ton of GPUs just sitting around due to the physical constraints of
PCI-E given that you can attach GPUs to some common instance types at any
time? How do you not run out of capacity and just have to say no?

~~~
tw04
You can deploy PCIe with switches over distance and keep the GPUs in an
external enclosure:

[http://www.dolphinics.com/products/pent-dxseries-
dxs410.html](http://www.dolphinics.com/products/pent-dxseries-dxs410.html)

[http://www.dell.com/us/business/p/poweredge-c410x/pd](http://www.dell.com/us/business/p/poweredge-c410x/pd)

*there are newer/better/faster versions of the above available, just two examples I had handy as my guess of how to make this happen. I'm confident that even if AWS is doing something similar it's being done on customized hardware.

~~~
arielweisberg
As a gamer I was initially surprised that works. It makes sense if server side
use cased are more batch oriented/compute intensive.

Latency to start and stop "jobs" is critical in gaming as you are trying to
hit a 60-144hz job time.

~~~
etrain
GPUs in the cloud aren't targeted at gamers. They're targeted at people doing
things like running render farms and training deep learning models.

~~~
rl3
What about companies interested in real-time streaming of 3D content? That use
case is extremely sensitive to latency, probably more so than normal gaming
is.

Fortunately the amount of additional latency introduced is likely to be
negligible (another comment cites PCI-E switches incurring <= 1µs).

------
badloginagain
Does this enable remote render farms? That would be huge for independent
animation studios.

~~~
pfranz
Most render farms for animation studios are CPU based renderers. VFX companies
have been using cloud rendering here and there for a few years now[1]. There
has been chatter for 10+ years that GPU based rendering is on the cusp of
changing everything (predating the cloud push). Anecdotally, small (1 or 2 man
shops) have had luck with onsite GPU rendering when they have a lot of simple
things to do. Larger places need more flexibility than GPU renderers have
offered and places with large in-house render farms don't have any GPUs
installed on them.

Although, recently, GPU rendering has gotten some traction in larger
facilities[2]. Cloud rendering makes it easier to move towards these kinds of
things becuase you don't have to commit to the hardware upfront. However, the
big problem with cloud rendering at even modestly sized animation/vfx houses
is transferring the terabytes to and from the cloud (the consenus is to leave
it in the cloud or use a local cloud).

[1] [http://www.postmagazine.com/Publications/Post-
Magazine/2012/...](http://www.postmagazine.com/Publications/Post-
Magazine/2012/November-1-2012/VFX-Flight-makes-use-of-the-cloud.aspx)

[2] [https://www.redshift3d.com/blog/blizzards-overwatch-
animated...](https://www.redshift3d.com/blog/blizzards-overwatch-animated-
shorts-rendered-with-redshift)

------
vinayan3
Exciting. Hopefully, the GPUs are made by Nvidia and support CUDA.

I'm curious what the pricing will be.

~~~
puzzle
They don't mention any manufacturers and do not support CUDA, just OpenGL.

It's clear what they are doing: you call Amazon's OpenGL library, which
applies some batching and compression when talking to a remote GPU somewhere
else in the cluster. You are not allowed to or even need to know what kind of
hardware is on the other side. They could even pick different manufacturers.
And because of this proxying, you can only use an open and vendor-neutral API
like GL, hence no CUDA support.

~~~
vinayan3
I read another comment on /r/machinelearning it's likely that this won't be
for CUDA. They would have highlighted CUDA support instead of just showing
OpenGL. Furthermore, they only showed Windows maybe Linux might not be
available?

------
MichaelBurge
Does anyone have hardware recommendations for getting as much GPU performance
as possible for $7500 or so(maybe with 8 or 12 GTX 1080/1070s)? And how does
it compare with the K80s or P100s that AWS and Google are using?

Just looking at the spot pricing, I see that a p2.xlarge is $0.57 an hour,
while a p2.8xlarge is $72/hour and a p2.16xlarge is $144/hour. That tells me
there is extreme demand for heavy GPU instances, and a home cluster would be
one way to insulate myself from that:

[https://aws.amazon.com/ec2/spot/pricing/](https://aws.amazon.com/ec2/spot/pricing/)

------
old-gregg
Also check out [http://www.bitfusion.io](http://www.bitfusion.io) (no
affiliation)

------
erichocean
Vulkan is definitely interesting, but OpenGL 4.4 or later on Linux is a must
for me.

