
Microsoft Readies Azure GPUs - kungfudoi
https://www.virtualizationpractice.com/microsoft-readies-azure-gpus-38918/
======
n00b101
Not sure about the exact pricing shown here, but it looks like Azure blows AWS
GPU pricing out of the water.

Comparing specs:

    
    
      AWS g2.2xlarge:
      * DRAM: 15GB
      * GPU processor: 1xGK104
        * 1536 CUDA cores
        * 2.29Tflops/s peak single precision throughput
      * GPU memory: 4GB
        * 160GB/s peak bandwidth
      * Price: $0.65/hr (US East, Linux)
    
      Azure NC6:
      * DRAM: 56GB
      * GPU processors: 2xGK210
        * 4992 CUDA cores
        * 8.73Tflops/s peak single precision throughput
      * GPU memory: 24 GB
        * 480GB/s peak bandwidth
      * Price: $0.66/hr
    

I expect AWS will introduce a new GPU fleet and reduce pricing of the existing
G2 fleet, in response to Azure.

~~~
JoshGlazebrook
I wish one of the main cloud platforms would significantly cut their per/GB
bandwidth costs. It feels like there has been no adjustments in years.

Also funny story, I was 14 or 15 and was trying to figure out a name for a
stupid little chat client I was trying to make in C#, and somehow azure popped
into my head. Not surprisingly azure.com was taken and just had a generic
microsoft landing page (this was a little before it was announced in 2008).

~~~
blahi
That's a major lock-in mechanism. Data stays in for the compute. Which is a
major pity for Microsoft since all the marketing measurement startups got
built on top of AWS while Azure has superior analytics tools.

------
wjoe
This is pretty interesting from a video encoding point of view. We're
currently using g2.2xlarge AWS instances which are Kepler based. All we really
need from those instances is the GPU (more specifically, the NVENC H264
encoding chip), so other specs aren't too important.

It looks like the NV6 Azure instances use a newer Maxwell based chip, which
can handle much more (concurrent, or higher bitrate/resolution) videos,
compared to the 2 generation older AWS GPUs.

And the NV6 is virtually the same price as the g2.2xlarge... time to do some
testing. Hopefully Amazon can catch up soon.

~~~
dogma1138
Just for reference:

K1 (AWS): 1 GPUs x 6 H.264 streams (max single stream 720p@30fps)

K2 (Azure K80): 1 GPUs x 6 H.264 streams max single stream 720p@30fps)

M6 (Don't know if it's used by anyone ;)): 1 GPU x 18 H.264 streams (max
single stream 1080p@30fps)

M60 (Azure M60): 1 GPU x 18 H.264 streams (max single stream 1080p@30fps)

You can sacrifice encoder bandwidth (max streams) for higher resolutions or
frame rates.

~~~
zip1234
Is GPU better than CPU for encoding? I thought performance ended up similar.
Also, are you saying that K1 AWS can reencode 6 streams at once?

~~~
dogma1138
The performance of the GPU is orders of magnitude better you can do multiple
1080p streams in real time with high bandwidth easily.

The quality on the other hand even for the same bit rate is considerably worse
than highly optimized CPU compilers as the GPU does do a lot of hacks along
the way.

------
gwern
/r/machinelearning discussion:
[https://www.reddit.com/r/MachineLearning/comments/4wd5kn/ms_...](https://www.reddit.com/r/MachineLearning/comments/4wd5kn/ms_expands_cloud_gpu_offerings_k80s_at_056hour/)

------
LogicFailsMe
AWS: 4 year-old GPUs

Azure: 2 year-old GPUs

Wouldn't it be awesome if a cloud provider could keep up with the GPU roadmap
and ship something within a year or so of its release?

~~~
perryh2
The GPUs that hosting providers are buying need to support vGPU features,
which are only support by the higher-end workstation and server products.

~~~
LogicFailsMe
Which is a software/binning issue (probably 100% software) not inherently HW
because M60==GTX 980==GM204 and M40==GTX Titan X==Quadro M6000==GM200.

Interestingly, for the first time ever, GP100 is unique (well, OK, K80 too but
K80 was too late). And since Quadro P6000==Titan XP==GP102, it's probably just
a software block here as well.

Also, for the first time ever, the high-end Quadro will be the best FP32 GPU
of it's generation. That's the most interesting part for me.

~~~
dogma1138
It is a software/bios block. Once the driver grabs a desktop card and it's
initiated it's locked. The cards also come with a UEFI bios which means if
your host initialized the card during boot it's also locked.

To overcome this you need to disable UEFI boot/GPU boot in the BIOS and
blacklist the card in the host OS and then create a PCI-stub device that will
be used for the passthrough.

This is an utter and complete hack and you can't really use for production
grade implementations.

I'm not sure that Quadro actually supports vGPU also AFAIK only Tesla and Grid
parts do, Tesla does have some additional in GPU support for virtualization, i
don't know how GRID handles it. GRID parts are less for compute and more for
game/video streaming so they aren't fully virtualized, AFAIK they do not
support P2P gpu communication or shared memory access, they are only more
Hypervisor friendly so they can be thin provisioned.

~~~
LogicFailsMe
Bare metal grid GPUs do support P2P. Xen breaks that on AWS.

------
dharma1
I signed up for the beta but haven't heard anything back yet. Anyone else?

~~~
gwern
I haven't either.

------
Athas
This is interesting. Is there some way for academics to get discounted access
for experiments (not for running heavy simulations, but for testing whether
our stuff works)?

~~~
dogma1138
Microsoft has graned "free" subscriptions to proffessors and researchers
before, so it can't hurt to ask :P [https://azure.microsoft.com/en-
gb/community/education/](https://azure.microsoft.com/en-
gb/community/education/)

------
kim0
Does anyone know of an access protocol (like pcoip) that supports multiple
users to have concurrent 3d accelerated sessions on one VM ? Each just 720p

------
Dolores12
Any idea if it worth to run a bitcoin miner on those like it used to be few
years back?

~~~
akiselev
No, profitable bitcoin mining is now only possible using low power ASICs.

