
Rapids – Open GPU Data Science - msis
http://rapids.ai/
======
gok
"Scaling Out on Any GPU"

Love the implication that "any GPU" means it works on expensive Nvidia chips
and also slightly less expensive Nvidia chips.

------
deepGem
I wonder how much of this is driven by market need. Pyarrow + Pandas is
significantly fast already.

[http://wesmckinney.com/blog/high-perf-arrow-to-
pandas/](http://wesmckinney.com/blog/high-perf-arrow-to-pandas/)

Also Pandas 2.0 is going to roll in a lot more utulities for parallel
computing. Is there really a need for 50-100x speedups today ?

~~~
ah-
I often run into situations where I hope pandas were 50-100x faster.

Dask can help, but introduces quite a bit of additional complexity.

I'm also looking forward to stricter data models than what pandas currently
uses, in particular proper null support for all dtypes and less random type
conversion.

------
minimaxir
Is this from NVIDIA itself? The only indication that's the case is from a line
in the blog post.

~~~
lmeyerov
Yep! Companies in the GPU Open Analytics initiative like Conda (dask, conda),
MapD, Ursa Labs (Arrow, Pandas), BlazingDB, and ourselves (Graphistry!) are
making a full ecosystem of GPU analytics, similar to what happened with Hadoop
for the last 10-15 years. There is little point for all of us to keep
reimplementing basic things like in-GPU database scans (PyGDF), and we want
fast interop for ecosystem composition / network effects (GPU take on Arrow).
Members are contributing pieces we believe are / should be commodity for the
GPU wave to get here faster and bigger.

The RAPIDS team has been really stepping up here -- PyGDF, and various
bindings & IO helpers -- and collaborating with many of us to get them right.
Another member was intending to share as the open compute core, but was
insufficiently open (kept multigpu proprietary?), and had little uptake, so
Nvidia stepped up with RAPIDS. The result is a more neutral solution, and
already with demonstrated framework dev uptake uptake. And hopefully, more GPU
compute everywhere, faster :)

~~~
dman
Is there a mailing list / forum that I can join to come upto speed with the
efforts?

~~~
lmeyerov
Many of us are in the GoAi Slack, tho more for syncing. Most of it is per-
project work -- our JS work for Arrow/GoAi is in the Graphistry Slack, PyGDF
is in the GoAi one, GPU Arrow is in Arrow mailing list or GoAi Slack, MapD is
in MapD, etc.

2018 has really been internally focused for pulling GPU islands into a GPU
mountain. Expecting 2019 to be way more externally focused. Each milestone
like this gets us closer :)

~~~
tmostak
And fwiw MapD is now OmniSci.

------
breck
Looks interesting. The performance chart seems a bit misleading, as you can
get a good 100 core CPU setup for $20k but a DX-2 costs 20 times that at
$400k.

~~~
twtw
1 node != 1 core.

~~~
breck
Whoops! I read core and not node. That makes more sense :). Thanks.

------
brian_herman
I don't understand it uses CUDA as a cornerstone? How is that open?

~~~
antoineMoPa
It makes me sad that CUDA is more popular than OpenCL. I would be willing to
sacrifice some performance for portability, openness and avoiding vendor lock-
in.

~~~
twtw
From
[https://news.ycombinator.com/item?id=18197988](https://news.ycombinator.com/item?id=18197988):

> MIOpen[1] is a step in this direction but still causes the VEGA 64 + MIOpen
> to be 60% of the performance of a 1080 Ti + CuDNN based on benchmarks we've
> conducted internally at Lambda. Let that soak in for a second: the VEGA 64
> (15TFLOPS theoretical peak) is 0.6x of a 1080 Ti (11.3TFLOPS theoretical
> peak). MIOpen is very far behind CuDNN.

That performance penalty is a bit too steep for me. Vega64 should be 1.3x perf
compared to 1080ti, but instead it is 0.6x. 50% lower performance is quite a
sacrifice.

~~~
antoineMoPa
But did you ever care about openness, vendor lock-in and portability? Most
people in the world can't afford 600$ Nvidia GPUs.

~~~
twtw
> Most people in the world can't afford 600$ Nvidia GPUs.

Ok? I'm not what your point is here - most people can't afford a $600 amd gpu
either. If you can't get a $600 gpu, buy a cheaper gpu - gtx 1060 is $250 or
gtx 960 is $50.

Performance costs a premium, and that is just how is is. It would be great if
we lived in a world where you could buy a Titan v for $1, but in the real
world valuable things cost more money, and that unfortunately means not
everyone can buy them.

