
DMTK: Distributed Machine Learning Toolkit from Microsoft - mrry
http://www.dmtk.io/
======
arbre
This is very exciting to see these distributed deep learning frameworks open
sourced by top companies. What I do not understand is why amazon AWS and other
top cloud service are not integrating these frameworks in their services.
Training a distributed neural net should be as simple as defining the model
and specifying the resources.

~~~
vegabook
agreed - though I'm somewhat saddened to see that this library doesn't seem to
be getting the same positive reception (in terms of scoring / comments) as
Tensorflow did yesterday. Am I right and if so, is there a technical reason
for that?

~~~
blazespin
NO GPU, not for production use.

~~~
nightski
You do realize there are a fair amount of machine learning algorithms that do
not run efficiently on the GPU right? Deep learning isn't the only method out
there...

~~~
blazespin
Fair enough! None that I'm interested in, however.

~~~
gcr
Come on, keep an open mind! Random forests still work great!

There's still some love for genetic algorithms _somewhere_ , right?

...Right ??

~~~
irascible
Someone make me a painting of a shriveled genetic algorithm dying in a random
forest..

------
jhartmann
This is very interesting. I was very surprised that Tensorflow did not use a
central parameter server, and this looks like a good foundational parameter
server to build something like DistBelief on top of from Microsoft. I wonder
if they have a Deep Learning module that they will open source eventually.
Microsoft has a few tricks they used in some of their recent papers and it
would be interesting to see them in a production quality system.

~~~
amaks
Here Jeff Dean explains why there is no central parameter server in
Tensorflow:
[https://youtu.be/90-S1M7Ny_o?t=28m59s](https://youtu.be/90-S1M7Ny_o?t=28m59s)

------
fitzwatermellow
I was doing some research for a small demo. As a proof of concept,
implementing a small ConvNet using WebGL shaders. But in the course of my
search I stumbled on this very broad and interesting patent granted to
Microsoft Corp.

Processing machine learning techniques using a graphics processing unit

[http://www.google.com/patents/US7548892](http://www.google.com/patents/US7548892)

Pretty amazing when you consider they wrote the application a decade ago!
There is no arguing we are experiencing a golden age and an embarrassment of
riches provided by ML toolkits and GPU cloud capabilities. But further
inspection of patents in the space brought up this gem, in which Emotient is
attempting to patent the crowdsourcing of training data...

Collection of machine learning training data for expression recognition

[https://www.google.com/patents/US20150186712](https://www.google.com/patents/US20150186712)

------
azinman2
That they're providing a distributed LDA implementation with and O(1) Gibbs
sampler is a big deal. I haven't played with it yet but the numbers they are
reporting relative to cluster size is orders of magnitude improvement.

------
math_and_stuff
Wow, more surprising than this being open source from Microsoft, it is based
on MPI.

~~~
vegabook
and looks like Linux first class citizen.

------
blazespin
Where's the GPU support?

~~~
brazzledazzle
Sitting in your future pull request.

