

GoLearn – Machine Learning in Go - alixaxel
http://www.sjwhitworth.com/machine-learning-in-go-using-golearn/

======
timtadh
I have found that most difficult thing about doing machine learning in Go is
the lack of a really good matrix library. I always just end up using Python
because I end up wanting to do something like "find eigenvectors." Easy in
Python but no one has yet wrapped up a nice interface into the BLAS/LAPACK
libraries to do this from Go. (That I know of! If you know a good library for
this let me know!!!!)

Numpy and Scipy are so mature in this respect it is difficult to compete with
them. I looked into implementing eigenvalue algorithms recently with the idea
that I would just write a native Go library for doing this kinda stuff.
However, reading the source of JAMA[1] was sufficiently humbling for me to
realize this was not a good idea. (If you really want to be humbled try
reading the fortran implementations in LAPACK.[2] I believe SRC/dgeesx.f is a
good starting point)

[1]
[http://math.nist.gov/javanumerics/jama/](http://math.nist.gov/javanumerics/jama/)
[2]
[http://www.netlib.org/lapack/#_software](http://www.netlib.org/lapack/#_software)

~~~
Osmium
> Numpy and Scipy are so mature in this respect it is difficult to compete
> with them.

Tell me about it. What I'd give to have something equivalent for Objective-C
(or certain other languages too, e.g. Julia). I'm looking at PyObjC as a stop-
gap solution for now, but it sure adds complexity to a project.

Edit: Since SciPy is BSD-licensed and presumably mostly C behind-the-scenes,
perhaps there's potential for a group to try and package it up for other
languages? I have no idea how large an undertaking like that would be...

~~~
timClicks
At least with Julia's PyCall, you don't need to sacrifice losing access to the
Python stack. You can work with NumPy arrays without needing to copy data
around.

~~~
Osmium
Thanks! I didn't realise this. This looks really useful.

------
willis77
Considering this is four months old and the only method implemented is still
just knn, I think it's disingenuous to (a) call this a library and (b) write
in such general terms (admittedly, hindsight is 20/20). I don't mean to
detract from the premise behind starting the project, but language like "I
couldn't find any comprehensive ML library for Go, so I decided to write one"
has a bit more hubris than is warranted.

~~~
gamegoblin
Indeed. Also, the KNN implementation is rather lightweight. It only has
euclidean distance, and basic matrix datastructure.

Even just KNN needs several distance metrics built in (manhattan, hamming,
mahalonobis, to name a few) and a good ball tree implementation for use on
large datasets so that the search time goes from N to log N.

------
cnbuff410
Cool project.

There are already a couple of Machine Learning libraries[1][2] written in Go
and some of them are actually more mature than GoLearn.

Also just curious, I always thought Go is not really a good language for DM/ML
stuff due to lack of good matrix library and generics. If someone here
actually tried to write any ML library in Go, what's your genuine feeling
about it?

[1] [https://github.com/huichen/mlf](https://github.com/huichen/mlf) [2]
[https://github.com/xlvector/hector](https://github.com/xlvector/hector)

~~~
ajtulloch
I wrote a decision tree library (random forests, gradient boosting, etc) in Go
while learning the language
([https://github.com/ajtulloch/decisiontrees](https://github.com/ajtulloch/decisiontrees)).

It's nice being able to trivially parallelise operations in Go - e.g.
constructing the weak learners for a random forest, generating candidate
splits, recursing down left and right branches, etc.

    
    
        // Recur down the left and right branches in parallel
        w := sync.WaitGroup{}
        recur := func(child **pb.TreeNode, e Examples) {
            w.Add(1)
            go func() {
                *child = c.generateTree(e, currentLevel+1)
                w.Done()
            }()
        }
    
        recur(&tree.Left, examples[bestSplit.index:])
        recur(&tree.Right, examples[:bestSplit.index])
        w.Wait()
    

As you said, generics and a matrix library would be make the experience nicer.
Just having

    
    
        sort :: Ord a => [a] -> [a]
    

would strip a decent amount of mildly error-prone boilerplate, and there are
other cases (splits for cross-validation, etc) where it would be nice to be
able to abstract over the type of the slice, etc.

------
sjwhitworth
Hi, author here. Thanks for posting it. As you can see, it's definitely
incomplete as it stands, as I haven't been able to spend as much time on it as
I would have liked. I'm hoping that will change. If anyone fancies working
more formally on it with me, send me a mail at stephen dot whitworth at
hailocab dot com.

------
JPKab
This is cool. Just wondering, how would speed compare to Julia?

~~~
c0g
Poorly - Julia calls well tested and really carefully programmed BLAS
libraries for the numerical heavy lifting, this doesn't. It is likely to be
less accurate, less stable, and slower just for that reason.

The comparison doesn't really work anyway - better would be Julia vs. Go, some
ML toolkit in Julia vs. this.

------
AYBABTME
This lib doesn't seems to be ready. A quick look at the code doesn't suggest
any attention was given to the code beside throwing it on github. I don't
understand this submission. There is nothing to be used here.

