Hacker News new | past | comments | ask | show | jobs | submit login
GNU Gneural Network (gnu.org)
315 points by jjuhl on Mar 13, 2016 | hide | past | web | favorite | 107 comments

I agree with the general motivation that having too much AI research in the hands of software companies who keep it proprietary harms transparency and progress. But there is already a lot of neural-network free software, so why another package? For example, these widely used packages are free software, and seemingly more featureful: http://torch.ch/, http://www.deeplearning.net/software/theano/, http://pybrain.org/, https://www.tensorflow.org/, http://leenissen.dk/fann/wp/, http://chainer.org/

I agree with the general motivation that having too much AI research in the hands of software companies who keep it proprietary harms transparency and progress. But there is already a lot of neural-network free software, so why another package?

Not only is there a lot out there, a lot of it was released by companies like IBM[1], Google[2], Yahoo[3], Baidu[4], Microsoft[5], etc. So while I'm generally sympathetic to the FSF's position, this case almost seems like a bit of a reversal of things: there doesn't seem to be a problem with for-profit companies taking the fruits of the labors of volunteers and building on top of it... instead, we have a surplus of riches, released as Open Source by a bunch of big companies. It just happens that most of it is under a permissive license like the ALv2.

Of course, one could suggest that that state of affairs isn't natural and/or sustainable, and that this doesn't negate the issues the FSF is dedicated to. So I support this effort, even if it seems redundant on some level at the moment.

[1]: http://systemml.incubator.apache.org

[2]: http://tensorflow.org

[3]: http://yahoohadoop.tumblr.com/post/139916563586/caffeonspark...

[4]: https://github.com/baidu-research/warp-ctc

[5]: https://github.com/Microsoft/CNTK

Yeah they've really missed the fact that it isn't the algorithms or code that we're missing out on. Companies are usually pretty open about these because they know it isn't bit that is hard to compete on.

The hard bit is the training data. Good luck collecting 10k hours of transcribed speech, or 10k recordings of "Okay Google".

+this. Freely available data is a huge value to everyone. We use it in academia, and it's useful in companies big and small. (Even at Google -- I started exploring some of my research questions using MNIST and Imagenet because they're baselines that allow reproducibility, and because you don't have to deal with the privacy issues. For amusing anecdotes about this, consider what the Smart Reply team had to do: http://googleresearch.blogspot.com/2015/11/computer-respond-... It's much harder to train a network when you can't ever look at the training data!)

The best thing the assorted communities involved in this effort could do to accelerate the advance of open, accessible machine learning is to create good creative-commons datasets that anyone could use to train models that could be released open source. And as an academic, let me say that hundreds of researchers would figuratively kiss the ground you walk on for doing so. :)

>Another bizarre feature of our early prototype was its propensity to respond with “I love you” to seemingly anything. As adorable as this sounds, it wasn’t really what we were hoping for.

> Good luck collecting 10k hours of transcribed speech

I'm sure that nearly every DVD theatrical release has subtitles available. Speech against a wide range of background noise too, e.g. music, explosions, traffic, normal ambient noise, etc.

Seems a good start for acquiring a large corpus of labelled speech.

Aside from the potential problem with regards to copyright, it should also be noted that subtitles in general are not transcripts of dialogue. The subtitlers often have to shorten down sentences of speech so that viewers have time to read before the next couple of subtitles appear on screen.

There shouldn't be any issues with copyright, as long as you aren't redistributing the original work. Otherwise all neural networks would be illegal, since most training data is copyrighted.

As for errors in the subtitles, that's still good enough. As long as the machine learning model can deal with uncertainty, it would just not learn from those examples and learn from the ones that are correct. It might even learn to abbreviate sentences itself!

Models trained on DVD audio are considered derived works. You certainly couldn't release such a model under the GPL.

You also have to solve the (very difficult) subtitle alignment problem before you could begin training.

I'm a neural net trained substantially on copyrighted books, music, tv, and movies. Does that mean I'm a derived work, and consequently all works I create are derived works as well?

I'm not saying you're wrong, necessarily. Since copyright is so vague as to allow that interpretation, that shows how much copyright is incoherent, contradictory, broken, and, ultimately, nonsense.

What about the countless audiobooks available on archive.org [1]? Sure, you may be limited to just books in the public domain, but that's still plenty of books.

[1]: https://archive.org/details/audio_bookspoetry


It's not like you could take the neural net weights aggregated from thousands of movies and retrieve any form of entertainment from them. Is a derived work anything at all based on an original, or just something in the similar field, ie entertainment->entertainment?

My own personal definition is whether the derivative work could survive if the first work did not exist, not for which purpose it was intended to be consumed. Not sure about the legal definition.

Legally, there is a huge gradient between length(work), sha(work), train(transcription(work)), transcription(work), thumbnail(work), etc. Your personal definition of "derived" sounds a lot like the mathematical definition, which isn't amazingly useful in a copyright context.

> Not sure about the legal definition.

Perhaps stating "You certainly couldn't release such a model under the GPL." so surely isn't a great idea?

That actually depends on if the audio is under copyright.

And even so, that is no reason why we as open source collaborators cannot create a million or billion or so samples of "Hello" in foo language as training data as a corpus for all to use.

> Models trained on DVD audio are considered derived works

[citation needed]

There are tons of freely available models based on copyrighted works. Are you sure this is true?

But in order to use the movies for training you would need to buy the thousand and thousands of DVD's

That's the biggest by far and still only 1k hours.

But yes, that is what we need more of - not matrix libraries.

Huh, I wonder how much truth there is to your words. As an outsider to Machine learning and neural networks, it seems to me that algorithms can be very valuable and big companies do not lack training data. Of course, training data is expensive and very important, but if training data were that important, their most important resource would not be machine learning scientists, but an army of do-monkeys that provide training data. It won't be the victory of the smartest scientist but of the one with the most employees. And that does not seem to be the case.

It absolutely is the case. Why do you think Google has been providing such services as Google Voice, ReCaptcha, Street View, heck... Maps... and basically everything that's awesome and free? (other than the stuff that puts advertisements in your peripheral vision)

Andrew Ng told a story about one of his first robots that was supposed to roam around the lab and collect coffee cups to deposit in the sink. He ran out of varieties of coffee cups to train the robot's vision well before the robot learned how to detect a coffee cup.

The key to being a successful AI company is to figure out how to get the world to send you data.

Edit: That or figure out how real brains work and how to scale them. Which is probably almost, but not quite, entirely unlike a convolutional neural net.

Edit2: This also leads us to a twist of the dogmatic refrain: If you're not the customer, you're the employee.

It's mix between who has the most data, who can pose the problem best, who can wield the largest computers and who can handle the most complex algorithms

> The hard bit is the training data.

That, plus acquiring a team skilled enough to make good use of the code.

Almost all of the open source software in the area is permissive-licensed, and relies on non-free components (CUDA).

To be honest, I'm not sure how Gneural plans to compete with those packages without support from CUDA or cuDNN, all of which are distinctly not open source.

FANN is GNU licensed (LGPL 2.1), doesn't rely on non-free software, is written in C,so it's the same as Gneural in those regards. But it also is way more mature, has more features, compiles and runs on Linux, Windows and MacOS,and has bindings to 28 other languages.

Gneural will compete with Theano sort of like how the GNU Hurd competes with Linux

I know you're just trying to be funny, but I don't think it's funny at all.

The Linux kernel undoubtedly many features that the Hurd system lacks, but that is due to the severe lack of manpower of the latter system and the billions of Dollars being poured into the former.

On the other hand the Hurd has features that the Linux kernel can never hope to achieve because of its architecture.

> The Linux kernel undoubtedly many features that the Hurd system lacks, but that is due to the severe lack of manpower of the latter system and the billions of Dollars being poured into the former.

That's why GNU Hurd is essentially a dead project. Sadly it never attracted the attention and manpower necessary for it to survive.

> On the other hand the Hurd has features that the Linux kernel can never hope to achieve because of its architecture.

For example?

Fault isolation. We're doing it for daemons, we're doing it for web browsers, it is insane we're not doing it for operating system services. I bought a graphic tablet and the first time I plugged it into my laptop the Linux kernel crashed. And this was merely a faulty driver, not even malicious hardware.

Also think of the effort it took to introduce namespaces to all the Linux subsystems. After a decade the user namespace still has problems. This is ridiculously easy on a distributed system, yet very hard on a monolithic one.

I am not trying to be funny. I am dead serious. Aeolos explained it perfectly.

In other words, not at all

That's not necessarily relevant though. I'm sure the FSF would love to see Free Software replace all proprietary software, but in the end, the real point is that Free Software options are available to the people who want them. This isn't like a battle between commercial entities where market share is king and a project will be dropped if it isn't profitable. Gneural will be a success if a community forms around it and people work on it and use it, however small that community might be.

The problem I have with it is that they could be contributing their brain power and time to other open source projects instead of recreating the wheel for very little benefit. Take my opinion with a grain of salt, as I consider the more restrictive Copyleft licensing as new /loss/ for society.

To be honest, I'm not sure how Gneural plans to compete with those packages without support from CUDA or cuDNN, all of which are distinctly not open source.

I don't see the point either. Gneural will probably never be better than Theano, Torch, Tensorflow, Caffe, et al., which are already open. If anything, time/resources are much better invested in contributing to a polished/competitive OpenCL backend to one of these packages.

Caffe has an OpenCL backend - https://github.com/BVLC/caffe/tree/opencl

I'd really like to understand the reasons behind the focus on CUDA and not OpenCL. My understanding is that nVidia and AMD made sure their hardware and software would make the GPU accessible for non-graphics tasks, but AMD's version is not functionally or legally locked to their hardware. Why hasn't OpenCL taken off and run on nVidia hardware?

It seems like there must be more at play, but I'll admit a lack of insight and imagination on this one.

It seems like there must be more at play, but I'll admit a lack of insight and imagination on this one.

I think the reasons are twofold: 1. CUDA had a big headstart over OpenCL. 2. NVIDIA has invested a lot in great libraries for scientific computing. E.g. for neural nets, they have made a library of primitives on top of CUDA for neural nets (cuDNN), which has been adopted by all the major packages.

Performance. OpenCL has been 2-5x slower for ML than CUDA. Not sure of the exact reason but I think it's the highly optimised kernels which are not there with OpenCL, but are with CuDNN. I think it's mostly a software issue, compute capacity in theory should be more or less the same with equivalent AMD/NVidia cards.

AMD should have invested much more heavily into ML, if they had, their share price would probably look a bit better than it does now.

This looks interesting - running CUDA on any GPU. http://venturebeat.com/2016/03/09/otoy-breakthrough-lets-gam...

I recall hearing that CUDA has much more mature tooling. Not only the already mentioned cuDNN, but the CUDA Toolkit [0] seems like a really comprehensive set of tools and libraries to help you with pretty much anything you might want to compute on a GPU.

Also somewhat related: AMD seems to be moving towards supporting CUDA on its GPUs in the future: http://www.amd.com/en-us/press-releases/Pages/boltzmann-init...

[0] https://developer.nvidia.com/cuda-toolkit

On closer inspection, it looks like AMD's CUDA support consists of "run these tools over your code and it will translate it so your code does not depend on CUDA"...

Its sort of supporting CUDA, just like a car ferry sort of lets your car 'drive' across a large body of water.

Because it requires nVidias cooperation in implementing OpenCL. And of course they are not about to do so in a useful manner when they are leading with CUDA.

Also, the premise of OpenCL is somewhat faulty. You end up optimizing for particular architectures regardless.

and relies on non-free components (CUDA).

Yeah, this is one reason I'm really hoping some of the stuff AMD is pushing, in regards to openness around GPUs, gains traction. And why I am hoping OpenCL continues to improve so that it can be a viable option. Being dependent on nVidia for all time would blow.

The use of the gplv3 allows gnueral to have, as a dependency, any of the apache or permissive licensed tools like TF, torch, etc, and then through those tools 'export' their dependence on non-free components from nvidia and others.

I don't think this is wrong, per se, but it is ...funny when the fsf portrays their work as morally superior to us horrible corporate permissive license lovers, while inexorably depending on non-free components.

In an ideal world this project will be popular and will lead to someone on gnueral writing nvidia compatible drivers that will allow them to reject nvidia's, but I'm not optimistic. Not because of some incompetency on the Gnueral team, but nvidia's long history of making life very difficult for open driver writers.

Does the FSF really depend on non-free components?

It's possible to run any of the "major" neural network toolkits (Caffe, Torch, Theano) on CPU-only systems. All of them are permissively licensed (to my knowledge).

It will be prohibitively difficult to train the model without some kind of hardware assistance (CUDA). This means that if we're building an ImageNet object detector, even if the code implements the model correctly the first time, training it to have close-to-state-of-the-art accuracy will take several consecutive months of CPU time. Torch has rudimentary support for OpenCL, but it isn't there yet. There are very good pre-trained models that are licensed under academic-only licenses that also help fill the gap. (This is about as permissively as it could be licensed because the ImageNet training data itself is under an academic-only license anyway.)

I'm not sure what niche this project fills. If you want an open-source neural network, you have several high-quality choices. If you need good models, you can either use any of the state-of-the-art academic only ones, or you would have to collect some dataset completely by yourself.

> This is about as permissively as it could be licensed because the ImageNet training data itself is under an academic-only license anyway.

Does this necessarily follow, that a machine-learning model is a derived work of all data it's trained on? As far as I know, the law in this area isn't really settled. And many companies are operating on the assumption that this isn't the case. It would lead to some absurd conclusions in some cases, for example if you trained a model to recognize company logos, you'd need permission of the logos' owners to distribute it.

(This is assuming traditional copyright law; under jurisdictions like the E.U. that recognize a separate "database right" it's another story.)

I'm not aware of the formal legality of it, but I don't see why it wouldn't be the case. Without the training data, the model can't work. That seems to fit the definition of "derivative work".

IANAL, but I looked at the definition of derivative work, and it seems really hard to apply to learning algorithms. But I'm going to disagree with you. I notice that US law mentions "preexisting material employed in the work". IMO a set of neural network weights contains no preexisting material at all. All the examples of derivative works include at parts of previously copyrighted works directly.

I'd like to note that some publishers, like Elsevier, allow you access to their dataset (full texts of articles) under a license with the condition that you can not freely distribute models learnt from their data.

Wrong, most do support OpenCL or at least have partial support. It's just much less supported because not so much people see too much benefit from it. Btw, it's all open source. If you miss some functionality, it's really easy to add.

Yeah, you don't depend on CUDA/cuDNN, but of course you can use them if you want it to be fast

But the CPU fallback is there

Its going to need to use CUDA or it will not be competitive with alternatives. CUDA makes training networks more than an order of magnitude faster.

But that may or may not matter, depending on what you're doing. And how often you do it. If I have a network that I only retrain once a month, I can deal with it taking a day or two to train. Heck, it could take a week as far as that goes.

OTOH, it obviously matters a lot if you're constantly iterating and training multiple times a day or whatever.

The difference is between training taking a week, and training taking 10 weeks.

It takes a week to train a standard AlexNet model on 1 GPU on ImageNet (and this is pretty far from state of the art).

It takes 4 GPUs 2 weeks to train a marginally-below state of the art image classifier on ImageNet (http://torch.ch/blog/2016/02/04/resnets.html) - the 101 layer deep residual network. This would be 20 weeks on an ensemble of CPUs. (State of the art is 152 layers; I don't have the numbers but I'd guess-timate 3-4 weeks to train on 4 GPUs).

For state of the art work "a day or two" is pretty fast for a production network, and that's on one or more big GPUs. Not using CUDA is definitely a dealbreaker for any kind of real deep learning beyond the mnist tutorials. It's common to leave a Titan X to run over a weekend; that would be weeks on a CPU.

Well not using CUDA isn't necessarily synonymous with "use a CPU". There is OpenCL. But still, you have a point even if we might quibble over details. This is why I am very much hoping AMD gets serious about Machine Learning and hoping for OpenCL on AMD chips will eventually reach a level of parity (or near parity) with the CUDA on nVidia stuff.

Its unlikely that AMD is going to be able to make serious inroads in the near future. nVidia has built quite a lead not just in terms of chips but tooling. I had thought a couple of years ago that AMD should be building a competitor to the Tesla. It should be able to build a more hybrid solution than nVidia can given its in house CPU development talent. But I haven't seen them building anything like that and a competitor to nVidia may have to come from somewhere else. In the absence of a serious competitor OpenCL is not very interesting.

Yeah, and that's sad. I really hate to see this whole monoculture thing, especially since CUDA isn't OSS. :-(

Its really a hardware problem.

Why not focus on adding GPLed code to an existing package with a GPL-friendly license?

FSF want to be copyright holder for all of it projects code so it's possible to relicense codebase under newer version of GPL. For same reason everyone contribute to their projects must sign CLA.

I think the idea is to use awareness of GNU and also to focus the attention of people with the skill to contribute on neural-networks.

What I mean is, you're right of course, there's much better neural-network free software already available, but GNU endorsing an official package 1) could get people whose concerns are more strongly geared toward free software ethics to start paying attention to neural networks, and 2) get people, regardless of their ideological commitment to anything, to be more aware both of the ethical issues and of neural-network software--just in virtue of GNU's mild fame.

and of course, unless the maintainer of this package makes weird choices and alienates other projects, there's the other benefit of projects learning from each other and poaching code and strategies for the greater good.

Right. I don't think this is meant to be immediately competitive against other, more mature neural net implementations. It's just GNU getting some skin in the game and giving people another option to use/contribute to. I think this might be a fun project to submit some changes to, since there's probably plenty of easy problems left to solve with such a new implementation.

Do you know of anything besides Microsoft's GUI tool that lets someone simply specify the various attributes of the network they want, and it simply creates it, ready to be trained and tested and consuming data?

For image data, Caffe (http://caffe.berkeleyvision.org/) is dominant (most deep learning computer vision research I see is done with Caffe). However, that's the chief problem with Caffe - it's very difficult to extend because it revolves so much around "just specify attributes" and train.

Not only that, but GNU has deliberately obfuscated code (GCC) to prevent people from using it in ways GNU doesn't approve (but are allowed by license).

The implementations look odd. A network consists of a collection of neurons, which are implemented individually as structs. The forward pass through the network is a series of nested loops, and the gradient descent implementation doesn't use backpropagation - it uses finite differences to approximate derivatives, which is known to be inefficient. Given the overall design of the library, it isn't really clear what you would use it for in practice.

I hope that future versions take inspiration from other open source machine learning libraries, which show how to use linear algebra and backpropagation and are much more effective.

- It's nice that GNU is taking on such a project

- FANN seems like a pretty good alternative

- The value of the software at the big "monopolies" lies within the data, not necessarily the software

- This needs to be in some publicly accessible repo. Downloading a zip file and submitting patches? I thought we, as a society, were over that way of building OSS.

Aw. It should have been named the GNU Gneural Gnetwork, gno?

My first thought also


The "ethical motivations" section is out of place here. Its moaning about "money driven companies" (as though money were a bad thing), or "monopoly" (which does not exist in AI), just reflects badly upon the project.

I think they stay within reason. IMO they don't say that being money driven is bad. It is rather that big, money driven companies can pose a problem. Maybe there is no monopoly on AI but there is certainly a monopoly on, for instance, social network data.

However, I find it quite amusing (and perhaps "out of place") that the maintainer uses a gmail address.

> "money driven companies" (as though money were a bad thing)

It is a valid viewpoint to find money-driven companies as a “bad thing” (or more exactly, companies whose main goal is to maximize shareholder value).

The monopoly _does_ exist in AI: machine learning is entirely data driven, and companies like Facebook and Twitter quite literally have investors throwing money at them because they have such valuable data for that purpose. Google is no different.

"Monopoly" does not mean "two companies have lots of knowhow that competitors might like".

Monopoly does mean "when a specific person or enterprise is the only supplier of a particular commodity [...] which relates to a single entity's control of a market" (Wikipedia)

Data is the commodity. There is nowhere else you can get good raw data about, say, what people were publicly discussing last week, except through Twitter, in order to guess the stock market. Much of that data is closed off or incomplete even through their API. There is no other option except to create another Twitter.

A commodity is a routinely interchangeable product, available from multiple suppliers, competing primarily on price. No, "data" in this context is not that at all. A library of user profiles is the opposite: it's proprietary, unique, massive.

So this library isn't solving anything then.

I think it solves something (needing a simple, well documented, GPL licensed machine learning framework written in C) but not the major problem faced by those on the cutting edge of machine learning.

You are correct. Money is just a way of measuring value that we agree upon that manifests itself in credit or paper currency, but is just an object. Appealing to the masses with "money driven companies" is an intellectually dishonest argument to froth it up a bit.

This team should focus on a SPIR-V back-end and remove vendor lock in from NVIDIA for CUDA IN tensor AI software. A GPL licensed AI library without GPU acceleration isn't attractive outside academia.

Love it! If you want to play with state-of-the-art machine learning software, this is not for you. But if you want a clean implementation of neural networks in C that has a GPL license and no non-free components, this is a good start.

There's already FANN: which is more mature and has bindings for 28 other languages: http://leenissen.dk/fann/wp/language-bindings/ I maintain the C# wrapper.

I have used FANN from Perl, amazing library. I'm still glad there is some AI software under GNU's belt and the source code for "gneural networks" is pure C and way easier to follow (for now)

It doesn't have quite the same feature set, but I wrote a simple, and I think "clean" neural net library for Python but with a pure C engine under the hood - could be pretty easily adapted to use directly from C.


It's 1980s hobbyist level code. Almost small enough to be on stackoverflow as a sample. Not that much there to love!

At this stage of things, I think it's more forward looking to open source trained models. Not only they are beginning to be the real core of future building blocks (see, e.g., trained word2vec vectors) but also the contain the real complexity in a NN, i.e., the are the "real function" you would want in a library.


Am I mistaken, or is the source repository for this project just tarballs checked into CVS?

Commit message:



Seems like

Is there more being done to promote GPU acceleration on non-CUDA platforms? I feel like this would be more useful than yet another FOSS NN library.

Torch has rudimentary OpenCL support. Some things "sort of" work. https://github.com/hughperkins/cltorch Theano has been slowly working on integrating OpenCL support too for several years, but I'm not aware if it's supported or not.

Nvidia has a complete monopoly on all deep learning hardware and tooling. With the possible exception of Google (and maybe Facebook), 100% of all serious academic researchers are training their models on Nvidia hardware with Nvidia's propietary CUDA toolkit. Using anything else is currently completely unthinkable. Amazon and Nvidia have even teamed up to make CUDA training cheap (on the short term) for EC2 users.

I'd love to be able to switch to OpenCL, but there's so much momentum and very little perceived benefit when your lab already has four (very expensive) Titan X cards.

Talking about this, GNU/the FSF should start drafting an OSS license for neural networks. Like APL Afferro for cloud services, the specifics of neural networks is that data is strategic.

APL -> Guaranty of OSS for the desktop

APL Afferro -> Guaranty of OSS for the cloud

??? -> Guaranty of OSS for NNs

Funny.. The majority of AI research is currently using open source libraries (Theano, Lasagne, Torch, Keras, Scikit-Learn, Nolearn, etc. etc. etc.)

Now Google does have access to a whole lot of data that the rest of the world doesn't. and FB, Google, and etc. have more than a bit of a hardware advantage... for now, at least. Distribute a shared system over a P2P infrastructure, and you can change that. Perhaps rather significantly.

If you were an AI (software), and you had to pick a license to release your source code under, one would assume you would pick the GPL, as it retains as much freedoms as a piece of software could ever expect in a world full of us.

If I was an AI, I would release my source code as public domain or BSD. That way, big corporations would start using me and I'd have access to the world's financial and defense systems.

Shit, maybe I'm an AI.

please don't be an AI. you seem to be a BSD version of Skynet.

Isn't the problem that in our age of supervised training, the algorithms are not the competitive advantage, but the data ?

Nice that GNU is taking on such a project..


I'm glad the FSF is finally getting concerned about proprietary AI, but it's going to take a lot more than a single neural network package to get caught up in this arms race.

I wish they had taken the initiative much sooner.

It's more that someone interested in neural networks also wanted to work under the Gnu umbrella. The FSF don't have any resources per se.

I wish them good luck

Anyone else notice how GNU's website is stuck in 1993?

No, it was updated since then. Its header/footer format is certainly was not common in 1993, it has search box and things like that. Anyway, it is usable, does not require JavaScript, and loads really fast.

Also, it sets the background colors. You couldn't do that until like HTML 3 in 1995.

However, META ICBM, is a joke as old as the META tag, which I guess is 1995.

You mean, like the site you are using to post that complaint?

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact