I know I'm a decent developer, but I feel entirely entirely inadequate to participate in this enormous, scary world of AI.
The people doing this aren't magical geniuses; they've just put the time and work into the subject and have been able to get themselves into a position they can do this all day surrounded by others they can collaborate with.
As with most human endeavors, the trick is to just get started, and not get frustrated and give up when it turns out you don't know anything at first. Some people don't mind starting with a ton of abstract learning about the subject, others prefer trying to accomplish specific tasks, learning the theory along the way.
For your specific question, as with all software, there's likely a lot to be done that has little to do with the main task of the tool and is just the everyday tasks of ease of use, interoperation with other tools, testing, etc. If on the other hand you want to get into the scary world of AI, like many others I'd recommend the Coursera machine learning course as a great place to start
Says the person named 'magicalist'. :)
I don't think I have any affinity for the topic, but at least now I can read/discuss the topic without being completely blind. There are also a lot of smaller components in the course that I found useful even not working directly in AI/ML. Just some general data modelling and linear algebra stuff was nice to pick up on the ride.
The answer I got was rather unpleasant and won't please everybody. No you cannot contribute. At least not in a direct meaningful fashion. You can be a waterboy. So that's what I did for 2 years. I wrote a lot of ETL jobs using custom Scala DSLs, that fed the input to these ML jobs. It was a total waste of time. Sure I learnt map reduce and Hadoop and all that jazz, but end of the day, I wasn't doing ML. I was doing ancillary tasks. These tasks no doubt have some economic worth, because I was getting paid. But no company is going to let you do the ML when they have 100s of ML PhDs on their payroll and you aren't one of them. So you just do the data prep, or do ETL, or do data viz, or crunch some numbers aka BI, and convince yourself you are doing real ML. This went on for a while. Finally I couldn't put up with this farce and quit.
What worked for me personally was finding a very small company with a tiny data science department, that was headed by an ML PhD who was ready to mentor me, tell me which papers to read, get me to start working on my own papers, get me to build ML systems for image recognition on company time, get me to read textbooks and present topics...it was all quite painful and very humbling, but I learnt a shit ton of stuff. So my frank suggestion is to be brutally honest with yourself. You aren't going to get from here to there by hanging around on coursera or writing ETL. This stuff is seriously hard. If you want to make a genuine contribution, be willing to put in serious time - and that means literally stopping whatever shit you are doing now as a webdev/back-end dev/ETL dev/data-eng etc. Those ancillary tasks won't get you anywhere. Buckle down and do the real deal. You'll thank yourself one day if you did.
This is a path I took around 4 years ago, and I have built some seriously valuable stuff in that time.
Also I would personally say to not discount the value of backend stuff, doing scalable ML processing is not off the shelf and lots of value can be created by making scalable reusable systems that can run machine learning models. Most of those ML PHd's can't (or don't want to) build scalable distributed systems. If you can train experiments faster, or actually do the work to make an end to end system work outside the lab, you can make real measurable contributions.
IMO this is excellent advice for anyone who wants to move into a new field, and I've personally learned many of my skills through this kind of situation. Get yourself a job somewhere small but ambitious, and you'll end up wearing a lot of hats and doing new things simply because you put your hand up for them. Then when it's time to move on, voila! You have X years commercial experience in a whole bunch of things and you put the one that you want to focus on next onto your resume.
The bigger the company, the smaller the pidgeonhole you'll live in and the less likely you are to ever end up doing anything outside your original job description.
2. Start small. Really small. Tinker with tutorials. See what happens when you change stuff around.
3. Read the documentation with another browser tab open to look up definitions.
4. Don't be afraid of the math. Its your friend. If it feels too complicated (it happens to all) then look for tutorials / online help on forums to figure it out.
5. Realize that nobody knows much about AI. Its still being developed. Pompous people will try and make themselves look smart by making it seem like they know a lot about it. They might know a lot of what we collectively know about AI, but they don't know a lot about it. No one does.
PS. I'm currently working in AI applications and this is how I started. Don't believe the hype.
It takes time to become really good at it, but basics are relatively easy to aquire. Libraries and learning resources are much better now than they used to be. You can totally learn how to apply ML using just scikit-learn tutorials. Coursera ML course gives a nice introduction. For deep learning check an awesome http://www.deeplearningbook.org/ and try to implement what you're reading about e.g. using https://github.com/Theano/Theano.
The world of AI is enormous indeed, tens of new worthwhile papers are published every day. Just reading them can be a full-time job, but you don't have to know all state-of-the-arts techniques to be dangerous. Over time it will become easier to read new papers, and you will start to realize how are ideas related. Often a paper provides just a small tweak for a known algorithm, or combines existing "building blocks" in a new way.
As with all skills, it takes time to become good at it, but there is nothing to be scared of. Keep learning, do something practical every week. If you don't have problems at hand to solve using ML, register at kaggle.com. There is more to learn than e.g. in devops, but everyone can do it.
Often times you can be dangerous with just scikit learn, xgboost and Keras if you can do data prepping, pipelines and stacking/ensembles. Besides wonderful frameworks such as TF, Theano and Torch there are also "zoos" of ready made models, word vectors and such that can be used out of the box.
You are mixing "applying ML black boxes" with "creating ML black boxes". The former is not specially hard (if you know the language the libraries are in, have curiosity and time), but you need to know a little bit of what you are doing unless you want to just be one of the thousand monkeys with a typewriter. The latter is hard, like very hard.
If you want to make actual contributions to the machine learning part of machine learning (i.e. not building infrastructure and pipelines), then you need to get your math into a really good state. That comes first before you can even hope to begin to learn ML. That means acing probability and linear algebra, which is most of what ML is based off of.
If you have no experience with probability or linear algebra, it'll take you at least a year of solid studying to get up to speed.
Then, you can start taking ML courses. That will take you another year to really understand well.
At that point, as dxbydt said, you should find a ML PhD who is willing to mentor you. Otherwise, you will not be able to make meaningful contributions, since there are thousands of ML PhDs out there who are much better skilled than you are, and companies would rather hire them than you.
So the ultimate answer is: yes, with a ton of sacrifices and years of work. The only question for you to answer is whether or not you want to make those sacrifices and put in that work.
Though it might be too much for a beginner ML project. If you are starting from zero, this is a good intro to CNN's - http://cs231n.github.io/convolutional-networks/
I still don't know a ton but I'm fumbling my way through it working towards my end goal. If I can start working through it and figure it out I don't see why any developer couldn't!
Also never forget the golden rule: largely no one really knows what they're doing. Sure we have AI / ML experts and me saying they don't know what they're doing is insulting but overall people rarely know everything that they're doing or how to specifically accomplish a goal. Most of the times it's a haze and you work through it.
 https://www.simex.io (sorry, shameless plug but it seemed relevant! Anyone can get into this category)
What really helped advance my understanding from zero to knowledgeable novice was rewriting some existing code line by line (using expanded variable names and comments), and thinking about each line and what it does as you go. It's the software development equivalent of Hunter S. Thompson re-typing The Great Gatsby just to get the feel of writing a great novel. Here's one I did based on Denny Britz's tutorial:
Britz's Original: http://www.wildml.com/2015/09/implementing-a-neural-network-...
My version: https://gist.github.com/sthware/c47824c116e6a61a56d9
The challenge is to find a figure who is not a researcher driving for papers who has authority and experience managing delivery and can engage and orchestrate mainstream contributors. Occasionally these people emerge like a messiah (cf. Linux) but it seems to me that this is one of the hit/miss factors that mean that some of the best "1000 flowers" fail to bloom.
Does Tensorflow have an Open Source project manager employed by Google?
People in those communities can't point you in the right direction, if you show you're willing to read the docs and do the work of learning. One thing you will find is that AI requires a lot of non-AI components to work. That is, you may be able to help an open-source project develop its UI, or improve its datapipeline, or any number of other useful things. By working on the edges of such a project, you'll learn the lingo and grasp some of the basic ideas. From there, you can start working on the math, etc.
Bottom line, there are a lot of ways to get involved with AI.
OpenAI and DL4J are equivalent in the sense that they both have active communities focused on parts of machine learning. The difference, of course, is that DL4J is a commercially backed OSS framework, while OpenAI is a not-for-profit think tank that is using other group's DL libs, while building its own sandbox and obviously conducting its own research.
I wouldn't say that DL4J is more serious because Java is enterprise. It's just tackling different problems. It does some things better (many integrations) and some things worse or not at all (automatic differentiation), which is simply a reflection of our priorities. Not research, just production solutions and apps. That said, we are making it easier to create custom layers with DL4J, so people will be able to implement a lot of the latest papers there soon.
A requirement for this one is an NVIDIA GPU with compute capability 3.5+. It turns out that my NVIDIA GPU is a 3.0 one (a laptop with a Quadro K1100M). I'm half lucky, because my laptop lets me upgrade the GPU (it's a HP zBook), still do I want to spend the money only to experiment? I didn't check the price for a 3.5+ GPU that fits in my laptop but you can compare this to cloning a pure-CPU repository from GitHub and give it a try.
As GPUs are used more and more for computing tasks, it feels more and more like back in the 80s when you found a nice 6502 assembly program on a magazine but your CPU was a Z80 (or 68000 and you had 80286). We eventually standardized on x86/amd64 because the market went there. I wonder if we're going to standardize on a GPU architecture too.
By the way, before somebody points it out, a solution is to get a cloud GPU server (AWS has them). Still I'd like to be able to develop locally.
I'm working on segmentation in the context of traffic counting.
The problem with AI is that one need enormous money and resources of a big corporation to produce results one could see on AI competitions. It their "research" they do progress, like everyone else, by trial and error (well, augmented with a decent heuristic search process) and the more people and hardware resources they could put in it, the more chances they will outperform other teams. It is resources, not "smartness".
I have completed the very first AI MOOC by Andrew Ng years ago - 780 out of 800 or something. The only difficulty is that it combines methods from math and programming, so one need to have some background to really understand hows and whys. Also really decent knowledge of English is required, otherwise one might miss the nuances in a very dense, loaded with terminology lectures, so, the entry level is quite high (of course, one could always copy-paste code without understanding and use ready-made toolkits and tutorials).
Apart from that it is nothing special. Basically, it is a function composition, with linear algebra and some numerical optimizations. When you have understood the basic building blocks - mathematical functions, processes and algorithms involved, there is nothing much else to do - one has to apply the theoretical knowledge which is, again, not a big deal, to real problems, and this is where big corps with resources took the advantage.
As long as you manage to get inside one of bigcorp AI lab, you become a star, simply because of the well funded PR machine of the corp. Everything that comes out is super cool, of course, so even being mentioned in the context makes one super cool too.
In my opinion, the guys from upper middle class families, who went through a decent technical school (with mom and dad's money) which taught them the basis needed for entering AI, are not that special. I probably could beat one or two of such snobs, having no high school education at all, never studied English or programming in a school and being raised in an impoverished family, but this is another story.
As for ML, take Andrew Ng's course, it is pretty accessible, and then take the one on the Udacity (with all these arrogant hot shots) and realize that there is really no magic in it. It is not that hard.
When you see a cool paper or video about some "breakthroughs" it ML, take into account that it is mostly due to resources spent on it, not some kind of extraordinary genius of the authors of the paper. Remember, all they do is basically a heuristic search and constraint satisfaction problems - train, test, change function (layer) composition (usually, without real understanding of whys), re-train, re-test. The problem is that there are very few of such slots in megacorps.
The guys like Andrew Ng himself are, of course, the real stars. But this kind of success comes from years of rigorous training similar to what Olympic champions have undergo. I personally don't think I have such ambitions or wish to embrace such lifestyle.
That would be dangerous. They could follow the SL example and turn them into cardboard-like figures, or get more creative and let you dress them up as clowns, Barney the Dinosaur or some other avatar of your choice. :)
But also, my understanding is their glasses cut a lot of light and are meant for outdoor use. Indoor use would be very useful too!
For those who are familiar, new episodes are apparently arriving on Oct.21st.
A hackathon project inspired by White Christmas: http://jonathandub.in/cognizance/ (Brand Killer)
I'm still playing around with fasttext (which is amazing btw) officially announced last week so I'm surprised to see Facebook Research announce and release another project so soon.
The Python package which serves as an API to fasttext (https://pypi.python.org/pypi/fasttext/0.7.2 ) is well documented, is constantly being updated, and is easy to use.
If I can draft off tech developed to sell more ads to do some good I'm all for it.
Layers are an abstraction we use to make artificial neural networks dramatically faster to work with using linear algebra. By keeping each node interacting only with the layers above and below, we make the computations a lot nicer for our model of computation.
In the brain, neurons link freely to other arbitrary neurons based on adaption processes we (or at least I) don't really understand.
The brain also lacks a clear idea of direction to count the layers along, since it has innumerable different inputs coming in at all times, and the resulting signals interact all over the place.
The most meaningful analog would probably be to ask "How many neuron firings typically occur between an external stimulus and a response to that stimulus?" Even that is extremely rough though, because through evolution a lot of 'short circuit' structures have formed in our bodies. The gag reflex is obviously triggered by sensory input, but it probably doesn't check with your frontal lobe before firing the appropriate muscles.
A real neuron takes in the order of 10ms to integrate and fire to the next neuron. Many subconscious reactions take less than 1 sec, which leaves time to a chain of length less than 100. Note that those neurons are not strictly arranged in layers.
The human visual cortex has 10^12 synapses . One popular 2015 deep learning net (ResNet 152-layers) used 10^12 FLOPs to classify objects in one image (but less weights.)
In terms of depth, we're there. In terms of breadth, it will take several years. But the brain does things very differently. For example, it has top-down signals during "prediction."
> To capture general object shape, you have to have a high-level understanding of what you are looking at (DeepMask), but to accurately place the boundaries you need to look back at lower-level features all the way down to the pixels (SharpMask).
A Noob question:
If it reaches the "breadth" of human brain, how close will that to the "Skynet becomes self-aware" moment.
What would it tell us about human, our society after it study, analyze millions, billions hours of FB, youtube videos?
It's several orders of magnitude more "neurons" and "connections" than even the largest ANN's
You can probably find a lot of u-net implementations from this contest.
One that performed really well . It uses 'inception style' blocks feature extraction instead of vgg. But otherwise pretty similar.
They require to login to FB, then redirect to GitHub