
Ask HN: What is the skillset for programmers at AI startups? - jason_slack
For programmers working at AI companies, what is your skill set for the job?<p>* OpenCV<p>* Neural networks<p>* R<p>* Machine learning<p>* CUDA<p>* something else?
======
dbfclark
Very company dependent. At Luminoso
([http://luminoso.com](http://luminoso.com)), we live in Python, with varying
doses of Javascript for frontend, very little R (but some of us know it), Java
and C++ are pluses (for working with other people's software), and we have at
least some Haskell in production (mostly for preprocessing). And plenty of ops
tools for deployment. We have used a few neural networks frameworks and
machine learning packages; definitely the Python machine learning ecosystem is
big for us but we haven't found our One True neural network framework yet.
Other companies (say in image recognition, Ditto Labs for instance) are
definitely more CNN-oriented, yet others are certainly doing work in R.

If you have a company that is doing genuinely interesting AI work, it's likely
that they are somehow on the forefront of research, pushing existing tools to
do things that they can only barely do. If you find yourself in an AI role
(typically only some roles actually do AI, of course -- systems need to stay
up, great UIs need to be built, etc.), I would guess that you'll need to
familiarize yourself with their particular toolsets and methods rather than
assuming that there are universally correct things to go learn.

~~~
hazard
why not keras for your 'one true neural network framework'?

------
antognini
My approach:

* The Python scientific stack for prototyping, data manipulation, etc. (Ipython, matplotlib, numpy, scipy, pandas, scikit-learn).

* Tensorflow for training NNs.

* Everything gets ported to C++ once it's ready to be integrated into the codebase.

* I work on the automated interpretation of EEGs, so I use our software (Persyst) to actually visualize the EEGs and see what's going on in the data at the ground level.

My boss is a big fan of Statistica for data manipulation and NN training. I've
been learning it, but I have a natural affinity for Python.

I think you'll find that the tools you need will be pretty dependent on what
you're doing, though. I deal with relatively small datasets (~100 EEG records,
~dozens GB total) and the NNs need to be well understood. So we use small NNs
and spend a lot of time poking and prodding them to make sure that they're
behaving the way we expect them to. If you're working on much larger datasets
and are more willing to tolerate the NN acting as a black box you'd have to
use a different set of tools (bigger NNs, more complicated NN architectures,
GPUs for training, etc.).

~~~
aprao
If you don't mind, can you tell me which C++ matrix manipulation library you
use at your startup?

~~~
antognini
I'm using boost, but we also have our own NN library that we use. Once I have
everything working using boost, I then swap that out for the in-house NN
library to make it consistent with the rest of the codebase. (I'm fairly new
to the company so I'm still getting accustomed to the existing code.)

------
vonnik
I agree with _dbfclark_ that it's case by case. At Skymind
([https://skymind.io/](https://skymind.io/)), we support an open-source Java
library ([http://deeplearning4j.org/](http://deeplearning4j.org/)).

* So obviously Java, Scala, Clojure, and lower-level languages like C++, C and CUDA.

* Since deep learning needs very large datasets to train on, our engineers need experience with open-source libraries used in production environments, such as Hadoop, Spark, Kafka. We also work with Lagom, Nifi, etc.

* A lot of the same math underpins many machine learning and deep learning algorithms. So a high degree of comfort with linear algebra, calculus and probability is a plus.

* A general knowledge of the strengths and weaknesses of various algorithms -- neural networks, reinforcement learning, etc. -- their combinations and applications is helpful.

The Github repos are here if you're curious:

[https://github.com/deeplearning4j/deeplearning4j/](https://github.com/deeplearning4j/deeplearning4j/)

~~~
jason_slack
Can you do CUDA with AMD GPUs? AMD Fire GPUs from a Mac Pro

~~~
ngould
My understanding is the CUDA interface is for NVIDIA GPU's only. There are
non-CUDA API's for other types of GPU's, but they don't necessarily work out
of the box with neural network libraries like Tensorflow yet.

~~~
jason_slack
So really you need a box with Nvidia GPUs to begin

~~~
lovelearning
Not necessarily.

If you want to start with CUDA, you can buy EC2 instances with GPUs and pay
for time used.

If you want to start with deep learning, you don't need GPUs or CUDA. All the
popular frameworks work fine on CPUs. GPUs are not guaranteed to accelerate
all deep learning use cases; sometimes the time taken to transfer data to and
from GPU dominates the time taken to process that data.

If you want to start with deep learning using an AMD GPU, Theano has support
for OpenCL which is an alternative to CUDA. But as I understand, this support
still remains limited and incomplete.

~~~
jason_slack
I also saw this:
[https://github.com/hughperkins/DeepCL](https://github.com/hughperkins/DeepCL)

OpenCL library to train NN. This could be an alternative to Theanos.

~~~
eivarv
Worth noticing: development of an OpenCL-based backend for Theano is in under
way [1].

I don't know how it's progressing (as I haven't checked it's status for a
while), though, but some [2-3] GitHub issues might be worth checking out.

[1]:
[http://deeplearning.net/software/theano/tutorial/using_gpu.h...](http://deeplearning.net/software/theano/tutorial/using_gpu.html#gpuarray)
[2]:
[https://github.com/Theano/Theano/issues/2936](https://github.com/Theano/Theano/issues/2936)
[3]:
[https://github.com/Theano/Theano/issues/1471](https://github.com/Theano/Theano/issues/1471)

------
gelisam
I'm a programmer working at an AI startup,
[http://keatext.ai](http://keatext.ai), but I'm not touching the AI part, so I
just want to point out that there are other ways to get in. In my case, I had
experience with Haskell, their backend is using Scala, those are both
functional languages, and that was good enough!

Also, we're in Montreal and we're hiring :)

~~~
testallthestuff
saw Keatext's presentation at FounderFuel Spring 2016 demo day. Really cool
stuff you guys are doing. Would love to join a cool startup like that but
still in school and don't have much AI experience :/

------
jeffreysmith
I work at x.ai ( [https://x.ai/](https://x.ai/) ) building an artificial
intelligence that schedules meetings. You could take a look at this job post
for our team [https://x.ai/jobs/#data-engineer](https://x.ai/jobs/#data-
engineer) As stated, we need an engineer to know about: distributed systems,
functional programming, databases, NLP, cloud infrastructure, etc. These roles
take a wide range of skills with some real depth in key areas. Even if you're
well prepared, you'll be expected to constantly be learning. But if you're up
for it, it's great work!

------
lowglow
What would someone recommend as a good platform/framework to start with for
longevity and robustness of community? We're just now looking for a ML/AI lead
to help lay down this foundation for our latest product Asteria, and I'd like
to know how to start searching for quality candidates/hackers.
[https://baqqer.com/projects/asteria](https://baqqer.com/projects/asteria)

------
beachstartup
statistics.

~~~
zerr
Yes, unfortunately, symbolic AI is not popular in startups...

~~~
olliwaw
Why unfortunately? That approach to AI led to billions of investment losses in
the 80s and an AI winter.

The current AI boom is led by the fact that deep learning works.

------
andrewstuart
Adventure games.

------
joeld42
Tensorflow

------
wdiamond
pay me 1000 bucks for that information. 2000 bucks for more.

~~~
val314159
I will do this if I can get a Kickstarter to give me $10,000!

~~~
wdiamond
if that was ironic, I would tell that some people need money but don't need to
share implementations.

