
“Joel Test” for Data Science - gk1
https://blog.dominodatalab.com/joel-test-data-science/?hn=1
======
maxaf
A thinly veiled sales pitch for a "data science collaboration platform" is
trotted out as some profound measure of how attractive a given company is to
prospective applicants for data science positions.

I'm wary of the Joel Test and others like it for one very simple reason: the
points are goals to strive towards, not some filter by which prospective
employers should be judged. Especially in the case of a fledgeling field such
as data science, wouldn't _you_ want to be deeply involved in laying down the
foundation for all that this screener list seems to take for granted?

Almost every point on this list is far from a solved problem. This is an
exciting time for the industry, a happening time when we begin to establish
some semblance of best practices by groping in the dark. What good are
prospective data scientists who recoil from the challenge of working with
engineers on tools that will enable future research?

~~~
mseebach
The Joel Test was actually explicitly pitched as a filter to judge employers
against (incidentally, Joel's firm scored a perfect 12/12).

------
afarrell
The original Joel Test was a piece of content marketing by Joel Splosky, the
CEO of the company that builds FogBugz among other tools for collaboration
among software engineers. This appears to be the same. We should evaluate it
its merits: Are these useful questions to ask in pursuit of a more productive
and less frustrated data science team?

~~~
mseebach
I don't see the need to make the Joel Test sound so nefarious. The Joel Test
was content marketing, yes, but not of the FogBugz tool (while a bug database
is item 4 out of 12, he doesn't even plug FogBugz in the item - it was a
recruitment piece more than a sales piece). The reason we're still talking
about the test is that it really struck a chord - it listed practices that
were true and important, but not widely held to be so. Today, it's mostly
redundant, because those practices are now totally uncontroversial standard
industry practice, but it's worth dwelling on the fact that not too long ago
you could "content market" recruiting developers by promising them source
control systems and an Excel sheet of bugs.

~~~
di4na
Don't go too fast. It is widespread in the startup world. In entreprisey world
not so much. Here we have no version control, no continuous build and bugs
never get solved.

Don't even think about Hall testing. And the people that hired me asked no
question about my coding ability nor my Github account. And it is not a small
company, nor an isolated thing.

And don't ask about the tooling, i spend my days fighting it... not being able
to use package manager due to bad proxy is a PITA.

The Joel Test is still on my list of things to check with every company that
want to hire me now. Because i see what happen when it is not respected every
day...

------
burgerdev
"5\. Is the team using our software?"

~~~
apathy
Bingo. I like the Domino guys but this is a transparent pitch for their
particular platform. Last time they called, they weren't able to help with my
use case. But my use cases are bizarre. Still, there are others like me. And
they won't be well served by trying to shoehorn their work into Domino.
Sometimes you end up having to build the platform yourself, because the
existing ones can't do it.

Other times a pre-existing solution like Domino may help. I found out about
sigOpt yesterday, for example, and balked at the price until I discovered they
allow academics to use their platform for free. All of a sudden the hypercube
searches I was dreading for tuning regularized tensor factorizations became a
nonissue, and it costs me nothing.

If that becomes something that other people want to use (probably so, now that
I think about it) I don't see why sigOpt shouldn't make a pile of money off
it. And the Domino people have offered their platform free or cheap to our
group in the past. It's not because of principles that I turned it down (my
only real principle is "don't buy the cow if you get the milk for free"). It
just didn't serve our group's needs. But the general principle of a shared,
convenient store for data and analyses is a good one.

The other general principle that I tend to emphasize is that you should always
see about a free trial beforehand. :-)

~~~
afarrell
Isn't the original Joel Test a transparent pitch for FogBugz?

~~~
apathy
Point taken. FogBugz was also, AFAIK, not a perfect solution for everyone
(e.g. I don't recall it offering much for a shop where everyone uses screen or
tmux as their window manager). But the underlying principles were handy
anyways.

------
sgt101
Ok - "latest tools without IT"; let's imagine that you have valuable or
personal data, someone picks up an unchecked tool and heyyyyy presto! You lose
all that data (well, you still have it, also other people have and your
company is in the newspapers).

Also - the best tools money can buy? Where does money come into data science
tools? Tensorflow = free, RStudio = free, Shiny = free (ok you can get
commercial versions of the last two, and we have, but they are cheap!) Sparklr
= free, Python = free....

~~~
djcjgshjjc
Well, the best analytics database for your use case may not be cheap. E.g. a
column store database.

Even open source options can require expensive eng/ops resources to set up a
cluster.

~~~
collyw
Oracle is probably best in class relational database, but Postgres or MySQL
will work fine for most use cases.

------
kfk
meh... for me number 1 should be: do they understand domain knowledge beats
fancy analytics hands down most of the times or do they spend their time
optimizing things nobody cares about?

~~~
dagw
Dear god yes. If everybody on your team just has a generic math/CS background
and no one has a deep understanding of the domain you are working in, then
you'll end up wasting so much time you might as well give up and go home.

------
collyw
Just hire a decent software engineer and let them learn the data science.

