
Is “data scientist” the new “programmer”? - jdhzzz
http://blogs.harvard.edu/philg/2018/09/18/is-data-scientist-the-new-programmer/
======
thidr0
Something about this article strikes me as a thinly-veiled complaint about
poorly designed object-oriented systems. Take, for example this comment by the
author:

>Even if the money were half of what today’s coder gets paid it might still be
a better job because one is spared the tedium of looking at millions of lines
of Java that do almost nothing!

What all those millions of lines of code are is abstractions, decoupling, and
modularization of logic/responsibilty. This is hard-won knowledge from the
field of software engineering. Granted, a lot of it is probably very poorly
designed or organized. But the problem is the design and not the philosophy.

Because scientists all use the same basic rules of math, but each business
will each have it's own special rules (i.e. not all payroll software
implements the same policy/axioms), this makes it really easy for the hard
logic of scientific work to be in a general-purpose library. "Normal"
developers need to customize their own rules, or in other words, develop their
own services unlike the data scientists.

Now if every data-scientist had to roll his/her own version of numpy, pandas,
sci-kit learn, tensorflow, etc. the author would probably be decrying the
deluge of procedural spaghetti produced by data scientists. The data
scientists' notebooks look simple because of all that indirection is hidden
away in the many libraries.

~~~
skybrian
All of today's software is built on millions of lines of code. This isn't much
of a problem when it's hidden away behind a good abstraction. Being written in
another language forces the API to be documented well enough so you don't need
to go deeper: "native code", "kernel code", "part of the browser".

Crappy million-line Java apps are generally crappy not due to raw line count
but rather due to leaky abstractions and badly designed APIs, so you do have
to navigate through a lot of that code.

~~~
portal_narlish
I sometimes wonder if programmers instinctively overcomplicate things in
interest of collective job security. Some of the stuff I've seen in
(particularly awful) Java code bases is perplexing to the point where it seems
intentional.

~~~
PeterisP
It's more likely the natural entropy of code - it's easy to add stuff to a
system in a way that makes it more messy; and if the system _already_ is a big
mess, then it's much harder to do non-messy additions and the bigger mess it
is, the harder it is to start cleaning it up.

~~~
portal_narlish
I refer to this as the "spaghetti law of attraction". The burden of
refactoring things gets higher and higher and no one wants to touch it. So
they just add another try-catch block and do some side effect and get the PR
merged.

~~~
time-traveler
It is also a question of management, it is easier to motivate adding new
features instead of keeping thing nice and tidy. Especially as finding good
abstractions are very hard and time-consuming.

------
vinayms
The take down on abstraction and software engineers (by using Java as an
example) is similar to saying "back in the day to find a prime number we would
simply use a sieve, but today it is a tedium, what with all the pi's and e's
and thetas that get in the way, and what are geometry and polynomials doing
here, and what in the god's name is this _i_ , I just want to count the prime
numbers which are nice round whole numbers".

That's what happens when a topic grows from being a curiosity where
dilettantes dabble into a proper field that is applied to solve problems.
Granted, some of the developments can indeed be tedious and self indulgent,
but otherwise this is the natural progression. Its sad and frustrating when
people who ought to know better make such statements. Is it done to provoke a
critical analysis, positive trolling if you will?

About the role of data scientist, I find it both amusing and disappointing
that just about anyone with a three week MOOC gets to work in this field, who
otherwise had never dealt with statistics before. I mean, statistics is a
three year long grueling applied maths degree, and condensing it to three
weeks is silly. It is actually in this way that it is similar to progamming
job of the 90s (I don't know how it was in the 70's, I wasn't born yet). Just
about anyone who could learn Java or VisualBasic, or the self taught cowboys
who used C, ended up programming professionally. Actually it was not that bad,
for coding is not as hard as its made to be, but that until they got sucker
punched by the n squared complexity, to say the least, on a big data. Coding
couldn't help them and they realized programming was more than learning to
code and using some API's and system calls. (I was one of them in a way, when
I started to code in C++ to model and simulate my mechanical engineering
project, and it lead me to the path of enlightenment.) So, today's data
scientists who are not bonafide statistics graduates or statisticians have it
coming as well, whatever the analogue is, unless they are merely "data
monkeys" in which case all is well and as expected.

~~~
thisisit
> who otherwise had never dealt with statistics before.

And why is statistics required? Let's face it _most_ companies who need "Data
Scientists" are looking for regular BI guys with fancy terms. Most of the
problems are solvable using out of box functionalities in python/keras etc.
Sure there are places and problems which require hard mathematics and stats
but those are few and far between.

~~~
mywittyname
This is true in my experience. Data Scientist seem to run the gamut from
"knows SQL" to "has a Ph.D in Behavioral Psych and spent four years getting
scientific results published in peer-reviewed journals".

The company that I work for has changed role titles from Data Analyst to Data
Scientist specifically because people who know SQL, but don't program, won't
apply to/accept jobs without that title.

------
analog31
What I took from the post was not that a "data scientist" qualifies as a
"programmer" in the modern sense, but in the sense of the kinds of things
programmers did in the 1970s. And maybe he's expressing some nostalgia for
those times.

I learned programming around 1982. I didn't pursue a programming career, but
went to college and majored in math and physics. Today I often use programming
in the way that a data scientist might, solving problems using high level
tools. The data that I deal with are physical measurements. I'm not employed
as a programmer.

I also work with a lot of programmers, so I get a glimpse of what they're
doing, maintaining a million-line code base. And I have to admit that being
thrust into that environment would have me waxing nostalgic about the good old
days too. I'm happy doing what I'm doing, and happy that someone knows how to
turn my stuff into production code if it ever gets to that point.

What I'm really doing is applying my domain knowledge in a domain that happens
to depend heavily on computation. To answer Greenspun's question, what I'm
doing is certainly more interesting -- to me. I have colleagues for whom
wrestling with the monster code base, and the kinds of engineering it
requires, are their source of fascination.

~~~
itronitron
Yes, "data scientist" is the new "70's programmer"... write some code in one
file that runs within a hosting system (mainframe, spark).

Regarding the complexity and tedium of many production code bases I think they
got there because many developers don't have the ability (experience) or
opportunity (iterations) to do things simply.

~~~
raducu
True, but the reverse is also applicable: academia most often than not does
not get the chance to do anything really complex in a tight schedule; this
explains what the author says:

"Consider that the “data scientist” uses compact languages such as SQL and R.
An entire interesting application may fit in one file. "

I have seen horrendous, multi-page SQL queries in very large systems.

~~~
collyw
SQL is still one of the best languages for readability in my opinion.

------
minimaxir
> Consider that the “data scientist” uses compact languages such as SQL and R.
> An entire interesting application may fit in one file. There is an input,
> some processing, and an output answer.

The argument this post is making is reductive. Yes, sometimes data science is
simple. Sometimes it isn't, and that's when you _really_ need someone with the
appropriate skillset.

------
portal_narlish
Data scientist, programmer and software engineer are different things. They
are not disjoint by any means, but this guy is conflating them in a way that's
totally wrong.

Software engineers have to _engineer things_. They deal with production
applications, distributed systems, concurrency, build systems,
microservices... coding is sometimes only a small part of the job.

Data scientists nowadays _do programming_ in interest of research, modeling
and data visualization. But they are not only programmers - they are usually
supposed to have an applied statistics or research background. Some also do
software engineering, especially at companies serving data science/ML in their
products.

A programmer is actually someone like a data analyst or business systems
developer. They don't have to build systems themselves, they just write
loosely structured code against existing systems. Like writing SQL queries for
dashboards, or drop-in code for things like Salesforce. This is probably the
closest thing to what he's describing as the "70s archetype". Minus the deep
optimization stuff.

~~~
Fiahil
I agree with you. I've seen brilliant Data scientists struggling to understand
how git branching works. But, as you say, their principal focus is applied
statistics, not programming.

My role as a software engineer is to create a good enough architecture so
their can use properly the information contained in their 60 GB CSVs.

As a side note, I also noticed that clients have no issue paying _a lot_ for
_Data Science_, but for the "software guys" ? That's a whole other story,
despite being of equal importance to the project.

------
Jtsummers
Deming, from _Out of the Crisis_ (1986):

    
    
      People with master's degrees in statistical theory accept
      jobs in industry and government to work with computers. It is
      a vicious cycle. Statisticians do not know what statistical
      work is, and are satisfied to work with computers. People
      that hire statisticians likewise have no knowledge about
      statistical work, and somehow suppose that computers are the
      answer. Statisticians and management thus misguide each other
      and keep the vicious cycle rolling. (p. 133)
    

This is what today's data scientists are. Last century's statisticians,
similarly hired for misguided reasons (we need them because our competitors
have them!).

~~~
thousandautumns
I'm not sure I follow. Specifically, what is meant by, "Statisticians do not
know what statistical work is, and are satisfied to work with computers."?

~~~
Jtsummers
What he meant was that they (being fresh out of school) don't know what
statistical work for _business_ is. They can crunch numbers, but they don't
actually know what questions they need to be answering for their employers.
And their employers don't really know what's needed either, just that they
need a statistician. So they end up doing number crunching on the computers,
which is all fine and good, but of relatively low value. Additionally, and
many who have programmed can attest to this, computers offer an emotional
satisfaction when you work with them: Oh, cool, I finished this neat Travis CI
integration so now my workflow is more automated. But I spent two weeks doing
that and what value have I added to the business? (Not that automation is bad,
but people get distracted by the side problems and not the core problem.)

Of course things have changed, you don't have to invent your own statistical
packages anymore. But see some comments elsewhere in this post: People are
saying that the data scientists are the ones that know process automation
better than anyone else in their offices. The ones who best understand docker
and continuous integration. This leads to a question: Why are they so good at
that? Is it because it's solving real problems for them and letting them be
more effective? Or are they like every "data scientist" in the last couple
offices I worked in: They have no real work to do because no one knows what's
expected of them, so they solve interesting (to them) problems rather than
business problems.

I'm not trying to knock the whole field, but it's a trend we've seen play out
before. Smart companies and smart people figure out that they need X. Or they
discover a technique or process that works well (see devops). They do it, they
create a position called Xian or X Scientist or X Analyst. Now everyone wants
to be like the successful guys and start imitating, without comprehending the
value or purpose of the work or process. Lots of people take on the new title,
schools offer courses in the techniques they use, but with a poor emphasis
(due to time or their own lack of understanding) on the business case for it.

The current trend with data scientist is no different. There's positive value
when it's understood, and negative value when it's not (best worst case: just
an extra body being fed but doing no harm to the business other than the cost
of their salary).

------
manfredo
No, at least in my understanding data scientists specialize in the analysis of
data rather than the development of software. You'd hire a data scientist to
look for interesting patterns in data, or create machine learning models, and
other data analysis tasks. These tasks may involve writing code, but it's
usually specific to data analysis, often in R or Matlab or similar. A lot like
how many people in the natural sciences pick up coding to enhance their
capability, but the software writing is a means to an end.

I wouldn't hire a data scientist to build a web app (well, I would if he or
she had the necessary knowledge and skills - the job title wouldn't be "data
scientist" though). "Software developer" is much closer to "programmer".

~~~
mr_toad
I think the point of the article was that it used to be more common back in
the 60’s and 70’s for programmers to work on data problems. From basic stuff
like census tabulation or designing file systems, to creating trigonometry or
t-statistic tables, to AI.

There was less specialisation, less of a divorce between programmers and
users.

There also seemed to be a conflation of computing and AI back then. Lisp was
considered AI. And the early computing pioneers and theorists were strongly
interested in AI, logic, and mathematics.

------
sixdimensional
This post states that a data scientist uses compact languages such as SQL and
R.

Genuine question - do people really believe that being able to write and
understand complex SQL makes you a data scientist?

I ask because, I've been writing some of the nastiest, most difficult looking
SQL around for probably at least 15 years. And yet, I would NOT call myself a
data scientist because I know and can work with data and use SQL. It might
make me a data engineer.

What would make me a scientist is the process, method and rigor I apply to
data-driven research and in practice. It's not about what tool I use or how
complicated that tool is.

I often get a whiff of imposter syndrome over this because, if being "great at
SQL and R" is enough to get the big bucks as a data scientist, then I'm
clearly doing it wrong. But, then again, maybe I'm being too literal thinking
that a scientist means something different.

~~~
telchar
I've been working as a data scientist for several years and have written some
pretty gnarly looking SQL myself. I have a background in math and hard science
so I have some understanding of the scientific method as well. While I respect
our DBAs I wouldn't call any of them qualified to be data scientists.

While I have been able to hold my own in this job I went back to school to
pursue a graduate degree (partly) because being in the field has shown me how
much more there is to know. While it's easy enough to train a simple model in
R there are so many ways to fool yourself and produce an invalid analysis and
so many variations on otherwise-simple problems.

It seems this field has a lot of variation. A glorified report writer might
get the DS title but they're not going to get the really cool jobs.

If you're interested in data science try out a kaggle competition and try to
place high. The variety of methods and tricks people try to improve their
entries can be illuminating, I think.

~~~
yvrev
I'll preface this with I've not had a look at any Kaggle competition, but I
always assumed Kaggle competitions was on par with programming competitions in
terms of how the skills transfer professionally. A great programmer is not
necessarily great at programming competitions after all.

Am I way off here?

~~~
telchar
No, there's way more to data science than competitions. But for someone who is
already a data engineer more or less, I think it could be a good window into
the complexity of modeling.

------
booleandilemma
Sure, we’ve stolen the term “engineer” for long enough, let’s bother the
scientists now.

~~~
gnulinux
Why is software engineering not a valid engineering? I worked on both software
and hardware engineering and general principles seem to be the same. You deal
with complexity and simplify it by making abstractions. You make calculations
to make sure your project is feasible. It's not like EE and Aerospace
engineering are literally the same field but there are some principles shared
in those fields, and with software engineering. Am I missing something?

~~~
throwaway2048
When software engineers stop disclaiming all liability for their products
failing, we can talk about them like we talk about engineers (sidenote: some
are already take responsibility)

~~~
M_Bakhtiari
It's foolish for anyone to take responsibility for any software written under
the current prevailing industry practices.

It would be funny if real engineers were able to get away with making
crumbling messes that can't hold their own weight because their middle
managers don't believe in concepts like stress and strain like ours don't
believe in refactoring or abstractions.

------
3chelon
I've interviewed a few "data scientists". Some of them were pretty arrogant.
Their idea of a "close to the metal" language was Numerical Python. I don't
think these guys are going to be writing the next generation of OS anytime
soon.

~~~
alvis
Never. I'm a data scientist myself and know many other so-called data
scientists. But coming from an engineering background, I pretty much agree
with you. & interestingly, all of the ppl I know who claim they want to write
an OS one day are all data scientist. Seriously! They don't even know what's
unit testing!!!

~~~
serversystem
Why would data scientists aspire to write an OS? Sounds puzzling.

~~~
3chelon
I was talking from the viewpoint of the OP who was asking if data scientists
were replacing programmers.

~~~
yorwba
The OP is explicitly about the kind of programming work where "An entire
interesting application may fit in one file.", which doesn't tend to apply to
OS's nowadays.

------
danso
This is such a bizarre post. The reason why people use a language like R is
because it is easy to learn and use (and install, via RStudio) for data
analysis without having to be a well-trained programmer. I can’t recall ever
hearing from anyone who has relied on R, doing so because it was
computationally efficient. The point of the language is convenience —
particularly with how easy it is to create attractive graphics using ggplot2’a
defaults.

It’s a testament to the R library’s developers (particularly Hadley Wickhan)
for making APIs that do so well in streamlining data work. But I’m willing to
bet a majority of R users, particularly in academia, could not load a simple
delimited data file without a high-level call such as read.csv.

(By “simple”, I mean a delimited text file that could be parsed with regex or
even split. I don’t expect the average person to be able to write a parser
that dealt with CSV’s actual complexity)

~~~
EdwardDiego
The fact that R has such buy-in despite being a rather awful programming
language (a friend of mine worked on the next Lisp-like version of R under
Ross Ihaka, and the next version is based on the fact that current R is a bit
awful) is precisely because it offers such convenience to non-programmers.

In my sister company, they have data scientists, and data engineers. The data
scientists write their algorithms in the language they're most comfortable
with (typically JS), and the data engineers rewrite to perform efficiently in
the application that's applying them.

Data scientist and programmer are two very different specialisations.

~~~
110285591136
> ... despite being a rather awful programming language (...) it offers such
> convenience to non-programmers

I've heard people say similar things about MATLAB - that it's a poorly
designed language, but many that people (mostly non-CS folk) use it out of
convenience.

Can someone with experience using R explain what makes it so appealing to non-
programmers? It seems like these two factors, "poorly designed" and "easy to
use", should be at odds with each other.

~~~
peatmoss
Eh, it’s not as bad as people like to whinge that it is. There are indeed
warts, but they’re pretty overblown. If you are comfortable with functional
idioms R mostly does what you want without a great deal of fuss. If you’re
predisposed to procedural idioms, then you’re going to be fighting the
language.

I started learning R about the time I started reading How to Design Computer
Programs, and I found it pretty easy to transfer that model of thinking to R.
And I find Clojure, Racket, and Scheme to also be somewhat comfortable after a
short reacclimation period.

Some of the convenience bits have to do with most functions working on vectors
without needing to explicitly iterate most of the time. Also libraries. If you
want to estimate a linear regression, or make some exploratory plots, or try
some rando statistical method that your graduate advisor suggests, you don’t
have to worry about whether it’s already been implemented for you in R.

You can do a lot of heavy lifting by cribbing off of example code because most
code is short. You just get heaps of leverage by using R.

Look, I like to do things the hard way a lot. My whole life is pretty much a
string of highest friction path choices. For data science R is easy because
all the work has been done for you. It's the difference between writing GUI
apps against Cocoa APIs vs, I dunno, XLib or Motif.

------
aogl
In my company I am a software engineer and my colleague is a data scientist,
our current project that we work together on does a lot of NLU and NLP type
work (think bots) and our skillsets often don't overlap and are both equally
valuable to the projects success. That is, I tend to write the infrastructure
and platform code that ties everything together and deal with all the software
engineering type work, while my data scientist feeds in trained models and the
likes. Both are necessary to handle contractual requests/responses as per our
scope design.

~~~
gnulinux
My experience is very similar to this as a "software engineer" in a company
who has 50% 50% split software engineers and data scientists.

------
DonHopkins
There was a sign on the door to the Vax Lab at the University of Maryland that
said "Department of Research Simulation".

~~~
falcor84
Haha, it seems that quite a few scientists in academia are simulating the act
of performing research ;)

~~~
DonHopkins
In my case, staying up all night playing SimCity in the computer lab was
actually good preparation for a career in simulation game programming.

[https://medium.com/@donhopkins/designing-user-interfaces-
to-...](https://medium.com/@donhopkins/designing-user-interfaces-to-
simulation-games-bd7a9d81e62d)

------
dekhn
A data scientist is just a statistician who works in the Bay Area.

------
screye
As someone who has been looking for Data Scientist jobs in the past few
months, I can reliably say that the term can mean everything from software
engineer for big data systems, SQL guy or a person that builds complex machine
learning models.

It is just as vague as the job profile of a "programmer". In that sense, the
title is right. But, in the context of the article's content, I disagree.

The job done by a data scientist in demanding roles, requires a strong grasp
on undergrad level statistics. But because of the recent trends towards ML,
the person also needs to have a strong grasp of linear algebra, vectorization
and software engineering / undergrad algorithms.

While it is unlikely that one data scientist may need to summon the whole
skill set, an interviewee will never know which subset of these skills you
will be asked to demonstrate to get hired.

Modern software jobs have figured out distinct subset of skills needed to
differentiate between different software roles for experienced employees.
Junior level employees are barely even expected to know anything other than
algorithms, data structures and high level system design (at least during
interviews)

Another funny observation (anecdotal) is there seem to be more openings for
"senior data scientist" (who is expected to know everything), than "junior
data scientists" whom the company is willing to mentor.

As of now, I find myself scrambling to decide which skills I need to
prioritize, often feeling like I am being pulled in opposite directions.
Almost of which require formal instruction (the maths), and can't be picked
like software skills through youtube and online projects. This isn't a knock
against software, just different type of subject matter.

Companies interviewing for these roles may ask everything from leetcode
algorithms questions to statistics to questions about modern ML algorithms and
domain specific models (in NLP, Vision, finance, recommenders)

I personally find a "junior" Data Scientist's role (in expectations) to be
harder than that of a junior SDE. There is a reason many these jobs will put
phD into preferred qualifications. It is ironic that there has been such a
massive surge of people without the necessary background, who do a couple of
MOOCs and crown themselves data scientists. Being good at any software & math
heavy domain is hard. Data Science is no exception.

------
rawoke083600
Forgot who said it but it was great: a "Data scientist" is a programmer better
at stats than any 'normal' programmer and better at programming than any
'normal' statistician." :P

~~~
screye
This is the best short form definition of a data scientist I've heard yet.

------
jeffreyrogers
There is a good comment on the original article by a user named LauraConrad.
I'm excerpting it so HN readers will see it:

> I was a “Programmer” in the ’70’s, and I keep thinking how much of what my
> early programs did would be done by a spreadsheet now (or any time since the
> late ’80’s).

------
alkonaut
I thought data scientist was closer to doing the work of a statistician than a
progrrammer. Visualizing data, and analyzing data. Programming becomes part of
it by necessity.

Data science is also a much sexier term than statistics, just like "Machine
learning" and "Artificiall intelligence" is a lot sexier than, say,
"Regression".

As someone funnier than me put it: "A data scientist is just a statistician
with a mac".

~~~
sgt101
I prefer : What's a data scientist -> a scientist"

It doesn't capture the whole, but it's a powerful way of thinking about what
the profession should really be trying to be.

------
theoh
The basic premise of the article is that

systems programming=irrelevant bloat and abstraction

while

data reduction=definite purpose and utility

People writing python notebooks to do data analysis are probably fairly
comparable to the scientific computing programmers of the past, but I feel
like this picture tends to dismiss the computer science side of systems
programming:things like GUIs, network code, processes and virtual memory, all
the architectural aspects of computing.

One might prefer APL or Forth for writing one-page programs, and it's probably
true that systems now are bloated relative to what they could be. Still, there
is much of interest going on in a typical operating system, compiler or video
game, while a typical data analysis notebook is IMO fairly dull and even
basic, from a software angle.

~~~
andyburke
Yeah, the author's take is myopic. What they call bloat, people from the 70s
would call wondrous: ubiquitous networking with and without wires, beautiful
graphical interfaces, encryption everywhere (and expanding), far more open
systems than proprietary re-engineered ones, the list goes on and on.

~~~
cdcfa78156ae5
The author is Philip Greenspun, who in the 1980s worked with the people that
created all of the things you listed:
[https://en.wikipedia.org/wiki/Philip_Greenspun](https://en.wikipedia.org/wiki/Philip_Greenspun)

There is nothing myopic about his perspective.

~~~
theoh
It's fair to mention that he is well known, though in fact I'm one of the old
guard that remembers when he had a higher profile.

But, as with a new Paul Graham essay, surely we can critique the blog post on
its merits instead of falling back on an assessment based on some kind of
appeal to authority/"expertise by association". Philip Greenspan doesn't need
to be treated with kid gloves as if he was the pope.

John Ousterhout made comments that touch on some similar (though not
identical) distinctions in programming practices. That was years ago, and he
was then a much more credible figure in software than Greenspun. All the same,
his essay was heavily criticised. That's what serious intellectual discussion
should involve.

[https://en.wikipedia.org/wiki/Ousterhout%27s_dichotomy](https://en.wikipedia.org/wiki/Ousterhout%27s_dichotomy)

[http://www.tcl.tk/doc/scripting.html](http://www.tcl.tk/doc/scripting.html)

------
coryfklein
If it means we now get a term that, at least for a couple of years, filters
out all the garbage roles recruiters throw at me then I'm on board with
adopting this terminology.

------
dschuetz
I wonder why "data engineer" isn't one of the suggested terms. Scientists do
not really program science, nor do programmers research programs, as their
respective fields of expertise.

~~~
3rdAccount
To be nitpicky, in the US, engineer means you graduated from an ABET
accredited program in something like: Chemical engineering, mechanical
engineering, civil engineering, electrical engineering, industrial
engineering, computer engineering....etc.

That is not to say programming isn't a difficult job that requires a lot of
analytical and creative thinking similar to an engineer. The difference is in
getting a degree in something that has 4-5 years of calculus based math,
physics...etc classes. There is also a rigorous 8 hour test to get a license
after 4 years on the job.

I guess the broad term of building something and doing analysis fits here, but
I don't see any Data Engineers in practice. What I see are Data Scientists and
Data Analysts. Of course I'm arguing over semantics here, but it is important
to get the distinction correct.

~~~
vostok
> To be nitpicky, in the US, engineer means you graduated from an ABET
> accredited program in something like: Chemical engineering, mechanical
> engineering, civil engineering, electrical engineering, industrial
> engineering, computer engineering....etc.

Do you happen to have a reference for this? At first glance, it seems to be
incorrect rather than nitpicky.

Anecdotally, I know plenty of people who do not have ABET accredited degrees
and have "engineer" in their title in the US.

~~~
ldarby
[https://motherboard.vice.com/en_us/article/vvapy4/man-
fined-...](https://motherboard.vice.com/en_us/article/vvapy4/man-fined-
dollar500-for-crime-of-writing-i-am-an-engineer-in-an-email-to-the-government)

(There are lots of other articles about that case, that one sums it up mostly
in the url)

~~~
learc83
The OP says "the US". That article is one state in the US. Most states don't
have restrictions on the use of Engineer in job titles. Canada does though.

------
ww520
Isn't it the new term for report writer?

~~~
pvarangot
Intern Analyst Automation Engineer.

------
d--b
This is total BS.

> Does the interesting 1970s “programmer” job still exist?

Sure! Go right there: [https://www.linkedin.com/jobs/cobol-
jobs/](https://www.linkedin.com/jobs/cobol-jobs/)

And enjoy the not-bloated-at-all systems you'll find there!

------
Silhouette
_How did applications get so bloated and therefore boring to look at?_

That's an easy one. Too many people who lacked sufficient experience to make
their own informed judgements yet trusted consultants peddling soundbites
instead of skilled and experienced developers who knew better.

If you have ever read a book or watched a talk by someone who advocated very
short functions and minimal nesting, and you subsequently adjusted your
personal programming style or corporate coding standards as a result, please
do yourself a favour and go back and look at whether they offered any evidence
-- anything at all -- or even just a reasonable argument that stands up to
scrutiny -- to support their position.

The relatively plentiful resources in a lot of modern systems do remove one
barrier that forced developers to do better, but I don't really believe that's
a big factor. It's more that when you have an industry so focussed on young
people, a lot of what happens is the one-eyed leading the blind, because too
many people who have been around long enough to see the big picture get
shipped off to management or other positions before they can pass on what
they've learned widely enough to advance common practice.

------
denzil_correa
Data Scientist has two terms in it : Data + Science. More often than not,
people ignore the "Science" part of that equation.

~~~
httpz
Someone said any field with "Science" in the name isn't really a science.
Computer science, data science, political science, social science, etc.
Physics, chemistry, biology don't have science in their name.

~~~
mycorrhizal
>> Someone said any field with "Science" in the name isn't really a science.
Computer science, data science, political science, social science, etc

The etc. would also include cognitive science/neuroscience, medical science,
earth science, material science, agricultural science, veterinary science,
geoscience, food science, etc.

And of course as we all know climate science is fake./s

Generally when I hear a field with the word "Science" in the name I think of
it as a more interdisciplinary field. Take Earth Science it draws on different
areas of physics (ie wave physics), biology (ie ecology) and chemistry (ie
kinetics). Earth science is still very much science it is just doesn't fit
perfectly into the more foundational fields.

------
joker3
The output of a data scientist's work includes plenty of things that aren't
code. Yes, the code that I write tends to be very short, but if it represented
everything I had to do to get there it would be quite a bit longer.

------
tanilama
I don't think so.

Per my observation, the most 'interesting' part of a data scientist's job is
story telling, that is using data analysis to draft a theory to push forward
for product direction. Some of the ML engineers works under Data Scientist
umbrella, but since the DL thing happens, they are now putting under even
fancier titles like AI Engineers or such.

So data scientists are really product manager/owner with analysis skills. Is
this job interesting? For sure, when it follows this definition. Interesting?
That only depends on the problem domain, not the title, IMHO.

------
garyclarke27
True - Superficially SQL may appear to be simple, old fashioned and a bit
verbose, but once you are expert with it (takes at least 5 years) is amazingly
powerful. Operates at a much higher abstraction level than Java, Python etc,
so is I would guess 25 times more expressive. Postgresql pure SQL CTE’s give
you variables and recursion PLpgsql gives you dynamic sql for macro/meta
programming. If you use immutable tables can be purely functional. SSDs and
now even faster Optane memory have resolved the IO problem which handicapped
RDBMs until recently.

~~~
mgummelt
SQL is not more expressive than Turing Complete languages, no.

~~~
garyclarke27
They’re all Turing complete inc SQL with case and recursion. I meant density,
1 line of a code, a SQL window function with a filter clause would probably
take a page of Java to achieve the same result.

~~~
YawningAngel
Nope, Java has map and filter just fine. Eg

``` Words.stream() .map(word -> String.toUpperCase(word)) .filter(word ->
word.startsWith('A')) ```

~~~
JohnCohorn
SQL window functions aren't rocket science(not that I've used them much, cause
ORMs and popular stripped down DBs like MySQL tend to not support them very
well), but they do a lot more than you think if you're comparing them to
trivial map/filter operations.

------
brainpool
Data scientist is a misnomer except when there is a relevant Ph.D. and that
was never the bar for a programmer.

------
shrimpx
I would say "data engineer" is the new programmer, in that programming is
evolving away from procedural monolithic threaded code with locks everywhere,
to distributed message processing pipelines whose capacity can be flexibly
adjusted, etc. "Data scientist" is an actual role at some companies but most
data scientists are actually struggling with the contradiction between what
they learned in school and the harsh reality of what their job demands of
them.

Edit: misspelling.

~~~
linux2647
So data engineering would be a subset of software engineering?

------
Wed19Sep
The author simply has a grudge towards over-engineered code such as [FizzBuzz
Enterprise
Edition]([https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpris...](https://github.com/EnterpriseQualityCoding/FizzBuzzEnterpriseEdition)).
Not all modern coding is like that, and you certainly cannot build complex
software (at least maintainable complex software) without abstraction layers.

------
mtrycz2
One thing I don't like about my source code is it "doing something
interesting".

I like my code to be boring. I like my frameworks to be boring. I like my APIs
to be boring. So I can focus on important things in life (or even the
important things at work), and be done with it.

------
nautilus12
Hacker News, finding new ways to make your job feel trivial since 2007.

------
dekhn
Similar story: once, when Googlers were talking about full stack engineers,
Rob Pike piped up to say they weren't really full stack engineer; he said he
was a full stack engineer back when he worked on the Voyager project because
he had to know everything from silicon quantum mechanics up to interplanetary
astrophysics.

Sure, there are parallels between modern programmers and the scientist-
engineer-calculators of the past.

------
tw1010
I would bet a lot of money on the fact that that isn't the case. (In fact, I
am betting a lot of money, in the shape of my choice of career.)

------
fogetti
Well I am "lucky" enough that I can work on a legacy application written in
OpenACS. It wasn't written in the 70's but it's definitely old and outdated.
So this kind of narrative that everything was better back then, simply does
not convince me at all. I might be wrong of course, but the author tells
anecdotes which is not a real argument, so there is that.

------
sigfubar
No. Data scientists exist at the mercy of programmers: without the tooling and
the pipes, data science would not be a going concern.

~~~
closeparen
Hah! At my company a decent proportion of engineers spend their lives
scrambling to productionalize and operate the Lovecraftian concoctions of R
and Python that our data scientists cook up on their laptops.

~~~
natalyarostova
As a data scientists I view other data scientists who need engineers to
productionize their code with contempt. Perhaps (probably?) their data
pipelines and systems are sufficiently more complicated than mine, but I'd
feel embarrassed if I couldn't write production Python code

~~~
blt
They are different skills. Not saying that it's hard to learn both, but there
are standardized career paths that will lead you to be good at the modeling /
techniques side of "data science" without learning much about software
engineering. For example, studying math in undergrad. And there are certainly
lots of people capable of productionizing messy R scripts without fully
understanding the statistical ideas behind them. So I think, as a team leader,
you are restricting yourself to some degree if you only hire people who can do
both.

------
remram
Why is Java, a byte-compiled statically-typed language, "bloated", while R it
SQL, interpreted scripting languages, not? I find the point hard to follow.
Execution of an R script will go through many more "layers", downloaded from
many different sources, and implemented in different languages.

------
mollusk
I don't know, the number of hoops you need to jump through to use a trendy
data science tool like Hadoop, Spark etc. is way bigger than that of a simple
Java program. From my (limited) experience I'd say they the data science (or
big data) way is the bloated and convoluted one.

------
casper345
> How did applications get so bloated and therefore boring to look at?

I love reading code at this level. 1. You get to see into the mind of the
programmer and learn new techniques. 2. Boring?! 3. A good text editor makes
it attractive to look at for hours :)

------
CodeSheikh
In the absence of marketeers (think bootcamps) and recruiters the alternative
or correct title would have been: 'Is “statistician” the new “programmer”?'

~~~
smt88
A lot of data scientists are doing very light statistics, though. Many of them
don't have formal degrees in stats at all. There's a huge range of ability
represented by that job title, ans "statistician" doesn't capture the lower
end.

------
_pmf_
Programmer who has refreshed his high school math skills.

------
dchichkov
You can shape software out of chaos. You can shape software out of order. Both
are just sides of a multifaceted field called Software Engineering.

------
riskneutral
I thought “engineer” is the new “programmer.”

~~~
gilbetron
It's the old new term ;)

------
itronitron
data scientists write code for themselves while software developers write code
for other people

------
itronitron
only if "software engineer" is the new "system administrator"

~~~
gnulinux
This doesn't make any sense. My job title is "software engineer" I never do
any system administration. I produce code in python, javascript, C and SQL;
never do any sort of administration. Sure, I occasionally deal with linux
since our servers are linux and so some knowledge of it is useful, and I use
unix tools pretty extensively (in OSX) since I prefer to write code this way.
All the "software engineer"s I know have similar experience to mine with
varying languages, so please suggest some evidence.

~~~
itronitron
I agree that it doesn't make any sense, however I recently interviewed at two
different companies for a software engineering role and both had
requirements/expectations for sys admin experience. I will refrain from going
off on a rant and just remain hopeful this is yet another short-lived trend.

------
_Codemonkeyism
No.

------
CyberDildonics
Is marketer the new journalist?

~~~
dschuetz
Is online journalist the new marketer?

~~~
CyberDildonics
I would say without any hyperbole, absolutely.

~~~
iamdave
Wait. Then what are "influencers"? The feedback loop seems to be eating
itself.

~~~
asknthrow
The human serpent of advertising.

------
drej
No.

[https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...](https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines)

------
drannex
coder -> programmer -> developer -> data scientist

------
speedplane
Everyone is a programmer now. Data scientists, accountants, marketers,
doctors, lawyers, project managers. Knowing how to write programs is just know
how to write.

