
The eigenvector of “Why we moved from language X to language Y” - platz
https://erikbern.com/2017/03/15/the-eigenvector-of-why-we-moved-from-language-x-to-language-y.html
======
The_suffocated
The research methodology in this blog post is fundamentally flawed. The author
only counts how many people move from X to Y, but he doesn't count how many of
them do not move at all. The whole diagonal of his (sample) transition matrix
are actually missing values, but he treats them as zeroes. This greatly
distorts the equilibrium distribution. As a result, he misinterprets each
equilibrium probability as the "future popularity" of a language as well, when
it at best only represents the future popularity of a language _among those
who constantly switch their languages_.

~~~
erikbern
Author here. You are absolutely right. As I mentioned in the notes, I think
this matters a bit less than it might seem like (the stationary distribution
does not change if you add a diagonal matrix) but clearly some languages will
have a higher propensity for people to stay.

I think this flaw is even smaller than the issue of using Google statistics to
infer transition probabilities. It's just a shitty proxy, at best.

At the end of the day, there's a lot of assumptions going into this analysis.
I hope I didn't make it seem more serious than I meant it to be – it's really
just a fun project and kind of a joke not to meant taken seriously.

That being said, I think the conclusions are at least "directionally" correct.
They might be off by a factor of 2x or 5x or even 10x, but the stationary
distribution exhibits an even bigger spread (multiple orders of magnitude) so
I suspect the final ranking is still "roughly" correct (with a very liberal
definition of "rough")

~~~
j2kun
The thing I like most about this post is that it's falsifiable. We will know
in ten years whether C and Java are still popular, and whether Go succeeds in
the sense this data suggests. So thank you for being concrete and clear, even
if it's all in fun and other people don't like it :)

~~~
thaumasiotes
> whether Go succeeds in the sense this data suggests

An interesting thing about this methodology is that it is extremely sensitive
to the age of a language. It's possible to switch _from_ an old language _to_
a new language, but not the other way around -- so if you happen to do your
measurements after a language has had some uptake but before it's been around
for long enough that people have built significant projects on it and
subsequently gotten sick of it, the future distribution by this method can
only be 100% New Language. (Because sometimes people switch to New Language,
but no one ever switches away.)

Actually, to predict the future distribution of language use, you also need to
know the rate of people moving from nothing ("I just had a brilliant idea!")
to each language. If everyone eventually transitions to Go, but everyone
_starts_ in Ruby, then the division of market share between Go and Ruby
depends in part on how frequently people start new projects.

~~~
DocSavage
The sorted stochastic matrix shows that C contradicts your assumption that
it's not possible to switch from a new language to an old one. Or, at least,
it shows that portions of new language code are occasionally rewritten in C.

~~~
kem
What's missing from the matrix is "no language" language or null language.
That is, a column and row that represents people who start projects from
scratch in a given language.

I agree that the analysis makes it abundantly clear people move to older
languages, but the question is what new projects are started in, and how many
projects represent new versus transitioned projects.

This analysis is interesting, and gives a rough idea of what people are moving
from and to when they decide to do that, but not necessarily popularity.

What the author is indexing in the end isn't really predicted overall language
use, it's predicted transition target frequency.

------
Maro
I wish more people would read 'Hack and HHVM', written by Owen Yamauchi,
formerly a member of Facebook’s core Hack and HHVM teams.

[http://shop.oreilly.com/product/0636920037194.do](http://shop.oreilly.com/product/0636920037194.do)

The hidden lesson for me was that rewriting the code in <new-language> is not
the only option. Another option is to slowly improve the language/runtime
itself until you've essentially switched it out underneath the application,
which is what happened at Facebook. Meanwhile keep refactoring the code to
take advantage. (Granted, this is isn't an option for a small company.)

I work at Facebook and sometimes write www code. When I was interviewing and
thought about writing PHP code, I didn't get a warm and cozy feeling, being
reminded of terrible PHP code I've seen (and written myself) in the 2000s as a
"webdev". Thanks in part to the advances described in the book, the codebase
is definitely not like that; it's easily the best large scale codebase I've
ever seen (I've seen 3-4).

My thoughts in blog form (written 3 months after I joined):

[http://bytepawn.com/hack-hhvm-second-system-
effect.html](http://bytepawn.com/hack-hhvm-second-system-effect.html)

~~~
noir_lord
As an unintended side-effect (maybe), the mere existence of HHVM acted as a
spur to PHP generally, things have improved radically in PHP-land over the
last few years.

------
jeremyjh
I don't think the Google queries measure what the author thinks it does.

I noticed it lists 14 results from Haskell to Erlang, which I was skeptical
of. When I google "move from Haskell to Erlang" or "switch from Haskell to
Erlang" I do find results (such as quora questions, versus questions,lecture
notes) but none of those results are the type of article we're looking for.

If they really want to do this, I think they also need to validate that some
of those keywords are in the title of each page.

~~~
schoen
I'm also concerned that for some blogs, a single post might appear several
times in Google's estimated search results (due to crawling the same blog
under different hostnames or the same post under different paths, or because
of syndication or posts to link aggregators). So maybe some individual posts
reflecting particular teams' decisions are reflected 5, 10, or 100 times in
the Google result count.

------
ISL
Cache:
[https://webcache.googleusercontent.com/search?q=cache:XQb6R9...](https://webcache.googleusercontent.com/search?q=cache:XQb6R9BYLEQJ:https://erikbern.com/2017/03/15/the-
eigenvector-of-why-we-moved-from-language-x-to-
language-y.html+&cd=1&hl=en&ct=clnk&gl=us)

~~~
chalst
Is the table itself cached anywhere?

------
jlrubin
I appreciate that the author wanted to implement their own eigen vector/value
method, but really they should use:

    
    
        numpy.linalg.eig(x)[0]
        numpy.linalg.eigvals(x)[0]
    

Numerical stability can be hard to get right...

~~~
mathgenius
Yes, but... The matrix has all non-negative entries, and the author is after
the highest eigenvalue/vector so I think this means stability is just not an
issue. The only possible issue is time to convergence.

The nice thing about the power method is its conceptual simplicity. In cases
like this, it's quite hard to screw it up. And it will scale far beyond those
numpy functions (not that this is needed for this example.)

Also, did anyone mention PageRank yet?

~~~
howeman
Why do you say it will scale far beyond? Mat mul is N^3 as is eigenvalue
solving.

It's actually the second highest eigenvalue. The highest eigenvalue is always
1 for stochastic matrices.

~~~
j2kun
Power method is not matrix-matrix multiplication (which is not N^3, BTW [1]),
but rather matrix-vector multiplication. So the power method is N^2*k where k
is the number of iterations required to reach precision (usually
polylogarithmic).

All this being said, scalability is _obviously_ a non-issue when talking about
a matrix of programming languages. All methods are constant time.

[1]:
[https://en.wikipedia.org/wiki/Matrix_multiplication_algorith...](https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Sub-
cubic_algorithms) Interesting tidbit: nobody can even prove it's not N^2 :)

~~~
howeman
Oh right, of course, because you're iterating the distribution. Duh.

And yea, mat mul is not N^3 theoretically, but most implementations are. I've
heard that some (mkl maybe) are 2.8, but haven't had someone point code to me.
My personal attempts at implementing Strassen were slower than a tuned N^3
implementation, at least for matrices that fit into memory.

------
hasklel
10000 vocal webdevs make a blogpost about moving from Node/Python/Ruby to Go
because their app is slow as shit and the JVM isn't trendy enough for them.
Also I wonder if I'm reading this correctly but are there actually people
moving from Cassandra/DynamoDB to Mongo??

~~~
anvildoc
there is clearly a mistake in the database graph ... that it must be
inverted... because also no one is moving from mariadb to mysql

------
tray5
Well the fact that C is going so strong guarantees we'll be dealing with
easily preventable bugs for the next 100 years

~~~
pjmlp
Sadly getting rid of C means getting rid of UNIX, as they are symbiotic and
UNIX vendors will surely never rewrite them in anything else or replace POSiX
standard.

~~~
rocqua
C and say, rust or c++ can interface.

You don't need to rewrite, just stop writing extra stuff in C, maybe when you
do a really big refactor in C, port it. In the end, we can migrate away from C
gradually.

That makes me wonder, is there any chance in hell we get some RUST in the
Linux source code?

~~~
viraptor
Upstream? Likely never. Out of tree?
[https://github.com/tsgates/rust.ko/blob/master/README.md](https://github.com/tsgates/rust.ko/blob/master/README.md)

~~~
majewsky
I recall that Linus once said on the LKML that he could see himself accepting
Rust code into the kernel. But I cannot find the source right now.

------
fishnchips
Am I reading this incorrectly, or there is more movement from Swift to
Objective-C than the other way around? Do I sense a methodological error?

~~~
thehardsphere
I had this exact reaction to the graphs that also showed:

1\. Movement from Postgres to MySQL

2\. Movement from Mariadb to MySQL (and NOT the other way around?!?)

3\. Movement from PHP to Java (I remember the sort of people leaving Java for
PHP 10 years ago, and I don't think they'd go back, or that PHP people would
pick Java as their choice to move to)

I think maybe he has the axes labeled wrong?

~~~
mi100hael
MySQL is actually seeing a resurgence as people realize that ACID is valuable
and performance is just fine for 99.9% of use-cases.

And Java is seeing a bit of a resurgence as well as people get fed up with
shitty PHP and other dynamically typed languages. Java has some frameworks
like Dropwizard and Spring Boot that make it not as terrible anymore.

~~~
thehardsphere
> MySQL is actually seeing a resurgence as people realize that ACID is
> valuable and performance is just fine for 99.9% of use-cases.

That wouldn't explain why people jump from the database with better ACID
(Postgres) to the one with generally worse ACID (MySQL). Or why people would
move from the open source non-Oracle fork (MariaDB) to the Oracle-acquired
original project that everyone forked away from (MySQL).

> And Java...

Oh, I agree. Java is awesome these days. I'm just making a disparaging blanket
generalization about the people who jumped to shitty PHP to begin with.

------
ArneBab
This looks pretty interesting. Most striking to me is that Go is taking from
other 'target' languages. You can see the 5x5 block of the other strongest
target languages giving to Go, but not taking from it. To make what the
Eigenvector says explicit:

Top 5 giving to Go directly: C, Python, Java, Ruby, Scala

Top 5 giving to C: C#, R, Java', C++, Fortran

Top 5 giving to Python: C', Perl, Java', C#, C++

Top 5 giving to Java: C', C++, PHP, Python', C#

Top 5 giving to Ruby: Python', PHP, Perl, Java', (C++ — only 215)

Top 5 giving to Scala: Java', Ruby, (Python', C#, PHP — only 100, 17, 16)

': language also in top 5 givers to Go.

The other top languages take from each other (there is migration in both
directions), but currently Go mostly takes here. However it does lose people
to Rust — which is actually the strongest go-to language from Go. And C++ does
not give to go.

This might point to a discrepancy between Go marketing and reality (efficiency
and replacing C++).

It would be great if you could repeat this exercise next year to see how
things changed.

(besides: the script is nice and concise!)

~~~
pavanky
Or it could be that Go is a fairly new language and has so few users (relative
to other top languages), the search queries will show up in only one
direction. You should note down what is happening in the other "new"
languages.

------
rjbwork
Highly skeptical of so many people migrating from C# to C, or python to
Matlab, to give a couple of examples. This seems like a highly flawed
methodology from many perspectives, as pointed out in comments.

~~~
bluejekyll
Or Rust to COBOL? I had the same thought.

Then reading more I realized the results included things like optimizing
certain portions of programs into a language for hot areas of code. Though the
Rust to COBOL one I should go read. That's nuts.

------
verytrivial
I find it amusing that for Javascript frameworks the approximate end-state is
perpetual oscillation between React and Vue.

------
jaimex2
Anyone else routinely roll their eyes at "why we moved from x to y" blogs?

They are always just "We wanted to do this in a particular way. So we fought
the framework till we decided to move to another that does things the way we
thought they should be done. Now things are much better but we will fail to
mention down the line all the new compromises we have to deal with"

~~~
douche
Most often, it's a "We built this thing initially in X until the legacy
technical debt and proto-duction compromises caught up with us, then we
rewrote it in Y, leveraging all the domain knowledge and experience we've
gained after doing it in X. Amazingly, the second system in Y is
better/faster/has less bugs!"

------
santaclaus
Who is the one person who rewrote their matlab homework in php?

~~~
aaronjg
Apparently it is a scientific program that was made into a web app
[https://www.ufz.de/index.php?en=39156](https://www.ufz.de/index.php?en=39156)

------
peterwwillis
It should be noted that the premise dictates this eigenvector is limited in
scope. It seems to apply mostly to people who both wish to create products
(usually for some commercial venture) and have a habit of saying things like
"Well this looks hard. Let's try reinventing this wheel with different tools
and see what happens."

------
urs2102
This is super interesting. It's also interesting to see the converse - who
isn't moving anywhere. Go, Elixir, Dart, and Clojure all seem pretty happy!

~~~
throwaway729
This is an absolutely terrible methodology for asking that question. People
tend to blog about major language changes. Very few C# shops/devs are
publishing blog posts about how they're still using C# this year.

Unfortunately, you'd probably be hard-pressed to find a decent methodology.

But indeed, "we used this language for 10 years without thinking about
switching" is a much more interesting metric than "we switched languages 3
months ago".

~~~
urs2102
Yeah true. My mistake, but fair point. I'm not exactly sure how to measure
that to be honest. Seeing what people used for ten years without switching
would be pretty neat.

~~~
fauigerzigerk
Just look at the labour market. Even when companies don't switch languages
they constantly have employee turnover.

The only problem is that the labour market is a lagging indicator. Perhaps the
best approach is to combine statistics about active switchers (like the OP)
with labour market statistics.

------
jstewartmobile
I really appreciate his method of breaking it down per-niche.

When it comes down to it, all languages are DSLs. Even LISP/Scheme are DSLs
for making DSLs (like Butterick's " _Beautiful Racket_ " earlier today).

Presenting them as per-niche directed graphs is probably less likely to steer
newbies (and sadly, not-so-newbies) into another round of "let's redo
everything in X!"

~~~
tempodox
Agreed. Nevertheless, everything _will_ be redone in JavaScript, by the look
of things.

~~~
jstewartmobile
[https://www.os-js.org/](https://www.os-js.org/)

~~~
shokunin
"Has a small, but awesome CUMmunity"

~~~
tempodox
I wonder whether that typo was intentional ;)

~~~
jstewartmobile
The whole thing is like a "Yo Dawg, I hear you like Javascript" meme, so
probably so.

------
pjmlp
I don't see any database that we actually care about.

SQL Server, Oracle, DB2, Informix.

~~~
mrweasel
I can't really tell if you're joking, but I think your question illustrates a
different flaw in the article.

People really write about the databases you listed, because their users have
an entirely different mindset. Sure people may switch from Oracle to DB2, or
from SQL Server to Oracle, but some organisations just have "standard
databases" that they work with. Switch would be a multi year process, and
certainly not something to be advertised, unless it's: "Now with support for
SQL Server" in the marketing material.

~~~
pjmlp
I am not joking, those are the type of databases I use daily on the the
programming stacks I posted in another thread.

My employer does enterprise consulting.

------
w8rbt
I love golang, but I hate trying to search HN articles for the word go... I
need a ___find go but not ago_ __search button in FireFox ;)

grep go | grep -v ago

~~~
bluejekyll
I can't tell you the number of times that I've searched for _Rust_ and get
back results about protecting your old favorite car. And project names like
_corrode_ don't help narrow the search, it in fact raises the error rate.

It's funny, there was this company Yahoo! that was trying to organize the
internet to try and fix this...

------
hamilyon2
Only relatively small project can afford a rewrite. So, this is statistic
among projects that can affod a swith. And as far as I can see, this is
eigenvector of trend. First derivative of actual state of things. More
informative of the state of fashion today

------
kilon
I always thought that there is no such thing as best programming language in
the world. After all they only recycle the same recipe , again and again. Then
I found out Smalltalk and realized how wrong I was.

There is nothing that comes close that can compete with the massive success
that Smalltalk has been. Its blows my mind how it can be so much better than
anything else out there including the usual suspects (Lisp, haskell, blah
blah).

But in the end its not about the language , its about the libraries. Hence why
Python remains my No1 choice.

In the end however even Smalltalk is terrible outdated. The state of software
is in abysmal condition trapped in its own futile efforts of maintaining
backward compatibility, KISS and do not reinvent the wheel.

In sort software is doing its best to keep innovation at a minimum and as such
pretty much everything sucks big time and is still stuck in stone age.

I once considered becoming a professional coder working in a company doing the
usual thing, I am glad I was wise enough not to choose that path. I would have
killed myself right now with all this nonsense that makes zero logical sense.

But my hope is in AI, the sooner we get rid of coders, the better. Fingers
crossed that is sooner than later. Bring on our robotic overlords.

Saying that I know a lot of people that really love coding and respect it as
an art and science, so there is definitely hope.

------
Yuioup
The author is surprised that angular is holding up. I've been learning
angular2 the past few weeks after having never used a single page application
framework before and I'm loving every second of it. I'm never going back to
ASP.NET MVC except to use it as an API.

~~~
camus2
> I'm never going back to ASP.NET MVC except to use it as an API

Angular isn't going to help you write your server. That's such a strange
statement.

~~~
douche
Quite often ASP.NET MVC is/was taught with the Razor view syntax wrapped
around the axle of everything else that MVC can do, because it was such a
revelation compared to the suck of WebForms.

------
brightball
The Python/Ruby axis are interesting. You've got over twice as many Python to
Erlang posts out there...and a fraction of the Python to Elixir's. Ruby has
the opposite. A bunch of Ruby to Elixir and very few Ruby to Erlangs.

I wonder why that is?

~~~
WJW
The Elixir syntax is quite alike to Ruby's, so if you are going to change from
Ruby to a language running on the Beam VM, switching to Elixir is easier than
to Erlang because you don't have to relearn as much. I suppose that this
effect is missing for Python to Elixir and that therefore relatively more
people choose to switch to Erlang.

------
raesene9
To me it's unsurprising that people move from "thing that was popular a while
ago" to "newer thing".

Generally when a language/framework/toolset first hits, it looks magical and
fixes loads of problems people are currently experiencing.

It's own crop of problems has yet to emerge (generally these only emerge once
a sufficiently large number of projects have been using it for a sufficient
length of time).

So at the moment Go is the new thing, and it's surplanting the older new
things... come back in 3-4 years and it'll likely be on the losing end to
something else.

------
chiefalchemist
This is certainly interesting. Great for happy hour bullshitting, but too
flawed to take seriously.

I'm not dismissing it. Just wanting to define proper context. Else some
twithole will start a shit storm over null.

------
mngr
The prevalence of Go may be caused by the stupid fact that Google search
results include pages where "go" is used just as a common verb and not a
language name.

------
eigenwhat
@platz : The time component is missing in the analysis to make comparisons
meaningful to me - so in comes a cubic meter of salt.

------
PDoyle
> Surprisingly, (to me, at least) Go is the big winner here. There’s a ton of
> search results for people moving from X to Go.

You mean ... Google search results?

I'm not trying to suggest that Google's search engine is intentionally biased
toward Google projects, but I think it's reasonable to assume that their own
projects wouldn't fall into whatever unintentional blind spots their search
engine may have.

~~~
jrocketfingers
They didn't exactly make it search engine friendly either, given that the name
is quite generic/short.

------
z3t4
It would be interesting to do the same thing but with Bing and see if C# and
.net comes up first ... ;)

------
wlllmunn
I find it very hard to believe that no-one is moving to node.

~~~
richmarr
They are. Mainly from Java, Python and PHP.

You may be reading the axes the wrong way around... look down the node column
rather than the node row.

------
azinman2
Oi. Another person who thinks the number of search results returned is a real
number that means something.... the fact that it gives even plausible results
is impressive as the number is made up by googles servers.

~~~
laughfactory
Made up? Can you explain? I ask because I'm professionally working on a
project which uses those results and I've often wondered about their validity.
I know there are...Issues with them in various ways, but what are you aware
of?

~~~
azinman2
It's not appropriate for me to give you the details, so I'll just say I
wouldn't rely on that at all.

------
iopq
I half-expected Unicode to have an emoticon for this [https://assets-
cdn.github.com/images/icons/emoji/trollface.p...](https://assets-
cdn.github.com/images/icons/emoji/trollface.png)

But alas, this post had to use an image.

