Hacker News new | comments | ask | show | jobs | submit login
Show HN: Analysis of 2018 Hacker News “Who Is Hiring” Job Posts (letstalkalgorithms.com)
301 points by vthallam 29 days ago | hide | past | web | favorite | 186 comments



I think there are some flaws in this analysis that can be confirmed when the code is released.

I think the reason Scala seems surprisingly popular is that is was simple string matching so a post that said "We are looking for someone who can build scalable infrastructure" would count as one mention of "scala".

Could be wrong though.

Also see: https://news.ycombinator.com/reply?id=18725655&goto=item%3Fi...


This is why 90% of data science is the messy, grunt work of just simply cleaning/normalizing the data. The analysis is the easy part.


> grunt work

Honestly, it is an insult to call good data preprocessing skills grunt work.

It may not be as attractive, but requires a lot of understanding of the nature of data and way in which patterns present themselves.


That's the point of the comment. Modern data science tutorials assume the source data is a nice and tidy CSV, when the reality is that data is very often a mess, and it's a constraint often not reflected in the glamorization of the field. (it's also a reason that I recommend that code always be published when making posts such as these, as it's very easy to do wrong!)


Hadley Wickham's book "R for Data Science"[1] does a good job emphasizing the data cleaning and reshaping steps in the analysis process.

[1] https://r4ds.had.co.nz/


>Modern data science tutorials assume the source data is a nice and tidy CSV

For what it's worth, I don't really see this as a problem in and of itself. Of course, if you want to do data science in the real world, you need to learn about data cleaning, manipulation, and warehousing in addition to the "pure" data science process that begins with tidy data. But it's good that tutorials segment these out, since not everyone needs to learn both at the same time. Anecdotally, I had plenty of real-world experience dealing with messy data by the time I started learning data science.

As an aside, my impression is that technical assessments for data science positions tend to underemphasize data cleaning and manipulation. Granted, they're still really time-consuming as it is, but there's probably some room for optimization there.


> Of course, if you want to do data science in the real world, you need to learn about data cleaning, manipulation, and warehousing in addition to the "pure" data science process that begins with tidy data.

True, although ideally the data is cleaned/structured before it hits a data warehouse and the data scientist starts working with it. It's an iterative process (DS finds messy data, flags the problem upstream, repeat).

I do wish there were more tutorials with messy data; the ones I make I deliberately try to use complex datasets and highly relational ones, although as a result the tutorials are more complicated and more imposing for beginners.

> As an aside, my impression is that technical assessments for data science positions tend to underemphasize data cleaning and manipulation.

When I was interviewing for DS jobs, I got a lot of the implement-binary-search-on-a-whiteboard questions which annoyed the hell out of me. But for the takehome assignments, they required some data ETL, which I felt was more representative. (some assignments had deliberately flawed datapoints the user was expected to identify and remove while not explicitly being told to do so)


Grunt work noun [ U ] uk ​ /ˈɡrʌnt ˌwɜːk/ us ​ /ˈɡrʌnt ˌwɝːk/ informal. the basic, hard work, often physical or boring work, that is necessary for something to succeed.

They're just calling it like it is.


My God. This "it is an insult" thing is insufferable. Please stop doing it.

It's literally what the phrase means.


It's insulting to insinuate that grunt work is insulting.

(lightheartedly; I'm not like disgusted over here :-) )


Eventually it will be split from actual Data Science work which is essentially either Machine Learning or Statistics. Imagine making a master sculptor mine his own Marble, it would be absolutely absurd similarly it would be silly to make a PhD Machine Learning expert work on data cleaning. A massive waste of talent on trivial work.

Edit: trivial isn't the right word, it's obviously an important step in the process but quite different from what you are really paying a "Data Scientist/ML Engineer/etc" to do.


It's quite different... I would hope the marble miner would know quite a bit about geology, mining equipment, etc....

It's not that putting the sculptor on the task would be a waste, nearly as much as that the sculptor doesn't possess any of the necessary skills.

How many people know how to find a source of marble, run a mine, and sculpt a great work of art? Roughly nil, but it's how many organizations seem to hire.


On the contrary, if the statisticians aren't intimately familiar with the data and the cleaning process, it's likely that they'll overlook some important features. When I interview potential hires, I make sure they'll be self-sufficient when writing data extraction code.


This is why most large tech companies have data engineer roles that are separate from the data science role.


I worked for a company, which is still in business, 80% of whose work involves "cleaning/normalizing" data, basically everything that isn't sales and client services.

Market research is all about getting different kinds of data and massaging it into queryable and productizable form (reports).


Where do they usually get the market research data from?


Depends on the market being researched. Focus groups are one source, POS data is another. Large operators like Nielsen collect all kinds of data for all kinds of industries, which can be purchased. You collect what exists where it lies for the stuff you want to look at.


Often it can be faster to enter data manually then to make a script collect it automatically with the same accuracy.


VaLUAble. tRUST.

I would be pretty surprised to see these two so much more commonly requested than Javascript.


Probably the same for rust, looking at the October 2018 data set with https://kennytilton.github.io/whoishiring/ rust gives 50 results, trust gives 21, \brust gives 26. Similar results for scala.


Yeah, similarly with "Excel". Are there really that many posts looking for people with deep spreadsheet experience? Seems more likely that it's a lot of false positives for "excellent" and "excel at".


Excel is probably mislabelled, it should say HTML instead. Look at the list above


Yup. JavaScript, Node.js, Node, npm, React, Angular, Vue, Express and a few more words mean pretty the same thing and are fishing for the same pool of developers (Yes, I know there's a difference between Angular and React.)


> Yes, I know there's a difference between Angular and React.

When can you start?


I've noticed that in the hiring threads, a particular company is always hiring, and the email contact username is "austin". It would be easy to pick this up as a location even though it's a person's name.

This analysis has the Austin as one of the top 10 locations, and I'm curious whether or not that is skewing the numbers. There are a steady stream of jobs actually in Austin, so either seems possible.


Same with product manager. Many descriptions list working with product management as a requirement. There have not been 425 product management jobs listed this year. Trust me, I've applied to most of them.


You’re very likely correct. See the compare by volume:

https://hnprofile.com/compare?search=scala,python,rust

Not the only issue I see, for instance my company posted for “machine learning engineers”, yet that isn’t in the results at all.

I think they are doing regex search in the strings, as opposed to identifying the words and doing comparisons


Here's a link with more languages in case anyone else got curious: https://hnprofile.com/compare?search=scala,python,rust,cloju...


Gotta add some word boundaries to that regex!


I wonder what's included for the regex of .net. Eg: asp.net core, dotnet, c#, VB.net, f#, .net, xamarin,...

Because there are a multitude of possibilities.


Agreed. It looks like the author may be looking for something like a /.net/ regexp? I don't think .net is popular at all in HN job postings. A better regex would be pretty much what you suggest:

    /asp\.net|dotnet|c#|VB.net|f#|xamarin/i


It appears that the same problem occurs with locations. For example it would appear that the small town of Visby (Sweden) which is 1/100 of the size of Stockholm has 1/10 the jobs of Stockholm. Or more likely the location matches the company name Visby of startup visby.io.


Pittsburgh (120) seems to be beating Pittsburg (6) as well, so theres a lot of data cleanup to be done.


I think there's also a bug where the numbers don't account for "among those who mentioned [category]."


Would there be an issue with just changing the string to be matched to "scala " or " rust"?


There are numerous ways to fix the glaring flaws in this analysis but a very simple approach would be using regex word boundaries, that way it'll work not only for the space but for commas, periods, end-of-line, etc.


The correct method is probably something like "\<$lang\>"


For the record, the original post has been edited: your suspicion was correct, and they've fixed it now.


OP here. As some people were commenting, Scala and excel I guess have some false positives which reflect the unusual popularity. I did try to limit the search to exact word, but think that I overlooked that part. I will try fix the code and update the skills chart soon. Thanks for pointing it out.

Edit: I fixed the false positives using a word boundary on regex "\\bword\\b". The data now seems little sane, will keep looking for any false positives. updated the data and regenerated the chart using new data.


I'd also expect some tough to catch exceptions for remote, as that's something I always look at and despite the posting guidelines, posters can't stop themselves from adding things like "sorry, no remote available at this time" or "part-time remote might be an option for the right person in some roles".

I get that you're trying to be complete, but if it's not a truly remote position, it's not remote. put nothing about remote in the position or at the very most "REMOTE: NO". Working from home 10% of the time but needing to be in a SF office the rest of the time is not a remote position; it's the flexibility I'd expect from most jobs in this industry.


Try searching craigslist for a shared apartment for a couple.

"Sorry, no couples!"


You have to get clever when searching CL. Plenty of places that allow cats don't click the "Cats OK" box. Instead of using that search filter, include the terms -"no cats" -"no pets". In your case, -"no couples". My CL searches to filter the fake 1BRs get pretty crazy (-roommate -"room for rent" -jr -junior -studio, etc...) but their search engine handles complex queries well.


Same kind of issue for us Canadians. Search for remote: "remote us only"


I'm guessing they get a lot of speculative remote applications, so are trying to prevent that.


You've got some citizens of Null Island in your data!

https://en.wikipedia.org/wiki/Null_Island


It's really cool, but it's not going to be useful until you correct these errors. You might want to provide a note to readers in the meantime.


Thanks. I added a note at the skills section about the false positives. Really should have paid attention, but i guess i was just excited to show it off, first :(. I am on it fixing the regex.


You might want to consider performing your text analysis using something with a negex implementation; this would distinguish an instance of "remote" from "no remote" or "can't hire remote".


There seem to be errors in geo-locations as well. No results for Zurich while it's the biggest hub in Switzerland. A suspiciously high number in Venice, Italy (I think it's safe to assume California here).


I completely relied on the python package "GeoText" for locations.I see Geneva and Lausanne from Switzerland, i will have to double check if the package missed it or there were not many posts on the hiring thread. Thanks for pointing it out.


There were definitely posts for both "Zurich" and "Zürich" in almost every thread.


How did GeoText do on mine? https://news.ycombinator.com/item?id=18358038


{'Huntsville': 1, 'Arlington': 1, 'Florida': 1, 'Virginia': 1, 'Greenville': 1, 'Melbourne': 1, 'State College': 1, 'Maryland': 1, 'Austin': 1, 'San Antonio': 1, 'Texas': 2}


I don't know how best to handle "go" as well as "golang" without false positives.


Maybe it's in Facebook's word2vec?


word2vec is a Google project.


OP, thanks for the analysis. Really appreciate it. Quick question - Does San Francisco include abbreviations like "SF" or "San Fran" ? Does it also include jobs in the Valley towns like Palo Alto and San Mateo ?


"We do not accept REMOTE applicants" would still be a false positive.

And same would go for visa. "We do not sponsor VISA"

I really think you're going to have to do some sort of sentiment analysis, either via human or machine.


Edit 2 is still broken, I find it hard to believe there are less skills than total number of jobs or position posted.


While you're at it, you've got `vuejs` and `vue.js` listed separately :)


Same for “react” and “reactJS”.


Take that all you people crying "Scala is dead" after Java 8+ comes out. /s

Anyway, I've personally seen (and interviewed for) jobs that were advertising Scala (or plans to move to Scala, or that Scala experience is a plus) but actually meant there was one proof-of-concept project in Scala, 18 months ago, nobody ever touches it nowadays, the person who wrote it is long gone and by the way, our legacy JDK5 application is on fire, do you mind fixing that for the rest of your life?


I suspect there's false positives from "scalability" and "scalable"...


I hope OP took that into account. Seems like pretty low hanging fruit when cleaning your data.


Yep just split on whitespace


Or preferably, just split on word boundaries.


Oh for sure; I'd take any job listing with something cool with a huge grain of salt; hardly any proper software development employer actually uses hip technology, but they will use hip terms to try and draw in developers.

Hype-driven development is a problem, and hype-driven recruitment another one. Just be a good java developer. Maybe push for Kotlin slowly. Don't pursue Scala for the language's sake.

See also: blockchain, IoT, etc. I've seen a job advert that basically bolted those onto the not well hidden "we're just looking for decent java developers" job description.


Scala's in good shape, but not quite so popular as the posted analysis presents, based on my findings of the same data:

https://www.hntrends.com/2018/nov-react-still-top-containers...


The fact that Rust and Python are quite high in the rankings, makes me hopeful about the future of programming again. The fact that remote work is not growing is a bit concerning, but if people from the valley want to overpay - it is none of my business. I hope eventually the market will correct it.


If you don’t mind sharing, why do you find the lack of growth in remote work concerning?


Not OC, but for me it's just a real wonder why it's not better embraced at this stage of the game. We have all the tools to accommodate collaboration within remote teams and in (most) places the broadband to handle it. Add to this the continued funneling of companies into these metro areas where COL is high (NYC, SF, Seattle) and thus people may find themselves being forced into higher commute times just to attain a better COL situation.

I personally am commuting close to 2 hours each way, so 4 hours total, because the job market is much stronger in NYC then my immediate (30-45 minute) area. If a job is open around here the salaries are almost 30-40% lower than NYC despite our COL still being high.


We have tools to accommodate collaboration but I believe something gets lost when you remove those water cooler conversations and physical interaction.

In regards to your commute, wouldn't a 40% lower salary be worth 4 extra hours everyday? I guess it depends on the math but unless you're being compensated for the commute(in which case it's just work on the train/metro) that seems like an incredible sacrifice for a bigger paycheck. At what point is your time worth it?


That watercooler talk getting lost is absolutely a pain point with remote work, but there are mitigations and benefits that offset it in my opinion.

One option is to re-create that talk. At a past job we would often all fuck around before/after standup calls for a bit, but sometimes that fucking around became work and solved problems. Other times we would just call each other either one-on-one or in small groups just to shoot the shit, and that can recreate that same feeling.

A big part of that is getting over the idea that "calls" are somehow different than walking over to someone's desk, you wouldn't hesitate most of the time to walk over to a coworker and start chatting, but most people hesitate to call someone on slack. At a past job that hesitation wasn't there because the culture embraced it, and suddenly we had our watercooler talk back, just over video calls.


> One option is to re-create that talk

One remote first company I interviewed with told me all their daily calls were video because they wanted to make sure people still felt human. And I think it's such a crucial piece in all this because people have been accustomed to just doing audio only which only further creates a sense of loneliness.


Yeah. I'm remote at a mostly collocated company and I'm always trying to get my face on other people's screens whether that's a stand-up or not screen sharing off the bat during demos or even just over-commenting in group chats to get at least my name out there.

It's worth it though, and when I'm on-site I get people talking to me that I've never met who recognize me from somewhere I was on a screen.


That is the most obvious problem with developers, thinking that you can solve people problems with tools.

So you are right something is lost. Skype, jira, slack will not convey the feeling that there is another person on the other side of line. You will not see coworkers getting sick, going through hardships in life. If you see someone you can tell he had bad night sleep that is why he is upset. You get the watercooler conversation that other guy wife is annoying... You don't care it should not affect quality of work but no one is robot, via electronic communication everything seems so perfect... Then you expect people to be perfect, and they expect from you to be perfect, then you get upset, but you just have flu and cannot focus really...


By tools I meant for collaborating, not replacing humans. I'm not some robot who just wants to sit head down because I'm terrible at socializing. I do enjoy being in the office to see faces, but I also have realized this not a very strong argument for having every company be so resisting to remote work.

And as I stated you can replace this in other ways. Because you're home you may end up at a gym or coffee shop or group bike rides more often. That gives you a different form of social interaction in the day to day to replace "water cooler" talk.


I think "water cooler" talk is over stated because many companies will have different channels to discuss these kinds of things. You no doubt loose the face to face human interaction, but there are other ways to account for this. Getting out and socializing with people via hobbies can help greatly here.

As for the commute, I've worked out a flexible work schedule that lets me be home a few days but as I stated I'm in a higher COL state (NJ). So a 30-40% reduction in pay ends up being a fairly big change in QOL. I've learned to make use of the time on the train by engaging in things I enjoy (video games, podcasts) which may be harder at home. Moving closer also doesn't help because A. it gets even more expensive B. QOL drops due to higher density areas which makes it harder to ride bikes, garden, etc.

We've accepted the choice we make to be where we are because of what we get from our life outside of work, but it doesn't mean I can't hope for better remote possibilities.


It's not overstated. The only people I've heard say this are the ones that prefer remote work. You're definitely more productive around where the action is happening than in a house with kids screaming.


First off, forgive me if I don't take your word for it, but I just saw an article recently saying that remote workers are more productive on average. Whether that study was just an anomaly or there's some other explanation like "only productive programmers can manage working remotely", it would hint that your hot take represents the world as you imagine it, not necessarily the world as it is.

Secondly there's something like a "no true Scotsman" vibe about the implication that you can't trust people who prefer remote work to comment on their productivity like you can people who are onsite. That may not be the right fallacy, but there's a fallacy in there somewhere.

Thirdly, nobody should be working with kids crying. God invented doors for just this reason. If you can't make a quieter space at home than you can at work, you're either pulling in some serious perks on the job with your private, sound insulated office or you don't have the fundamental amenities to work at home, it's not the nature of working remote, it's showing up to work unprepared that's holding you back.


Fair enough. Different strokes. Giving up on hope is a scary wormhole, always keep dreaming.


My time is absolutely valuable and I make it abundantly clear on any phone screens for new roles. I've established my own set hours within the office to accommodate the long commute without impeding my life more than it has to. But the reality of going from 100k -> 60k in a high COL state is quite a change. It's entirely possible, we've been there, but it def requires you to readjust. And as the cost of certain goods keep rising it's not as easy as it was for us 3-5 years ago.


At my last gig, "water cooler" conversations were discouraged because SLack conversations were archived and searchable. No need to wonder what someone said last week, last month or last year. It's all there and ready to read.

It was kind of stupid to come into work when this rule is in effect.


Is that even legal?


Probably not illegal to discourage but maybe illegal to ban


i've found some of the 'water cooler' talk could easily foster gossip and cliques, leading to people "in the know" getting tapped for special projects and promotions, simply because they 'fit in' with a particular manager better, skills be damned. The 'smoke break' phenom too - if your boss smokes, figure out a way to get out there and spend time - that's their water cooler time, and the spoils go to the other smokers.

Of course it's not 100% that way, but my own experience has seen it play out a few times that way.


> those water cooler conversations

You mean managers and owners getting to oversee "their people".


That's definitely a part of it. Much easier to hold people accountable when you physically interact with them. If your manager or owner is abusing this, then I imagine working for them remote would involve a webcam and a key logger. Good managers usually find ways to motivate their workers, but it is definitely harder when you never get to physically interact.


For true remote with employees at their homes, security failures become far more likely.

There is a middle ground. One can open up small offices in small cities. For example, instead of 1000 people in a city of 5,000,000 people, it could be 20 people in 50 cities that have 100,000 people. Each site gets a VPN connection, with the hardware physically secured in a commercial building.

The COL goes way down. The commute becomes tiny for most people. I'm in that situation, and my commute is 3 minutes if I use a car.


Not OP but due to a lack of investment in public transportation in NA over the previous decades, many people spend a few hours a day commuting and in many cases this is done in private vehicles that pollute. For many in our profession there is no need to go to an office to work.


Only if you consider churning out code to be your work; I for one prefer to work in an office for a number of reasons. Not everyone's a robot that doesn't require social contact; not everyone has a good work-life balance allowing them to not have to be in the office to fulfill their needs.

Plus, free lunch.


Not everyone gets free lunch


There is no free lunch.


Interesting, I haven't put much thought into the environmental impact of resistance to remote work. I would disagree with your second point however, I believe that you lose something when work is performed remote, regardless of your role. That benefit certainly isn't worth destroying the planet.


I (re)started my consultancy in part so I could work from home. My commute had gone from 2 hours a day to 3 hours a day with the rise of traffic in the Toronto area.

I meet with clients a couple times a month and I am on the phone, slack or webex/ringcentral/google hangouts etc.. a few hours a week.

I am _sooooo_ much more productive now (with fewer distractions) that I work on average 500 hours a year less than I used to. That doesn't even begin to factor in the quality of life issues. I now drive 8000km per year versus over 20,000km per year.

Some jobs, some roles, yes, you need to physically go somewhere. For me, for what I do? There is no benefit.


First of all, congratulations! Always scary to (re)start over, and I'm glad it's worked out for you.

I'm a big believer in serendipity. To use a machine learning example, your algorithm needs to have some temperature.[1] Sometimes you want to sacrifice your queen in order to checkmate your opponent in 5 moves. In this situation, that might mean sacrificing on productivity during a project in order to meet with the client more frequently in person, allowing you to develop a long lasting relationship.

I'm sure that things come up in those physical meetings that don't come up during phone calls or Slack for a variety of reasons. In my opinion, if you met with clients a dozen times a month, instead of a couple, you would not be as productive, but you would drastically improve your relationship with the clients.

Loyalty is a currency like any other, it can earned and spent. You can't quantify the value of a relationship the same way you can quantify productivity, but your improved empathy and sympathy to your clients problems will improve your performance. You might also find that your clients trust you more, and give you more freedom and time to find solutions to their problems. Finally, developing relations will pay off in your professional and personal life down the road, long after you have finished working on the current project.

Of course, if you're optimizing for work-life balance, or spending less time on the road, then this can all be ignored. Take the necessary steps to achieve your desired lifestyle. If your goal is to promote growth, build infrastructure, and deliver value, I believe you lose something by going remote.

Personally I think the best system is a combination of Monday and/or Friday remote, with the rest of the days in a physical location. This allows employees to enjoy parts of the remote lifestyle, while still keeping many of the benefits of meeting in person.

1) https://www.quora.com/What-is-the-temperature-parameter-in-d...


Not OP, but I'd like the world to be moving toward a paradigm where physical location is not a significant factor in career / pay / advancement. It makes things more "meritocratic," and it puts pressure on some of these big tech hubs to keep their costs of living competitive.

But I don't want to see it if the market doesn't justify it. I think today there is something beneficial to having people working together in an office, but I'd prefer to be proven wrong by some new remote-work management style (or something). And software development is one of the more ideal use cases for remote work, so if it's not expanding for us, it's less likely to expand in other industries.


Rust I can see.

Python is deeply troubling. It is a regression from FORTRAN and COBOL. Long ago, we invented compile-time type checking. The benefits for software quality have been enormous. There isn't really a downside here, as there would be with the performance loss of garbage collection or bounds checking. Shaking out lots of bugs before even attempting to test the software is a wonderful advance that we made half a century ago. Python's incompatibility with compile-time optimization is also horrifying. The situation is so extreme that you can't even make a decent-performing JIT.


You're awfully unfamiliar with the history of computer science if you truly believe that dynamic typing is some sort of recent invention or that static typing is some panacea that solves all of your problems.

Both systems of type checking have existed as long as... Programming languages have existed, essentially and, sadly, so have the endless comparisons and flamewars.


i learned to love smalltalk and lisp after i realized that despite their age they are just as good as most younger programming languages.

it felt like a revelation, and almost disappointment that there is hardly any innovation in newer languages.


Python has strong typing as of 3.6, although not at compile-time, since it doesn't compile.


And you can use mypy to do some static type inference in 2.x


Python was already strongly typed long before Python 3.

Python does not have static typing as a built-in part of the language. It has had the ability to annotate arguments and returns of functions/methods since 3.0, a standard-library module containing helpful code to use this for type hints since 3.5, and the ability to annotate variables since 3.6. Annotations are a completely optional feature and no built-in part of Python will check these annotations or analyze code for correctness in advance of execution; there are third-party tools to do this, if you want it.

Also, Python most certainly does compile -- the CPython interpreter is a virtual machine which runs bytecode, and Python source code is compiled to bytecode for that VM. There isn't a requirement to run a completely separate standalone Python compiler ahead-of-time to generate the bytecode (if bytecode isn't available, Python will compile source to bytecode on a per-module basis as those modules are loaded), but that doesn't mean it isn't compiled.

As to "strong" typing:

"Strong" typing is a term that's only vaguely defined, but most commonly refers to whether a language will implicitly coerce/cast values of incompatible types in order to make an operation succeed. Consider this code:

    a = 1
    b = "2"
    c = a + b
In a strongly-typed language this is an error¹. Depending on other aspects of the language, it may be a compile-time error or it may be a runtime error, but the important thing is that the third line of that sample will never successfully execute. In a weakly-typed language, the third line could execute, and would assign a value of either 3 (if the string is coerced to number) or "12" (if the number is coerced to string). And in fact, in Python the third line above raises a runtime TypeError, since str and int are incompatible types for the "+" operator to work with.

Static typing refers to a situation where both names and values have types, and where all attempts at binding must involve names and values of compatible types. For example, in Java:

    int a = 3;
The name "a" is declared to be of type int, and the value 3 is of type int, so the binding of the value to the name succeeds. Attempting to bind a non-int value to the name "a" would fail. In a dynamically-typed language, only values have types, and the type of a value does not restrict which names it can be bound to.

You can remember this easily by considering why the name "static" is used: it's because you can perform checks of name/value types and bindings statically, without needing to run the code to determine types. In some languages (like Java) this is accomplished by requiring all names to be explicitly annotated with their types; in others the types will usually be inferred automatically from usage, with the option to annotate when desired or to resolve ambiguity.

--

¹ Yes, yes, I know someone on HN is going to suffer a terrible career-ending injury from how fast his knee jerked at that and possibly from breaking his wrists in his rush to post "Well actually there may be a type defined somewhere that's a union of string and number, so how dare you say that's an error when you don't know if someone might be using such a type!" My advice is not to be the type of person who suffers severe injuries due to such an obsessive need to nit-pick, because reasonable/charitable readers will correctly understand the example with no difficulty.


Cool to see that other folks find this data interesting. I've been analyzing languages/frameworks/skills in the "Who is Hiring" posts for a few years (https://www.hntrends.com/) as well.

I know it's just one part of the analysis, but the skills list appears to be be pretty far off of what I've been seeing. React gets over 200 a month by itself. Are you capturing all pages of the postings in each month's thread? Here's the data (counts) I have through November, broken down by month and term - https://www.hntrends.com/data/data-20181101.js


So the next report / analysis needs to be:

Of everyone who asked for work or applied for work via these posts - how many secured a position?

I've long given up on these posts trying to find freelance work or other permanent work. There's to much competition (here) to stand out from the flood of replies.


Anecdotally, the one time I posted a job, I had about 80 replies for a remote work junior engineering job where I specified the salary at $60k/yr in the post (this was in 2015). Of those 80, about 10 were promising and I probably would have made an offer to at least one or two if we hadn't shifted priorities and stopped the hiring process.


I applied to matching jobs for three months about five years ago. Also did the same earlier this year for three months.

My advice: skip this site and weworkremotely.com. Both are complete wastes of time based on the responses.

If you’re in IT, your best bet is LinkedIn (for a referral) or Careerbuilder/Dice/Indeed. At least those will result in face to face interviews.

Also find companies you like and apply directly on their website.


Check your map, it looks like you may be matching "Remote, or XXX" with the town of Remote, Or (Oregon).


Some highlights I saw -

* SF is center of the tech world. Next biggest is just around half of its size. (not withstanding HN bias).

* Approx 25% jobs allow visa sponsorship

* HTML, Python, .Net overwhelmingly dominates everything else.

* Reduced interest in databases, SQL, Obj-C, Java

* No TensorFlow or PyTorch in demand

* 90% jobs in development/software engineering, 10% in management and misc.


It probably understates the SF market, TBH. As some of the largest employers in SV who hire en masse rarely post jobs in HN. Such as Google, Facebook, Apple, Salesforce, Uber etc. Also, Google, FB, Uber, Airbnb and others have a boot camp system, where job postings do not correlate to hiring since one posting is used to hire an entire bootcamp class


>SF is center of the tech world. Next biggest is not even half of its size.

The article lists SF as having 2922 jobs and New York 1746 jobs


Wow , is .NET that popular in startups ?

I always assumed it was used heavily in enterprise but never that much it's would somehow , it's also a popular stack in the Silicon Valley and NY. That said London and UK are well know to be MS Stack users.

Can some Engineers from SF or NY or voice their opinions on this ?


When you look at the job postings in your browser, and do a simple string search for ".net" (so not a regex search), it doesn't show up all that often. E.g. if I look at this month's posts (4 pages), I see it 31 times. By contrast, Python shows up 251 times. I suspect the posts were searched with a regex like /.net/i without escaping the ".".


The answer is less deep than that. Hacker News is now regularly used as a recruiting tool. It may be that bigger companies are now also posting every month. .NET has come a long way though and I have a friend who is using it at his place, but from anecdotal evidence I see a lot more Python and JavaScript than anything.


The graph of top skills doesn't remotely correspond to the keyword values (actual data). For example rails/ruby is nowhere on the graph, when both score over 50.


As much as I like Scala I find that hard to believe. I actually look for Scala in the job listings and I never felt it came up that much, compared to say Python or Javascript. A dumb search on the page for scala mostly turns up scalable ...


Ditto with Rust, most of the matches are part of the word "trust" when using ctrl+f in Chrome.


This is one reason why we need regexes (to name just one point) in browser and other search. I've said before in other places that Gmail could benefit from an SQL-like query language, for example. So could many other web and non-web apps. Google search too. One potential issue is excessive load on the servers if it gets used a lot, though.


As others have mentioned, it’s possible the analysis was done through straight string matching; would know for sure once the code is open sourced.


A bit surprised, pleasantly I must admit, to see Django keep pace with Rails. I have never been quite able to explain the startup world's love affair with Rails. Nothing wrong with it but the money question is why?


We all wanna see the stack we work in succeed, so we can pay them bills. I'm gonna assume yours is Django. Mine is Rails. Plenty of work for all of us, no need to fight...


Everything in rails is figured out already, you don’t have to pick which ORM or library to choose, and then find out half way down the line it’s not working as expected. It just works consistently, reliably. There’s no other reason to use another server side framework if you’re creating. CRUD app or API (unless you’re specifically comfy in something else).


Doesn't answer parent's question. Django and Rails are both web frameworks.


There's an implicit position that those qualities ("just works", etc) are more prevalent in rails than django.


When you build web apps with a fairly narrow set of business rules and requirements rails is THE server side framework if you ask me. Development of simple features, that conform to some aspect of the framework can be impressively fast and reliable. The trouble always starts when requirements come piling in that were never talked about when the project started -- But this is a management issue and definitly not unique to ruby backend world.


I could argue the same about Django too. And Python is expressive, for my money, more expressive than Ruby. Ruby isn't bad but outside of functional languages, nothing beats Python how compactly ideas can be expressed.


    "Engineering Manager": 271,
    "Principal Software Engineer": 25,
    "Product Designer": 243,
    "Product Manager": 425,
    "Program Manager": 35,
    "QA": 353,
    "Senior Software Engineer": 818,
    "Software Engineer": 2975,
    "Staff Software Engineer": 10
I think this section should have been ordered by volume and cleaned up a little bit. 'Software Engineer' and 'Staff Software Engineer' are the same.


"Staff Software Engineer" is typically the level above senior, and below principal. It's not commonly used at smaller companies. I agree about the ordering/ranking.


Even though react and angular and vue are different frameworks, they are all modern JS libraries and I would lump that data together instead of keeping javascript so fragmented in the data set.


You could make the argument that Go is underrepresented as the match is only for golang and not go.


Yep. A quick glance at December's Who is Hiring shows at least 2 jobs that aren't included by searching for Golang only:

https://news.ycombinator.com/item?id=18662978

https://news.ycombinator.com/item?id=18590124

There's probably more in December alone that the regex missed.


I wonder if some of the "scala" hits are actually from the word "scalable"...


Really hope the OP did full word searches...otherwise he would be double counting "react" and "reactjs" as well


I'm not sure where embedded or firmware would fit - developing for it's a skill but it's not a language or framework - but I'm curious what % of posts taps the skill. Going by memory it feels like roughly the same number of firmware positions are on there every month, and it's the same guys looking to fill them.


What no Clojure? I find that hard to believe.


There are Clojure jobs posted every month, just an omission in the list of language regexps s/he searched for.


Scala as the top skill for 2018. I thought it would be high, but I’m surprised to see it at number 1...


Keep in mind, HN will mostly have startup jobs, so you're more likely to see languages like Scala represented here than in the job market at large, where I would expect Java, PHP, C++ etc. To dominate.


So if I'm understanding, raw text is fed into the geotext python library, which parses out geographical locations and adds to the count.

I'm very skeptical that there are 18 new jobs on the island of Jinja, and that instead the analysis is picking up the Jinja templating language

Edit: same thing for "cambridge" in the same location as Jinja Island. Obviously probably refers to Cambridge MA, where Tufts/Harvard/MIT are all nearby, so tons of companies will be there.


This is great, but I would have liked to see more data around how many jobs are entry level, management, executive, etc. Thanks for this! Really cool.


I was curious what the number would be for Washington, DC, but it seems that the scraper couldn't distinguish between DC and Washington state.


So one simple way to solve the scala and excel issue is just to match for the space at the end or punctuation. So search for "scala " and "scala." instead of "scala". Same for "excel " and "excel." In fact you could just combine them all into word + end where end can be [" ", ".", "!", "\n" etc]


Page doesn't load properly in chrome, like the rendering doesn't get finished, its all blurred. Loads fine in firefox.


I find it hard to believe only 5% jobs are willing to sponsor visa considering most engineering teams are full of H1 visa holders. I think in this case it may be worth looking at the negative text as the default assumption is that the job sponsors visa transfer


You're comparing an HN post, which is typically smaller companies, to an overall statistics where large companies hire from body shops.


Good work. Besides fixing the false positives others have suggested, I would really like to see the total numer of posts per category in comparison to the total number of posts per month, i.e. a relative quantity to better explore wether there is a trend or not.


I live in Italy, and the chart shows 24 jobs openings for Napoli on Hacker News (64 in Rome...). I checked and I could not find a single text match for the words Italy, Rome, Napoli, Neaples, Naples, etc.

Can you better express how you extracted the location data?


Have you seen this interactive map of HN job offers? https://whoishiring.io/search/0/0/2?source=hn


Strange Sydney has so many less jobs in HN than Melbourne. Whereas I think in actual job count it is opposite. This could be because of more startup like companies in Mellbourne which publish jobs on HN


Melbourne, Florida has a surprising number of people advertising on HN.

Also I don't think this is de-duplicating, and many of the Melbourne FL ads are repeated.


Scala seems suspiciously high compared to the data on Hacker News Hiring Trends: https://www.hntrends.com/


Wow, 2nd most common skill is “excel”. I would have never guessed.


I've found that "scala" and "excel" are ranked exceptionally high.

I don't know but maybe the author it's looking simply for "excel" and "scala" in the text without taking in account surrounding space. So:

  - "scala" would match "scalable" and "scalability"
  - "excel" would match "excellent"


I was also quite confused.

I think it's so high because "excel" is also a verb so a lot of job posts will say things like "We are looking for people who excel at what they do" and this was counted in the analysis.

Changing the search to be case sensitive might help here.


Myabe the regex pattern is checking for the pattern but not as a separte word?

"scalable solution"

"excellent benefits"


I think the chart is a little weird. Looks like Rust is second and Excel is actually third.


It shows 18 jobs in Rennes, France. But couldn't find a single job post. Wonder which word could produce false positive for "Rennes"?


It's not looking good for me, a remote C# developer.


If you look in the data .net is up there in numbers. Seems like the author is just doing straight regex matches not taking into account matches embedded in other words. Scala and rust both are exceptionally high, which I wouldn't expect.


Kind of the opposite, I do .NET Core development with Azure and I get at least from 1 to 2 job offers for REMOTE positions, for on site I get around the double.


Scala is the most popular skill. Is there something I don't know about Scala and the reason for its success?

PS: I use Kotlin/Java for web development.


The OP didn't correctly exclude false positives, so anything with "scalable" was registering for "Scala". It's fixed now.


There is "Oregon" on the map (with 54 jobs) but even if it was referring to Oregon City that's way off the actual location.


Weird to not see any devops role in your analysis.


TIL Jinja is a place, its off the coast of Africa.


So does hacker news consist of “engineering managers” who know html or is that simply who everyone wants to hire?


Sloppy regex.


I am so thrilled to see Rust at #2.


I think it's become indisputable that python seriously beats ruby in the job market.

It is sad to me.


Scala, huh?


And julia over c++???


This honestly seems just a mistake.

Skimming a few of the posts, there are maybe 15+ references to C++ for post, and 1 for Julia at most.

The detection algorithm seems to be a bit iffy.


I'd be willing to bet that the regex parser doesn't like the pluses in C++


Definitely. I might be blind, but I don't see C++ at all in the list of skills.


It has changed compared to the initial version.

Pretty sure that a string like "c\+\+" was there before.

It seems that in the new version the old patterns have been commented out, the search method was changed and c++ removed altogheter:

https://github.com/venkateshthallam/hn-whoishiring-analysis/...


One of the maps shows a disposed event handler error on my iPhone 6 w. Safari


The job posts for react are five times those of javascript :thinking:


As a Rails Dev, why does Python score so high? Is Django or scraping that important?

I know Python can be used in practically anything, but what do you keep seeing companies ask for that is a recurring need with Python?


I think you hit the nail on the head, "Python can be used in practically anything."

Our Data Scientist was able to pickup Django pretty quickly because of his experience with Python. And yea, actually someone at my company who was doing a bit of scraping on the side was able to pick up Django pretty easily as well. And last night, I chatted with my buddy who works at a big tech company and they use Python for database intrusion analysis.

I don't know the Ruby ecosystem very well, but it seems the biggest application of Ruby is Rails. I'm curious to hear what other ways people use Ruby, so please share.


Well I use Ruby also for Chef. I use it to keep my servers perfectly perfect documented and tested in code. Since I am a solo founder I can stay away for weeks from the actual configuration. When I forget something or something breaks I look at my Chef recipe. It takes a little longer to get setup

I also use Chef to setup my local environment which is wonderful. When you are learning how to really dig deep into your text editor and writing your own custom snippets (and commands, and keybindings) for react. You don't how important it is to have your local env, your dotfiles and atom config automated and can be synced up with a single command no matter what computer you are on.

This, with private git repos and syncthing is a wonderful thing.


Would you go into more detail with using Python for db intrusion analysis?


I think Python is in most places a secondary thing. Lots of places have their primary code in something else, but some testing or infrastructure things in Python. Therefore it's listed in the advertisements.


Data science probably.


Probably data science.


Why is Montreal missing from the map?


Why the reverse scroll on the map?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: