
I analyzed the history of “Who is Hiring?” threads - philipkiely
https://blog.floydhub.com/web-scraping-with-python/
======
hadsed
A little off topic... but I love this because it gives me an idea.

I've been looking at LinkedIn with some immensely negative feelings for some
time now. I've always wished someone would just disrupt the hell out of them,
because they don't deserve the near-monopolistic position they're in. They
just don't care about their users!

And before someone says, "well [pushes up glasses] their real users are
recruiters that they sell $20k/year seats to their search tool". Yeah, as a
hiring manager I've tried using it. Awful! :)

And jobs boards are nothing new, I guess, but something feels really nice
about how much bigger these "Who is Hiring?" threads are getting on HN. Maybe
we really do need a jobs-centric "social network" for industry verticals.
Beyond terrible UX and general disrespect, LinkedIn also just doesn't do
enough with their data to make it useable. Maybe that's because it's just too
hard to do for every job and industry out there.

And maybe that next-gen professionals' social network can be the first open
and free one?

Or maybe I'm just wishing too hard for something that can't/won't be done.

~~~
gota
LinkedIn has become plagued by self-help-like stories and other self
promoting, vague and meaningless write ups, in my experience.

It wasn't always like this. I used to have a reasonably cool feed centered
towards technologies and news that are at least partly relevant to me and my
industry. No more. It's all 'I gave a homeless person lunch and today they are
the #1 duck breeder in Brazil' or some other nonsense.

I don't know when this started or if it is localized, but it renders the
'place to find useful information' aspect of social media completely void.
There's no amount of reasonable curating effort that will save a news feed
like that.

I try not to be too negative about things but without some serious change I'll
use a static LinkedIn profile and visit it for a fraction of a second every
week - to check for messages - and nothing more.

~~~
sharadov
I just can't stand the humble brag posts!

~~~
fnord123
I know! As someone with 500+ contacts to high powered professionals I'm
inundated with a huge amount of humble brag posts!

~~~
seoulbran
*chuckle

------
philipkiely
Author here. I scraped 8 years of data on Who is Hiring? threads. The data is
available through the article and on GitHub. Spoiler alert: people want to
hire full time software engineers in San Francisco. Happy to answer any
questions.

~~~
ulucs
If not intended as a learning exercise, why not use the existent API?

~~~
cushychicken
The author is still an undergraduate, so yeah, I'd file under "learning
exercise".

~~~
philipkiely
I hope to engage in learning exercises long after I graduate!

~~~
filoleg
Hearing this was pretty refreshing and encouraging. Thanks for posting that,
and I mean it.

------
rsweeney21
The data on "Freelancer? Seeking freelancer?" is interesting and I think it
can be misleading.

The disparity between freelancers seeking work and people seeking freelancers
on HN is huge - way more people looking for work, than people looking to hire.
That makes it feel like there is an oversupply of freelancers, which would
drive down the pay rates.

However, I think it's more of a reflection of the HN audience than the broader
market. I.E. HN readers don't hire as many contract/freelance employees. This
aligns with what we've seen in practice as well. If software is the company's
core competency, they will be less likely to hire contract or outsource.

On the other hand, there is a huge need for non-software companies to hire
senior devs and they hire a lot of contract/freelance. Demand is high and
rates are good ($120-200/hr)...at the right companies. So, if you are a
contractor looking for work, I would target companies where software isn't
their primary business.

Source: I run a company that helps senior software engineers find contract
work. (Shameless plug: www.facetdev.com)

~~~
felideon
> I run a company that helps FAANG-rite-of-passage-waving software engineers
> find contract work.

FTFY.

~~~
dang
Please don't do this here.

~~~
mattsfrey
I get the comment is a bit snarky, but if you dig into this guys site it's
required that you have worked 3+ years at a "top tier company" with a rotating
ticker of FAANG and other giant SV behemoths on it's front page ad, which IMO
is fairly pretentious and warrants a bit of a call out.

------
nwsm
Neat. I'd be interested in seeing how many job posts are reposted each month,
and which companies post the most. I have noticed many jobs/companies that are
present almost every month.

The rainbow coloring of the graphs is distracting.

~~~
ryandrake
I had “plans” a while back to put together a cynically-named “Who’s Not
Hiring” script that did just that. Look at which companies post the same or
extremely similar jobs month after month. Life got in the way and I abandoned
the project early, but it would be interesting to see the results of such an
effort.

~~~
jotux
I'm on a team at my current job with ~10 other engineers, and at my previous
job I was on a team of ~12. With growth, retirements, and turnover at both
jobs we are always hiring. My rough approximation from both jobs (spanning
about 10 years) is that we're able to hire one person every 6 months. I'm not
a hot tech geographic region, and the field I work in is somewhat niche, but
there are many non-cynical reasons a job might be posted indefinitely.

~~~
unleashit
Indefinitely posting while hiring only once every six months is actually the
poster child of the problem. If it really takes six months to find someone to
fit the role, you might want to take a look at your hiring process because as
you must be rejecting (or ignoring the applications) a lot of great, qualified
people.

IMHO if someone did take the time to publish the metrics of companies who
perpetually post the same positions, it would be a fair counter balance.

~~~
jofer
I'd argue six months is about average for the jobs I've seen and had over the
years. Not every role needs an off-the-shelf front end dev.

A lot of things really do require significant domain and industry specific
knowledge. For those roles, it takes a very long time to turn up even one
minimally qualified candidate. It's easy to say "find someone bright who can
learn on the job", but that's not practical in the cases I've seen.

Many things need either specialized education or experience in a similar role
to be effective within a 2-3 year timeframe. When on-the-job training would
take multiple years, it's worth spending six months to a year to find someone
who can fill the role. At least in the industries I've worked in, the roles
that require extensive and rare training/experience outnumber the roles that
don't.

~~~
ryandrake
If you take six _weeks_ to hire someone (let alone six MONTHS!) then you’re
probably missing out on candidates who have other good, quicker options. I
remember doing a phone screen with a major telco whose name starts with A,
ends with T and has a T in the middle. It seemed to go well, then silence (as
often happens with tech interviews). Well I did a few interviews at other
companies, picked one, signed everything, moved my family across the country,
and started work there. About 9 weeks after that phone screen, they finally
got back to me with: “We thought the initial interview went great! How about
coming onsite for the next step?” LOL

~~~
jotux
Our hiring process doesn't take 6 months -- we're only able to find a
candidate, interview them, negotiate, and have them accept an offer every 6
months. Our biggest problem is just finding people to apply. In the last ~8
years I count only about 100 total applicants to our jobs (I'm not counting
internship job listings where we got a lot of student applicants). We tried
posting jobs on HN Who's Hiring for about 6 months and got _zero_ serious
candidates, so we stopped.

~~~
SOLAR_FIELDS
Out of curiosity, do you put a competitive salary range on your posts? I
suspect that if you put a range that is above the median for your locale you
might have more bites. Only other reason I can think is if your technology
stack is outdated or you are located in an area where no one wants to live.

------
depressedpanda
People who are saying that the author should have used an API are missing the
point; sometimes the only way to gather data is by resorting to scraping, and
this article details how one could go about doing that, by using an
interesting target: HN.

To the author: Good job, I'm sure many people will find your article useful! I
wish I would've had this article at hand a couple of years ago when I needed
to aggregate data from several websites that provided no API.

~~~
minimaxir
> sometimes the only way to gather data is by resorting to scraping

Nowadays, web scraping has a nonzero risk of hitting _legal_ issues for sites
when "the only way to gather data is by resulting to scraping", especially
when an API already exists with a ToS/Guidelines on how the data should be
obtained and used. And even moreso if the data itself has monetary value.

------
minimaxir
HTML scraping should be the _last_ option you consider to get data after all
else fails.

Even though the Hacker News API
([https://github.com/HackerNews/API](https://github.com/HackerNews/API)) is
somewhat old, it's a much more kosher way of getting data.

Even better is to use the public data dump in BigQuery
([https://console.cloud.google.com/marketplace/details/y-combi...](https://console.cloud.google.com/marketplace/details/y-combinator/hacker-
news)). Quick query to get all top-level comments in posts by whoishiring:

    
    
        #standardSQL
        WITH whoishiring_posts AS (
          SELECT id from `bigquery-public-data.hacker_news.full`
          WHERE `by`="whoishiring" AND type="story"
        )
    
        SELECT text
        FROM `bigquery-public-data.hacker_news.full`
        WHERE type="comment"
        AND parent IN (SELECT id from whoishiring_posts)

~~~
plibither8
The API that [https://hn.algolia.com](https://hn.algolia.com) provides can
also be used! In my experience, it's more verbose and the queries are more
customisable.

~~~
saagarjha
This is great! I've been wondering for a while how to efficiently load a
comment tree without spamming the Hacker News API with hundreds of requests,
and this seems like it would work nicely.

------
m3at
While the article seem intended as a web scraping tutorial, it's good to
remember that there is an official HN api [1] in case you really want to do
data science on the site.

[1] [https://github.com/HackerNews/API](https://github.com/HackerNews/API)

------
mfbx9da4
Why didn't you just use the Algolia API? I used the Algolia API to build a 24
/ 7 live stream of randomly selected highly voted HN vídeos
[https://www.crowdform.co.uk/hntv](https://www.crowdform.co.uk/hntv) saved me
a lot of time

------
lazyant
The last section about when not to scrap warms my heart.

~~~
philipkiely
Thanks! It was important to me to include that in an article intended to teach
the skill.

------
tsumnia
If you're still feeling froggy with your data, I'd love a follow up with some
NLP term frequency used in the posts.

Did intern frequency drop or did the sheer increase in posts shrink their
overall percentage? The concern I have there is the idea of becoming too "top
heavy" where no one is interested in helping interns grow and only want to
recruit experienced people.

~~~
philipkiely
You raise a great question. I also wonder if Hacker News as a platform tends
towards more senior people, I know that I usually feel like one of the least
experienced people in most comment sections.

------
danbrooks
Nice work! One comment - the plots show the fraction (not percentage) of
postings for interns and remote workers.

------
killjoywashere
Since we're in the middle of the Who is Hiring cycle, what's the best way to
post a couple of jobs?

------
guessmyname
What happened on December 2011 and June 2013? Why is there no data?

~~~
philipkiely
A couple of months were missing for each data set, if you page through the
"whoishiring" user submissions you'll notice them missing.

------
eurticket
love this, beautiful.

~~~
philipkiely
Thanks you!

