Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Where is “Who is hiring?” hiring? (whereis-whoishiring-hiring.me)
493 points by manlio on April 9, 2015 | hide | past | favorite | 117 comments

Nice name and clean interface!

I'm glad to see Remote as a location, but due to the free-form writing in the original posts, there are errors. For example, "Haskell dev at Standard Chartered Bank" is listed under Remote, but the post itself says "Remote work isn’t an option". The post for Button similarly doesn't allow remote, but uses "Remote - no" to convey that.

I've been planning on building some filtering for the Who is Hiring threads, and I've pretty much determined that some degree of manual review will be needed. In the most recent thread, I found a huge number of posts containing "remote" which don't actually allow remote working. "No remote" is fairly common and easy to filter out, but there are any number of variations that you can't anticipate a priori.

> I've pretty much determined that some degree of manual review will be needed

You're spot on with everything. I did a lot of manual review and the site already filters out "NO REMOTE", "REMOTE no", "Remote not" and "No Remote" entries. I did spot the "Remote work isn’t an option" post, but I decided I'm not going to write that kind of completely ad-hoc filtering rules, it's just ugly.

You could break the text up into sentences [1] and do sentiment analysis [2] on the sentences with 'remote' in. Then flag based on that.

[1] https://opennlp.apache.org/documentation/1.5.3/manual/opennl...

[2] http://nlp.stanford.edu/sentiment/

Wikify it.

Let users can log in and change the remote/non-remote status (and other attributes).

Have some kind of trust system (could be linked to HN points or whatever).

(Even better if the YC guys made a custom job board where you fill in a form with all the details so there is no inconsistency.)

Or you could hire people to do it via oDesk or Mechanical Turk. Not so interesting technically, but it's a job people are good at.

Hire people for cheap to help people be hired for $$$, with no reward for the upsell. Brilliant! :)

Sentiment analysis probably isn't the right option here, though it may work.

I think a combination of dependency parsing[1] and regex is the way to go.

regex examples: "Remote: No", "No remote please"

Dependency parsing examples: ""Remote work isn’t an option", "Remote work will not be considered"

[1] look for negation in the parse tree using something like http://demo.ark.cs.cmu.edu/parse?sentence=Remote%20work%20is...

Sentence segmentation and sentiment analysis may be overkill.

N-grams + Naive Bayes is potentially Good Enough.

All these strategies are interesting, but I'm afraid we are over-engineering the problem here. The pretty simplistic strategy I'm using now is basically just pattern matching, and so far I had only 4 misplaced posts out of the 840 for April alone: that is < 0.5%. And it's blazing fast! I can rebuild the entire db in less then 30 seconds.

Given these number I believe pretty much everything more complicated than that would be a total overkill... Good food for thoughts though!

I just manually curate in these cases. HN hiring threads don't ever exceed a level where 0.5% manual review would be onerous.

I think you will need 100% manual review to find those 0.5%

In my experience with data quality management, manual translation of these edge cases is not pleasant. Yet it can be very valuable. It's a bit like "online learning" in machine learning - each time an error is found, you provide the correct answer. Yes, you might end up with a long array of phrases/regexes to check against. However, it scales just right for the amount of data you have and provides high quality results.


"REMOTE no problem!" :) Just kidding. Great job.

A better option would be to require job postings to make location and remote-ability explicit at the top, in a standard format/layout. Because quite often I'm Cmd+F-ing through a thread and landing on a ton of "no remote" posts, which is frustrating.

This is awesome. Only suggestion is to add a backlink to the post and/or user who posted it.

Yes, this please. Sometimes there are comments or you want to investigate the profile/history of the poster.

Aaaand, feature is up ;)

Amazing. Thanks so much!

There is no way to view the data on the HN website. Please add a link to a source, or at least https://news.ycombinator.com/user?id=whoishiring

Fixed that.

The only way to deal with the unstructured nature of the "Who is hiring" posts is to have some sort of schema that can be processed. I also wanted to do something similar (imagine it with dc.js, for example!), but the data is too diverse.

A sample entry could be:

  company: 'Some Company',
  jobs: [
      dev_type: 'Web/Mobile/Data',
      dev_sub_type: 'Frontend/Backend/DevOps/Android/iOS',
      visa: 'Required/Not required/Transfer only/Sponsored',
      remote: 'Yes/No/Maybe'
      locations: [
All posts could have a METADATA: compressed_json entry that can be processed by the site and displayed/filtered accordingly. Perhaps it could be built manually at the beginning until it catches up.

dev_type: 'Web/Mobile/Data/Design/Hardware' at least.

Sure, there are many things that could be improved.

Anyhow, I'm surprised. I'm a Hardware Engineer myself, so how could I miss that!

This is cool, nice to look at other projects analyzing this data. I publish HN Hiring Trends, http://www.ryan-williams.net/hacker-news-hiring-trends/ , that watches the various technology terms being mentioned in the postings.

Once thing I changed to was just including top level comments and no replies/discussion of the posting. Do you handle similarly?

HN Hiring Trends is really sweet, I didn't know about it. I do include only top level comments; and incidentally, digging through HN's HTML code was... uhm... let's say a bit messy.

Yeah, I can imagine. HTML is really tricky. Why not use the API? With a few requests, you can query for and pull all of the whoishiring threads.

Here's how I pull them all down: https://github.com/ryanwi/hiringtrends/blob/master/lib/hirin...

Great project, however I've noticed some listings are missing. For example, from April, https://news.ycombinator.com/item?id=9303396 there was a posting from Questrade. It's missing on this website.

Thanks for the feedback! That particular post is missing because the city is written in all caps (TORONTO). I used a bunch of tricks to be able to catch as many cities as possible (say NY and NYC and Manhattan and even NEW YORK all go under "New York City"), but not every city is tested against its "all caps" equivalent atm because I wanted to be able to rebuild the db as fast as possible during development phase. I should probably fix that now.

I used to fall for this a lot so now most string comparisons I do something like

  // string contains string against lower case / no whitespace
Might help in your case too?

Unfortunately no, 1) you can't trim the whitespace if you're matching cities, 2) matching the cities lowercase opens up too many false positives. At the same time, matching every city with its uppercase equivalent doubles the time requested to build the db but only adds a tiny handful of posts. That's why (for now) I settled with a tradeoff where I catch the uppercase equivalent for the biggest cities only.

Sorting by language would be a great feature for this as well, particularly for those of us who work in more obscure / esoteric languages.

Yes, some sort of filtering is missing, I would like to filter by frontend for example. It would be very practical that way.

Yeah would be nice to click "Clojure" rather than Ctrl+F.

Love this. I was able to get more useful information about remote working than I was able to when reading the original who is hiring April post. This is probably a good time to suggest changing the format of the who is hiring posts. Would be great if the use of a standard form template is encouraged. That would make an effort to parse the data much easier, and would make the reading of the original post easier, and would probably make it easier for companies to create their posts too. Win for all?

Worth noting April Houston listings have a false positive because an investor was named Drew Houston.

I'm surprised nobody else has requested this, but any chance for a state category? If you really want to impress me, perhaps a warm and cold climate section or maybe have to shovel snow vs. unlikely to shovel snow. :P Gotta set my priorities straight…

also alphabetize the cities when there is a tie

Love the alphabetize idea! I just fixed that, thanks for the suggestion :)

Great work!

As a sugestion for next feature, I'd recommend a selection for visa sponsorship or not.

I actually thought about it, but it's a total mess to parse :( Think of posts like "We're sorry we can't sponsor H1B at the moment, but we might get you a VISA for our London office".

On the other hand, once you're already browsing New York City, good ol' ctrl-f for VISA will probably serve you well enough.

Agreed on the parsing. It would be nice if there was a standard "form" to be posted. Something line.

Description: .... Company: keyworks: python, startup, collstuff Visas Sponsered: YES

Back when the web looked (to me) like it was going to move to XHTML2 rather than HTML5 data encoding using Microformats looked promising, http://microformats.org/wiki/job-listing. One doesn't hear "semantic web" much nowadays though.

I was gonna suggest a sentiment analysis of the sentence associated with H1B, but your last phrase summarizes my lack of ingenuity.

Although I'm sure companies won't complain about the additional publicity, are there any concerns of copyright issues with scraping and republishing the text from other HN posts?

If everyone cared that much about copyright I think no one would ever make anything. There's always a way to sue someone over something, especially copyright.

I asked out of curiosity, not criticism.

Maybe an error but the entry for https://news.ycombinator.com/item?id=9305360 is getting confused with Melbourne, Australia. I think is should be under Melbourne, Florida.

I am from Melbourne which made me looked at the Australian entries.

General question about remote work: as a european, can you work remotely for an american company without an H1B visa?

Yes. You set yourself up as a contractor and handle country specific taxes etc yourself. The book "Remote: Office Not Required"[1] has a lot of additional information, it's a great read.

[1]: http://www.amazon.com/Remote-Office-Required-Jason-Fried/dp/...

Is it hard to find remote work in Europe at the moment? Just curious, I haven't had to look, but was under the impression there ought to be a reasonable amount available.

It might have to do with the USD being close to the EURO lately, so it opens more options.

Very cool idea, fun name, practical interface. I like it.

Perhaps add a simple tagging system where users can add tags to hiring posts. That way you don't need to comb through every post and hopefully you crowdsource some helpful taxonomic data.

Useful, but falls short for surrounding areas of Los Angeles like Venice (neighborhood of LA) & Santa Monica (adjacent and much a part of LA). I imagine there are issues like this for other cities and regions too.

Quite the opposite, both Venice and Santa Monica had so many posts I decided to treat them as independent cities, e.g. http://whereis-whoishiring-hiring.me/city/2015/4/Santa%20Mon...

As an LA-area resident, I definitely feel like an overall "Los Angeles" category is important. Santa Monica and Venice are LA; leaving them out is like leaving out Palo Alto or Mountain View from a "Silicon Valley" category.

I would suggest having both a "top cities" list as well as a "top regions" list, which would have SoCal vs NorCal or some similar groupings of cities.

i believe its not its own city. 2 posts for venice should be part of los angeles. you can still have venice its own section if u want, but excluding it from los angeles doesn't make sense.

what u have is too simplistic.

I realize that you may be using a limited input device, but surely you can afford a few spare grams of pressure for the shift key, or y and o? Anyway, I can see what you mean re: Venice/LA (Venice is not its own city), but as someone who used to work in Santa Monica there was a time when the ONLY places I would consider were SM, Venice, and just maybe El Segundo (the beach bike path did make for an amazing commute). Having WeHo or Northridge jobs mixed in there could be a pain.

having the problem of a few more posts to sift through is much better than the problem of missing something that might have been categorized as "Los Angeles" where the job is actually in venice or santa monica <- happens all the time.

these things are just a starting point - further investigation on the company's site and actual location are always necessary.

I was surprised at how few are hiring in Los Angeles. I'm thinking about going to grad school there. Can anyone in LA comment on the state of your tech economy?

LA-area tech community has been growing incredibly over the last few years. Lots of early stage startups; some are now maturing like Dollar Shave Club and Lynda.com

what about non-web stuff? I'm interested in robotics, sensors, embedded, manufacturing tech, etc.

A bit of a side note, it seems someone created an "alternate" who is hiring bot or account:


Only because the standard "bot" slept in last time.

Ah ha, that's why. I publish whoishiring technology trends every month[1] and was curious about the change. Fortunately the API allows a list of users, making it an easy thing to handle.

[1] http://www.ryan-williams.net/hacker-news-hiring-trends/

I assume that was just a DST issue. The bot posted it an hour later than usual, and the DST transition (for the US) took place between the March and April posts.

Ah so that's why. Good catch.

Can you add an option to sort by companies that are offering internships as well?

I would love to see this as well. If I have some time this weekend, I'll look into submitting a PR :)

Done! Better late than never ;)

Really nicely done. I did something similar as a blog post a good while ago and it was quite popular. The results haven't changed much it seems.

One comment I got was that I had just mapped where HN users are in the world.

Very cool, and interesting to see the results. I immediately looked for a 'trends' feature to see how cities rank change over time, or maybe this could be plotted?

This is really cool. Thanks for creating this. I'm not looking at the moment, but I'm always super interested by the Who's Hiring threads.

As someone who lives in Washington State, not Washington DC. It always frustrates me trying to sift through the DCs looking for something.

Boston is gaining and should probably be gaining more if lumped together with Cambridge.

Less sure about lumping San Francisco and Palo Alto. Thoughts?

I would suggest three buckets: San Francisco, including San Bruno, Millbrae, and Burlingame; Mid-Peninsula, covering San Mateo, Foster City, Belmont, San Carlos, Redwood City, Menlo Park, and Palo Alto; and South Bay, covering Mountain View, Sunnyvale, Santa Clara, San Jose, Cupertino, Campbell, Los Gatos, and Milpitas (and maybe Fremont?).

...apparently all the companies I am being rejected from are being acquired by all the companies too good to speak to me...

Just a small note about "Cambridge, MA" not being the same place as "Cambridge, UK".

Yep, plain Cambridge is "Cambridge, MA", the other one is listed as "Cambridge, UK". In the same fashion, "Venice" is actually "Venice, CA". It hurt a bit but it was the right thing to do ;)

Great work. Looks like Dublin, Ohio is getting categorised as Ireland though: http://whereis-whoishiring-hiring.me/country/2015/3/Ireland

Awesome, awesome idea! Any plans to do the same with "Who's looking for work"?

Not really, but if you want to do it all it takes is to replace the URLs to wget (i.e. you want to wget all the "Who's looking for work" pages instead of the "Who is hiring?").

You can easily build your local Sqlite database like that. I wrote some more instructions about it on the README.md on Github.

Ah, cool. I'll check that out. Thank you for open-sourcing this (and for documenting it)!

This is fantastic. Thank you!!

This is really cool. I'd love to see this by technology composition too.

This is pretty excellent.

This is great! Thanks for making it :)

Typo: "[brose by country]"

Should it have been "bros" by country? :-)

Awesome work, bookmarking this !

Nice work. Makes it easier for me to keep an eye on for jobs in Portland.

can you please also provide filter of Technology?

cool! you should lump cambridge MA in with boston

Best URL ever.


I should mention that I and other hiring managers I've talked with are moving away from posting on the "Who is hiring?" post.

It was pretty useful ~6 months ago. But, the amount of spam generated from recruiting and sourcing firms, various startups trying to push their revolutionary new online coding tools, etc. is pretty ridiculous and many of them, especially the SV-area startups, have been quite aggressive (e.g., phone calls and switching to my personal e-mail address after I told them I was not interested).

Posting jobs on twitter has been a far more effective sourcing tool than HN "Who is hiring" has become recently, at least in the free space.

I've had some ok luck with the Who's hiring threads, but, what really bothered me was some of the practices from these companies.

One company, allowing remote work, sent me to do a personality inventory without even talking to me first -- which really bothered me. (They're still posting looking for DevOps and Developers in Indianapolis.)

One company scheduled an introduction phone call on the 25th of the month, and then didn't show up on time and attempted to reschedule on the 15th of the following month. (Apparently, they didn't understand "Hire fast, fire faster.")

Finally, one company wasn't up-front or honest about their salary expectations until after I had spent almost a month in their system -- even taking a week off of work to do one of their "trial weeks" only to discover that they were going to offer me approximately 50% less than what I was making now and that they had a standard 'formula' for salaries...things that if I would have known, I wouldn't have wasted their time (nor mine) going forward.

Don't get me wrong -- HN has brought me a lot of great things: context, opportunities, viewpoints, and friends. Unfortunately, the "Who is Hiring" has morphed into traditional HR -- where you send a resume and don't hear back anything from anyone, versus the near-immediate feedback that you would once get in 2012.

This is why we can't have nice things.

How do you go about posting jobs on Twitter? Rather, is there a special tag you use or something?

From what I've seen, it's more like "Hey, we're hiring an xyz! Know anybody who'd be interested?" with a link to a web page or email address.

Ruh roh, that sounds bad. What should we do?

Maybe have a special tag that you can add to the text for info that you only want karma users of a certain level or higher to be able to see -- ex. (karma>300)[Contact me at my@email.com], or instead of direct numbers, target people that are able to downvote, or have a moderate level of karma. People would post a link to the recruitment page/job description page on their website for all other users, which would hopefully work at deterring spammers from contacting them personally.

I think something like this would help you focus your recruitment efforts on those who have at least contributed to the community in some way, which should filter out people spamming every single email in the thread.

Another idea is to mask emails with a craigslist-like mailing address, which would give the end-user the ability to report an email as spam, and therefore tie that email to the offending party's hacker news account.

Edit: What I mean is that each hacker news account would see the email address as a different one, so when they emailed that account it uniquely identifies the account that originally viewed that email address. So, Spammer A sees Poster B's email address as hn-49384932842@ycombinator.com, and Legitimate Candidate C sees Poster B's address as hn-4494838943842@ycombinator.com. When either one emails that address, if Poster B reports the email as spam, and if enough reports accumulate, the HN account sending the spam can be docked karma and lower them below the threshold allowed to view further posts.

I like this.

Nothing. HN is not a job board. Whatever you do to squelch recruiter spam on "Who's Hiring" threads is bound to have unintended consequences. Meanwhile: if the big problem is that recruiters use job posts as spam targets, everyone who posts an ad can come up with their own solution (if it's karma-locked, it can just be "mail your HN username here") and the best one will spread.

Should "Who Is Hiring" posts require a certain age of account or level of HN Karma to post in?

I'm open to that in this case, though in general people tend to object to karma requirements.

The Who Is Hiring threads belong to this community. If something needs to be done to protect them for the community, we'll do it. But we'd ideally like to see a consensus emerge.

We should probably discuss this in a separate thread (and probably not today, as I'm about to be traveling). And I feel bad for taking a Show HN further off-topic, so will mark this subthread as such (which lowers it), even though it's obviously an important question.

How about instead of requiring a certain amount of karma to post in the thread, require a certain amount of karma just to view it?

This would not work because the content is trivial to syndicate by people with privileges to view it.

+ exclude karma earned from submitting popular sites.

There is little correlation between the domain of a submission and the amount of points it receives on average. (The exception is the more niche posts by more renowned programmers)

The stories don't have to be popular to generate karma, as long as the domains are some articles will get karma from other people submitting the same links and manual upvotes too. Yesterday someone autosubmitted everything a dozen big tech sites published and got 100 - 200 karma without hitting the front page.

Ah, this makes sense. Karma-locking content sounds fun.

The spammers don't post HN replies; they send email or call.

Won't they find you on Twitter as well? Even LinkedIn has the same problem from what I hear from hiring managers, becoming increasingly frustrated by the number of recruiters that contact them when they post a job there.

Yup - I've been getting far less candidates (and the quality has gone down). Coupled with the spammers, it's getting quite annoying.

what hashtags/formatting are people using on Twitter for jobs posts?

Show HN: Ask HN: Where is “Ask HN: Who is hiring?” hiring?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact