Hacker News new | past | comments | ask | show | jobs | submit login

Hey, a friendly reminder. I’m parsing the thread, all job offers added here are also available on the map on

https://whoishiring.io

Also I’ve started a small campaign to update thread format and make it more parser friendly for whoishiring.io and others (I know that at least few websites that do similar thing). Also you can read more about whoishiring.io in a recent “Show HN” (https://news.ycombinator.com/item?id=13500701)

Here is the format.

  1) {company} | {job title} | {locations} | {attrs: REMOTE, INTERNS, VISA, company url}
  Google | Software Developer | SF | VISA https://google.com
  DuckDuckGo | Software Developer | Paoli PA | REMOTE, VISA
or

  2) {company} | {job title} | {locations}
  Google | Site Reliability Engineer | London, Zurich, Sydney
  Facebook | Web-developer | London, Zurich
I’m using this regex to test the firstline.

  \s*(?P<company>[^|]+?)\s*\|\s*(?P<title>[^|]+?)\s*\|\s*(?P<locations>[^|]+?)\s*(?:\|\s*(?P<attrs>.+))?$
You can test it in Python or here https://regex101.com/r/relwQD/3 (for the match look right).

As a result off this calling in previous editions of “Who is Hiring” many posters actually complied. Which resulted in more accurate map positions, better tagging (REMOTE, VISA, INTERNSHIP, …) and for some I was even able to get LOGOS!. Thanks!




From looking at postings so far, my guess is it's not natural for the submitters to remember or accurately list the job params as you've laid out.

My suggestion would be, if you're not already doing/working on it, to start doing some basic NLP around the specific verbiage used for each param, and then you can organize by match instead of by position in the list. I'm betting you'll see some nice results without much optimization even since each expected param argument is pretty different from one another.

Just an idea, best of luck!


I agree. I'm talking to HN people, they are in loop as well here. There is a chance that they will suggest something or we will adopt the format that I'm posting. Always would be good to have some feedback before too not miss to many things later.

As for the NLP, I'm doing it (kind of and I've tested many things including Stanford NER lib and external APIs) Having said that, If your format wont comply, no worries. I will do my best to find the location, tag the job if it's remote or intern. Although to get the rest like company name, positions names the mission is hard here.


I hear ya. I was able to format my submission properly (I believe), but just noticed that pattern as I scrolled.



is this open source? May be I can contribute.


When multiple {locations} are specified, do they all get mapped?


I meant something else (locations in terms of types countries, cities) but I think would be possible to have multiple location in format. It will not work that way right now, but I can try and test(and extend) the script work that way


You need to account for ONSITE. That's specified at the top, but I don't see it anywhere in your spec.


If it's not remote, where else could it be?


You are correct, but imagine to avoid people enquiring about if the position can be remote they write NO REMOTE instead of ONSITE.

Anyone searching through the comments for "REMOTE" is going to get all those NO REMOTE jobs as well, which is a pain. Using ONSITE makes it much more clear.


It's consistent with what people are posting and the directions given. From the very top:

> Please lead with the location of the position and include the keywords REMOTE, INTERNS and/or VISA when the corresponding sort of candidate is welcome. When remote work is not an option, please include ONSITE.


There's the possibility of (Remote, No onsite) in addition to (Remote, onsite)


I'd reduce that to 'REMOTE (?:ONLY)?'.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: