New: Please only post a job if you actually intend to fill a position
and are committed to responding to everyone who applies.
----
Please state the location and include REMOTE for remote work, REMOTE (US)
or similar if the country is restricted, and ONSITE when remote work is not an option.
Please only post if you personally are part of the hiring company—no
recruiting firms or job boards. One post per company. If it isn't a household name,
explain what your company does.
Commenters: please don't reply to job posts to complain about
something. It's off topic here.
Readers: please only email if you are personally interested in the job.
Searchers: try http://nchelluri.github.io/hnjobs/, https://hnresumetojobs.com,
https://hnhired.fly.dev, https://kennytilton.github.io/whoishiring/, https://hnjobs.emilburzo.com.
Don't miss these other fine threads:
Who wants to be hired? https://news.ycombinator.com/item?id=42017578
Freelancer? Seeking freelancer? https://news.ycombinator.com/item?id=42017579
I'm the CTO at the Common Crawl Foundation, which has a 17 year old, 9 petabyte crawl & archive of the web. Our open dataset has been cited in nearly 10,000 research papers, and is the most-used dataset in the AWS Open Data program. Our organization is also very active in the open source community.
We are expanding our engineering team. We're looking for people who are:
* Excited about our non-profit, open data mission
* Proficient with Python, and hopefully also some Java
* Proficient at cloud systems such as Spark/PySpark
* Willing to learn.
Our current team is composed of engineers who do some data science, and data scientists who do some engineering. We are focused on improving our crawl, making new data products, and using these new data products to improve our crawl.
If you'd like a little tour of what our data looks like, please see https://github.com/commoncrawl/whirlwind-python/
Interested? Contact us at jobs zat commoncrawl zot org. Please include a cover letter addressing the above points. Thank you for your interest!