

New Job Aggregation Service in Three Days. - nateless

Hello HN.<p>My name is Nate, and I'm an employee of the Russian web-agency CloudMill. Some of our staff, as I am considering getting H1B visa, but it's real pain in the ass to find an offer on monster, dice or careerbuilder (which doesn't work with Russian's IP by the way). Search with H1B bringing up totally irrelevant results with phrases like "Unfortunately we can't sponsor H1B" or they help to relocate but not sponsor and so on.<p>I had some free time and here we go - http://www.devhub.us. This is dead simple aggregation service like simplyhired. I tried to eliminate everything that annoyed me. And we have a checkbox to show only H1B offers :) It understands almost 90% of all offers whether company willing to sponsor visa or not.<p>Also you can search with all sphinx functionality, here are some examples:<p>@title (ruby|rails) developer @details (rspec|ejb|tdd)<p>@title iphone developer @details object c<p>@title python @details "python developer"~5<p>@title senior developer @company google<p>TODO List:<p>* Currently works only with Chrome, FF, Safari. Needs full cross-browser.<p>* Add more job sites<p>* Clean offers' HTML, because it looks ugly at the moment<p>Whole project took around 3 days. And if I it gets enough response I continue to maintain it and add new futures. But main concept is to leave it simple, I think services like Monster and Dice is too huge and only few needs smth more than search.<p>If you have any questions don't hesitate to ask : it@cloudmill.ru<p>Considering this is HN here are some tech details:<p>Whole thing build with our corp framework which we call BlackGold, we forked Kohana 3 about two years ago, and since then our paths diverged. We use only dedicated servers with FreeBSD ( EC2 too ), knowing which software we have and it versions allowed us to remove all settings, checks and other stuff we didn't need. BlackGold uses Memcached as caching layer and session storage, we removed ORM because its too heavy and replaced it with Data\Mapper pattern. Logging routed to MongoDB for which we have admin interface and can see all errors from all sites we have. This project uses Sphinx as search engine with MySQL and MongoDB. With all that our average page generation time is lower 0.002 secs, and memory usage is around 600KB-1.2MB with any project.<p>Other things include:<p>* Memcached CAS to avoid race conditions (multi threaded scraping).<p>* Search uses Sphinx and scraping post-processing. Each offer goes through several methods which cleans html, finds keywords, phrases, locations, etc.<p>* Offer duplication. It has been solved with Shingle's algorithm. But 50k offers had around 10M shingles, and search within them was to slow. This part has been moved to MongoDB with map\reduce.<p>* Geo location with free maxmind database to get city and state ( only for US )<p>* HTML5 Boilerplate, CSS3, jQuery, LESS. Static content has been moved to Amazon S3&#38;CF.
======
demirhan
Simple and clean interface, however, I couldn't find the checkbox for "h1b".
Any thoughts to open source the project?

~~~
nateless
You need to do a search first, then sidebar will change. We don't have a plan
to opensource it at the moment.

