
JobFunnel, a job website search aggregator - paulm7242
https://github.com/PaulMcInnis/JobFunnel
======
vadym909
Unfortunately, a job search aggregator assumes that companies religiously post
new jobs. The reality is when a company wants to hire it 'may' post a job to
cover it butt in case of an EEO complaint. Mostly they have internal
candidates, internally referred candidates, retained recruiters, applicants
from career website who all rank above an external job site. My guess is a
very low percentage of jobs actually get filled by job aggegator sites. imho
The number of applicants from these sites is just too much, far less qualified
and trusted compared to other sources.

~~~
throwaway743
Oh man, I love how this works. Everyone's time gets wasted, and then the
applicant gets the shaft and a bullshit excuse.

Either companies shouldn't be required to post publicly with the intent of
hiring internally or an employee getting a referral kickback, or something
needs to be figured out where the applicants aren't taken advantage of.

------
city41
I was always under the impression job sites don’t like to be scraped. What are
the risks in using a tool like this? I would suspect possibly getting your ip
banned.

~~~
everdev
As always, consult a lawyer.

But, my understanding is that you are allowed to scrape sites as long as you
don't impact their ability to deliver the site to other users. In other words,
don't DDoS them. Sites tried to ban scraping but last I remember seeing it was
not enforceable unless it was done at high rates or maliciously.

As for content, certain things like recipes aren't protected content because
they are "listings of ingredients". I can imagine a job opening / title might
follow that same logic, maybe the job requirements and benefits as well.

What is protected is the way it's presented and organized. So if you have the
same jobs with the same filters and same layout you might be exposing yourself
to trouble. You have to make changes in the way it's presented, which for
copyright law is changing 50% or more.

So if you're scraping a job site and only showing remote jobs or jobs in a
certain skill set, or you're showing the job along with some stats about the
company you're probably in a safe spot. But again, if you want to make a
business out of it it's probably good to talk to a lawyer first.

~~~
sverhagen
I don't think the question was: "is it legal to scrape", rather: "do they like
to be scraped". If they wanted to, I'm sure they could make scraping harder
(more annoying). Which, I suppose, thus comes down to: is being aggregated a
net plus or minus for their business. Probably "it depends" on what the
aggregator does with the data. So, just to get this pun in: it then comes down
to: is being aggregated a net plus or minus, IN AGGREGATE.

~~~
everdev
Then no, very few for profit companies like to be scraped.

It would be easier for them to provide an RSS feed or a free API if they
wanted to share their data.

------
PretzelFisch
Somewhat related freecodecamp.org had a video for aggregating job results with
a filter
[https://www.youtube.com/watch?v=lauywdXKEXI&t=196s](https://www.youtube.com/watch?v=lauywdXKEXI&t=196s)

------
boyinthecloud
Anyway I can automate applying for a job now. That would be the ultimate tool
to complement this.

~~~
codingslave
This wouldnt be that hard to do, but is it illegal to automate those
submissions?

~~~
zo1
May not be illegal, but all platforms don't like it. And it's only a problem
that needs solving because the internet is full of shit content.

There are N amount of job postings, but recruiters and other "middle-men" have
made it so that there are N * X * Y places a candidate has to apply to.

Where X are the amount of recruiters and Y are the recruitment
platforms/boards.

They literally take their clients' requirements, shuffle them around,
repackage them, adding editorial fluff and then post them to Y amount of
platforms. This is information warfare and we are losing the battle.

This is the net-consequence of all the "distributed" effects we've been
wanting for years. Instead, there should be _one_ database. Or at the very
least an identifiable entity even if you want to repackage something. I.e.
each one of these postings needs to have an agreed UID for employer,
employer's job posting, recruiter, and the type of roles it applies to. We
need more structure to the data, not less and more distribution across the
greater internet.

That is why they don't like scrapers, because they lead to consolidation,
aggregation or deduplication of data that has been needlessly duplicated in
order to confuse and obfuscate.

On a side, tech-related note, this is the same reason why I hate microservices
architectures: a good chunk of them distribute and duplicate pieces of data.

~~~
codingslave
I think it would be messy, but doable, to build a one application platform
that is tailored to individual websites behind the scenes. It knows how to
input the information, checks the boxes, and then collates the responses and
makes applications easy to track. Actually kind of curious why this doesnt
exist

------
vultour
This is essentially what large job sites like Indeed do themselves, but they
have entire teams maintaining the scraping pipeline. I've worked for one of
their competitors.

~~~
flyingcircus3
One pet peeve I have with sites like indeed is all of the subterfuge that goes
on. For instance, the half-assed sort by date feature. Indeed adds in month
old sponsored postings with new postings, and often repeats those old postings
on consecutive pages. Many sites also shamelessly reset their posting dates so
that nothing is ever more than a week old.

Stripping away the middleman's ability to manipulate the presentation of
search results is the most attractive feature here, in my opinion.

~~~
jlokier
One thing a multi-site scraping aggregator could do would be detect listings
that have been re-posted with just a date change, and show the _real_ amount
of time the ad has been up, alongside how long it's been most recently
refreshed.

It would be interesting to search for ads that were posted a long time ago and
are still unfilled.

And it would be intersting to see how that add "posted 12 hours ago" is the
same one that was posted 6 months ago and is currently being posted by 5
different recruiters with almost identical wording apart from their
boilerplate.

And it would also be interesting to see the different salaries/ranges that are
mentioned on different copies of the same ad. I've noticed different
recruiters sometimes post the same ad, but with a different salary ceiling.
Guess which number I'd like to have when negotiating... And if I see a job
whose salary has been creeping up for a few months, that's an interesting
trend too.

------
CosmicShadow
Awesome to see in the screenshot that this is from somebody in Kitchener-
Waterloo!

------
monksy
I wonder how this would be if it was backed by Elasticsearch.

------
matthewhartmans
Very cool OP!

------
abinaya_rl
At [https://remoteleaf.com](https://remoteleaf.com), we have been doing this
manually to curate remote jobs.

I spend more than 6 hours daily searching for, screening, verifying and
filtering hundreds of remote jobs. So it can save you time, energy, and
frustration – and hopefully, help you find a job faster

I'll check out this tool at least for the scrapping part.

~~~
simfoo
When I look at your signup form I can't really find my profession. I had this
problem for a while now but maybe I should simply ask: I do industrial
software engineering (in my case in automotive) that targets a mixture of
embedded, desktop and server environments. This is done mostly in C/C++ with
some Python, Shell and maybe Rust. What is the proper profession name for me?
Is this a full stack engineer? Not likely from what I've seen but I really
don't know

~~~
abinaya_rl
I agree and I'm not sure either. Maybe if you are interested I can modify your
category in the backend and set it to a custom one and start sending your
C/C++/Embedded development remote jobs.

~~~
jlokier
I just took a look. Nice looking site, by the way :-)

I have the same issue at signup as the GP - I find myself rather unsure what
categories make sense for me on your signup page.

Similar to the GP, a mixture of embedded, servers, and webdev, except I
wouldn't say I do just industrial. Feel free to look at my LinkedIn profile
(the link is in my HN profile) if you think it could give you some ideas for
additional categories on your signup page.

~~~
abinaya_rl
Based on your skills, I would suggest choosing the Javascript and
System/Devops categories.

~~~
jlokier
Thanks for that! Armed with the advice I'll take another look.

Though I think that, on the face of it, completely leaves out my interest in
compilers, programming languages and toolchains, and in
kernels/devices/embedded systems, which are a big part of my work in practice.

Maybe the "System" in "System/Devops" covers that, but those two are a
peculiar match: I've might quite a few devops workers who consider software
development to be slightly outside their knowledge, and something like C or
C++ programming to be "very advanced". I find it peculiar that anyone can do
devops without being a competent programmer but hey, it takes all sorts!

