
Show HN: CoolQLCool – Turn Websites into GraphQL Accessible APIs - gavino
https://coolql.cool
======
lachenmayer
I've previously written a very similar project called "graphql-scraper" (which
is arguably a far less cool name...), you can check it out at
[http://github.com/lachenmayer/graphql-
scraper](http://github.com/lachenmayer/graphql-scraper)

It works very similarly, with only superficial differences under the hood (eg.
I used jsdom, and this uses cheerio). The `waitForSelector` feature is very
cool!

You can see a live demo of the HN example using graphql-scraper at
[https://graphqlbin.com/v2/lxNohP](https://graphqlbin.com/v2/lxNohP)

This example is deployed on Glitch - you can easily spin up your own using
[https://github.com/lachenmayer/graphql-scraper-
server](https://github.com/lachenmayer/graphql-scraper-server) (with 1-click
deploys to Heroku, Now & Glitch)

Of course (as mentioned already) there is also
[https://github.com/syrusakbary/gdom](https://github.com/syrusakbary/gdom)
which uses Python+Graphene.

~~~
gavino
I remember seeing GDOM a while back when I first started this project, but
forgot to write it down as a source of inspiration. I'm gonna add all of these
as alternatives, because they're all great :D

~~~
syrusakbary
So happy to read that :) (and so glad it's served as source of inspiration for
your project, keep up the good work!)

------
maio
Nice! There is also similar project GDOM -
[https://github.com/syrusakbary/gdom](https://github.com/syrusakbary/gdom)
written in Python.

------
bryanrasmussen
Are you planning to build anything on top of this - service,company? I was
thinking it would be a good way to build an api for some projects I've been
thinking of working on, although I would probably want to switch out cheerio
for [https://github.com/intoli/remote-
browser/](https://github.com/intoli/remote-browser/)

~~~
gavino
Nah, I don't really plan on turning it into a company. I'd gladly accept any
PR to swap out cheerio, I haven't touched that part in close to a year :D

------
canadev
This is a tangent but they link to a serverless deployment service where you
upload your code as a function and they execute it. Pretty interesting.

------
pdxandi
I've been looking for something like this! I'm trying to play around with it
but can't seem to get the selector right. How do I grab a table `td` by its
nth selector (tried `td:nth-of-type(n)` to no avail)?

------
VMG
Awesome name, awesome project!

------
conceptpad
Great project! I can imagine this may greatly improve web certain classes of
scraping. @gavino I'm curious what tooling and architecture you used to put
this together?

~~~
gavino
Sure! The backend is actually pretty straight forward, it's a NextJS app
deployed on Now with a few added endpoints to handle the incoming GraphQL
queries.

Then for actually turning the query into a digestable output I used the
GraphQL schema builder that handles accepts HTML nodes from the requested page
and grabs the right variables.

------
simonhamp
Didn’t Yahoo do something like this many years ago, effectively a SQL for web
pages?

~~~
pnevares
You're probably thinking about Yahoo Pipes:
[https://en.wikipedia.org/wiki/Yahoo!_Pipes](https://en.wikipedia.org/wiki/Yahoo!_Pipes)

------
nurettin
Not sure what to make of this. How does it handle throttling or captchas?

------
halfjew22
If this is a community reference I’m going to be very happy.

~~~
gavino
Troy and Abed scraping websites!! :D

~~~
atomi
in the morning!

------
jarjar12
Sorry a dumb question. What are the use cases ? Thx

~~~
ralusek
1.) You have a website with data you'd like to consume.

2.) That website doesn't expose an api, but returns statically rendered html.

3.) You don't like parsing statically rendered html for the data you're
looking for, and you'd prefer getting the data using a GraphQL interface.

~~~
acct1771
Page isn't loading in Materialistic (HN reader on Android/F-Droid repo), but:
do you have this exact verbiage on it? It's very concise!

------
hokumguru
This is very cool.

------
powerslacker
Give this man an internet.

