Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: CoolQLCool – Turn Websites into GraphQL Accessible APIs (coolql.cool)
124 points by gavino on Dec 19, 2018 | hide | past | favorite | 22 comments



I've previously written a very similar project called "graphql-scraper" (which is arguably a far less cool name...), you can check it out at http://github.com/lachenmayer/graphql-scraper

It works very similarly, with only superficial differences under the hood (eg. I used jsdom, and this uses cheerio). The `waitForSelector` feature is very cool!

You can see a live demo of the HN example using graphql-scraper at https://graphqlbin.com/v2/lxNohP

This example is deployed on Glitch - you can easily spin up your own using https://github.com/lachenmayer/graphql-scraper-server (with 1-click deploys to Heroku, Now & Glitch)

Of course (as mentioned already) there is also https://github.com/syrusakbary/gdom which uses Python+Graphene.


I remember seeing GDOM a while back when I first started this project, but forgot to write it down as a source of inspiration. I'm gonna add all of these as alternatives, because they're all great :D


So happy to read that :) (and so glad it's served as source of inspiration for your project, keep up the good work!)


Nice! There is also similar project GDOM - https://github.com/syrusakbary/gdom written in Python.


Are you planning to build anything on top of this - service,company? I was thinking it would be a good way to build an api for some projects I've been thinking of working on, although I would probably want to switch out cheerio for https://github.com/intoli/remote-browser/


Nah, I don't really plan on turning it into a company. I'd gladly accept any PR to swap out cheerio, I haven't touched that part in close to a year :D


This is a tangent but they link to a serverless deployment service where you upload your code as a function and they execute it. Pretty interesting.


I've been looking for something like this! I'm trying to play around with it but can't seem to get the selector right. How do I grab a table `td` by its nth selector (tried `td:nth-of-type(n)` to no avail)?


Awesome name, awesome project!


Great project! I can imagine this may greatly improve web certain classes of scraping. @gavino I'm curious what tooling and architecture you used to put this together?


Sure! The backend is actually pretty straight forward, it's a NextJS app deployed on Now with a few added endpoints to handle the incoming GraphQL queries.

Then for actually turning the query into a digestable output I used the GraphQL schema builder that handles accepts HTML nodes from the requested page and grabs the right variables.


Didn’t Yahoo do something like this many years ago, effectively a SQL for web pages?


You're probably thinking about Yahoo Pipes: https://en.wikipedia.org/wiki/Yahoo!_Pipes


Not sure what to make of this. How does it handle throttling or captchas?


If this is a community reference I’m going to be very happy.


Troy and Abed scraping websites!! :D


in the morning!


Sorry a dumb question. What are the use cases ? Thx


1.) You have a website with data you'd like to consume.

2.) That website doesn't expose an api, but returns statically rendered html.

3.) You don't like parsing statically rendered html for the data you're looking for, and you'd prefer getting the data using a GraphQL interface.


Page isn't loading in Materialistic (HN reader on Android/F-Droid repo), but: do you have this exact verbiage on it? It's very concise!


This is very cool.


Give this man an internet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: