Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: CloudQuery – Turn any website to serverless API with SPA support (github.com/cloudfetch)
117 points by timqian on Jan 24, 2019 | hide | past | favorite | 23 comments

Nice. I've been toying with the same idea, but with optional RSS formatting to be able to generate an RSS feed from any website.

What's stopping me though is the EUs Article 13, and I can't see this avoiding that either.

If you have a turnover of < 10M euros and less than 50 employees, the last version of the text I saw (though I get a little confused to the process) would exempt you from doing anything.

Hopefully, the Article won't pass as long as countries push back.


Hopefully it passes but with modifications that make it sensible. We need to constantly re-evaluate and update copyright law and rules for the ever evolving Internet but at the same time attempt to avoid limiting innovation. Not an easy task but I believe in EU in this case.

I was able to do this 10 years ago with Yahoo!Pipes :(

You can do this now with https://www.pipes.digital/ :)

And I'm actually working on making the extract block more powerful, having a graphical way to select elements.

Yahoo also have a similar tool nowadays called YQL, I tried it but faild to make a query..

YQL is dead since 3rd of Jan. 2019 https://developer.yahoo.com/yql/?guccounter=1

This looks good!

* Shameless plug *: Our little startup, Feedity - https://feedity.com, helps create XML/RSS feeds for any public webpage, even JS/XHR/SPAs and social networks (Facebook, Instagram, Twitter), via a visual feed builder and REST API.

Where's Yahoo! Pipes when you need them...

dead. so sad.

Very neat idea! I like do this locally in Emacs (by scraping `M-x eww` output!), but having an API is a great idea! How would you go about using it on sites that require login (for prototyping private apis for example)?

Also, it says it uses serverless-chrome for running chrome on AWS lambda... is that "expensive"?

For pages require login, it is not implemented yet as this is a little complicated , but it is not impossible, the tool need to record actions user do and replay it in the remote browser.

About pricing, AWS lambda provide 10 million free invoke and the billed invoke is cheap too($0.2 for 10 million invoke) you can check the pricing detail here https://aws.amazon.com/cn/lambda/pricing/

*1 million

Does anyone remember the Kimonify extension? It reminds me of that! Cool!

LOVED Kimonify, especially how it (somehow!) figured out the pattern of data after clicking on a few different items. This is very close to that, minus the pattern recognition.

I think the correct keywords for this would be "tool assisted continuous scrapping"

So people are starting to miss YQL

Looks great, reminds me of Kimono labs.

What's the use case of this tool?

My guess: Rapid prototyping of tools that use scraping as a source of data.

We actually have multiple tools that do realtime scraping as the primary data source. Many of these tools act as a simplified interface to features from another service.

For example, there is some webapp we've been using that we were using for a single feature, and that app doesnt have an API available. Using the app required many clicks, and page-loads were slow. By inspecting the HTTP requests, we figured out the minimum amount of HTTP requests required to perform our common task.

Using a simple GUI that focusses on that simple task, the user can initiate the task from a single form, and the server will perform the correct HTTP requests and notify the user whether the process was successful or not.

We have plenty of these "micro tools" that encapsulate / wrap around web apps to simplify the usage of that tool. Usually our micro tools are easier to use (because its focused around our use-case), add integrations with other tools and commonly are significantly faster as well.

They are easy to build (usually within 40 hours) and are a real time-saver, because the users don't have to keep track of all the logins, don't have to load slow web apps, and don't have to have a guide with screenshots where they need to click.

These web-apps sometimes change their request/response structure but after building a few of these tools, but it doesn't happen that often and the tools are updated within hours.

This tool generates APIs providing data, developers can make use of the API to build other tools. For example I build https://cloudfetch.info based on it. More strictly speaking, this tool is extracted from cloudfetch, because I think an API might have more use cases.

I use a very similar tool (https://wrapapi.com/) to generate JSON feeds for From Founders (https://www.fromfounders.com/).

I noticed that lots of these founder interview sites (Indie Hackers, Starter Story, etc.) didn't have APIs or RSS feeds, so I put the MVP together in an afternoon. Really useful for myself even if I don't have a ton of other people subscribed yet.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact