
Show HN: Feed Creator 2.0 – Generate RSS feeds from web page elements - k1m
https://createfeed.fivefilters.org/
======
stekern
Cool project!

I've recently created something similar for personal use. I have many websites
(mainly webshops) I want to be notified about changes on, but they don't have
RSS feeds, subscriptions or APIs than you can use.

I set up a cron job that runs daily, scrapes websites according to some
XPaths, and saves the results to a DB. If any new elements have appeared, an
email will be sent out. The biggest challenge is handling false positives:
being able to distinguish between a new element and e.g., a previously seen
element with an updated title, description etc. For websites that directly
expose what seems to be unique, server-side, identifiers in their HTML, using
that as a primary key seem to work well. If that's not available, the href of
the HTML element seem to be fairly static.

Do you have any thoughts on the issue of false positives and unique
identifiers?

~~~
k1m
Thanks! I haven't given the issue of unique identifiers too much thought
because in most cases I assume the item URL is less likely to change than the
text and will serve as the unique identifier for the RSS reader. It's possible
to create feeds without item URLs in Feed Creator, so in those cases maybe
letting users select an identifier to be the guid element in the feed would be
helpful.

Generally though, I'm hoping users understand that feeds produced in this way
could be a little more brittle than if the site offered its own feed.

One difference with your approach is that you have the data from previous
fetches in your database. With Feed Creator everything related to producing
the feed (source URL, selectors, filters, etc.) is embedded in the feed URL to
avoid having to record data on the server. So each request is treated as if
it's the first one - the server doesn't know if an item in the feed is new or
old. If we referred to feed data from previous fetches, maybe we could let
users introduce a delay before having a new item added to the feed. This might
help in cases where a typo is spotted and corrected by the publisher minutes
after publication. Can't think of a much better way of avoiding false
positives at the moment though.

------
k1m
Happy to get feedback and answer any questions about this here.

Here are two feeds we made earlier to give you an idea of what Feed Creator is
supposed to do. The links below will pre-fill the form with the parameters
you'd enter and produce a preview of the RSS:

* Chomsky.info articles: [https://createfeed.fivefilters.org/index.php?url=https%3A%2F...](https://createfeed.fivefilters.org/index.php?url=https%3A%2F%2Fchomsky.info%2Farticles%2F&in_id_or_class=main_container)

* Latest articles by John Pilger: [https://createfeed.fivefilters.org/index.php?url=http%3A%2F%...](https://createfeed.fivefilters.org/index.php?url=http%3A%2F%2Fjohnpilger.com%2Farticles&item=.entry&item_desc=.intro&item_date=.entry-date)

~~~
Ambroisie
Is this service self-hostable?

~~~
k1m
Yes. If you have access to a server with PHP, you should be able to run it
yourself. We have a simple PHP file you can download to test the compatibility
of your server.[0]

We sell the self-hosted version[1] and have a blog post with some instructions
if you want to run it on a VPS[2].

[0] Zip file with PHP file inside:
[https://createfeed.fivefilters.org/fc_compatibility_test.zip](https://createfeed.fivefilters.org/fc_compatibility_test.zip)

[1]
[https://www.fivefilters.org/pricing/](https://www.fivefilters.org/pricing/)

[2] [https://blog.fivefilters.org/2020/06/04/feed-
creator-2.html](https://blog.fivefilters.org/2020/06/04/feed-creator-2.html)

------
docuru
I’d be worried about copyright issue.

In many cases, the client doesn’t care/know about content copyright and just
crawling.

(P/s: I used to develop a blogging platform and find RSS links to seed
content)

------
jslakro
This seems a halfway before create a webscraping solution. Adding support for
integrate automation with zapier or ifttt could help to close the circle. Nice
project

~~~
k1m
Thank you! :) I don't have much experience with Zapier, but I assume the RSS
feed this produces can be plugged into both Zapier and IFTTT if they have RSS
support. Or maybe you had something else in mind?

~~~
jslakro
You're right. RSS could be enough to use for automation

