
Show HN: Dffer – Get notified when a website changes - dffer
https://www.dffer.com/
======
awill
There are many, many competitors in this space (visualPing, changeTower,
versionista, followThatPage, trackly.io). Most have similar pricing to Dffer,
but some are completely free (possibly with worse UI).

I wonder whether this sort of freemium model works, because there are some
people who's use case (like mine) absolutely isn't worth spending money on,
but I'd like more than 3 checks for free. Obviously I'll take a worse UI
that's free over spending money on such a ridiculously simple app. I could
literally buy an EC2 instance just to run 10-20 daily cron jobs for less than
signing up for Dffer.

My use case is to use trackly.io to see when new firmware/software updates
come out for devices I own. But of course, that's never something I'd be
willing to pay for.

~~~
aakilfernandes
There are some users who would actually rather pay $5/mo than use a free
service. Thats because paying money gives the impression (real or not) that
the service is more likely to stick around and be fixed when it goes down.

~~~
devdad
For me, $5 for not having to setup cronjobs is worth it.

~~~
awill
I don't think price is about the task (cronjobs), it's about how you value the
output.

If I'm using cronjobs for some trivial unimportant task (like checking for
firmware updates), I am not willing to pay. If I'm using cronjobs for super
important competitive analysis, sure I'd pay.

------
rahimnathwani
Does this work with content that is loaded dynamically with JS? I've used
[http://www.changedetection.com/](http://www.changedetection.com/) in the
past, but it only scrapes the HTML.

~~~
bluepeter
We offer “full page” rendering using PhantomJS (soon to be headless Chrome)
for Versionista, a competing product. It works rather well, but it introduces
its own set of problems... eg you need timers to ensure all the AJAX calls
have completed, and so on.

~~~
rahimnathwani
Thanks - just signed up.

I found the 'crawl' settings a little confusing, as the first item is about
loading additional URLs (which is what I thought I wanted, as JS is often
loaded from separate URLs rather that being embedded in the HTML), but then
there's a later option about a full browser. I chose this latter option, and
left the other one unchecked. I hope that's correct.

------
CHANCECHANEL
I wonder how its implemented. I think its possible to do run this type of
service at a very large scale for extremely low costs using AWS Lambda and
Cloudwatch events.

~~~
bluepeter
That’s exactly what we use for a competing product. The difficulty is less in
the crawling, and more in the scheduling and difference visualization
(including filters). Presenting salient results without a ton of false
positives is hard work.

------
fabiospampinato
I've made something similar a while ago called "rssa" [1]. You give it some
urls, some way of extracting information from them (via a jQuery selector,
regex etc.) and if things change it tells you.

I use it for tracking prices on Amazon, some stats about my repositories and
so on.

[1]
[https://github.com/fabiospampinato/rssa](https://github.com/fabiospampinato/rssa)

~~~
jdc0589
seems like a lot of us did this. my version let you write little JavaScript
modules to extract data from a page (which had been parsed by jsdom already),
and then the change tracking picked things up.

------
jvagner
Here's something I've wished I'd had over the 20 years of web development
efforts I've been involved in: for production sites, a site that crawls a path
regularly and saves a copy of the site in a timeline fashion.

How many times I've thought, "Man, we've made a lot of changes in the last 9
months and no one has a visual narrative of all of those changes..."

~~~
dewski
I built exactly this once, was fun to set and forget and look back on it in a
few months. It calculated the diff and you could see obvious spikes when big
changes were rolled out. You saw a lot of big spikes on say Apple.com when new
products were announced or marketed, whereas Reddit.com was always changing
and couldn't detect any meaningful patterns.

It also had ffmpeg in the backend that would create a timelapse for you on
demand. I built it when we rolled out our some rebranding/homepage changes on
GitHub.com and nobody really knew off the top of our heads when exactly the
last time we updated it was. Git only tracked code changes, no visual changes.

It was originally built on top of
[http://www.paulhammond.org/webkit2png/](http://www.paulhammond.org/webkit2png/)

------
tectonic
I use Huginn for this -
[https://github.com/huginn/huginn](https://github.com/huginn/huginn)

------
veritas3241
What I really want is something that can track changes to documents (PDFs, but
any doctype really) hosted on some site. The URL may or may not change, but
the location of the document URL on the page probably won't be changing.
Really I'm just interested in the MD5, so no need to track specific changes.

I've got some custom scripts but paying somebody would be nice. Management of
these becomes a pain after 10 or so.

~~~
dmoena
Hey, can you tell more about the context on why you need this? It would help
to know if there's a more general need and if so, maybe someone gets motivated
to implement it...

------
maturz
I've been using a Firefox add-on for this

[https://addons.mozilla.org/en-US/firefox/addon/update-
scanne...](https://addons.mozilla.org/en-US/firefox/addon/update-scanner/)

It works quite well but there's not much in terms of excluding content. When I
needed to do that I just ran it through a simple script that filtered for me
(Node.js).

------
mrskitch
Curious if this works on SPA's, and if so what are you using?

~~~
bluepeter
It’s posaible to do this for SPAs, yes. We use PhantomJS for rendering
JavaScript at Versionista... soon we will swap to headless Chrome.

------
jdormit
What is the use case for this? Why would I pay for it?

~~~
graysonk
> What is the use case for this?

I've written something similar for a company who wanted to monitor their
rival's website for news about expansion.

> Why would I pay for it?

I'm not sure either. A quick script in a cron job could also do this.

~~~
sho
> A quick script in a cron job could also do this

I love comments like this. You couldn't hope for a better example of "shit's
easy syndrome".

Yeah, a quick script in a cron job. Oh, but now we need 2 pages monitored. Now
we need 100. Now provide an interface for marketing to add and remove pages.
Make sure it emails these N people (different for each page of course!). Oh,
this has to be up 24/7 so we'll need a server and monitoring and a test
environment and deploy strategy. Your script is giving false positives because
of changing asset timestamps. Oh, your script needs to provide a visual
indication of what changed and when. Etc etc etc et cetera.

That's why you pay for things like this. Because it's actually a shitload of
work.

I recently used a similar site (visualping.com) to score a pair of airpods,
which are in tight supply, by watching my local apple store for availability.
Worked beautifully. Write my own script, are you kidding?

~~~
Karrot_Kream
> Yeah, a quick script in a cron job. Oh, but now we need 2 pages monitored.
> Now we need 100. Now provide an interface for marketing to add and remove
> pages. Make sure it emails these N people (different for each page of
> course!). Oh, this has to be up 24/7 so we'll need a server and monitoring
> and a test environment and deploy strategy. Your script is giving false
> positives because of changing asset timestamps. Oh, your script needs to
> provide a visual indication of what changed and when. Etc etc etc et cetera.

> I recently used a similar site (visualping.com) to score a pair of airpods,
> which are in tight supply, by watching my local apple store for
> availability. Worked beautifully. Write my own script, are you kidding?

You are totally conflating use cases. For most people, monitoring 2/3 things,
a cron script works fine. In fact, I have one that's running right now and it
Just Works (TM). By the time I need to provide visual diffs and have bulk
email logic, I'm probably going to use an external tool (or whip it up as a
side project if I think it's worth my time).

You can't equivocate the n=1 case and the n=10000 case.

~~~
sho
> You can't equivocate the n=1 case and the n=10000 case.

Fair enough, but the comment I was replying to was equivocating those cases
too. I was simply pointing out that maybe, just maybe, a service like this
might have some value above and beyond a 5 minute script and a cron job. And
indulging in a small rant against the "i could do that in a script/i could
write that in a weekend" etc mindset.

------
lightedman
Long ago, web browsers used to actually have this capability built-in. I
remember doing it with Netscape Navigator.

------
bootcat
Wanted to do something similar for so long !! I am also interested if you do
ajax sites, and if so how ?

------
laktek
If you want to build something like this on your own, check Page.REST.

