
Webwatch - jgrahamc
https://github.com/jgrahamc/webwatch
======
pweissbrod
crontab -l | {cat; echo " * * * * * if curl
'[https://mysite/'](https://mysite/') | grep -q mysubstring; then echo 'found
it'; fi"} | crontab -

~~~
jgrahamc
I've watched this command change repeatedly as you've been editing it to make
it work.

This is a good example of why I didn't do this in the shell.

~~~
chelmertz
By looking through the git history of your project, it doesn't seem you got it
correct the first time either. I don't think gluing "UNIX tools" together
biggest strength is "make it work fast & on the first try", but "have
independent tools that does 'one thing' well".

In relation to your tool, I think curl provides very many more features that
are easily accessible through command flags than the limited subset of HTTP
capabilities you expose (for example, basic auth or different set of headers).
The same argument goes for mailing, setting headers or such.

With that said, tools that does one thing and does it well are the ones that
gets used, personally I'd just prefer it to be function in <your-shell>
instead :)

~~~
jgrahamc
Yeah. The 'git log' really shows all the changes I had to make to the README.
Oh, and an error message.

~~~
chelmertz
I mean, a tool can be really useful (I write tools this size all the time) but
some of them needs tweaks forever. I just think some 'tweaks' are already
solved by other projects, that's why using already written tools that are
somewhat UNIX-y sounds like a good idea to me. That's what I tried to say; of
course I don't want you to write a 100% complete program in the first commit,
that would make everything I write look really bad in comparison. Just be
prepared for that pull request that lands basic auth in your project, and the
next PR after that :)

------
onion2k
Nice idea but it needs work. Firstly, and most importantly, any open source
project lives and dies on it's documentation. Without a basic guide to what
the thing even does no one is likely to to use or support the project. Give
some love to your README.md file. _How_ to use the project would be great.

Secondly, at the moment you're just doing a straightforward string comparison
on the <body> of a page[1]. It'd be more useful if I could define something
like a DOM querySelector or a regexp. It'd also be useful to look in the
header at that page title.

[1] At least, I think so. I've never used Go so that's just what I gather from
reading the source.

~~~
jgrahamc
This is a really short little program I wrote for a quick need I had. I added
a simple README. There are a ton of ways to improve it (regexp, DOM walking,
automatically figure out MX, ...); if people want to do that I'd be happy to
take PRs.

I tend to default to "stick it on Github and see if it helps someone else".

~~~
onion2k
That's a fair comment. I just figured if you were posting it to HN you were
looking for feedback.

~~~
jgrahamc
Happy to have comments and even PRs.

------
kauegimenes
If anyone is looking for something similar that runs in the browser i
recommend this two extensions:

Chrome: [https://chrome.google.com/webstore/detail/page-
monitor/pemhg...](https://chrome.google.com/webstore/detail/page-
monitor/pemhgklkefakciniebenbfclihhmmfcd)

Firefox: [https://addons.mozilla.org/pt-
br/firefox/addon/check4change/](https://addons.mozilla.org/pt-
br/firefox/addon/check4change/)

------
ludbb
I assume I'm jealous for a project that brings nothing new compared to so many
other solutions and still grabs 76 stars (as I write this). It seems, after
all, github stars are another way to say "I'm popular" and not so much that a
project is good.

~~~
jgrahamc
I'm surprised this is popular. It was just a quick thing I wrote to solve a
specific problem that mattered a lot.

------
97-109-107
In similiar vein and quite easy to run locally
[https://thp.io/2008/urlwatch/](https://thp.io/2008/urlwatch/)

~~~
lcswi
Or Specto.

------
hartator
Is there any perks to pass arguments like this?
`-url=[http://cloudflare.com`](http://cloudflare.com`). I was thinking the
right way was `--url [http://cloudflare.com`](http://cloudflare.com`) or `-u
[http://cloudflare.com`](http://cloudflare.com`).

~~~
ola
It's using the standard `flag` package that comes with Go, as for why `flag`
parses args this way I don't know.

~~~
scrollaway
The go devs are aware of it, and adamant that this stuff is fine and they
don't want to make the flag package "any more complex" since it's so easy to
install a different one (nevermind that of course people are going to use the
builtin one...). I find this absolutely ridiculous given how nonstandard it is
in today's shell scripts; -flag is supposed to be interpreted as -f -l -a -g
or -f "lag" depending on the -f argument.

------
ozcanesen
There is a startup for that
[https://monitorbook.com/](https://monitorbook.com/)

~~~
eric_cc
And it was: "Crafted with <3 in San Francisco"

so there is that

~~~
RoseO
I especially love the feature tick "Push Notifications (coming soon)" as a
reason to go for their higher tier subscription.

------
michaelmcmillan
There's quite a lot of edge cases that can be triggered when fetching HTTP
responses. Perhaps a small test suite would be beneficial in order to attract
new developers that don't feel like breaking anything? (-:

------
gavreh
Similar to [https://www.changedetection.com](https://www.changedetection.com)

~~~
carsonreinke
Or [http://followthatpage.com/](http://followthatpage.com/)

------
olouv
I've built the same thing using NodeJS a couple of weeks ago, with phantomjs
support (javascript execution), mandrill (emailing) & and some other nice
options: [https://github.com/mgcrea/node-web-
watcher](https://github.com/mgcrea/node-web-watcher)

------
ramon
What string? Is it a webpage modification or just a whois modification? What
exactly is it looking for?

~~~
arcatek
I assume that this program kinda works like "curl <page> | grep <string> &&
mailx -s 'Match' <email> <<< 'matched'".

Useful when you want to periodically check if a page changed - I already used
a similar thing to get concert tickets before anyone else.

That being said, I feel like a browser extension might be more useful than a
command line script, for this particular use case.

~~~
michaelmcmillan
A browser extension is handy, but requires your browser to be open in order
work. A script on the other hand can just be thrown up on a server and forgot
about.

------
TazeTSchnitzel
What about the _absence_ of a phrase? I would like to be able to do

    
    
      webwatch \
        -url=https://example.com/privacy/ \
        -warnmissing="never received a National Security Letter" \
        -from=me@example.net \
        -to=eff@eff.org

~~~
jgrahamc
[https://github.com/jgrahamc/webwatch/pulls](https://github.com/jgrahamc/webwatch/pulls)

~~~
colinbartlett
I understood your point, but you might be better received if you responded
something like the following:

"That's a great idea! I have no personal need for such a feature, but if you
do and are able to submit a pull request, I'd be pleased to merge it."

------
chdir
Off topic : Is it ok to re-post an ignored article [0] the very next day, just
curious, not complaining :)

[0]
[https://news.ycombinator.com/item?id=10443814](https://news.ycombinator.com/item?id=10443814)

~~~
jgrahamc
I reposted because I got the following email from HN:

    
    
        Hi there,
    
        https://news.ycombinator.com/item?id=10443814 looks good, but didn't
        get much attention. Would you care to repost it? You can do so
        here: https://news.ycombinator.com/repost?id=10443814.
    
        Please use the same account (jgrahamc), title, and URL. When these match,
        the software will give the repost an upvote from the mods, plus we'll
        help make sure it doesn't get flagged.
    
        This is part of an experiment in giving good HN submissions multiple
        chances at the front page. If you have any questions, let us know. And
        if you don't want these emails, sorry! Tell us and we won't do it again.
    
        Thanks for posting good things to Hacker News,
        Daniel

~~~
bmelton
Interesting that the process needed you to repost it for the mods to boost it.
Seems like they could have just fiddled with it without you having to manually
interact with it.

I can't help but wonder what the logic is there.

~~~
eterm
Also interesting that HN is moving (has moved?) toward a curated site. HN asks
for reposts of things they deem good. They also adjust downward the score of
many articles. (As can be seen through large jumps on sites that track article
ranks, some of which will be automatic from the flamewar detector, some of
which is likely manual.).

It seems like we're reaching a "web 3.0" which uses users to do the expensive
bit of an intial sift but then the site admins edit/curate that into their own
vision.

We're moving away from user driven content, back to curated content with user-
sourcing.

~~~
mmahemoff
Web 3 or not, I'd see it as an extension of user sourcing, where users have
various levels of moderation powers. I would guess these HN emails (I also
received one recently and duly reposted) are triggered by some count of admins
voting up unloved posts, maybe from a list filtered by a user karma threshold.

As Jeff Atwood says about StackOverflow, it should be possible for a
sufficiently privileged user to do just about anything staff can do.

Not really a new concept as /. had the notion of metamoderation, but a richer
model with multiple levels of user.

------
cookiecaper
Could you please make this legal in the US by honoring robots.txt and scanning
any links to the ToS for words forbidding "automated access", "crawling",
"spidering", "polling", etc.?

------
lfx
Hey, jgrahamc neat tool, could you please add bins to repo? I know, I know I
can compile my self. But not everybody has luxury install go just to try...

~~~
jgrahamc
No. I really hate adding binaries to git repos.

~~~
jakevn
You don't need to add it to the git repo itself. You can create a release in
Github and attach binaries.

------
flixic
It's like self-hosted Google Alerts for one page.

~~~
GPGPU
I was going to suggest Google Alerts. It works really well!

~~~
ramon
Then the description should have started like this: Self-hosted Google Alerts
in Go!

That would of saved me a couple of minutes

------
ola
Useful tool for what is a very common task, nice work jgrahamc!

------
fiatjaf
The hard part is to know what string to search for.

------
kefka
Not to demean your work, but I also replicated what you did in Node-red.

And it also goes to twitter. And console. And MongoDB.

It took me 5 minutes.

[http://imgur.com/lbHoTIb](http://imgur.com/lbHoTIb)

(Below is JSON link to replicate what I did)

[http://pastebin.com/i6KhuwbX](http://pastebin.com/i6KhuwbX)

~~~
jgrahamc
Fun.

