
Monitors.txt - lazy webapp monitoring - eliot_sykes
http://monitorstxt.org/
======
cheald
I don't think I like this idea.

* Exposing a "monitors.txt" is a potential security hole that reveals more about your infrastructure than you meant to.

* Restricting monitors.txt to a given service (by htaccess IP restriction?) is brittle and gets you out of "write feature"-land and into "do sysadmin"-land, which you're trying to avoid anyhow.

* Cucumber is an unnecessary extra layer that doesn't do much besides just frustrate the user who has to write _just the right_ English syntax to get what he wants. See also: Monit config definitions. The files read nicely once you get them configured, but it can be a downright bear to find out what just the right magic combination of keywords and phrases is to get it to do what you want.

* If you want client-side managed monitoring, why not have some interface to your chosen monitoring service that you invoke via your Cap deploy? That's something that you can make reality today, rather than waiting on service providers to implement it.

The concept of being able to define your monitoring as a part of your
application and then just deploying it with the rest of the package is really
appealing, but you still have to go tell monitoring services where to find
your monitoring file, you still have to wait on monitoring services to
implement support for it, and monitoring just _isn't that difficult of a
problem_. How often do you change your monitoring rules?

~~~
icebraining
_Exposing a "monitors.txt" is a potential security hole that reveals more
about your infrastructure than you meant to._

Everything you put in monitors.txt has to be publicly accessibly anyhow. What
could is reveal that the actual website doesn't?

~~~
cheald
Not true. I can have a service monitor
"<http://foo:bar@mysite.com/password/protected/endpoint/> or
"[http://mysite.com/protected?auth_token=2caf5e77cfa057ad2f36e...](http://mysite.com/protected?auth_token=2caf5e77cfa057ad2f36e6fafd2b95787dd5a5edfdbeae8f73a5f7a7f892fb0a5f83d)
just fine.

Passwords are security by obscurity, sure, but putting them in a world-
readable file at a common known location is the height of ridiculous.

~~~
icebraining
Then you can put monitor.txt behind that password protected endpoint too. My
point is that whoever you need to give access to the file also has access to
the actual website, which has all the information (it has to, since it's
supposed to be able to test against it).

------
fleitz
English is an incredibly crappy language for writing tight specs in. The only
appeal of english as a programming language is that people who cut big checks
speak it. It's completely inappropriate for expressing formal logic.

The problem is when people think "Hey, I have a computer science problem. I
know, I'll express that problem in English!" and now they have two problems

~~~
prodigal_erik
Only until they "parse" the English using regexes. Then they have three
problems. Four if you count "there are lots of trivial variations on the
correct English phrasing, but none of them work, so you have to refer to the
exact grammar anyway, this just made it bulkier."

~~~
cheald
That's my exact gripe with Cucumber. It's basically just regex soup that's
designed to make other people think that you just feed raw intent to your
computer and it divines the correct behavior. I honestly don't see how it adds
actual value, when Ruby is already so damn readable.

~~~
minhajuddin
I've tried using cucumber a couple of times and failed, and I knew there was
something wrong with it. Thanks for pointing that out.

~~~
cheald
Lots of people like it, so I'm sure that there's an argument to be made for
it, but it feels unnecessary and crufty. Personally, I think that the
strongest case for it is that it lets your manager without any technical
ability feel like he's a useful part of the process, but it just seems like a
solution in search of a problem.

Something like Steak (<https://github.com/cavalle/steak>) feels far more
comfortable and readable to me.

------
nikcub
I thought this was going to be as easy as 'if you can't reach /monitors.txt,
then the site is down'

but seriously, does nobody else see a problem with your monitoring
configuration being hosted on the site being monitored? It is a bit like
asking a hospital patient to keep an eye on his own charts and to let you know
if anything goes wrong ..

do people really switch monitoring companies often enough to justify having a
standard configuration format?

ps. cucumber syntax is horrible because it perpetuates a 'everybody speaks
english' view of the world

~~~
mseebach
> but seriously, does nobody else see a problem with your monitoring
> configuration being hosted on the site being monitored?

So, if the file is unreachable, something is probably very very wrong.

Malicious changing of the file could be addressed by firing an alarm when the
file changes.

> It is a bit like asking a hospital patient to keep an eye on his own charts
> and to let you know if anything goes wrong ..

No, it's a bit like leaving the chart with the patient, trusting that he won't
destroy it.

> do people really switch monitoring companies often enough to justify having
> a standard configuration format?

Do people really switch monitoring companies rarely enough to justify keeping
a myriad of incompatible configuration formats?

This also has the benefit of you owning all of your own data.

> ps. cucumber syntax is horrible because it perpetuates a 'everybody speaks
> english' view of the world

Cucumber syntax is great because a lot of people are already used to it. Oh,
and it's completely language independent too.
[https://github.com/cucumber/cucumber/tree/master/examples/i1...](https://github.com/cucumber/cucumber/tree/master/examples/i18n)

------
ricardobeat
I'd much rather have something akin to an appcache.manifest file:

    
    
        # monitors.txt - see http://monitorstxt.org for more info
    
        GET:
            http://monitors.txt
            http://otherservice.monitors.txt
        
        POST:
            http://monitors.txt/service q:lalala user:trololol
            http://monitors.txt/service2 id:250
        
        RESOLVE:
            monitorstxt.org 207.97.227.245
        
        PERFORMANCE:
            http://duckduckgo < 2s
            http://duckduckgo/images < 3s
    

IMO feature testing should be part of your test suite pre-deploy, not the
monitoring service.

~~~
merlincorey
This syntax seems a lot less confusing and more powerful than the subset of
English presented.

Another bit of syntax that I think is missing is something like:

GUAGE:

    
    
      http://domain/stats/logins 60s
    

COUNTER:

    
    
      http://domain/stats/logins 60s
    

AVG:

    
    
      http://domain/stats/logins 24h
    

Something like this would allow you to setup not only with simple external
monitoring systems, but with internal systems like Nagios, Ganglia, ZenOSS,
etc.

Of course, I do have to agree with many people that I am a bit uneasy
PUBLISHING all of this data for all to see.

~~~
eliot_sykes
Interesting, incorporated this into the YAML sample:
[https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/moni...](https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/monitors.yml)

------
kanwisher
Really love the idea of text file based monitoring, hate the cucumber
interface. I stomach it in rails but I'm not a big fan of English in my code

~~~
politician
I'm curious about this -- don't you use APIs which use largely, if not
universally, English identifiers? Or is there a version of, say, the .NET
Framework where the API itself is written in French, German, or Spanish? Or
what language has a custom parser for each language so that you can write
keywords ("if", "for", " _do_ ") in your native language?

~~~
eCa
Believe it or not, VisualBasic (and every built in function) in Excel is
translated in localized versions, and of course the English keywords are _not_
kept. It's a mess.

------
rednaught
Fantastic idea. I see humans.txt (<http://humanstxt.org/>) was an influence.
Why not stay in a concise format like that?

~~~
eliot_sykes
I've not been able to figure out an alternative to cucumber that has the same
flexibility and readability.

For example, I've not a clue yet on how to improve on this example:

    
    
      Feature: DuckDuckGo Search
        It should continue to kick ass
        And I should be able to search
    
        Scenario: Homepage performance
          When I go to http://www.duckduckgo.com
          Then I should see the page downloaded in less than 0.5 seconds
          And I should see the page assets downloaded in less than 2 seconds
    
        Scenario: Search over SSL
          Given I go to https://www.duckduckgo.com
          When I fill in "q" with "site:news.ycombinator.com"
          And I press submit
          Then I should be on https://duckduckgo.com/?q=site%3Anews.ycombinator.com
          And I should see "Hacker News" within "h2"

~~~
CJefferson
The problem is that this looks tempting, but what if I change:

    
    
        Then I should see the page downloaded in less than 0.5 seconds
    

to

    
    
        Then I want the page to download in under a second.
    

The natural English looks very tempting, but the actual terms you can use are
very restrictive.

~~~
eliot_sykes
Thinking monitors.txt would need a validator, plus the monitoring providers
can alert customers when they've entered something invalid.

------
crawshaw
Brilliant.

Given the target audience, I think a more concise description language that
pseudo natural language would work better. Regardless, such a service can
bring basic alerting to the huge number of sites which currently have nothing.

~~~
eliot_sykes
Thanks for the feedback. I'd love to hear any ideas on language alternatives.

~~~
bmurphy
I agree with the parent. Something very simplistic in json or yaml would be
preferred.

------
gojomo
Interesting idea. The proliferation of convention-dictated top-level URLs
offends the design sense of many web architects. Possible alternatives are to:

• shoehorn a pointer to a varying URL inside an existing convention-dictated
place – as for example with sitemap pointers inside robots.txt files.

• let the resource live anywhere but specify its location to consuming
services via some out-of-band mechanism. That is, you still have to tell any
monitoring provider where your particular monitoring-spec lives, probably via
its signup interface, but you can still use the same format and unmoving file
with multiple providers.

~~~
eliot_sykes
I'm keen on your idea for an out-of-band mechanism, and would be needed for
the obfuscated URL idea

 _"For the more gung-ho, obfuscate the URL and use SSL
(e.g.<https://yoursite.tld/something-unguessable/monitors.txt>) and tell your
provider where to find your monitors.txt"_

~~~
mseebach
Another approach could be to support HTTP Basic auth.

------
LukeShu
I like the idea, but not the URL. How about moving it to "/.well-
known/monitors.txt" in conformance with RFC 5785?

~~~
eliot_sykes
Thanks for this, looks like a step in the right direction, do you know what
successful applications there have been to this RFC?

<http://tools.ietf.org/html/rfc5785>

~~~
LukeShu
The biggest one I know if is "host-meta" (RFC 6415), which is used by
webfinger (properly supported by gmail), and a few other "social"-type
protocols.

------
sunnydaynow
Nice idea. The monitors.txt should/could also contain selenium or imacros
scripts for transaction monitoring, for use by monitoring services like
alertfox or browsermob.

------
cultureulterior
Ugh, natural language. Never works well.

~~~
eliot_sykes
I'm open to implementing alternative formats to Gherkin.

If you've got an idea of some psuedo code you could write that'd make the
monitors.txt idea more attractive, please write some, I'm keen for feedback.

------
prodigal_erik
"Should see the page assets" is kind of vague. You don't want to get 3 AM
alerts because somebody else's transcluded widget is slow, if you can't do
anything about it and the page is usable without it anyway. You probably want
to distinguish your resources and third parties which may or may not have
their own monitors.txt, as well as your resources that are expected to be slow
because they're not static.

And is this presuming a headless browser? There are a lot of poorly-authored
documents which blow up if their js can't get certain resources, even though
the URLs don't appear in the markup where a scraper would see them. I'm all
for having a pure-HTML mode that always works, because that's basic
competence, but I'd want my progressive enhancements monitored as well at a
lower priority.

------
eliot_sykes
Put up some ideas for what yaml, xml, and json formats could be:

Cucumber [https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/moni...](https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/monitors.txt)

YAML [https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/moni...](https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/monitors.yml)

JSON [https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/moni...](https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/monitors.json)

XML [https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/moni...](https://github.com/eliotsykes/monitorstxt/blob/gh-
pages/monitors.xml)

------
kirillzubovsky
Love the idea of using natural language to write all the specs! Also, got
amused by reading the HN comments, where all jedis, ninjas and rockstars are
arguing against using English... kind of expected.

------
eliot_sykes
Big thanks to HNers for the encouraging feedback. Based on it I'm going to:

\- Put some monitors.txt examples in JSON, YAML, Gherkin and XML up on
<http://monitorstxt.org> to help get the syntax right and satisfy the majority
of language preferences. Feedback/contributions welcome, fork on github here:
<https://github.com/eliotsykes/monitorstxt>

\- Continue work on prototype monitoring app, with support for the above
formats

\- Open up the prototype to interested devs

------
delano
I like the idea. What about if the definition was in the form of HTTP
requests, something like:

    
    
        GET /
    
        GET /login
    
        POST /login
        user={{CUSTID}}&pass={{PWORD}}
    
        GET /search?q={{QUERY}}
    

The monitors.txt file could contain pointers to the definition files (one file
per usecase), ideally with arbitrary metadata.

If there's some kind of consensus, now or later on, I'll implement it in our
monitoring service (blamestella.com).

~~~
eliot_sykes
Thanks for the psuedo code and the consideration of implementing it if there's
consensus, much appreciated.

I'm working on implementing monitors.txt for my own sites and will open it up
to a few devs to get the API sculpted just right. After then I hope they'll be
other monitoring services as keen as blamestella to have a go at it.

------
eliot_sykes
So far people seem to like the idea, but not so keen on cucumber for the
language. Please contribute any pseudo code ideas below as I want to get this
right

~~~
politician
Honestly, I like the cucumber format, but what you ought to do is publish an
API of phrases so that monitoring providers can target a standard set of
monitoring operations. Right know, I don't see how they can execute a generic
spec as described.

~~~
eliot_sykes
Cucumber devs might not like this, but at the moment I'm taking cues for the
API phrases from cucumber-rails-training-wheels (web_steps.rb for capybara) -
I see why the devs wanted it removed, but it works great for something like
this that needs wide support.

------
nicpottier
Why make it human readable?

Especially if you want lots of providers to support it, pick something super
easy to parse.. JSON is the obvious choice there:

    
    
       [{ method: "GET",
          url: "index.html",
          tests: [ contains: "nyaruka",
                   response_in_ms: 500,
                   status: 200 ] },
        { method: "POST",
          url: "login.html",
          params: { foo: "bar" }
          tests: [ "status": 200 ] }]

~~~
jebblue
JSON is harder to read than XML, to my human eyes.

~~~
lparry
Wow, first time I've ever heard someone say that. Most people I know think XML
is horrid to read, myself included. I'd take json or yaml over XML any day

~~~
jebblue
I'm probably in the minority on this one yes.

------
huhtenberg
Ah, _that_ sort of monitoring... I was hoping it'd be something like a sitemap
with pages' timestamps, so that monitoring sites like
<http://www.followthatpage.com> wouldn't need to poll actual pages and
needlessly pollute my access.log (completely disrespecting robots.txt along
the way).

------
nickand
People can't even agree on one programming language even though they all
manipulate the same bits. Now we have a monitor.txt. So you are telling me
that the site simply existing and being up is not enough to satisfy it's state
of being up? Keep making extra crap people and you dig your own hole.

------
charliesome
Cool, but you won't get anywhere without a formal grammar or, at the very
least, a reference implementation...

~~~
eliot_sykes
Agreed, working on it...

------
jpalomaki
I think there is some point in this .We deploy the same webapplication to
multiple servers. In some cases different versions of the app should be
monitored in different ways. With monitors.txt I could include the information
on the web app and the monitoring tool would automatically pick it up.

------
gnufied
I like this! I wish there was a way to authenticate though. So, as you can
gauge in deeper and check if services are really working. But of course, if
the file is public, that throws spanner in the plans.

~~~
eliot_sykes
Talking about monitoring deeper in the app...Monitoring cron jobs is one thing
that I've never done in a way I'm comfortable with, or enough. That
realization is what lead to monitors.txt. I'm working on something like this
along with a rails plugin for checking jobs run on time:

    
    
      Feature: Jobs should run on time
    
        Scenario: Search Subscription Job ran on time
          Given the job started and ended at:
            | start                   | end                     |
            | 2011-10-30 05:30:01 GMT | 2011-10-30 05:37:42 GMT |
            | 2011-10-29 05:30:01 GMT | 2011-10-29 05:37:42 GMT |
          Then I should see the job took less than 10 minutes
          And I should see the job started every day at "05:30 GMT"
    

The rails plugin generates the text above and start/end run times with
something like this:

    
    
      class SearchSubscriptionJob
    
        def perform
          # Fire off the search subscription emails
          ...
        end
    
        add_job_monitor :job_method => :perform, 
          :should_run => [:daily, "05:30 GMT"],
          :max_duration => 10.minutes,
          :alerts => {:email => 'admin@yoursite.tld',
                      :sms => '+441234567890',
                      :twitter => 'twitteruser'}

------
eliot_sykes
Like monitors.txt? Henchmon Beta wants you
<http://news.ycombinator.com/item?id=3202535>

------
eliot_sykes
So how many of you like the cucumber/gherkin format over anything else
mentioned? Would be helpful to know to gauge interest.

~~~
mfjordvald
If I ran a monitoring service I would honestly never bother implementing it.
Parsing natural language can be really, really difficult. I realize this is
just pseudo natural language as it has a very restricted subset of terms,
however, monitoring is a complex field and to fully support most features
it'll eventually become really complex and really error prone.

------
inconditus
Interesting idea, I'm not so sure I agree about "For the more gung-ho,
obfuscate the URL and use SSL (e.g. <https://yoursite.tld/something-
unguessable/monitors.txt>) and tell your provider where to find your
monitors.txt'.

<http://en.wikipedia.org/wiki/Security_through_obscurity>

~~~
tptacek
If "something unguessable" is a 64 bit number from /dev/random, and it's
disclosed only to monitoring providers, and indexes are turned off on the
server, it's not "security through obscurity"; it's a key.

Virtually every web app in the world relies on a similar security system. You
just don't notice, because we call the "key" in those systems a "cookie", not
a "dynamic URL path component".

There are reasons why the URL key is inferior to other keys used by web
applications, but they are fiddley. If monitors are unlikely to have extremely
sensitive information in them (and you'd hope they wouldn't given their
intent), it's fine to use URL keys.

