
Readability - KevinBongart
http://lab.arc90.com/2009/03/readability.php
======
tptacek
It's slick, but does anyone else see the irony in the fact that they're not
only running banner ads on the Readability site itself, but also (subtly) in
the sites they're stripping the ads from?

It's worth pointing out that, like it or not, we're not really _entitled_ to
escape banner advertising.

~~~
ewiethoff
> we're not really _entitled_ to escape banner advertising.

"Oh, really?" I type into my heavily modded Firefox browser because some
computer graphics give me brain seizures. (Too bad I cannot log into HN from
my Lynx browser.)

Last time I checked, the industry term for a browser is "user agent," and page
rendering is under the control of the user agent. I can use whatever user
agent I want, and so can you. I can fiddle with my user agent as I see fit,
and so can you. That's what your browser's Options dialog is for.

Sorry I'm getting really testy here, but I get tired of pointing out to the
world that it's the _user_ who is in control, not marketers, not page
designers, not Yahoo! CSS Reset...

~~~
scott_s
In the technical sense, sure, you're allowed (entitled) to do whatever you
want.

But that's not what tptacek is referring to. He's referring to the "I want
something for nothing" sense of entitlement. You have no moral right
(entitlement) to read their content without viewing the ads the put up. That's
why ad-blocking has never sat right with me: it's implicitly accepting that
the model of ad supported, free content can not work on the internet.

~~~
erso
Nonsense.

If you make the content of your site freely available with the knowledge that
users can access your site in a way that circumvents your business model
you've signed up for a business model that isn't dependable.

This is no different than a street musician wanting money from people that
stop to listen to the music they play. Maybe the listener, having heard the
music, determines it's not of any monetary value. Are they obligated to pay
anyway? No, and the street musician knows this. They understand that only a
certain percentage of people that stop to listen will actually pay anything.
It's true, too, that maybe someone will listen to the music and be unable to
pay. Should they block their ears as they walk past because they cannot give
what the musician would like to receive for his service? In the case of a
screen reader, maybe the user doesn't even know the ads exist.

If the street musician decides his business model isn't supporting him as much
as he'd like, maybe he'll stop playing music. And will the public that decided
to not pay him anything really care that he doesn't play it anymore? No. They
decided it wasn't valuable enough to pay for.

If you feel your content is worth something or you need money to continue
providing it, you should charge people for it outright, or make an explicit
request for donations. If people find value in your content and are able to,
they will pay for it.

It should not be expected that when you put content freely available
somewhere, whether it be publishing a blog or playing music in the street that
everyone who reads or hears it finds it of equal value or is willing and able
to pay for it. You, as that site owner or musician do yourself a disservice if
you expect otherwise.

~~~
ewiethoff
Agree with erso. Much of the internet consists of buskers, and buskers cannot
rely on revenue. They sing on the street for the love of it, or for self-
promotion, or because they don't have headshots and an agent to book a concert
hall, or because they're crazy--not because this will bring them a fortune,
because it won't. Pay-per-view, however, is booking a concert hall and
charging admission. And selling subscriptions is selling season passes to the
shows. Very different. Busking might drum up a fan base, but bona fide
concerts are where the real revenue is. Think for a moment: pay-per-view and
subscription pr0n are where the internet revenue is.

------
spydez
Interesting note, for those of you uninterested in karma or user names, the
bookmarklet will hide both. You'll still have your voting arrows and the
comment text; you just won't know what everyone else thought of the comment
(aside from their placement relative to each other).

It's pretty neat to go on a big thread and hit the bookmarklet. e.g.
<http://news.ycombinator.com/item?id=501696>

------
kylec
It could use more tweaking - I tried to read the latest pg essay with it, but
all it would display was the image with the light "PAUL GRAHAM" text at the
top of the page. I then tried to go back to the essay, but it looks like the
only way to un-Readability the page is to reload it. I then tried to highlight
the text that I wanted to read, but Readability doesn't take this into account
when generating the page.

When you have a service that guesses what the input is supposed to be, there
should be a way to gracefully fail and allow the user to manually specify it,
and if that doesn't work it should be easy to disable the service.

~~~
ewiethoff
If you're a Firefox user and savvy about CSS selectors, I recommend the
Stylish add-on (<http://userstyles.org/>). It lets you create blanket tweaks
for all sites and specific tweaks for specific sites. And you can turn your
tweaks off and on at will. It works even with Javascript turned off. I set
colors and fonts and font sizes, adjust column widths, eliminate entire
columns, make images semi-transparent so they don't bug me so much, etc.

~~~
mnemonik
Also, if you're an opera user or have greasemonkey the styles can be loaded as
userscripts, which will just use javascript to edit the css style changes you
have selected.

There are a few styles for Hacker News as well:
<http://userstyles.org/styles/search/hacker%20news>

------
sjs382
I like this a lot. It's sort of a bookmarklet for people who like to read
"Printer Friendly" pages. Unless there's pagination, of course.

~~~
pixelmonkey
You should check out the Firefox add-on "Print Hint":

<https://addons.mozilla.org/en-US/firefox/addon/700>

It will search a page for a "Printer-Friendly" link and display a glowing
green printer icon in the bottom of your Firefox if it finds one. I find that
on some pages, this can be a way to quickly navigate to a "Reader-Friendly"
page that does support pagination. Of course, sometimes Print Hint doesn't
detect a printer link, though I've never had a false positive.

------
palish
I like it. However, I found a site that breaks it:
[http://www.instigatorblog.com/the-art-and-science-of-the-
sma...](http://www.instigatorblog.com/the-art-and-science-of-the-small-
exit/2009/02/04/) ... When I click my 'Readability' bookmarklet in Chrome, the
main article is removed completely, and all that's left is the blog entry's
comments. Here is the exact Readability bookmarklet I'm using:

    
    
      javascript:(function(){
      readStyle='style-newspaper';
      readSize='size-medium';
      readMargin='margin-medium';
      _readability_script=document.createElement('SCRIPT');
      _readability_script.type='text/javascript';
      _readability_script.src='http://lab.arc90.com/experiments/readability/js/readability-0.1.js?x='+(Math.random());
      document.getElementsByTagName('head')[0].appendChild(_readability_script);
      _readability_css=document.createElement('LINK');
      _readability_css.rel='stylesheet';
      _readability_css.href='http://lab.arc90.com/experiments/readability/css/readability.css';
      _readability_css.type='text/css';
      document.getElementsByTagName('head')[0].appendChild(_readability_css);
      _readability_print_css=document.createElement('LINK');
      _readability_print_css.rel='stylesheet';
      _readability_print_css.href='http://lab.arc90.com/experiments/readability/css/readability-print.css';
      _readability_print_css.media='print';
      _readability_print_css.type='text/css';
      document.getElementsByTagName('head')[0].appendChild(_readability_print_css);
      })();

------
swombat
Brilliant. I love the fact that it doesn't really change the look of my blog
very much ( <http://www.danieltenner.com> \- once you're inside an article,
applying readability with medium margins, newspaper look, large font, only
gets rid of the red background and the image, that's all)

------
dood
Interesting, any idea how this works?

Edit: doh, source here:
[http://lab.arc90.com/experiments/readability/js/readability-...](http://lab.arc90.com/experiments/readability/js/readability-0.1.js)

~~~
aston
You edited to include the link to the JS, but here's my reading of it:

1) It pulls out content in <p> tags (presumably those are the ones with data
you want).

2) It rewrites double line breaks as paragraph breaks (in case the site isn't
as semantic as it should be).

3) In order to pick the "main content" container, it looks for the container
elm with the most <p>s inside.

4) It filters the "main content" to remove stuff that looks like trash.
Filters include having too much non-<p> content, having too few commas, and
too few words.

5) It rips out all of the HTML on the page and puts its own in, which also
pulls together the user's selected style info. This is the sketchy step. I
think an overlay might've been more appropriate here, but the comments imply
the author had some difficulty there.

~~~
dood
Thanks, had a brainfart and forgot I could just read the code ;)

------
sarvesh
Useful but seems buggy. It disabled keyboard input completely in Opera.

