
Readable: A more readable version of "readability" - jbm
http://readable.tastefulwords.com/
======
j_baker
"Legally, Readable's source is under the Creative Commons Attribution-
NonCommercial-ShareAlike license."

Erm... Why? Correct me if I'm wrong, but Creative Commons wasn't made for
code. There are plenty (even too many) open source licenses to choose from.
Why not one of those?

EDIT: Yup, using cc for source code is not recommended:
[http://wiki.creativecommons.org/FAQ#Can_I_use_a_Creative_Com...](http://wiki.creativecommons.org/FAQ#Can_I_use_a_Creative_Commons_license_for_software.3F)

~~~
tastefulwords
Honestly, I browsed around for less then ten minutes, before deciding on
Creative Commons for the code -- and I did so only because I previously saw
source-code under various Creative Commons licenses. I did not know that it
wasn't recommended; thanks for letting me know. I'll be sure to study the
issue more thoroughly and re-license the code.

~~~
telemachos
Open Source Licenses[1] is a good place to "shop" for licenses, though you
might find the "by name" view a bit overwhelming. Maybe start with from the
"Licenses that are popular and widely used or with strong communities" section
on the "by category"[2] page.

[1] <http://www.opensource.org/licenses/index.html>

[2] <http://www.opensource.org/licenses/category>

~~~
tastefulwords
Thank you.

------
mgkimsal
Maybe a bit off topic, but readability, readable, instapaper, etc - weren't
user-defined style sheets supposed to address at least some of the issues
these services are addressing? I ask because it seems that sometimes we get
decent tech ideas (the _idea_ behind CSS is good at least) but then we settle
for half-baked implementations and don't pursue good user tooling. CSS
specifically - no browsers have good ways of defining/saving/using custom
stylesheets, yet this (IIRC) was a selling point of CSS back in the day. Had
we had better browser support for this, we may have had a much different web
than we do today.

As it is, people think it's really cool when there's those stupid 3 font size
icons at the top of a page and they can 'resize' the page's fonts between 9,
11 and 13 point fonts (w00t!)

~~~
tastefulwords
Couldn't agree more. If user defined style-sheets worked (and if content and
presentation were truly separated) the web would be a much better place.

When I was developing Readable, btw, I pretty much thought of it as a better
implementation of user-defined styles. The text-parsing and main body text
extraction is just my way of getting around the problem of
content/presentation.

~~~
mgkimsal
BTW, this wasn't a bash at you at all (if you took it that way). More a lament
that we ended up in this chicken-and-egg situation. No point in making user-
css tools because there's so much inline style and mixing, and no point in
moving towards building sites with user-defined CSS in mind because there's no
good support in browsers.

~~~
tastefulwords
Don't worry, I didn't take it in a bad way. And I do, actually, completely
agree with you; I thought that my comment made that clear -- but I do
appreciate you making sure I didn't take offense; that kind of courtesy is
rare.

Back to our issue, though: if, tomorrow, browsers all got good enough at user-
css that Readable wouldn't be needed anymore, I would gladly convert it into a
one-page tutorial explaining to people how to set up their user-css :)

~~~
mgkimsal
Yeah, I was probably writing back more to assure anyone else reading it that I
wasn't bashing you :)

Thanks for some great work, by the way!

~~~
tastefulwords
You are most welcome.

It was -- and still is -- my pleasure.

~~~
ajays
Awww..... group hug! :-)

------
jbm
(For what it's worth, I was a fan of the old Readability plugin, but I am
somewhat concerned about it's direction now.)

------
bluelu
Our company (EDIT: not related in any form to readable or readability) does
nonrss based crawling (blogs, news, message boards), so we spent quite some
time on automatically identifying article, author and date and comments in
articles.

If someone is interested in creating something similar to readability (e.g
with us doing the article extraction for you) or does a need a website article
extraction, you can contact me t.britz@trendiction.com.

PS. We have so many other ideas on our own and that's the reason we are not
doing it ourselves.

~~~
tastefulwords
Would've been nice if you mentioned that your company has nothing to do with
Readable, as there is a chance someone reading might get that impression --
especially for the short time that your reply is the first reply.

~~~
bluelu
Sorry, didn't think about that. Edited it. thanks

~~~
tastefulwords
Thank you, too.

------
ggchappell
What about _printable_?

One of my major uses for Readbility was to format articles so that I could
print them out for later reading. But a recent change to Readability made it
so that (at least under Firefox) the browser print function will not break an
article into multiple pages. And that makes it pretty useless for printing.

Whatever the problem is, Readable has it, too. Can something be done about
this? (Also, does anyone know what the problem is?)

In any case, improving the readability of the web is a worthy goal. Thanks for
your efforts.

~~~
tastefulwords
I'm not aware of the problem you're describing. And I can't seem to reproduce
it.

Could you provide more details, please? -- URL of an example article, more
details on what exactly happens that shouldn't happen.

If you'd like, you can get in touch, to do this
(<http://readable.tastefulwords.com/about-and-contact/>) -- as a matter of
fact, it would probably be preferable, as opposed to using HC as a bug
reporting forum.

~~~
ggchappell
I'll do both (report here & there).

I just tried the top five HN links. All exhibited the problem.

[http://blogs.msdn.com/b/powershell/archive/2011/04/16/powers...](http://blogs.msdn.com/b/powershell/archive/2011/04/16/powershell-
language-now-licensed-under-the-community-promise.aspx)

<http://www.logolounge.com/article.asp?aid=lnPf>

<http://code.google.com/p/leveldb/>

[http://www.technologyreview.com/computing/37525/?p1=A2&a...](http://www.technologyreview.com/computing/37525/?p1=A2&a=f)

<http://openfarmtech.org/weblog/2011/05/solar-fire/>

For each, I clicked on the link, and then clicked my newly made Readable
bookmark, which I had set up using the default settings. Then I went to
Firefox's File:Print Preview. I have it set on the default settings (Shrink to
Fit, Portrait). For each of the five articles listed above, the result was
that only one of the displayed pages had article text on it, and this was not
the entire article, since it would not fit on a single page. Sometimes other
pages were displayed, sometimes not, but if there were any, then they were
blank pages.

I am running Firefox 3.6.16 under Ubuntu 10.4 (Lucid). I'm perhaps a bit
behind in my patching, but, in any case, this is not a new problem.

~~~
tastefulwords
The problem, in the steps you said you followed, is the "Print Preview" step.
File >> Print Preview will show you a preview of how the entire current page
will look like, when printed -- but that's actually not what you want to
print.

What you want to print is only contents of Readable's overlay. To that end,
please use the Print link, shown in the menu at the bottom of the overlay --
unfortunately, there is no "Print Preview" available.

The technical explanation for this is that Readable's overlay is actually an
IFrame -- and browsers support printing the iframe contents as if the iframe
were a window onto itself; but they will print the iframe as an element in the
main page, when you're printing that instead.

~~~
ggchappell
Wow.

Well, you're right. "Print" works fine.

Three suggestions for you:

(1) Make the "Print" link easier to get at. Not just 'way down at the bottom
of the page. Maybe put one with the "Close" link at the upper-right?

(2) Change the text of the "Close" links. I'm not sure what they should say;
but "Close" doesn't really convey the right idea. What it actually does, from
the user's POV, is not _closing_ , but rather returning to the original
styling.

(3) Maybe figure out how to make Readable work with Firefox's Print Preview.
(I realize that FF's behavior isn't your fault, but it would still be nifty if
it worked.)

Thanks again for all your work.

~~~
tastefulwords
Readable's interface is due for a slight overhaul. But I haven't yet decided
what the new design is going to be. I will take your suggestions into
consideration, though.

As for the work: you are most welcome.

------
tastefulwords
I'm Readable's author.

If anyone has any suggestions for improvements, or even feature requests, I'm
all ears.

I'm not promising I'll implement them; but I'll definitely listen and consider
them carefully.

P.S. jbm, thanks a lot for posting this -- I've tried posting Readable to HC
myself (twice), but it didn't stick.

~~~
MediaBehavior
I'm another one who uses such services for PRINTability. And one of my big
sadness is that the _author_ _line / date_line url_lines are often missing: a
real nuisance on a piece of paper that gets discovered months (years?) later.

Note: readability only _of late_ thought to start displaying URL at end of
articles.

EDITs:

1) Meant to say I'm pleased to see you working on more flexible/powerful
version than Readability (don't much like their latest approach of redirecting
during reformatting).

2) You might want to try this sort of test-case - for which Readable only
presents the first paragraph:
[http://boston.com/bostonglobe/ideas/articles/2011/05/08/seni...](http://boston.com/bostonglobe/ideas/articles/2011/05/08/seniorland_circa_2050/?page=full)

~~~
tastefulwords
Boston.com is now fixed; thanks for the report.

As for your issues on printing: thank you.

I had honestly not given printing very much thought -- as I don't really use
it myself.

Your ideas make a lot of sense, though; so count on seeing them implemented.

~~~
MediaBehavior
Great, Gabriel...

I like the way your mind works - and so look forward to following your work.

------
beaumartinez
A great improvement over Readability! It's bookmarklet has been replaced, at
least on _my_ bookmark bar: it's faster, has increased customization options
and justification, and is open-source... Add hyphenation to that mix and
you've got a solid ten out of ten service.

(Justification and hyphenation were the two features I always wanted
Readability to have, and it always seemed odd to me that something branded
"readability" lacked them...)

In the FAQ you answer to whether Readable is open-source: _"no" in the sense
that the source isn't on display somewhere_. Not to split hairs here, but that
sentence is incredibly confusing; what does having your code "on display" have
to do with being open-source?

~~~
Sephr
Why would you want justification in something that's supposed to make
something more _readable_? Justification might make text look nice, but it's
terrible for readability, and people with dyslexia often struggle at reading
it, and the same usually goes for hyphenation.

~~~
beaumartinez
Why, then, are books typeset justified and hyphenated?

> _Justification might make text look nice._

And that's bad? You have to _look_ at text to read it; I'd rather it look
"nice" and as non-distracting as possible. Perhaps "readability" is
subjective, but I think text that flows _evenly_ is easy on the eyes.

~~~
Sephr
On the contrary, variable spacing from justification is very distracting.

------
Qz
I tried using Readability the other day to zap the SpaceX article from here
and it damn near froze my computer. When it was done, it didn't even show me
the right article...

------
abrowne
One thing that makes me prefer Readability (or in my case the Readability
Redux Chrome extension[1]) is that it's able to stitch together multiple-page
articles into one page. It makes reading articles on sites like The Register
and Ars Technica much more bearable. I'd love it if Readable gained this
feature. (Safari Reader mode also does this, using code from the old
Readability.)

[1]:
[https://chrome.google.com/webstore/detail/jggheggpdocamneaac...](https://chrome.google.com/webstore/detail/jggheggpdocamneaacmfoipeehedigia)

~~~
tastefulwords
Quoted from an email, where I answered the same exact question:

    
    
      Most likely, Readable will never have multi-page support.
      The reason for this is Readable's philosophy, and not any technical difficulties that feature may imply.
    
      It is Readable's intention to take whatever is in your browser window right now, and make it better.
      But it is not Readable's purview to go beyond that.
    
      Readable tries to act like a browser, in this respect.
      Think about it like this: Readable getting subsequent pages in the background would be pretty much equal to web-browsers doing the same thing for all paging, on all websites.
      Just because it is technically possible, does not mean it should be done.

~~~
abrowne
Fair enough. I like that you have a strong sense of Readable's goal(s) and
know what features would be overextension. As for sites that publish articles
spread across multiple pages, I should really vote with my feet (eyes?).
Looking forward to hanging quotes!

------
pasbesoin
It's been quite a small bit of time, now, but IIRC I had some direct
correspondence with the author/creator of Readable, and he was a quite
agreeable chap. I realize this doesn't speak to the technical merits, but the
exchange left me feeling more comfortable about using his product. (Leaving
aside the role of advertising and the need to support sites, a separate but
not insignificant concern.)

Here's the old URL/site, which is also more visible with Javascript disabled.

<http://readable-app.appspot.com/>

~~~
tastefulwords
Glad I made an impression; and thanks for the compliment. I'm actually about
the same in real life, too.

Don't worry: Readable won't ever have advertising -- the most I'm looking for
is something like a very small "Sponsored By XXX" banner, at the bottom -- if
any cool company will ever be interested that is.

The new site will be redesigned -- and it will be a lot better when JavaScript
is turned off. But I really don't recommend you use the old site any more, as
that still runs the old version of the application -- which is way worse, in
terms of performance.

I hope you're still using Readable, and that it's still helping you read the
web more comfortably.

------
tuxcanfly
I hope this doesn't track what I'm reading unlike the new readability.

~~~
tastefulwords
Yup; all text processing is done in the browser.

URLs are tracked, though -- they would be tracked in my server logs, even if I
didn't do anything specific to track them.

But they are intentionally tracked -- and a percentage of them are run through
an automated Readable test, every day, so that I can see, on average, how
often Readable gets things wrong.

~~~
tastefulwords
I will also be adding an option that will disable even this anonymous
tracking, for those who really want/need it.

However, you'll just have to take my word for it that I won't be reading the
server logs -- which contain the same, if not more, data.

------
eli
So the source is licensed CC-Attribution-NonCommercial-ShareAlike, but there's
no public repository or support? That's a curious approach.

~~~
tastefulwords
The source code isn't in any repository because I don't work with
repositories. And, if I were to list a specific version of the code, it would
become slightly obsolete in less than a week -- as I am continually tweaking
Readable's text-parsing algorithm. You can find an always up-do-date version
of the code here: <http://readable-static.tastefulwords.com/_r/bulk.js>

~~~
rudle
>> The source code isn't in any repository because I don't work with
repositories.

You really should be using version control. If you want people to contribute
to this open-source project, you can't beat a public repository (like github)
at which you can receive patches from other developers. Right now, it is not
apparent how I can work to improve your project.

~~~
tastefulwords
You are correct, sir. But I haven't really gotten in to the open-source game
just yet -- maybe I should; I don't know.

I am using version control; just not public version control.

If you want to help, get in touch (<http://readable.tastefulwords.com/about-
and-contact/>). I'm happy to hear any ideas; and, if you want to do some
actual work, that would be quite awesome too.

~~~
jackolas
I don't know why you would use your own version control. and not release it...
Makes little sense.

------
FilterJoe
Comparison of the new versions of Readable and Readability, here:

[http://www.filterjoe.com/2011/04/11/web-page-reformatting-
se...](http://www.filterjoe.com/2011/04/11/web-page-reformatting-services-
readable-and-readability/)

Hightlights:

The new Readability takes 6-12 seconds to reformat a page, vs. less than 2
seconds for both the new readable and the old Readability.

New Readability has more features, including Instapaper-like sharing for those
who pay.

It is possible to use the old Readability, for those who prefer it.

~~~
riobard
Readability 2.0 choosing to load and reformat the pages via their server
instead of doing it in place in browser really slows it down.

~~~
ambiguity
It does make it nice that I can share links to the "readable" version with
other people.

~~~
tastefulwords
This is a planned feature for Readable, too.

Count on seeing it in a future release.

------
shiven
I still don't get how/why is this different/better than the firefox
readability extension?

The commercialized readability was a step back from the bookmarklet, but this
firefox extension seems to work as well if not better. Could someone more
experienced please explain this to an ignoramus ...

[1] <https://addons.mozilla.org/en-US/firefox/addon/readability/>

~~~
tastefulwords
Readable doesn't really want to be "better" than Readability. In all, they're
actually very different beasts.

And I honestly don't consider Readability to be my competition. Readable first
started because of my own desire to have text formatted a certain way, no
matter what website that text happened to be on.

Unfortunately, Readability beat me to the launch by 2 weeks -- otherwise you
would all now be talking about Readability as a version of Readable :)

The only reason I didn't kill Readable after that, was that it was different
enough from Readability to diverse it's own shot -- plus, I love working on
Readable's text-parsing algorithm; it's a very cool problem to solve.

P.S. The extension you pointed to is based on the first Readability
bookmarklet -- and it's made by a guy who also made an extension based on the
first version of Readable.

~~~
shiven
Thank you for the clarification, I was unaware of the details you just
mentioned.

Talking of algorithms, does Readable use something like the Knuth and Plass
line break algo used by LaTeX? A Javascript implementation was mentioned a
while ago on HN[1].

Good luck with Readable, anything that helps reduce clutter (in any part of
life) is a great gift. Thanks for sharing!

[1]<http://news.ycombinator.com/item?id=1974963>

~~~
tastefulwords
You're very welcome.

No; Readable doesn't use anything like the Knuth and Plass algorithm.

But I did thoroughly check out the JS implementation you mentioned; and
Readable will probably use a part of the Knuth algorithm in the future -- as I
am planning to support hanging quotation marks, hyphenation, as well as better
(typographic) justification.

~~~
bramstein
Author of the JS implementation here. Let me know if you have any questions
about the implementation or problems integrating it. I was hoping someone
would pick it up and integrate it with Readability-like service. (I've slowly
been working to add support for it to Treesaver
<http://www.treesaverjs.com/.>)

For hyphenation you might also want to check out my Hypher project
(<https://github.com/bramstein/Hypher>) which is a minimal hyphenation engine
written in JavaScript. In my benchmarks it is about 4 times faster than
Hyphenator.js (and a lot smaller.)

~~~
tastefulwords
Thanks for the info, Bram. I'll be sure to hit you up if/when I have any
questions.

------
orionlogic
This one is better than Readability, seems faster. It's more instant like the
old readability.

I am complaining the new direction that Readability take since day 1 when they
decided to first abandon old readability, than force a meaningless frame
around content with a very slow implementation.

My only critique for all: please have a name something other than contains
"read".

~~~
tastefulwords
I agree with you about the names -- and I really wish I could've thought of
something clever when I first named it; unfortunately, it's a bit late now.

------
aba_sababa
A favicon. I just want a favicon. Every single bookmark in my bar has an icon,
and no words, and ALL I want is a favicon! Readability doesn't have it;
neither does Readable. Makes things much uglier, no matter how pretty the text
is. A golden star to whoever hacks it first.

~~~
tastefulwords
I believe there's an extension (for Firefox, at least) that allows you to
create custom buttons for bookmarklets.

In the near future, a native solution will be available -- i.e. I'm working on
a thin extension, for all browsers, that will act as Readable's launcher (with
benefits like keyboard shortcuts, an icon, and slightly faster load times).

------
arkitaip
It's nice that you can customize it because the default - serif - font isn't
that readable, atleast not on win7+opera.

Also, I think they need to update the default font families because calibri
and other post Vista fonts are as commonly used as Arial & co.

~~~
tastefulwords
If you have any suggestions, I'm very open to them. As a matter of fact, if
you're up to it, you can design your very own theme -- I'll let you name it,
and give you full credit. Or, if you just have a couple of quick suggestions,
feel free to list them here, or get in touch
(<http://readable.tastefulwords.com/about-and-contact/>).

~~~
arkitaip
Design own theme? Sure, why not. How does this work?

Here's a bug report for you. Using Win7,32bit/Opera 11.10, the text field for
specifying custom font families doesn't really show because the borders aren't
visible. Screenshot <http://i.imgur.com/gwb2O.jpg>

~~~
tastefulwords
Well, just customize Readable (<http://readable.tastefulwords.com/?setup>) to
your heart's content. Then, get in touch, and send me the custom options you
used.

If the theme is awesome, I'll add it to Readable's selection of themes
([http://readable.tastefulwords.com/?setup#explain-style-
theme...](http://readable.tastefulwords.com/?setup#explain-style-themes)). Did
you try those out, by the way?

If you know CSS, you can also heavily customize Readable via the "More CSS"
option.

------
terrcin
Doesn't work on this page:
[http://boston.com/bostonglobe/ideas/articles/2011/05/08/seni...](http://boston.com/bostonglobe/ideas/articles/2011/05/08/seniorland_circa_2050/?page=full)

I like it though, keep it up. :-)

~~~
tastefulwords
Boston.com is now fixed; thanks for the report.

------
loup-vaillant
I use NoScript, and they mentioned that possibility. I'm glad they showed
respect for my choice, even though there's no escaping Javascript here.

------
xijhing
I'd use it, but my Apture extension doesn't work while using your script.

------
geoffbp
Readability works just fine

