Show HN: Paste a link to an article, get a minimal version to read

bpierre · on Sept 15, 2014

Example with “The Rise And Fall Of The Dreamcast” (multiple pages): http://justread.mpgarate.com/read?url=http%3A%2F%2Fwww.gamas...

Original: http://www.gamasutra.com/view/feature/132517/the_rise_and_fa...

Awesome tool!

cliveowen · on Sept 15, 2014

http://evernote.com/clearly/ I've been using this for years, now I can't read an article without it.

topherjaynes · on Sept 15, 2014

To show of the functionality for first time page landers you should pre-populate with a popular article url. I had to open a new tab and find an article to test. I almost didn't come back. Almost, but glad I did!

grimtrigger · on Sept 15, 2014

I see website after website make this mistake. How is this not painfully obvious?

huskyr · on Sept 15, 2014

What's the difference between this and something like Instapaper or the iReader plugin ([1])?

[1]: https://chrome.google.com/webstore/detail/ireader/ppelffpjgk...

mpgarate · on Sept 15, 2014

Useful for articles that have sluggish javascript behavior, span multiple pages, or are otherwise hard to read.

Example use with a TechCrunch article: http://justread.mpgarate.com/read?url=http%3A%2F%2Ftechcrunc...

Also works well as a bookmarklet:

javascript:window.location.replace("http://justread.mpgarate.com/read?url=" + escape(document.URL))

jscheel · on Sept 15, 2014

Ugh, I need to set up an extension to automatically rewrite all Techcrunch urls.

glittershark · on Sept 15, 2014

Pentadactyl command to do this with the currently open page:

    :command! justread execute 'open justread.mpgarate.com/read?url=' + buffer.URL

hyp0 · on Sept 15, 2014

http://justread.mpgarate.com/read?url=news.ycombinator.com%2...

see also readability.

but what I really want is a low bandwidth version of a webpage, to conserve my mobile data plan.

analog31 · on Sept 16, 2014

Could the browser tackle this? If I understand it correctly, in broad strokes, nothing comes down unless the browser asks the server for it.

I'm dating myself, but when I first learned about HTML, the idea was that text would be organized so the browser could make it more readable for you, based on your needs. For instance, a deaf person could use a text-to-speech browser, and perhaps the heading tags would help them navigate the document.

Today's web page simply treat the screen as a graphical canvas.

In those old days, I also learned that having a crummy obsolete browser for my crummy obsolete computer actually sped up browsing because my browser was simply incapable of downloading the stuff that ate bandwidth.

mpgarate · on Sept 17, 2014

That is correct. justread will save you data compared to loading the original page. No javascript, ads, extra images unrelated to the article etc. It does, however, keep the images, since often they are an interesting part of an article.

cdbattags · on Sept 15, 2014

package it with http://squirt.io/ and suddenly we can read everything

goldfeld · on Sept 15, 2014

Anyone who thinks speedreaders like this are a good idea should look into the opthtamologist Bates' research and method. Reading without moving your eyes equals tension and stress on your eyes and related muscles (neck, shoulders.) It's a great way to increase your need for glasses.

vidyesh · on Sept 15, 2014

Not sure if this practically possible to read a whole article this way or not but this is an awesome tool.

How come I never stumbled upon this!?

Thank you very much.

jonalmeida · on Sept 15, 2014

It's been on HN a while back, but while using it in practical cases like WSJ, it seemed to pick up HTML code, whitespace characters and/or text from a sidebar.

I ditched it at the time, but I may try to start using it again if I can get it work with ebooks.

scoot · on Sept 15, 2014

I've added bookmarklets for both, and you can Clean (which sorts out multi-page articles, sidebars etc.) then Squirt to speed-read the resulting article. They really do go hand-in-hand.

notastartup · on Sept 15, 2014

oh my god. that is an amazing tool. really hard to say but I don't think I caught everything or remember everything I read but do feel like I can understand what I am reading. I keep trying to sound out the words and give up and begin look at the words only like pictures. its a surreal experience.

masukomi · on Sept 15, 2014

what is different about this than the original arc90 readability algorithm with an URL field added to kick off the processing?

mpgarate · on Sept 15, 2014

This project directly uses the Readability api.

I created this as a more simple interface than Readability offers, primarily for my own personal use as a bookmarklet.

stavros · on Sept 16, 2014

I didn't know they had an API. Would it be easy to create a bookmarklet that used it? I don't like their extension, it feels too heavy. I want something that doesn't run until I invoke it.

akavel · on Sept 15, 2014

By the way, does anybody here know of an algorithm (and/or already implemented open-source library/app) that copes well with auto-extracting content from forum-like websites? (i.e. phpBB, StackOverflow, HN, reddit, ...)

syllogism · on Sept 17, 2014

My suggestion would be to understand the Boilerpipe algorithm, which as far as I can see is the best available solution (and much clearer than readability): http://www.l3s.de/~kohlschuetter/publications/wsdm187-kohlsc...

You can then easily adapt it for your requirements.

krapp · on Sept 15, 2014

Umm... anything that uses xpaths should work I would think.

Apologies for blowing my own horn but I've had some luck filtering HN and reddit with this project I built (I used to have an example in progress online but i've taken it down): https://github.com/kennethrapp/embedbug

akavel · on Sept 15, 2014

The point is I want some heuristic that would work "automagically" (like Readability, etc), not requiring me to invent a tailor-made xpath for each and every such website in the world.

rahimnathwani · on Sept 16, 2014

Try this:

http://fivefilters.org/content-only/

It has a default extractor, and site-specific recipes use the same format as Instapaper, so you can leverage the work Marco has done on different sites.

krapp · on Sept 15, 2014

Oh, alright.

If there is such a thing I'd be interested to learn about it myself. TBH "tailor make an xpath for every site" is the best solution i'm aware of.

suprjami · on Sept 16, 2014

Don't spose you're putting the source up anywhere?

Knowing my luck I'd get used to reading with this, then you'd disappear off the internet forever. It'd be nice to be able to self-host.

mpgarate · on Sept 17, 2014

That is a nice idea. I'll have to prepare the code a bit for this but will likely do so. In that case, I will let you know.

justread is written with golang!

infinitone · on Sept 16, 2014

At first I thought by minimal, you meant summarized/shortened. Perhaps use an additional words to describe what you mean.

Other than that- looks good.

praveenster · on Sept 15, 2014

Care to share the details of the html parser? is it one of arc90/goose/boilerpipe/fivefilters or a new engine?

cag_ii · on Sept 15, 2014

> Built by mpgarate with the Readability API.

Is noted in the footer.

badloginagain · on Sept 16, 2014

Would like to see as a browser extension, one button click to view the page in a readable format. Great work.

wehadfun · on Sept 15, 2014

this is great would like it even more if it remove all images and displayed text in a boring font.

nazgul · on Sept 15, 2014

You're not concerned about the copyright issues related to this?