Show HN: Notforest - cleaning up the Web one page at a time (notforest.com)
25 points by rhubarbcustard 1633 days ago


Thank you for making this, and please please please don't "pivot" from a bookmarklet into another half-baked social network.

heh, absolutely no chance of that happening.

Tried this on a well-marked-up, html5 valid page. It ended up pulling one random piece of content from near the bottom. I realized that it was probably pulling the longest content on the page. Looking at the source code, that's exactly what it does. It removes forms, objects, scripts, images, blank links, divs, etc. Then it goes through paragraphs and tables and finds the longest content. This algorithm seems pretty good for long-form article content, but not for the marketing homepage I tried it on. Overall, pretty cool.

Thanks for checking it out and, yeah, it was designed to work with blog posts and news articles and not marketing-type sites.

Great idea, works well on 4 sites out of the 5 I tried it on. Too bad it makes notforest.com completely blank and thus not readable at all.

I'll keep it anyway !

Cool idea, a Chrome extension would be useful since some of us hides the bookmarks bar.

Thanks and totally agree about Chrome extension, will look into that. Notforest has actually been knocking around for a couple of years now and is due a code update so maybe good time to add browser extensions too.

Hmm, what's the difference between this and Readability?

All Readability does is replace the crap on websites with other, different crap.

Unfortunately, this site does a much worse job of recognizing the content text on pages.

Is it ironic that notforest.com is blank after using it?

Well, why is the text on your page unreadable then? Grey on white.

Not only that, the text seems to be made of data: images?!

