

Show HN: Never lose a website again - flippyhead
http://fetching.io

======
ark15
Two issues with your landing page

1\. major - Breaks back button

2\. minor(nitpick) - If I click on the green "notify me" button and then
"cancel" on the prompt dialog instead of "OK", it still shows me the green
"Success - you will be notified" message

Good luck though!

~~~
mmaunder
"Never lose a website again" ...because you will never leave this one.

~~~
flippyhead
Ha!

------
jgh
Your site breaks the back button (Chrome 35 OS X Mavericks)

~~~
flippyhead
You are right! I'm working on a fix.

~~~
brk
Out of curiosity, what are you doing with such a simple site (and I don't mean
that in a bad way) that requires any conscious effort to fix? It seems like
you really need to go out of your way to fuck up the back button and for a fix
to be "work".

But, recent nuanced trends in web design/navigation aren't my top skill, so
I'm asking the question honestly.

~~~
zevyoura
The site is built with the meteor framework, so it's not just the simple
implementation you might expect from the layout. I imagine the features of
meteor are useful in the actual app itself, which is probably not a simple
site.

~~~
rurounijones
Basically, wrong tool for the job. The top page should probably be static, no
need to involve meteor at all.

The rest? Maybe meteor is the best choice, cannot really say.

------
purplerails
Shameless plug: [https://www.purplerails.com/](https://www.purplerails.com/)

(1) Saves an exact copy of the page also.

(2) Indexes the text.

(3) Encrypted (search occurs on your computer).

Been in beta for a while. Thanks for feedback.

~~~
andyhmltn
Love the cartoon on your front page!

~~~
purplerails
Thank you for your kind words!

------
sgibat
I love this idea. Chrome's history is insufficient; I swear, it often doesn't
record sites I've visited. But.. I'm not ready to trust an unknown process
sending my entire browsing history to unknown servers. Excited for a potential
local version.

~~~
joshvm
Chrome's history is inordinately awful. Get this addon:
[https://chrome.google.com/webstore/detail/better-
history/obc...](https://chrome.google.com/webstore/detail/better-
history/obciceimmggglbmelaidpjlmodcebijb?hl=en)

~~~
joshschreuder
I've always thought this, so thanks for the extension, looks pretty good. It
would be nice to have some geek stats like a top x list of your most visited
sites.

------
drdaeman
I was just looking for such indexer this exact evening.

Unfortunately, logon via Twitter fails with 500 error bar flashing at the
site's top and logon via FB fails with "App Not Setup: The developers of this
app have not set up this app properly for Facebook Login.", so I can't try it.

Nonetheless, my biggest worry is why this is a service instead of a standalone
package. (Actually, I've considered trying with the hope plugin may be
possibly FOSS and if so researching whenever it could be hacked to be used
with locally-installed Solr/Lucene server) I'm not really comfortable with
directly or indirectly sharing my browsing history with most third parties.

~~~
flippyhead
Bah, I'm sorry! The Facebook login has now been fixed.

I totally hear you on some people not wanting to share browsing history
externally. This first version was easiest to do as a hosted service. Next up
I intend to package it up as an installable app.

~~~
volent
Twitter login still not working

------
ibrad
I am going to wait for the local version for this one. I am working on a
project that needs the full browsing history including content. You might have
read read it here on HN[1]

The goal is to make my computer a search engine that can also recommend
articles based on the ones I have visited. It can also check websites to see
if there is any new content.

The project is still in its very early stages and any tool I can use will be
very helpful. This looks just like what I need.

[1]:
[https://news.ycombinator.com/item?id=7822859](https://news.ycombinator.com/item?id=7822859)

------
smoe
Why server side indexing and not use something like lunarjs[1] from the
beginning?

Like the Idea of fulltext search on history, but I'm not going to send my full
history to some random dude on the internet. No offence intended:)

[1] ”Simple full-text search in your browser”
[http://lunrjs.com/](http://lunrjs.com/)

~~~
flippyhead
None taken! I looked at lunarjs (and a few other options) but felt that the
more sophisticated features of ElasticSearch were worth it (at least to me).

I'm curious, would you be more comfortable if all your content were encrypted
such that not even the app developer (me) could read it? Or would only a local
index do?

~~~
smoe
It's not just about storing the history itself. I store bookmarks in the cloud
as well.

My main problem is: As I understand it, the plugin sees what I see and just
sends away everything. Including payments and balance in my bank account,
business messages in Basecamp, code in private Github repos and content on
sites that are not yet public I've signed an nda for.

The benefits don't justify the risks for me. Using incognito mode to avoid a
plugin I've installed isn't really feasible.

As a second reason: I have mediocre internet connections most of the time
since I'm traveling. Therefore I try to avoid as much requests as possible.

------
bujatt
I like the concept of this service.

Basically we should be able to find _any_ piece of information that we have
already encountered with anytime in our lifetime. This service is indexing
this layer of information.

I would add another layer for information with which we engage more: e.g.
liking, linking, sharing etc.

------
tempestn
Cool. I've definitely had a desire for something like this in the past. It's
mostly been filled by Evernote and its web clipper now. Any time I have a
vague inkling that I might want a page again sometime, I clip it, so I can
easily find it with a future search. (Often by accident since you can
configure the clipper to show Evernote notes alongside search engine results.)

The downside compared to something like this is that it only works if I have
the foresight to clip a page. But the upside is that I don't end up indexing
all the crap I definitely won't want again, so it's easier to find things in
the remainder.

~~~
flippyhead
Thanks. My hope is that the search is good enough that it doesn't matter how
much is indexed -- you'll always be able to find what you are looking for.
This is one reason, to start, I felt it was easier to index content server
side..

~~~
tempestn
Of course, there's no reason not to do both. A service like yours could catch
anything that falls through the cracks with EN. If you could eventually create
browser extensions like theirs that add results alongside search engine
results, it might would help people rediscover pages they've used in the past.

------
mooism2
What browsers does it work with?

Where's your privacy policy?

~~~
bradbeattie
Moreover, how does this work with sites that make use of ajax content? It'd be
frustrating to assume that it's recording just fine, only to later find that
patches of your history couldn't be recorded.

~~~
flippyhead
That's a great point. At the moment it only records what's in the DOM
(stripped of all HTML) after the first page load event. I'm working on how to
include AJAX content but it's not nearly as straight forward.

~~~
waterside81
Take a look at readability.js ([https://github.com/Kerrick/readability-
js](https://github.com/Kerrick/readability-js)) and extract/upload the main
DOM content after all the JS trickery has completed.

~~~
bradbeattie
Assuming it completes. I can easily imagine a page with a news ticker that
updates once every few seconds.

------
pbhjpbhj
Do you just record n-grams indexed against the page url, are you then
uploading that index? If you're not uploading it how is there no "local
version" available?

It's an interesting idea. Personally I have a script that wgets all the pages
I bookmark and I very rarely use that content. What use cases are you
anticipating?

~~~
boyter
It would be useful for pulling back pages that have fallen off the internet
and are not in the wayback machine.

Thats a use case I have hit a few times. I even started backing up useful
posts just in case they died.

[http://gigablast.com/rants.html](http://gigablast.com/rants.html) is an
example. Its a really good insight into the creation of a search engine which
really should be preserved.

One that did dissapear but has since come back is this post
[http://widgetsandshit.com/teddziuba/2010/10/taco-bell-
progra...](http://widgetsandshit.com/teddziuba/2010/10/taco-bell-
programming.html) which I went looking for a few years ago but had dissapeared
from all search indexes.

~~~
ctrijueque
Pinboard offers an archival account (25$ a year) that works wonderfully for
this. Take a look if you don't already know it.

[https://pinboard.in/faq/#archiving](https://pinboard.in/faq/#archiving)

------
arikrak
Looks interesting, I've often wanted to find old pages I've visited. Though
I'll wait until it has a track record, before I trust it with my history.

It would also be important to rank searches well, since otherwise an entire
history may have too much. Though this will be a hard area to take on
Google...

------
someone5555
To be blunt, this project will never gain traction because not enough people
are willing to store their browsing history somewhere outside of their
control. The back button issue shows a lack of attention to detail. Something
that's extremely important when dealing with sensitive personal data.

~~~
flippyhead
Please know this is the first exposure this project has ever had. I really did
try, in my spare time, to get this thing perfect before soliciting feedback
from this wondrous community but -- alas! -- there be bugs. It's this kind of
feedback that I was seeking and I intend to incorporate as I drive towards a
broader release.

Part of what I'm trying to validate is exactly the point you raised: will
people generally be too freaked out to store their browsing history "in the
cloud"?

The next thing I'm hoping to determine is if it'll be enough to encrypt this
content is such a way that people are OK with it being stored externally or if
__only __a locally installed version will do. Security aside, there 's a great
deal of advantage to offer this service hosted.

Thanks for your feedback!

~~~
lolatu54
I think the only way people would be comfortable is if you use strong one-way
encryption, meaning you yourself cannot decrypt a user's data. But to enable
trust, and being a startup, you would have to have your code reviewed by a
reliable third party or open-sourced for public review. Google gets away with
it because they have built a brand with a reputation that some people trust
enough with their data (I don't personally, but enough do). A much easier path
is to create a version that works locally with no external communication. Both
options would be ideal, but perhaps more work than you care to take on. No
matter how it pans out it will be a great learning experience.

------
aeflash
Ah, so it's like Firefox's URL bar, except with full-text search, rather than
just the URL/Title.

------
eglover
If this doesn't repeat websites and categorizes them in a neat manner it would
be amazing. My obsessive bookmarking led me to throw this up:
[http://goo.gl/WNw5OG](http://goo.gl/WNw5OG)

I look forward to seeing what you come up with.

------
soundoflight
As an Opera user, there has been many times where the in-page content
searching has helped me find stuff. When I'm in different browsers, that loses
its usefulness. I'll definitely come back to this when the local version is
done.

------
meritt
I'm a bit confused. Is this essentially a browser plugin which extracts
plaintext of your browsing and then sends it off to fetching.io's servers for
indexing, and presumably some sort of search box area to let me search those?

------
vertak
Is each user's index stored somewhere in their browser or on your servers? It
would probably ease peoples' minds if they knew their history was stored
locally, but I can see the performance hit that might make.

------
Flenser
What does it look like? Would be nice to have a demo page with an example
index to try without having to sign up. I bet you could A/B test it to see
what affected sign-up rates too.

------
Jonovono
For people that want something like this but locally:
[https://github.com/idibidiart/AllSeeingEye](https://github.com/idibidiart/AllSeeingEye)

~~~
hupili
nice pointer! Not sure what amount it can support. It stores screenshots for
the pages, which is a lot consumption of space.

------
saint-loup
I may be blind, but you don't seem to inform the user that the addon is
currently Chrome-only before he has suscribed. It's misleading.

------
ladybro
Ha, very random but I remember you from an Airbnb listing a few months ago.
Almost lived with you in Seattle for a couple weeks! Best of luck.

------
PeterWhittaker
Why do so many sites require cookies _just_ to tell me what they are all
about? Yes, I am of the lunatic fringe that blocks all cookies by default...

...and by default more and more web sites lose me as a potential
client/user/whatev because they require cookies just to display static
welcoming information...

...I lose nothing by this, as far as I can tell. Perhaps ignorance really is
bliss. :->

~~~
dguaraglia
I guess most websites nowadays are built using one of the myriad web
frameworks out there (Django, RoR, you name it.) Most of this frameworks
enable sessions by default, simply because it's what most websites will want
if they manage any kind of state. Nothing nefarious about it.

~~~
jarrett
> Most of this frameworks enable sessions by default

True, but in my experience, the major frameworks don't automatically lock out
users with cookies disabled. For example, on a Rails app with no before_filter
on the homepage, you can start the server and do this:

    
    
        echo "GET / HTTP/1.1" | nc localhost 3000
    

You should get back the homepage HTML.

------
mappum
I don't understand, what is the advantage over using Chrome's history menu?

~~~
flippyhead
Chrome doesn't index the contents of the page and is limited in how much
history it records. This does full text indexing of the pages you visit and
lasts forever. It's backed by elasticsearch.

------
xanderstrike
Why do I need to create an account for what sounds like a browser extension?

------
minusSeven
not available on Firefox. Can't use it.

------
serf
when is firefox support planned?

------
KhalilK
hmmm, privacy?

------
PeterGriffin
Interesting service, but if you consider the market, it's the cross-section of
people...

\- who are both anal-retentive enough to want to index the content of every
page they visit, and yet...

\- are not anal-retentive enough to want to index the content of any page
visited on a mobile device, and also...

\- don't care about their privacy by sending the address and index of every
single page they ever visit to _someone_.

Good luck.

