
Show HN: Memex – annotate and instantly recall any website - Blahah
After 18 months of tinkering and iterating, we want to show you our Memex:  An open-source browser extension to effortlessly organise your web-research.<p>We&#x27;d love to know what you think :)<p>Check it out at <a href="https:&#x2F;&#x2F;worldbrain.io&#x2F;hn" rel="nofollow">https:&#x2F;&#x2F;worldbrain.io&#x2F;hn</a><p>Memex features:<p>- Instantly find websites you visited with the fuzzy memories you have about them, instead of bookmarking everything or keeping dozens of open tabs. Search for every word of every website you’ve seen, and filter by time, domain, custom tags or bookmarks.<p>- Add your thoughts to websites via comments &amp; annotations, directly in the browser - not in external, disconnected apps.<p>- Cite websites with precision: Share links to pieces of text on any website.
======
BlackForestBoy
[ MEMEX TEAM ]

Hey Hackernews crowd, my name is Oliver and I’m the founder of the
worldbrain.io, the project developing this Memex. You can ask me anything
about our strategy, our financial model “Steward Ownership” and roast me about
the details of our vision.

# Vision Summary:
[https://worldbrain.io/vision_deck](https://worldbrain.io/vision_deck)

# Economic Vision: [https://worldbrain.io/crowdfunding-
memex/#why](https://worldbrain.io/crowdfunding-memex/#why)

# Detailed Vision:
[https://worldbrain.io/vision](https://worldbrain.io/vision)

Almost exactly 4 years I started the journey of researching a solution on how
our society can collectively tackle information overload & misinformation
(also known as “fake news”). I believe that getting this right is crucial for
us to make the most complex personal, social and political decisions
effectively, sustainably and compassionately - or go bust within the next
century. Counter to common tactics of approaching this problem I think we
should not dictate a certain truth to battle misinformation but give people
the tools that shared truths can emerge. Memex is our first step towards
making this possible and we are excited to show it to you today.

In order to build the first prototype of Memex I learnt how to code. (or
rather how to be a skriptkiddy and piece together things from GitHub and
StackOverflow ;) ) Even though the code was pretty shitty, it helped to gather
a wonderful team and a community of volunteers that rebuilt everything from
scratch (and ditch my shipwreck of code). So in reality they did 99% of the
work of developing Memex and deserve the praise. :)

Now here we are. After a tinkering and iterating on Memex over the past 18
months we feel ready to show you our progress. Let us know how we can improve
Memex to make it really useful in helping you to overcome the information
overload in your web-research.

\- Oliver

~~~
WhiteOwlLion
I'm glad you forked Falcon and developed it to this point.

[https://news.ycombinator.com/item?id=14648312](https://news.ycombinator.com/item?id=14648312)
HistorySearch came out with another Chrome extension, but I don't like their
approach at all: [https://historysearch.com/](https://historysearch.com/)

I'm wiling to give your product a try again.

Thanks for taking it this far...

~~~
BlackForestBoy
Hey!

Thanks for the kind words.

Indeed we initially forked Falcon to develop "Research-Engine", the first
iteration of this.

After we could validate that people really wanted it, we rebuilt everything
from scratch. So no Falcon code is there anymore.

Yeah saw the new HistorySearch update too. Thought they were not in the game
anymore. Their product works well, however I don't feel comfortable pushing
all my history to some server I have no control over. Especially not by
default.

Also, offline-first is so important to me too. It's often that I work from a
plane or with shitty internet connection and want to add annotations or search
my history.

Hope you like our new updates. We know there is still toooons of stuff to do.
So let us know where you have troubles :) We'll do our best to improve it bit
by bit. Literally :)

------
mrkgnao
This looks interesting. I'm open to replacing my current (actually quite
rudimentary) "capture to org-mode" setup. Is there any way to dump
all/important parts of the information Memex captures to plain text/CSV/etc,
or, say, a sqlite database?

Overall, this is a space I'm very interested in and this looks like a polished
product. I'll be keeping an eye on it; I've installed the addon and am looking
forward to playing with it :)

PS. "Full-text search" seemed to me (and might to many) like full-text search
of webpage _contents_ , not just URLs/titles. It's not malicious, of course,
but it feels slightly misleading.

~~~
BlackForestBoy
hey! Oli here from the Worldbrain.io team

> Is there any way to dump all/important parts of the information Memex
> captures to plain text/CSV/etc, or, say, a sqlite database?

Yes indeed there soon will be. Currently working on a backup & restore feature
that would make that possible too. First cloud backup & local dump, then
exporting in different formats. Depends a bit though on what people really
want out of it and how they intend to use it.

Want to get this right together with the community.

> "Full-text search" seemed to me (and might to many) like full-text search of
> webpage contents,

You are right in your first impression. It IS full text search on the content
of each page you visit :)

------
pvinis
I don't remember how I found memex but I've been using it for 2 weeks. It's
pretty cool! I've been looking for a way to be able to search anything I've
visited online, full text search. Keep it up, and thank you. :)

If I had to add something, it would be the ability to sync between
computers/browsers, or fetch unfetched pages from history, so the syncing can
happen in the browser and memex just crawls again.

~~~
BlackForestBoy
Hey!

> it would be the ability to sync between computers/browsers,

This is right now in the making :) Give it a couple of weeks and you can do
that!

> or fetch unfetched pages from history

You can already do that now :)

Go to settings > import history & bookmarks

Does that do what you are looking for?

Cheers Oli

------
Gambit89
How well do you (plan to) support the W3C annotation standards [1], and could
one use this as a backend for hypothes.is [2]?

[1] [https://www.w3.org/annotation/](https://www.w3.org/annotation/)

[2] [https://web.hypothes.is/](https://web.hypothes.is/)

~~~
BlackForestBoy
hey!

Oli here from the WorldBrain.io team

Yes we are in close contact with the Hypothes.is team and may develop some
shared components in the future. We already use their anchoring library.

Internally we don't store the annotations in W3C standard yet, but will
provide that as soon as we get to develop the ability to export or access
annotations via an API.

How do you imagine Memex being used as a backend in your context? Once we
offer more integrations to other services, it is definitely thinkable to also
make annotations from Hypothes.is searchable. Is that something along the
lines you're thinking of?

Cheers! Oli

~~~
Gambit89
Internal representation doesn't matter to me as much as the external view, so
I guess I'm actually looking for your planned extensible API - I'll be eagerly
awaiting that.

With more and more annotation services, there may come a need for an
"annotation manager". Memex could be the manager for Hypothes.is as you
suggest, but a separate manager could also use Memex and Hypothes.is as
backends. So, for example, the annotation act is done with Memex, but the
research stage is done with a separate tool.

Cool product/vision! I like your guys' stance on the user's data and with an
API might settle on this one. :)

------
poltak
[MEMEX TEAM]

TL;DR: AMA about the full-text search functionality and general experiences
with using the extension.

Hello everybody. I'm Jon - one of the lead technical contributors to the Memex
extension. I have been involved in the project for roughly 18 months and
watched as it has changed and grown over that time through the hard work of
many different individuals.

I would be happy to provide answers to any questions regarding the full-text
search and how we enable it for each web page your visit. I have spent a large
portion of my time at Memex focused solely on the search and, with the help of
some fantastic technologies like Dexie ([http://dexie.org](http://dexie.org))
and IndexedDB, we have arrived at what we believe to be one of the fastest in-
browser full-text history search engines. However saying that we hope to
iterate on and improve important aspects of our search - such as results
accuracy - in the near future.

I would also love to hear any feedback you might have from your own
experiences with using or having the extension installed in general - we're
always aiming to improve that experience.

------
rambojazz
"save locally" you mean like in my browser, or I can setup my own server?

~~~
BlackForestBoy
Hey! Oli here from worldbrain.io

Yes, it means all data is stored locally and in the browser. Later we aim to
make the backup & sync cloud self-hostable too.

------
VvR-Ox
Very interesting project. * nice that there is an open source & for free
edition * what happens with very big data? (search performance etc) * what
about integration with other tools?

Good Luck & thx for sharing :-)

~~~
BlackForestBoy
Hey!

Oli here from worldbrain.io

So the search technology is horizontally scalable and should carry years of
your web-research locally in the browser. (and still be performant)

Yes we strongly plan to integrate with other services so you can search and
import data there too. Intended is that we make it abstract enough for
developers to easily and autonomously also integrate services on their own.
However that needs to wait a bit more since so many other priorities are on
the table and we are quiet a small team still.

Cheers Oli

------
chrisweekly
Looks great! FWIW I like the vision / ethos as much as the product.

------
dublin
Interesting, but I don't think I'll be giving up 4 years worth of OneNote web
clips that do the same thing, but with more flexibility and compatability...

------
icedchai
Memex... neat. I remember reading about Vannevar Bush many many years ago.

------
PurpleRamen
Sketchy. Big promises for privacy, but first thing they do is collecting data.

Similar the commercial plan. Yet there is no existing selfhosting (at the
moment), or even an export. There is no display of how many data are
collected. How much will this waste with time? How do I clean it up?

The Integration seems to something they should work on. Search does not work
with Tridactyl (a vim-style addon for firefox). Entrys in context-menu are
also missing. The Annotation "Toolip" (seems there is a typo in the settings?)
is kinda useless for me as I already use the "Swift Selection Search"-addon.
Some documentation to manual configurate integration might be useful. Some
tooltips are missing in the Toolip too. No clue what those icons should do
from the look of it.

BTW In Settings, Acknowledgements-Section there if gif(?) from some movie-
scene. Given the commercial plans of this project this will likely considered
as copyright infringement.

BTW2 Memex seems to be trademarked. Might be a problem.

~~~
BlackForestBoy
Heyho :)

Oliver here from the WorldBrain.io team.

Thanks for sharing your concern. Hope I can adequately address some of them.
Yeah indeed we are collecting some data on how you use the features, so we
know how we improve the workflows of the tool. No terms you search, urls you
visit or blacklist, annotations you make or anything user generated about you
is collected though. You can also opt out of that and we don't even know
you're existing.

> There is no display of how many data are collected.

A list of everything collected you find here: [https://worldbrain.io/privacy-
policy/](https://worldbrain.io/privacy-policy/)

Re selfhosting: Yeah we also would like to be there already, but we are a
small team and have to priortise things that make the tool useful first.
Overtime we get there. Inside the team we have a strong commitment to deliver
this.

> Given the commercial plans of this project this will likely considered as
> copyright infringement.

Good point, removing that rather!

> Entrys in context-menu are also missing.

Yep, we want that too :D Hopefully someone can help building this as
unfortunately there are a couple of other priorities at the moment. E.g.
making annotations fully searchable

> How do I clean it up?

You can already blacklist & retrospectively delete stuff via the settings, but
soon also a bulk select will be available.

Sorry if the tool is not yet fully like you want it :) We are working on it.

~~~
BlackForestBoy
To add to this: We really take your privacy seriously. This is why we went out
of our way to prevent using Google Analytics, the easy and common way, to do
the measurement of user-flows. There was nothing available that did it well
for browser extensions. Instead we built our own analytics tool that would
ensure noone else has access to that data just because we are using someone
else's software. Props for that go to our GSoC student Mukesh :)
[https://github.com/mukeshkharita](https://github.com/mukeshkharita)

~~~
webmaven
I'd suggest that releasing the analytics tool as open source is likely to be
_very_ interesting to a large number of users here.

~~~
BlackForestBoy
Yeah, we have that on the map too. Still need to make it run smoothly and
factor it out into a separate library when we can. Limited (wo)man power at
the moment to do this. Kind of hacked together at the moment too :)

------
ShishKabab
[MEMEX TEAM]

TL;DR; AMA about Backup/Sync functionality, ways you use Memex and we can make
your life easier, the Storex layer described in the second paragraph, design
methodologies and decentralization/Blockchain!

Hey there! I'm Vincent, a weird mix of software engineer slowly going service-
and UX-designer. Since joining WorldBrain in February, I've been architecting
and implementing parts of Memex's Cloud functionality, including Memex.Links
and the Backup and Sync functionality that is currently in the works. On the
technical side, I try to make sure that we can move fast and make the best
trade-offs between ideal and doable solutions, so that we can create stuff
that is as useful and easy to use as possible. But being in the field for 13
years building software, my main interest is shifting from programming, to the
whole processes from reseach to designing functionality and communication
strategies to make things that are as useful as possible, which I can then
help technically architect. And as such, in my spare time I'm currently
thinking about how to rethink web development to lower the barrier between
designers and developer, reducing the write-test-modify cycle while taking a
wholly new approach to QA that is driven by designers instead of programmers
churning out unit tests.

Lately the bulk of my work has been battling AWS and Google docs, trying to
get the Backup functionality to work as nicely as possible. But, in the
process I've taken Jon's work creating the fastest in-browser full-text search
engine currently available, and made it into a low-level library acting as an
abstraction layer on top of different kind of data sources (IndexedDB, SQL,
NoSQL, soon to come REST/GraphQL.) It still needs some work to be in a
presentable state though (docs, more tests, etc.) The goal is, we aim to
create many loosely coopled, tightly cohesive layers on top of it that solve
common storage problems, like schema migrations, access rights, content feeds,
live collaboration and moving data between front- and back-end. Due to it's
origins in Memex, we called it Storex. Feel free to brainstorm with us on how
to make this useful to lots of people and especially, what tedious problems
you keep solving time after time hovering just above the storage layer.

