
Building personal search infrastructure for your knowledge and code - october_sky
https://beepb00p.xyz/pkm-search.html
======
djhworld
I've given up with trying to find The One True Note Taking Tool, so have ended
up writing my own thing that I tinker with now and again to tune it to exactly
what I need.

It's essentially a simple web server that sits on top of a bunch of markdown
files.

The frontend renders the markdown using markdown-it and supports KaTeX for
simple inline mathy things, along with the extended markdown stuff like tables
etc. I've even made it so that you can drag and drop files (including images)
into the edit box and it will upload them to the server and render the correct
markdown syntax so they can be rendered when you look at the note.

Alongside the files, the data is also stored in a SQLite database file with
some metadata, and I'm using the Full Text Search (FTS5) engine to support
search which seems to work ok.

If the database gets corrupted it can just be rebuilt, it's really just there
to augment the notes. If I stop developing it or want to move on, the notes
are there as text files.

It works well enough in a mobile browser, although admittedly a bit rubbish if
you need offline access.

Works well enough for me. I might open source it one day but I think I'd need
to clean up the code a bit first :)

EDIT: the core of the tool was mostly inspired by this article
[https://golang.org/doc/articles/wiki/](https://golang.org/doc/articles/wiki/)

~~~
gwgundersen
This sounds a lot like a tool I built for myself [1], sans the database. I
agree that Markdown + Katex with a local server seems like the right move for
most technical people. Lots of things like encryption, backups, and basic text
search can be done via other Unix tools. I also agree that the big win is
owning your data long-term, even if you get tired of maintaining the software.

[1] [https://github.com/gwgundersen/anno](https://github.com/gwgundersen/anno)

~~~
archontes
Sir! I have to say seeing you here that I appreciate your contributions.

------
sqs
Sourcegraph CEO here. I see the doc mentions Sourcegraph for code search
(cool!). Something like ripgrep is indeed better for your case, a single
person who just needs to search code in local directories on their own
machine. I made a PR for our docs at
[https://github.com/sourcegraph/sourcegraph/pull/8075](https://github.com/sourcegraph/sourcegraph/pull/8075)
that should clarify this.

Sourcegraph is a web-based code search tool that automatically syncs and
indexes many repositories from your organization's code host(s). It's intended
for every developer at an organization to use for searching across all of the
organization's code (and for navigating/cross-referencing with code
intelligence). It's self hosted and usually there is 1 Sourcegraph instance
per organization. If you love local+personal code search, I bet you _and_ your
teammates would love organization-wide code search, so give Sourcegraph a try
([https://docs.sourcegraph.com/#quickstart](https://docs.sourcegraph.com/#quickstart)).
:)

~~~
mikepurvis
Another great option for local code/repo search is Hound. I maintain an
instance of it at my workplace, but it's so lightweight and easy to deploy
that I could easily imagine running an instance of it on my laptop for offline
personal use.

[https://github.com/hound-search/hound](https://github.com/hound-search/hound)

~~~
blyry
YES! We have been using hound for several years now, having all hundreds of
our org repos searchable in one spot, in a LIGHTNING FAST manner has been an
invaluable tool to help our various teams keep up with the legacy sprawl and
effectively remove old features and all their dependencies from our sprawly
systems. I even wrote a microservice that uses gitlab global hooks to keep
hound up to date without polling, and a little c# config generator that runs
as a cron job on our gitlab instance and redeploys hound with the newest repos
included.

Hound falls short on access control front (we wrapped our instance with a saml
proxy), but it's still a 'you either can search every piece of software for
\'password\'' or you don't have any access at all. Having to index a specific
branch instead of all of them kinda stinks too; for those two specific reasons
we have been eyeing sourcegraph, esp. as the gitlab integration matures.

I can't emphasize enough how fast hound is and how pleasurable it is having a
regex based code search that doesn't make me wait.

~~~
mikepurvis
Yeah, the access control thing is not ideal— I have my instance behind Apache
for the active directory plugin. Potentially as a hack you could run multiple
Hound instances and reverse proxy the correct one based on a user's role?
Might be easier to just add in proper support upstream. :)

Anyway, for now I'm at a small enough org that everyone still just sees
everything, and it's been super valuable.

As far as competition with other tools, the infrastructure team at my org has
their Elastic instance plugged into our GitLab, but most of the engineers
agree that Hound is better— it's faster, it does regex, and it doesn't do
goofy stuff like return pages of the same result from everyone's fork of the
same repo.

------
ssivark
Meta-observation. This topic seems to be getting a lot of attention on HN over
the last few months, indicating massive interest. Further, looking at the
landscape of developments in this space (past all the me-too Markdown note
taking apps): Evernote seems to have a fading presence on the landscape,
Notion seems to be a (too?) well-funded behemoth startup, Roam is trying some
exciting things, and Tiago Forte is putting together some interesting things
under the BASB banner. (Any others? Oh btw, there’s also Perkeep)

It’s amazing for how long Emacs’ Org-mode has been largely unparalleled! Apart
from the revered desktop setup, there are now a bunch of mobile offerings
including Organice — not quite slick, but definitely useful.

I‘m sincerely rooting for more experiments in this area. I would love to be
able to write by hand or speak to my memex (multi-modal interaction). Vannevar
Bush’s “As we may think” has languished uncourted for pitifully long. In some
ways, this was supposed to be the first “killer app” for personal computing.

~~~
marviel
It's a ripe space. I'm using notion mostly right now, but I've also used:

-Coda.io (big, more scriptable player)

-Hypernote (super new player, but with a cool new take on inter-note relationships)

-Tiddlywiki (super customizable, really fast -- but also has a fair amount of footguns)

-Airtable (only played with it a few times but it's usually mentioned in the same breath as notion, I notice)

Hopefully someday we'll achieve Alan Kay's dream :)

~~~
DavideNL
If only Notion was private (like for example Omnifocus.) I can't imagine
uploading all my private data to the cloud of a "free" app.

OmniFocus is more expensive but i gladly pay to prevent my data from being
analyzed & sold.

~~~
marviel
I don't disagree -- though they do have a paid version, and a modest team
size, which seems promising

------
klft
(1) For note taking I stumbled across anno[1] via[2] two weeks ago. It's a
python flask application which you run on your localhost. You write markdown
which gets stored locally as file and is rendered as html using pandoc[3].
It's really basic but I love it.

(2) For physical documents I use a Fujitsu ScanSnap iX500[4] for scanning. A
runtime-licencse of ABBYY FineReader for OCR is included. The resulting PDF
has embedded text which I extract using pdftotext[5]. I wrote a python
application to search and tag this documents. It loads all the text in-memory
which is perfecty fine as I have < 10,000 documents. I use it since 5 years
and it works OK.

[1] [https://github.com/gwgundersen/anno](https://github.com/gwgundersen/anno)

[2]
[https://news.ycombinator.com/item?id=22033792](https://news.ycombinator.com/item?id=22033792)

[3] [https://pandoc.org/](https://pandoc.org/)

[4]
[https://www.fujitsu.com/global/products/computing/peripheral...](https://www.fujitsu.com/global/products/computing/peripheral/scanners/scansnap/ix500/)

[5]
[https://en.wikipedia.org/wiki/Pdftotext](https://en.wikipedia.org/wiki/Pdftotext)

~~~
lifeisstillgood
Actually, what has been bugging me recently is the inability to "tag" photos
on my iphone - all I want is to snap a copy of my bill / invoice whatever, tag
it with "gas bill" and let it upload to icloud / dropbox. from there I am sure
I can onwards process looking for "gas bill" but actually there seems to be no
obvious way to do it, (even looked into EXIf data), and I guess it will age to
wait till i learn ios coding

~~~
jamiek88
Touch and hold , then tap an option. Custom: Tap , tap Enter New Tag, type a
customtag, and tap Done. Create additional custom tags: Tap , tap Enter New
Tag, type a custom tag, and tap Done. Add more than one custom tag to a photo:
Tap , and tap each tag you want to add (so a checkmark appears next to it).

~~~
mceachen
Is this a real UX, or something you'd like? (This isn't how either Apple or
Google Photos works)

------
stillwater56
Does anyone else find that the simple act of writing notes helps them remember
and process better? I spent forever trying to find an ideal note-taking
solution, but now I just write things in a single notebook. I rarely review my
notes, but I find that simply writing thoughts down consistently has improved
my memory and understanding of new concepts.

~~~
otakucode
This certainly applies for me personally. My theory is that it ties in with
the sort of 'geographic' memory where when you think of something, you might
not be able to remember exactly what it is, but you can remember pretty
precisely that it's in the middle of a certain notebook, on a heavily marked-
up page, in the bottom left corner. By tying things to a location which you
can remember, placing it in a bit of a context, its easier to hold on to. I
also find, and for this I have no explanation at all, that I can remember
sequences of numbers and code very well, better than anything else. I couldn't
tell you the date I started or left my job 2 employers ago, but I could rattle
off my 7-digit numeric security code for the door no problem. The brain is
weird.

~~~
lazyasciiart
Well, you also used that 7 digit code a lot more often than you ever had to
recall your start or end dates.

------
lcall
I wrote and use daily [http://onemodel.org](http://onemodel.org) (AGPL, uses
postgres), for many reasons listed there :) . One way to think of its current
state is a text-mode, easy-to-learn (i hope) infinite mind map of things,
where I store _and can query_ effectively everything: calendar, reminders,
quasi-anki-like knowledge review, journal, automatic activity log, notes on
subjects, very efficiently for the user. (It also stores documents, but that
is not very smooth compared to other document systems, nor is browser
integration smooth at all.)

Edit: It also has a very basic security model (private, public, unspecified),
and with that in mind, can export trees of notes as html or as outline
documents (text), with or w/o indentation & numbering, which I've found very
useful. And anything can be in as many places in the tree as is helpful. The
export to simple html, I use to generate my 2 web sites.

(I plan to move it to Rust, and maybe sqlite, eventually, as well as add
features like anki, internal code attached to entity classes for cheap
internal customization/automation, etc, but have been slow lately.)

(Edit: it is currently only self-hosted by each user. Have considered doing
hosting for other users, and might some day.)

~~~
gotts
telnet demo seems to be down at the moment: Trying 52.37.29.12...

~~~
lcall
True; sorry about that. Maybe I should remove that from the web site until I
decide better. But the best thing is probably to check the screen shots via
the web site, then install/try it if you like...

Edit: I have removed mention of the telnet demo from the site. If there were
sufficient real interest I would put it back (or consider hosting the system
for others). If so, email me via the mailing list at the site, or via the
address at the site footer. Thanks.

------
gricardo99
A great time saver for me was simply setting up better bash history and search
capabilities[1].

I wrote a wrapper function, sbh (search bash history) that allows me to input
date strings like "2 months ago", or "last week", which narrows the search.
Linux 'date' function with --date string arg is pretty powerful[2].

1 - [https://spin.atomicobject.com/2016/05/28/log-bash-
history/](https://spin.atomicobject.com/2016/05/28/log-bash-history/)

2 - [https://www.thegeekstuff.com/2013/05/date-command-
examples/](https://www.thegeekstuff.com/2013/05/date-command-examples/)

------
dchichkov
Reminds me somewhat similar - CEO of Wolfram developed a nice way of record
keeping: [https://writings.stephenwolfram.com/2019/02/seeking-the-
prod...](https://writings.stephenwolfram.com/2019/02/seeking-the-productive-
life-some-details-of-my-personal-infrastructure/)

By the way, is there, by chance, a "note taking/indexing tool from photo"? I'd
like to be able to take a photo of an title/abstract of computer science paper
with my phone. And then be able to find it, by approximate date and keywords.
(I use Android. Seems like something relatively easy to hack, actually, on top
of Google photos.)

~~~
tibu
Evernote does character recognition quite well. I don't know if there are any
others but would be good to have something else too so I can leave Evernote
for Notion.

------
napoleond
I've been thinking a lot about how I manage my own data lately (notes, photos,
code, reference material, etc) and have concluded that the primary feature I'm
looking for is longevity. I'm saddened by the amount of data I've lost over
the years, either because of hard disk failures or third-party services going
out of business/making it difficult to extract things/getting too expensive.

In light of this, I'm biasing toward simple file formats managed by tools I
write myself, and optimizing for cost in a way that I otherwise don't, since
any recurring costs incurred by the system are effectively a lifelong
commitment. I _am_ relying on S3 for primary storage (so that it is accessible
anywhere) but with a sync to offline backup.

So far, I've implemented a personal Zettelkasten tool (with built-in spaced
repetition, so doubles as an Anki replacement) and a search engine that's
based on Presto (via AWS Athena) so that I don't need to keep an Elasticsearch
instance alive. I'm planning to build out other repository tools as I go.

It's been very liberating to build tools that are never meant to be used by
anyone other than myself, and with the confidence that the tools don't matter
too much anyway since the underlying files are stored in evergreen formats.

~~~
silicon2401
what's the optimal setup for long-term, large-scale (personal) data storage?

I want to build one big Backup. Some initial research has pointed me to
something like Bacula to manage the data backup process from a machine. With
the 3-2-1 rule, I know I also need my Backup itself to have at least 3 copies,
in at least 2 different forms (cloud/hard disk), at least one of which is off-
site from me.

As an individual, do you or anybody else know the best way to implement such a
system? Should I buy one giant hard drive, use many hard drives to create a
RAID array, something else?

~~~
kortex
Oooh. I've been wrestling with this problem for a while now.

Basically I'm working on a tiered system. Files/dirs are categorized by size
(<10MB, <25GB, >25GB) , and by sensitivity (public, confidential, secure. And
importance is usually proportional to security). I have fortunately found that
security is usually inverse to size. Github/lab anything which makes sense.
Confidential small stuff (sans keys) is just stored in gmail/drive. Big,
boring stuff (music, ebooks) is just kept on external hard drives.

Secure, ultra-important stuff, I don't really have a system for.

The system I'm leaning towards is just encrypt archives and store the
key/password securely, and store it like you would any boring data, with a
local NAS and a cloud backup service of some sort, or just stored on drives
offsite.

~~~
silicon2401
Do you feel comfortable using cloud storage for so much of your content? My
ideal is to be entirely self-backed-up. I want a personal git server, photo
archive, etc. With bandwidth, service costs, vendor issues (dealing with
google seems like a nightmare from reading online).

How did you construct your NAS? Is it a single system, or multiple hard
drives/storage solutions connected to your network?

~~~
kortex
It depends. Github is not going down. Gmail is not going down. If they do,
it's Bug-out-bag time, and I am working on curating what information subset I
need for that.

Ideally though yes I would have my own entire backup system but I frankly
don't trust myself enough to do it right, so hence some redundancy in the
cloud.

The NAS I am still designing actually :p

------
spdustin
I'd really like a personal "correlate all the things!" setup that has a plugin
architecture for any source and creates a time series and document-based store
of whatever I want. Tweets, e-mails, text messages, time tracking, etc.

There are lots of tools that do the individual moving parts, but a personal
aggregator of everything would be interesting. Basically, a tool that lets you
become your own personal data broker—just for your own personal data.

~~~
karlicoss
I'm kind of working on that too :)
[https://github.com/karlicoss/my](https://github.com/karlicoss/my)

I wrote a post on some data that I collect and have/will integrate:
[https://beepb00p.xyz/my-data.html#consumers](https://beepb00p.xyz/my-
data.html#consumers)

~~~
K0SM0S
I only skimmed through and the combined breadth + intent of your projects
seems very, very interesting — I mean it speaks to me. So, way to go! Mad
props, please keep it up!

If you ask me, this is the shape of things to come.

~~~
karlicoss
Thanks! :) I wish it was easier to share with other people, lots of things are
tedious to set up

------
user00012-ab
My problem with a lot of services listed below, is they all eventually go
away, and all your data is off somewhere else. Unless you store your data
locally in a human readable format (markdown) you are just putting all your
data into a system that WILL go away at some point in the future.

Google has already had 2-3 services to manage your data that they have closed
down. Maybe they are the ones that taught me not to trust your data with
anything on the web.

Even something like Evernote is iffy, they seem like they are constantly on
the verge of shutting down.

Although I do find it sad that that the human race as a whole puts so little
value into this type of software, and so much value into sports and politics.

~~~
lcall
[http://onemodel.org](http://onemodel.org) , described briefly elsewhere in
this discussion page and more at that site, is self-hosted, which today means
installing postgres and editing one config file, doing backups & upgrades (but
I might be able to help some).

Maybe I could host for others sometime if there were sufficient interest.
And/or move it to sqlite.

~~~
capableweb
Yeah, seems neither self-host (onemodel) or letting someone else (you or
Evernote) is particularly attractive, because the chance of data loss is
always there.

Is it possible there is a solution that makes the data more permanent and
allows multiple parties to backup the same sources, or something similar? Some
sort of federation protocol maybe.

~~~
lcall
Thanks. That is on the future roadmap (though I have been slow lately):
selective sharing/copying/synch. I encourage anyone with possible interest to
sign up for the announcements list at least, and maybe decide sometime to
help. :)

------
ketzo
It's been mentioned a few times in these comments, but I want to add a +1 for
Roam[1]. Note-taking/personal knowledge tool that's very, very different from
anything I've seen before -- closest thing I can compare it to is Wikipedia.
It's still in beta with some rough edges, but VERY worth checking out.

[1] roamresearch.com

~~~
qot
Worth mentioning the pricing [0]:

$30 / month

$10,000 / lifetime

[0]:
[https://twitter.com/Conaw/status/1214855473876201472](https://twitter.com/Conaw/status/1214855473876201472)

------
Fiveplus
> _all digital trace I 'm leaving (tweets, internet comments, annotations)_

I would be open to the idea of a tool which combines the entirety of my
digital presence at any point in time in a single platform. Kinda like a
dynamically updated list which updates itself - every time a linked account
makes a comment, 'likes' a post or performs any activity that may link it back
to me.

~~~
kirubakaran
I'm building this [https://histre.com/](https://histre.com/) It has Hacker
News, Telegram, and web browsing (notes, bookmarks, history) integrations
already. Up next: Emacs org-mode exports, integrations with Pocket and
Pinboard.

Here is a bit longer comment on that which I made earlier today:
[https://news.ycombinator.com/item?id=22160026](https://news.ycombinator.com/item?id=22160026)

~~~
swozey
This is cool, I'd dig a $2-5/mo unlimited account for 1 person/team with the
same unlimited settings.

~~~
kirubakaran
Thanks swozey. Can you please send me an email? k@histre.com

------
wtracy
This have me a hairbrained idea for a browser extension that drops every web
page you visit into a private Lucene database.

~~~
user00012-ab
I was kind of having the same idea, except any site you bookmark gets added to
a personal web crawler, and then you have your own search site for things you
find interesting.

~~~
walterbell
This exists on iPhone/iPad! DevonThink2Go, local crawl/search + optional
encrypted sync over self-hosted WebDAV or public cloud services. Can also
take/search markdown notes.

------
jmakov
No mentions of [https://tiddlywiki.com/](https://tiddlywiki.com/)?

~~~
user00012-ab
tiddlywiki was great until all the browsers stop supporting writing to local
files, now saving changes is a pain, making me find something else.

~~~
ahnick
Maybe this solves your problems? It creates a database in your browser's
LocalStorage.

[https://noteself.github.io/](https://noteself.github.io/)

~~~
BLKNSLVR
And the database can be setup to sync with a self-hosted CouchDB instance.

~~~
bachmeier
While that works, the original appeal of Tiddlywiki was that you could open a
file in your browser, type away, and save naturally. Once you get into "self-
hosted", you just have a regular old wiki. I used it everyday for several
years but gave up once the transition happened. I keep trying to go back, but
it just can't compete with files edited in a text editor and stored locally.

------
capableweb
Everything I write about (journal + other things, task lists and what not) is
written in plain markdown files currently (about to move it to TiddlyWiki, one
of these days...) and to get search, I just use `the-silver-searcher` which
searches the entire directory of my files. Simple and scalable (got around 9k
documents by now)

------
insomniacity
My eternal frustration in this space is that my employer has strict firewalls,
web filtering and data-loss prevention software, and remote access is over
Citrix with no copy-paste. Consequently, if I build a knowledge base, it is
stuck inside the firewall. Equally, if I build it outside, I can't use it at
work.

~~~
fctorial
Why don't you host it on an ec2 instance? They won't be blocking amazon ip.
Where do you work?

~~~
insomniacity
There's definitely no external access without going through the web proxy. And
a new uncategorized site would be blocked by the web proxy - and it wouldn't
pass review.

I work in a highly regulated industry...

Any workaround would be grounds for termination. So there's no point to my
comment really - just curious if anyone else is in the same boat.

~~~
dr_baba
Can't you use a personal mobile device with a 4G connection to access a
knowledge database outside the firewall, without moving any data across your
employers network? As long as the data you wish to read/write is not sensitive
in itself, and it's mostly just plaintext notes that you can read/write from
any device, I don't see the issue with that.

~~~
KineticLensman
Secure physical sites (e.g. some military bases) may require you to place
personal electronics in a lockable cabinet. You have to use a paper notepad if
you don't have a device certified by the local security team. Using a non-
certified device can result in being evicted or prosecuted.

------
karlicoss
Hey, author here. Happy to answer any questions!

~~~
saadalem
Is there a way we can subsribe to the blog ?

------
porker
> Ideally I want to be able to do fulltext realtime search over anything that
> I ever had in my visual field. Not even necessarily text, but audio and
> video as well.

Where I find all these systems break down is recall. They're designed for
someone who can recall a word or phrase that was in the content. I can usually
recall "It was about X" or "The document/web page/image looked like Y". But an
actual word? The author's name? Not a chance.

While a more difficult problem, if the tool is to live up to the "Future"
section of this page, it's got to go a long way beyond what's in the source
data, to what's thought of by the user.

------
albertzeyer
This topics comes up again and again. I collected some notes about this here:
[https://github.com/albertz/wiki/blob/master/personal-
knowled...](https://github.com/albertz/wiki/blob/master/personal-knowledge-
base.md)

E.g. one software I started to use is nvALT, via:
[https://www.macstories.net/links/organizing-everything-
with-...](https://www.macstories.net/links/organizing-everything-with-plain-
text-notes/)

But I'm nowhere near a perfect and complete solution yet...

~~~
computronus
The successor to nvALT, nvUltra, is currently in private beta. I'm looking
forward to its release!

[https://nvultra.com/](https://nvultra.com/)

~~~
porker
And still Mac only :(

------
tomerbd
I have less notes after being fed up with nites. It's really time consing to
manage notes so - I manage logs. I just log everything I do each task in it's
new page. It's append only.

For notes which I mutate I just keep a personal web site and I tried to keep
this as cheatsheet and as compact as possible so I don't need to manage it.

So append only log in quip new folder for each task.

Mutative cheatsheet super compact pages in personal website.

Oh and for quick sniper's alfred.

That's it.

------
ajphdiv
I self host a confluence server. All my content is available to me offline.
Might be a bit overkill, but I have knowledge bases for all my work. If there
is a web page I come across I can just copy/paste the content into a new post.
Everything is searchable. It really is great. They offer a starter license,
which is $10 per year:

[https://www.atlassian.com/licensing/starter](https://www.atlassian.com/licensing/starter)

------
glinkot
I use a few things for this (on windows):

\- For notes, OneNote, though I'm always on the lookout for an alternative
with decent UI and syncing, but using open file formats. Full text search
simple enough with this. Code formatting isn't good but there's an addin where
the free version formats it as it was copied.

\- To search local files, Voidtools Everything is great. Searching instantly
by filename is a real time saver.

\- If I want full text search of a large base of documents, I used Likasoft
Archivarius which cost me $30 about 10 years ago and is still handy. It's the
only local desktop search I've found that supports full text indexing of tons
of formats like outlook .ost, etc and can look inside archive files

\- For backups I've continued to stick with external drives, mirrored
periodically with Freefilesync. 3 copies - one as master, two mirrors ensuring
one is offsite.

~~~
seized
Take a look at Standard Notes. It is privacy focussed with encryption but has
markdown and code editors and can be self hosted

~~~
glinkot
Thanks, looks interesting. I find Markdown a great idea in theory, but have
found very few examples of wysiwyg markdown editors that work 'as you'd
expect'. For me that means:

\- Bullets with multiple indents going from 1 to 1) a. etc \- Table handling
\- Usual formatting like heading levels etc

And there seem to be lots of flavours of markdown too, just to add another
layer to things.

------
dapithor
I wish things like [https://piggydb.net/](https://piggydb.net/) had more
momentum or competitors... personal knowledge databases seem to be such a
tough niche to tackle.

Edit: since there is a new project here is more details years back:
[http://www.linux-magazine.com/Issues/2014/160/Workspace-
Pigg...](http://www.linux-magazine.com/Issues/2014/160/Workspace-Piggydb)

------
flaque
If you're into this sort of thing, you might want to checkout Roamresearch:
[https://roamresearch.com/](https://roamresearch.com/)

~~~
losteric
Seems similar to ZIM ([https://zim-wiki.org/](https://zim-wiki.org/)), except
proprietary/hosted? I've just started using zim - can someone more experienced
compare the two?

------
jslakro
We could fill a whole internet with each personal method for storing,
classifying and accesing. We're missing a OS for our own memory.

------
jefurii
I wish there was a method for printing QR codes or URLs on paper that would be
the reverse of scanning a QR code. This would make it easy to write complex
URLs in your paper diary/techo/commonplace book/notebook.

------
andreygrehov
I keep my knowledge in a private Git repo managed by
[https://www.gitbook.com/](https://www.gitbook.com/). So far it works out
great. Going to make it public soon.

~~~
karlicoss
That's cool, please drop me an email (or just share here?) when you release
it, I'm collecting
([https://beepb00p.xyz/tags.html#exobrain](https://beepb00p.xyz/tags.html#exobrain))
other people's wikis!

~~~
spoontoeat
Thanks for making your notes public. It inspired some further thinking for my
org-mode setup.

------
Unsimplified
Tried the custom webapp and DB solution for a while. Wasn't publicly portable
enough (for others to copy paste/export easily).

Currently using markdown files in git repos.

------
ziyadb
The holy grail [[https://beepb00p.xyz/pkm-
search.html#future](https://beepb00p.xyz/pkm-search.html#future)] of this
really resonated with me and fully mirrors what I've been thinking about the
past few months. In my observations, it's input capture, information
organization, and subsequent retrieval:

Information Capture:

Input Capture - You’re going to have all-encompassing tracking and recording
of all activity, but want configurable privacy on the extent to which you want
your daily conversations and observations of external things you encounter and
are exposed to. Capturing input needs to be holistic and incorporate all
properties of encounters and new information.

Potential sources of input:

Vision — point of view recording, see snapchat spectacles, etc as primitive
examples. Audio (voice notes and multi-party conversations) - voice calls,
video, etc. and other forms of audio transmission where there is more than a
single party in the interaction. Digital interactions You will need to keep
track of web pages you visit at what times Conversations you see on Twitter,
etc.

Properties and cues must be extrapolated from the information that is captured
on input, in the case of audio, transcriptions are sufficient for
transcription and retrieval purposes, however since video is a visual medium,
it includes significantly more properties that need to be accounted for.

The aim here is to identify sufficient data points (cues) that are
subsequently represented in such a way that they are easy to search across
things you have encountered but only seem to recall a certain property or cue
from. This is because of the fact that human beings tend to remember things in
fragments, for instance, you might remember a certain color on a page that you
visited within the last 6 months and nothing else.

So long as you are capturing sufficient input and actions then you should be
able to go back to any given point in time. How and where are you going to
store this information? Storing everything is going to be a large amount of
data. The essence of the information and context must be preserved. If you
want to wind back to an arbitrary position in time with the original context
intact, you want to retain as much as you can in the most efficient manner
possible, so determining which data points to retain is essential. (Once the
content structure has been figured out, this will be viable).

Examples of Primary Cues:

Time - humans generally keep track of things in a linear time-based fashion.
Color - invokes emotion and is memorable. Physical Location - the efficiency
of information retrieval is highly influenced by the location at which it is
originally synthesized, encountered, and stored. Keywords - the default
conventional mode. Can and should be extracted from video/imagery and audio.
Imagery - search for images based on their contents and ambience.

Potential Secondary Cue — Music - see historical associated input and actions
while certain music was played. (What else?)

Meta Cues — Subjects - Automated tagging of keywords/encountered content.

Any combination of these queries is possible, but ultimately the killer
feature is the ability to backtrack through time to find a certain piece of
information that is made available thanks to the always-on recorded nature of
your interactions with the physical and digital worlds combined.

Knowing what to store, and how, + displaying it needs to be worked on further.

~~~
lcall
[http://onemodel.org](http://onemodel.org), described elsewhere here and
moreso at that site, tries to model arbitrary knowledge and has a vision
encompassing any kind of info one wanted to be tracked (again, more at the
site). (Edit: If you have possible future interest, there is an announcements
list.)

------
maurits
I've been pondering on building something like this for a while.

For now, I've settled on sphinx because it can be easily exported to dash, and
tied in to an alfred workflow for search.

------
hvasilev
I use a vim plugin called vimwiki and I export my todos and notes into HTML.
Works fine for me.

------
chimichangga
I just email links, code, docs, etc. to myself with descriptive subjects and
tags.

------
JabavuAdams
I basically live in Evernote. Will gradually transition to personal tooling.

------
rawoke083600
Most stuff (links, photos, docs, etc) I just email it to myself

------
voltagex_
Is there anything for people who don't use Vim?

~~~
executesorder66
emacs has org-mode.

------
jacquesm
A search infrastructure for my knowledge would require access to wetware. Code
I can see working.

------
marv3lls
Ya lost me at $(emacs)!

