Seems like a Silly Question, but how do you organize your files.
I struggle to find a way to keep my reference documentation organized such as ebooks relevant for programming or other related tasks.
I made an automatic document tagger and categorizer. It collects any docs or HTML pages saved to Dropbox, dropped into a Telegram channel, saved with zotero, Slack, Mattermost, private webdav, etc, cleans the docs, pulls the text, performs topic modeling, along with a bunch of other NLP stuff, then renames all docs into something meaningful, sorts docs into a custom directory structure where folder names match the topics discovered, tags docs with relevant keywords, and visually maps the documents as an interactive graph. Full text search for each doc via solr. HTML docs are converted to clean text PDFs after ads are removed. This 'knowledge base' is contained in a single ECMS, external accounts for data input are configured from a single yaml file. There's also a web scraper that takes crawl templates as json files and uploads data into the CMS as files to be parsed with the rest of the docs. The idea is to be able to save whatever you are reading right now with one click whether you are on your mobile or desktop, or if you are collaborating in a group, and have a single repository where all the organizing is done actively 24/7 with ML.
Currently reconstructing the entire thing to production spec, as an AWS AMI, perhaps later polished into a personal knowledge base saas where the cleaned and sorted content is public accessible with REST/cmis api.
This project has single handedly eaten almost a third of my life.
I use LDA algorithm for topic modeling. It has been the standard go-to for a while now within NLP community. There are implementations of it in many languages. The tricky part is cleaning the text, domain specific stopword lists, and in general controlling how text is processed depending on the context to make useful topic assignments when the text corpus represents more than a single field of knowledge. There are also some interesting ways of combining recent advances in RNNs on top of the more old school LDA topic modeling. I think this will be where most substantial advances will be coming from.
Home directory is served over NFS (at work). Layout is as follows:
phireal@pc ~$ ls -1
Box/ - work nextcloud
Cloud/ - personal nextcloud
Code/ - source code I'm working on
Data@ - data sources (I'm a scientist)
Desktop/ - ...
Documents/ - anything I've written (presentations, papers, reports)
Local@ - symlink to my internal spinning hard drive and SSD
Maildir/ - mutt Mail directory
Models/ - I do hydrodynamic modelling, so this is where all that lives
Remote/ - sshfs mounts, mostly
Scratch/ - space for stuff I don't need to keep
Software/ - installed software (models, utilities etc.)
At home, my main storage looks like:
phireal@server store$ ls -1
archive - archived backups of old machines
audiobooks - audio books
bin - scripts, binaries, programs I've written/used
books - eBooks
docs - docs (personal, mostly)
films - films
kids - kids films
misc - mostly old images I keep but for no particular reason
music - music
pictures - photos, organised YYYY/MM-$month/YYYY-MM-DD
radio - podcasts and BBC radio episodes
src - source code for things I use
tmp - stuff that can be deleted and probably should
tv_shows - TV episodes, organised show/series #
urbackup - UrBackup storage directory
web - backups of websites
work - stuff related to work (software, data, outputs etc.)
.
├── Desktop
├── Downloads
├── Google Drive // My defacto Documents folder
│ ├── legal
│ ├── library // ebooks and anything else I read
│ ├── ...
├── Downloads
├── Sandbox // all my repositories or software projects go here
├── Porn // useful when I was a teen, now just contains a text file with lyrics to "Never Gonna Give You Up"
I backup my home folder via Time Machine. I haven't used Windows in years but when I did, I used to do something similar. Always kept a separate partition for games, and software because those could be reinstalled easily, personal data was always kept in my User folder.
- bin :: quick place to put simple scripts and have available everywhere
- build :: download projects for inspection and building, not for actively
working on them
- work-for :: where to put all projects; all project folders are available to
me in zsh like ~proj-1/ so getting to them is quick despite depth.
- me :: private projects for my use only
- proj-1
- all :: open source
- proj-2
- client :: for clients
- client-1
- proj-3
- org :: org mode files
- diary :: notes relating to the day
- 2017-06-21.org :: navigated with binding `C-c d` defaulting to today
- work-for :: notes for project with directory structure reflecting that of
~/work-for
- client
- client-1
- proj-3.org
- know :: things to learn from: txt's, books, papers, and other interesting
documents
- mail :: maildirs for each account
- addr-1
- downloads :: random downloads from the internet
- media :: entertainment
- music
- vids
- pics
- wallpaper
- t :: for random ad-hoc tests requiring directories/files; e.g. trying things
with git
- repo :: where to put bare git repositories for private projects (i.e. ~work-for/me/)
- .password-store :: (for `pass` password manager)
- type-1 :: ssh, web, mail (for smtp and imap), etc.
- host-1 :: news.ycombinator.com, etc.
- account-1 :: jol, jolmg, etc.
Not all folders are available on all machines, like ~/repo is on a private server, but they follow the same structure.
- ebooks: I don't love Calibre, but it's the only game in town.
- music: Musicbrainz Picard to get the metadata right. I've been favoring RPis running mpd as a front-end to my music lately.
- movies/TV: MediaElch + Kodi
I don't have a good solution for managing pictures and personal videos that doesn't involve handing all of it to some awful, spying "cloud" service. Frankly most of this stuff is sitting in Dropbox (last few years worth) or, for older files, in a bunch of scattered "files/old_desktop_hd_3_backup/desktop/photos"-type directories waiting for my wife and I to go through them and do something with them. Which is increasingly less likely to happen—sometimes I think the natural limitations of physical media were a kind of blessing, since one was liberated from the possibility of recording and retaining so much. Without some kind of automatic facial recognition and tagging—and saving of the results in some future-proof way, ideally in the photos/videos themselves—this project is likely doomed.
My primary unresolved problem is finding some sort of way to preserve integrity and provide multi-site backup that doesn't waste a ton of my time+money on set-up and maintenance. When private networks finally land in IPFS I might look at that, though I think I'll have to add a lot of tooling on top to make things automatic and allow additions/modifications without constant manual intervention, especially to collections (adding one thing at a time, all separately, comes with its own problems, like having to enumerate all of those hashes when you want something to access a category of things, like, say, all your pictures). Probably I'll have to add an out-of-band indexing system of some sort, likely over HTTP for simplicity/accessibility. For now I'm just embedding a hash (CRC32 for length reasons and because I mostly need to protect against bit-rot, not deliberate tampering) at the end of filenames, which is, shockingly, still the best cross-platform way to assert a content's identity, and synchronizing backups with rsync—ZFS is great and all but doesn't preserve useful hash info if a copy of a file is on a non-ZFS filesystem, plus I need basically zero of its features aside from periodically checking file integrity.
One thing I do that I've found to be pretty helpful is to prefix files/directories with a number or date, for sorting. Some things are naturally ordered by date, for example events. So I might have a directory "my-company/archive", where each item is named "20170621_some-event".
Other things are better sorted by category or topic. For tools or programming languages I'm researching I might have a directory with items "01_some-language", "02_setup", "10_type-system", "20_ecosystem", etc.
I do something very similar. I save files into a watched folder with Hazel (Google Drive, Dropbox and my Downloads folder). Hazel has a rule to rename the file with the YYYYMMDD_Filename.ext, and then depending on the extension filters it to a different folder, or with a PDF runs an OCR on it and stores it in Devonthink Pro.
I did this (with dates) for all calculations I did during my PhD and it turned out very useful despite its simplicity. I guess it was simple enough that I could actually be consistent over years. I wish the rest of my projects were as orderly ;)
That's pretty much how I did it on my NAS. My top level is basically "fiction", "non-fiction", "music", "pictures", "software" - then it just goes from there.
But it has problems. For instance, I like to collect information about robotics and artificial intelligence. In many cases I have papers with titles like "Using Computer Vision to Control a Robot Arm via a CNN". Do I put it under "robotics/sensors/vision" or "ai-ml/anns/cnn" or "robotics/motion-control/platform/arm"...or...? It can technically fit into any and all of those categories!
That's a problem with hierarchical category structures; when something can fit into multiple categories, you can either duplicate the information (not good - unless your system has some way of using pointer refs or such to prevent data duplication - which most file systems don't), cross-link the information (put it in a canonical spot and symlink to it), or just say "f-it" and stick it someplace, and hope you can find it later (which sometimes you can't).
What I wish I had, instead, was a simple means to search my filesystem in a very quick fashion. Ideally, it would be something like the old Google Search Appliance, which could spider and index my filesystem, read each file (and any metadata stored in the file, such as in the case of videos and images) and build up an index that can be quickly and easily searched. It would also keep this index up-to-date as files are added, removed, or changed.
Unfortunately, I've yet to find a low-cost (ideally free) open-source solution to this problem, that was also easy to set up and maintain. I've found more than a few solutions (or partials) which given enough admin and configuration (plus maintenance and/or glue code) could potentially become the system I want, but none of them were "turnkey" - install, simple setup (nothing more complex than a NAS or WiFi Router, for instance), and "let it go". They were all very "enterprise-y" and required more than a bit of effort to install and maintain. It isn't that I couldn't do that, I just don't have the time to dedicate to such a task. But it might be something I just have to bite the bullet for.
Maybe what I need to do is research the latest offering of FreeNAS - maybe they've (since the last time I used it) implemented a decent search engine module (or some third party plugin has been created) to handle this issue?
One external raid (mirrored) that holds information only necessary for when I'm working at my desk. Within that drive I have an archive folder with past files that are rarely/ever needed. The folder structure is labeled broadly such as "documents" "media" and more specific folders within. For the file level I usually put a date at the beginning of the name going from largest to smallest (2017-6-21_filename). For sensitive documents; I put in encrypted DMG files using the same organization structure.
As for all "working" documents, they're local to my machine under a documents or project folder. The documents folder is synced to all my devices and looks the same everywhere with a similar organization structure as my external drive. My projects folder is only local to my machine, which is a portable, and contains all the documents needed for that project.
TL;DR Shallow folder structure with dates at the beginning of files essentially.
If you're particularly asking about reference material that you take notes about and would like to search and retrieve and produce reports on, Zotero might work for you. I have many years of research notes on it - it's a hyper-bookmarking tool that can keep snapshots of web pages, keep PDFs and other "attachments" within saved articles, lets you tag and organize them along with search capabilities.
Outside of that scope, my files reside randomly somewhere in the ~/Documents folder (I use a mac) and I rely on spotlight to find the item I need. It's not super great but is workable often enough.
It's not a silly question!
edit: I've been trying to find a multi-disk solution and haven't had much success with an easy enough to use tool. I use git-annex for this and it helps to some extent. I've also tried Camlistore, which is promising, but has a long way to go.
Another option is to have a look at a tag-based filesystem instead of hierarchical ones to organize everything semantically. I'm using Tagsistant (there're other options) for a couple of months now and I'm almost happy. More satisfied with the idea itself and the potentiality.
I mostly work with Golang so usually all work related stuff will be in my GOPATH in ~/code/go/src/github.com/company-name/.
Non Golang code will go to ~/code, sometimes ~/code/company-name but I also have couple of ad hoc codebases spread around in different places on my filesystem.
So it is a bit disorganized. However last few years I have rarely ever needed to cd outside of ~/code/go.
Some legacy codebases I worked on (and still need to contribute to from time to time) can be in most random places as it took some effort and time to configure local environment of some of these beasts to be working properly (and they depend on stuff like Apache vhosts) so I am too afraid to move those to ~/code as I might break my local environment.
I have my files pseudo-organized, meaning I kind of try to keep them where they should be logically, but since this varies a lot - they're not really organized.
The thing is - I use "everything" a free instant file search tool from voidtools.
It is blazingly fast, just start typing and it finds files while you type.
It uses the ntfs file system (windows only, sorry everyone else) existing index to perform instant searches, it is hands down the ultimate most fast file search tool I have ever encountered - files literally are found while you type their names, without waiting for even a milli second.
So, no organization (the ocd part of me hates this) but i always find my files in an instant, no matter where i left them.
My file layout is quite uninteresting. The most noteworthy thing is that I have an additional toplevel directory /x/ where I keep all the stuff that would otherwise be in $HOME, but which I don't want to put in $HOME because it doesn't need to be backed up.
- /x/src contains all Git repos that are pushed somewhere. Structure is the same as wanted by Go (i.e., GOPATH=/x/). I have a helper script and accompanying shell function `cg` (cd to git repo) where I give a Git repo URL and it puts me in the repo directory below /x/src, possibly cloning the repo from that URL if I don't have it locally yet.
As I said, that's not in the backup, but my helper script maintains an index of checked-out repos in my home directory, so that I can quickly restore all checkouts if I ever have to reinstall.
- /x/bin is $GOBIN, i.e. where `go install` puts things, and thus also in my PATH. Similar role to /usr/local/bin, but user-writable.
- /x/steam has my Steam library.
- /x/build is a location where CMake can put build artifacts when it does an out-of-source build. It mimics the structure of the filesystem, but with /x/build prefixed. For example, if I have a source tree that uses CMake checked out at /home/username/foo/bar, then the build directory will be at /x/build/home/username/foo/bar. I have a `cd` hook that sets $B to the build directory for $PWD, and $S to the source directory for $PWD whenever I change directories, so I can flip between source and build directory with `cd $B` and `cd $S`.
- /x/scratch contains random junk that programs expect to be in my $HOME, but which I don't want to backup. For example, many programs use ~/.cache, but I don't want to backup that, so ~/.cache is a symlink to the directory /x/scratch/.cache here.
I use `mess` [1].
Short descrption: New stuff that is not filed away instantly goes into a folder "current" linked to the youngest folder in a tree (mess_root > year > week).
If needed at a later time: file it accordingly, otherwise old folders are purged if disk space is low.
Taking it a step further: synching everything across work and personal machines using `syncthing`.
└─Filename preserved, ordered by date or grouped in arbitrary functional folders
Drivers
├─Video
├─Sound
└─MB
Music
└─Primary Artist
└─YYYY.AlbumName (Keeps albums in date order)
└─AlbumName Track# Title.mp3 (truncates sensibly on a car stereo)
Pictures
└─YYYY-MM-DD.Event Description (DD is optional)
Projects
├─scripts - reusable across clients
│ └─language
│ └─purpose
└─clientname
├─source code
└─documents
Utils (single-executable files that don't require an install)
I use Beyond Compare as my primary file manager at home and work. Folder comparison is the easiest way to know if a file copy fully completed. Multi-threaded move/copy is nice too.
Beside the usual `Images`, `Videos`, `code` directory, the single most important directory on my system is `~/flash` (as in : flash memory). This is where my browser downloads files and where I create "daily" files, which I quickly remove.
This is a directory that can be emptied at any moment without the fear of losing anything important, and which help me keeping the rest of my fs clean. Basically `/tmp` for user.
Most of my files stay in the download folder. If I think I will need them at a later stage against I upload them to my Google Drive. Google is quite good at searching stuff - for me that also works for personal files. I have probably 100 e-books that are on my reading list and will never get read by me...
symlinks for ~/Downloads and ~/Documents into ~/Dropbox is my only interesting upgrade. Across the varying different devices I have different things selectively synced. Large media files are the only things that don't live in dropbox in some way or another. It's pretty convenient for mobile access (everything accessible from web/mobile). I've done some worrying about sensitive documents and such, but most of it is also present in my email, so I think I lost that battle already. It also means there's very little downside to wiping my HD entirely if I want to try a different OS (which I used to do frequently, but ended up settling on vanilla ubuntu).
Organizing my files has been an obsession of mine for many years, so I've evolved what I think is a very effective system that combines the advantages of hierarchical organization and tagging. I use 3-character tags as part of every file's name. A prefix of tags provides a label that conveys the file's place in the hierarchy of all my files. To illustrate, here's the name of a text file that archives text-based communications I've had regarding a software project called 'Do, Too':
- pjt>sfw>doToo>cmm
'pjt' is my tag for projects
'sfw' is my tag for software and computer science
'doToo' is the name of this software project
'cmm' is my tag for interpersonal communications
Projects (tagged with 'pjt') is one of my five broad categories of files, with the others being Personal ('prs'), Recreation ('rcn'), Study ('sdg'), and Work ('wrk'). All files fall into one of these categories, and thus all file names begin with one the five tags mentioned. After that tag, I use the '>' symbol to indicate the following tag(s) is/are subcategories.
Any tags other than those for the main categories might follow, as 'sfw' did in the example above. This same tag 'sfw' is also used for files in the Personal category, for files related to software that I use personally--for example:
- prs>sfw>nameMangler@nts
Here, NameMangler is the name of the Mac application I use to batch-modify file names when I'm applying tags to new files. '@nts' is my tag for files containing notes. I also have many files whose names begin with 'sdg>sfw' and these are computer science or programming-related materials that I'm studying or I studied previously and wanted to archive.
A weakness of hierarchical organization is that it makes it difficult to handle files that could be reasonably placed in two or more positions in the hierarchy. I handle this scenario through the use of tag suffixes. These are just '|'-delimited lists of tags that do not appear in the prefix identifier, but that are still necessary to convey the content of the file adequately. So for example, say I have a PDF of George Orwell's essay "Politics and the English Language":
The suffix of tags begins with '=' to separate it from the rest of the file name. A couple of other features are shown in this file name. I use '_' to separate the prefix tags from the original name of the file ('orwell9' in this case) if it came from an outside source. I'm an English teacher and use this essay in class, and that's why the tags 'wrk' for Work and 'tfl' for 'Teaching English as a Foreign Language' appear. 'wrt' is my tag for 'writing', since Orwell's essay is also about writing. The tag 'georgeOrwell' is not strictly necessary since searching for "George Orwell" will pick up the name in the text content of the PDF, but I still like to add a tag to signal that the file is related to a person or subject that I'm particularly interested in. Adding a camel-cased tag like this also has the advantage that I can specifically search for the tag while excluding files that happen to contain the words 'George' and 'Orwell' without being particularly about or by him.
That last file name example also illustrates what I find to be a big advantage of this system: it reduces some of the mental overhead of classifying the file. I could have called the file 'wrk>tfl>politicsAndTheEnglishLanguage=sdg|wrt|lng|georgeOrwell', but instead of having to think about whether it should go in the "English teaching work-related stuff" slot or the "stuff about language that I can learn about" slot, I can just choose one more or less arbitrarily, and then add the tags that would have made up the tag prefix that I didn't choose as a suffix.
There's actually a lot more to the system, but those are the basics. Hope you find it helpful in some way.
in my main collection of files for my
startup, computing, applied math, etc.
All those files are well enough organized.
Here's how I do it and how I do related
work more generally (I've used the
techniques for years, and they are all
well tested).
(1) Principle 1: For the relevant file
names, information, indices, pointers,
abstracts, keywords, etc., to the greatest
extent possible, stay with the old 8 bit
ASCII character set in simple text files
easy to read by both humans and simple
software.
(2) Principle 2: Generally use the
hierarchy of the hierarchical file system,
e.g., Microsoft's Windows HPFS (high
performance file system), as the basis
(framework) for a taxonomic hierarchy
of the topics, subjects, etc. of the
contents of the files.
(3) To the greatest extent possible, I do
all reading and writing of the files using
just my favorite programmable text editor
KEdit, a PC version of the editor XEDIT
written by an IBM guy in Paris for the IBM
VM/CMS system. The macro language is Rexx
from Mike Cowlishaw from IBM in England.
Rexx is an especially well designed
language for string manipulation as needed
in scripting and editing.
(4) For more, at times make crucial use of
Open Object Rexx, especially its function
to generate a list of directory names,
with standard details on each directory,
of all the names in one directory subtree.
(5) For each directory x, have in that
directory a file x.DOC that has whatever
notes are appropriate for good
descriptions of the files, e.g., abstracts
and keywords of the content, the source of
the file, e.g., a URL, etc. Here the file
type of an x.DOC file is just simple ASCII
text and is not a Microsoft Word document.
There are some obvious, minor exceptions,
that is, directories with no file named
x.DOC from me. E.g., directories created
just for the files used by a Web page when
downloading a Web page are exceptions and
have no x.DOC file.
(6) Use Open Object Rexx for scripts for
more on the contents of the file system.
E.g., I have a script that for a current
directory x displays a list of the
(immediate) subdirectories of x and the
size of all the files in the subtree
rooted at that subdirectory. So, for all
the space used by the subtree rooted at x,
I get a list of where that space is used
by the immediate subdirectories of x.
(7) For file copying, I use Rexx scripts
that call the Windows commands COPY or
XCOPY, called with carefully selected
options. E.g., I do full and incremental
backups of my work using scripts based on
XCOPY.
For backup or restore of the files on a
bootable partition, I use the Windows
program NTBACKUP which can backup a
bootable partition while it is running.
(8) When looking at or manipulating the
files in a directory, I make heavy use of
the DIR (directory) command of KEdit. The
resulting list is terrific, and common
operations on such files can be done with
commands to KEdit (e.g., sort the list),
select lines from the list (say, all files
x.HTM), delete lines from the list, copy
lines from the list to another file, use
short macros written in Kexx (the KEdit
version of Rexx), often from just a single
keystroke to KEdit, to do other common
tasks, e.g., run Adobe's Acrobat on an
x.PDF file, have Firefox display an x.HTM
file.
More generally, with one keystroke, have
Firefox display a Web page where the URL
is the current line in KEdit, etc.
I wrote my own e-mail client software.
Then given the date header line of an
e-mail message, one keystroke displays the
e-mail message (or warns that the date
line is not unique, but it always has
been).
So, I get to use e-mail message date lines
as 'links' in other files. So, if some
file T1 has some notes about some subject
and some e-mail message is relevant, then,
sure, in file T1 just have the date line
as a link.
This little system worked great until I
converted to Microsoft's Outlook 2003. If
I could find the format of the files
Outlook writes, I'd implement the feature
again.
(9) For writing software, I type only into
KEdit.
Once I tried Microsoft's Visual Studio and
for a first project, before I'd typed
anything particular to the project, I got
50 MB or so of files nearly none of which
I understood. That meant that whenever
anything went wrong, for a solution I'd
have to do mud wrestling with at least 50
MB of files I didn't understand; moreover,
understanding the files would likely have
been a long side project. No thanks.
E.g., my startup needs some software, and
I designed and wrote that software. Since
I wrote the software in Microsoft's Visual
Basic .NET, the software is in just simple
ASCII files with file type VB.
There are 24,000 programming language
statements.
So, there are about 76,000 lines of
comments for documentation which is
IMPORTANT.
So, all the typing was done into KEdit,
and there are several KEdit macros that
help with the typing.
In particular, for documentation of the
software I'm using -- VB.NET, ASP.NET,
ADO.NET, SQL Server, IIS, etc. -- I have
5000+ Web pages of documentation, from
Microsoft's MSDN, my own notes, and
elsewhere.
So, at some point in the code where some
documentation is needed for clarity for
the code, I have links to my documentation
collection, each link with the title of
the documentation. Then one keystroke in
KEdit will display the link, typically
have Firefox open the file of the MSDN
HTML documentation.
Works great.
The documentation is in four directories,
one for each of VB, ASP, SQL, and Windows.
Each directory has a file that describes
each of the files of documentation in that
directory. Each description has the title
of the documentation, the URL of the
source (if from the Internet which is the
usual case), the tree name of the
documentation in my file system, an
abstract of the documentation, relevant
keywords, and sometimes some notes of
mine. KEdit keyword searches on this file
(one for each of the four directories) are
quite effective.
(10) Environment Variables
I use Windows environment variables and
the Windows system clipboard to make a lot
of common tasks easier.
E.g., the collection of my files of
documentation of Visual Basic is in my
directory
H:\data05\projects\software\vb\
Okay, on the command line of a console
window, I can type
G VB
and then have that directory current.
Here 'G' abbreviates 'go to'!
So, to command G, argument 'VB' acts like
a short nickname for directory
H:\data05\projects\software\vb\
Actually that means that I have --
established when the system boots -- a
Windows environment variable MARK.VB with
value
H:\data05\projects\software\vb\
I have about 40 such MARK.x environment
variables.
So, sure, I could use the usual Windows
tree walking commands to navigate to
directory
H:\data05\projects\software\vb\
but typing
G VB
is a lot faster. So, such nicknames are
justified for frequently used directories
fairly deep in the directory tree.
Environment variables
MARK.TO
MARK.FROM
are used by some other programs,
especially my scripts that call COPY and
XCOPY.
So, to copy from directory A to directory
B, I navigate to directory A and type
MARK FROM
which sets environment variable
MARK.FROM
to the directory tree name of directory A.
Similarly for directory B.
Then my script
COPYFT1.RXS
takes as argument the file name and does
the copy.
My script
COPYFT2.RXS
takes two arguments, the file name of the
source and the file name to be used for
the copy.
I have about 200 KEdit macros and about
200 Rexx scripts. They are crucial tools
for me.
(11) FACTS
About 12 years ago I started a file
FACTS.DAT. The file now has 74,317 lines,
is
2,268,607
bytes long, and has 4,017 facts.
Each such fact is just a short note,
sure, on average
2,268,607 / 4,017 = 565
bytes long and
74,317 / 4,017 = 18.5
lines long.
And that is about
12 * 365 / 4,017 = 1.09
that is, an average of right at one new
fact a day.
Each new fact has its time and date, a
list of keywords, and is entered at the
end of the file.
The file is easily used via KEdit and a
few simple macros.
I have a little Rexx script to run KEdit
on the file FACTS.DAT. If KEdit is
already running on that file, then the
script notices that and just brings to the
top of the Z-order that existing instance
of KEdit editing the file -- this way I
get single threaded access to the file.
So, such facts include phone numbers,
mailing addresses, e-mail addresses, user
IDs, passwords, details for multi-factor
authentication, TODO list items, and other
little facts about whatever I want help
remembering.
No, I don't need special software to help
me manage user IDs and passwords.
Well, there is a problem with the
taxonomic hierarchy: For some files, it
might be ambiguous which directory they
should be in. Yes, some hierarchical file
systems permitted to be listed in more
than one directory, but AFAIK the
Microsoft HPFS file system does not.
So, when it appears that there is some
ambiguity in what directory a new file
should go, I use the x.DOC files for those
directories to enter relevant notes.
For ebooks I created folders for main-categories and some sub-categories (inspired by Amazon.com or some other ebook shop structure).
For photos folders per device/year/month.
For Office documents pre-pending date using the ISO date format (2017-06-21 or 170621) works great. (for sharing with others over various channels like mail/chat/fileserver/cloud/etc)
Currently reconstructing the entire thing to production spec, as an AWS AMI, perhaps later polished into a personal knowledge base saas where the cleaned and sorted content is public accessible with REST/cmis api.
This project has single handedly eaten almost a third of my life.