Ask HN: How do you organize your files

kusmi · on June 21, 2017

I made an automatic document tagger and categorizer. It collects any docs or HTML pages saved to Dropbox, dropped into a Telegram channel, saved with zotero, Slack, Mattermost, private webdav, etc, cleans the docs, pulls the text, performs topic modeling, along with a bunch of other NLP stuff, then renames all docs into something meaningful, sorts docs into a custom directory structure where folder names match the topics discovered, tags docs with relevant keywords, and visually maps the documents as an interactive graph. Full text search for each doc via solr. HTML docs are converted to clean text PDFs after ads are removed. This 'knowledge base' is contained in a single ECMS, external accounts for data input are configured from a single yaml file. There's also a web scraper that takes crawl templates as json files and uploads data into the CMS as files to be parsed with the rest of the docs. The idea is to be able to save whatever you are reading right now with one click whether you are on your mobile or desktop, or if you are collaborating in a group, and have a single repository where all the organizing is done actively 24/7 with ML.

Currently reconstructing the entire thing to production spec, as an AWS AMI, perhaps later polished into a personal knowledge base saas where the cleaned and sorted content is public accessible with REST/cmis api.

This project has single handedly eaten almost a third of my life.

echion · on June 21, 2017

This sounds really interesting -- can you share anything else, or pieces of the pipeline...especially topic modeling?

kusmi · on June 22, 2017

I use LDA algorithm for topic modeling. It has been the standard go-to for a while now within NLP community. There are implementations of it in many languages. The tricky part is cleaning the text, domain specific stopword lists, and in general controlling how text is processed depending on the context to make useful topic assignments when the text corpus represents more than a single field of knowledge. There are also some interesting ways of combining recent advances in RNNs on top of the more old school LDA topic modeling. I think this will be where most substantial advances will be coming from.

phireal · on June 21, 2017

Home directory is served over NFS (at work). Layout is as follows:

  phireal@pc ~$ ls -1
  Box/       - work nextcloud
  Cloud/     - personal nextcloud
  Code/      - source code I'm working on
  Data@      - data sources (I'm a scientist)
  Desktop/   - ...
  Documents/ - anything I've written (presentations, papers, reports)
  Local@     - symlink to my internal spinning hard drive and SSD
  Maildir/   - mutt Mail directory
  Models/    - I do hydrodynamic modelling, so this is where all that lives
  Remote/    - sshfs mounts, mostly
  Scratch/   - space for stuff I don't need to keep
  Software/  - installed software (models, utilities etc.)

At home, my main storage looks like:

  phireal@server store$ ls -1
  archive     - archived backups of old machines
  audiobooks  - audio books
  bin         - scripts, binaries, programs I've written/used
  books       - eBooks
  docs        - docs (personal, mostly)
  films       - films
  kids        - kids films
  misc        - mostly old images I keep but for no particular reason
  music       - music
  pictures    - photos, organised YYYY/MM-$month/YYYY-MM-DD
  radio       - podcasts and BBC radio episodes
  src         - source code for things I use
  tmp         - stuff that can be deleted and probably should
  tv_shows    - TV episodes, organised show/series #
  urbackup    - UrBackup storage directory
  web         - backups of websites
  work        - stuff related to work (software, data, outputs etc.)

amingilani · on June 21, 2017

My home folder is it.

  .
  ├── Desktop
  ├── Downloads
  ├── Google Drive // My defacto Documents folder
  │   ├── legal
  │   ├── library // ebooks and anything else I read
  │   ├── ...
  ├── Downloads
  ├── Sandbox //  all my repositories or software projects go here
  ├── Porn // useful when I was a teen, now just contains a text file with lyrics to "Never Gonna Give You Up"

I backup my home folder via Time Machine. I haven't used Windows in years but when I did, I used to do something similar. Always kept a separate partition for games, and software because those could be reinstalled easily, personal data was always kept in my User folder.

jolmg · on June 21, 2017

My home directory:

  - bin :: quick place to put simple scripts and have available everywhere
  - build :: download projects for inspection and building, not for actively
       working on them
  - work-for :: where to put all projects; all project folders are available to
       me in zsh like ~proj-1/ so getting to them is quick despite depth.
    - me :: private projects for my use only
      - proj-1
    - all :: open source
      - proj-2
    - client :: for clients
      - client-1
        - proj-3
  - org :: org mode files
    - diary :: notes relating to the day
      - 2017-06-21.org :: navigated with binding `C-c d` defaulting to today
    - work-for :: notes for project with directory structure reflecting that of
         ~/work-for
      - client
        - client-1
          - proj-3.org
  - know :: things to learn from: txt's, books, papers, and other interesting
       documents
  - mail :: maildirs for each account
    - addr-1
  - downloads :: random downloads from the internet
  - media :: entertainment
    - music
    - vids
    - pics
      - wallpaper
  - t :: for random ad-hoc tests requiring directories/files; e.g. trying things
       with git
  - repo :: where to put bare git repositories for private projects (i.e. ~work-for/me/)
  - .password-store :: (for `pass` password manager)
    - type-1 :: ssh, web, mail (for smtp and imap), etc.
      - host-1 :: news.ycombinator.com, etc.
        - account-1 :: jol, jolmg, etc.

Not all folders are available on all machines, like ~/repo is on a private server, but they follow the same structure.

ashark · on June 21, 2017

- ebooks: I don't love Calibre, but it's the only game in town.

- music: Musicbrainz Picard to get the metadata right. I've been favoring RPis running mpd as a front-end to my music lately.

- movies/TV: MediaElch + Kodi

I don't have a good solution for managing pictures and personal videos that doesn't involve handing all of it to some awful, spying "cloud" service. Frankly most of this stuff is sitting in Dropbox (last few years worth) or, for older files, in a bunch of scattered "files/old_desktop_hd_3_backup/desktop/photos"-type directories waiting for my wife and I to go through them and do something with them. Which is increasingly less likely to happen—sometimes I think the natural limitations of physical media were a kind of blessing, since one was liberated from the possibility of recording and retaining so much. Without some kind of automatic facial recognition and tagging—and saving of the results in some future-proof way, ideally in the photos/videos themselves—this project is likely doomed.

My primary unresolved problem is finding some sort of way to preserve integrity and provide multi-site backup that doesn't waste a ton of my time+money on set-up and maintenance. When private networks finally land in IPFS I might look at that, though I think I'll have to add a lot of tooling on top to make things automatic and allow additions/modifications without constant manual intervention, especially to collections (adding one thing at a time, all separately, comes with its own problems, like having to enumerate all of those hashes when you want something to access a category of things, like, say, all your pictures). Probably I'll have to add an out-of-band indexing system of some sort, likely over HTTP for simplicity/accessibility. For now I'm just embedding a hash (CRC32 for length reasons and because I mostly need to protect against bit-rot, not deliberate tampering) at the end of filenames, which is, shockingly, still the best cross-platform way to assert a content's identity, and synchronizing backups with rsync—ZFS is great and all but doesn't preserve useful hash info if a copy of a file is on a non-ZFS filesystem, plus I need basically zero of its features aside from periodically checking file integrity.

mcaruso · on June 21, 2017

One thing I do that I've found to be pretty helpful is to prefix files/directories with a number or date, for sorting. Some things are naturally ordered by date, for example events. So I might have a directory "my-company/archive", where each item is named "20170621_some-event".

Other things are better sorted by category or topic. For tools or programming languages I'm researching I might have a directory with items "01_some-language", "02_setup", "10_type-system", "20_ecosystem", etc.

rphillips · on June 22, 2017

I do something very similar. I save files into a watched folder with Hazel (Google Drive, Dropbox and my Downloads folder). Hazel has a rule to rename the file with the YYYYMMDD_Filename.ext, and then depending on the extension filters it to a different folder, or with a PDF runs an OCR on it and stores it in Devonthink Pro.

yagyu · on June 21, 2017

I did this (with dates) for all calculations I did during my PhD and it turned out very useful despite its simplicity. I guess it was simple enough that I could actually be consistent over years. I wish the rest of my projects were as orderly ;)

lkurusa · on June 21, 2017

Roughly this scheme:

~/dev for any personal project work

~/$COMPANY for any professional work I do for $COMPANY

~/teaching for teaching stuff

~/research for academic research (it's a big mess unfortunately)

~/icl for school related projects (where "icl" is Imperial College London)

For my PDFs I use Mendeley to organize them and have them available everywhere along with my annotations.

I store my books in iBooks and on Google Drive in a scheme roughly like: /books/$topic/$subtopic

Usually organizing your files is usually just commitment, move files off ~/Downloads as soon as you can :-)

bballer · on June 21, 2017

I try not to over think it, just:

  ~/$MAJOR_TOPIC
  |
  |--- ./$MORE_SPECIFIC
      |
      |--- ./$MORE_SPECIFIC
          |
          |--- ./general-file.type
      |
      | ./general-file.type
  |
  |--- ./$MORE_SPECIFIC
  |
  |--- ./general-file.type

etc

As you find yourself collecting more general files under a directory that can be logically grouped, create a new directory and move them to it.

Also keep all your directories in the same naming convention (idk maybe I'm just OCD)

cr0sh · on June 21, 2017

That's pretty much how I did it on my NAS. My top level is basically "fiction", "non-fiction", "music", "pictures", "software" - then it just goes from there.

But it has problems. For instance, I like to collect information about robotics and artificial intelligence. In many cases I have papers with titles like "Using Computer Vision to Control a Robot Arm via a CNN". Do I put it under "robotics/sensors/vision" or "ai-ml/anns/cnn" or "robotics/motion-control/platform/arm"...or...? It can technically fit into any and all of those categories!

That's a problem with hierarchical category structures; when something can fit into multiple categories, you can either duplicate the information (not good - unless your system has some way of using pointer refs or such to prevent data duplication - which most file systems don't), cross-link the information (put it in a canonical spot and symlink to it), or just say "f-it" and stick it someplace, and hope you can find it later (which sometimes you can't).

What I wish I had, instead, was a simple means to search my filesystem in a very quick fashion. Ideally, it would be something like the old Google Search Appliance, which could spider and index my filesystem, read each file (and any metadata stored in the file, such as in the case of videos and images) and build up an index that can be quickly and easily searched. It would also keep this index up-to-date as files are added, removed, or changed.

Unfortunately, I've yet to find a low-cost (ideally free) open-source solution to this problem, that was also easy to set up and maintain. I've found more than a few solutions (or partials) which given enough admin and configuration (plus maintenance and/or glue code) could potentially become the system I want, but none of them were "turnkey" - install, simple setup (nothing more complex than a NAS or WiFi Router, for instance), and "let it go". They were all very "enterprise-y" and required more than a bit of effort to install and maintain. It isn't that I couldn't do that, I just don't have the time to dedicate to such a task. But it might be something I just have to bite the bullet for.

Maybe what I need to do is research the latest offering of FreeNAS - maybe they've (since the last time I used it) implemented a decent search engine module (or some third party plugin has been created) to handle this issue?

eagerToLearn · on June 21, 2017

If you're running Windows, you can use Everything [1] to instantly find files on your computer just by knowing their name.

1. https://www.voidtools.com/downloads/

Animats · on June 21, 2017

    documents/projects/projectname/whateverlayoutthetoolsdemand

with each project under Git. Layouts for Go, Rust, ROS, and KiCAD are forced by the tools. Python isn't as picky.

Web sites are

    sitename/
        info - login data for site, domains, etc.
        site - what gets pushed to the server
        work - other stuff not pushed to server

with each site under version control.

two2two · on June 21, 2017

One external raid (mirrored) that holds information only necessary for when I'm working at my desk. Within that drive I have an archive folder with past files that are rarely/ever needed. The folder structure is labeled broadly such as "documents" "media" and more specific folders within. For the file level I usually put a date at the beginning of the name going from largest to smallest (2017-6-21_filename). For sensitive documents; I put in encrypted DMG files using the same organization structure.

As for all "working" documents, they're local to my machine under a documents or project folder. The documents folder is synced to all my devices and looks the same everywhere with a similar organization structure as my external drive. My projects folder is only local to my machine, which is a portable, and contains all the documents needed for that project.

TL;DR Shallow folder structure with dates at the beginning of files essentially.

sriku · on June 21, 2017

If you're particularly asking about reference material that you take notes about and would like to search and retrieve and produce reports on, Zotero might work for you. I have many years of research notes on it - it's a hyper-bookmarking tool that can keep snapshots of web pages, keep PDFs and other "attachments" within saved articles, lets you tag and organize them along with search capabilities.

Outside of that scope, my files reside randomly somewhere in the ~/Documents folder (I use a mac) and I rely on spotlight to find the item I need. It's not super great but is workable often enough.

It's not a silly question!

edit: I've been trying to find a multi-disk solution and haven't had much success with an easy enough to use tool. I use git-annex for this and it helps to some extent. I've also tried Camlistore, which is promising, but has a long way to go.

xymaxim · on June 21, 2017

Another option is to have a look at a tag-based filesystem instead of hierarchical ones to organize everything semantically. I'm using Tagsistant (there're other options) for a couple of months now and I'm almost happy. More satisfied with the idea itself and the potentiality.

richardknop · on June 21, 2017

I mostly work with Golang so usually all work related stuff will be in my GOPATH in ~/code/go/src/github.com/company-name/.

Non Golang code will go to ~/code, sometimes ~/code/company-name but I also have couple of ad hoc codebases spread around in different places on my filesystem.

So it is a bit disorganized. However last few years I have rarely ever needed to cd outside of ~/code/go.

Some legacy codebases I worked on (and still need to contribute to from time to time) can be in most random places as it took some effort and time to configure local environment of some of these beasts to be working properly (and they depend on stuff like Apache vhosts) so I am too afraid to move those to ~/code as I might break my local environment.

ktopaz · on June 21, 2017

I have my files pseudo-organized, meaning I kind of try to keep them where they should be logically, but since this varies a lot - they're not really organized. The thing is - I use "everything" a free instant file search tool from voidtools. It is blazingly fast, just start typing and it finds files while you type. It uses the ntfs file system (windows only, sorry everyone else) existing index to perform instant searches, it is hands down the ultimate most fast file search tool I have ever encountered - files literally are found while you type their names, without waiting for even a milli second.

So, no organization (the ocd part of me hates this) but i always find my files in an instant, no matter where i left them.

majewsky · on June 21, 2017

My file layout is quite uninteresting. The most noteworthy thing is that I have an additional toplevel directory /x/ where I keep all the stuff that would otherwise be in $HOME, but which I don't want to put in $HOME because it doesn't need to be backed up.

- /x/src contains all Git repos that are pushed somewhere. Structure is the same as wanted by Go (i.e., GOPATH=/x/). I have a helper script and accompanying shell function `cg` (cd to git repo) where I give a Git repo URL and it puts me in the repo directory below /x/src, possibly cloning the repo from that URL if I don't have it locally yet.

  $ pwd
  /home/username
  $ cg gh:foo/bar # understands Git URL aliases, too
  $ pwd
  /x/src/github.com/foo/bar

As I said, that's not in the backup, but my helper script maintains an index of checked-out repos in my home directory, so that I can quickly restore all checkouts if I ever have to reinstall.

- /x/bin is $GOBIN, i.e. where `go install` puts things, and thus also in my PATH. Similar role to /usr/local/bin, but user-writable.

- /x/steam has my Steam library.

- /x/build is a location where CMake can put build artifacts when it does an out-of-source build. It mimics the structure of the filesystem, but with /x/build prefixed. For example, if I have a source tree that uses CMake checked out at /home/username/foo/bar, then the build directory will be at /x/build/home/username/foo/bar. I have a `cd` hook that sets $B to the build directory for $PWD, and $S to the source directory for $PWD whenever I change directories, so I can flip between source and build directory with `cd $B` and `cd $S`.

- /x/scratch contains random junk that programs expect to be in my $HOME, but which I don't want to backup. For example, many programs use ~/.cache, but I don't want to backup that, so ~/.cache is a symlink to the directory /x/scratch/.cache here.

_mjk · on June 21, 2017

I use `mess` [1]. Short descrption: New stuff that is not filed away instantly goes into a folder "current" linked to the youngest folder in a tree (mess_root > year > week). If needed at a later time: file it accordingly, otherwise old folders are purged if disk space is low. Taking it a step further: synching everything across work and personal machines using `syncthing`.

[1] http://chneukirchen.org/blog/archive/2006/01/keeping-your-ho...

romdev · on June 22, 2017

Downloads

└─Filename preserved, ordered by date or grouped in arbitrary functional folders

Drivers

├─Video

├─Sound

└─MB

Music

└─Primary Artist

  └─YYYY.AlbumName (Keeps albums in date order)

    └─AlbumName Track# Title.mp3 (truncates sensibly on a car stereo)

Pictures

└─YYYY-MM-DD.Event Description (DD is optional)

Projects

├─scripts - reusable across clients

│ └─language

│ └─purpose

└─clientname

  ├─source code

  └─documents

Utils (single-executable files that don't require an install)

I use Beyond Compare as my primary file manager at home and work. Folder comparison is the easiest way to know if a file copy fully completed. Multi-threaded move/copy is nice too.

_pctq · on June 21, 2017

Beside the usual `Images`, `Videos`, `code` directory, the single most important directory on my system is `~/flash` (as in : flash memory). This is where my browser downloads files and where I create "daily" files, which I quickly remove.

This is a directory that can be emptied at any moment without the fear of losing anything important, and which help me keeping the rest of my fs clean. Basically `/tmp` for user.

mijoharas · on June 21, 2017

Why not just use `/tmp`?

_pctq · on June 21, 2017

Because it's already a mess, and because I don't want to take the risk of deleting sockets when I want a purge.

xmpir · on June 21, 2017

Most of my files stay in the download folder. If I think I will need them at a later stage against I upload them to my Google Drive. Google is quite good at searching stuff - for me that also works for personal files. I have probably 100 e-books that are on my reading list and will never get read by me...

codemac · on June 21, 2017

recoll has worked great for a document index.

https://www.lesbonscomptes.com/recoll/

I also recommend calibre for e-books, but I never got to the "document store" stage that I think some people have.

mayneack · on June 21, 2017

symlinks for ~/Downloads and ~/Documents into ~/Dropbox is my only interesting upgrade. Across the varying different devices I have different things selectively synced. Large media files are the only things that don't live in dropbox in some way or another. It's pretty convenient for mobile access (everything accessible from web/mobile). I've done some worrying about sensitive documents and such, but most of it is also present in my email, so I think I lost that battle already. It also means there's very little downside to wiping my HD entirely if I want to try a different OS (which I used to do frequently, but ended up settling on vanilla ubuntu).

raintrees · on June 21, 2017

-clients - For client specific work

   -Project1

-devel - For development/research

   -Language/technology

       -specific research case

And I built my own bookmarking tool for references/citations.

joshstrange · on June 21, 2017

Calibre may be a little rough looking but it's very powerful and it's what I use.

Edit: Also you might want to make a small title edit s/files/ebooks unless you are inquiring about other types of files as well.

house9-2 · on June 21, 2017

~/Dropbox/dev-media/books

~/Dropbox/dev-media/slides

~/Dropbox/dev-media/video

When reading for pleasure I typically read paper, try to limit the screen time if possible.

rajadigopula · on June 21, 2017

If its for e-books only, you can try adobe digital editions or calibre. You can tag and create collections with search functionality on most formats.

gagabity · on June 21, 2017

Dump everything on desktop or downloads folder then use Void Tools Everything to find what I need.

cristaloleg · on June 21, 2017

~/work - everything related to job

~/github - just cloned repos

~/fork - everything forked

~/pdf - all science papers

eternalnovice · on June 22, 2017

Organizing my files has been an obsession of mine for many years, so I've evolved what I think is a very effective system that combines the advantages of hierarchical organization and tagging. I use 3-character tags as part of every file's name. A prefix of tags provides a label that conveys the file's place in the hierarchy of all my files. To illustrate, here's the name of a text file that archives text-based communications I've had regarding a software project called 'Do, Too':  

- pjt>sfw>doToo>cmm  

'pjt' is my tag for projects  

'sfw' is my tag for software and computer science  

'doToo' is the name of this software project  

'cmm' is my tag for interpersonal communications

  Projects (tagged with 'pjt') is one of my five broad categories of files, with the others being Personal ('prs'), Recreation ('rcn'), Study ('sdg'), and Work ('wrk'). All files fall into one of these categories, and thus all file names begin with one the five tags mentioned. After that tag, I use the '>' symbol to indicate the following tag(s) is/are subcategories.

Any tags other than those for the main categories might follow, as 'sfw' did in the example above. This same tag 'sfw' is also used for files in the Personal category, for files related to software that I use personally--for example:

  - prs>sfw>nameMangler@nts  

Here, NameMangler is the name of the Mac application I use to batch-modify file names when I'm applying tags to new files. '@nts' is my tag for files containing notes.  I also have many files whose names begin with 'sdg>sfw' and these are computer science or programming-related materials that I'm studying or I studied previously and wanted to archive.  

A weakness of hierarchical organization is that it makes it difficult to handle files that could be reasonably placed in two or more positions in the hierarchy. I handle this scenario through the use of tag suffixes. These are just '|'-delimited lists of tags that do not appear in the prefix identifier, but that are still necessary to convey the content of the file adequately. So for example, say I have a PDF of George Orwell's essay "Politics and the English Language":  

- sdg>lng>politicsAndTheEnglishLanguage_orwell9=wrt|wrk|tfl|georgeOrwell  

The suffix of tags begins with '=' to separate it from the rest of the file name. A couple of other features are shown in this file name. I use '_' to separate the prefix tags from the original name of the file ('orwell9' in this case) if it came from an outside source. I'm an English teacher and use this essay in class, and that's why the tags 'wrk' for Work and 'tfl' for 'Teaching English as a Foreign Language' appear. 'wrt' is my tag for 'writing', since Orwell's essay is also about writing. The tag 'georgeOrwell' is not strictly necessary since searching for "George Orwell" will pick up the name in the text content of the PDF, but I still like to add a tag to signal that the file is related to a person or subject that I'm particularly interested in. Adding a camel-cased tag like this also has the advantage that I can specifically search for the tag while excluding files that happen to contain the words 'George' and 'Orwell' without being particularly about or by him.  

That last file name example also illustrates what I find to be a big advantage of this system: it reduces some of the mental overhead of classifying the file. I could have called the file 'wrk>tfl>politicsAndTheEnglishLanguage=sdg|wrt|lng|georgeOrwell', but instead of having to think about whether it should go in the "English teaching work-related stuff" slot or the "stuff about language that I can learn about" slot, I can just choose one more or less arbitrarily, and then add the tags that would have made up the tag prefix that I didn't choose as a suffix.

  There's actually a lot more to the system, but those are the basics. Hope you find it helpful in some way.

graycat · on June 21, 2017

From a recent backup, there are

417,361 files

in my main collection of files for my startup, computing, applied math, etc.

All those files are well enough organized.

Here's how I do it and how I do related work more generally (I've used the techniques for years, and they are all well tested).

(1) Principle 1: For the relevant file names, information, indices, pointers, abstracts, keywords, etc., to the greatest extent possible, stay with the old 8 bit ASCII character set in simple text files easy to read by both humans and simple software.

(2) Principle 2: Generally use the hierarchy of the hierarchical file system, e.g., Microsoft's Windows HPFS (high performance file system), as the basis (framework) for a taxonomic hierarchy of the topics, subjects, etc. of the contents of the files.

(3) To the greatest extent possible, I do all reading and writing of the files using just my favorite programmable text editor KEdit, a PC version of the editor XEDIT written by an IBM guy in Paris for the IBM VM/CMS system. The macro language is Rexx from Mike Cowlishaw from IBM in England. Rexx is an especially well designed language for string manipulation as needed in scripting and editing.

(4) For more, at times make crucial use of Open Object Rexx, especially its function to generate a list of directory names, with standard details on each directory, of all the names in one directory subtree.

(5) For each directory x, have in that directory a file x.DOC that has whatever notes are appropriate for good descriptions of the files, e.g., abstracts and keywords of the content, the source of the file, e.g., a URL, etc. Here the file type of an x.DOC file is just simple ASCII text and is not a Microsoft Word document.

There are some obvious, minor exceptions, that is, directories with no file named x.DOC from me. E.g., directories created just for the files used by a Web page when downloading a Web page are exceptions and have no x.DOC file.

(6) Use Open Object Rexx for scripts for more on the contents of the file system. E.g., I have a script that for a current directory x displays a list of the (immediate) subdirectories of x and the size of all the files in the subtree rooted at that subdirectory. So, for all the space used by the subtree rooted at x, I get a list of where that space is used by the immediate subdirectories of x.

(7) For file copying, I use Rexx scripts that call the Windows commands COPY or XCOPY, called with carefully selected options. E.g., I do full and incremental backups of my work using scripts based on XCOPY.

For backup or restore of the files on a bootable partition, I use the Windows program NTBACKUP which can backup a bootable partition while it is running.

(8) When looking at or manipulating the files in a directory, I make heavy use of the DIR (directory) command of KEdit. The resulting list is terrific, and common operations on such files can be done with commands to KEdit (e.g., sort the list), select lines from the list (say, all files x.HTM), delete lines from the list, copy lines from the list to another file, use short macros written in Kexx (the KEdit version of Rexx), often from just a single keystroke to KEdit, to do other common tasks, e.g., run Adobe's Acrobat on an x.PDF file, have Firefox display an x.HTM file.

More generally, with one keystroke, have Firefox display a Web page where the URL is the current line in KEdit, etc.

I wrote my own e-mail client software. Then given the date header line of an e-mail message, one keystroke displays the e-mail message (or warns that the date line is not unique, but it always has been).

So, I get to use e-mail message date lines as 'links' in other files. So, if some file T1 has some notes about some subject and some e-mail message is relevant, then, sure, in file T1 just have the date line as a link.

This little system worked great until I converted to Microsoft's Outlook 2003. If I could find the format of the files Outlook writes, I'd implement the feature again.

(9) For writing software, I type only into KEdit.

Once I tried Microsoft's Visual Studio and for a first project, before I'd typed anything particular to the project, I got 50 MB or so of files nearly none of which I understood. That meant that whenever anything went wrong, for a solution I'd have to do mud wrestling with at least 50 MB of files I didn't understand; moreover, understanding the files would likely have been a long side project. No thanks.

E.g., my startup needs some software, and I designed and wrote that software. Since I wrote the software in Microsoft's Visual Basic .NET, the software is in just simple ASCII files with file type VB.

There are 24,000 programming language statements.

So, there are about 76,000 lines of comments for documentation which is IMPORTANT.

So, all the typing was done into KEdit, and there are several KEdit macros that help with the typing.

In particular, for documentation of the software I'm using -- VB.NET, ASP.NET, ADO.NET, SQL Server, IIS, etc. -- I have 5000+ Web pages of documentation, from Microsoft's MSDN, my own notes, and elsewhere.

So, at some point in the code where some documentation is needed for clarity for the code, I have links to my documentation collection, each link with the title of the documentation. Then one keystroke in KEdit will display the link, typically have Firefox open the file of the MSDN HTML documentation.

Works great.

The documentation is in four directories, one for each of VB, ASP, SQL, and Windows. Each directory has a file that describes each of the files of documentation in that directory. Each description has the title of the documentation, the URL of the source (if from the Internet which is the usual case), the tree name of the documentation in my file system, an abstract of the documentation, relevant keywords, and sometimes some notes of mine. KEdit keyword searches on this file (one for each of the four directories) are quite effective.

(10) Environment Variables

I use Windows environment variables and the Windows system clipboard to make a lot of common tasks easier.

E.g., the collection of my files of documentation of Visual Basic is in my directory

H:\data05\projects\software\vb\

Okay, on the command line of a console window, I can type

G VB

and then have that directory current.

Here 'G' abbreviates 'go to'!

So, to command G, argument 'VB' acts like a short nickname for directory

H:\data05\projects\software\vb\

Actually that means that I have -- established when the system boots -- a Windows environment variable MARK.VB with value

H:\data05\projects\software\vb\

I have about 40 such MARK.x environment variables.

So, sure, I could use the usual Windows tree walking commands to navigate to directory

H:\data05\projects\software\vb\

but typing

G VB

is a lot faster. So, such nicknames are justified for frequently used directories fairly deep in the directory tree.

Environment variables

MARK.TO

MARK.FROM

are used by some other programs, especially my scripts that call COPY and XCOPY.

So, to copy from directory A to directory B, I navigate to directory A and type

MARK FROM

which sets environment variable

MARK.FROM

to the directory tree name of directory A. Similarly for directory B.

Then my script

COPYFT1.RXS

takes as argument the file name and does the copy.

My script

COPYFT2.RXS

takes two arguments, the file name of the source and the file name to be used for the copy.

I have about 200 KEdit macros and about 200 Rexx scripts. They are crucial tools for me.

(11) FACTS

About 12 years ago I started a file FACTS.DAT. The file now has 74,317 lines, is

2,268,607

bytes long, and has 4,017 facts.

Each such fact is just a short note, sure, on average

2,268,607 / 4,017 = 565

bytes long and

74,317 / 4,017 = 18.5

lines long.

And that is about

12 * 365 / 4,017 = 1.09

that is, an average of right at one new fact a day.

Each new fact has its time and date, a list of keywords, and is entered at the end of the file.

The file is easily used via KEdit and a few simple macros.

I have a little Rexx script to run KEdit on the file FACTS.DAT. If KEdit is already running on that file, then the script notices that and just brings to the top of the Z-order that existing instance of KEdit editing the file -- this way I get single threaded access to the file.

So, such facts include phone numbers, mailing addresses, e-mail addresses, user IDs, passwords, details for multi-factor authentication, TODO list items, and other little facts about whatever I want help remembering.

No, I don't need special software to help me manage user IDs and passwords.

Well, there is a problem with the taxonomic hierarchy: For some files, it might be ambiguous which directory they should be in. Yes, some hierarchical file systems permitted to be listed in more than one directory, but AFAIK the Microsoft HPFS file system does not.

So, when it appears that there is some ambiguity in what directory a new file should go, I use the x.DOC files for those directories to enter relevant notes.

Also my file FACTS.DAT may have such notes.

Well, (1)-(11) is how I do it!

guilhas · on June 21, 2017

Zim wiki

frik · on June 21, 2017

For ebooks I created folders for main-categories and some sub-categories (inspired by Amazon.com or some other ebook shop structure).

For photos folders per device/year/month.

For Office documents pre-pending date using the ISO date format (2017-06-21 or 170621) works great. (for sharing with others over various channels like mail/chat/fileserver/cloud/etc)