
Prefer associative ontologies to hierarchical taxonomies - dredmorbius
https://notes.andymatuschak.org/z29hLZHiVt7W2uss2uMpSZquAX5T6vaeSF6Cy
======
motohagiography
The reason I use hierarchical models is because they provide more powerful
abstraction than purely associative ontologies. I do use ontologies for
defining logical objects in an organization (people, document types, costs,
risks, etc) but I don't trust end users to invent their own because unless you
have a background in it, they just re-create the same narrow taxonomies they
use in spreadsheets.

An example would be how a corporate wiki/confluence and sharepoint sites are
basically documentation landfills that support labour intensive ad hoc
processes without a lot of intelligence. In contrast, a library of functions,
microservices directory, data dictionaries, tool kits and categories of
processes are an abstraction layer over those landfill elements that enables
clarity and scale.

The skill and intuition to aggregate things into abstractions isn't common, so
most people don't rely on it, but when you have it, it's really valuable. A
rule I use in building ontologies is that a thing without a type is just a
poor design decision that leads to conceptual debt, and a type without a thing
is just a thing.

------
charlysl
Hierarchies scale well, though, and are often found in complex systems, both
natural and manmade, where this is an issue.

This classic paper (The Architecture of Complexity, by Herbert Simon, Nobel
prize winner) explores hierarchy as an organizing principle in complex
systems:

[https://www.andrew.cmu.edu/course/15-440/READINGS/simon-
arch...](https://www.andrew.cmu.edu/course/15-440/READINGS/simon-architecture-
of-complexity-1962.pdf)

~~~
throwaway4585
>Hierarchies [...] are often found in [natural] systems

Nah they aren't. The 'tree' of life is more of a full-fledged graph and the
whole kingdom-order-genus-species thing is muddy and ill-defined. Most of the
'hierarchy' we find in natural systems is just our own projected deformation
of much more complex relationships.

~~~
throwaway_pdp09
Once you get complexity past the ability to do horizontal transission, and
past the point where divergent species can no longer interbreed, it pretty
much becomes a tree I think. If not, please show me where 2 branches rejoin
where there's no H transmission nor mutual fertility.

> the whole kingdom-order-genus-species thing is muddy and ill-defined

Well yes, it's somewhat an arbitrary construct for human convenience, but
isn't much of the muddiness down to our limited understanding of the true
relationships?

Regarding "cadherin/catenin or how epithelia are formed to see how a tissue is
much more than a bundle of cells." could you give a link cos this is so far
outside my area I really don't know what to look for (wiki article on Catenin
wasn't very revealing of your point, sorry if I missed it)

~~~
klyrs
> If not, please show me where 2 branches rejoin where there's no H
> transmission nor mutual fertility.

Viral infections modify host DNA, and can also incorporate host DNA -- this is
a mechanism by which lateral gene transfer can occur between macroscopic
organisms. The platypus has mammal, reptile and avian DNA, and they seem to
have come about long after those branches diverged -- evidence that lateral
gene transfers occur between macroscopic organisms.

~~~
throwaway_pdp09
WTF horizontal gene transfer = platypus!?!?

Show me some paper on this. I mean an actual scientific paper, not a paper
used to roll a yard-long spliff.

~~~
klyrs
Google is free and your aggro stance doesn't encourage me to satisfy your
petulant demand.

~~~
catalogia
Relating to platypus and horizontal gene transfer, I found this, which seems
to be not quite what you're talking about: Horizontal transfer of BovB and L1
retrotransposons in eukaryotes. Genome Biology, 19(1).
doi:10.1186/s13059-018-1456-7

Regardless, horizontal gene transfer certainly does happen.

~~~
throwaway_pdp09
Your paper is what the article I quoted seems to have been based on. It also
mentions L1 and BovB. Well, I have some serious taking back to do!

~~~
pacman83
If you'd like a curated collection of _many_ papers on horizontal gene
transfer in eukaryotes, it comprises much of the evidence presented at
[https://www.panspermia.org/archindex.htm](https://www.panspermia.org/archindex.htm)
. Having followed this collection since 2010, I have gotten the sense that
horizontal gene transfer to/from eukaryotes is not only common but an
important mechanism of evolution.

~~~
throwaway4585
Yeah most people think evolution is just natural selection and accidental
mutations, which is a high schooler understanding's of it. I'm not being
derogatory, it's literally what I learned in high school and I don't blame
people for not digging further if they haven't taken classes afterwards.

------
battery_cowboy
I don't get why we don't use an S3 like filesystem with tags now, folders are
very limiting. Get rid of folders, tag things properly, and then make a good
search interface. Then you add a yeah for each file that has its old path,
"/user/bin" for example, so that legacy folder structure still exists
somewhere for legacy programs.

------
paulsutter
This note-keeping system is really nice, super fast navigation, would strongly
prefer over Notion which is clunky and unnaturally slow.

Consider supporting him on Patreon:
[https://www.patreon.com/quantumcountry](https://www.patreon.com/quantumcountry)

Apparently the note-keeping system is a work in progress:

> PS: Many people ask, so I’ll just note here: no, I haven’t made this system
> available for others to use. It’s still an early research environment, and
> Premature scaling can stunt system iteration.

~~~
meowface
As a Notion user, my immediate reaction was also "where can I get this". I've
never seen a UI/UX as nice as that one. If he made a business out of it, I bet
people would be begging to throw money at him.

~~~
andymatuschak
That's kind; thank you. Perhaps someday, but for now, I mostly want to
understand what "notes" actually are / should be.

~~~
meowface
I understand. I think what I and perhaps many other readers are primarily
interested in is the raw link-following interface that enables quick jumping
around to and from interconnected things, rather than note-taking or the
methodology or philosophy of note-taking.

For example, if TV Tropes switched to an interface like this, I bet way more
people would get lost down rabbit holes, and they'd go deeper and last longer.

Admittedly, I'm not much of a note-taker, let alone an Evergreen note-taker,
and so far I haven't been sold on the idea. Essentially, I'm just too lazy.
(I've found a lot of other very insightful information in this quasi-blog,
though.)

I'm more hoping for a good neural interface to come out within the next five
or so decades, so that this whole process can happen with very little effort
and work almost automatically. Potentially insight-generating prior thoughts
are pretty lossy when they're just in your head, as you describe it, but a
neural interface and associated organization system could possibly eliminate
this lossiness problem. I acknowledge it may take a lot longer than five
decades before such a thing is practical and seamless, but it's just a hope.

At the end of the day, what I think I personally want right now is a hybrid
note-taking, list-making, bookmark/document-organizing, mindmapping, knowledge
base/wiki, table/chart-making, scheduling, planning app with an interface
similar to this one, where it's easy and fast to follow different nodes, see
how they relate, and unwind the stack at any point. I understand such a
project is probably way beyond the scope of anything you're trying to do, let
alone something you want to build a whole business around.

For reference, a lot of what I do is investigative and research work
(generally centered around network security). This rarely involves long-form
prose, and is more like trying to map and discover connections between lots of
different kinds of small entities and findings. But I'd also find this
valuable for more typical business use cases, like what Notion is generally
used for.

Basically, I just wish apps like Notion, Coda, and Nuclino would implement an
interface that's as smooth and useful as this one. These apps are all
basically just trying to clone each other and are using interfaces that
could've been designed in the 90s, rather than trying to fundamentally alter
the experience.

------
wenc
There are 5 practical relationship roles in knowledge modeling: parent-child,
associate, source-target.

Parent-child is hierarchy. Associate is undirected peer. Source-target is less
useful in general, but it roughly represents a directed peer relationship
(e.g. A exports this good to B, and B exports that service to A -- there's a
direction).

Associative relationships are very easy. Bob is friends with Mary, and with
Pete. But they are also very loose and do not convey any information around
belonging/precedence/ordering, which is a major downside. Associations don't
have sufficient representational power for capturing structures.

Hierarchical relationships are very useful for expressing structure. For
modeling real knowledge, you also need the ability to represent _multiple
hierarchies_. For example, Bob is Dan and Chloe's father (parent-child in one
context), and Bob is also Beth and Joe's boss (parent-child in another
context). It can also happen in reverse -- Beth is Bob's superior in a social
club they both belong to (yet another context). Multi hierarchies appear
everywhere yet we don't often recognize them as such.

Multiple hierarchies are sometimes captured through tags (e.g. Gmail
categories) or in the UNIX file system, via symbolic links. But apart from
those examples, I haven't seen any widespread recognition of _multiple
hierarchies_.

Most techniques/software tend to be designed for single hierarchies (e.g.
outlines). This simplicity appeals to how the human brain processes
information. It also only assumes a single context at a time which may be too
simplistic and underpowered in many situations.

~~~
catalogia
> _There are 5 practical relationship roles in knowledge modeling: parent-
> child, associate, source-target. Parent-child is hierarchy. Associate is
> undirected peer. Source-target is less useful in general, but it roughly
> represents a directed peer relationship (e.g. A exports this good to B, and
> B exports that service to A -- there 's a direction)._

I feel like this is missing something, namely in modelling relationships where
a single child can have an arbitrary number of parents _in a single context_
(or source-target without cycles.) I don't think that's adequately captured by
multiple-hierarchies when you can't break the parents into discrete categories
(e.g. mothers and fathers) and thereby into discrete hierarchies. If you try
to model a DAG like that as multiple hierarchies by somehow imposing an order
onto the set of parents, you'll get yourself into a quagmire.

~~~
wenc
I'm not sure I follow completely but I'd like to explore this. To make things
a bit more concrete, what is an example of what you're describing?

~~~
catalogia
In a version control system, a commit might have an arbitrary number of
parents. Furthermore a single commit might have an arbitrary number of
relationships to another commit (e.g. parents can also be grandparents, etc)
Trying to reason about such a system as multiple-hierarchies will have you
reasoning about a potentially huge number of hierarchies if the graph is well
connected. I think it's more straight forward to reason about the system as a
single graph, rather than a huge set of spanning trees.

~~~
wenc
> I think it's more straight forward to reason about the system as a single
> graph,

So I'm still not sure I totally get the objection -- please help me out here.
In my mind, multiple hierarchies _are_ represented by a single graph.

When one reasons about the system, one will naturally only consider single
hierarchies at a time for simplicity. The way to do this is center yourself on
a node at a time, and then from that vantage point, traverse the node in
different directions.

In the system, each edge belongs to only 3 categories: P ("parent", directed),
A ("associate", undirected), S ("source", directed). We're only considering
hierarchies, so we'll restrict ourselves to "P". Each node can have an
arbitrary number of relationships with other nodes. A family and work
relationship could look like this.

You -P-> your kid 1

You -P-> your kid 2

Mother -P-> You, Sister

Father -P-> You, Sister

Boss -P-> You

Boss -P-> Colleague

Company President -P-> Boss

Company -P-> Company President

Seems to me it should be easy to reason about multiple parents in different
contexts -- they all belong in the same graph.

Wikipedia sort of does this with its Categories box at the bottom of every
page, but Wikipedia only has one kind of relationship category: P.

------
volume
Translation to MS Outlook users: don't file individual emails into different
folders. Tag the messages or just search.

------
wellpast
> Things don’t always fit exactly. Maybe once enough new ideas are collected,
> a new category would emerge… except you can’t see its shape because
> everything’s already been sorted.

This is so very true and precisely why PL type systems (in their common usage,
which involve creating ADTs/classes/records -- taxonomies) are so bad for the
necessary creative evolution that software systems must undergo to keep up
with the dynamic real world.

Dynamic entities, untyped maps, allow for creative evolution; type-oriented
ADTs ubiquitous in typed PLs blind you and _bind_ you in exactly the way this
post describes.

~~~
jmeister
As someone looking to learn Haskell, I was pondering over this from another
post on the frontpage:
[https://news.ycombinator.com/item?id=22843181](https://news.ycombinator.com/item?id=22843181)

Just out of curiosity, do you significant experience in a statically-typed FP?
What’s your favourite language right now? Your comment reminds me of Rich
Hikey’s famous talk, so I would guess Clojure?

~~~
wellpast
I have a similar take on Haskell and I do think its adherents are more
clinging to a near-religious like commitment to the language and its
counterproductive purity.

That said I think

> I feel learning Haskell was worthless.

is a little harsh. I suppose b/c I don't think learning (anything) is ever
worthless.

Yes I'm a Clojurist, per se. I agree with even Rich Hickey who will even say
that it's fundamentally not about the specific programming language.

However I do think Haskell falls into a different category. Most PLs at least
make pragmatic concessions. PLs with strong static typing take a
fundamentalist approach that does not give any flex and therefore I find anti-
pragmatic.

But why not learn a strong static PL? Why take my or anyone else's word for
it? The best way to discover the right tools is to pick them all up and wield
them. If _outcome_ is your goal, if productivity is your sincere goal (as
opposed to something else like intellectual curiosity or purposeful
rigidity...b/c you're, say, afraid of change) then I do think you will
gravitate away from static typing in many cases. Especially in the prevalant
case of development of web and distributed data systems.

~~~
wellpast
And to answer your question my experience in strongly typed functional FP was
building a few applications in Elm, a derivative/cousin of Haskell.

------
yewenjie
Nice to see his website has org-roam like backlinks to each note.

------
anovick
After thinking about this issue for a long time, I came up with a solution of
my own.

The solution I came up with is a personal knowledge base made up of files
assigned with multiple categories. Categories form a hierarchy and the system
strictly preserves this hierarchy and takes it into consideration with minimal
user intervention:
[https://github.com/amitnovick/catalog](https://github.com/amitnovick/catalog)

Read about my journey and thoughts on it here:
[https://dev.to/amitnovick/a-catalog-of-your-
files-2nd7](https://dev.to/amitnovick/a-catalog-of-your-files-2nd7)

------
modwest
The most interesting applications of this idea lie outside software
development, in my opinion.

Often, I find, hierarchical taxonomies are “the box” I need to “think outside”
of.

------
smitty1e
Just flatten your hierarchical paths to assiciative keys. Load your tree into
a dictionary. Declare victr'y.

------
chrisweekly
IMHO they both have their place, and systems that support both align best to
how many peoples' minds work.

------
chadlavi
Predictable hierarchical taxonomies plus associative groupings like tags plus
search

~~~
modwest
That is a very efficient mechanism for building cognitive structures that
entrench oneself in orthodoxy, in my opinion.

~~~
chadlavi
What does this even mean?

~~~
modwest
the fact you're asking means you wouldn't understand the answer

------
dredmorbius
Related, podcast with the author, touches on some of the topics here:

[https://www.perell.com/podcast/andy-
matuschak](https://www.perell.com/podcast/andy-matuschak)

------
kristianp
It would be useful if he provided examples. What makes an "evergreen note" as
opposed to a transient one?

------
tgbugs
In general I agree with the sentiment expressed in TFA and generally try to
follow it when building new ontologies because asserted subClassOf hierarchies
create a huge impedance mismatch between how the computer interprets them and
what the designer thought they meant (e.g. I've seen people use subClassOf
hierarchies to represent a partonomy, but more subtle variants in what "is a"
means within a single hierarchy can wreak havoc on usability). It also allows
you to defer the deeper modelling until a use case presents itself and build
your hierarchies from the underlying structure of the domain (with data)
rather than trying to impose an almost surely incorrectly hypothesis about the
structure of the domain from the top down (without data).

That said, there are a number of cases where hierarchical taxonomies are vital
in building information systems. Some examples.

A use case where you need a way to guarantee that a space is completely
covered without duplicates (important for accounting, or creating menus).
Having a single non-overlapping reference space (like a map) is critical for
clear communication.

A use case where you need to capture some semi hierarchical knowledge from
domain experts, e.g. an org chart, or the parts of a car, or the parts of the
brain.

Org charts are the perfect example of the tradeoff. Consider the US Department
of Defense (don't actually do this you will loose your mind). The chain of
command is famously a single parent hierarchy, and if it is not, it is a sign
that something can go wrong due to conflicting orders. However, let's say I
wanted to know which branch of the military funded a certain research project
based on the office that wrote the RFP. This question is pretty much
impossible to answer for an arbitrary RFP, and the time and effort needed to
maintain the full associative ontology and keep it up to date is stupefying
huge, AND it is not even clear that such an ontology would actually be
pointing to anything that was actually meaningful in the real world (beyond
ill defined social and economic relationships between arbitrary groups of
primates).

Hierarchies are an effective way to collect knowledge from domain experts in a
systematic way that does not require them to know that they are writing down a
bunch of axioms, but instead can just draw a diagram -- multiple hierarchies
are very important here, because hierarchies from different experts usually
agree at a high level, and then differ in the details, often due to the use
case for the knowledge or from the experimental perspective, not because there
is some fundamental ontological difference. In this sense having multiple
hierarchies is a way around the problem of ambiguity in the meaning of "is a".

------
stillbourne
Which is just a fancy way of saying favor composition over inheritance.

