Building an effective tagging system can be much harder than people realize. I once worked on a tagging system for a collection of math problems. I thought I could code a simple tagging model, and let users tag their own math problems, and it would become much easier to find the problems you're most interested in.
Then I realized that tags like algebra 1, Algebra 1, Algebra I, Alg I, and all other variations should mean the same thing. So I started to develop a closed set of tags. That led to a fascinating rabbit hole about taxonomies that I don't even remember how to speak about clearly at this point.
That project is still a work in progress, and it's left me with immense respect for people who build well-structured systems that involve tagging.
Two impressive site-wide systems I've seen are the categories of Wikimedia Commons (multimedia) and tags of Archive of Our Own (fanfiction). The Commons guideline[0] elucidates its system and interesting ontological theory well. It's scope is extremely broad, aiming to simultaneously include any possibly useful categorization scheme,[1] and overall is a fairly freeform (ideally) directed acyclic graph. Variations are handled with redirects and disambiguation pages in a typical wiki manner, with the limitation that individual category uses must have the canonical name. Ao3, in contrast, has a schema of sorts, and synonyms are made equivalent during resolution (its tags FAQ[2] is also an interesting read).
I tried to write a more thorough comment but also struggled with being coherent. Thus, some ideas, only briefly:
- At an even higher level, the web itself and the overlapping userbases/communities ('intersectionality', without the discrimination--the original set-theory kind?) of individual sites can also be considered a way of organizing content
- Thus, analogously: Search engines replaced directories and webrings as algorithms did tags. The present SEO meta, though...
- Generalizing from Commons, all Wikimedia wikis (Wikipedia, etc.) have parallel category structures, only less developed due to the greater reliance on links. So do most wikis in general, though Wikimedia also unifies categorization and structured data with Wikidata. From there are knowledge graphs and databases in general, wrapping back around to Google trying to determine the Knowledge Graph item that each query refers to.
[1] all the typical keying on depicted people, things, times, and places, plus the ways that we categorize those. Niches from 'horizontal bicolor blue and white flags' to 'Luxembourgish pronunciation by gender', 'trams on route 709', 'ships with 6 funnels'. There's a tool (now called vCat) to visualize categories, some outputs here: https://commons.wikimedia.org/wiki/Category:Wikimedia_catgra...
On a similar note, Danbooru-style image boards often have highly developed tagging systems, ranging from tags for specific characters or artists to tags for art styles, poses, or even specific features which happen to appear in the artwork (like "hat bow" or "blue eyes").
I ended up building out a hierarchy as well. But figuring out the structure of that hierarchy was not trivial at all. How does the name of a repeatable class (Algebra 1) fit with the name of a specific class (Algebra 1 Fall 2020 Section 2)? How does that relate to an area of math like algebra, geometry, number theory? How does that relate to things like context (ie problems about Minecraft, Lego, Physics, etc.)
I developed a closed system of tags, and then gave people the ability to define aliases.
Then I realized that tags like algebra 1, Algebra 1, Algebra I, Alg I, and all other variations should mean the same thing. So I started to develop a closed set of tags. That led to a fascinating rabbit hole about taxonomies that I don't even remember how to speak about clearly at this point.
That project is still a work in progress, and it's left me with immense respect for people who build well-structured systems that involve tagging.