Hacker News new | past | comments | ask | show | jobs | submit login
Wiki Bankruptcy (critter.blog)
168 points by mcrittenden on Aug 10, 2020 | hide | past | favorite | 139 comments

This is one of the articles which is pure non sense, but helps drive a rich HN discussion on the topic discussed.

The idea is a very bad idea. Imagine if wikipedia would do that, what will happen. Fortunately they don't and they found other way to fix the problem.

The issue here is: part of the documentation is not up to date anymore

The underlying issue is : the culture of the company probably doesn't encourage written documentation and probably prefer 1-1 communication. As a result documentation is out of date

The right solution is: Dont share documentation in 1-1, always write it on the wiki, and give the ability to people to notify that some part of the page is out of date so that the owner of the document can fix it

This is not a side issue, especially in the "new" world in which a lot of organizations need to work remotely

Don't share documentation 1-1: Set your Slack retention ridiculously low to force people to put information in documentation if they wish it to be retained at all.

This is one of those instances of pure non-sense that I hope will drive rich discussion.

As with every other attempt to "force people" to do something via UX limitations, I think that's doomed to fail.

The whole notion of forcing users to change their behavior by placing obstacles in their way is futile in my experience. It's a grave misunderstanding of who users are and how to design for them.

Forcing users to do what you want them to is how you get desire paths [1]

[1]: https://en.wikipedia.org/wiki/Desire_path

> Set your Slack retention ridiculously low to force people to put information in documentation if they wish it to be retained at all.

I worked at a company that tried something similar.

The result was that individual teams set up their own off-the-record private Slack workspaces or Discords so they could regain some control over their conversation history. Obviously not great from a company information security perspective, but many managers looked the other way because they just wanted to get their work done without worrying about conversation history arbitrarily disappearing.

The idea that you can force people to produce better documentation by making their lives more difficult never really works out.

The issue with 1-1 communication is not the retention part but the exclusive and not complete part. If you have no retention but allow 1-1, the same person will just keep asking the same question over and over. Which will make everything worst.

Also as the sibling comment is saying, you won't change the culture of an organization only by an UX change or a software change, that can help, but that won't be enough

When you say don’t share documentation 1:1, do you mean always share broadly? Or do you mean always write it down?

Terrible idea indeed. Why would I opt for a one year storage option when I could send an email blast with the info and have it retrievable for a lot longer?

This would push people away from a wiki.

I think what this idea is calling out is that whatever you sent in the email a year ago is probably stale now - it answers the question for that time, but not necessarily for right now, whereas a wiki should have versioning.

Unfortunately once you update the wiki, it also grows stale after that moment, so in someways intractable :)

Perhaps that’s not a bad thing.

You want to encourage thorough and up to date documentation? Don't allow code to be merged without docs. Review documentation at least as stringently as code. Insist on clear exposition, good grammar, and accuracy. If you're not willing to pay that price, your docs will only be as good as your programmers' intrinsic motivation to write them.

The fact that I have never been able to get buy-in for this proposal would argue that our docs may actually be good enough.

> Imagine if wikipedia would do that

That's not the use case the blog post is addressing. It shouldn't be considered a mark against anyone if when applying their advice in a way that they never advocated then it turns out not to hold up so well.

I’m not sure if Wikipedia is a fair comparison but open minded - what I appreciate about this approach is it forces the org to think from first principles about what should be in their knowledge base.

What else do you see causing the problem aside from culture? Are there tools that do this well?

> The right solution is: Dont share documentation in 1-1, always write it on the wiki

You are restating the problem, the question is how

Not really. "Get people to always add to the wiki" is a different problem than "don't have 'outdated' pages on the wiki".

Sounds like one of the people at a former employer of mine had a time machine to come forward in time to read this article and take it by heart. He deleted more than 800 articles I had written on the company wiki, which were never recovered, and then when he eventually left the company it was realized that those articles were important reference points which were never replaced in the transition. It turns out deleting 800 articles is easier than writing 800 articles when the original author with the domain knowledge had moved on into another role.

In short, this is a terrible idea, don't do this. Even old and outdated articles serve as a historical record which can breadcrumbs for your mental processes as you troubleshoot an issue or understand the mindset that went into creating the system you're working on. "Outdated" just means that it's not an accurate representation of the current state, but it doesn't mean it isn't an accurate representation of historical state.

> Sounds like one of the people at a former employer of mine had a time machine to come forward in time to read this article and take it by heart. He deleted more than 800 articles I had written on the company wiki, which were never recovered, and then when he eventually left the company it was realized that those articles were important reference points which were never replaced in the transition. It turns out deleting 800 articles is easier than writing 800 articles when the original author with the domain knowledge had moved on into another role.

What did the company do after realizing it?

I have problem comprehending how a company which requires hundreds of wiki pages to function somehow does not have if not versioning system then at least a weekly backup of essential documentation.

Having a backup is one thing. Restoring from that backup in a way which is piecemeal is another thing entirely.

Point taken. Selective restore of a subset of the deleted wiki pages looki non-trivial indeed

This is a pretty terrible idea. I've run a wiki at work and a lot of the information is outdated but a lot of it is also still really relevant even given that the page has not been touched in 3-5 years.

Somewhat wiki related but on the software side - I created a very basic plugin for wikimedia that solves a specific problem and because I did not have any commits in my github repo for 3-6 months they flagged my plugin as "do not use" and "unmaintained". It was frustrating to put something together in my free time and to have the community basically shun it because there were no new features I could think of to implement. I think documentation is the same way - creating friction by deleting, hiding, or even obtrusively flagging pages as "old and invalid" can make people not want to use your system.

I'm a mediawiki developer, out of curiosity what was the name of your extension?

Sometimes people get carried away with that process (deletionists are everywhere ;), but its primarily supposed to be for extensions that haven't been updated in a long time and don't work with the current version of mediawiki.

I'm not familiar with Wikimedia plugins. In some plugin ecosystems there's an expectation that plugins need to adapt to deprecated apis, bump dependencies, etc. Is that the case for wikimedia plugins?

Regardless, 3 months seems like a very short time to consider something "unmaintained"

This is rather pointless exercise. One can have a page documenting a very rare system crash and how to fix it. If the crash don't repeat itself annually the page will be deleted. Not a great outcome, me thinks.

Instead of a forced page renewal there should be a bonus for fixing highly useful pages and a bounty for finding errors in the documentation.

Otherwise the same crappy page with few nuggets of useful links/info will be copied verbatim for ever and ever.

We've used Guru[0] for some of our internal documentation, which will prompt you to re-verify the accuracy of docs on a set schedule. So every six months (configurable per doc), it will prompt the designated person to glance over the doc and say "yup, still good".

I'm undecided as to how well it works, but it seems like it a better solution to the stale data problem than "let's just delete everything and hope we don't ever need it" approach.

[0] getguru.com

I am a fan of Markdown based wikis. Have been using GitLab for some time now.

On a separate note: one can script simple link checking/verification of the fetched content. And flag all pages with dead links, or with links where PDF/html/whatever did change from a snapshot taken say every month or so.

What sort of bonus/bounties are you envisioning?

Alternatively: abandon wikis and keep documentation in your monorepo, next to the code/project it concerns (and in a doc/ global path for orga-wide documentation). Then render it with whatever - I wrote a little tool [1] for that, because I couldn't find anything that worked for this usecase.

Lowering the effort barrier for documentation work (ie. making it easy, and expected for developers to commit documentation changes _alongside_ code changes) has been critical for the organizations I've worked with. Disorganized, slow-loading wikis with obtuse syntax markup tends to make people really not want to touch docs, or keep postponing cleanups. Developers will thank you for making the documentation grepable, for using standard code editing/review/submission flow for documentation changes, and for making ownership of documentation clear.

[1] - https://hackdoc.hackerspace.pl/devtools/hackdoc/README.md?re...

In most companies I've worked for only part of the contibuters to the company's Wiki were developers. What about Product Managers, BizDev people, etc?

Do you expect everyone at the company to use git and work with markdown? (Don't get me wrong, I wish everyone did. But knowing people in these roles makes me think this would never happen)

I may be in an unusual position of being able to dictate this but I always push for non-coders to use git and markdown... design assets are checked in to a repo, blog posts use markdown and are pushed to a Netlify site, marketers can tweak language and images directly in our Vue app (partly chosen for Vue's fantastic accessibility), etc.

TBH I have not pushed non-coders to document stuff in git / markdown, only the developers do that.

Free, friendly tools like Github desktop, VS code, Netlify CMS help a ton. It also helps that marketers, designers, product managers are all technically inclined positions, where hiring smart people pays off.

There's a tradeoff as it takes more time to preach and teach the use of these tools, but I see it as a good investment for workflow and in people. Being able to wrangle git is a great skill!

This would be insanity in a mono repo. A developer would have to look over every PR to make sure some critter from the marketing department didn't accidentally change an important bit of code.

1) You'd setup a Policy based on directory or Regex of files and organize your monorepo so that only the people that need to see the PR would. Use your tools to your advantage.

2) It really depends on your culture at your company, having open access to view and change the code from marketing to submit small PRs or possibly move to development. Or submit blog posts or spelling fixes to the marketing website can make for a great team.

3) This adversarial "Critter", Developers > Marketing view makes for a very toxic workplace. Marketing is a skill-set just like development, without marketing or sales there is nothing develop, without development, there's nothing to sell.

> You'd setup a Policy based on directory or Regex of files and organize your monorepo so that only the people that need to see the PR would. Use your tools to your advantage.

Hard to catch everything

> It really depends on your culture at your company, having open access to view and change the code from marketing to submit small PRs or possibly move to development. Or submit blog posts or spelling fixes to the marketing website can make for a great team.

I'm all for open access, but systems should have safety rails to stop people from making mistakes. This is why we have the PR process in the first place. If you have non-programmers approving PRs created by other non-programmers into repositories that have mission critical code, then you're really just asking for trouble.

> This adversarial "Critter", Developers > Marketing view makes for a very toxic workplace. Marketing is a skill-set just like development, without marketing or sales there is nothing develop, without development, there's nothing to sell.

My position actually straddles product and development, so I spend much more time with marketing types than developers these days. They're fine people, but they aren't as patient and detail oriented as developers and lack awareness of the dangers of certain actions. I've seen them do all sorts of crazy stuff that would throw the security team into an absolute fit if they found out about it.

That's something your ownership and CI systems catch, and is a prerequisite to use a monorepo sensibly in the first place.

The ownership system ensures a per-subpath commit/review policy, while the CI system ensures a dependency change doesn't cause cascading failures in other parts of the repository.

I have no experience with monorepos - but yes, developers have to look over PRs that touch apps. No committing to the main branch!

Also what I have found to be essential to enable technical non-coders (and juniors! and contractors!) is that PR reviewers are responsible for merging and any needed final quality tweaks.

This isn't going to work with every team. Monorepos would make this whole effort harder, I'd guess that just teaching people what to edit would be painful.

Alternatively, you could have some ui that allowed writing to the wiki and it converted it to a commit. Basically the same way that editing the readme can be done at GitHub. In any case, as mentioned in a different comment. Most of the wiki is edited by developers anyway

Git is another level. But I have seen documentation of Linux systems in Google Docs. Commands, fragments of config files etc. This can burn your brain for sure.

Therefore Markdown should have higher priority than some crappy level Word or Excel skills. Even if 80% of endusers will use Markdown wiki/docs just as viewers, there is no point of hogtie sysdmins/developers and ask them to document in Word, google docs and keep data in excel files

The solution to this is to make a UI which can be used like a wiki, but in fact creates a git commit for each edit. Allow documentation-only changes without reviews or pull requests etc.

Still use ACL's in the monorepo so that people can edit their own documentation with a single click, but to edit another teams docs requires some kind of pull request/approval from that team.

Create a culture that code changes must have accompanying documentation changes, and that any substantial code change without documentation changes needs a good reason.

Then everyone can use the tools they are familiar with, and still get all the benefits of history and always-uptodateness.

If you can learn pivot tables in Excel, you can learn markdown.

THIS ^^^

Google took this approach internally using markdown files (If I remember correctly).

Perforce is also going this way - https://www.perforce.com/manuals/swarm/Content/Swarm/basics.... - but I wish more (right now it's only in their "project overview pages.")

This allows for code reviews on the doc too, easier to auto-generate, fix, etc - than non-file interface like wiki.

One gotcha - if your playbook is stored there, make sure it's backup'd on secure external USB drives, as the source system, or the markdown render, or something else needed to accomplish this may go down, and now you and your team/project are going to be affected.

Like pre-rendered to .html, .chm, even why not pdf files - whatever it takes just to be able for your oncall personnel to read them back if need to.

Presumably your on-call personnel will have those repos checked out?

What if your source system goes down, can't connect to your local machine, needs to use VM - no the oncall needs to have the USB stick or more, and use it any other company laptop.

Hackdoc itself has a mode to run from a local checkout, with no other service dependencies (vs. the production mode that expects a git/grpc proxy service to access the live repository), and as it's Go it can easily be moved around as a single, relocatable binary that will run from nearly all operating systems.

> Google took this approach internally using markdown files (If I remember correctly).

Yes: https://www.usenix.org/sites/default/files/conference/protec...

I feel like there's a vast difference between documentation relevant to code and a lot of the stuff accumulating in a company wiki. And if you put the latter in a global path, it doesn't become magically better maintained just because your wiki is backed by a repo.

Age without further context is in any case a terrible indicator of relevance. What's IMHO important is a culture of providing the context with your entries.

I prefer this as well (except for the monorepo part :) ) for anything that concerns the code/app in any way. We chose Slab for our wiki partly because it can surface markdown files from github. So, the onboarding guide just links to relevant markdown files right in Slab.

This looks great and I wish my company adopted the monorepo & bazel. The navigation to //devtools from the link you shared results in a 404 however - perhaps the tool's routing is confused by the bazel "//<project>" syntax? Maybe not as I see the url path looks correct, maybe parent directory links should not show as active if there is nothing to serve in the parent directory.

It's broken :) This is mildly internal for now, and needs a bunch of bug fixes before release.

The behaviour I would expect for this is that the 'devtools' part of the path shouldn't be clickable - as there isn't any documentation markdown in that path of the repository. I'll see if I can implement this today.

EDIT: fixed.

Biggest issue with this is that it destroys all history about technical decisions/processes/etc. I regularly find myself looking through old documents to figure out why something was set up the way it was.

I like the idea of marking all potentially outdated content as such. Maybe automatically archive old pages where it puts a "Warning: Potentially Outdated Content" notice to the top of any page that hasn't been updated in X months. Could also add a simple counter to the top of every page "Last updated XXX Days Ago" rather than the typical "Last Edited on <date>".

This is a terrible idea. You're destroying institutional knowledge because "it's cluttered."

What do you do with clutter? You clean it. You don't burn your house down every year because it's cluttered.

Wikis aren't free. Organizations need to take it upon themselves to continuously groom the contents and organization of their wiki. They can be very useful if you do. But it's an ongoing process, just like organization in any other area in life.

You don't write code in a single file and then delete it when it gets too big. You create folders. You move and prune. Sometimes you delete. Make it intuitive.

> This is a terrible idea. You're destroying institutional knowledge because "it's cluttered."

It's also a terrible idea because, depending on the type of company and its lifestage, it might also be a colossal waste of time and energy even assuming wiki-rot is a problem you actually have (and that it's causing any damage which, honestly, is a moot point for many companies).

Are you really telling me, and especially now, with revenue down in the middle of a global pandemic and economic crisis, that the best use of everybody's time is to trash our internal wiki and start it again rather than, oh, I don't know, work on projects that we can actually charge people money for?

What I need most, right now, is for everyone to focus on making it raaaaaaaaaaaaiiiiin. That's not likely to change anytime soon, especially if the economy takes a protracted period to bounce back, as seems likely.

More generally I tend to be sceptical of people whose main output is corporate busywork[0]. If there's one thing any kind of recession makes clear it's this: the people you want are the people who either make a valuable product or offer a valuable service that can be sold, and the people who can effectively market and sell those products or services. Everything else is ancillary and, whilst you might need it, you want to run it as lean as possible. This isn't about being ruthlessly mercenary: it's about building a sustainable business with a decent buffer of cash that allows you to ride out the rough times so you can afford and keep paying people rather than be forced to make a load of them redundant.

As you've implied, keeping a wiki up to date is a cultural issue, and is something that needs to be done continuously. I think the key here is that sense of active maintenance, and encouraging everybody to be involved. If you change something and that something is referenced in the wiki, you need to change the wiki too otherwise the job isn't done.

[0] I imagine the places where wiki rot is most prevalent are those where creating the wiki in the first place was busywork - some sort of checkbox exercise, or a one off project with no thought given to ongoing maintenance.

> Are you really telling me, and especially now, with revenue down in the middle of a global pandemic and economic crisis, that the best use of everybody's time is to trash our internal wiki and start it again rather than, oh, I don't know, work on projects that we can actually charge people money for?

I bet if you forced the managers to set everyone's OKRs or priorities for the quarter, none of them would include "Rewrite Wiki" in the written goals.

That's the real problem with these exercises: The intent may be good, but it's a massive amount of work that no one is willing to prioritize over other projects. So instead, the efforts stall out.

Furthermore, no one wants to contribute to a new Wiki if they think it will simply be discarded in a few years when the next up-and-coming manager wants to declare Wiki bankruptcy.

I worked at a company that was upset that too many people were relying on Slack backlog search to find answers to old questions. They decided to shut down the Slack workspace and start over new to encourage everyone to become less reliant on Slack. This solved nothing, but made everyone's jobs more difficult when they couldn't remember conversations from a few years ago.

Worse yet, many teams set up their own separate Slack workspaces, off the record, so they could have control over the destiny of their backlogs.

Nothing was solved, but many things became significantly worse.

This is probably the biggest unsolved problem in the "corporate wiki" space. It's easy to create content, but it's hard to organize it properly, it's even harder to re-organize a large mess, and it's harder still to have a person or team who's job is solely to manage document organization.

IMO, the tools themselves are really failing us here.

I'm creating a new document? Give me some kind of super-powered organizer algorithm. Auto-suggest the appropriate tags & folder/space location based on the content.

I'm an admin and trying to clean up the clutter? Run that same algorithm across the entire space and auto-suggest a better layout & document naming. Make it easy to bulk move/delete documents, folders, etc. Surface old, unvisited documents with no edits & recommend for archiving.

Search indexes could also take into account age of the document, # of edits, # of days _since_ an edit, etc.

I could go on.

A corporate wiki is the easiest and simplest form of corporate documentation you can have. If you really want proper, up-to-date, well-written documentation, a wiki isn't for you. A wiki is the first step towards proper documentation just above the "no documentation" stage.

If you want to improve on the wild-and-easy stage of everyone just using the wiki however they feel like, just have some rules.

For devops work, have a page for all your objects. Track lifetime events (the non-automatic kind) there. Swapped yet another disk in machine b47f8? You'll see the record in the wiki and can maybe also swap the controller or cables. Looking for the local installation procedure of some software? look at the wiki page. Doing another installation on a different host group? put a link in the wiki from hostgroup to software and vice versa. Have page templates for all objects and tasks.

The trick is "no change without wiki change". "No identical second support answer without wiki FAQ entry". The trick is documentation discipline.

(I know that some tasks may be better served with e.g. proper inventory software, however, there is something to be said for everything documentation-like in one place).

Oh, and ffs, never delete history, it will definitely bite you someday.

I remember reading something somewhere that points out that Library Science is exactly about the skills to solve this problem: organizing, managing, and indexing large amounts of otherwise-unstructured data.

It strikes me that all the various "corporate wiki" services are really just trying to patch over the absence of staff dedicated to this role.

> IMO, the tools themselves are really failing us here.

Yes but this is not the biggest issue. The biggest issue is the culture. In the short term it is easier to not have a proper documentation, so if you don't have an enforced culture about written documentation, this will always be an issue

Agree. What you need for most content is not structure; what you need is really great search and discovery tools.

Contrasting pattern: Ise shrine

They literally do burn it down periodically, and build another. Because they do this, and plan to do it, the knowledge of how to make it doesn't fade into history.

At non-negligible cost, and without needing to support customers that are still on Shrine 173.5 for Workgroups two rebuilds later ;)

This is actually my coding philosophy. Burn it down and rebuild nearly weekly. The rewrites get smaller, less buggy and more flexible.

This method acts as a forcing function to do exactly that because he tells his team to "recreate the pages they thought were important". If an organization's Wiki really is 75% irrelevant, unused, or incorrect, then a major correction is needed, and "declaring bankruptcy" seems like a good motivational tool to initiate a repair.

> a good motivational tool to initiate a repair

The 75% of articles nobody reads aren’t a problem. The problem is the 20% of articles everyone reads of which 80% is outdated or inconsistent.

This solution doesn’t solve that problem. The natural response to this policy is to copy and paste across resets. Not only does that do nothing useful, it risks adding more cruft.

To be fair, what the article proposes is more akin to a parent telling the kids, “pick up all the toys you care about, and anything that’s still on the floor by tomorrow is going to be donated.”

My question is why can’t the process be automated? Just make the whole thing behave like a cache with LRU eviction.

Fresh timestamps on the wrong answer defeats the purpose of a wiki.

I`ve personally seen a Fortune 500`s informal wiki/support infrastructure and knowledge consolidation is not as black and white as FIFO or GIGO (it was nonexistent aamaf).

Its rare to find an F500 company that has big data analysts working front-line roles to even get a grasp on a consolidation strategy, but shareholders can dream.

> To be fair, what the article proposes is more akin to a parent telling the kids, “pick up all the toys you care about, and anything that’s still on the floor by tomorrow is going to be donated.”

This works because the kids having one specific toy probably doesn't matter.

However, having one specific wiki page might matter quite a bit.

For example, how often do you perform emergency procedures? Is it often enough that you're going to edit the various emergency procedure pages in your wiki often enough to keep them from the chopping block? How about pages about things which aren't emergencies, but which are involved and delicate and require non-obvious knowledge? Again, how often will you edit those pages? Will you remember all of these vital pages every time the man comes around? Forget one and... You're on the hook for suddenly remembering everything which got deleted a few cycles ago.

But what about the deleting clutter? In the analogy about the kids and the toys should the children be told to throw away toys they don't want to play with?

Indeed, this is the same terrible idea I see whenever my next job wants me to switch to the new latest and greatest javascript framework, or the new latest and greatest ML framework. If you never have time to build up the simple knowledge you keep wasting time.

> You don't burn your house down every year because it's cluttered.

If you can afford it.... It isn't the worst idea to just start from scratch.

Why not if you give away everything, otherwise its such a waste of resources ...

One of the most memorable experiences of my life was when I was moving out of a place and I had a beautiful loft bed with a slide for my son.

I more or less gave it away (or... sold a $1500 bed for $50) and the mother who bought it cried because she said this is the nicest thing she has ever bought for her son.

So... If you can afford to do it, do it.

My fear is that the wiki would be abolished/ignored in favour of each team maintaining its own internal documentation.

My organization essentially does wiki bankruptcy every now and then simply by never agreeing on the informational tools we will use and constantly changing them. We have had three bug trackers and three documentation processes in a year.

It has made the wiki basically ignored as you don’t know what you need until you need it in many cases. We are back to relying mostly on the knowledge inside people’s heads.

I've noticed that the number of bug trackers, like the number of wikis, seems to grow roughly logarithmically with respect to org size. It does also seem that as these numbers grow, the more teams revert to tribal knowledge, as nobody knows quite where to look for or create documentation.

We have a development team of 15 at most, but yeah, I get your point. Each project ended up having a different tracker depending on who was heading the department (also had three of those in a year).

Brilliant. This is like the McDonald's theory. [1]

> I use a trick with co-workers when we’re trying to decide where to eat for lunch and no one has any ideas. I recommend McDonald’s. An interesting thing happens. Everyone unanimously agrees that we can’t possibly go to McDonald’s, and better lunch suggestions emerge. Magic!

[1] https://medium.com/@jonbell/mcdonalds-theory-9216e1c9da7d

Wikis are the technical version of Bikeshedding.

Trying to build grand unification schemes certainly is, I think.

However, having active editors of sections with an incentive on people to contribute is good because it moves more info from closed spots to accessible spots.

I’ve never found a good, sustainable way to get librarians and editors. When I’ve gotten it to be people’s jobs it ends up being people out of touch who aren’t able to edit in a meaningful way. So we’d get perfect spelling and grammar but statements like “ visual studio is the top programming environment so all programmers must use msdn” that’s an obviously insane statement since we have lots of non-MS devs, but the editor didn’t know that, left it, and caused lots of confusion.

This is the prototypically terrible idea that "information architects" love. There is an underlying streak of distrust of any information not curated by a professional, and it's fundamentally antithetical to Wiki values.

The BigCo I work at tends to decide, every couple of years, to change Wiki platforms and more or less delete the previous Wiki. The new Wiki starts with some beautifully polished pages anointed by the high priests of information. Some other useful pages make it over (usually losing considerable historical context). Some other pages get lost, because the people most qualified to maintain them have moved on. Other pages get lost because the people most qualified to maintain them get tired of getting jerked around.

No. I concede that wiki rot is a problem, but just because no one has used a page for a year and all those involved have left the company doesn't mean the info isn't important.

Better solution: stop using big branded, overly clever and difficult to use wiki software. Have a single wiki, not split artificially by project, that can be searched as a single entity.

I think a way to archive content and remove it from the search index is better than deleting.

Old content is still handy to have accessible, even if it’s not completely accurate, but it makes searching awful.

Some kind of 'voting' could be handy - so you can see at the top of each wiki page "BobSmith last reviewed this page on 01/02/2007 and reported it accurate at the time".

This is more or less how my team handles it in servicenow. Any document that hasn't seen an edit or a rating in 90 days gets flagged for review. Every friday I clear the review list by either rating the doc or assigning it to a SME to be updated if I can't divine what the purpose of the document is.

I don't think there's any way to curate an evergreen wiki that doesn't involve some regularly applied TLC. It's like doing your dishes and vacuuming the rug.

If it's not in the search index, it's not accessible. Nobody is going to look through every wikipage to find something.

You could make recency a factor in ranking though.

I could see a flag indicating whether you want to include archived content in the search index.

But more generally, archived content would be where you’d browse for content you know used to be available. Anything that you know should be searchable is the content that should be migrated each time bankruptcy is declared.

It’s actually not that bad of an idea, it could be made less dramatic by auto-archiving stale pages, according to whatever rules you come up with.

Internals documentation is a subject I have given up on. Everyone has wildly different opinions on what it should contain, and how much of it there should be. I keep my own notes and just pretend most of the internal wiki doesn’t exist to stay out of it.

The problem with wikis in organizations is that they are a bit too unstructured.

When a company is small they are great, but they accumulate cruft, the people who wrote the pages leave, they get disorganized, no one knows where to put pages, no one can find anything, etc, etc.

So when you look up something in the wiki you don't know whether it is up to date or not. Maybe it contains a procedure written by the support staff which you need to follow when X happens but the developers have changed the subsystem so it is actively dangerous now?

As orgs grow and their data grows you need a more structured place to put it than a wiki (or the shared filestore for that matter, ewww). One day you'll have Compliance requiring that docs are kept up to date, organized properly and owned by somebody.

Eventually you'll end up using a more sophisticated tool like Confluence which can enforce ownership of pages, enforce structure, archive pages which aren't updated and generally tick the compliance boxes.

(The above my personal experience from growing a company from 0-50 staff ;-)

The problem is that the structure isn't one dimensional - it's n-dimensional (e.g. product, technology, lifecycle (dev/support)).

And it's hard to structure anything that works on all of the axis that people want to view the information.

I think the core problem is this: In building ANY documentation you have a slider between cost, freshness and accuracy. You can pick any two. You can NEVER obtain all three. My last place of work was a large FAANG and we had multiple knowledge management systems.

I was, and still remain firmly in the wiki camp. If you find information out of date date in a wiki you can flag it, comment on it or update it. Wikis are not a "single-source-of-truth". The effort to make everything authoritative and run through 20 layers of approvals kills the will/time/ability to document things and tosses tar into the speedy gears of your documentation machine.

So you have articles from 10 years ago? are they hurting you? No they are not- just because 12 editors did not find them useful does not mean that the 'irrelevant' out of date article from 10 years ago wont help the oncall getting paged at 3AM. Keep your stale data. Lessons learned and documentation of old things can find new ways of being important long past the time that they served their core function.

Storage is cheap. No need to be a deletionist- you gain nothing. What you loose is history and perspective.

Personally I had plenty of documents I would look at that were over 6 years old and I was very thankful to find something from the past- "Ohhhh so that is why we chose this architecture!" so now I can say "Yeah, I actually found out why we did things this way- we no longer need to do this and should revisit how we stand these things up."

What I see is a 3-4 year cycle/tenure/promo/depart on good engineers. If your history only goes back that far then you are basically losing history every generation of engineer. For what it is worth, in my experience, most of the people who propose this sort of thing have never been on the other side of the pager at 3AM.

Instead of deleting the pages, why not premise them with something like "This page may contain outdated information" after some timeout condition? It seems like this solution would really solve this problem...

In general, what's worse than having outdated information is having information vanishing, so that you have no way of validating how appropriate it is. Out of date information, whilst not spelling out the correct process, may give you an idea of the steps involved.

Anyway to implement this you would probably do something like:

    git log -n 1 dir/file
Check the date, if it's larger than some amount and grep doesn't show some existing warning, throw a warning in the header of the file.

Wiki rot is real, but this is a terrible idea.

As others in this thread have pointed out, if you have a page (or twenty) in your wiki that capture information about arcane information that doesn't have even yearly applicability -- yet still may have applicability -- by deleting it you've willfully destroyed some organizational knowledge.

Just as bad would be cases where editors don't even know if the information is still important, but due to the timestamp choose to delete it.

This is as about as close to "throwing out the baby with the bathwater" as it gets.

I worked on the wiki at a large company, we had several problems but came up with a few changes.

- A wiki reaper that would reap docs that hadn't been updated in more than 1 year. We'd give people 1 month to action.

- In search, older pages were weighted lower than newer pages.

- Pages had "team" owners rather than individuals.

- Teams would get a monthly digest of their top viewed pages and lowest viewed pages to encourage updates.

- Star rating for pages 1 through 5, would feed into above digest.

Wikis grow organically but need active curating.

That reaper was the bane of my existence. I would save wiki links and they would disappear without me realising it in time.

If you are a distributed company, even outdated and wrong docs are more useful than nothing at all.

It stopped deleting and started archiving back in 2016/2017 when Dex was merged with the Wiki.

I know, it was a glorious day.

Wiki rot is a real thing. The problem I've got with this suggestion is that it doesn't handle the case of pages that are useful but don't need to updated frequently.

The bigger issue that I see is that keeping internal documents in a good state is hard and it's no ones responsibility usually. I've been on teams where the internal documentation is great but it's always been because one person has put a ton of time into maintaining it.

I wonder if there is any wiki software that would actively help curators in grooming it by say:

* giving readers the ability to quickly highlight some text/content to mark sections as "out of date"

* provide "rot report" views where maintainers can see which pages are the most stale in terms of never being viewed or edited. Sort of highlighting of content which has gone the longest without being seen at all (or possibly, edited).

MediaWiki has various useful maintenance reports. Pywikibot can help automate maintenance tasks via the MediaWiki API.


It's hard enough to get people to write these things down in the first place. The absolute worst case is "historical knowledge"; the piece of software that's just working, so nobody touches it, and the person who maintained it is leaving, so you need to get their knowledge in case you dig it up again. Which may be years later.

Maybe one problem with a lot of Wiki software is they make the filetree-like structure seem too important with huge filetree sidebars and constantly make you consider the vastness of your outdated collection of knowledge. Why care if there are really old pages as long as their update times are very apparent?

For our engineering team we've been happy with StackOverflow Teams. Some of the questions are probably just as outdated as the author complains about, but with the design of SO it doesn't matter much. We use tagging heavily and make sure to include keywords in the content of the questions and answers.

One downside of SO Teams is that you are subject to SO's search engine instead of Google or your search engine of choice since your team's pages are private.

(I'm not affiliated with StackOverflow, I'm just a customer.)

I would leave a job if someone tried to pull this. Some of us are busy and can't just drop everything to try and stop some whacko from destroying years of notes we've created to share with our coworkers.

Perfect way to ensure that teams will never write stuff down anywhere.

Ah, I was thinking a different tactic: Everyone would just make mega-pages so that the page itself isn't stale but specific sections are.

I think the best approach to this problem I've seen is an Last Edited date (or preferably, activity graph or "N" edits in the last year or similar) at the top that gets increasingly obnoxious as content gets more stale. Obviously far from perfect, but would have saved me from relying on out-of-date documentation many times.

This is so very unnecessary. Search has never been a great way to find things in a wiki. Much is found by discovery, navigating from the roots. To fix the wiki you start with the roots. And an informal B tree algorithm works pretty well for that.

If anyone happens to mention to me how it's after 4 and they don't want to start anything new, I suggest they clean up the wiki a bit.

Start from the landing pages and project pages. Every page should say what it's for right at the top (that way I know when I've gone to the wrong page). Any page that is too long or has too many outbound links needs to be simplified. Any content that is not about that subject, should go to one that is. If you spent 1/3 of a page about one topic talking about a different topic, that page has been hijacked, and it should be fixed.

Eventually all of the dead content is 5 links away from the root, the contemporary stuff and the things that are always true are a few clicks away from the landing pages. At this point it barely matters if old stuff is there.

I suppose the idea here is that every wiki page is meant to be a living source of truth; and that no facts are better than outdated/incorrect facts?

In that case, I'd offer an additional suggestion: rather than deleting the wiki, just add some flag-bit to pages to hide them, where un-hiding them requires an edit. (Sort of like the flag-bit on user-accounts that means the user needs to change their password before they can log in again.)

Then, add a plagiarism filter (that is only active during these edits), such that during that re-enabling edit, the user cannot accept any of the previous text as is, but must at least paraphrase it. (Just like, in that "must change your password" mode, you can't reuse your old password.)

This would hopefully force the user to consider the value of everything they're re-writing as they're re-writing it. It incentivizes deleting, because deleting is less work than keeping; information would only be kept if it "pulled its weight."

You could even make the process easier, using a bit of client-side JS to run the user through the text of the original hunk-by-hunk ala a `git add -p`: show one sentence at a time, and require that the user type it out again if it's true; write a corrected version if it's slightly incorrect; or "drop" the line if it's now useless.

As for who exactly would have the responsibility of doing this, two considerations:

• Corporate wikis don't tend to have people whose full-time role is to edit/maintain them, ala Wikipedia editors. If they did, it would probably be such a person's job; but that'd also imply that such people would need to be "lore gatherers" for the company, such that they know what is outdated/incorrect. Sounds hard.

• While wikis are designed to enable a shared contribution model, I feel like corporate wiki pages can usually be said to have one main "owner" who has a "responsibility" for them (sort of like a component in a source tree does.) Perhaps it could be that person's responsibility to "refresh" the wiki pages for which they're the "owner." (Hopefully they're something like a product manager, where things like this—communication to ensure productivity—is already what they do all day, and this isn't getting in the way of other tasks for them.)

Taking Confluence as a popular corporate wiki, there’s no need to dump content so brutally. You could easily write an automated pruning script that does things like;

- find pages that haven’t been updated in a while

- messages the last editor “this page will be deleted in a week. If you think is should be kept please update it”

- a week later deletes the page (or puts it an archive)

Probably needs to smart about how it determines age - certain pages tend to get used annually only (eg the annual budget info)

All that said, most of the problem of cluttered corporate wikis is simply because their search engines suck. Confluence is particularly bad.

The #1 thing you need in a corporate wiki search is results ranked by recency ... most things that happen in a corporation have a half life of 2 weeks before they’re no longer interesting. Relevancy also comes into that of course. The user themself would also be nice to have - a given person with a given role will naturally be more interested in some content and not others, and that’s easy to determine from their search history

I think it depends on the problem. Most organizations where I’ve worked have more of a problem with info being siloed, stuck in people’s heads, or stuck in out of date pdfs that people email around.

In that case, I’d rather have the chaos of a wiki as long as there’s version history that shows users. That at least improves the asks to individuals and helps find the mavens of info.

So I’d rather just work on a wiki with search and let the organization happen organically than “burn it down and start over.”

I find it really hard to get the right people to contribute as superficial info from non-SMEs or some contractor whose job is to crank out docs that don’t have usage metrics or quality metrics is worse than nothing.

Currently I’m trying out, not a wiki, but a git repo that takes pull requests and generates a web site that’s hooked into intranet search. It allows easy read and any employee can send a PR or open an issue so all changes are tied back to a specific individual.

What I want to know is, why isn't "Wiki maintainer"/"Librarian" an accepted job category, and why don't companies hire dedicated individuals and teams to do it? I think that would have a decent ROI, but we probably need someone in the industry to prove the concept, à la Google and SREs.

This might be a job for a technical writer. Of course, I've seen way too few of those in my career too.

Having worked at companies that use knowledge management software like Jive and Confluence I think I've found my favorite way to deal with this stuff. And the answer is Slack.

I slowly found myself searching Slack for answers and the search is pretty good. I'm sure deprioritizing older messages and posts is built in to the index algorithm but the messages are still there for if I need to go back that far.

I hate to shill for Slack but this is the closest thing I've found to a Stack Overflow/Google substitute for enterprise knowledge. I usually arrive at these conclusions through experience and test feedback loops.

Confluence I've found is tolerable for static documents like business requirements but the UI feels stuck in the early 2010's and isn't as usable as it could be. Google Docs and Sheets seem to suffer from discoverability issues at a bunch of the places I've worked at.

Possibly leads to teams backing up pages and "recreating" the same ones again.

Would love to hear if anyone actually tried this

I have been through various wiki migrations and so on (and on, and painfully on). And yes, there's a lot of rot, and unfleshed skeletons, and potemkin villages of Sharepoint subsites, and so forth. And there's no structure or consistency because everything is self-service.

The term is outdated but the need is not: to tame this kind of thing, you need a webmaster, a person who has been granted the authority to create structure, to move things, to say "This belongs somewhere else" or "If you don't update this I am going to take it away." This is a person who has a responsibility to run a link checker and to see when the last time a page was accessed.

It's a real need, but we just forgot about it during the fragmentation of all things web-related.

Something I'm trying to do on my own team is to make the wiki more of a browsable "table of contents", a glorified list of all the salient things. Searching to find the answers on Slack/Email/Jira/etc is fine and dandy when you KNOW what to look for but otherwise a gateway glossary gives the visibility necessary for a starting point.

For longer prose and procedures (particularly troubleshooting a very specific problem) I am encouraging blog posts with tags that link back to the parent page. These degrade more obviously due to their chronological nature and feel less jarring to prune (which can be done simply by changing the tags).

I tried to do this, but every time I suggested getting rid of the old wiki someone would object and say they checked it three months ago. It was still like this after it had been read-only for years.

There's a Japanese site for tech blogging called Qiita that puts a warning at the top of any post that's over a year old. While a scorched earth policy could be more satisfying (and potentially safer), using a warning might make it easier to get buy-in.

Here's a random Qiita post with a warning (yellow box):


What knowledge you might need in the future is an unknown unknown. If everything is going well with a piece of infrastructure you might not touch the wiki for years.

The author's justification (at least in part) for deleting all of this information? It can be bad in the hands of employees which aren't intelligent to recognize the fact it is out of date. The solution? Trust those same employees to recognize that even though they know how to perform a completely obscure process, they should download a Wiki that they never use but will be incredibly important in the case of their own death or departure.

Absolutely brilliant.

The solution is not destruction and a lot of manual work. It's software. Make it more usable. Keep a record of last page updates and last page visits. Flag pages that according to your settings (default no views in a year, no updates in three years) for review for update or archival.

Archive old pages compressed heavily but in he wiki data store. Mark them as potentially obsolete but possibly of historical interest. Leave links to them intact and have an option to restore the page to intake status for viewing.

Fully purging the knowledge base is a terrible idea, but having a "review timer" associated with each page might work. As pages approach, say, 1 year without edits/actions, they get flagged for review. A reviewer can confirm if the info is still good, or needs to be updated, or needs to be archived etc.

Assigning pages to owners is also important, so you know who's responsible for the content. In Confluence we just use Spaces as logical groupings but YMMV.

This knowledge base rot is one of the main reasons why I've been working on https://histre.com/

It doesn't solve all the use cases of an internal wiki (yet), but the main idea is, if you rely on people to keep a knowledge base up to date, it will fall into disrepair pretty fast. So you need to integrate with their workflow and make this extremely seamless.

TBH that sound's super great!! Nice Product really, but i would never EVER put my KB onto others servers, maybe self-service aka install it on my internal server should be a option, a bit like the atlassian model.

Thank you! I may build a self-hosted version in the future.

Double Thumps up!!

It might be nicer to age pages into "faded out" stage after a period of time of no edits (deliberately harder to read, low contrast, but there's a public "unfade" button if anyone cares to press it), and then "archived" stage after a bit of time faded. If you want to get a "archived" page back, you can ask the admin to re-publish it.

Just install a plug-in that warns about age of a page you are viewing. To miss, no fuss, to overly clever harassment of your team.

Well i had a wiki for a big insurance company, the point was...it was for me, my servers, and overall my IT-kb, i tried to bring other peoples in without success. My solution was, if a page had no change in 180 Days the page went automatic into the archive-mode, without being in the search index, but of curse not MY pages ;)

Just because something isn't accessed often doesn't mean it isn't relevant & worthy of keeping around. There's a pretty big long-tail of highly specific institutional knowledge which may not be needed daily but is still important to document & have searchable for those few times it's needed.

The Google internal wiki system solves this by showing you when the last update was and when the last review for staleness was. A user is assigned as the owner for a document. They get pinged when the article hasn't been updated in X months. They look it over and fix it, and the cycle starts over.

I've seen this with people that receive tons of email as well. They're out for a month and have 10k emails to sort through. Instead of doing that, they send a message to everyone who sent them an email that if it was important they should re-send. Then just delete everything.

I thought Wikimedia has a version history. Couldn't you just write a "blank" version of every article that hasn't been accessed in over a set time period.

And if someone gets there and wants the old version back, they can.

Have worked at a firm where this was done. Confirming it's a bad idea.

Very interesting. Can you say more about how it played out?

We use HelpDocs.io (love it) and they have a "set article to go stale" option. I haven't used it, but it seems like a nice reminder to re-evaluate or delete the page.

Do you want shadow IT? Because this is how you get Shadow IT.

To avoid software rot, delete all the code every year!

I'm pretty sure I read the same about Jira tickets.

At my company, every single PR is linked to a ticket. These tickets can be linked to each other as causes, related issues, etc. Having them around and reasonably organized has been invaluable for us since we can simply blame a change, trace it all the way back to when it wad introduced, and see why it was put there to begin with. Jira and our wiki system (Confluence) cross-reference each other where appropriate as well. Altogether, it makes the investigation part our jobs way simpler. I would not think to purge entries simply because they're older than an arbitrary date.

Facebook has this. Tasks are autoclosed after 6 months of inactivity, unless specifically flagged. It's sort of nice as a way to close things you were never going to get to, but it's also terrible if you actually have a long backlog but do want to get things done, because the auto closer adds a lot of noise.

This is called archiving, everyone needs a good archive system, all finished cases get archived, archives older than 5 years are thrown away.

I must say, this is a brilliant manual GC process.

Feeling this pain a hell of a lot right now. Where I work, confluence has several simultaneous problems.

Misuse of the medium:

- It's used as a repository for product requirements (which are usually either stale, or may actually be speculative of future incarnations of a feature. In either case this is usually linked to as reference acceptance criteria in JIRA tickets, which creates all the problems you'd imagine).

- It's used to hoard information about what X team did to get something done according to another team's processes (i.e. getting secrets added to our secrets store, how to add a piece of infrastructure, etc.), which have likely changed since X team went through this (and the reference material is rarely ever linked to).

- It's used to document abstractions in code, and is usually stale as the code and documentation are not reviewed together in pull requests.

Lack of ownership/maintenance:

- All public confluence articles appear to get indexed in search (and they are by default published as public), and however search is configured it seems to be very bad at determining relevancy, so you are just as likely to get someone's stale instructions for how they followed some internal procedure as you are the owning team's documentation for doing it

- There seem to be no standards for what information needs to be provided in documentation about internal APIs. No consistency on providing information about request/response types, parameters, contextual/semantic information, limits, true SLAs (as almost everything advertised as an SLA is an SLO), downstream dependencies, etc. This may actually be a "wrong tool for the job" instance too, but in the absence of any other consolidated way of documenting APIs, this is the best thing this company has.

- No one seems to be responsible for culling stale information, and since hoarding is so prevalent, people are reluctant to actually do it. Sometimes I come across stale information in articles published by users with deactivated accounts, sometimes the only indication that the information is stale is that someone has added "(Archived)" in the title (or something similar), sometimes there's just a banner on the page saying something like "this information is no longer valid, see other team" and then linking to the wrong place to find the current information, etc. All of these pages still rank highly in search results.

There may be several organizational dysfunctions involved in creating this state of affairs, and you should have a plan for incentivizing good wiki maintenance and proper use that is a more complicated and sophisticated process, but generally I'm in favor of more radical plans to cull stale information when you find yourself in this kind of situation (if you can get buy-in). This is fundamentally different from crufty old code, because there's nothing battle-testing bad documentation like old code is battle-tested by reality, and stale information is effectively disinformation and it propagates all the usual costs of disinformation (slowed retrieval of necessary information, ill-informed decisions, etc.).

Your real documentation will always be your source code and configuration. Concentrate on making these clearer and more accessible (as appropriate, for instance, don't make your passwords too accessible).

Any other "documentation" should point to your real documentation.

Highly depends on your environment. If you have many products/projects, many things are worth documenting independently of the specific thing that encountered or used them.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact