
Using the same blogging software for 20 years - cosmojg
http://boston.conman.org/2019/12/04.1
======
gwern
> I gave up on dealing with link rot years ago. If I come across an old post
> with non-functioning links, I may just find a new resource, link to The
> Wayback Machine or (if I'm getting some spammer trying to get me to fix a
> broken link by linking to their black-hat-SEO laiden spamfarm) removing the
> link outright. I don't think it's worth the time to fix old links, given the
> average lifespan of a website is 2½ years and trying to automate the
> detection of link rot is a fools errand (a page that goes 404 or does not
> respond is easy—now handle the case where it's a new company running the
> site and all old links still go a page, but not the page that you linked
> to). I'm also beginning to think it's not worth linking at all, but old
> habits die hard.

After about nine years of writing, I've concluded something similar: my
existing reactive approach ([https://www.gwern.net/Archiving-
URLs](https://www.gwern.net/Archiving-URLs)) is not going to scale either with
expanding content or over time. Fixing individual links is OK if you have only
a few or aren't going to be around too long, but as you approach tens of
thousands of links over decades, the dead links build up.

So the solution I am going to implement soon is taking a tool like ArchiveBox
or SinglePage and _hosting my own copies_ of (most) external links, so they
will be cached shortly after linking and can't break. The bandwidth and space
will be somewhat expensive, but it'll save me and my readers a ton in the long
run.

~~~
kick
_So the solution I am going to implement soon is taking a tool like ArchiveBox
or SinglePage and hosting my own copies of (most) external links, so they will
be cached shortly after linking and can 't break. The bandwidth and space will
be somewhat expensive, but it'll save me and my readers a ton in the long
run._

Wouldn't a more ideal solution be archiving via a variety of external (or
internal, I guess) sources the first time a link appears on your site, and
then after a year automatically switching all links to archived versions? This
would kill link rot in its tracks while preserving a lot of the value of links
for the people you're linking to, and would cost less in bandwidth costs given
the curve of access on old content.

~~~
gwern
> Wouldn't a more ideal solution be archiving via a variety of external (or
> internal, I guess) sources the first time a link appears on your site, and
> then after a year automatically switching all links to archived versions?

Yes, by 'shortly' I meant something like 90 days. In my experience, most pages
won't change too much 90 days after I add them (I'm thinking particularly of
social media-like things), but it's also rare for something to die _that_
quickly. 365+ days, however, would be perilously long. My main concern there
is balancing between delaying so long to snapshot that the link dies (thus
generating the manual linkrot-repair problem I'm trying to avoid) and being
too eager to snapshot and archiving a version which is not done and would
mislead a reader.

(I also went through all my domains and created a whitelist of domains that my
experience suggests are trustworthy or where local mirrors wouldn't be
relevant. For example, obviously you don't really need to worry about Arxiv
links breaking, or about English Wikipedia pages disappearing - barring the
occasional deletionist rampage.)

> This would kill link rot in its tracks while preserving a lot of the value
> of links for the people you're linking to, and would cost less in bandwidth
> costs given the curve of access on old content.

My traffic patterns are different from a blog, so it wouldn't.

------
criddell
> My PageRank is still high enough to get requests from people trying to leach
> off from it.

One of his guesses for this is because he doesn't try to game the PageRank
system. I hope he's right.

I've argued with other people that the best way to score well on Google is to
create the best website you can and forget about SEO games. The specific
example we were talking about is recipe websites. They all seem to have
pointless essays about _the first time the author tried ossobuco when they
were an exchange student_ or something else irrelevant. The theory is that the
essay is for Google and not for the poor schmuck trying to make dinner.

~~~
WorldMaker
The essays at the top of recipes are about creating "romance", and often are
intended for people as much as (or more than) SEO. They are intended to create
empathy from the recipe reader toward the writer and establish certain
bonafides, most of which are inconsequential to the recipe itself but help
separate one recipe website and/or one recipe author/collector/distributor
from another. It's actually something that goes way back in recipe books,
where a lot of classic, well respected recipe books for centuries have always
put the work into an introduction chapter describing the author, what their
passions are, how they approach recipes. It helps sell recipe books and helps
aspiring and/or home chefs see things like "oh, this author is just like me"
or "if this person can make this recipe, so can I". It's just that on the
internet when you are looking for one specific recipe that "introduction
chapter" starts to have to move into every single recipe because people aren't
going to pick up your website or blog like a book, they are going to go
straight to a single page. Though even that isn't entirely unique in the world
of cookbooks: there are also old traditions of narrative recipe books that
treat the cookbook as a diary of sorts laying out the author's discovery and
interest in every single recipe. Sometimes people want a story to go with your
dry lists of ingredients and otherwise boring step-by-step directions.

It emphasizes the _art_ in cooking, that it isn't just boring "food
chemistry", but that it is a way for people to connect to each other. Even if
it didn't help with SEO, there are probably lots of recipe bloggers that would
do it anyway because it shows passion, love for the art, narrative hooks for
the reader to explore who they are, their creativity, and/or their other
interests outside of just their kitchen.

~~~
criddell
What works in one medium isn't necessarily the best choice for a different
medium. Books and web pages have unique constraints and use cases. Everything
you've described sounds like the experience I want from a great cook book.
It's something I will browse and spend time with. On the web I just want
information. I got there with a specific search and I'm not there to learn
about the author.

The essays probably wouldn't bother me if they would put the recipe first. If
you have a great story about why sage is your favorite herb put that after the
recipe. I don't care, but I suppose somebody might.

~~~
dhimes
I agree with you. I wish I didn't. I enjoy the romantic side that WorldMaker
points to, but the fact is that when I'm looking up a recipe on the web I'm
just trying to make it. All I want talked about is whether or not it's
authentic or (say) made differently for American tastes, possible
substitutions, etc.

I'm not proud that this is what I've become.

------
bradley_taunt
Really interesting read. With the current trend of revitalizing personal
sites/blogs, I hope to see more of them stick around for even 10+ years.
Everyone should have their own personal piece of the web.

------
chipotle_coyote
I admit I'm mostly amused/fascinated by the list at the end of the feed types
the blog supports, ranked by popularity. Gopher being more popular than Atom
is a quirk specific to this blog -- it's not like many other sites these days
support it, right? -- but it's the quite new JSONFeed being at top that was
most surprising. Possibly also a quirk specific to this blog, but I'm not as
sure of that as I am about Gopher.

~~~
WorldMaker
My impression is that because it is the same RSS discovery process that finds
both traditional RSS, Atom, and JSONFeed, knowing which one your Reader
application favors and/or is using is sometimes difficult. I'd off-hand heard
that many of the big, strongest web-based Readers had started favoring
JSONFeed, but this is the first evidence I've seen of it. It's also why I
assume Atom is so low on the list, in that those same readers likely would
prefer Atom to RSS, but have switched to JSONFeed today, and RSS has the long
tail of longevity and legacy reading software.

It could also just be that since it is apparently request count (rather than
unique origin or such) that possibly JSONFeed tools simply request it more
often than the others.

------
mmsimanga
I must confess I have probably spent more time trying out blogging software
than actually blogging. Reading that OP has his entries as HTML has given me
food for thought. I am using Hugo but not being a front end developer I
battled (most probably due to my lack of knowledge, time and application) so
much to get my theme simple that I find myself hesitant to blog again. Moral
of the story use what you understand.

------
mooreds
Love it. I've been blogging for 16 years myself. Find it a wonderful way to
share knowledge and see myself changing over time. I've tried keeping a diary
but it never works for me.

------
dddddaviddddd
Currently have been running essentially the same PHP + plain text files stack
on my website for about 10 years. No more complex than it needs to be.

------
jefftk
I've been using the same blogging infrastructure for fifteen years. I write
HTML, then run an offline script to create a static website (rss feed, tag
indexes, extracts for home page). I rewrote most of it once, switching from
parsing the html with regexps to using an actual html parser, and taking the
time to clean up some especially ugly parts.

------
egypturnash
I’ve been using the same WordPress installation for seven years now. I thought
it was longer but I guess I was mentally counting the years before that when I
just had a modified version of a gallery script (Singapore) shoeing my art and
no actual text blog. I’ll probably be using it for years to come.

Does what I need it to, works for the amount of traffic I get, no need to
change.

------
Fizzadar
20 years! That's awesome. Makes me wish I didn't replace 7 years of history on
my blog in 2013.

~~~
kick
The front page of your blog has been archived a few hundred times since 2006.
I'm pretty confident you can recover most of your posts, if you want:

[https://web.archive.org/web/20060801000000*/pointlessramblin...](https://web.archive.org/web/20060801000000*/pointlessramblings.com)

~~~
WorldMaker
At one point I downloaded my site's Wayback Machine Archive curious what I
could recover. It didn't feel like enough because particularly in the regions
of time where I lost the most blog posts my front page mostly only had
synopses or excerpts of posts and the full posts themselves were on a follow-
up page that the Wayback Machine didn't archive.

------
ncmncm
Reading the article, it says he hasn't been using the same software, because
he's rewritten all of it, since, much of it multiple times. The only thing
that's the same is the content and the file format.

"You have an ax. You replace the handle. Later, you replace the head. When did
it stop being the same ax?"

"When I replaced the handle."

~~~
gwern
You know, in DC ([http://dresdencodak.com/2015/05/18/dark-
science-46/](http://dresdencodak.com/2015/05/18/dark-science-46/)), the
character who gives that answer is turning into a villain.

~~~
ncmncm
Good catch!

But her brain is her superpower.

RADNAR

By the stone of Daggoth!

