Other than that it looks quite useful, and it's definitely something to keep in the tool belt. Bonus points for the subtle Undertale references too :)
I've started doing this for my Rust projects too because Rust is becoming a really nice language for writing these sorts of things, and having a separate binary as a Rust file in src/bin is convenient.
Often you want to use some logic from the main application in your tasks and using the same language streamlines that.
In a past life, we used invoke  for task running. It was incredible but has the same problem as rake: it introduces another language (Python) and more dependencies.
There's a fairly new task runner being developed in go called mage , but it didn't seem worth the jump yet to me as it's still pretty immature (I haven't played with it in a few months, though). Did you consider trying that out?
Biggest advantage over a generic tool like Make is that you can import your project dependencies (such as your database configuration) into your task.
I like keeping just one language and tooling, but the biggest benefit to me is I'm always forgetting the arcane bits of shell syntax, and using Go saves me from having to search it all the time. Despite being more verbose than bash, I find it saves me time overall.
But please, when making screencast or screenshots, use a simple prompt.
Because the information included in the prompt is often not only irrelevant, it can be actively distracting and makes reading through less pleasant by increasing the required eye-movement and cognitive effort to filter out the useless content.
Small little things makes a difference.
I should pull out my old laptop or my backups and resuscitate the script.
The example at http://tty-player.chrismorgan.info/ was generated using that script.
# Start a new bash shell with a severely filtered environment and no initfile.
if [ -z "$_BFR_RUNNING" ]; then
env -i \
bash --init-file "$0" "$@" <&0
# What remains of this file is the initfile.
eval "$(dircolors -b)"
alias ls='ls --color=auto'
Although the concurrency level (512 connections) is a bit too aggressive for most servers. You'll get throttled, blocked or your backend will crash (which isn't too bad in any case, except that might not be what you were after with a link checker).
Maybe just introduce something like an exponential backoff algorithm if you start getting too many 5xx errors or the requests are hanging.
This is why you'll see assets1.foo.com, assets2.foo.com, etc. all pointing to the same IP address(es). Server-side code picks one based on a modulus of a hash of the filename or something similar when rendering the HTML to get additional pipelines in the user's browser. Not sure how or if this is done in SPA.
How does it compare to the older python "linkchecker", which was resurrected here (and is quite fast): https://github.com/linkchecker/linkchecker
I still use linkchecker (because I’ve never completed my Rust-based link checker that I started several years ago), and have extended it at work to support client-side certificates which we use on CI, but I’m generally fairly unimpressed with linkchecker.
Separately, it’d be awesome if this also checked that anchor links resolve to id values to validate linking to specific elements on a page.
I wrote https://github.com/jwilk/urlycue , but I'm not quite happy about it, I don't have energy to improve it either; so I'm looking for alternatives.
Edit: I see it's not quite that simple. However, I still think that with some stricter matching requirements this could work.
The URI language is of course regular, so it would be possible to construct a regexp that matches only URIs. But naively applying such regexp wouldn't work in practice, because many punctuation characters are allowed in URIs. For example, single quotes are allowed, so in this Python code the regexp would match too much:
homepage = 'http://example.com/'
I run it in a CI job which builds a static site. It works pretty well for me.
Although, based on a quick look at the code this thing isn't going to go particularly fast.
For more customizeable spidering, Scrapy allows you to customize a spider, and even deploy spider daemons to run in production (https://doc.scrapy.org/en/latest/topics/deploy.html). For an out-of-the-box version, try Spidy (https://github.com/rivermont/spidy). For super serious spidering, try Heritrix (https://webarchive.jira.com/wiki/spaces/Heritrix/overview) or Nutch (https://nutch.apache.org/).
Here's an interesting read on crawling a quarter billion pages in 40 hours: http://www.michaelnielsen.org/ddi/how-to-crawl-a-quarter-bil... From my own experience crawling massive dynamic state-driven websites, even if you're trying to just grab a single page, you will eventually want the extra features.