Thanks! His active years as a programmer coincide with the FIG F-83 Mike Perry, which was a time when there weren't very many programmers. So it's quite a coincidence!
I vaguely recall reading somewhere that for some nontrivial software (I forget what exactly) the speedup from hardware advances between Apple II and ~2000 was roughly equivalent to running the most modern iteration of the algorithms involved on the original machine.
I've terribly butchered this since the details completely escape me but you get what I mean. It feels like it could be true, which is... neat. This sort of thing certainly happened several times with gaming consoles where developers are able to squeeze every ounce of performance from the hardware at the very end of its generation.
2000x is believable, but that doesn't mean the latest algorithm will run on Apple II. Algorithmic speedup is often hardware relative. For example better cache locality is less important in older hardwares.
I find that the main takeaway is that the time/space/complexity tradeoff dimensions become immense with even modest Moore's Law scaling improvements.
On the actual back-in-the-day Apple II at time of release, you probably did not get a configuration that maxed out RAM and storage because it just cost too darn much. But as you got into the 80's, prices came down, and the Apple could expand to fit with more memory and multiple disks, without architectural changes.
As such, a lot of the early microcomputer software assumed RAM and storage starvation and the algorithms had to be slow and simple to make the most of memory, but Apple developers could push the machine hard when ignoring the market forces that demanded downsizing. It was a solid workstation platform. When the 16-bit micros came around the baseline memory configuration expanded remarkably, so completely different algorithms became feasible for all software, which made for a substantial generation gap.
By the 90's, the "free lunch" scenario was in full swing and caching became normalized, and so everything has been about faster runtimes, but at systemic scale, with layers of cache, it's often hard to pinpoint the true bottleneck.