If the author aimed for the same quality they would be much smaller, they instead opted for the same bitrate because that would be opening the can of worms of "similar quality is subjective in the eye of the author." If you watch the samples, you can clearly see how more more AV1 gets done with [roughly] the same number of bits; H.264 looks like a complete joke in comparison.
If you massaged the AV1 bitrate until it was the same blurry mess as H.264 (in your eyes), it would likely be much smaller.
Except they didn't achieve quite the same bitrate. For scene 1, for example, H.264 was 209.9 kbps, VP9 was 191.2 kbps, AV1 was 230.1 kbps. Put another way: the AV1 stream had 39 additional kpbs (or 20% more bits) than the VP9 stream. 20% is a pretty big deal (especially at these low bitrates), and undermines the point the author was trying to make.
The increase in quality (and decrease in bitrate) for H.264 → VP9 is really cool. But the increase in quality for VP9 → AV1 isn't as impressive because the bitrate also increased. What would have really driven the author's point home was if the AV1 stream was higher quality and a lower bitrate than the VP9 stream.
I was thinking of the comparison from this angle:
GIF: plays everywhere, horrible quality and giant file sizes, high CPU usage.
H.264: plays everywhere, good quality and file sizes, almost universal hardware acceleration even on cheap devices
VP9: plays many places, competitive size with H.264, hardware acceleration is common but entire popular platforms lack support
AV1: limited support, great file sizes, hardware support has barely started shipping.
If the goal is to replace GIFs I would weight compatibility and ease of playback much greater than bumping the file size savings from 95% to 97%.