These typically work by changing the media bit encoding to be easier to process with 6502 instructions instead of a lookup table. Others simplify the table and use more RAM so that fewer lookups are needed. I haven’t looked at how Transwarp works yet but it seems like the latter.
Amazing. It should be noted that 50x faster is 20KBps, the original 1541 firmware was stupendously crippled. Loading the entire C64 memory space in 4 seconds is a tremendous quality of life improvement for a C64 user. Heck, you can read the entire disk in less than 9 seconds.
Also: "Plain Commodore GCR encodes 4 bytes raw to 5 bytes on disk, Transwarp encodes 3.5 bytes raw to 5 bytes on disk. (Thus only 223 = $df instead of 256 bytes per block, the 224th byte is the checksum.)
One trick is to encode those 3.5 bytes using a subset of plain Commodore GCR, such that it works just fine with d64 and regular transfer/copy solutions. Part of that trick is that decoding a GCR byte drive-side just takes a table lookup (4 cycles). =)"
I haven't looked at it but I bet it's written in handwritten ASM. You can disassemble that. It probably won't be easy but with a memory map of the C64 and 1541 you could tell how it's working and learn from it.
Some discussion with the author: https://www.lemon64.com/forum/viewtopic.php?t=78558&start=0
There’s a newer version here: https://csdb.dk/release/?id=214786
Release notes: https://csdb.dk/release/?id=214786&show=notes#notes
More details from another advanced loader:
https://www.linusakesson.net/programming/gcr-decoding/index.... https://www.linusakesson.net/software/spindle/v3.php