After doing all of this, you make Tetris in the high level language. It's a badass book, super well-written, and what I consider an essential text.
To give you a little head start on that, where many digital circuits are clocked (or make one step) on just the rising edge of a clock cycle (where it transitions from low-to-high), the 6502 clocks on both the rising and falling edges; certain parts (like the hardware that initiates a memory access) will fire on the rising edge, then the part that needed that information will fire on the rising edge half a cycle later (like making the ALU start adding the value that was just fetched from memory to the accumulator).
If you've ever heard someone say that a 6502 of one speed is roughly equivalent to a Z80 clocked at twice the speed, now you know why: the internal logic is essentially clocked twice as fast as the input clock.
 IIRC this isn't completely correct, and external memory is allowed to take up to 3/4ths of a cycle to perform a memory access to accommodate slower ROMs, but you get the idea.
1 MHz 6502 is about as fast as 3 MHz Z80. Although your mileage my vary, some say up to 4 MHz Z80.
Just ignore the religious stuff.