Basically you write high voltage to the "strobe" line of the shift register in the controller, putting it in a state where it continuously updates to reflect current button presses. Once you remove the high voltage, it "freezes" its 8-bit state representing whatever combination of the eight buttons were last depressed, and then you make 8 successive reads from the shift register's serial line, reading off the state 1 bit at a time. It's up to the game software to be robust against bouncing, as well as against a tricky hardware bug where the DMC module of the NES's audio processing unit conflicts with the latching mechanism used by the controller's serial line. Needless to say, this must all be emulated by accurate NES emulators, too :)
Look closely, the '165 is a little different, both in functionality and pinout. You'll not be able to swap in one for the other. -- But yes, both are parallel in, serial in/out shift registers.
The 4021 and 74LS165 are similar, inasmuch as they're both PISO shift registers, but they're not the same part. Here's representative datasheets for both parts from TI:
Basically you write high voltage to the "strobe" line of the shift register in the controller, putting it in a state where it continuously updates to reflect current button presses. Once you remove the high voltage, it "freezes" its 8-bit state representing whatever combination of the eight buttons were last depressed, and then you make 8 successive reads from the shift register's serial line, reading off the state 1 bit at a time. It's up to the game software to be robust against bouncing, as well as against a tricky hardware bug where the DMC module of the NES's audio processing unit conflicts with the latching mechanism used by the controller's serial line. Needless to say, this must all be emulated by accurate NES emulators, too :)
[1] http://www.datasheetcatalog.com/info_redirect/datasheet/phil...