I don't know about the TMS99 series, but on the 6502 memory read could be as short a single cycle. The 6502 had a concept of a "zero page" which treated the first 256 bytes of memory specially, with single cycle access. So they could be used as a kind of register.
But of course there was only one ALU, and to use it you had to use the Accumulator register.
> On the 6502 memory read could be as short a single cycle…
No, it couldn't. Even a NOP was 2 cycles. Memory access is at least 3 cycles for a zero-page read (read opcode, read immediate byte, read data), or more for the more complicated addressing modes.
The 6502 definitely reads or writes every cycle! - consult the data sheet (or VICE's 64doc) for more info. This is also not hard to verify on common 6502-based hardware.
Suppose it executes a zero page read instruction. It reads the instruction on the first cycle, the operand address on the second cycle, and the operand itself on the third. 1 byte per cycle.
(For a NOP, it reads the instruction the first cycle, fetches the next byte on the second cycle, then ignores the byte it just fetched. I think this is because the logic is always 1 cycle behind the next memory access, so by the time it realises the instruction is 1 byte it's already committed to reading the next byte anyway and the best it can do is just not increment the program counter.)
Oh, it certainly accesses memory every cycle. But the effective "efficiency" of the memory access instructions is much lower -- if you were writing something like a memcpy(), for instance, they wouldn't be contributing to its transfer speed.
The Renesas 740 (M740) 6502 variant had a T processor status bit which would cause certain instructions to operate on $00,X instead of the accumulator.
But of course there was only one ALU, and to use it you had to use the Accumulator register.