AnandTech did bandwidth benchmarks for the M1 Max and was only able to utilize about half of it from the CPU, and the GPU used even less in 3D workloads because it wasn't bandwidth limited. It's not all about bandwidth. https://www.anandtech.com/show/17024/apple-m1-max-performanc...
Indeed. RIP Anandtech. I've seen bandwidth tests since then that showed similar for newer generations, but not the m4. Not sure if the common LLM tools on mac can use CPU (vector instructions), AMX, and Neural engine in parallel to make use of the full bandwidth.