Hacker News new | past | comments | ask | show | jobs | submit login

This is some great work!

One point I disagree with:

>What’s less obvious is that we can infer the size of the machine’s register file. On one hand, if 256 registers are used, the machine can still support 384 threads, so the register file must be at least 256 half-words * 2 bytes per half-word * 384 threads = 192 KiB large. Likewise, to support 1024 threads at 104 registers requires at least 104 * 2 * 1024 = 208 KiB. If the file were any bigger, we would expect more threads to be possible at higher pressure, so we guess each threadgroup has exactly 208 KiB in its register file.

>The story does not end there. From Apple’s public specifications, the M1 GPU supports 24576 = 1024 * 24 simultaneous threads. Since the table shows a maximum of 1024 threads per threadgroup, we infer 24 threadgroups may execute in parallel across the chip, each with its own register file. Putting it together, the GPU has 208 KiB * 24 = 4.875 MiB of register file! This size puts it in league with desktop GPUs.

I don't think this is quite right. To compare it to Nvidia GPUs, for example, a Volta V100 has 80 Shader Multiprocessors (SM) each having a 256 KiB register file (65536 32-bit wide registers, [1]). The maximum number of resident threads per SM is 2048, the maximum number of threads per thread block is 1024. While a single thread block _can_ use the entire register file (64 registers per thread * 1024 threads per block), this is rare, and it is then no longer possible to reach the maximum number of resident threads. To reach 2048 threads on an SM requires the threads to use no more than 32 registers on average, and two or more thread blocks to share the SM's register file.

Similarly, the M1 GPU may support 24576 simultaneous threads, yet there is no guarantee it can do so while each thread uses 104 registers.

[1] https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.... : table 15, compute capabilities 7.0




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: