Where in the codebase is the logic specific to TPU vs. CUDA?

hi · 2024-09-12T01:35:59.000000Z

The codebase heavily uses PyTorch XLA libraries (torch_xla.*), which are specific to TPU. Key TPU-specific elements include XLA device initialization, SPMD execution mode, TPU-specific data loading, and mesh-based model partitioning.

[0] https://github.com/felafax/felafax/blob/main/llama3_pytorch_...

[1] https://pytorch.org/xla/master/