This is a quickstart example using LeRobot and Flower that demonstrates how to train a diffusion model collaboratively across 10 individual nodes (each with its own dataset). This example uses the push-t dataset, where the task is to move a letter T object on top of another that is to remain static.
The example it's pretty easy to run, and can do so efficiently if you have access to a recent gaming GPU. Although the diffusion model only take 2GB of VRAM (of course you can decide to scale it up), the compute needed to train them isn't negligible. For context, running the example until convergence takes 40mins on a dual RTX 3090 setup. It takes about 30rounds of federated learning (FL) to do so although the example runs for 50 rounds by default.
The example runs each node/robot in simulation by default (i.e. each node is a Python process and there is some clever scheduling to run the jobs in a resource-aware manner). But it is straight forward to run it as a real deployment where each node is, for example, a different device (e.g. NVIDIA Jetson). If someone is interested in doing this, checkout the links added at the bottom of the example README.md
I'd love to hear your feedback and thoughts!!
Check out Flower: https://flower.ai
Learn more about LeRobot: https://github.com/huggingface/lerobot
The push-t Dataset: https://huggingface.co/datasets/lerobot/pusht
Learn more about the Diffusion Policy model: https://arxiv.org/abs/2303.04137