Unless I'm missing something big, this looks like a significant deal for independent developers of self-driving AI software: GPUDrive enables them to run driving simulations with hundreds of AI agents on consumer-grade GPUs at 1M FPS, and it comes with Python bindings, wrappers for Pytorch and Jax, and a friendly standard MIT license. Thank you for sharing this on HN!
I am not an expert, but the way that I understand self-driving systems is that there are multiple models running, and then those outputs are fused into yet another model which outputs the raw controls/actuations. In other words, I see this model/trainer as the "conductor", telling the car how it should approach an intersection, enter a highway, deal with merging traffic or construction zones, etc.
There is another model which interprets visual data to assist with lane-keeping, slow down or stop for pedestrians, inform the conductor of road signs... The final model combines all these inputs and incorporates the user preferences and then decides whether to brake or accelerate, how much to rotate the steering wheel.
Idk heh. The point of the high performance training is you can train the "conductor" role faster, and run inference faster. Assuming the car has limited compute/gpu resources, if you have a very high performance conductor function, you can dedicate that much more budget to visual/sensor inference and or any other models like the Trolley Problem decider (jk).
Is this just the location data being trained on, or is there image and sensor input data too? It looks like it's just location, which seems like it limits the applicability, but I'm not sur
Edit: reading a bit more it's somewhere in between. Afaict no raw sensor data etc.,. but different "parsed" sensor inputs are supported. I'm not sure whether this is synthetic or not? E.g., is the LIDAR view real LIDAR data from some system or a processed result of what the system thinks LIDAR would be able to see? I can't tell.
If you have a simulation where realtime is 60 fps, you could simulate a little over 4.5 hours per second if you could run it at 1M fps. That would definitely help with learning rate.
He's not saying "break realtime into microsecond chunks."
He's saying: run through 4.5 hours of 16 millisecond chunks of time in a second. This is good for regression testing or producing training data quickly.