Another great session of George and John recording Oh My Love [1] shows what I think of as embracing rough but promising ideas along with balanities like tuning guitars, learning chords, etc... The Beatles engaged in this process much more than most musicians, despite it meaning most the stuff they played sounds relatively bad compared to playing tunes they'd practiced a bunch before. Is this due to work ethic or some love of creation? Idk, but it is fascinating to see the sausage get made.
Allocating 20% to safety would not be enough if safety and capability aren't aligned. I.e. without saying Bostrom's orthogonality thesis is mostly wrong. However, I believe they may be sufficiently aligned in the long term for 20% to work [1]. The biggest threat imo is that more resources are devoted to AIs with military or monetary-based objectives that are focused on shorter-term capability and power. In this case, capability and safety are not aligned and we race to the bottom. Hopefully global coordination and this effort to achieve superalignment in four years will avoid that.
The nice thing about box breathing is that, unlike pursed lips for example, you can do it without anyone noticing, by quietly breathing through your nose. So if you're in a stressful meeting, you can calm yourself without anyone noticing :)
Turchin's political stability indicator predicts this due to rising economic inequality, overproduction of grads with advanced degrees, and exploding public debt [0][1].
However, civil wars usually occur in poorer countries [1][2]. Also the proportion of the world in civil war is 1% and falling exponentially [2].
Finally, Metaculus gives a 5% chance of US civil war before 2031, where a recent July bump is discussed in the comments (Roe, Jan 6) [3].
Trillion parameter networks are mentioned a few times, but Tesla is deploying much smaller networks than that (like tens of millions IMU). Trillion param networks are mostly transformers like GPT-3 (actually 175B) etc... that are particularly heavy vs Conv as they have no weight sharing. Tesla is definitely starting to use transformers though, e.g. for camera fusion and evidenced by their focus on matrix multiply in dojo asic's vs the conv asics they have in the on-vehicle chips.
Yup, there's plenty of ML architectures that try to save on parameters size, achieving better generalization (less overfitting) at the expense of slightly costlier training and inference. The memory constraints on Tesla Dojo might not be a big deal after all.
[1] https://www.youtube.com/watch?v=yksV7YVuqdg