That's basically a known fact about LLMs, they need oversight. But if they make the task 100x easier, it's still useful as a starting point. This kind of neural net analysis is difficult to do manually.
I am curious if they just start making inventories for all neurons in all layers, then they can compare models based on neuron types, or even train them to achieve the right mix of concepts.
I am curious if they just start making inventories for all neurons in all layers, then they can compare models based on neuron types, or even train them to achieve the right mix of concepts.