Maybe the embedding could be paired up with a set of words that embed to somewhere close to the original embedding? Then the embedding can be updated for new models by re-embedding those words. (And it would be more interpretible by a human.)
I mean it was just a thought I had. May be a "solution in search of a problem". I generate those a lot! haha. But it seems to me like having some sort of canonical set of training data and a canonical LLM architecture, we'd end up able to generate consistent embeddings of course, but I'm just not sure what the use cases are.