This is generally clickbait with a large amount of vapid information. I am surprised to be honest that HackerNews is giving the attention to it that they are. I would not encourage giving this any more attention.
As an early draft, them putting in a placeholder of ~'this model uses a lot of compute {TODO: put in cost estimates here?}' does not at all equal 'the authors didn't even know how much it cost to train the model!' Additionally, of course the toxicity went down. There's a world of RLHF between here and that original draft, and they've shown how the non-RLHF model has lowered the toxicity of the untrained base model significantly. If the author of the tweets had done their due diligence, they might have noticed that.
Rather obviously around the time when the model was originally being developed, text-only was sorta really the only way that LLMs were done. Them pivoting to multi-modal is just a natural part of following what works and what doesn't. This is really straightforward, I am mind-boggled that this is getting attention over discourse that is meaningful to the tidal wave of change coming with these models.
One final note that this is a bit of shoveltext is at the bottom the author offers up vague concerns followed by mass-tagging accounts with high follow counts, to include Elon Musk.
I'd encourage you not even to give the tweet the benefit of your view count and to just move on to more valuable discussions that are taking place. Why not take a look at a fun little thread like https://news.ycombinator.com/item?id=35283721 ? (not affiliated other the fact that I made the first comment on it, I just pulled from the rising threads on HN's frontpage)
As an early draft, them putting in a placeholder of ~'this model uses a lot of compute {TODO: put in cost estimates here?}' does not at all equal 'the authors didn't even know how much it cost to train the model!' Additionally, of course the toxicity went down. There's a world of RLHF between here and that original draft, and they've shown how the non-RLHF model has lowered the toxicity of the untrained base model significantly. If the author of the tweets had done their due diligence, they might have noticed that.
Rather obviously around the time when the model was originally being developed, text-only was sorta really the only way that LLMs were done. Them pivoting to multi-modal is just a natural part of following what works and what doesn't. This is really straightforward, I am mind-boggled that this is getting attention over discourse that is meaningful to the tidal wave of change coming with these models.
One final note that this is a bit of shoveltext is at the bottom the author offers up vague concerns followed by mass-tagging accounts with high follow counts, to include Elon Musk.
I'd encourage you not even to give the tweet the benefit of your view count and to just move on to more valuable discussions that are taking place. Why not take a look at a fun little thread like https://news.ycombinator.com/item?id=35283721 ? (not affiliated other the fact that I made the first comment on it, I just pulled from the rising threads on HN's frontpage)