Feedback from early readers was that the work was too large to digest in a single reading, so I split it up into a series of posts. I'm not entirely sure this was the right call; the sections I thought were the most interesting seem to have gotten much less attention than the introductory preliminaries.
I think these articles may benefit from a more thorough table of content at the beginning, or from some kind of abstract. If you briefly presented the whole list of topics in a single article, it would be more clear that your views on the topic are more complete. I initially thought the table of content would be scoped to the article itself rather than connecting it to the adjacent ones.
I had never heard of you, and this article appeared very biased to me. I found the information ecology piece superior, shame that it went unnoticed; I will try to go through all of them. I admire the breadth of topics you’re covering and appreciate the many sources. They’re clearly written in your own voice and that is great to see, I guess I mostly reacted to not being fully aligned with your view.
I'm not sure that HN vote count is a good indicator of interest? HN alerted me to the existence of the intro post. I read the intro, noticed that it was one in an ongoing series, and have been checking your blog for new installments every few days.
I suspect that if you'd not broken up the post into a series of smaller ones, the sorts of folks who are unwilling to read the whole thing as you post it section by section would have fed the entire post to an LLM to "summarize".
Sure, but 4 front-page posts from the same url in 4 days surely sits at the tail of the distribution. (I guess they all capitalize on the same 'LLM-is-bad' sentiment).
It's also aphyr, who is incredibly popular. Take one very popular author, have him write a series of posts on the zeitgeist everyone can't help but talk about, and yes, the outcome is that his posts are extremely popular.
I still remember his takedown of mongodb's claims with the call me maybe post years and years ago filling me with a good bit of awe.
Different URL, same domain, and exactly the kind of thing I’d expect a fair number of HN readers to have in a feed reader where they’d see it shortly after publication and decide to share it.
Also, if you think this is just “LLM is bad”, I highly suggest reading the series first. The social impacts they talked about at the start of the series should resonate with a lot of people here and are exactly the kind of thing which people building systems should talk about. If you’re selling LLMs, you still want to think about how what you’re building will affect the larger society you live in and the ways that could go wrong—even if we posit sociopath/MBA-levels of disregard for impacts on other people, you still want to think about how LLMs change the fraud and security landscape, how the tools you build can be misused, how all of this is likely to lead to regulatory changes.
FWIW Zero-3 refers to a common strategy for sharding model components across GPUs (commonly called FSDP-2, Full Sharded Data Parallel). The "3" is the level of sharding (how much stuff to distribute across GPUs, e.g. just weights, versus optimizer state as well, etc.)
It's important to note that pathological liars don't stop lying. In fact, when they're caught lying red-handed, they usually double down and lie even more.
Pretty disgusting behavior from the founders just posting as normal on linkedin/twitter as if this is run-of-the-mill. Fraudsters need to be nipped in the bud, lest we get trump-like scenarios.
reply