Funny thing is that this textbook can be easily found on LibGen. I do not know a lot about LLM datasets, but they probably include books from these shadow libraries, right ?
Nice catch. GPT-3 at least was trained on “Books2” which is very widely suspected of containing all of Libgen and Zlibrary. If the questions are in Libgen, this whole paper is invalid.
Considering that textbooks are probably the single highest quality source of training data for LLMs, I would be very surprised if OpenAI wasn't buying and scanning textbooks for their own training data (including books that aren't even in Books2).
It’s highly unlikely that they will spend hundreds of thousands of dollars on buying their own copies when it’s not even remotely clear that doing so will be enough to answer the copyright violation cases they are facing.
I’ve always thought of brutalism as the ultimate “function over form.” It’s brutal in the sense that it “brutally” strips away the form until all that’s left is the function. It’s like minimalism without the vanity.
Unfinished concrete, specifically. I would translate "brut" as "bare" or "unrefined" here. English brutality is related to French brut, but in French it means more like crude, unrefined, raw, blunt, with no particular sense of animalistic or violent.
It's called brutalism because brute is the French word for concrete. To many being made out of concrete would be a defining feature of brutalism, to others it's the exposed building materials without façade.
Maybe article should link directly to that. I'm all for quirky and interesting design, unlike some in these parts, but it is too glitchy and uncomfortable for me.
Impressive and important work. I was wondering if there ever was a web version of this. Archived site mentions something, but I'm not sure if it was ever done. Using standalone application always feels like unnecessary friction to me. I'm out of my depth here, but since models are available, is it possible to just render them in WebGL or something like that. Would that be technically difficult ?
Actually yes. The PC app can be 'easily' exported to WebGL, as shown here: https://z-anatomy.netlify.app/
This version has the UI bugged, I share it only to show that it is possible, and in fact, it should not take much work.
This was in process just when the association dissolved, which is why it was left half done, but it is something I have pending, along with implementing all the improvements to the code that I am programming in the VR version.
Their kit is 99 dollars, but how expensive is the actual DNA testing nowadays ? It's my understanding from this [0] video that it is approximately 1000 dollars for the whole genome, but 23&Me sequence only a tiny portion of the genome (above video mentions 500 000 bases out of 6 billion total bases of human genome), and that sequencer processes thousands of samples in parallel, so there is some economy of scale factors. It is sometimes very confusing how all these tech/techish companies have such bad financials.
Their financial reports consistently paint approximately 50% gross profit (revenue - cost of revenue). However, their R&D burn is enormous. Sales, marketing, general and admin costs are approximately equal to their gross profits.
In an alternate reality, you probably could have structured 23andme into a company making modest net profit. But that would not have matched the 6 billion price valuation either. And it's quite likely that a substantial part of the R&D burn was because they recognized the limitations of their core product.
In short, 72–76% of the world’s industrial fishing, and 21–30% of transport and energy, vessel activity are missing from public tracking systems.
The actual article from Nature, in open access: https://www.nature.com/articles/s41586-023-06825-8