Solving the strawberry problem will probably require a model that just works with bytes of text. There have been a few attempts at building this [1] but it just does not work as well as models that consume pre-tokenized strings.
Or just a way to compel the model to do more work without needing to ask (isn't that what o1 is all about?). If you do ask for the extra effort it works fine.
+ How many "r"s are found in the word strawberry? Enumerate each character.
- The word "strawberry" contains 3 "r"s. Here's the enumeration of each character in the word:
-
- [omitted characters for brevity]
-
- The "r"s are in positions 3, 8, and 9.
I tried that with another model not that long ago and it didn't help. It listed the right letters, then turned "strawberry" into "strawbbery", and then listed two r's.
Even if these models did have a concept of the letters that make up their tokens, the problem still exists. We catch these mistakes and we can work around them by altering the question until they answer correctly because we can easily see how wrong the output is, but if we fix that particular problem, we don't know if these models are correct in the more complex use cases.
In scenarios where people use these models for actual useful work, we don't alter our queries to make sure we get the correct answer. If they can't answer the question when asked normally, the models can't be trusted.
I think o1 is a pretty big step in this direction, but the really tricky part is going to be to get models to figure out what they’re bad at and what they’re good at. They already know how to break problems into smaller steps, but they need to know what problems need to be broken up, and what kind of steps to break into.
One of the things that makes that problem interesting is that during training, “what the model is good at” is a moving target.
Perhaps. LLMs are trained to be as human-like as possible, and you most definitely need to know how the individual human you are asking works if you want a reliable answer. It stands to reason that you would need to understand how an LLM works as well.
The good news is that if you don't have that understanding, at least you'll laugh it off with "Boy, that LLM technology just isn't ready for prime time, is it?". In contrast to when you don't understand how the human works – that leads to, at very least name calling (e.g. "how can you be so stupid?!"), a grander fight, even all out war at the extreme end of the spectrum.
You're right in aspect that I need to know how humans work to ask them a question = if I were to ask my dad how many Rs are in strawberry, he would say "I don't have a clue" because he doesn't speak english. But he wouldn't hallucinate an answer - he would admit that he doesn't know what I'm asking him about. I gather that here LLM is convinced that the answer is 2, but that means that LLM are being trained to be alien, or at least, when I'm asking questions I need to be precise on what I'm asking about (which isn't any better). Or maybe humans also hallucinate 2, dependent on human
It seems your dad has more self-awareness than most.
A better example is right there on HN. 90% of the content found on this site is just silly back and forths around trying to figure out what each other is saying because the parties never took the time to stop and figure out how each other works to be able to tailor the communication to what is needed for the actors involved.
In fact, I suspect I'm doing that to you right now! But I didn't bother trying to understand how you work, so who knows?
It's interesting how all focus is now primarily on decoder-only next-token-prediction models. Encoders (BERT, encoder of T5) are still useful for generating embedding for tasks like retrieval or classification. While there is a lot of work on fine-tuning BERT and T5 for such tasks, it would be nice to see more research on better pre-training architectures for embedding use cases.
I believe RWKV is actually an architecture that can be used for encoding: given a LSTM/GRU, you can simply take the last state as an encoding of your sequence. The same should be possible with RWKV, right?
Let's be fair and acknowledge the difference between a small local farm and the larger "industrial"-like ones. I grew up in a farm, I know others too, people are kind to their animals.
Of course, the larger ones don't care. The same happens even when producing vegetables, no regards to nature in those cases either.
For JIT-ing you need to know the sizes upfront. There was an experimental branch for introducing jagged tensors, but as far as I know, it has been abandoned.
Back when GST was introduced, they asked all exporters of software services to pay the 18% GST and then claim a refund because exports are exempted. This was before they introduced a process to apply for a "NOC" (which must be renewed periodically I think). So, I duly paid GST for the first couple of months.
5 years later, I'm yet to receive the refunds. First, they said that they were authorized to only sanction refunds for exporters of physical goods and then they said that this is handled by the State government, who said that it was handled by the Central. Finally Central accepted that they are in-charge of the refund, but by then COVID happened and I could not follow up for 2 years. In May, when I tried to claim the refund again, they said that this refund pertains to 2016 and it's difficult to "reopen" the accounts for that year and that's handled by yet another department. So, they have written to them to ask for the process. No replies after two emails. Story continues...
My family’s liquor shop was forced to close because our accountants could not keep up with the constant GST changes. They quit and we could not find another on short notice. The worst part was that the shop was on the hook for the unpaid taxes of upstream suppliers - money that is yet to be fully refunded after years.
India’s legal systems and the enforcement of laws are one of the biggest drags on a country with immense potential.
Yes, you can get a waiver if you apply for something called the "Letter of Undertaking" which allows you to not pay anything for exports. This was not available at the start.
modules were my #1 wanted feature in C++, but they are very disappointing to me:
- they allow `.` in their names to signify hierarchy, but have no built-in notion of hierarchy or submodule. This means possible error messages will have to be suboptimal
- they are orthogonal to namespaces, meaning if you have a module `foo.bar` any identifier it will export will be in the global namespace unless you put it in the `foo::bar` namespace in the `foo.bar` module file. As a user this means that I can import a `foo.bar` module and expect symbols to be imported in any namespace.
- btw, what happens if different modules import the same name? I'll let you guess:-D. Modules were the chance to eliminate this kind of issues by eliminating the notion of shared namespaces, but oh well...
- modules have interface units, implementation units, partitions, interface partition and implementation partition. There must be exactly one non partition interface unit, called the primary module interface. Instead of, you know, just having modules and submodules files like all modern module systems in existence.
- Modules add new and fun ways to get ill-formed, no diagnostic required programs. Such as not re-exporting all interface partitions.
I really hoped that modules were the chance of introducing a simpler compilation model for C++. I'm not pleased with the result.
That's just usual legacy driven design, like timespec. It's actually a very good move that module separator is '.' instead of something ugly like '::' or even worse ':::'. Not even new languages can easily let go of angle braces and double colons.
Yikes, I was so excited for modules back when I was still writing C++17 and a bit of early 20. Sad they have so many problems with them and/or are just plain not implemented still. Honestly I'm happy I'ved moved on from the language
Popular languages that went for strict 1:1 mapping between modules and filesystem, like Java and Python, often end up moving away from it, or at least adding various hacks (e.g. the way you can split submodules across packages in Python).
I don't know about Java, but in Python, 99% of the modules I create respect the 1:1 mapping between modules and filesystem.
Same in Rust, the overwhelming majority of modules I create is in the standard filesystem <-> modules mapping. For generated code, I use the special syntax that allows not respecting this mapping, but that's once in a blue moon.
IMO, C++ should have taken the same steps: providing sane, correct and easy defaults, while allowing the flexibility to override them when necessary (with special annotations).
I'm disappointed that a modern C++ feature was designed in the long tradition of having bad defaults instead.
[1]: https://arxiv.org/abs/2106.12672