Hacker News new | past | comments | ask | show | jobs | submit | karterk's comments login

Solving the strawberry problem will probably require a model that just works with bytes of text. There have been a few attempts at building this [1] but it just does not work as well as models that consume pre-tokenized strings.

[1]: https://arxiv.org/abs/2106.12672


Or just a way to compel the model to do more work without needing to ask (isn't that what o1 is all about?). If you do ask for the extra effort it works fine.

    + How many "r"s are found in the word strawberry? Enumerate each character.

    - The word "strawberry" contains 3 "r"s. Here's the enumeration of each character in the word:
    - 
    - [omitted characters for brevity]
    -
    - The "r"s are in positions 3, 8, and 9.


I tried that with another model not that long ago and it didn't help. It listed the right letters, then turned "strawberry" into "strawbbery", and then listed two r's.

Even if these models did have a concept of the letters that make up their tokens, the problem still exists. We catch these mistakes and we can work around them by altering the question until they answer correctly because we can easily see how wrong the output is, but if we fix that particular problem, we don't know if these models are correct in the more complex use cases.

In scenarios where people use these models for actual useful work, we don't alter our queries to make sure we get the correct answer. If they can't answer the question when asked normally, the models can't be trusted.


I think o1 is a pretty big step in this direction, but the really tricky part is going to be to get models to figure out what they’re bad at and what they’re good at. They already know how to break problems into smaller steps, but they need to know what problems need to be broken up, and what kind of steps to break into.

One of the things that makes that problem interesting is that during training, “what the model is good at” is a moving target.


Are you saying that I have to know how LLM work to know what I should ask LLM about?


Perhaps. LLMs are trained to be as human-like as possible, and you most definitely need to know how the individual human you are asking works if you want a reliable answer. It stands to reason that you would need to understand how an LLM works as well.

The good news is that if you don't have that understanding, at least you'll laugh it off with "Boy, that LLM technology just isn't ready for prime time, is it?". In contrast to when you don't understand how the human works – that leads to, at very least name calling (e.g. "how can you be so stupid?!"), a grander fight, even all out war at the extreme end of the spectrum.


You're right in aspect that I need to know how humans work to ask them a question = if I were to ask my dad how many Rs are in strawberry, he would say "I don't have a clue" because he doesn't speak english. But he wouldn't hallucinate an answer - he would admit that he doesn't know what I'm asking him about. I gather that here LLM is convinced that the answer is 2, but that means that LLM are being trained to be alien, or at least, when I'm asking questions I need to be precise on what I'm asking about (which isn't any better). Or maybe humans also hallucinate 2, dependent on human


It seems your dad has more self-awareness than most.

A better example is right there on HN. 90% of the content found on this site is just silly back and forths around trying to figure out what each other is saying because the parties never took the time to stop and figure out how each other works to be able to tailor the communication to what is needed for the actors involved.

In fact, I suspect I'm doing that to you right now! But I didn't bother trying to understand how you work, so who knows?


Or "when trying to answer questions that involve spelling or calculation, use python". No need for extra training really.


There are many different classes of problems that are affected by tokenization. Some of them can be tackled by code.


It's interesting how all focus is now primarily on decoder-only next-token-prediction models. Encoders (BERT, encoder of T5) are still useful for generating embedding for tasks like retrieval or classification. While there is a lot of work on fine-tuning BERT and T5 for such tasks, it would be nice to see more research on better pre-training architectures for embedding use cases.


I believe RWKV is actually an architecture that can be used for encoding: given a LSTM/GRU, you can simply take the last state as an encoding of your sequence. The same should be possible with RWKV, right?


> Cows are treated very well in all the ranches I've been on.

Unfortunately the entire premise of your argument is based on a personal anecdote that's a gross generalization of vast number of cattle farms.


Let's be fair and acknowledge the difference between a small local farm and the larger "industrial"-like ones. I grew up in a farm, I know others too, people are kind to their animals.

Of course, the larger ones don't care. The same happens even when producing vegetables, no regards to nature in those cases either.


For JIT-ing you need to know the sizes upfront. There was an experimental branch for introducing jagged tensors, but as far as I know, it has been abandoned.


Can you give some examples of problematic C idioms in C++?


Manipulating C style strings and arrays, instead of the respective C++ standard library classes, calling malloc()/free() directly all over the place.


Every billion dollar idea is a million dollar idea.

Every million dollar idea is a 100K dollar idea.

There are always smaller segments within large segments that you can go after.


Back when GST was introduced, they asked all exporters of software services to pay the 18% GST and then claim a refund because exports are exempted. This was before they introduced a process to apply for a "NOC" (which must be renewed periodically I think). So, I duly paid GST for the first couple of months.

5 years later, I'm yet to receive the refunds. First, they said that they were authorized to only sanction refunds for exporters of physical goods and then they said that this is handled by the State government, who said that it was handled by the Central. Finally Central accepted that they are in-charge of the refund, but by then COVID happened and I could not follow up for 2 years. In May, when I tried to claim the refund again, they said that this refund pertains to 2016 and it's difficult to "reopen" the accounts for that year and that's handled by yet another department. So, they have written to them to ask for the process. No replies after two emails. Story continues...


My family’s liquor shop was forced to close because our accountants could not keep up with the constant GST changes. They quit and we could not find another on short notice. The worst part was that the shop was on the hook for the unpaid taxes of upstream suppliers - money that is yet to be fully refunded after years.

India’s legal systems and the enforcement of laws are one of the biggest drags on a country with immense potential.


That sounds awful. Curious to hear if you were able to avoid GST payments in later years as should be the case for exports.


Yes, you can get a waiver if you apply for something called the "Letter of Undertaking" which allows you to not pay anything for exports. This was not available at the start.


This is exactly what happened to me.


This is awful.

I had to apply for Foreign Export of Services Certificate as well which turned out to be the PAN number.

They also made it impossible for services exporters to claim GST back on business expenses.


Are you sure these aren't just thinly-veiled requests for a bribe?


No, they usually are very brazen about that, but this time they are generally clueless.



What are some ways in which they are bad/broken?


modules were my #1 wanted feature in C++, but they are very disappointing to me:

- they allow `.` in their names to signify hierarchy, but have no built-in notion of hierarchy or submodule. This means possible error messages will have to be suboptimal

- they are orthogonal to namespaces, meaning if you have a module `foo.bar` any identifier it will export will be in the global namespace unless you put it in the `foo::bar` namespace in the `foo.bar` module file. As a user this means that I can import a `foo.bar` module and expect symbols to be imported in any namespace.

- btw, what happens if different modules import the same name? I'll let you guess:-D. Modules were the chance to eliminate this kind of issues by eliminating the notion of shared namespaces, but oh well...

- modules have interface units, implementation units, partitions, interface partition and implementation partition. There must be exactly one non partition interface unit, called the primary module interface. Instead of, you know, just having modules and submodules files like all modern module systems in existence.

- Modules add new and fun ways to get ill-formed, no diagnostic required programs. Such as not re-exporting all interface partitions.

I really hoped that modules were the chance of introducing a simpler compilation model for C++. I'm not pleased with the result.

Some references:

[0]: https://vector-of-bool.github.io/2019/01/27/modules-doa.html

[1]: https://vector-of-bool.github.io/2019/03/10/modules-1.html


That's just usual legacy driven design, like timespec. It's actually a very good move that module separator is '.' instead of something ugly like '::' or even worse ':::'. Not even new languages can easily let go of angle braces and double colons.


Yikes, I was so excited for modules back when I was still writing C++17 and a bit of early 20. Sad they have so many problems with them and/or are just plain not implemented still. Honestly I'm happy I'ved moved on from the language


Popular languages that went for strict 1:1 mapping between modules and filesystem, like Java and Python, often end up moving away from it, or at least adding various hacks (e.g. the way you can split submodules across packages in Python).


I don't know about Java, but in Python, 99% of the modules I create respect the 1:1 mapping between modules and filesystem.

Same in Rust, the overwhelming majority of modules I create is in the standard filesystem <-> modules mapping. For generated code, I use the special syntax that allows not respecting this mapping, but that's once in a blue moon.

IMO, C++ should have taken the same steps: providing sane, correct and easy defaults, while allowing the flexibility to override them when necessary (with special annotations).

I'm disappointed that a modern C++ feature was designed in the long tradition of having bad defaults instead.


Yes, we plan to do that, but we will start off with first supporting raw vector data type and search on that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: