More

remilouf · 2024-10-20T15:17:31 1729437451

This is actually pretty funny.

remilouf · 2024-09-10T19:41:24 1725997284

That’d be a pretty inefficient way to generate bullshit at scale

zero-sharp · 2024-09-11T02:55:07 1726023307

automating the creation of false testimonials is inefficient at scale? go on ...

what's the alternative?

remilouf · 2024-05-02T14:53:48 1714661628

LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.

remilouf · 2024-04-05T17:19:32 1712337572

Looks like it’s quite the opposite: http://blog.dottxt.co/performance-gsm8k.html

remilouf · 2024-03-15T14:21:54 1710512514

What do you mean by "semantic dimension"?

remilouf · 2024-03-15T14:06:10 1710511570

That whole structured generation line of work looks promising. I hope someone else takes this and runs evaluations on other benchmarks. Curious to see if the results translate!

Homunculiheaded · 2024-03-15T14:47:02 1710514022

Agreed! While these results are very promising, there's still a lot to explore in this space.

In addition to the "prompt consistency" and "thought-control" ideas mentioned in the post, I'm definitely curious how the performance is on more complex structured data (things like codegen).

remilouf · 2024-03-05T18:24:46 1709663086

Awesome work! I am really impressed by how much structured generation improves model performance.

remilouf · on Feb 4, 2024

This article presents a way to make structured generation with LLMs much faster than standard generation, but what I find most interesting is how it highlights the issues that tokenization entails towards the end.

remilouf · on Dec 22, 2023

We already support regex-guided generation in the library, and could easily make an API to serve this as well if that's a feature people want!

jilijeanlouis · on Dec 22, 2023

I need it yes. that would be amazing tbh.

remilouf · on Aug 17, 2023

It is currently limited by the time it takes to build the index. There are obvious optimizations we can apply to this, however in a production setting it does not matter much since you only need to build the index once for each (schema, vocabulary) pair.

IanCal · on Aug 19, 2023

Is there a rough guide as to how long to wait? I think it's definitely an important thing if building takes 10+ minutes (or hours?) for even very basic models, that's a fundamentally different production architecture (as launching from a blank slate is now not feasible). It's also a big devx issue.

I'd highlight this somewhere on the readme as I wasn't sure if it was just broken or how long to wait.