Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
aliljet
1 day ago
|
parent
|
context
|
favorite
| on:
Qwen-AgentWorld: Language World Models for General...
The benchmarks here are confusing at best. Am I reading correctly that this model is essentially as good or better than all frontier models right now?
help
anana_
1 day ago
|
next
[–]
I believe the benchmark listed is about simulating the environment for the various tasks, rather than doing them. It seems that the point of this model is to generate sim data to improve other models with
reply
blourvim
1 day ago
|
prev
[–]
Benchmarks in general are a little iffy, the whole industry is going off of vibes anyways. Can't decide before trying it out
reply
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: