Hacker News new | past | comments | ask | show | jobs | submit login

No, I think I've been pretty clear that I'm interested in how mechanically sound the simulation is. Also those measures are over an even shorter duration so even less relevant to how coherent it is at real game scales.

How should this be concretely evaluated and measured? A vibe check?

I think the studies evaluation using very short video and humans is much more of a vibe check than what I’ve suggested.

Off the top of my head DOOM is open source so it should be reasonable to setup repeatable scenarios and use some frames from the game to create a starting scenario for the simulation that is the same. Then the input from the player of the game could be used to drive the simulated version. You could go further and instrument events occurring in the game for direct comparison to the simulation. I’d be interested in setting a baseline for playtime of the level in question and using sessions of around that length as an ultimate test.

There are some on obvious mechanical deficiencies seen in the videos they’ve published. One that really stood out to me was the damage taken when in the radioactive slime. So I don’t think the analysis would need to particularly deep to find differences.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
