Hacker Newsnew | past | comments | ask | show | jobs | submit | rsf's commentslogin

> The claim that the code is inefficient is really not substantiated well in this blog post.

I didn't run benchmarks, but in the case of clang writing zeros to memory (which are never used thereafter), there's no way that particular code is optimal.

For the gcc output, it seems unlikely that the three versions are all optimal, given the inconsistent strategies used. In particular, the code that sets the output value to 0 or 1 in the size = 3 version is highly unlikely to be optimal in my opinion. I'd be amazed if it is!

Your point that unintuitive code is sometimes actually optimal is well taken though :)


Stefan Kanthak has previously noted that GCC's code generator is quite horrible, in these extensive investigations:

https://skanthak.hier-im-netz.de/gcc.html


Interestingly it's actually not a flaw, the key appears after a while if you're not in the room (and don't have one):

https://news.ycombinator.com/item?id=17460392

I assume the agent somehow found this out and developed the behavior of going in and out of the room until the key shows up (which, with enough agent randomness it apparently will).


> In addition, the agent learns to exploit a flaw in the emulator to make a key re-appear at minute 4:25 of the video

After a bit of debugging, this appears to be a very intentional feature in the game rather than a flaw. That key appears after a while if you're not in the room (and don't have one).

Based on this disassembly: http://www.bjars.com/source/Montezuma.asm

Here's the relevant code with some annotations added:

https://goo.gl/VUDr9F

I'm not sure if this is a previously known feature in the game (a quick google search does not reveal much). It would be quite interesting if the RL agent was the first to find it!

PS: If you launch MAME with the "-debug" option and press CTRL+M you can see the whole memory (atari 2600 only has 128 bytes!!) while playing the game. If you keep an eye on the byte at 0xEA you will know when the key is about to pop up. Alternatively you can speed things along by changing it yourself to a value just below 0x3F.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: