Hacker News new | past | comments | ask | show | jobs | submit login

My techniques probably are somewhat due to having started 30 years ago.

I rarely enter an interactive debugger. I have TONS of logging statements I can toggle. I make program execution as deterministic and reproducible as possible (for example, all random numbers are generated from random number generators that are passed around). When something goes wrong, I turn on (or add) logging and run it again. Look for odd stuff in the log files. If it doesn't make sense, add more logging. Repeat.

I worked on a pretty large videogame in the 90s where /everything/ was reproducible from a "recording" of timestamped input that was automatically generated. The game crashed after half an hour of varied actions? No problem, just play the recording that was automatically saved and attached to the crash report. It was amazing how fast we fixed bugs that might otherwise take weeks to track down.




Was that game Quake 3 by any chance? After reading about how Quake 3's event system worked (http://fabiensanglard.net/quake3/), I started using those techniques not only in C++ game engines, but even in Python GUI applications and I'm now experimenting with it in Javascript with Redux. I'm a huge fan of that pattern. It takes a bit of work to set up, but it's magical when it works correctly.


Terra Nova: Strike Force Centauri. 1996, so it predated Quake 3.


Can you elaborate on how you're doing this in Redux? Are you adding a logging middleware to log actions?


Trace is a powerful tool. I've shipped OSs with internal wraparound trace buffers that ran on a million machines for years - just so when I received a crashdump from the field I'd have something to sink my teeth into. Net cost: nearly zero. Net value: occasionally golden.


How would one play the recording, was the game able to load the logs and replay the events automaticaly from them?


Yes, exactly. Assuming that the error was not in the rendering system, which took the bulk of the CPU time and which was isolated as much as possible, the recording could even be replayed at high speed by omitting the rendering of most frames.

I forgot to mention that we actually found a lot of bugs by seeing a playback diverge from the original recording; this was often due to uninitialized variables or reading from random memory. We could generally see when the divergence happened because we computed a checksum of the state of the world every frame and stored it in the recording as well.


Interesting stuff, games seem like an ideal target for such recording, because their data is available offline and they are by nature event-focused.

I wonder how/if this could be applied to network-facing software like a daemon, or programs that transform large amounts of data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: