Hacker News new | past | comments | ask | show | jobs | submit login
Serialization for C# Games (chickensoft.games)
133 points by jolexxa 10 months ago | hide | past | favorite | 48 comments



If I learned one important lesson from writing savegame systems: don't directly serialize your entire game state on the "game object level" (e.g. don't create a savegame by running a serializer over your game object soup), instead decouple the saved data from your game logic internals, have a clear boundary between your game logic and the savegame-system, keep the saved state as minimal as possible, and reconstruct or default-initialize the rest of the data after loading saved state.

With that approach, a language-assisted serialization system also looses a lot of its appeal IMHO (although it can still be useful of course for describing the separate savegame data format).

Also: resist the architecture astronaut in you, especially savegame systems are a honey trap for overengineering ;)


If your game engine is built on a data-first architecture like ECS then it can be pretty trivial to directly serialize your game state. I have had good luck with this using bitECS https://github.com/NateTheGreatt/bitECS/blob/master/docs/INT...


Agreed, when the data is already in a table format (instead of an "object spider web") the idea to automate serialization makes more sense, it essentially becomes a "database problem". I would still very carefully consider what data columns need to be persisted and which should be reconstructed, and I wouldn't try to come up with a too generic solution.

For instance in some games it might not be necessary to save a reference to a targeted object, if the gameplay targeting mechanism picks up a target in the first frame after loading a savegame anyway (etc etc...). Whether that target is exactly the same as at the time of creating the savegame might not be relevant (but very relevant for other games).

I guess the TL;DR is: in many cases it might be much easier to come up with a specialized per-game savegame system instead of coming up with a generic savegame system that works for all types of games.


Depending on how large your save state is, it could be as simple as a function mapping from a list of game objects to the saveable object. That approach works really well with Redux on the web since you really don't want to save most things. Where things really get tricky is when you want to get fancy and support things like saving only the changed portion of the state.


All very good advice that I feel deeply. I think I fell into the honey trap some time ago, but I've made peace with that — the tools I'm making will probably do more good than any game I could finish making, at least for now.

Jokes aside, though, I do try to dog-food my tooling as much as possible. I maintain a Godot/C# 3d platformer game demo with full state preservation/restoration (<https://github.com/chickensoft-games/GameDemo>) to demonstrate this.

By the time I've finished writing tests and docs for a tool, I've usually identified and fixed a bunch of usability pain points and come up with a happy path for myself and other developers — even if it's not 100% perfect.

I also have a bunch of unreleased game projects that spawned these projects, and even gave a talk on how this stuff came about (<https://www.youtube.com/watch?v=fLBkGoOP4RI&t=1705s>) a few months ago if that's of interest to you or anyone else.

The requirements you mentioned in your comment cover selectively serializing state and decoupling saving/loading logic, and I could not agree more. While you can always abuse a serializer, I hope my demonstration in the game demo code shows how I've selectively saved only relevant pieces of game data and how they are decoupled and reconstructed across the scene tree.

Also probably worth mentioning the motivation behind all this — the serialization system here should hopefully enable you to easily refactor type hierarchies without having to maintain manual lists of derived types like System.Text.Json requires you to do when leveraging polymorphic deserialization.

Manually tracking types (presumably in another file, even) is such an error-prone thing to have to do when using hierarchical state machines where each state has its own class (like <https://github.com/chickensoft-games/LogicBlocks>). States as classes is super common when following the state pattern and it is well supported with IDE refactoring tools since they're just classes. Basically this serialization system exists to help save complex, hierarchical state without all the headaches. While I was at it, I also introduced opinionated ways to handle versioning and upgrading because that's also always a headache.


[flagged]


Please don't put comments like this in threads. There is a button for it.


Spot on


In my personal projects I’ve been using variations on the same simple code for saving/loadings objects for a decade or so, and have very few problems. The heart of the code is this interface -

    public interface IStashy<K>
    {
        void Save<T>(T t, K id);
        T Load<T>(K id);
        IEnumerable<T> LoadAll<T>();
        void Delete<T>(K id);
        K GetNewId<T>();
    }
And implementations of that are very stable over time. Objects get serialized as json and stored in a folder named after their type.

There’s a small number of gotchas, for which I have well known work arounds:

- I generally won’t remove a property, but mark it as obsolete and stop using it.

- If I’ve added a new Boolean property, I’d tend to name it such that it defaults to false, or if it must default to true, have it stored in a nullable boolean, and if it loads as null (from an older instance of the type), set it to the default.

- some convenient types I want to use (as properties) are not serializable, so before saving I’ll copy their data into a serializable type, like an array of key values, then on loading rehydrate that to a dictionary. (I guess this is a harsh performance penalty if you’re doing a lot of it in a game)


(Blogpost with an implementation of IStashy is here — https://secretgeek.net/stashy_gist)


How do you deal with serializing properties "by reference"? E.g., if 3 objects reference object "Foo", then Foo is serialized once instead of being duplicated in the json 3 times?


It depends. I don’t tend to end up with deep object graphs that need to be saved/ reloaded.

It might be that we serialize foo and foo has a list of references to its 3 children. The “parent” reference from the child back to foo is marked as do not serialize; an “after rehydration” function on foo could then set the value each child’s parent reference.

But more often — say baz bar and bam reference foo — the speed at which baz changes is different to the speed at which foo changes. The reference to foo from Baz is marked do not serialize. Baz also has a property indicating the ID of Baz. (For IStashy<K> - K is the type used for the keys, the IDs; it might be a string or an int or a guid, I tend to use string. All objects in the system have the same kind of ID, and it is unique per type.)

Generally if cyclic data structures are possible then some part of the cycle will be marked as no serializable and I’ll keep a key reference adjacent to it.

Situations that triggers huge cascading saves — they’re kind of an anti pattern for how I work. If one little change changes everything then perhaps it can be calculated on the fly from a pure function, not persisted at all— or perhaps there’s over-coupling etc.


> I generally won’t remove a property, but mark it as obsolete and stop using it.

Presumably because loading will break?

> - If I’ve added a new Boolean property, I’d tend to name it such that it defaults to false, or if it must default to true, have it stored in a nullable boolean, and if it loads as null (from an older instance of the type), set it to the default.

Why?


`default(Boolean)` is false, so you can load an old object and it'll substitute the default, rather than having to throw an error on a missing property. You could do the same with, say, a new Int32 field, so long as it should default to zero.

Similarly, `default(Nullable<Boolean>)` is (wait for it) null, so you can do "oldVal ?? true".


(I meant to respond to the gp comment here, soz. I agree with everything the parent comment says.)

> Presumably because loading will break?

I think the object will still load ok, I’m not sure if it would break because it’s been so long since I was in a scenario where I wanted to really delete a property. Normally when I make it obsolete there is also some new property or properties that have replaced it. When loading the object, if the (now obsolete) property is not null, I translate its value into the new property/ies (I.e., “migrate” it into the new properties), then null out the old property value, so that the migration only happens that one time.

I guess using something allegedly “simple”, over a long time, only appears simple because you will slowly internalise any idioms you’re using, and they don’t take much thought anymore.

Looking back — I’ve used this pattern for over 20 years, across various platforms. I’ve had a backing source that is anything from xml files to json to sqlite to in memory (for rapid tests) in a few languages. Things that seem natural or intuitive (to me) at this point are just habits that are rusted on, whether good or bad.

Sometimes I start building fast indexing systems on top of it, or archiving systems or record versioning… and the better tool would be to switch to a db or to a more full featured key value store. But it’s such a lot of fun!


Sometimes, when battling these issues, I wish the Smalltalk-style approach[1][2] was more popular/feasible. Basically, saving the entire state of the VM is a fundamental operation supported by the language. Only truly transient things like network connections require special effort.

There are some echoes of this with things like Lua's Pluto/Eris, or serializable continuations in other languages (eg: Perl's Continuity).

It's just such a pain to thoroughly handle that sort of stuff without language-level support. And doing a "good enough" approach with some rough edges is usually shippable, so it's hard to build a critical mass of demand for such support. And even if there was, it's very hard to add it to a language/framework/etc that wasn't designed for it to begin with.

I've had a decent experience with 'struct string' style approaches, like Lua's string.pack() or Perl's pack()[3]. It's a little brittle, but extremely straightforward and "not framework-y," which suits me. But it leaves out things like program execution state; it's just for plain data.

[1] https://en.wikipedia.org/wiki/Smalltalk#Image-based_persiste...

[2] example of using this serializable statefulness for serving web apps: https://en.wikipedia.org/wiki/Seaside_(software)

[3] https://perldoc.perl.org/functions/pack


A full memory dump for a game will nowadays often be multiple gigabytes, that's a non-starter.

Even back in the day, Game Maker had a function to dump the state to disk that was intended for game saves. It sucked - turns out there's a bunch of state that you don't want in your savefile - keybinds, settings, even most game state actually.

Save state should be opt-in, not opt-out, and on top of that a VM/memory dump makes it a very big pain to opt-out.


All valid issues, to be sure. But I do think that a big chunk of the suckiness is poor tooling (lacking easier/better ways to opt-out or customize various parts, for instance) rather than a conceptual problem. That's why I feel like it would require a language to fully embrace it (kinda like Smalltalk did) rather than being a bolt-on feature. And for games it would likely need to be innately aware of GPU concerns, too.

On the flipside, there are (hopefully obvious) big advantages for the development process, when you can snapshot full states.

Of course, none of it matters if you actually need max performance — no AAA shooters would use it. But there are lots of not-performance-critical games which might benefit more from the better development experience at the expense of some performance. Perhaps point-and-click adventures, sidescrollers, shootemups, and such.

Anyway, just spitballing (:

I'm working on a game now that has almost no state, and I wish for a way to have that same freedom I feel in a more stateful traditional game, without having to muddy up everything with serialization interfaces et. al.


You can surprisingly sort of do this in Java. Just create a lambda which will start the game at the current state when you call it.


Huh?


I don't, in fact I mostly hate serializations systems. IMO they lead to extremely long load times. It might be more work to put the data that actually needs to be saved into some binary blob but it's the difference between a game that loads instantly and a game (like Source games) that takes 10-20 infuriating seconds per level every time you die.


This seems to cover many common pain points, but I’ve written my fair share of .NET serializers and for anything I build now I’d just use protocol buffers. Robust support, handles versioning pretty well, and works cross platform.

I’d like to know their reasons for making yet another serializer vs just using pb or thrift.


This is a good point. I don't think anyone wakes up wanting to make a new serializer. At this point, I was already pretty deep into making and releasing tools for my game projects so doing this didn't seem like such a stretch (although it actually ended up being one of the hardest things I've ever done).

A lot of small to mid-size games (which are the focus of the tools I provide) want to save data into JSON, whether it is to be mod-friendly or just somewhat human-friendly to the developer while working on the game. Not familiar with Thrift, but PB is obviously for binary data and has a focus on compactness and performance, which isn't the primary concern on my list of priorities for a serialization system. My primary concern for a serialization system is refactor-friendliness. I want to be able to rework type hierarchies without breaking existing save files, or get as close to that as possible.

I suppose you could say I'm only really introducing "half" of a serialization system: the heavy lifting is being split between the introspection generator (for writing metadata at compile time via source generation) and System.Text.Json (which handles a lot of the runtime logic for serializing/deserializing things).


In my experience, the pain of dealing with changes outweighs the pain of dealing with boilerplate, so it's better to explicitly write out save and load functions manually than rely on reflection.

Also means you can do stuff like if(version<x) { load old thing + migrate} else {load new thing} very easily. And it's just code, not magic.


That's essentially what this system does — it identifies the models and their properties that you've marked as serializable at build-time using source generation, and then allows you to provide a type resolver and converter to System.Text.Json that lets you make upgrade-able models with logic like you just described.

The assist from the source generation helps reduce some of the boilerplate you need, but there's no escaping it ultimately.


Something like DB schema upgrading would be good but if you have versions you should be able to do that just fine. Reflection and changes are not at odds.


I want to like this because it seems well done but I kind of grimace instead. It's not the library's fault.

Game engines have some form of serialization already (most of what a game engine does is load a serialized game state into memory, imo).

I've found its usually better to try to leverage those systems so you're not building multiple models objects and doing conversions between game engine and serialized types.

Engines often do a lot of (design)work to load things directly into memory in such a way that the game engine can use the inflated object immediately without a lot of parsing. It's nice to try to leverage that. Moreover less plugins is less complexity in the build process etc etc.

Those desires give me pause when looking at serialization plugins in the context of game engines.

Howeve, it's also not entirely feasible to only use the core engine systems in all cases. Often what's available at runtime for a game engine isn't always the same as build time. You might need to read this data outside of the engine and then you're really out of luck...life's so complicated.


RunUO has an implementation of this and it's like 25 years old but still worked really well


I really like this implementation, but it's probably worth mentioning here that RunUO and other tools like it are solving the problem at a layer of abstraction beneath what I was introducing here.

The serialization system I am providing here actually leverages System.Text.Json for reading and writing data — it's more concerned with helping you represent version-able, upgrade-able data models that are also compatible with the hierarchical state machine implementation I use for managing game state.


Wow clicked into the thread to see if anyone might mention RunUO :) it’s the only exposure I’ve had to serialization in C# I always wondered how it ranked compared to other approaches.


As someone that also fell in love with C# with RunUO. I never actually looked at the Serialization at the time. Need to spend some time in RunUO or the fork soon.

https://github.com/runuo/runuo/blob/master/Server/Serializat...


Ultima online solved all our problems 25 years ago.


Naive question: is there a reason why SQLite wouldn’t work for something like this?


Well you still need to solve for what happens when a new version of your app (maybe with a new embedded version of SQLite) loads up an old data file saved by an old version of your app.

The old version might not contain all the tables you need, and the ones it has may not have the columns you expect. So you need to run some data migrations on the database. Now you no longer have a serialization problem but instead you have a schema versioning problem.


https://www.sqlite.org/pragma.html#pragma_user_version

I use SQLite for game state management. It's just like any other database scenario. I write migrators that check the user_version of the database. It's just a for loop from user_version to current version. The migrators themselves can be arbitrary methods that sometimes modify game state to bring it up to date. The most common scenario is adding a new property to something, and then figuring out the appropriate defaults to assign for existing rows (typically null/0). But you can go all the way into the ETL rabbit hole.

I think the relational model via SQLite is the best way to manage state for the more complicated games like in the 4X and deck builder genres.


I do the same thing, but I have increasingly found myself wanting to serialize json into columns because having a rigid schema can sometimes add a lot of friction. Experience has taught me though, that it's worth the extra effort to define a schema, because nine times out of 10, the flexible json will ossify into unexpected format that the code relies on anyway, but now the database doesn't help enforce integrity. I would definitely recommend defining a schema and doing it right the first time. It will save you time in the long run, and make for much fewer bugs.


I am not against the JSON-in-columns hybrid path, but I have typically found it grows into a monster over time. In my experience, it caused performance problems more than anything else.


> instead you have a schema versioning problem

That same versioning problem also exists with other approaches. Having a versioned schema of the savegame format around for version migrations is generally a good idea.


Could solve this with a migration framework (I'm sure there is something for sqlLite). I've also done something similar with object/document storage. Store the version of the schema in the record and write a map function for each version from the previous.


We actually used SQLite in a couple of singleplayer RPGs (the Drakensang games).

The initial world state was baked into tables in an SQLite database file, and savegames were just mutated SQLite files (we kept a record of created, mutated and deleted database rows, and periodically flushed those changes into SQLite).

It worked well, but was overkill because we didn't actually make use of any advanced SQL features (just simple search over an object-id column). It would have been easier to cut SQL out of the loop and just write a simple table-based persistency system.


You could use it, but it's not really solving the same problem.

For a game, you generally don't need the relational database features. You aren't doing queries. You just want to load an entire level into memory, or save an entire level. For the serialization and persistence aspect, I don't see an advantage of SQLite over just calling JsonSerializer.Serialize().

The author's system then adds a bunch of features like version tolerance, AOT compilation of class metadata for iOS, polymorphic serialization, support for List<> and Dictionary<>, integration with the Godot game engine, etc. As far as I know, SQLite doesn't help you with any of that.

Anything that can write data to disk can ultimately save and load your game data; it's just a question of how easily.


While you don't need the relational features, some games do need the ability to make partial updates to make auto-save performant.

Do a search for something like "Minecraft save game size", and you'll see some people have multi-gigabyte saves. Similar issues crop up with some Paradox Interactive games.


Games are hugely varied. No doubt there are games out there for which SQLite is perfect. But I wouldn't use it for making partial updates in something like Minecraft.

It's not practical to store individual Minecraft blocks as table entries, so if you were using SQLite, you'd likely just store chunks (e.g. 16x16x16 blocks) as binary blobs. Then you'd rewrite entire chunks on save. It's not really taking advantage of what SQLite offers.

There are a lot of serializers and frameworks out there you could choose from, but even something as simple as just writing one map region per file and overwriting modified regions on save would be better than SQLite.


I wonder the same thing. Perhaps less portable? IE can't package that up in a binary (I have absolutely no idea just spitballing)?

And very cursory search suggests maybe there is nothing to that guess: https://www.reddit.com/r/golang/comments/tqffv2/packaging_an...

It's an interesting question because I've run into some datascientists that were so used to working in memory with dataframes and similar that they moved mountains to do things like de-duplicate csv's in memory (that they couldn't all fit in at once) where-as they could have done so trivially with sqlite.


When N=1 normalizing and denormalizing the data would be slower and more cumbersome than just reading and writing the whole blob.

You could use DB schema upgrade tooling to accomplish some of what's done by this library but now you're at SQLite+<some other middlewear>. If you have a tool you already like then that's perfectly ok.


For simpler games with simple state which can be expressed in relationships, it is definitely a good solution. However, as games get more complex, modeling the game state in just relations is harder. Its much simpler to model state in an object like structure. At least for me.


Because that would require additinal step: object graph conversion to relational database representation when saving and reverse process when loading. It is simpler to save graph right away.


There's a very good MessagePack serialization library for C#. I've used it in many of the games I worked on.

https://github.com/MessagePack-CSharp/MessagePack-CSharp


How this subject is approached in ECS? Just add Save component that knows how to serialize to all objects that need to be saved and a system that dumps alive objects into a file?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: