In my personal projects I’ve been using variations on the same simple code for saving/loadings objects for a decade or so, and have very few problems. The heart of the code is this interface -
public interface IStashy<K>
{
void Save<T>(T t, K id);
T Load<T>(K id);
IEnumerable<T> LoadAll<T>();
void Delete<T>(K id);
K GetNewId<T>();
}
And implementations of that are very stable over time. Objects get serialized as json and stored in a folder named after their type.
There’s a small number of gotchas, for which I have well known work arounds:
- I generally won’t remove a property, but mark it as obsolete and stop using it.
- If I’ve added a new Boolean property, I’d tend to name it such that it defaults to false, or if it must default to true, have it stored in a nullable boolean, and if it loads as null (from an older instance of the type), set it to the default.
- some convenient types I want to use (as properties) are not serializable, so before saving I’ll copy their data into a serializable type, like an array of key values, then on loading rehydrate that to a dictionary. (I guess this is a harsh performance penalty if you’re doing a lot of it in a game)
How do you deal with serializing properties "by reference"? E.g., if 3 objects reference object "Foo", then Foo is serialized once instead of being duplicated in the json 3 times?
It depends. I don’t tend to end up with deep object graphs that need to be saved/ reloaded.
It might be that we serialize foo and foo has a list of references to its 3 children. The “parent” reference from the child back to foo is marked as do not serialize; an “after rehydration” function on foo could then set the value each child’s parent reference.
But more often — say baz bar and bam reference foo — the speed at which baz changes is different to the speed at which foo changes. The reference to foo from Baz is marked do not serialize. Baz also has a property indicating the ID of Baz. (For IStashy<K> - K is the type used for the keys, the IDs; it might be a string or an int or a guid, I tend to use string. All objects in the system have the same kind of ID, and it is unique per type.)
Generally if cyclic data structures are possible then some part of the cycle will be marked as no serializable and I’ll keep a key reference adjacent to it.
Situations that triggers huge cascading saves — they’re kind of an anti pattern for how I work. If one little change changes everything then perhaps it can be calculated on the fly from a pure function, not persisted at all— or perhaps there’s over-coupling etc.
> I generally won’t remove a property, but mark it as obsolete and stop using it.
Presumably because loading will break?
> - If I’ve added a new Boolean property, I’d tend to name it such that it defaults to false, or if it must default to true, have it stored in a nullable boolean, and if it loads as null (from an older instance of the type), set it to the default.
`default(Boolean)` is false, so you can load an old object and it'll substitute the default, rather than having to throw an error on a missing property. You could do the same with, say, a new Int32 field, so long as it should default to zero.
Similarly, `default(Nullable<Boolean>)` is (wait for it) null, so you can do "oldVal ?? true".
(I meant to respond to the gp comment here, soz. I agree with everything the parent comment says.)
> Presumably because loading will break?
I think the object will still load ok, I’m not sure if it would break because it’s been so long since I was in a scenario where I wanted to really delete a property. Normally when I make it obsolete there is also some new property or properties that have replaced it. When loading the object, if the (now obsolete) property is not null, I translate its value into the new property/ies (I.e., “migrate” it into the new properties), then null out the old property value, so that the migration only happens that one time.
I guess using something allegedly “simple”, over a long time, only appears simple because you will slowly internalise any idioms you’re using, and they don’t take much thought anymore.
Looking back — I’ve used this pattern for over 20 years, across various platforms. I’ve had a backing source that is anything from xml files to json to sqlite to in memory (for rapid tests) in a few languages. Things that seem natural or intuitive (to me) at this point are just habits that are rusted on, whether good or bad.
Sometimes I start building fast indexing systems on top of it, or archiving systems or record versioning… and the better tool would be to switch to a db or to a more full featured key value store. But it’s such a lot of fun!
There’s a small number of gotchas, for which I have well known work arounds:
- I generally won’t remove a property, but mark it as obsolete and stop using it.
- If I’ve added a new Boolean property, I’d tend to name it such that it defaults to false, or if it must default to true, have it stored in a nullable boolean, and if it loads as null (from an older instance of the type), set it to the default.
- some convenient types I want to use (as properties) are not serializable, so before saving I’ll copy their data into a serializable type, like an array of key values, then on loading rehydrate that to a dictionary. (I guess this is a harsh performance penalty if you’re doing a lot of it in a game)