
Show HN: Serialize++ – Tiny Data Serialization Library for C++ - apankrat
https://github.com/apankrat/cpp-serializer
======
hellofunk
I’m curious what problem this is trying to solve that hasn’t been already
solved in many other ways. There are so many serialization libraries for C++.
The API for this isn’t too far from what YAS offers:

    
    
      YAS_OBJECT_NVP("myobject"
                ,("a", aa)
                ,("b", bb)
            )
    

instead of:

    
    
      OBJECT_SPEC( foo )
            __f( aaa ),
            __f( bbb )
      END_OF_SPEC
    

YAS also aims to do it with _as little fuss_ as possible, similar goals, yet
YAS binary archives are endian independent, and there are other advantages.

~~~
apankrat
I take it's this one -
[https://github.com/niXman/yas](https://github.com/niXman/yas) ?

I haven't seen it before, but there are bound to be similar solutions to the
same problem. The library I posted is used internally for a number of our
projects and it has evolved in a course of several years. It is not _trying_
to solve a problem, it _is_ solving a very specific need that our code had
since the beginning. So if you are implying that this is a pointless
duplication of some other existing solution, then you are off. There's more
than one way to skin a cat.

Re: endianess - to each (project) their own. We don't need normalized byte
ordering, so with all other things being equal yas-like library will be
marginally slower. Other advantages - I don't see any that would benefit our
case, but I'm sure there are some that may come handy in other contexts.

------
Jyaif
You can add
[https://github.com/USCiLab/cereal](https://github.com/USCiLab/cereal) to your
footnotes as it works in a similar way: you list the fields of your structure
in a serialize function.

~~~
apankrat
Done.

------
jokoon
I want to store data as binary, and generally I would just lay everything in
arrays, write the vector size as an int, the type id (arbitrary value) as a
char, write data, done. I just have 2 function to do that, and it's enough.

If you have fat data, binary is a wise solution. Generally flattening data in
array is not so complicated, and you gain speed. SQLite is also a good
solution, although I'm not sure it's good to store things like arrays of vec2,
pictures...

I tried protocol buffers once, I was really horrified by the size of the
headers it required.

------
unlinked_dll
Some quick and dirty things to make this useful, put it all in one header and
wrap your code in a namespace. No really viable as a dependency otherwise.

------
apankrat
Under 500 lines, this probably barely qualifies as a _library_ , but it still
allows packing structs and containers to/from binary blobs with as little
effort as possible.

It has virtually no dependencies, requires no pre-processing (like ProtoBufs
do) and it should be easy to understand and extend. It knows how to handle
basic types and std containers and then uses a couple of C++ features to
coerce the compiler into auto-generating all needed methods for custom types.

All you need to do is to specify which struct/class members must be serialized
and then call store() on an instance to produce a blob. To restore an instance
- feed the blob into parse() and that's it.

This was originally written to implement an IPC protocol for a pair of
cooperating processes, but it can be readily reused for quickly storing data
on disk and other things.

~~~
cygaril
Being only five hundred lines doesn't disqualify it from being a library.

But being impossible to safely integrate into other projects because of
illegal names and polluting the global namespace does.

This is at best a rough proof of concept which could at some point be turned
into a library.

~~~
apankrat
Such a clever swipe!

Don't know where you are sourcing your definitions from, but, conventionally,
being a library merely means that it's a reusable piece of code that does
something well-defined.

~~~
rowanG077
It's not reusable. Ergo not a library.

------
kjgkjhfkjf
Does this work if the two communicating processes use different versions of
the struct?

~~~
apankrat
Depends on the differences, but generally - no, of course, not.

~~~
kjgkjhfkjf
It'd be useful if you mentioned this in your caveats list, since version skew
is a common gotcha that some serialization technologies (e.g. protobuf) are
designed explicitly to address.

~~~
apankrat
OK, will do. I assumed it was obvious that schemas on both ends must match.

~~~
quietbritishjim
I would assume it's obvious that schemas on the two ends _don 't_ need to
match because solving that problem is usually a basic requirement of a
serialisation library. Of course they need to be compatible in some way though
e.g. in protobuf, don't reuse old field numbers.

~~~
apankrat
Depends on the project. Lots of cases when there's no need for schema
versioning. For example, when two processes need to talk to each other and
they both run from the same binary or share the exact same datamodel. Even if
they aren't run from the same binary, it's often _much_ simpler to just
require both sides to use the same protocol revision.

------
hoistbypetard
Looks nice to use, at least for the happy path.

I'd also say it looks a lot like the non-intrusive pattern for the boost
serialization library. That's my default tool for this problem just because
I'm pretty much always pulling in a pile of boost libraries for any non-
trivial C++ project anyway.

------
pubby
Does it handle endianness?

~~~
apankrat
No. See caveat #4.

The code stems for the same-machine IPC library. No point in doing any byte
order normalization. Trivial to add though if needed.

~~~
chrisseaton
> The code stems for the same-machine IPC library. No point in doing any byte
> order normalization.

Some processors can run one process in one byte order, and another process in
another byte order! For example Itanium and the UM.be bit which the user can
set.

~~~
apankrat
Nice, didn't know that. Seems a bit unfair though they didn't cover middle-
endian order.

Seriously though, how many machines are configured like that in the real world
and what's the overlap with our target installation base?

~~~
chrisseaton
No idea - but it's something to keep in mind if you have the assumption in
your head that 'byte order is constant for a machine'.

~~~
apankrat
It almost universally is though, no? I'd expect a processor core with inverted
endianness to be excluded from the general use and reserved for select
processes that require that feature.

~~~
chrisseaton
I don’t think so - it’s just another processor flag and processes can switch
it as needed. For example you could switch while reading data from a network
connection and back again afterward.

~~~
apankrat
It'd be interesting to see any real-world uses of this. Must be something
rather exotic that could benefit from not doing bswap/s in bulk.

------
zabzonk
Note that names containing double underscores are reserved in C++ for the C++
implementation, which your library is not part of. Using such names can lead
to all sorts of difficult to diagnose problems.

~~~
apankrat
Noted. From my perspective it's the same class of issues as with using _t
suffix for custom C types - also reserved, so technically not good, but
trivial to fix if it ever causes any trouble.

PS. In practice, the use of underscores, both single and double, as a name
prefix is wide-spread. From the use of _foo notation to pass arguments to
functions, to using _bar() to designate member functions that should be called
under some sort of lock, to using __xxx for macros - I mean, yes, all this
doesn't align with the C++ standard, but it exists, it's actively used in live
code and it also leads to the coding habits that are hard to change. Hence the
use of __f() in this particular library.

~~~
zabzonk
> he use of underscores, both single and double, as a name prefix is wide-
> spread.

Leading single underscores are OK at class scope, so OK for member variables
or functions. They are not OK at global scope, in either C++ or C.

And, yes it is wide-spread, as are many other dangerous bad practices. It's
something that is easy not to do, and which has no worth, so why do it?

~~~
apankrat
Because it's not a "dangerous bad practice".

There are examples of actually bad and dangerous language misuse, but this is
not one of them. It's mostly pedantry. Like using KiB instead of KB.

~~~
saagarjha
> Like using KiB instead of KB.

They mean different things.

~~~
Bjartr
Outside of the context of hard drives, KB is almost always used to mean KiB

