
Serde 1.0.0 for Rust released - taheris
https://github.com/serde-rs/serde/releases/tag/v1.0.0
======
the_duke
Serde really is one of the gems of the Rust ecosystem.

It is a (de)serialization framework that can be quite easily implemented for
various serialization formats like JSON, MessagePack, Yaml, toml, ...

It enables automatic and very performant (de)serialization of you data, into
different formats.

Often with a simple:

` #[derive(Serialize, Deserialize)] struct Data { ... } `

~~~
jadbox
I really really would love Avro support added... I'd do it if I had more Rust
experience. The icing on the cake is if it could also interact with the Avro
registry somehow.

~~~
steveklabnik
What's Avro?

~~~
lobster_johnson
It's not as popular as Protocol Buffers, Thrift, MessagePack etc., probably
because it started out in Java land (I believe it came from Doug Nutch of
Lucene fame).

I've never used it, but one benefit of Avro is that it can embed the schema in
the serialized output, which allows clients to consume data even though they
don't have the IDL, which isn't possible with Protobuf and Thrift. MessagePack
does include types and map keys, but doesn't have a schema.

~~~
spearo77
s/Doug Nutch/Doug Cutting/

~~~
lobster_johnson
Oops. That should have been "Doug Cutting of Nutch and Lucene fame".

------
makmanalp
Serde is the kind of magic that we all hoped a type system like Rust's would
cause. The transcode and zero-copy stuff is so neat. Doing stuff like this
under the hood and "for free" is a nightmare in other languages.

------
geofft
I'm working on an strace implementation in Rust, with an enum (tagged union)
for system calls:

    
    
      enum Syscall {
         Open { pathname: Buffer, flags: u64, count: u64 },
         Read { fd: u64, buf: Buffer, count: u64 },
         ...
      }
    

I was _very_ pleasantly surprised at being able to add a couple of
dependencies, add #[derive(Serialize)] right above the struct, change two
lines of my main driver program, and get an strace --json with useful output
with no further effort. It's the sort of experience I expect from a higher-
level language with dynamic types and runtime reflection, but available to me
in a systems language.

~~~
stevedonovan
Yes, and sometimes you really pay for runtime reflection - I've seen Java
programs reduced to Python performance by excessive use of reflection.
Whereas, this is _compile-time reflection_!

------
killercup
Very nice:

> Zero-copy deserialization

> […] The semantics of Rust guarantee that the input data outlives the period
> during which the output struct is in scope, meaning it is impossible to have
> dangling pointer errors as a result of losing the input data while the
> output struct still refers to it.

~~~
edmccard
>The semantics of Rust guarantee that the input data outlives the period
during which the output struct is in scope

To be fair, so does the semantics of every language with garbage collection --
keeping things alive while there are references to them is the bread and
butter of GC.

EDIT: I do think it's impressive that Rust can manage this without the
overhead of GC. But the sentence from the release notes immediately before
'killercup's quote was: _This uniquely Rust-y feature would be impossible or
recklessly unsafe in languages other than Rust_ which struck me as a bit over-
hyped.

~~~
geofft
No, I think Rust still does this uniquely in a way that isn't available with a
GC language. Imagine the following design (assume fd is a UDP socket or
something, and so each read returns a complete message):

    
    
        char buf[1024];
        while (read(fd, buf, 1024)) {
            messages.push_back(deserialize(buf));
        }
    
        for (message: messages) {
            print(message);
        }
    

Garbage collection will keep buf alive, but won't guarantee that buf isn't
being mutated while it's alive. Rust's ownership system will guarantee that.
In Rust, the read() function would require a mutable (i.e., unique) reference
to the buffer, and serde's deserialization function also requires a reference
to the buffer, preventing read() from being callable while the deserialized
objects continue to exist.

I think doing reader/writer refcounting at runtime is hard because this is a
case where there's nothing reasonable to do at runtime if you have
incompatible references. At best you can do copy-on-write, but then you
silently lose the zero-copy performance. You really want a compile-time error
saying "You structured this code wrong, go redesign it or add some copies."

------
Sharlin
As an aside:

> Rust's "orphan rule" allows writing a trait impl only if either your crate
> defines the trait or defines one of the types the impl is for. That means if
> your code uses a type defined outside of your crate without a Serde impl,
> you resort to newtype wrappers or other obnoxious workarounds for
> serializing it.

I hadn't realized this. The justification is reasonable - preventing ambiguity
when resolving traits - but it precludes one of the major use cases for
traits/type classes: adapting a type from one library to an interface in
another library without wrapper types.

~~~
floatboth
Yeah, in Haskell it's allowed but outputs a warning. So a lot of my modules
use -fno-warn-orphans :D

~~~
mrkgnao
The reason is that class instances are really not first-class in Haskell. You
can never import a module and choose not to import its instances, not even
with

    
    
      import Module (just_one_symbol_i_need)
    

or to not import a couple of them. I think named instances a la PureScript let
you do

    
    
      import Module (instance toJsonText, ...)
    

which would be great to have in Haskell, but probably breaks something deep
inside instance resolution.

PureScript as a "Haskell without the mistakes" really is a great idea.

------
eslaught
I really love Serde, but one consequence of splitting out the formats into
different libraries is that different formats can be of substantially
different levels of quality. My impression is that JSON support is best-of-
class, but I have no idea about the others.

For example, I've been looking at the CBOR library for Serde [1], and it's not
obvious whether the library is full-featured, robust, actively supported, etc.
Same goes for many of the other Serde formats. At the moment I'm likely to
just choose JSON for new projects since I don't want to build on top of
something that isn't known to be solid, but it would be really nice to be able
to use binary formats for what I want to do.

Now that Serde is 1.0 it would be nice to do a push on the individual formats
so that users coming in can tell what's active and well-supported vs a
(possibly inactive) community contribution.

[1]: [https://github.com/pyfisch/cbor](https://github.com/pyfisch/cbor)

~~~
xiphias
This is a global problem for packages, and I think there's a push for a
feature where developers can mark for a crate how stable it is. Anyways for
the packages without passing tests we already know it can't be stable yet, as
it doesn't even have unit test passing guarantee :)

------
ktta
There's also capnproto-rust[1] if anyone was wondering.

[1]: [https://github.com/dwrensha/capnproto-
rust](https://github.com/dwrensha/capnproto-rust)

~~~
emmelaich
... but it doesn't use Serde.

Should it? Or can it? Am curious, not a criticism.

~~~
ktta
They target different things. Serde is something that simply serializes data
into common text based or binary based format.

But there's been some innovation done to make that even more efficient. It
started with protocol buffers[1]. So you can then basically write .proto files
which are based on protocol buffers' own schema[2] which look like this[3].
What's special about these schemas is that they can be strongly typed, and
then after a schema is written, which is a .proto file (in case of protocol
buffers), code for any language can be generated to receive and parse the
binary encoded message properly, with proper error checking. This avoids re-
writing code in different languages if a RPC protocol is changed. It also
offers other advantages and you can look into the docs for that.

Then, the author of protocol buffers left Google and created something called
Capnproto[4], which improved on it in many ways. Now, what I linked to is a
rust program supporting capnproto's own schema[5]

[1]: [https://developers.google.com/protocol-
buffers/](https://developers.google.com/protocol-buffers/)

[2]: [https://developers.google.com/protocol-
buffers/docs/proto3](https://developers.google.com/protocol-
buffers/docs/proto3)

[3]: [https://github.com/WhisperSystems/libsignal-
protocol-c/blob/...](https://github.com/WhisperSystems/libsignal-
protocol-c/blob/master/protobuf/WhisperTextProtocol.proto)

[4]: [https://capnproto.org/](https://capnproto.org/)

[5]:
[https://capnproto.org/language.html](https://capnproto.org/language.html)

~~~
emmelaich
I'm aware of the details of Cap'n Proto.

But I thought since Avro is somewhat similar to capnproto and it uses Serde
(in Rust) then capnproto could/should too.

But it sounds like it is a "should not"

------
eridius
This is really cool. But how does zero-copy deserialization work for &'a str
with formats like JSON where the input data may have string escapes (and
therefore needs to be mutated during deserialization)?

~~~
whyever
It doesn't. You can use Cow<'a, str> if you want string escapes to work, but
then it is not really zero-copy anymore.

~~~
eridius
So, what, if you use &str you get the raw data, escapes and all? That's not
very good, especially because it means round-tripping your struct through JSON
can return the wrong result.

I assume that Cow<'a, str> will only produce an owned string if there are
strong escapes that need decoding? If so, that's probably the best approach
for decoding appropriately, as you'd get zero-copy as long as no mutations are
required, but round-tripping through JSON would still work right.

~~~
conradev
That makes sense. I want to rewrite my plist parser to both use Serde and be
zero copy ([https://github.com/conradev/plist-
rs/issues](https://github.com/conradev/plist-rs/issues)), and for binary
plists you would similarly need to convert UTF-16 strings to UTF-8, but
wouldn't need to touch UTF-8 strings.

I wonder how the API would work. Would json::Value be modified to have a
lifetime and contain Cow enums? How would it implement ToOwned? It almost
seems like there would have to be separate types, a json::Value<'a> and
json::OwnedValue.

------
leshow
Awesome! I end up using Serde in just about every project. It's great!
Congrats guys.

------
newsat13
Does serde support serializing to protobuf and back?

~~~
steveklabnik
[https://crates.io/crates/serde_protobuf](https://crates.io/crates/serde_protobuf)

(I haven't tried it)

~~~
stusmall
From the README: "Serialization is not yet implemented in this version."

~~~
steveklabnik
We really have to start rendering READMEs on crates.io...

~~~
Veedrac
IMO you should just merge with docs.rs.

