More

chaokunyang · 2025-02-21T06:14:01 1740118441

Available since https://github.com/apache/fury/releases/tag/v0.10.0

chaokunyang · 2024-11-11T15:54:27 1731340467

See https://github.com/apache/fury for details about Apache Fury serialization framework

chaokunyang · 2024-10-05T19:49:06 1728157746

Apache Fury is a blazingly-fast serialization framework for jdk. Apache Fury use automatic codegen for object graph serialization at graalvm native image build time. It eliminate all reflection and introspection at runtime. In this way and its optimized serialization protocol, It runs up to 50x than graalvm jdk serialization. And user don't have to specify reflection config json file, no annotation are needed. It's easy to use and minimize user mind overhead.

chaokunyang · 2024-07-24T12:45:54 1721825154

The code are located at:

1. https://github.com/apache/fury/blob/main/java/fury-core/src/... 2. https://github.com/apache/fury/blob/main/java/fury-core/src/...

chaokunyang · 2024-07-24T12:37:06 1721824626

JSON/Protobuf used a KV layout when serialization, it will write field names/types multiple times for multiple objects of same type. And the sparse layout is not friendly for CPU cache and compression.

We proposed a scoped meta packing share mode in Apache Fury 0.6.0 which can improves performance and space greatly.

With meta share, we can write field name&type meta of a struct only once for multiple objects of same type, which will save space and improve performance comparedto protobuf. And we can also encode the meta into binary in advance, and use one memory copy to write it which will be much faster.

In our test, for a list of numeric struct, Fury is 6x faster and 1/2 payload smaller than protobuf.

chaokunyang · 2024-06-03T15:38:09 1717429089

Pretty cool, multiple level IS is great. Does KQIR support predicate pushdown?

chaokunyang · 2024-05-09T14:49:05 1715266145

We've already support such dict encoding, we let users register class with an id. And write class by id, id will be encoded as a varint, which uses only 1~5 bytes. But not every users like the registration. Meta string encoding here is for cases where not mapping is available

chaokunyang · 2024-05-09T14:47:01 1715266021

Meta string is used only for ascii chars, so every char is in range of a byte. We can just random access them. But your reminder is good, I just find out that we didn't check the passed string are ascii string, although our case always pass ascii string

chaokunyang · 2024-05-09T14:41:50 1715265710

Glad to see H2 database engine benifits from this. I use H2 too.

chaokunyang · 2024-05-08T16:23:02 1715185382

Yes, we only use this encoding when it brings benifits. `wire format` is a good term to describe such cases, which the context are constrained to current message only. So the data for compression are small, and many statistical methods won't work since it's a `wire format`. The stats itself must be send to peers, which will have bigger space and cpu cost