> - You ask for a review on your site but that just links to the github project page. Have you done a show-hn by the way?
I'm asking for a review of the specifications, which is what the main github page is for. There's also a reference implementation, but that's in a seprate repo. I've done a show HN in the past. It might be time for a fresh one since it's been a couple of years.
> - I believe schema representation and encoding should be two separate things; In other words, a good data manipulation tool should support several schema formats and several encoding formats
The schema representation will not be tied to the encoding. I'm leaning towards maybe https://cuelang.org/ as the officially endorsed schema format, but there's nothing stopping someone from using something else.
> - I personalty prefer the types to be a little less opinionated (for instance, there are many legitimate definitions for a "date" so I do not want the data layer to favor one over the others, although I reckon having some support for dates is convenient)
I put a date in there because it's such a fundamental data type that everyone wants one, and if it's not specified in an opinionated manner, everyone comes up with their own (likely incompatible) interpretation, which is what I want to avoid. The idea is that a user of the format shouldn't have to worry about HOW to encode their data unless they're using exotic types.
> - having a type for markup is particularly suspicious.
I'm still not 100% decided on whether markup will stay or go. I've already tried and dropped dozens of other types already so it might go before I release...
> - Are strings UTF-8? UTF-8 only?
Yes, UTF-8 only.
> - Why do you need specific types for edges and nodes?
Because nodes can't represent weighted or other complex or non-directed graphs, and edges are by their nature too bulky for representing trees.
> - No type for records/structures apart from the top-level one??
Not sure what you mean? structures are represented using the map type and a schema.
> - No sum types? How do you handle nulls?
Null is allowed. You can pass {"result" = 500} or {"result" = null} or {"result" = [1 2 3 4]} if you want. Sum types would be enforced by the schema.
> - Lists are of heterogeneous types? Can we have a type for "lists of some type"
This would be the job of the schema. There are typed arrays for primitive types like int and float since that's a fairly common use case and would be too bulky otherwise.
> - How come comments ended up being types?
Comments are "types" in the sense of what types of data a document can physically contain. They aren't "types" in the sense of actual data to be passed to applications (although an application could listen for it - for example a CTE reformatter or sanitizer).
> - The front-page should display a comparison in encoding and decoding speed and encoded size compared to the major contenders (which are protobuf and json, I guess)
Encoding and decoding speed would depend on the implementation. This is just the specification.
> - It should also display the corresponding go code for each presented schema examples
Not sure how useful that would be since every single example would be cbe.Marshal(myobject, stream) and myobject = ce.Unmarshal(mytype, stream)
> - You ask for a review on your site but that just links to the github project page. Have you done a show-hn by the way?
I'm asking for a review of the specifications, which is what the main github page is for. There's also a reference implementation, but that's in a seprate repo. I've done a show HN in the past. It might be time for a fresh one since it's been a couple of years.
> - I believe schema representation and encoding should be two separate things; In other words, a good data manipulation tool should support several schema formats and several encoding formats
The schema representation will not be tied to the encoding. I'm leaning towards maybe https://cuelang.org/ as the officially endorsed schema format, but there's nothing stopping someone from using something else.
> - I personalty prefer the types to be a little less opinionated (for instance, there are many legitimate definitions for a "date" so I do not want the data layer to favor one over the others, although I reckon having some support for dates is convenient)
I put a date in there because it's such a fundamental data type that everyone wants one, and if it's not specified in an opinionated manner, everyone comes up with their own (likely incompatible) interpretation, which is what I want to avoid. The idea is that a user of the format shouldn't have to worry about HOW to encode their data unless they're using exotic types.
> - having a type for markup is particularly suspicious.
I'm still not 100% decided on whether markup will stay or go. I've already tried and dropped dozens of other types already so it might go before I release...
> - Are strings UTF-8? UTF-8 only?
Yes, UTF-8 only.
> - Why do you need specific types for edges and nodes?
Because nodes can't represent weighted or other complex or non-directed graphs, and edges are by their nature too bulky for representing trees.
> - No type for records/structures apart from the top-level one??
Not sure what you mean? structures are represented using the map type and a schema.
> - No sum types? How do you handle nulls?
Null is allowed. You can pass {"result" = 500} or {"result" = null} or {"result" = [1 2 3 4]} if you want. Sum types would be enforced by the schema.
> - Lists are of heterogeneous types? Can we have a type for "lists of some type"
This would be the job of the schema. There are typed arrays for primitive types like int and float since that's a fairly common use case and would be too bulky otherwise.
> - How come comments ended up being types?
Comments are "types" in the sense of what types of data a document can physically contain. They aren't "types" in the sense of actual data to be passed to applications (although an application could listen for it - for example a CTE reformatter or sanitizer).
> - The front-page should display a comparison in encoding and decoding speed and encoded size compared to the major contenders (which are protobuf and json, I guess)
Encoding and decoding speed would depend on the implementation. This is just the specification.
> - It should also display the corresponding go code for each presented schema examples
Not sure how useful that would be since every single example would be cbe.Marshal(myobject, stream) and myobject = ce.Unmarshal(mytype, stream)