
Show HN: Verify JSON using minimal schema - yusufnb
https://github.com/yusufnb/verify-json
======
tylerchr
Very neat. I had a similar need recently which led to a similar solution[^1].
Notably, I used a syntax nearly identical to yours, albeit without the very
handy pluggable validators yours has. Nice job.

What led you to choose “!” as the “optional” modifier? My intuition would have
guessed that character to have the opposite meaning.

[1]: project:
[https://github.com/tylerchr/jstn/](https://github.com/tylerchr/jstn/) —
playground:
[https://tylerchr.github.io/jstn/](https://tylerchr.github.io/jstn/)

~~~
willvarfar
A long time ago, before MyPy, I did a python schema checker for JSON. It grew
into argument checking too:
[https://pypi.org/project/obiwan/](https://pypi.org/project/obiwan/)

Simple stuff like:

    
    
       Person = { "name": str, "age": int }
    

The basic idea was to use a file full of obiwan types to document the rest
api. The schema was both human-readable and valid python. The api could then
trivially check the schema. Once the schema is checked, of course, its safe to
go walk the json knowing that things aren't going to crash on you etc.

It worked great. Still works great. I still prefer the syntax to that of MyPy.

No real adoption, of course :) But fun to remember what could have been :)

------
smoyer
Maybe I'm in the minority but I don't want light-weight at the expense of
compatible/standard. It's being recognized that validation is more important
than we'd expected - the recent ACM guidelines for moderation on Postel's Law
really resonated with my experiences with RESTful/JSON micro-services
([https://queue.acm.org/detail.cfm?id=1999945](https://queue.acm.org/detail.cfm?id=1999945)).
But if we're going to the effort of validating, let's do it in a
language/framework agnostic way ... jsonschema.org is mentioned in other
comments but I'm a fan of the OpenAPI initiative
([https://www.openapis.org/](https://www.openapis.org/)). Then we can all
share the validation rules!

~~~
yusufnb
I understand. In my experience, the usability of those standards is pretty
bad. Which in effect leads to projects not using anything at all.

One of the benefit of JSON over XML is that it is concise and fast to work
with. The standard should reflect that part as well.

With this lib, as a developer to verify a JSON for a simple REST request is as
simple as - `verify(json, "{a,b,c}")`

~~~
smoyer
> leads to projects not using anything at all

We're counting that against products when we're analyzing what services to
adopt/purchase. You doing nothing at all ... or something non-standard results
in everyone else having to attempt to do it for you with the resulting
inconsistencies. I'm not sure why you think OpenAPI is tailored for XML ...
Here's the specification for the sample application (petstore) with service
and model definitions in YAML - [https://github.com/OAI/OpenAPI-
Specification/blob/master/exa...](https://github.com/OAI/OpenAPI-
Specification/blob/master/examples/v3.0/petstore.yaml).

~~~
yusufnb
I think we are talking about different things here. I love OpenAPI and
wouldn't do a project without it. verify-json is a lightweight way to validate
JSON schema and does not intend to change/improve anything about OpenAPI.

This is more for client-server JSON schema validations, comes handy when
different teams are writing client and server side code. Especially in
startups where iterations are rapid.

~~~
smoyer
A long time ago in a galaxy far, far away, Corba implemented ICDs as
"contracts" between parties. It seems like a client and server developed by
different teams is still a good use-case for "something" that specifies the
contract. In the 2000's, we wrote mil-spec style ICD documents that we
attempted to enforce when writing code that had to interact (via a message
bus). The benefit of a language/framework agnostic way of performing with
something executed in code is a great step forward from just having an ICD
document. Note that early Corba, XML-RPC and SOAP/XML could perform this
function but people did horrible things to the data (using an indexed array of
strings as parameters) that broke both typing and the ability to do type-
specific validation.

From the way you've phrased your comment, I'm going to assume that the
different teams are within the same start-up ... having a versioned,
executable is even better when iterations are rapid if there are breaking
changes. I've been in enough start-ups to know that things are skipped over to
get a product out the door. I also know that the marketing and sales guys will
say "we're already using a RESTful/JSON API for our back-end, let's open our
API to third-parties to accelerate adoption of our SaaS". Ouch! (but I've been
there)

I will admit that "back-filling" an OpenAPI specification for an existing
service can be pretty daunting ... it's quite a bit easier if you start with
the first MVP and iterate the ICD with each iteration of the client and
server. Admittedly, we (software developers) still have issues versioning
services. It's not so bad for incremental changes but breaking changes to a
service where you have to support both older and newer clients is painful.

------
catlifeonmars
This has already been said in the nested comments, but why not “?” for
optional? Coming from Typescript, Swift, etc “!” feels more like an assertion
of non-nullity (is that a word?). It’s also easy to read as “not <key>”.

~~~
yusufnb
Hmmm, true. ? might be better. Let me think it through. I will update the
package to support that.

~~~
yusufnb
you can track this issue for that change - [https://github.com/yusufnb/verify-
json/issues/2](https://github.com/yusufnb/verify-json/issues/2)

------
oefrha
Exclamation mark for optional is just weird. Question mark would be a lot more
intuitive, unless there’s an origin story here I’m not aware of.

~~~
angry_cactus
Not sure of the exact motivation, but Swift uses exclamation mark to unwrap
optionals.

~~~
oefrha
In Swift optional types are marked by a question mark (exactly as I
suggested). Unwrapping an optional makes it non-optional, so the exclamation
point has the opposite meaning.

------
Eikon
It's crazy the amount of work that is spent to essentially implement static
typing into dynamically / duck typed languages.

Projects like this, mypy and others makes the whole thing _bizarre_ to say the
least.

~~~
ken
What statically typed language lets me define a type as “number between -180
and +180”, or “string which contains only alphanumeric chars”?

I think that would be a great feature but from what I see static typing fails
here, too.

~~~
rraval
> What statically typed language lets me define a type as “number between -180
> and +180”, or “string which contains only alphanumeric chars”?

Pretty much all of them. Any simple predicate like this can be encoded with
witness types.

Here's an example in Java, which is hardly the paragon of static typing (i.e.
it's no Haskell/Idris/Agda/Rust/Typescript):

    
    
        class AlphaNumericString {
            private final String str;
    
            // use a fallible factory with a `private` constructor if you're
            // morally opposed to exceptions
            public AlphaNumericString(String str) throws AlphaNumericException {
                if (!str.matches("^[a-zA-Z0-9]*$")) {
                    throw new AlphaNumericException();
                }
    
                this.str = str;
            }
    
            private static class AlphaNumericException extends Exception {
            }
        }
    

Now code can freely use `AlphaNumericString` and be guaranteed that it has
been validated.

You may object and say that newtype wrapping is cumbersome but:

1\. That's an argument about sugar and ergonomics, not about the semantics
that the static type system enforces

2\. Some languages make it easier to generate forwarding methods to the
underlying type (a la
[https://kotlinlang.org/docs/reference/delegation.html](https://kotlinlang.org/docs/reference/delegation.html))

3\. The `AlphaNumericString` is describing a smaller set of values than
`String`. In general, you should be strongly considering the methods you allow
and make sure that all paths continue to enforce the semantics you intend.

------
gunn
Why are schemas strings?

It would seem much more natural to make them JSON structures since they're
almost that anyway.

~~~
ZenPsycho
then you might as well use jsonschema.

~~~
mikl
Indeed, there seems to be no reason to use this instead of JSON Schema.

------
tckr
JSON schema and ajv are my go to tools for this:
[https://github.com/epoberezkin/ajv](https://github.com/epoberezkin/ajv)

------
suref
I built something similar for my latest webapp to validate json-requests in
python, the syntax is like this:

    
    
        {
            "content": Required(str),
            "username": Required(str, validate=username_is_ok, transform=lambda x: x.lower(), 
                       fail_message="Username isn't valid"),
            "message": Optional(str),
            "some_list": Required({
                "name": Required(str),
                "date": Required(str)
            }, is_list=True)
        }
    

Then I can provide this in a hook to the request method.

~~~
1f97
can you tell me a bit more about this? i wanted something like this for a
small api i have but ended up using marshmallow to validate jsons against
defined schemas which i think is a bit overkill.

~~~
kingosticks
If you haven't seen, these alternatives might also be helpful:

    
    
       * https://github.com/keleshev/schema
       * https://github.com/alecthomas/voluptuous
       * https://github.com/Pylons/colander

~~~
1f97
thanks, i will take a look!

------
ivanhoe
This is OK for simple structures, but in a deeply nested structures and larger
APIs you'd probably also need some equivalent of json-schema $refs and ability
to reuse blocks in your api schema.

------
Zinggi
You shouldn't validate json. You should parse it, e.g. transform it into your
desired data structure if it comes in in a valid shape.

[https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-
va...](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)

I really like Elms way of Json decoding for this.

~~~
guitarbill
I read the post. It seems to be mainly applicable to statically-typed
languages. Apparently, the core problem with validation is "shotgun parsing":

> Shotgun parsing is a programming antipattern whereby parsing and input-
> validating code is mixed with and spread across processing code—throwing a
> cloud of checks at the input [...]

> The problem is that validation-based approaches make it extremely difficult
> or impossible to determine if everything was actually validated up front
> [...]

Err, what? So validating against a well-defined schema won't necessarily cause
this. But okay, again, I can buy the benefits for statically typed languages.
There's more though:

> Don’t be afraid to parse data in multiple passes. Avoiding shotgun parsing
> just means you shouldn’t act on the input data before it’s fully parsed
> [...]

> Use abstract datatypes to make validators “look like” parsers. Sometimes,
> making an illegal state truly unrepresentable is just plain impractical
> given the tools Haskell provides, such as ensuring an integer is in a
> particular range.

I've experienced this in Java/Jackson (which btw, proves this is not exactly
new, sexy, or rare in the statically typed world).

What is the suggestion for a dynamic language? In e.g. Javascript, classes
will only get you so far, and seems needlessly heavyweight if you aren't going
to get other benefits of type-safety. I really don't see how this is helpful,
even after putting in the time to investigate this.

~~~
Zinggi
In JS, my preferred way of handling json is heavily inspired by the elm json
decoding way.

I want my decoders to essentially do 2 things:

1\. Transform received data into a data structure that is best suited for my
app. This might involve converting lists to objects, objects to sets, parse
dates etc.

2\. Only succeed on valid data

This way, my application never has to deal with bad data. Also, I get to
design the data structures I use, not the APIs I use. By only validating and
not transforming, you are pushing more advanced validation further away from
where the data was received (since you'll need to transform the data at some
point anyway).

As for libraries, I want composable parsers. For TS I'd probably use:
[https://github.com/paperhive/fefe/](https://github.com/paperhive/fefe/)

------
jacekm
Karate[1] has its own dsl for describing both json and xml. The output is
short, but not very readable. For people new to the project it's rather
cryptic. It is easy to maintain though once your team becomes fluent in
Karate.

[1] [https://github.com/intuit/karate](https://github.com/intuit/karate)

------
andrenarchy
You can also use fefe
([https://github.com/paperhive/fefe/](https://github.com/paperhive/fefe/)) to
validate _any_ data with its pure functional and minimalist approach. Bonus:
it's 100% type-safe automatically if you use Typescript.

------
crtlaltdel
i saw this and was thinking “hrm...this is the sorts thing i use lodash
for...if i even need it”.

and now i see lodash is a dependency.

~~~
ddoolin
It also exposes itself as a Lodash mixin, which I have never used nor seen, so
that's something learned.

------
verdverm
Enter [https://cuelang.org](https://cuelang.org)

------
philliphaydon
Wow this is very clean!

~~~
eventreduce
I do not think so. If you check the example schema, it is very hard to
understand it. What does ':lat' mean? is it a string or a number? What does
'!b' mean, can it only be false?

~~~
philliphaydon
If you read the whole example:

lat = custom validator

b = shorthand for boolean (as is s for string)

! = optional

So for me reading the whole example its very easy to understand and digest.

~~~
eventreduce
Yes, reading the docs explains what these keywords mean. But by just looking
at the schema it is impossible to understand what that is. And if you check
the definition of clean code which is something like "intuitive
understandable" then it comes clear that this is not clean.

~~~
philliphaydon
Ah I see where you’re coming from. Good point.

------
therufa
There has been a schema specification for json already. it's called
conveniently `json-schema` [https://json-schema.org/](https://json-
schema.org/)

~~~
eventreduce
I think the author knew this but wanted a more minimal type of validation.

~~~
arethuza
Json schema does scale down pretty well (indeed as the link above shows the
simplest possible Json schema is just "{}") - it also has pretty good tool
support (e.g. VS Code) - there is a Json schema for Json schemas which helps a
lot.

The main advantage of the approach in the article appears to be that its easy
to extend validation with code - which is a nice touch. Pretty much any
complex Json validation I have written has to combine both schema based
validation _and_ code based checks - usually done completely separately.

~~~
eventreduce
Json-schema itself scales down pretty well. But the libraries for validating
it do not. Even if your schema is only {}, the build size increment because of
pulling in a jsonschema validator is enormous because of all these supported
features.

~~~
arethuza
Fair enough I guess - don't think I've ever had a need to do client side JSON
validation against a 'schema' so I've never had that concern.

