
Designing the perfect TypeScript schema validation library - colinmcd
https://vriad.com/blog/zod/
======
cjdell
I built something like this a few months ago but struggled to explain to
almost everyone except die hard TypeScript fans what the big deal was.

I explained it was all about the type inference from the schema so I could
make guarantees across runtime boundaries.

This is what I came up with, though this library looks like it may be more
comprehensive than what I wrote. Struggled to come up with a good name to
represent exactly what it did...

[https://www.npmjs.com/package/type-safe-
validator](https://www.npmjs.com/package/type-safe-validator)

The ability of the TypeScript compiler to allow this stuff continues to make
it one of my favourite languages to work with, despite it still being
JavaScript underneath.

~~~
colinmcd
I guess it's something in the air! Given the popularity of the other
libraries, it's surprising that there's isn't an optimal one.

And yeah, unless you've gone down the rabbit hole of type safety obsession,
it's hard to understand the necessity of this. I like that you specifically
call out REST API validation in your docs...the difficultly of type safe API
implementation was what drove me to build zod.

I'm using Zod plus a hand-rolled codegen tool to implement an end-to-end type
safe RPC API that validates all data at runtime AND generates a statically
typed Typescript SDK for use on the client.

PS If people like the sound of that I might publish that too. Let me know if
it sounds interesting.

~~~
Lapland
I think it sounds super interesting, would love to see how you've solved it :)

------
Gehinnn
I've been working on something similar, also struggeling with `io-ts`.
However, I want the schema to be serializable (possibly compatible with json
schema) so that a rich ecosystem of tools can be build upon it. There is still
a long way to go though.

My library `@hediet/cli` [1] uses this technology to offer a browser based UI
for CLI applications [2]. In the long term, I want to replace my use of `io-
ts` with this technology in my JSON RPC implementation [3], so that I can
implement something similar to Swagger for JSON RPC.

I always felt the possibilities of JSON Schema are not really explored, which
I think might be due to bad design of JSON Schema (it's really hard to
consider all these denormalizations, it's like HTML 20 years ago).

I'll have a more detailed look at your library once I find time!

[1] [https://github.com/hediet/ts-
cli/blob/master/cli/README.md](https://github.com/hediet/ts-
cli/blob/master/cli/README.md) [2] [https://github.com/hediet/ts-
cli/blob/master/cli/docs/gui.pn...](https://github.com/hediet/ts-
cli/blob/master/cli/docs/gui.png) [3] [https://github.com/hediet/typed-json-
rpc](https://github.com/hediet/typed-json-rpc)

------
hn_throwaway_99
While I think this is pretty cool, given his use case (wanting to use a
strongly typed client and server), I can't see why one wouldn't just use
GraphQL with TypeScript on both the client and server. I've done this and I
love it:

1\. Define the GraphQL interface with GraphQL's typedef language. GraphQL then
takes care of validating all the request and responses at runtime.

2\. Use something like [https://github.com/dotansimha/graphql-code-
generator](https://github.com/dotansimha/graphql-code-generator) to generate
all your TypeScript types from your GraphQL schema. Place these types on your
resolvers and then you get compile-time type checking.

3\. Since GraphQL actually exposes your schema as part of the endpoint,
clients can use the same tool to keep their TypeScript types in sync, and get
the same typing benefits when writing the client code.

~~~
pas
One use case of validators is for config files.

~~~
eyelidlessness
FWIW, you can somewhat accomplish this with just the `config` library and
config files written as TypeScript.

~~~
pas
Sure, this discussion is full of options, but none are really perfect, so
there are plenty of opportunities for bikeshedding!

~~~
eyelidlessness
I guess you could call trying to offer a helpful suggestion bike shedding
¯\\_(ツ)_/¯

~~~
pas
Excuse me if I have offended you, my intent was rather far from that.

The bike-shedding is to which one to pick right now, when there doesn't seem
to be a clear superior choice.

I saw your suggestion as just as a bit half-baked as the ad-hoc stuff I
actually use now, and the other in-chimings in the discussion. (io-ts,
utility-types, runtypes, etc. all are great, but as the title indicates it
feels there should be one schema system specifically designed for TS. That can
be used for description and validation of all interfaces of a TS system,
including incoming requests, static and runtime configuration, outgoing data,
and mapping these to internal domain modeling.)

~~~
eyelidlessness
> Excuse me if I have offended you, my intent was rather far from that. > >
> The bike-shedding is to which one to pick right now, when there doesn't seem
> to be a clear superior choice.

Fair enough. My intent was not bike shedding, but just mentioning a thing that
exists if you would like to use it. In my experience most people are not aware
`config` supports TypeScript config files.

> I saw your suggestion as just as a bit half-baked as the ad-hoc stuff I
> actually use now, and the other in-chimings in the discussion.

For what it's worth, for libraries which require config I combine io-ts (well
my wrapper around it) for validation and the types it generates for type
safety. It is not an experience that feels half-baked to me. It guarantees
that clients using my libraries provide valid configuration at runtime, and
allows them to get static guarantees of the same (which I also employ in
clients which use those libraries).

> (io-ts, utility-types, runtypes, etc. all are great, but as the title
> indicates it feels there should be one schema system specifically designed
> for TS. That can be used for description and validation of all interfaces of
> a TS system, including incoming requests, static and runtime configuration,
> outgoing data, and mapping these to internal domain modeling.)

IMO there doesn't need to be just one, as long as people can find one that
suits their needs. I certainly use io-ts for all of the things you describe,
it is designed specifically for TS, and I am quite happy with it.

------
eyelidlessness
This is great. I recently built a fairly large amount of functionality
_around_ io-ts, adding support for mixed required/optional fields, eliminating
the need to deal with `Either` monads, and a whole lot of other stuff that's
mostly just relevant to my company's usage. I think if this had been available
at the time I was looking for an underlying library, it would have been a no
brainer to choose this one.

Worth noting, another option is Runtypes[1], which also looks great. I can't
remember off the top of my head why I ultimately picked io-ts over Runtypes,
but it's another one for folks to consider (and I'd be curious what the Zod
author thinks of it).

[https://github.com/pelotom/runtypes](https://github.com/pelotom/runtypes)

~~~
eyelidlessness
Oh, I do want to add one bit of minor feedback for the author: one of the
things I like about the io-ts interface is that fields/codecs are values
unless they require a parameter (e.g. `t.string` vs `z.string()`). It's a
little bit easier for me to read. It's also not clear to me at a glance
whether calling `z.string()` is creating a new instance of something and
whether that affects the behavior of a given schema.

~~~
colinmcd
I personally prefer to standardize everything as a function. It also leaves
some breathing room in case I decide to include parameters as part of a future
API augmentation.

~~~
eyelidlessness
I hope you'll give me the opportunity to try to convince you otherwise.

I understand the context of this is that io-ts puts an onerous emphasis on FP
concepts and data structures. But FP _principles_ , when applied in an
effective and usable way, are quite good. The make it easier to reason about
and maintain code.

One such principle is that a "function" with no parameters is a good sign that
it will cause some kind of side effect. I sort of hinted at this when I
questioned the design, when I said that it wasn't clear to me whether calling
these functions was producing new instances and whether those new instances
affect behavior.

I would also argue that leaving room to parameterize primitive types is
leaving room to make a drastically more complicated API in the future. A
`string` with "options" suddenly becomes its own sub-API.

While io-ts is more complicated, once learnt its API is more predictable. A
`Type` is a value. An `interface` (or similar) type is a `Record<string,
Type>`. Always. `Type`s can be augmented/composed/etc (and you may provide a
function that accepts one or more `Type`s as parameters to accomplish that),
but they always resolve to a value. The value can be reused without concern
about side effects.

~~~
Ezku
Very strongly agreeing with this assessment and hoping Colin will come to
agree.

A zero-argument function is either a constant, or a side-effect. Given we
don’t want a side-effect, exposing a constant instead of an effectful-looking
function is preferable.

------
jrimbault
I'd really like something like a compiler step/plugin to generate the
validators from the declared typescript types.

~~~
eyelidlessness
This is basically that but in the opposite direction. It also has the
advantage of being able to express validations that you can't express in the
type system.

~~~
randomchars
Do you have an example of the kind of validation couldn't be expressed in the
type system?

~~~
root_axis
Min or max length of an input. Any type of string validation like email or
something similar that's validated via regex.

------
franky47
Adding to the pile of similar approaches, there's TypeBox [1] which uses JSON
Schema as an intermediate artifact (validation can happen with ajv or other
libs), and extracts static types for TypeScript.

Having the JSON Schema intermediate is useful when using Fastify [2], so that
request validation can happen with less boilerplate.

[1]
[https://github.com/sinclairzx81/typebox](https://github.com/sinclairzx81/typebox)

[2] [https://www.fastify.io/docs/latest/Validation-and-
Serializat...](https://www.fastify.io/docs/latest/Validation-and-
Serialization/)

------
nateabele
Also did something similar[1]. My motivations were a better API (mine is
modeled off of Elm's), and better error messages: I have a use case where
user's need to be shown raw, dev-tools-style interactive data dumps, with
error information overlaid on top. The next version will also enable data
generation from schemas.

Also, and perhaps most importantly, this library has no respect for undefined,
because particularly in the context of data modeling, undefined is complete
nonsense.

[1] [https://github.com/ai-labs-team/ts-utils#decoder](https://github.com/ai-
labs-team/ts-utils#decoder)

~~~
Gehinnn
I find `undefined` better than `null`. Either a user has contact information,
or the contact information are undefined (and not null). How do you model
that?

~~~
nateabele
See, I take the opposite view: consider that the 'model' is the abstract
structure of the data which your application knows about.

In your case, contact info is a piece of related data that your application
knows about, can traverse to, parse, and consume, and it can be either present
or absent. To me, `null` fits that case perfectly (personally I'd use Maybe,
but, same idea). Finally, in cases where you're modeling a collection of
values where the keys are unknown, see `dict()`.

~~~
Gehinnn
What is `dict()`? What's the meaning of `null`? `undefined` literally means
the value is not defined. `null` only has a technical meaning, which says its
value is the null pointer.

~~~
Dylan16807
That's not what null means in javascript and typescript. There, it's more like
'null' signals that there is no value, and 'undefined' signals that there is
no variable.

~~~
Gehinnn
All runtime differences are quite neglectable if you use typescript, at least
in my experience. You never accidentally use unassigned variables in
typescript. Also, `{ x: undefined }` and `{}` are distinguishable.

Undefined plays well together with optional fields in typescript, null does
not and requries normalization! `undefined` in a JSON array is a problem
though.

------
janpot
You might want to take a look at ts-json-validator [1] as well. It creates
type-aware validators for json-schema.

[1] [https://github.com/ostrowr/ts-json-
validator](https://github.com/ostrowr/ts-json-validator)

------
__michaelg
No mention of runtypes which looks pretty similar?
[https://github.com/pelotom/runtypes](https://github.com/pelotom/runtypes)

~~~
colinmcd
Huh. Totally missed this, it looks like an excellent tool.

No support for recursive types (which I personally want/need for my project).

I really like their API for constraint checking...might have to steal that...

[UPDATE] runtypes actualyl does support recursive types! My bad! Great lib.

------
sakagami0
Wow I made the same thing that I put v1 up a couple days ago.

[https://github.com/tetranoir/presi](https://github.com/tetranoir/presi)

A main difference is I tried to get rid of having a separate line to create
the Type and I tried to get rid of having to learn a new libary's api as much
as possible.

~~~
eyelidlessness
This looks interesting too, but the syntax fells _really weird_ to me.

~~~
sakagami0
A goal was you can treat it Exactly like an "Interface". I agree it looks
totally nonsensical but I think of it as a text replacement.

~~~
eyelidlessness
The assignment to class properties reminds me an awful lot of stuff I've seen
in Python, particularly in Django/REST Framework.

But the class wrapping thing would just break my brain every time I tried to
use it. Calling plain functions just feels a lot more natural to me.

------
seanwilson
Are there any plans for building validation support like this directly into
TypeScript? If not, why not?

TypeScript seems designed to be incrementally added to large existing
projects. Why then isn't there a standard way to validate objects coming from
the untyped parts of your project?

~~~
eyelidlessness
It's explicitly one of their non-goals[1]:

> Add or rely on run-time type information in programs, or emit different code
> based on the results of the type system. Instead, encourage programming
> patterns that do not require run-time metadata.

This approach (build runtime behavior that produces static types) is the
intended one.

[1] [https://github.com/Microsoft/TypeScript/wiki/TypeScript-
Desi...](https://github.com/Microsoft/TypeScript/wiki/TypeScript-Design-
Goals#non-goals)

------
mirekrusin
Something similar, but for flow and focused more on functional combinators
[https://github.com/appliedblockchain/assert-
combinators](https://github.com/appliedblockchain/assert-combinators)

~~~
eyelidlessness
Is the repo private? I get a 404

~~~
mirekrusin
Oh thank you, mistake, fixed.

------
AriaMinaei
I've used a few similar libraries [0][1][2] and wrote one for a personal
project. If we categorize them as embedded DSLs for runtime type checking with
_some_ support for static interop, they all share three major flaws:

1\. Sub-optimal developer experience: significantly noisier syntax compared to
pure typescript, convoluted typescript errors, and slower type checking.

2\. Unfixable edge cases in static type checking: Features like conditional
types work less reliably on the types produced by the library.

3\. Some typescript features can't be supported in the library (again,
conditional types).

I think a better way to approach the runtime+static type checking is to do it
as a babel plugin, which would fix the DX and edge-case problems and also
gracefully degrade in the case of typescript features that can't work in
runtime.

Since babel can now parse typescript, it is trivial to write a babel plugin
that takes regular typescript files and converts their type annotations into
runtime values [4]. Those values can then be fed into a simpler type checking
library, giving us runtime type checking on top of the native static type
checking experience.

[0] [https://github.com/gcanti/io-ts](https://github.com/gcanti/io-ts)

[1] [https://github.com/pelotom/runtypes](https://github.com/pelotom/runtypes)

[2] mobx-state-tree also has a schema validation library that works to some
extent statically: [https://github.com/mobxjs/mobx-state-
tree/](https://github.com/mobxjs/mobx-state-tree/)

[3] [https://github.com/gcanti/io-ts#branded-types--
refinements](https://github.com/gcanti/io-ts#branded-types--refinements)

[4]
[https://gist.github.com/AriaMinaei/2f1229178abad4363f5180db2...](https://gist.github.com/AriaMinaei/2f1229178abad4363f5180db238dd8b3)

------
random_savv
There is also class-validator, which I've used and I find elegant and working
well:

[https://github.com/typestack/class-
validator](https://github.com/typestack/class-validator)

~~~
colinmcd
Huh, not sure how I missed this.

I don't love the class-based declarations: it's a bit verbose and doesn't
allow you to use things like the spread operator to "mix in" fields into
objects. There's also some redundancy required for basic types:

    
    
      @IsString
      firstName: string;
    

For my purposes, I also needed support for recursive types, unions, and
intersections, which I don't believe are supported.

But their validation built-ins go way beyond Zod (IsEmail, Min, Max, Contains,
native Date support, etc). Thanks for sharing.

------
manuisin
A while ago, I used this library called ts-interface-builder to generate
runtime validation functions from TS types. It worked really well, I was able
to use the types from my front-end without any modifications for back-end
validation.

Also, used the inferred response types from endpoint functions and passed them
back to the front-end to use with front-end fetch calls. This was probably the
biggest improvement in terms of productivity and improving accuracy. Imagine
every time you change an endpoint's response, any fetch calls that are
impacted would just show errors at compile time/in your editor.

------
chris_st
Possibly stupid question, but why does the dog array validate:

    
    
      const dogsList = z.array(dogSchema);
    
      dogSchema.parse([
       { name: 'Cujo', neutered: null },
       { name: 'Fido', age: 4, neutered: true },
      ]); // passes
    

Since 'Cujo' doesn't have an age? Assuming it's the same dogSchema as in the
previous block, age is required, right?

Oh, and a minor typo: _This lets you confidently This way you can
confidently..._.

~~~
colinmcd
Whoops! Should be `dogsList.parse(...)`.

Also fixed the other typo :)

Thanks!!

~~~
tobr
I don’t want to derail the thread, but just a heads up: the layout is very
broken on my phone (iPhone). The text is clipped on the left side which
basically makes the article unreadable!

~~~
colinmcd
Consider the thread derailed :P

Just fixed this! I'd done exactly zero testing on mobile (built the site from
scratch yesterday).

------
Ezku
You mention creating object types with optional keys is cumbersome in io-ts.
How is that solved in zod, exactly? What allows you to map `foo: union([bar,
undefined])` to `foo?: bar | undefined` (note the question mark on the left
hand side)? There’s nothing in the declaration to give away why this wouldn’t
yield `foo: bar | undefined` which is what I believe you’d get out of io-ts.

Looks useful - I would have an easier time introducing this than io-ts.

~~~
colinmcd
Good question! It wasn't easy to get the question mark on the left-hand side,
but it is possible.

Here's the Zod equivalent:

    
    
      const C = z.object({
        foo: z.string(),
          bar: z.number().optional(),
      });
    
      type C = t.TypeOf<typeof C>;
      /* {
        foo: string;
        bar?: number | undefined
      } */
    
    

And here's the code that pulls this off:

    
    
      type OptionalKeys<T extends z.ZodRawShape> = {
        [k in keyof T]: undefined extends T[k]['_type'] ? k : never;
      }[keyof T];
    
      type RequiredKeys<T extends z.ZodRawShape> = Exclude<keyof T, OptionalKeys<T>>;
    
      type ObjectType<T extends z.ZodRawShape> = {
        [k in OptionalKeys<T>]?: T[k]['_type'];
      } &
        { [k in RequiredKeys<T>]: T[k]['_type'] };
    
      export class ZodObject<T extends z.ZodRawShape> extends z.ZodType<
        ObjectType<T>, // { [k in keyof T]: T[k]['_type'] },
        ZodObjectDef<T>
      >{ 
        // ...
      }

~~~
Ezku
Thanks for the reply. So, careful application of mapped types and removing the
ability to type a property as `foo: bar | undefined`. I understand this is
desirable a lot of times especially if you can’t affect the format of what’s
being parsed, but I’m not sure this is unambiguously better.

FWIW it’s made my life easier to say the keys will always be there, but the
values are possibly undefined. Less room for ambiguous interpretation.

------
adriancooney
Can't seem to spot a link to Zod's code Colin. Fancy making it a bit more
obvious in the post?

Edit: Found it: [https://github.com/vriad/zod](https://github.com/vriad/zod)

~~~
colinmcd
Just made it _significantly_ more obvious...Thanks for pointing that out!

------
donatj
Call me old fashioned, but "mission-critical" and "rock-solid" are inherently
at odds with anything that runs a on a clients machine. Doubly so for
something that runs in an interpreted language, especially one that doesn't
ship with it's own interpreter to said client.

In these cases I'd far rather see an old fashioned reliable server-side
application built in a language with a lot of built in safety. The less you
ask the client to do, the more reliable the application.

~~~
pas
Engineering has to consider cost too. Haskell might be amazing at this, but
that might cost a lot more. TS sounds like a complicated beast, and of course
has a lot of security trade offs (npm, myriad of unverified/unaudited
packages), but using TS on both the client and the server makes things
simpler, can cut down costs, help with time to market, yadda-yadda.

Sure, TS has other problems too. (Soundess issues in its type system.) But
still much safer than C/C++ in many aspects.

Java might be an old fashioned server-side thing. But again, it has a rather
old ecosystem and it's not exactly know for its high quality and security
consciousness.

Contrast that with Rust, which is young, but tries to (or had already?)
establish itself as the de-facto ecosystem for critical/safety/performance
things.

TS is basically that. Rust for the masses. For anyone who picked up JS and
found themselves at end of a bootcamp, or anyone who wants to step beyond
being a webdev. So compared to vanila JS (and pure C, and pure python, and
maybe even pure Java) TS stuff is rock solid. (Thanks to browser vendors
spending a lot on browser/DOM/JS-engine security.)

Furthermore. Security _must_ consider ergonomics. Otherwise it will be
bypassed. (Eg it simply won't spread, users will work around the secure way,
etc.) So if you can simply use the same validation things on your client to
provide early in-situ feedback about what's wrong with the input, you can
build more robust user interaction flows, which help with people using your
secure product.

