Project Cambria: Translate your data with lenses

jka · on Oct 6, 2020

This is really great stuff, as usual from Ink & Switch.

One initial thought that I'd offer is that in a simple running system, these 'lenses' are arranged into a series[1] that can take a datastructure from an origin 'base' version all the way through to the latest-known representation. This may be more familiar to most developers as an analogy to database migration scripts, as mentioned in the post.

When a new lens is created, the developers may want to distribute it to a subset of users, ensure that it works correctly, and potentially adjust it based on feedback before issuing the final lens to the application's population[2].

If this is the intended release workflow, then the arrangement of lenses becomes a graph or chain rather than a simple series. There may be times where it's necessary to backtrack briefly.

It's possible (although hacky, I will admit) to create an NPM JavaScript module that has another version of itself as a dependency.

I mention that because this provides a way to distribute some code -- a lens, for example -- alongside a dependency graph using an ecosystem that is relatively well-evolved (in terms of release management, client upgrade support, etc) for versioned code distribution.

[1] It's nice the that analogy with light passing through a series of lenses fits

[2] Think of an optician testing your eyesight and asking questions about various sample lenses

pvh · on Oct 6, 2020

For those interested, the code for the library behind the project is available at https://github.com/inkandswitch/cambria

susiecambria · on Oct 6, 2020

I just have to say, it's my name, it's my name! A cool sciency thing has my name! (I'm not sciency at all. Social worker by training, policy and budget wonk as my one-and-only non-human/non-canine love).

pvh · on Oct 7, 2020

It's a good name, thanks for sharing it.

intrepidhero · on Oct 6, 2020

I really want to use lenses. The idea of a definition that allows lossless translation from one data representation to another is so elegant. But I wonder if it's a mathematical ideal that doesn't match up with real world problems? I thought to apply the concept of a lens in my problem domain, converting engineering files to machine readable config files, but the reality is the process is more like combining multiple inputs to form an output, which may not contain all of the input data. So I'm left applying some tricks (including unnecessary data in the output/input) to make the process act like a lens. And it's pretty error prone.

Are there applications for lenses that really allow lossless translation in both directions? Or am I misunderstanding the concept in some fundamental way?

tengbretson · on Oct 6, 2020

This was my take-away as well. This is cool and all, but if your mapping problem can be entirely solved declaratively like this I don't know if you actually had all that tough of a problem in the first place. Almost every data translation nightmare I've been sucked into involved some pretty sophisticated business logic, and any declarative language that has the sufficient power to handle it inevitably becomes more complicated to write and maintain than a turing-complete, plain old programming language.

random3 · on Oct 6, 2020

Without being too knowledgeable on the topic, but with a little math background - you'd need a bijective function to that. On a quick search for bijective lenses https://arxiv.org/abs/1710.03248

pvh · on Oct 6, 2020

Close! In math lingo, you can only have a bijective function for domains with equal cardinality.

In other words, you couldn't have a bijective function between an integer and a boolean, because there are fewer possible boolean values than integer values.

Demanding bijective functions is probably more restrictive than you want to be for building real-world systems. We discuss this a bit in the paper -- I think look for "convert".

random3 · on Oct 6, 2020

Thank you! I guess in absence of that, the lens is the codomain so to speak so must retain the info. Will read when I get a chance :)

pfraze · on Oct 6, 2020

As I was reading, I was reminded of schema-migration definitions which some ORMs/DB tools give you (I currently use Knexjs). There might be a useful intuitive correlation there [edit: I see jka and the post already pointed that out]. Of course in the context that I&S is discussing, the migration is at “runtime” (so to speak) because we’re dealing with systems that can’t do all-at-once migrations. Like intrepidhero commented, I’m curious to know if declarative lenses will be sufficient for the task, but I’m glad that’s what they explored. I’ll be reading this post again more closely in the future.

gklitt · on Oct 6, 2020

Yes, this is definitely a useful point of comparison!

We hesitated to mention database migrations too much in the essay for the reason you mention: in some sense, the whole point of Cambria is that you never do a one-time "migration", but rather you just continuously translate the data.

Still, using the system does feel pretty similar to doing database migrations, and we intentionally modeled some of the dev workflow after ActiveRecord migrations:

https://www.inkandswitch.com/cambria.html#developer-workflow

ckluis · on Oct 6, 2020

This is one of the reasons I bought Muse. I watched some videos of them exploring ideas on how to build. The attention to detail is next level.

Their software is insane and I imagine part of this is about taking Muse to the web (at least a viewer).

Stoked!

masonhensley · on Oct 6, 2020

Do you have a link to Muse? Can't find one that fits this general topic.

pvh · on Oct 6, 2020

Muse is our first commercial spin-out from the lab, a spatial canvas to collect your thoughts: https://museapp.com/

OkGoDoIt · on Oct 6, 2020

https://www.inkandswitch.com/muse-studio-for-ideas.html

App is at https://apps.apple.com/us/app/muse-tool-for-thought/id150156...

It looks interesting, but $100 per year is really steep

vmchale · on Oct 6, 2020

Haskell taking over the world

pvh · on Oct 6, 2020

I'm not an expert on Haskell, but as I understand it, Haskell lenses are a distinct and independent derivation of the same research work by Pierce et al at UPenn: https://www.cis.upenn.edu/~bcpierce/papers/index.shtml#Lense...

samcheng · on Oct 6, 2020

Is this XSLT for JSON?

pvh · on Oct 6, 2020

I love the analogy, but no! First, XSLT is unidirectional and although I have built some tree-like XSLT processing systems, Cambria is explicitly graph-based and bidirectional.

In other words, with Cambria you can build a graph of lenses that describe various data schemas and then translate from or to any of them.

That said, XSLT has different strengths (syntax-aside) in that it is a general purpose templating language with strong functional programming underpinnings.

amenghra · on Oct 6, 2020

“Database columns would be renamed simultaneously in clients and servers” databases should support aliasing column names.

pvh · on Oct 6, 2020

Databases should add support for lenses and also support type conversations, column relocations, new default values, and everything else a user might need.

22c · on Oct 7, 2020

Yes, they're called "Views" [1]

[1] https://en.wikipedia.org/wiki/View_(SQL)

Daishiman · on Oct 6, 2020

Using YAML as a data transformation language? What could possibly go wrong?

pvh · on Oct 6, 2020

YAML is a convenient stand-in for a better syntax and was chosen for being less tedious to type than JSON itself, while still supporting JSON Schema to make our editor integration work.

Long run, I expect neither JSON nor YAML are really the best solutions.

Daishiman · on Oct 6, 2020

The problem with these data transformations is that the limited descriptive capabilities of YAML are going to be a problem, and then they'll switch to some hybrid monstrosity of YAML+Typescript, or some ad-hoc language or some new template syntax.

This always ends up happening; this will happen here, without a doubt.

ben509 · on Oct 7, 2020

Nothing, unless you're from Norway.[1]

[1]: https://hitchdev.com/strictyaml/why/implicit-typing-removed/