Hacker News new | past | comments | ask | show | jobs | submit login
Common Expression Language interpreter written in Rust (github.com/clarkmcc)
124 points by PaulHoule 57 days ago | hide | past | favorite | 34 comments



Ah, I was wondering why the project was getting a few more eyes today! Maintainer here, I took over this excellent project from Tom Forbes in April 2023. He did a phenomenal job writing the parser and laying the groundwork for an interpreter. One of the beautiful things right now about this project is its simplicity — it’s tiny compared to cel-go for example. I’m also a huge fan of our Axum-style functions[1] where you can register pretty much any closure as a custom function to be used in your CEL expressions. There’s still some mileage to go to support some of the more obscure aspects of the spec, but I feel like we’re getting close, and we have an excellent little cadre of contributors that have been extremely helpful in moving this forward.

[1] https://github.com/clarkmcc/cel-rust/blob/master/example/src...


This is really cool. I've been building a GUI system in Rust that features an expression language[0] — and we ruled out using CEL a while ago because the canonical Go CEL interpreter would cost several megabytes of runtime footprint (e.g. in a WASM bundle.)

Haven't measured the footprint of cel-rust yet, but I expect it's orders of magnitude smaller than cel-go. The Go runtime itself is the culprit with cel-go.

This Rust implementation may let us port to CEL after all, while maintaining Pax's <100KB wasm footprint. Nice work!

---

[0] https://www.pax.dev


> canonical Go CEL interpreter would cost several megabytes of runtime footprint (e.g. in a WASM bundle.)

At first blush Go seems incompatible with WASM in terms of maintaining Go's strengths—the two runtimes have different memory models and stack representations. I'm curious why one would choose Go if WASM were a target to begin with.


because very few people porting things to wasm understand the model more than they understand the abi they are compiling code to.

asm is a dead art.


I don't know why people don't just use Lua. It's extremely small, can be embedded only with the Lua modules you want it to include (e.g. you do not need to provide file system access, networking etc. making it similar to CEL, just better), can be compiled to every possible target being written in standard C, is very mature and is completely open source.


The problem is that Lua is Turing complete, so you can write programs that don't halt or which take an input-dependent duration to halt. Example:

    while true do
      print("meow?")
    end
In contrast, each CEL expression has a maximum depth which directly determines how long it takes to execute. (More precisely: by calculating the maximum costs up the expression tree from leaves to root.)


Lua is a VM and instruction limits can easily be imposed to kill programs like this. (This does not apply to LuaJIT)


The point here is that you know (at maximum) how many instructions will be executed before running it, and so if it exceeds the limit, you can avoid running it altogether, instead of killing it midway through its execution, leaving things in an indeterminate/corrupt state.


You don't. Even non-Turing complete languages can run for an arbitrary length of time. In general Turing completeness is never a relevant property in the real world.


One of the selling points of this language is claimed to be that it executes in constant time.


Well, they claim "linear time". Constant time would be impossible.

But anyway they achieve that by being much more restrictive than just "not Turing complete", e.g. you can't define functions.

It certainly sounds like an attractive feature, but how much benefit is it really to restrict a language so much that it will probably run quickly enough compared to just setting a timeout or computation limit? The only advantage I can think of is that you get some kind of computation constraints that don't depend on the data... But this is a pretty niche requirement.


It is a niche, yes. However, that niche is about adding some customizability to certain parts of a program, where Lua would be overkill. Moreover, hardening Lua, which is doable but not trivial, may be beyond the ability, interest, or simply time available to the developer. If you don't need the extra features Lua provides, why include them?


> If you don't need the extra features Lua provides, why include them?

The whole point of a feature like this is that you don't know what will be required. It's going to be pretty annoying when a user does find they need something that's impossible with CEL and they can't do it because the devs think they'll write slow code.

Lua is not very nice IMO, but I've used Rhai successfully in the past. It even has an operation limit feature already:

https://rhai.rs/book/safety/max-operations.html


Honestly, I've ruled it out for 1-based indexing. Why deal with that mistake when you don't have to?


It's not popular enough. I love Lua (except for 1-indexing) but I can't use a language if nobody else uses it


The games industry isn't nobody, that is how Lua got its fame, being a common scripting language, before Unreal and Unity became the only two engines most people know about.


Lua has been around for ages. I've never heard of CEL before today.


Lua is in the running for most-used language, second only to, possibly, JavaScript.

Few people write just Lua (although professionally speaking, that was me for several years) but many people write some Lua. It adds up.


Very interesting, and looks extremely flexible.

I had needed a small interpretive environment that would be highly controlled, used in a proprietary configuration templating solution for parametrizing various values.

I wrote https://github.com/ayourtch/aycalc - very rudimentary by default with just basic arithmetic, but easy to give different security guarantees, depending on the context - the references to functions and variables can be either separate from each other or share the space, also it’s easy to special-case the handling for both variables and functions.

The entire source code for the library is just around 400 lines, so i thought it can be a different enough type of a beast to mention, in case someone finds it useful.


CEL is becoming a "standard" implementation found in a lot of places. I generally like it, but loathe how much it's been growing the Yaml Engineering (devops) space. We need better options like CUE and Starlark that can scale and CEL feels more like duct tape in a lot of places. You kind of need CEL-in-config when the values are dynamic and processed within another system. In time, I expect that if something like GitHub Actions supported CUE natively, we could remove the `when: "CEL expression"` even with the dynamic values from previous steps. Eventually CUE's evaluator will be smart enough to know when / how to order sub-values.

- https://cel.dev

- https://github.com/google/cel-spec


there is visual editor which exports cel https://github.com/react-querybuilder/react-querybuilder


Very cool! I'm actually mostly interested in the Import + Export functionality. Would be great to have that as a library without the gui part.


Probably a silly and useless idea, but what if there was an ORM that was designed around CEL?


Not a silly or useless idea at all. Ive seen a custom built ORM that parsed a CEL-like syntax into a SQL query using a query builder. It was pretty nifty. The use-case was to allow users to craft arbitrary queries on a data-driven application.


From the CEL Github page:

> The required components of a system that supports CEL are:

> The textual representation of an expression as written by a developer. It is of similar syntax to expressions in C/C++/Java/JavaScript

Ok

> A binary representation of an expression. It is an abstract syntax tree (AST).

> A compiler library that converts the textual representation to the binary representation. This can be done ahead of time (in the control plane) or just before evaluation (in the data plane).

> A context containing one or more typed variables, often protobuf messages. Most use-cases will use attribute_context.proto

> An evaluator library that takes the binary format in the context and produces a result, usually a Boolean.

Why? All of these sound like implementation details to me, some of which I prefer not to have, such as the necessity for binary representation.


CEL is super cool, it's really great to have a quick and easy way to add a filter parameter to every list API on your server. This project should add an easy way to take a list/iterator of something that implements Serde Serialize, and filter based on a CEL expression.


In my mind it has some similarity to

https://en.wikipedia.org/wiki/Object_Constraint_Language

insofar as there is an expression language inside of OCL. OMG uses OCL in many parts of standards that can use that functionality.


Cool! I made something remotely similar, a library for complex arithmetic that dynamically evaluates user-defined expressions. Since it only has pre-defined functions, the compiler can do pervasive constant folding.


Is there an equivalent of CEL for the Arrow ecosystem?

In particular is there a spec for what expressions are admissible for predicates or transformations for example here (https://arrow.apache.org/docs/python/generated/pyarrow.datas...) or in Substrait?


I guess this is what I'm looking for:

https://substrait.io/spec/specification/


Does anyone know if Recursive CTEs are addressed because I didn't find that while on my phone?


Man I almost built this exact thing about a year ago and while I would have used it just didn't have enough of a usecase to justify investing the time.

This is awesome work.


is CEL just a buzzword/certification gate keeping for typed data?!


No




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: