Hacker News new | past | comments | ask | show | jobs | submit login
Diesel: A Safe, Extensible ORM and Query Builder for Rust (diesel.rs)
291 points by steveklabnik on Feb 5, 2016 | hide | past | web | favorite | 162 comments

Why is

an improvement over

    SELECT * FROM posts WHERE user_id = 1;

If you use this, it nails the database schema into your Rust code, because this needs to know the table formats. If someone adds a new field to a row, you have to rebuild all your Rust programs that use the database. Now your deployment process just got much more complicated.

It's a neat hack, but seems an overly complex way to do something that's not that complicated. Unless this has some huge performance gain over rust-postgres[1], it's probably too much.

If rust-postgres is too slow, it would be worth having an optional generic for rust-postgres to help with unmarshalling returned rows. Rust-postgres returns row objects from which fields can be extracted. Converting that to a Rust struct is the only operation which needs to go fast.

[1] https://github.com/sfackler/rust-postgres

On its own, the code representation is not an improvement. Where it shines is when you want to compose result sets.

For example, in ActiveRecord (the Ruby library that this looks very much influenced by), given:

  posts = Post.where(user_id: 1)
...you can now do things like:

  recent_posts = posts.order(created_at: :desc).limit(100)

  tagged = recent_posts.where('tag in ?', ['hacker', 'news'])
and then you can extract data:

or do mutate it:

  tagged.update_attributes({author_id: 2})

...And so on. By encapsulating a result set as something that can be augmented with new query clauses (where, limit, select) and so on, you can incrementally build queries, pass them to other functions, store them as member variables so that they be reused as "scopes" across multiple calls, and so on.

If the query were just a string, this sort of thing becomes awkward, verbose, brittle, and generally not type-safe.

The problem with this approach is that you are forced to work in a subset of SQL due to the type system mismatch of SQL and the host language.

A "proper" solution to this problem requires language changes (as Microsoft did with C# 3.0).

While it's true that solving this at the language level would be even better, it's not required. All you need is an extensible AST data model.

For example, ActiveRecord doesn't have support for window functions, but its underlying AST (a library called ARel) allows you to extend it, which at one point I did, allowing me to write something like this:

  things = Arel::Table.new('things')
  things.project(Arel::Nodes::NamedFunction.new(:sum, [things.count]).over('my_window')).
which generates something like:

  select sum(count(*)) over my_window
  from things
  window my_window as (order by name rows unbounded preceding)
(It's been a while since I looked at this particular code, which is no longer in production use, so it might not be correct.)

In other words, I could continue to express whatever I wanted of SQL, at the language level — no type mismatches.

The SQL builder code isn't pretty, to be sure, but it's something you typically encapsulate into a lower-level module rather than writing in application code. And as it's highly analogous to the SQL code it generates, it's not like it's that much more complicated. (ARel isn't a shining example of how to do it right, either. I expect Diesel implements the AST aspect more cleanly.)

Edit: I see what you're saying about typed record types. I was thinking more about the grammar, not the shape of the data.

Yes I should have been more clear that I was talking about a static typing context. In a dynamic language the problem is much easier, as you are basically only limited by your languages metaprogramming facilities.

But isn't the same true for a statically typed language with sufficiently powerful macros? I don't know if Rust falls into that category, having never used it, but I don't think it makes a difference whether the metaprogramming happens during compile time or runtime.

Yeah, if they're talking about what I think they're talking about, yes, something like LINQ could be implemented with macros (compiler plugin variant, I think) in Rust.

> basically only limited by your languages metaprogramming facilities

I'd like to point out that strong static-typing does not preclude powerful metaprogramming! Nim's macro system is extremely powerful because of it's great interplay between it and the type system.


You should take a closer look at what Diesel is doing with it's query builder. We actually do attempt to support the full power of every supported backend (of course there's still holes which need to be filled), with proper type guarantees which map to the semantics of Postgres. The main difference being that Diesel can check your queries for correctness at compile time.

The problem is that joins and projections give rise to new record types. The only way to support this in a type safe and natural way is if you have anonymous records in the host language, which Rust doesn't. On top of this there is some other metaprogramming stuff that is necessary in order to facilitate the DSL embedding.

Look at what Microsoft did with LINQ. You can't pull that off in Rust no matter how much you want to.

This can be accomplished with procedural macros.

This can be accomplished without procedural macros, too. ;)

Nevertheless, the goal here is to build an ORM for Rust. Why waste time describing what you think it can't do?

SQLAlchemy is a Python ORM that matches SQL almost 1:1.

The main reasons I like this approach are:

* it makes way easier (and safer) to reuse query parts compared to string composition/interpolation

* the ORM takes care of differences between the backends, so it is easier if I have to port the application to another RDBMS.

That's normally done in SQL with JOIN, or in some cases, temporary tables.

The ORM model of building a query dynamically over multiple steps in multiple contexts gives you a lot more flexibility and more resilient code. You can share code that builds queries for multiple tables that share some, but not all, fields or relation patterns much more easily. And you can do it all without making multiple calls to the database.

See Sandi Metz' recent piece, The Wrong Abstraction [1]. I prefer a SQL template approach such as Yesql [2]. The trouble with an ORM like ActiveRecord is a lack of control of when and how the query is performed.

1. http://www.sandimetz.com/blog/2016/1/20/the-wrong-abstractio...

2. https://github.com/krisajenkins/yesql

That's a problem with ActiveRecord and not with ORMs in general. Plenty of ORMs are explicit about how and when queries are performed. Check out SQLAlchemy for an example.

In that very specific example, you are right -- the benefits are pretty minimal (mostly just the deserialization). However, you're actually able to represent the relationship between those two models however you want, but still deal with relationships between types, and not what the exact join clause is.

Additionally, SQL is extremely hard to compose. In my applications, I very rarely have some specific one off query that I want to run, I have multiple predicates that I want to be able to mix and match easily. String munging is fragile, while this actually lets me build reasonable domain abstractions.

To your final point "Unless this has some huge performance gain over rust-postgres"

Diesel performs 33% faster than rust-postgres, and will be competitive with if not faster than ideomatic C.

> Additionally, SQL is extremely hard to compose.

I've seen almost everyone who doesn't use SQL directly say this but I always feel it's the exact opposite. SQL is easy to write once you do it enough. I've gotten to the point where I compose zero SQL in any application's code and, instead, do it all inside of stored procedures (of which they do not generate dynamic SQL). Then my code just calls stored procedures, my database users for the web application never have more access than executing a specific subset of stored procedures, and if there is a bug found in a stored procedure I can deploy them separately from the web application.

I'm guessing it's just different philosophies but almost every time I've used something that abstracts an existing interface to another, completely different interface, I always end up running into major issues.

"Compose" here doesn't mean "to write". It means to build up a database query in multiple steps, gradually adding (or removing) restrictions or relations dynamically in response to input. So one method might add a certain restriction that will get translated to a part of the where clause. Another might change which fields are being selected for or joined on or add transformations to the field. Meanwhile, the developer doesn't have to worry about special cases, or whether you need an AND or an OR, or how to go about wrapping a field in the SELECT clause after the fact. These methods might be in totally different contexts in the underlying code, and it would not be possible to make that style of query composition possible without abstracting away the SQL syntax. And that's exactly what an ORM does for you.

Oh I know what compose means. Having written a ton of stored procedures for large scale web applications I've never found the need to dynamically build such queries. Instead I focus on what I need a specific query to do, provide the options to a stored procedure or function (depending on the RDMS) and that's it. Much like unix with simple programs that do very niche things I look at procedures much the same way.

Don't get me wrong I see the simplicity in it from the ORM side I've just never had a good experience with ORM tools. Having worked directly against many RDMS and key value stores at least for me I don't see much value in them.

But anything that helps get a product out the door more power to ya, in my opinion.

What do you do when you want e.g. user controlled sorting and filtering?

I just implemented this recently in my job, but it was even tougher than that; I was joining dynamically created tables, and the number of joins depended on criteria. I ended up building my own AST to represent filters, and compiling to SQL. Every extra table join hurts performance - typically filters are implented as joins against a filtered sorted limited subquery returning ids, so fewer different table joins means less data to pull in when evaluating filters even when all the data is needed for the results page.

Point being, I had to compose complex dynamic queries and SQL was not a particularly pleasant target language to target directly. Abstractions (like asts) help with composition.

You can use a CASE-statement for each possible filter and apply it only if the filter value(s) are non-trivial.

    -- Parameter :filter
        table.column = :filter
This has zero cost in the case where :filter is unset.

The filters are predicates, not values.

> What do you do when you want e.g. user controlled sorting and filtering?

Sorting is easy. The stored procedure just accepts a sorting flag, add it into your query and your done. Filtering though, I could think of a dozen types of things you may want to filter on and I still may not hit your use case so it's hard to say how to do it based on what data is stored in the database, how it's stored and which RDMS you're using.

I've primarily done this type of work in MS SQL where I've built some pretty complex queries but I was always able to replace dynamically generated SQL with SQL that could generate a cacheable query plan.

Not saying there won't be edge cases and I don't know how your filtering worked so I don't have any type of satisfying answer for you. I would just be surprised if it wasn't doable somehow (I've seen some crazy shit in SQL queries :D)

The job I have in mind was basically implementing a subset of Excel's sorting and filtering capabilities into a paginated table view on a SPA web page.

Sorting has a couple of wrinkles. When Excel sorts rows, it changes the in-memory order; its sort is also stable. That means if you sort by column C, then B, then A, you're effectively doing "order by a, b, c". Seems simple enough. But that's not the only wrinkle in this problem.

Not all sortable columns are in the same table. If you do something naive like "select x.* , y.* from x join y [...] where [...] order by x.a, y.b limit 10 offset 20000", you're doing too much work - you're sorting a wide result set. It's normally better to do your pagination in a subquery and join against it, like this:

    select x.*, y.* from x join y [...] join (
      select x.id from x join y [...] where [...]
      order by x.a, y.b limit 10 offset 20000
    ) f on f.id = x.id
    order by x.a, y.b
Depending on the body of the where clause, you want to join the minimum set of tables to include just enough data to evaluate the page of ids - every join has a cost.

The body of the where clause is another story, it's pretty much entirely arbitrary. The user may choose to filter on any column, just like Excel autofilter. Creating good indexes up front isn't feasible - you don't know how many columns to include in any one index, they take time to create and they're not free in disk storage either. And the filtered columns span multiple tables; the final result set may contain several hundred columns across all these tables.

So what I did is take the criteria - the filter clause, which might look something like this:

    (and (or (eq (field 'x.f1') "foo")
             (eq (field 'y.f2') "bar"))
         (between (field 'm.timestamp') "date1" "date2"))
And turn it into a where clause. I analyze the filter tree to determine the minimum set of joins required to access all the fields needed by the filter and sorting criteria. I can then put that into a subquery to join against.

There are more wrinkles. To give a good experience, we need to populate a dropdown of potential filter values, like Excel autofilter. Potential filter values need to be filtered themselves - we only display filter values for that are potentially visible given the current filter. They also need to be searchable, because there may be millions of them. And they may include joins, like when you want to filter by tags or some other 1:N style relation, like row:tags. There are a bunch of different shapes to the different queries, but there are common composable underlyers to them.

All the above runs on mostly unindexed tables - we're relying on pure query compute performance within the designed data limits (less than 3 million rows). Sub-millisecond query response times aren't the goal here; we're supporting the user doing reasonably complex data manipulation operations with worst case performance measured in single-digit seconds.

(This is all in MySQL, because reasons that include business model constraints.)

So you don't even build dynamic queries in the stored procedures themselves (e.g. with string concatenations and EXECUTE)?

That's where I usually get fed up with stored procedures, as pseudo-ADA ain't a nice language for list and string manipulation. I'd rather do that in a language better suited, and that quickly leads to query builders…

Stored procedures are no more composeable than SQL strings are. Now it's certainly easy to say that I haven't used SQL directly. But I did spend years working that way (though of course this statement shouldn't give you much confidence by itself).

Ultimately an ORM can help you to build more maintainable and resilient (to changing market conditions) software than direct SQL. Diesel in particular can catch mistakes that SQL strings cannot.

I do understand why a lot of people have reservations about ORMs. I share a lot of them. I've maintained Active Record for about 2 years now, and learned a lot from doing so (note: This does not mean that I like AR or want this to be anything like it).

TL;DR: Your points aren't wrong. Diesel is different. Maybe give it a shot. ;)

> Stored procedures are no more composeable than SQL strings are[...]Ultimately an ORM can help you to build more maintainable and resilient (to changing market conditions) software than direct SQL.

I couldn't disagree more. Creating the separation between SQL DB and web application allows you to maintain, deploy, fix and upgrade things independently and it's easy (maybe not quite as easy as an ORM would make it but not really harder either; you're going to have to know how SQL works anyway if you want to effectively optimize and secure your data).

Not only that but you can provide far better security controls because you can limit certain users access to certain procedures, never direct access to tables and various other things and if you don't do any dynamic SQL in the procedures itself you not only never have to worry about SQL injection but unless there is a critical vulnerability in the SQL system itself it's going to be impossible to conduct an injection attack.

Yeah you can separate your code to make it easier to deploy separately to get a similar benefit and yeah your code can prevent SQL injections too but it's so much more work, more possible points of failure and the differences between databases can really bite you in the ass when you do everything generically. SQL is fun :)

> TL;DR: Your points aren't wrong. Diesel is different. Maybe give it a shot. ;)

Yeah I'm not saying NEVER use ORMs; they're great for creating things quickly that don't need to scale. But after my experience with them I'm highly skeptical using them outside of that type of use case. Of course you can scale ORMs but I've never seen it done well.

I think we're violently agreeing in a lot of ways. :)

I'm curious what you mean by scale. Are you referring to performance, or code size? If you're referring to performance, you'll probably find it interesting that we out-perform or are competitive with ideomatic C.

By scale I mean performance though I'm not sure the application's language matters as much as the SQL that's being generated and run against the database. I've just never had good luck with ORMs generating the best, most optimized SQL for the query I'm trying to run and I've known a few people who used an ORM until they started getting heavy traffic, tested out talking directly to the RDMS, found it to be more performant and end up dropping the ORM.

Don't get me wrong ORMs can certainly be performant. I just haven't seen or know anyone who used it in a manner that scaled to thousands of concurrent users and, at the same time, didn't have a lot of pain trying to do that with the ORM.

I think this conversation shows a mismatch between two different philosophies about how to use the database. The traditional way of doing things you've described involves multiple different applications connecting to the database. Data access is restricted by locking their database users down to specific APIs exposed by stored procedures. Non-trivial logic lives in the database in order to present these APIs.

A lot of systems lately are being built in a different way, that is, only one application connects to the database. This application presents an API (ReST, protobufs, thrift) for other apps that need that data. It handles access control and business logic, using the DB as a backing store. That means you don't have to worry about dababase user access control, because you don't have more than one user.

I don't have a strong opinion about either of these, but the latter way of working is how we do things at my current job (reasonably large payments software company), and it does work quite well. One advantage I appreciate is that there isn't arbitrary business logic sitting in the database where it can be changed on the fly - I wouldn't consider that a feature.

I'll make a stronger statement - stored procedures are generally not composable in production.

At least in Postgres they act as an optimization fence. In practice this means you can use them as an abstraction layer only in very limited ways where they happen to fit into a decent execution plan.

SQL is hard to programatically compose. At the very least, you have to use a Lang/SQL.

That's a new term to me - lang/SQL?

I guess he means a procedural language for sql, like PL/Pyton, PL/V8, etc., so PL/

Oh god, ick. I've used T-SQL and Pl/SQL procedurally and hate them with the burning passion of many suns.

I think somebody needs to actually justify the need for composable SQL. With examples. It's far from obvious.

In my experience outside of extension code the need for composable SQL is very limited. Most uses cases can be satisfied by Rails 2 scopes.

Now, if composability was free - sure it's great. But it's not free.

So, on one hand:

* it would be nice to have infinitely composable SQL

On the other hand:

* it would be nice to just have the actual SQL with all the features, well understood syntax and semantics that cost decades and billions of dollars to develop

* and no wrapper-on-top-of-SQL limitations and translation bugs, and need for training

With these language-on-top-of-SQL you don't get both. In fact you get severe compromises in the second group.

User-generated data in a user-defined schema, where the user wants an Excel-autofilter-like sorting and filtering experience over rowsets that range from 10 thousand to 3 million rows, and 2 columns to 150 columns.

Columns may be sorted and filtered individually or in combination. Filters may be based on value set membership or range criteria, comparisons with fields in the data or via a join to other tables, or based on the presence of an entry in a join table (think: tags or labels, and you want to filter by presence or absence of tag).

(This is what I spent the past couple of months implementing. The simplest way of composing SQL is via union / except / intersect, but it's also the worst performing; doesn't work great past a few hundred thousand rows. And you still need to compose together the filter conditions into a where clause.)

Rails 2 scopes are literally the case for composeability...

Yes, but a very limited one.

I'd love to see a solution with tradeoffs made the other way:

  * syntax and semantics 99% like underlying SQL
  * support for 99% of underlying DB functionality
  * traditional areas where ORMs add value:
    * deserialization into convenient data structures
    * casting, quoting and escaping on the way in and out
    * using stored association metadata for concise joins in the query

Every single one of those bullet points is a primary design goal of Diesel (caveat: Our semantics generally map to PG, because they apply well to other backends and are the least nonsensical)

   Diesel performs 33% faster than rust-postgres, and will be
   competitive with if not faster than ideomatic C.
Interesting. What makes it faster than rust-postgress?

Short version: Our compile time guarantees allow us to omit a lot of runtime checks that you would otherwise have.

Long version: http://bikeshed.fm/49

Longer version: Planning on doing a really in depth write up for the website soon. I'll also be going into more detail during my talk at the SF rust meetup, which there's a link to the live stream towards the bottom of the thread

I'm interested if some of those guarantees can be used upstream in rust-postgres?

As far as I saw, Diesel is essentially shipping with its own PostgreSQL driver.

It's possible. I tried to build on rust-postgres originally, but its design basically makes it impossible to abstract over. We also have fundamentally different views on how `ToSql` should work.

I'd love to find out what made it impossible to abstract over?

Please include in your larger talk!

The short version is its refusal to box things means that you have to have the connection, the statement, and the cursor all as local variables in the same lexical scope (you almost always want to abstract away the statement and cursor), and it's use of trait objects for serialization and deserialization. I might go into it briefly in my talk, but I don't like to rag on other people's work so I probably won't say much on it.

(Sorry for the double post, I think I replied to my own comment and not yours!)

And rusqlite (or libsqlite-3) doesn't have those issues?

I don't think you're ragging on rust-postgress, although I'd probably attempt to change rust-postgres first :P

> And rusqlite (or libsqlite-3) doesn't have those issues?

rusqlite has the same issues. libsqlite doesn't since it's just C bindings. Same as pq_sys. Ironically, rusqlite makes the same design choices as rust-postgres, but it's actually wrapping a heap allocated object anyway, so it gains nothing by doing so...

> I'd probably attempt to change rust-postgres first

I did, too. It turned out to be a pretty major rewrite.

Perhaps the question should be rather:

Why is

an improvement over

    SELECT * FROM posts WHERE usre_id = 1;

... to which the answer is probably more obvious.

Hah I love this answer. ^_^

>> it nails the database schema into your Rust code

On the contrary, the purpose of an ORM is to let your code be a schema for your database. This is my personal use case for an ORM; I want persistence but am not overly concerned with the database. There are plenty of use cases [1] where a schema update from outside is unlikely or impossible. In these cases, there is a very low cost for (by way of an ORM) tightly coupling your code to your schema.

Also, an ORM lets somebody else worry about sanitisation, and database specific-code. It can abstract away the choice of database, another very valuable feature. I think that the example you quote is actually a poor demonstration of the value of an ORM. The other ones on the linked page are far better :)

[1] A program running on only one machine, like an Android app or a music player on Linux desktops etc

If you change your schema without changing your client code, your program is broken - you just haven't found out about it yet. The fact that changing schema becomes a compile-time error with an ORM is a feature, not a bug.

Well for one thing the "SELECT *" query will typically yield for results a runtime-typed flat list of fields/values as accessed via a generic database access api.

Whereas as an applications programmer I'd rather deal with structures that make sense in my domain of interest, and let the marshalling code from SQL results be handled for me, and with compile-time checking for correctness. When that's coupled with a database schema reverse engineering tool (not sure if this project provides one), that way whenever the database changes, compile errors will result, which errors allow me to quickly hunt down places in my code that are affected and need changing.

For one thing, I believe SELECT * is generally considered harmful (I certainly view it as such), and without wildcard you would need to update the query regardless of the method you use to construct it.

Yeah, I've been wondering if I should note somewhere in the examples that we don't actually execute `SELECT *`, we do something more reliable. I felt like listing every column would bury the lede though.

Sorry, I wasn't at all criticizing the example, I was replying to the parent poster. Your library looks awesome :)

Oh I know, I didn't think you were. Just stating a random thought I've had about the examples on the site. XD

I generally agree. Are there frameworks or db access libraries that just offer some helpers around plain ole SQL? Seems like very framework/ORM invents it's own query DSL.

In the .NET world we have Dapper (https://github.com/StackExchange/dapper-dot-net), which lets us write queries in plain old SQL. It's a very useful little library. I believe StackOverflow use it for their data access.


  var posts = connection.Query<Post>("select * from posts where user_id = @UserId", param: new { UserId = 1 });

I really like Dapper - allows you to write "raw" SQL but does the tedious bit of mapping to/from objects in a nice way.

In Java I really liked MyBatis for this, and I think fits what you are asking.

"If someone adds a new field to a row, you have to rebuild all your Rust programs that use the database."

Are you sure? Why do you say that?

Read the Diesel build process, especially the section on "migrations".[1]

[1] http://diesel.rs/guides/getting-started/

This isn't true at all. You can add as many fields as you want. You'll need to recompile to use them, of course.

If anyone is interested in hearing from the creator, Sean Griffin will be talking about Diesel at the next Bay Area Rust Meetup on February 18th [1], which will be live streamed and eventually archived on Air Mozilla [2]:

[1]: http://www.meetup.com/Rust-Bay-Area/events/219697075/

[2]: https://air.mozilla.org/bay-area-rust-meetup-february-2016/

He also did a good podcast interview about Diesel and about Rust (and what makes a Rust ORM special with compile time magic) a week or so ago:


FYI It looks like the in-person slots for that meetup are already fully booked.

Sean Griffin (and others), the maintainer of Rails' ActiveRecord, has been working on this ORM for a while now. It's not a port of ActiveRecord, but a look at what an ORM that takes advantages of Rust's strengths would look like.

It makes heavy usage of Rust's type system features, particularly zero-sized types, to provide a large degree of safety with no overhead. Sean has mentioned several times that he even believes that he is faster than literal SQL strings, due to pre-compiling them in some way, but that's a bit out of my league.

He also wants to have features like "type-check all of your queries at compile time and then query the database to make sure that the schemas all line up" and other such compile-time safety shenanigans.

Honest question: When would it ever really make sense to use Rust on the server for a service that interacts with a SQL database heavily enough to justify using such a library?

Wouldn't waiting on the network eat up any possible performance gain you could expect from Rust vs something like the JVM or Go runtime? Before modern day JavaScript JITs, relational database systems were the epitome of dynamic runtime systems. There's so much performance variability in the query planner, network, disk caching, etc, to make garbage collection a rounding error. Zero-cost abstractions are just not a particularly useful thing in this context from what I can tell.

Even if you do have genuinely CPU-bound tasks, why wouldn't you use Rust for those modules + call it via FFI from whatever is doing most of the boring business logic? If it's just about static-type-check-all-the-things! I just can't get excited about it vs something like Go.

Short version: Because Rust's type system is a joy to use, and the reason I made Diesel was to try and push it forward in higher level contexts. I'm finding myself more productive in Rust than I am in any other language.

Long version: http://bikeshed.fm/49

In my limited experience using both Go and Rust for hobbyist web scraping, I haven't found either language to have a significant edge in development difficulty. Go's most salient edge over Rust would be in the more numerous and mature libraries.

So if Rust is not substantially easier or harder than Go, when Rust catches up to Go in the library ecosystem, I can see why people might want to use Rust. Also, it's nice to have better language support for functional strategies.

I think that Rust is substantially harder than Go. At least extrapolating from my personal experience and the 6+ highly-skilled people I've watched learn both Go and Scala simultaneously.

I just started learning Rust and its type system is much simpler than Scala. What makes it powerful is some well-designed somewhat orthogonal features (generics, type traits, the ownership system, algebraic data types, macros).

I didn't pick up Rust as quickly as Go. However, that's not really surprising, because Go is approximately the lowest common dominator of modern statically typed languages minus generics. (In a good and bad way.)

At any rate, I started learning Rust ~2 weeks ago and I am already writing small libraries, programs, etc. for my daily work.

Although that took a bit more time than Go, it's also a much more pleasing language to write in (the lack or sum types or generics in Go is very annoying) and you get more safety guarantees.

> it's also a much more pleasing language to write in (the lack or sum types or generics in Go is very annoying)

Interesting anecdote: Last year, for a course I had to write the assignments in Go. As a Rustacean, I grumbled about the lack of sum types a lot, and made various hacks via interfaces writing unidiomatic Go code to simulate them. My code design was also very Rust-y and unidiomatic.

This year, I'm TAing the same course, and there's a similar assignment. One student's submission was eerily similar to my design (by now I knew that it was unidiomatic), down to the hacks used. It also contained a comment saying "missing Rust's enums" (my code contained a similar comment to explain the hacks). At first I thought it may have been copied from my code (which can be found if you're looking for it), but it was different enough in other areas. I dug a bit and found that the student was indeed a rustacean.

It's rather interesting that two rustaceans used the exact same hacks and same overall design when asked to code in Go.

Surprisingly, I don't miss generics too much in Go. interfaces used to make me cringe due to the extra runtime cost (since I knew Rust before learning Go, and such solutions would make me cringe in Rust), but once I got used to Go I learned to use them properly. Named interfaces are pretty easy to use and let you structure the code decently well, just like proper generics. They may have a runtime cost, but Go doesn't put as much focus on that as Rust.

On the other hand, easy-to-use non-hacky sum types are something I miss from Go all the time.

I think you're right--Rust is harder to learn than Go--but I also think that doesn't really matter, since the important thing is whether people fluent in both languages can write code faster in Rust or Go. I have not seen many if any instances in which people who actually know Rust write code more slowly in Rust than they would in other languages (modulo compiler performance). The borrow check is, like, less than 5% of the errors I see, and the errors are at this point not significantly harder to fix than a misspelled variable name.

> people fluent in both languages can write code faster in Rust or Go

As someone belonging to that group, I agree. short programs are easy to write in Go. If I need to write some utility script I would do it in Go (or more likely python). However, if I need to write a whole new thing I find it very easy to design in Rust in a modular way.

Similarly, if I have a Go and a Rust codebase in front of me, and I need to make changes, it takes longer to work with the Go codebase for nontrivial changes. But this is less stark a difference than the above.

I will say that "modulo compiler perf" is an issue. I was tinkering with the Go compiler and the fact that I could try things out quickly was a big boon. However, I had to spend time figuring out which interface goes where (the documentation doesn't show interfaces implemented on types, you have to figure it out by looking at the source -- this is something that could be fixed though), so it might even out.

But yeah, fixing errors is not hard in Rust (nor do I often get borrow errors, usually it's other stuff), and in fact I find it easier to do than Go because Rust provides extensive errors with suggestions as well as showing the erroring thing inline, whereas Go just points out a file and line number (Again, this could be fixed).

Re: modulo compiler performance

My productivity has been greatly improved by restricting myself to `cargo check` when writing code. It only does type and borrow checking which is much faster than doing all of the LLVM passes (or whatever rustc does after those passes, on that subject I am ignorant).

Note that even for large codebase, the Go compile is faster than Rust up to typecheck. Not much faster, but when I use Go I get errors immediately; whereas in Rust I often context-switch to wait for errors (both for large codebases). Some planned improvements to Rust may fix this.

Fair enough, and being honest I haven't tried Go yet so I have no basis for comparison between the two compilers.

I've found that when using Atom with linter/linter-rust installed and setup with cargo check I rarely context switch while waiting for errors/warnings to be highlighted (and this is on a 1.6Ghz Xeon machine at work). But maybe I'm just OK staring blankly for slightly longer than others :).

Well, I work with really large codebases in Rust (Servo or Rust itself), so the twenty seconds to a compile error are a lot. Whereas when hacking on Go's compiler it's much, much less.

I wouldn't generally compare Rust against Scala, as Rust's type system is less ambitious and the language itself has far fewer concepts.

On the other hand, if you're already using Rust, why switch to another language just to talk to the database?

There are much simpler standard ways to talk to a database from Rust:



This is an experimental approach. It might or might not be useful, but you don't have to use it.

That's valid, but ORMs have been pretty standard for years now too. Most mature languages have either ORMs or some database abstraction libraries, so it's good to see Rust getting those now.

> Most mature languages have either ORMs or some database abstraction libraries

And a lot of them really suck, too, because it's a hard problem. I'll take query builders or raw SQL over ORMs any day of the week.


It seems like the intersection of good use case for Rust and need to use SQL, will also intersect with "minimal SQL needs", hence preference for simpler libraries.


Mentioned below where?

If you're talking about query string construction performance, I'm not sure I've ever seen even a Ruby application where query string building was a bottleneck vs the network.

Sorry, as you replied I had deleted my comment, because I decided I'd rather not get into an argument about this :) I will give you the link though:

https://news.ycombinator.com/item?id=11045700 <- "Diesel performs 33% faster than rust-postgres, and will be competitive with if not faster than ideomatic C. "

Forgive me, I didn't intend to be so argumentative. I genuinely want to understand. I'm trying to figure out if there are whole categories of use cases that I just don't have any exposure to.

As a side note: I keep trying Rust over and over again, and just can't get in to it. I think my brain just doesn't work that way. For my hobby project that needs super low level perf, I've moved a codebase from C, to Rust, and now to http://terralang.org/ and I'm finally quite satisfied with how the project is going.

Naw, it was my response that was more so than I wanted it to be. No worries. :)

Rust focuses on "zero-cost abstractions", that is, if you had to implement a feature yourself, you couldn't do it any better.

So this library, while more complex than a straightforward binding, has more features that are useful, like increased safety, while also being faster. A win-win. And very much in line with Rust's overall philosophy.

It's a bit late here, but feel free to DM me in our Gitter room and I can give you a full explanation tomorrow.

For me Rust compiler and type system is much much better than the one in Go. Rust is still very far from Go right now though but I like the "if it compile, it works" which is not a certainty in Go.

Workers that consume tasks and write state to a db system can easily have lots of work that rust's performance guarantees would help with.

Hopefully such systems won't be doing a fair amount of dynamic query generation... vs simple pre-baked CRUD operations.

I think it's hard to tell whether a language (by itself, not counting the libraries) will be good for something until you try. Rust has enough going on that some creative people might find some really useful things it can do in surprising spaces.

Think about JavaScript and Node.js. Who would have predicted that?

Even if we were to assume that GC is better than manual memory management for servers (something I think is not really true), this presupposes that the non-memory-management-related parts of Go's type system are better than those of Rust. I disagree with that for many reasons.

What about Go gets you excited where Rust doesn't?

In the context of my original post: 1) garbage collection, which isn't a big deal for most server applications and 2) simplicity, where the complexity of Rust only buys me marginal performance of questionable utility for the sorts of use cases where I'd be talking to a SQL database.

But the sorts of use cases where you'd be talking to a SQL database do benefit from generics and ADTs. For example, generics are good to avoid having to write for loops over and over, which is pretty pointless for application code (and, I'd argue, for systems code as well--for loops are just bad). Pattern matching is nice for high-level app logic, which often has the flavor of "if A, do X; if B, do Y; else do Z". Generics enable try!, which is a lot nicer than C-style error handling. And so on.

Diesel's selling point isn't actually the performance aspect. It's the safety and productivity.

We're not going to settle the dynamic vs static debate in this thread, so I'll refrain from commenting on safety and productivity :-)

Anyway, I don't mean to dump on your work. It seems like it's actually quite a nice implementation. I'm just questioning 1) the size of the audience and 2) Whether or not the audience who thinks they want this should actually want this.

If the size of your audience is your overriding concern, you shouldn't be writing your ORM in Go or Rust.

memory is a critical resource for scalability and gc's are not good at it.

I like writing Rust for more reasons that CPU efficiency.

In fact, that's behind: strong memory safety, Option/no-null (probably lumps in with "strong memory safety") and of course generics and the type system at large.

The speed is honestly a nice benefit. (I'm not being sassy, I don't have speed intensive applications and still reach for Rust).

I know the ".rs" TLD is common to rust projects, but maybe add "for Rust" to the title?

I had no idea that .rs was conventionally for Rust projects, but I'm also not too bothered by having to figure it out.

I'm not sure it's convention but it's common just due to matching the file extension.

It's technically for the Republic of Serbia. I only chose it because .io was taken by a python library (which I wish I'd checked before deciding on the name)

It's also a pain in the ass to register because you have to have a Serbian tax ID

Naming something "Diesel" is problematic. While the name is well known because of Rudolf Diesel, a German inventor of the Diesel engine and the Diesel fuel.

There is an Italian clothing company "Diesel S.p.A." that tries to protect their brand by suing almost anyone using "Diesel".

For example a German video game called "DieselStormers": due to a successful trademark lawsuit by Diesel S.p.A., who owns the trademark "Diesel" in relation to video games, Black Forest Games had been required to drop the title DieselStormers. http://www.eurogamer.net/articles/2015-09-30-dieselstormers-...

Such greedy companies act very similar to patent trolls. Well, there are sadly several other common words that are tainted by such "brands".

Edit: In 2015/2016 the Volkswagen (VW) Diesel scandal tainted the "Diesel" name too: http://autoweek.com/car-news/vw-diesel-scandal

Chosing Diesel as a name in 2016 is not exactly smart. greedy companies or not

Really nice home page, clearly explains what the product is for and how it works, I really wish more tools would have such landing pages, most of the times I am spending a lot of time just figuring out what the hell is that new tool about. Great job!

I am a big fan of typography but I have to draw the line at ligatures in code samples. It's just not appropriate for type that actually has a compelling reason to be monospaced =)

Mind posting a screenshot of what you're seeing? All our code samples are `font-family: monospace`

It's being inherited.

  // application.css:529
  body {
      -webkit-font-feature-settings: "kern", "liga", "pnum";
      -ms-font-feature-settings: "kern", "liga", "pnum";
      font-feature-settings: "kern", "liga", "pnum";
      // ...

I think they're referring to the "fi" in filter:


^ This is what I'm seeing. Sorry it took so long :(

This is pretty cool, makes me want to take a look at Rust.

I don't quite get why nobody simply creates function bindings that have a query template string on one side, that is represented as a function that creates a parameterized query under the covers, and a solid type that represents a record that is returned? I mean instead of jumping through all the binding chains.. just have a string and type... call the generated function based on the string, return an array/stream/whatever of <type>

I'd rather use template string generated promises in node, tbh, than any ORM.

That is our `sql` function

Very cool. I like that it (as far as I can see) is actually built on top of an SQL AST.

This is one of the shortcomings of ActiveRecord. The SQL building used to be extremely ad-hoc, and though ARel was eventually written to fix this problem, the integration between ARel and ActiveRecord is not very fluid (and ARel itself has some design issues; if I remember correctly from the last time I worked with it, it's not immutable, which can be very confusing when chaining expressions).

Yeah, I'm planning on killing Arel at some point. I actually want to extract `Relation` to a gem entirely, as it's really not an adequate query builder for what we need in AR.

Also random fun story about Arel not being immutable. We had a bug in 4.1 where basically PG bind params are ordered, but Relation can't actually know the order ahead of time (e.g. because a Relation can be used in a from clause). Turned out actually fixing it was somewhat hard (the "real" fix was eventually https://github.com/rails/arel/commit/590c784a30b13153667f8db...). But as a hack, basically someone took advantage of the fact that `Arel::Nodes::BindParam < SqlLiteral` and `SqlLiteral < String`, and Strings are mutable. So they just ignored the lack of a proper map function, and fiddled with the strings. >_>

Anyway, yeah our AST is fun because Rust's compiler lets us set it up so that we can walk it as many times as we need basically for free (because it gets monomorphised and the walking ultimately gets inlined and optimized away)

How does this reconcile the async IO problem?

I couldn't figure out from cursory reading how "PgConnection" works or how it would be compatible with other IO-related code.

This is still one of Rust's major shortcomings. Async IO, particularly async database and network IO, are still pretty much nonexistent in any practical sense. Some libraries are in the works, but for the time being, I'm still passing over Rust for any real work.

It's very close to landing in hyper, which should then percolate up through the rest of the ecosystem, at least, HTTP-wise.

Do you know of any resource that would explain the current status of running rust as a webservice server endpoint ?

I'm thinking of question such as :

-embedded http server vs using apache or nginx ( eg : node and go seem to favor not using third party servers at least for simple cases)

- prefered concurrency model ? ( async vs thread based vs coroutines vs...)

- deployment issues ? ( single binary with everything inside, vs parts that need to be predeployed on the server beforehand)

- third-party connectivity ? ( diesel seems like a really fantastic way to query a db, but what about other storage such as s3, mongo, redis, queue systems, etc)

- current frameworks status ?

I know it sounds like a lot, but as ORMs is usually the major pain point for me when looking at a new techno ( it's one of the main reason i'm not using go, for example) and now that diesel seems to answer all my needs, i'm REALLY looking forward to start using Rust. I just need those questions to be answered, and i think i'm not the only one.

I don't know of one, so let me write up something short:

  > embedded http server vs using apache or nginx 
For now, put nginx in front of things, at least if you're using one of the non-rotor frameworks. That's because they are synchronous for now. This will change soonish, more below.

  > prefered concurrency model ?
If you're using a framework, this is handled for you already. If you're building something custom, you probably want to use something like mio to make it asynchronous. Current frameworks are all using synchronous IO, but work is undergoing to switch that, which will then percolate upwards through the ecosystem.

  > deployment issues ?
By default, everything is statically linked except glibc, so you make sure that those are compatible and then copy the binary.

  > third-party connectivity ?
Wonderful or nonexistent depending. We're still youngish, so there's gaps. Support ranges from "provided by the vendor itself" (mongo) to "nonexistent" (some random noSQL store that might be popular amongst a certain group but someone hasn't written a binding yet)

  > current frameworks status ?
The two most well-known are Iron and Nickel. Crates.io uses its own homegrown framework due to age. The author of Diesel may or may not be working on a new one to go with it...

> The author of Diesel may or may not be working on a new one to go with it...

IIRC, he hints at working on one (I think with Yehuda Katz?) in the podcast where he talked about Diesel:


I'm not sure I entirely understand the question, but we ultimately end up hooking into libpq, and right now the queries are synchronous. As we look towards features like associations, this will actually have to be true to some extent, since if we want the return type to be `(User, Vec<Post>)`, we actually need to allocate the entire result set we're reading into memory to implement `next()`.

Would it be desirable or possible to instead return a lazy collection that only fetches perhaps 10 results and fetches more as you iterate?

It's impossible. We can't guarantee the order of the result set, so we'd have to iterate the entire collection just to implement `next`

An FYI to anyone responsible for this site, or in contact with them, the visited and unvisited link colors in the getting started guide[1] are very close to the background color, which makes them hard to read. Specifically, the link to dotenv and the CLI.

1: http://diesel.rs/guides/getting-started/

Thanks. The designer is taking another pass at it soon, I didn't expect to have so much traffic. I'll bump up the color on the links until then.

This would be really awesome if someone used this with a rust framework being used in production

It would be really cool if this had an easy integration with postgis that uses simple syntax to query location

When I'm using a node framework and using postgres I have to use a nosql db for playing with coordinates as postgis syntax is not something i am very fond of

FWIW making sure it's easy to add extensions like this is part of the core design of Diesel. For example, this adds full text search support (it's slightly out of date) https://github.com/sgrif/diesel_full_text_search/blob/master...

If you're interested in adding postgis support, ping me in our Gitter room. I'd be happy to help.

How does this compare to Elixir's Ecto?

I'm having Jose on The Bikeshed (bikeshed.fm) next month to compare notes!

In the getting started guide, it looks like they use an as-yet-non-existent Post struct to create the migration SQL. Where is the migration CLI getting the struct's fields from if it hasn't yet been written in that part of the tutorial? Am I missing something obvious?

The CLI just generates the file structure. We fill in the SQL manually as part of the guide, and then create the structs immediately after that.

What I get for reading too quickly! Thanks for clarifying.

It wasn't super clear, anyway. Fixed in https://github.com/sgrif/diesel.rs-website/pull/5 :)

You're supposed to write that SQL yourself, I believe. I'll file an issue to clear this up either way.

I like how the logo uses a red gas can. Diesel fuel is actually goes in a yellow gas can.

I think it's green, at least in the US. XD

nope, diesel is yellow. never seen a green fuel can.

For plastic 'cans' in the UK, we use green for unleaded petrol. Red was for leaded petrol and diesel is black.

I'll take your word for it. I'm no gas can expert. :)

Why did a NewPost struct have to be created to actually add a new Post? Is it generally not possible to have the result type of a query be used to update the same records?

Update yes. It's possible to use the same struct for create, too if you really want, but you'd need to wrap `id` in an `Option`.

Generally we're designed for the idea that you'll have a separate struct for read, create, and update. We also try and design for the latter two mapping roughly to web forms, as this is how most large applications end up wanting to be structured from my experience.

The main thing I'm trying to avoid is lumping all your logic in a single place, and ending up in many of the same rabbit holes that Active Record has led us to (which is at least partially my fault)

I guess that makes sense. The types should be able to ensure you don't have any errors even with so many "redundant" types, but my first thought is of how much boiler plate will result. Like, I would guess most record types will be very similar between create, update, and retrieve.

Also, what about upsert?

> The types should be able to ensure you don't have any errors even with so many "redundant" types

Correct. Sorry, I should have been clearer in my last comment. The reason you'd want a separate type for insert is because wrapping your id in an `Option` would be painful for anything except insert.

> but my first thought is of how much boiler plate will result.

Yeah, if you're coming from Active Record (which is the other project I maintain), we'll probably seem a little bit verbose. My goal here is to be 80% as nice to get started with, but nudge users towards significantly more maintainable code as their product grows.

> I would guess most record types will be very similar between create, update, and retrieve.

Early on, yes. 2 years in, not so much.

> Also, what about upsert?

This is something I'd like to add support for in the future. This comment made me realize I didn't have a tracking issue, so here it is: https://github.com/sgrif/diesel/issues/196

We went through a somewhat similar process at an old job. Entities don't have an ID before they've been saved, but do after. Other things can change over their lifecycle too. At one point, we had separate types for each stage in the lifecycle, a bit like your different types for insert and read, but that makes it impossible to write any kind of code which is stage-agnostic - so for example you need a separate to-string function for each stage.

We were working in Scala, and I had a go at solving this using a higher-kinded type for the ID and other changing fields:


The idea is that if you say that the ID is of type 'T of String, where T is some subtype of Option', then you can bind T to None for entities that haven't been saved, Some for ones which have, and Option where you have a mixture. You then have one type definition, with which you can write stage-agnostic code, but static guarantees about whether there's an ID.

You can go further and use type constraints to enforce the order of lifecycle stages: if your object can only be saved to the database once it's been given a timestamp, say, you can make sure that the ID type can only be Some if the timestamp type is also Some.

So maybe when Rust gets HKTs it'll be time for a new release!

This type of bound can be expressed with a trait and associated type already in Rust, fwiw.

> Early on, yes. 2 years in, not so much

I guess it's because I'm coming from KDB ETL work when I think about database stuff that I don't share this perspective. But in KDB, the database has no notion of non-null columns or auto-ids or anything.

Can you just use stock-standard parametrised SQL queries or do you have to use the ORM to build the SQL statement?

We have a SQL literal ast node but you lose most of the benefits of the library that way

I would really like to know this, too!

Safe from what?

Type safe, memory safe.

How is a strong-typed language, such as CL, is less safe?

Edit: let's paraphrase - how excessive safety nets for construction workers makes buildings more safe?

A strong-typed language without implicit coersions is not more "unsafe".

Baning heterogeneous containers and conditionals, maybe-wrapping of everything to be able to check some constraints at compile time and catch simple errors (which also would be caught in a strong-typed language at run time) does not quality for this "safe" meme.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact