

Arel, a composable relational algebra for Ruby - raganwald
http://github.com/rails/arel

======
xal
It's important to point out that Rails3 is build on top of this. It was
started as a Google Summer of Code project. It's an amazing piece of
engineering.

------
dmnd
So this is Linq for Ruby. Using this in place of ActiveRecord's querying API
will be great.

One nice thing about Linq though is that most (all?) of the methods on
IEnumerable have the same semantics as those on IQueryable. Ruby has a pretty
nice Enumerable, but the method names in Arel don't match up. Map becomes
project, select is where, etc.

It's convenient to be able to use the same syntax to query an in memory object
collection that you use to query a database (or any other kind of abstract
collection).

~~~
davidmathers
Map and project aren't the same thing. Map can't be part of the algebra
because it breaks the closure property. Project is a single, specific
function.

I agree with your sentiment though. It looks like Nick took names from SQL
where available ("where") and from relational algebra otherwise ("project"). I
would rather the names came from ruby first.

~~~
dmnd
I don't know much (anything) about relational algebra or breaking closure
properties, so I'm glad my sentiment was clear despite that error.

Is project different from map because it can only output a subset of the
attributes that go in, whereas an arbitrary thing can come out of map?

~~~
davidmathers
Exactly. Project is just an operator that takes a table and gives you a table
with a different number of attributes. All the relational operations take and
return table values (aka relation values), which is what makes the algebra
closed under operations.

SQL select actually combines project and map (with select...as) by letting you
define maps for the individual attributes of the table. That's fine since you
can't break the table structure. You just can't map on tables or table rows.

------
pvg
It looks interesting and clever (and a practical way to learn about relational
algebra) but the reason similar interesting and clever solutions haven't
become very popular over the years is that the field of applications where
they have a significant advantage is very narrow. Often these are rapid-
prototyping development tools and places where there's a need to generate very
complex queries in a way compatible with different relational sources.
Typically, though, the queries are not that complex and don't need to be
generated dynamically so this sort of thing becomes just an exquisitely
designed layer for more bugs to hide in.

~~~
fizx
On the contrary, having a real data structure that represents the query you
want your ORM to execute removes potential bugs! The alternative/old method of
generating SQL was a bunch of kludgy string concatenation.

~~~
pvg
There's also the old/current method of writing SQL by hand. And you're going
to have a hard time convincing me that adding another layer of code to
anything usually removes bugs. More code = more bugs. Not always, but just
about.

~~~
raganwald
I wonder if that depends on your world view? If you are "thinking in SQL,"
adding a layer of abstraction forces you to think of SQL and then solve the
puzzle of "Which Arel incantation produces the desired result?"

Whereas if you find a way to think in algebra, Arel implements the abstraction
and removes the problem of "Which SQL incantation produces the desired
result?"

You're right every abstraction introduces some problems. It's can be a win if
the abstraction's mental model is congruent to your thinking or to the problem
space.

~~~
pvg
I think it's it's really more a matter of experience than a worldview. In a
previous life, I used to work on a (commercial) product with similar
capabilities. I'm also not the sort of database superhero that can just spew
optimal SQL effortlessly. In fact, I do tend to think at a level closer to
relational algebra.

I've just come to find such abstraction layers less useful in the typical web
app. They seem to be more applicable in an enterprise app setting where
portability is important, control over the schema might be limited or non-
existent, the performance and scalability requirements are more predictable.
In a web app, the structural complexity of the data is often low or the data
is not well-representable in relational form (note the rise in popularity of
non-relational stores). Performance requirements are harder to predict. Adding
an relational algebra abstraction layer in that context often doesn't add
enough value to justify the increase in complexity - you still end up having
to understand the entire depth of the stack while getting little benefit from
the extra capabilities it offers beyond the warm fuzzy feeling of using
something pretty neat.

------
mst
Shame they appear to have confused how GROUP BY works. They're correct that -

SELECT users.*, count(photos.id) FROM users LEFT OUTER JOIN photos ON users.id
= photos.user_id GROUP BY photos.user_id

won't do what they want. In fact, it isn't even a valid query outside of MySQL
that I can think of.

What -will- do the right thing is:

SELECT users.id, users.name, count(photos.id) FROM users LEFT OUTER JOIN
photos ON users.id = photos.user_id GROUP BY users.id, users.name

(which in the conceptually similar perl thing I'm working on at the moment
you'd represent just by asking for a set of user objects with
$user->photos->count eager loaded, but anyway ...)

------
raganwald
Avdi Grimm also pointed me to: <http://github.com/dkubb/veritas>

