
Pony ORM - Use Pure Python to Speak to Your Data - pykello
http://ponyorm.com/
======
kozlovsky
Hi, I'm Alexander Kozlovsky, one of Pony ORM authors. I'll try to answer some
questions

> why such an obsession with "pure" Python?

With this statement we are trying to say Pony is not just to offer some syntax
sugar. The end goal of Pony ORM is to understand the semantic of each Python
generator which can be translated to database query and then do translation
accordingly.

The direct benefit of this is that generator query are more high-level then
resulted SQL, and can be much easier to understand and refactor. The simplest
example is an automatic converting of attribute path traversing such as
grade.student.group.department.name to a minimal set of SQL joins. Pony also
can use attribute path in reverse direction, from “one” to “many”, such as
department.groups.students.gpa, which in this case resulted in a multiset of
all ‘gpa’ values for this department. Also you can see examples of query
optimization in the documentation, for example when the subquery is converted
to LEFT JOIN with GROUP BY, where it can give performance benefit.

There are more distant benefits of the semantic translation. In future, we
plan to add support of denormalized database schemas. Such denormalization
will take place on a physical level (i.e. in the database) while on the
logical level (in Python) all queries and logic will remain the same. But
first of all we are going to add the migration support.

Another future plan is to add support of MongoDB. This is a distant plan, but
I think Pony architecture will allow this. After that, migration of PonyORM-
based project from relational databases to noSQL and vice versa will be
possible.

> An ORM like Django's offers a higher level of abstraction

I disagree with this. I think, any Django query can be written in PonyORM much
more concisely, and there are easy-to-understand PonyORM queries which cannot
be expressed in Django easily.

Also, Pony ORM completely eliminates N+1 select problem.

------
benatkin
This has a split licensing model. <http://ponyorm.com/license-and-
pricing.html>

~~~
bru
>Free for non-commercial usage: Creative Commons Attribution-NonCommercial 3.0
License

Beware! CC licenses are _not_ for software, and it actually makes it
incompatible with GPL. Alexander and Alexey, please see
[http://wiki.creativecommons.org/Frequently_Asked_Questions#C...](http://wiki.creativecommons.org/Frequently_Asked_Questions#Can_I_apply_a_Creative_Commons_license_to_software.3F)

~~~
Flimm
The NonCommercial part makes it _not_ open source, and _not_ libre software.
Both the OSI and the FSF clearly state that FLOSS software has to allow
commercial use for it to qualify.

------
kijin
I'm not very familiar with the culture of the Python community so this might
come across as offending, but why such an obsession with "pure" Python?

On one extreme, of course, there are spaghetti apps that mix PHP, HTML, CSS,
JS, shell, and SQL in the same file. We all know and hate those apps. But is
there any reason to jump to the other extreme and turn as much as possible
into "pure" $LANGUAGE ?

As soon as your query gets moderately complicated, you're still littering
Python with SQL keywords like order_by(), desc(), select(), and commit(). The
GROUP BY section of the documentation just reads like a rough translation of
SQL into Python, the only difference being the syntax. It's like watching a
first-week ESL student try to construct sentences in English.

The documentation advertises that "Pony allows any programmer to write complex
and effective queries against a database, even without being an expert in
SQL." I don't think this is true for any ORM I've seen so far, whether in
Python or in any other popular language. Beyond a certain level of complexity,
you need to know SQL in order to write complex queries. But if you already
know SQL, why translate long GROUP BY ... HAVING queries into Python only to
have the ORM translate it back into SQL? Why are we trying so hard to avoid
writing SQL? What are we going to do next? Write a library that translates
pure Python into Lua scripts for your Redis server?

I like ORMs because they simplify frequent tasks, like grabbing a dozen items
from the database and filtering them by a couple of columns. I also like them
because they often come with caching and effective protections against SQL
injection attacks. But I also think that purity is overrated. Both web apps
and native apps are already a mixture of several different languages, both on
the frontend and on the backend. Don't be afraid to add SQL to your belt, it's
just another language.

(By the way, why is there an order_by() method and a separate orderby()
method?)

~~~
ronnix
> I'm not very familiar with the culture of the Python community so this might
> come across as offending, but why such an obsession with "pure" Python?

I think it refers to the implementation, meaning the library has no C modules
and is written 100% in Python.

This means it should be able to run on all Python implementations, including
Jython, IronPython, PyPy, Google App Engine...

~~~
amalashkevich
I'm Alexey Malashkevich, one of the Pony ORM authors. For now Pony ORM runs on
CPython only because there is no access to frames in other implementations.
"Pure Python" means that you can write queries in term of objects using Python
generators.

~~~
ronnix
OK, thanks. But isn't it the point of all ORMs? Why the emphasis on pure?

~~~
amalashkevich
Pure in this case means that you can use Python syntax in order to query a
database. This way a database query looks identical to an iteration over a
list of Python objects.

------
megaman821
I really like the style of this. I remember seeing LINQ years ago and
wondering if something like it was possible in Python. I was disappointed when
it looked like it couldn't be done, but generating an AST then translating to
SQL is smart.

It seems from this thread not many people appreciate the parity of:

    
    
        list(o for o in Item if o.price>3)
    

when Item is a normal in-memory Python iterable, and:

    
    
        select(o for o in Item if o.price>3)
    

when Item is a row in a database.

After just helping someone get started that was new to Python and Django, it
was weird helping them learn list comprehensions and generators and then
having to teach such a divergent form of syntax to work with the Django ORM
even though the a lot of the concepts should be similar.

~~~
d0mine
you might mean "when Item is a table in a database" instead of "a row in a
database". `o` would be the row.

------
bayesianhorse
It will be very interesting to see, if this can be a commercial success, and a
good example to learn from.

In competition with Django ORM and Sqlalchemy, both of which are open source
and battle tested, they will have to do some cutting edge marketing to justify
the cost "per process".

The idea is great. We might see something like this pop up in Sqlalchemy or
Django real soon...

~~~
falsedan
SQLAlchemy comes with SQLSoup: not as 'pure' but the abstraction overhead is
low and the interface is very similar

------
luckystarr
How does it translate generator expressions? Some kind of byte-code magic?

~~~
yangyang
It appears to build an AST from the code object on the generator, and then
translate to SQL.

[https://github.com/ponyorm/pony/blob/orm/pony/orm/decompilin...](https://github.com/ponyorm/pony/blob/orm/pony/orm/decompiling.py)

[https://github.com/ponyorm/pony/blob/orm/pony/orm/asttransla...](https://github.com/ponyorm/pony/blob/orm/pony/orm/asttranslation.py)

------
estebank
This definitely reminds me of LINQ (never having used it).

As much as I dislike things like this that translate a language into another,
but not quite _right_ , as you have one syntax trying to mimic a completely
different one, you _do_ (or could) get the benefit of the possibility of
autocompletion, syntax highlighting and lint checks.

I know that adding something like LINQ in Python is completely out of the
question, as you'd have a special syntax case, and I agree with that.

For some time I've wondered how hard would it be to reverse traverse the
Python AST to find literal strings that are being passed to a dbapi2
interface, and syntax highlight/lint/autocomplete the SQL on those strings.
This gets there by bypassing the "Language is in a string" issue.

~~~
jimmaswell
I've used LINQ very often and I can say it's been superb. I can't compare it
to sql because I don't really know sql though.

~~~
taude
You're pretty lucky that you haven't had to know SQL to have good luck.

Having used Linq2SQL for a few years, I don't like this style of coding.
Often, to optimize a query, we've had to look at the generated SQL and then
reverse engineer it back to Linq to get the SQL that we wanted. While certain
queries were really easy in Linq2SQL, the more complex ones were extremely
difficult. I had to still understand and write the underlying SQL, then I had
to figure out how to "translate" that into the Linq2SQL. In the long run, all
time savings were nullified. I see the same with a DSL-type of wrapper like
this.

Additionally, almost all good backend devs speak SQL, I can take a Java-person
and without the Linq2SQL stuff, he could understand our backend, work on
queries, etc. The Linq just added another layer of complexity for someone to
think about.

(Granted, there's a database layer that we could truly put a lot of that stuff
in, if our queries got really complex)

~~~
jimmaswell
I haven't used linq2SQL or anything, just LINQ on collections in .NET.

A team I'm a part of (language is C#) had to get rid of a library that handled
SQL and do the SQL manually because the library was making a lot of
inefficient calls. Sounds like a similar problem.

------
mh-
unfortunately the confusing licensing circumstances will probably hamper some
of the interest in this.

i'm not opposed to commercial licensing, it's just that the existing
combination of options doesn't offer clear legal guarantees.

if you go to the website, the split-licensing is apparent - yet has issues
with, what seems like, improper use of Creative Commons licensing.

if you go to the github repo, the only license presented is AGPLv3.

are people who find the repo on github directly at risk of violating the
license?

~~~
amalashkevich
Thanks for pointing this out, we are working on the licensing part now. We are
not going to put anyone at risk of the licence misuse. Eventually we are going
to have a multiple licensing model allow using Pony ORM for free for non-
commercial and open source projects, but in the same provide a commercial
license as well.

> are people who find the repo on github directly at risk of violating the
> license?

Nope, why? Using it under AGPL is a valid option.

------
nbevans
.NET has had this since 2006. LINQ to SQL and Entity Framework both support
LINQ expressions. The problem is that more and more .NET devs are shunning it
when working their relational database. There is an adage that if you're
working with SQL then you should just be writing directly in SQL. Everything
else is second rate.

~~~
JonoW
As long as an ORM allows you to break out and right custom optimised SQL when
needed, then I see no reason to write _all_ your SQL by hand. ORMs are great
for covering basic, repetitive queries, but there are times where it's not
producing what you need. Of course you then loose the portability feature of
ORMs, but I think this is acceptable. The biggest advantage for me isn't
generation of SQL, it's mapping data back into your domain model/DTOs.

------
aleyan
From the sample on the front page

    
    
      SELECT "c"."id"
      FROM "Customer" "c"
        LEFT JOIN "Order" "order-1"
          ON "c"."id" = "order-1"."customer"
      GROUP BY "c"."id"
      HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
    

is translated into

    
    
      select(c for c in Customer
               if sum(c.orders.price) > 1000)
    

How does the python version handle nulls? They are explicitly coalesced into
0s in SQL. In python sum([1,numpy.NaN]) evaluates to nan. I think this code
will do something wrong.

I am not sure what flavor of SQL is beind used, but why are the table names
and the field names surrounded by double quotes? They make the query look
unnecesierily hairy.

Good job otherwise. Always glad to see people contributing to the community.

~~~
kozlovsky
No, it is the other way, Python is translated to SQL, not vice versa ;)

According to SQL standard, aggregate functions must skip NULL values. The
problem is with queries which return no rows. In this case, intuitive expected
result of SUM aggregate function is zero, but, because of some quirk in SQL
standard, SUM must return NULL if no rows were found.

Without coalesce function, result of SUM will be returned as None if no rows
were found. Coalesce replaces this surprising None with zero.

~~~
aleyan
Duh, of course. This is what happens when I don't read the explanation and
just grab on to the first pieces of code I see.

Thanks.

------
tom_usher
I've enjoyed using Peewee[1] as a lightweight ORM for Python - mainly for
scripting data manipulation on legacy databases but it does the job nicely.
How does Pony ORM compare?

[1] <https://github.com/coleifer/peewee>

~~~
kozlovsky
I have not used Peewee, but at first sight the most visible difference is the
query syntax - Peewee uses method chaining, whereas Pony ORM decompiles Python
AST. I think that many queries can look simpler in Pony than in Pewee.

For example, if I want to retrieve all pairs of different users with the same
name, in Pony this query would look as follow:

    
    
        select((u1, u2) for u1 in User for u2 in User if u1.name == u2.name and u1 != u2)
    

In Peewee, this query probably would look something like this (didn't test
it):

    
    
        User2 = User.alias()
        User.select(User, User2).join(User2).where((User.name == User2.name) & (User.id != User2.id))
    

It’s probable the matter of taste, but I like Pony syntax better. Another
example is taken from the Peewee doc:

    
    
        staff_users = User.select().where(is_staff=True)
        Tweet.select().where(~(Tweet.user << staff_users))
    

In Pony, this query would look as follow:

    
    
        select(t for t in Tweet if t.user not in (u for u in User if u.is_staff))
        

Also, this simple version is possible, don't know why it is not as simple in
Peewee

    
    
        select(t for t in tweet if not t.user.is_staff)
    

Also, I don’t know if Peewee supports IdentityMap and optimistics
transactions.

On there other side, it seems Peewee already has migration support, whereas in
Pony migrations are not implemented yet, this is the next task.

------
agentgt
Its so much easier to do this in something like Python, Scala, Ruby... or just
about any language other than Java. I built my own ORM (but uses Immutable
objects) in Java that uses a fluent DSL. Is it good... no it sucks compared to
what Scala and even .NET offer.

But then I thought screw that... SQL is a pretty good language so why not just
make writing real SQL easier:
[https://github.com/agentgt/jirm/tree/master/jirm-core#sql-
pl...](https://github.com/agentgt/jirm/tree/master/jirm-core#sql-placeholder-
parser)

------
cjauvin
I have solved a similar problem in a similar way
(<https://github.com/cjauvin/little_pger>) by creating a very thin wrapper
just above psycopg2, allowing to assemble SQL queries with kwargs and plain
data structures.

------
ludwigvan
A little off topic, but the visual editor reminded me of
<http://dbpatterns.com/> <https://github.com/fatiherikli/dbpatterns> Maybe you
could join forces.

~~~
amalashkevich
Thank you, will check this out.

------
csears
It looks very cool, and as others said, it immediately reminded me of LINQ.
But I wonder if there is too much magic going on for Pony to ever become
widely adopted by the Python community. Regardless, nice work!

------
tnuc
What is the purpose of this?

If you set up the an SQL Alchemy properly you can achieve the same thing.

If you wish complain about django then yes this sort of fixes it but it may
break more than it fixes.

SQL Alchemy is free and better.

~~~
kozlovsky
SQL Alchemy is fundamentally based on a relational model and translates
queries to SQL. Pony ORM is based on an entity-relationship model, which is
logically independent from SQL. Currently Pony translates queries to SQL only,
but in the future noSQL support is also planned.

------
coolsunglasses
This looks really nice. Korma serves as an interesting halfway point for those
that use Clojure.

<http://sqlkorma.com/>

------
Siecje
What intrigues me is the Diagram Editor. Why else would someone use this over
SQLAlchmey, Storm or even Django's ORM?

------
swah
Ok, you use this and now you can't even use solutions from Bill Karwin book...

------
daGrevis
This seems very close to what PHP does. The difference is that PHP doesn't
have so many ways to express itself so it's all done with method-chaining.

In my opinion, this is the way to write the core of an ORM and it should allow
to use this abstraction layer too! It's somewhat between SQL queries as
strings and ORM-level, I think.

~~~
kijin
PHP doesn't have a built-in ORM. Which framework/library are you talking
about?

~~~
daGrevis
Not anything super specific. Kohana, for example, has ORM that uses that-alike
way to built queries. And, as far as I know, it's ORM is built on top of that.
I think that ORMs should have something like that in them instead of SQL
queries as strings.

~~~
est
I think the selling point of Pony ORM is that you can write variable and
literals instead of string.

Django:

    
    
        Item.objects.filter(price__gt=3)
    

Pony ORM:

    
    
        select(o for o in Item if o.price>3)
    

Peewee:

    
    
        Item.select().where(Item.price>3)
    

Kohana:

    
    
        $Item->where('price', '>', 3)->find();
    

You see, the __gt or '>' here is kind of stupid.

