

Why I Like ZODB - reinhardt
http://plope.com/Members/chrism/why_i_like_zodb

======
YZF
I'm sorry but I hate ZODB with a passion.

\- It does not scale.

\- Poor support for replication or sharding.

\- It is slow. Really slow. Un-pickle every time you get something from cache.

\- It is error prone. Forget a commit or forget to handle conflict errors and
you're in big trouble.

\- No interoperability. Want to write a service in C++ to access your db?
You're out of luck.

\- As your system grows you'll have conflicts all over the place.

\- Some server side stuff needs to have your objects, e.g. if you want to do
conflict resolution.

\- Migrations/change to schemas are painful, once you change your object
you're no longer going to be able to de-serialize it.

\- You have to roll your own if you want change notifications.

So if you're just looking to persist some Python objects in a small system,
great. Otherwise I'd stay away from this.

~~~
e12e
Note: I'm being brief, not trying to be snarky -- I'd love to hear what's
behind your statements.

> \- It does not scale.

What does this mean? I'm not saying you're wrong, but without some context,
that doesn't say a whole lot.

I recall one big consulting company dropping Plone (and zodb) some five years
ago because zodb/plone scaled to about 10.000 documents, and using a custom
index solution based on lucene they managed to scale to around 100.000
documents for their cms -- but they ended up needing something else for "their
biggest" clients. Can't find the link or remember the company now (I believe
it was a German design shop). But it's the only story I've heard about zodb
not scaling for it's typical use case?

> \- Poor supprt for replication and sharding.

Are you aware that ZRS is now free and open source?

[http://www.zope.com/products/x1752814276/Zope-Replication-
Se...](http://www.zope.com/products/x1752814276/Zope-Replication-Services)

> \- It is slow. Really slow. Un-pickle every time you get something from
> cache.

You have to marshall structures that you load in a different way too -- is
this really something specific to zodb? Are you saying unpickle is slower than
other ways to marshall python objects?

> \- It is error prone. Forget a commit or forget to handle conflict errors
> and you're in big trouble.

As opposed to not handling transaction errors with a postgres backend?

> \- No interoperability. Want to write a service in C++ to access your db?
> You're out of luck.

>

Well, it's an object database. The only other I know of off the top of my head
that I know people are actually using, is Gemstone. You could of course wrap
zodb in a xml/json api -- but yes, I don't think interop with other languages
is a good fit for zodb.

> \- Some server side stuff needs to have your objects, e.g. if you want to do
> conflict resolution. > \- Migrations/change to schemas are painful, once you
> change your object you're no longer going to be able to de-serialize it.

This is a problem I'm constantly running into with Plone and a more or less
well understood set of third party add-ons. I really think the smalltalk image
approach is better (if you have the "data" you also have the "behaviour" \--
with zodb you might have a serialization of a complex class, but not the
ability to marshal it).

~~~
YZF
I wasn't aware ZRS is now free/open source. I would need to look at what that
brings to the table but it's unlikely to change my views. I'll check it out
though and thanks for the heads up!

Ignoring ZRS- As your number of clients and transactions go up you're still
bottlenecked in a single server. That's what I mean by doesn't scale. For
various reasons (e.g. objects can refer to each other) you're basically stuck.
A scalable database provide various means of growing as your load grows and
ZODB does not.

There are a few problems with pickling. First of all it is slow. Under some
assumptions there are faster ways of marshalling in Python. Secondly your
granularity of access is the entire object. You can't just get a certain field
out of a large object. Thirdly because objects in the client cache are pickled
you are spending a whole lot of time serializing/de-serializing them when you
don't really need to do that. In one of my applications that happens to
account for 80% of the execution time.

I haven't spent a lot of time with SqlAlchemy but I think an ORM that maps
well to some performant database is a better approach in Python.

~~~
mcdonc
It's best to keep object records small when using ZODB. This does mean you
need to do some planning about object structure. Objects that inherit from
"persistent.Persistent" are kept as separate records in ZODB, so you can break
up a large object into several smaller ones by attaching persistent attributes
to another object. If you just make a big structure out of nonpersistent
objects, ZODB will have all the downsides of plain old pickle (like slow
loading time for large objects) indeed, but its entire purpose is to allow you
to not do this.

------
zopyx01
The ZODB is and was a Python pickle store in the first place. The question
about what makes a "database" can be answered differently. Functionality like
indexes and query languages are in the Zope world application level
functionality build on top of the ZODB. The ZODB turned out being a database
solution for many large scale projects (up to several 100 GB). There are
options for sharding (mount points) and replications (ZRC, Relstorage). The
ZODB is unlikely the solution that you would use nowadays for "big data"
however keep in mind that the ZODB is already 15 years old and severed many
people in professional solutions for more than a decade. We know of many
businesses still using the ZODB in mission critical applications. But yes, the
ZODB is an object store and not a RDBMS - it's a completely different beast -
like always: use the right tool for each project. And the ZODB was already
"NoSQL" by the end of the 90s of the last millennium. No need to hate the ZODB
- it is just another database option - and in some case I would still use the
ZODB today over tinkered garbage database solutions like MongoDB having a
braindead replication and sharing model.

------
rhizome31
I had to work with ZODB on a couple of projects and it has been overall a
rather painful experience. Inability to run ad hoc queries and necessity to
manage indexes and maintain consistency at the application level has been a
source of annoying bugs for me. Not worth it IMHO.

~~~
chrismorgan
One of the problems people may experience when trying out something like ZODB
is trying to use it as they would use a relational database management system
rather than as they would manage a normal object structure inside their
program. Things like list comprehensions and careful design of the structure
of the objects can improve things a lot, but I will agree that sometimes the
fact that you can't as easily run ad hoc queries is sad. Also very often you
simply won't _need_ to use an index, and for a very large range of problems
the graph nature of ZODB (compared with the tabular nature of the RDBMS, where
you'll need indices to avoid full-table scans) is liberating.

Certainly there are some types of problems which ZODB won't work well with,
but overall I quite enjoyed the experimental project I made early this year in
Pyramid with ZODB. Combine it with traversal-based URL generation and you get
very interesting results.

I think it's the sort of tool that I'd recommend people at least _try_ ; once
they've broadened their horizons, then they can go back to using their RDBMS
when they wish to. (In this it's just like Pyramid's traversal; many
developers have forgotten that pattern matching is not the only good way of
routing URLs.)

~~~
davidkhess
I've used Durus a lot (a simpler version of ZODB) and indexing was always the
biggest issue I ran into. Normal object associations don't need traditional
indexing but object lookup based on attribute values gets pretty painful
without some indexing support.

------
ryanobjc
ZODB is kind of neat, but how does it fare with non-python client libraries? I
note the phrase "you can store anything that is pickleable" so I am thinking
not?

~~~
chrismorgan
Correct, ZODB is exclusive to Python.

------
wslh
If you like ZODB (like me), you might also like DyBASE:
[http://www.garret.ru/dybase.html](http://www.garret.ru/dybase.html)

BTW I used ZODB to add persistence and acknowledgment to the Python Queue
class: [http://blog.databigbang.com/adding-acknowledgement-
semantics...](http://blog.databigbang.com/adding-acknowledgement-semantics-to-
a-persistent-queue/)

------
davidkhess
If you are interested in object-oriented databases for Python check out Durus:
[https://www.mems-exchange.org/software/DurusWorks/](https://www.mems-
exchange.org/software/DurusWorks/)

It's design is based on ZODB but with a number of simplifications and no
dependencies on Zope.

~~~
kilink
In what way does ZODB depend on Zope? It has a dependency on zope.interface,
which is pretty minimal, and seems to be widely used outside of the Zope
Community (e.g., Twisted).

~~~
davidkhess
Whoops. That was a presumption on my part.

------
alextingle
Urgh! Isn't Zope dead yet?

~~~
reinhardt
ZODB != Zope

