
Customizing class creation in Python - joeyespo
https://snarky.ca/customizing-class-creation-in-python/
======
andy_ppp
I don't see why you'd need this sort of indirection. It's a very strange way
of hiding behaviour from people that is currently clearly understood. Should
metaclasses really be able to magic up decorators out of thin air and why is
importing a decorator and using it bad?

As for __init_subclass__() it literally allows you to mess with things when
subclassing objects in totally unexpected ways!

"Adjusting how classes are created can be very difficult to debug and so
should only be used when you have a really legitimate use-case."

I'd say never use it - if anyone has a good reason to do magic like this I'd
like to see it!

~~~
joshuamorton
Take a look at SQLAlchemy/django's ORMs.

Personally I have two uses, one is in proprietary code, but the idea is
basically use python to generate json, but allow incoherence of json pieces
via natural python flows. (metaclasses specifically allowed me to reverse the
flow of creating objects, so that visually, code wouldn't look like

    
    
        inner1 = JsonObject(config)
        inner2 = JsonObject(config)
        outer = JsonObject(config, children=[inner1, inner2])
    

but instead

    
    
        class outer(JsonObject):
            inner1 = JsonObject(config)
            inner2 = JsonObject(config)
            _config = config
    

ish. With this isn't not clear why this is important, but with highly nested
structures, having the python code visually mirror the output in terms of
nesting is really nice.

The second is at
[https://github.com/gtagency/pyrostest/blob/master/pyrostest/...](https://github.com/gtagency/pyrostest/blob/master/pyrostest/rostest_utils.py).
This isn't strictly necessary, but "The documentation states that all test_*
methods in a RosTest subclass spin up roscore" is a lot simpler and safer than
"if you forget to add `RosTest.setUp()` and `RosTest.tearDown()` at the
beginning and end of all of your setUp and tearDown methods, your tests will
fail. (and further, future invocations of tests may fail until you run
something like `killall -9 roscore && killall -9 rosmaster` in shell because
you've unintentionally broken your environment."

Generally speaking, Metaclasses allow you to make better user interfaces for
developers, and avoid the repetitive kinds of things you encounter in
"enterprisey" code. Specifically, they allow you to make interfaces much more
declarative than you otherwise could (ie. the SQLAlchemy example of "this is a
table and here are its columns" vs. "My table is a function that takes in some
stuff and magic happens within it")

~~~
andy_ppp
Personally I find a language with Macros easier to understand and far more
powerful for such things, and it's not a load of special cases that are still
being added to (in this case the __init_subclass added in 3.6).

Fine about meta classes for some things but you haven't covered why you'd use
the two examples in the article.

~~~
joshuamorton
I actually wanted prepare in production, because it would have allowed me to
override the class dict with an ordereddict, which was a more preferable
solution to making output jsons deterministic than just sorting the
components, but you do what you can.

As for init subclass, if you're alright with a DSL that does weird things with
the class line, it can be useful, as an example applied to SQLAlchemy, they
could do this (in pseudo sqla):

    
    
        class MyTable(Table, primary_key='uid'):
            uid = Column(int)
    

which would enforce via the api that there can only be 1 primary key.

~~~
andy_ppp
I don't actually understand either example which kinda proves my point ;-)

~~~
joshuamorton
How do you mean?

So for prepare, imagine you have a class whose entire job is to be serialized
to some external format. I'll use json, but it could be csv, yaml, whatever.
Heck, maybe you're doing code generation and its a group of functions that
will be placed into another file.

Say I have an instance of this class:

    
    
        class JsonObject(Serializable):
            key = "value"
            second_key = "second_value"
    

This gets serialized to

    
    
        {
            JsonObject: 
            {
                key: "value",
                second_key: "second_value"
            }
        }
    

Except that sometimes what you get is

    
    
        {
            JsonObject: 
            {
                second_key: "second_value",
                key: "value"
            }
        }
    

Minor, but important, difference. Python stores object attributes in a dict.
Python's dicts are unordered, and so when serializing, the order that things
are printed in is undefined behavior. That means that now instead of just
using a normal diffing module, you need to write some json-differ that parses
and compares the json, and you lose the ability to do side-by-side
comparisons. So you want deterministic, ordered, output generation.

Now, to be clear, you could model your api like this:

    
    
        output = JsonObject()
        output.append(key="value")
        output.append(second_key="second_value")
    

And that works well for this simple example, but as soon as you start nesting
things, it gets confusing, so just assume that for reasons you want this DSL
for code generation.

You have 3 options:

1\. Create some determinism: your serialize function looks something like this
(pseudopython):

    
    
        def serialize(self):
            for k, v in self.attrs:
                write(jsonify(k, v))
    

A fix is really easy:

    
    
        def serialize(self):
            for k in sorted(self.attrs.keys):
                write(jsonify(k, self.attrs[k]))
    

Not bad, but a few problems, you can't customize the output order, everything
now needs to be comparable, and its a smidge slower, especially for really big
objects (remember: you're writing a DSL for generating large serializable
things, there's a good chance you'll want to have some way to autogenerate
large quantities of data to be serialized).

2\. add an `_order` attribute to your class, then your serialize method
becomes

    
    
        def serialize(self):
            for k in self._order:
                write(jsonify(k, self.attrs[k]))
    

Well, now you have to forward declare everything, which is kinda annoying,
you're populating your namespace with crap (what if your generated
json/python/whatever needs a `_order` attribute!), and if you ever forget to
update your order attribute, your stuff doesn't work write.

3\. Replace your class's dict with an OrderedDict. Now, you've done some dark
magic to do this, but you don't need to forward declare, your users control
the output order naturally in a way they expect, and you don't have to sort a
bunch of things every time you want to serialize any data. (admittedly python
3.6 I think voids this issue by making the class dict and OrderedDict anyway,
but that's technically an implementation detail)

Does that make sense?

------
luhn
> aside: please don't abuse collections.namedtuple to make a simple Python
> object; the class is meant to help porting APIs that return a tuple to a
> more object-oriented one, so starting with namedtuple means you end up
> leaking a tuple API that you probably didn't want to begin with

I haven't heard this before. I subclass namedtuples regularly—whenever I need
a simple immutable object. I haven't had any problems personally. (Except when
pickling them. I would not recommend that.)

Can any experienced Pythonistas weigh in? Is this bad practice?

~~~
joshuamorton
You'll get surprises such as your subclass being indexable when you (probably)
don't want that.

~~~
lgas
What problem would that cause? Wouldn't it only be a problem if you attempted
to index into it, which you wouldn't do (on purpose) if you didn't think it
was indexable? And if you did try to index into it thinking that it wasn't
indexable, it would be a bug, which is the same as when it's indexable...

~~~
joshuamorton
Exactly. You probably don't want this thing that is essentially a namespace to
be indexable, but you index into it, which is a bug, but it quacks like a
tuple, so your mistake propagates silently, which is a _bad thing_.

Its more permissive than one would expect. Although the bigger issue is what
dragonne mentioned.

------
carapace
> Adjusting how classes are created can be very difficult to debug and so
> should only be used when you have a really legitimate use-case.

...and you never have a really legitimate use-case. Seriously, please don't
use this sort of magic in production code that you expect other people to use
and depend on. I'm looking at _you_ Django project. Bloddy cowboys.

