
Python 3.7: Introducing Data Classes - ingve
https://blog.jetbrains.com/pycharm/2018/04/python-37-introducing-data-class/
======
reaperhulk
As noted in the PEP data classes is a less fully-featured stdlib
implementation of what attrs already provides. Unless you’re constrained to
the stdlib (as those who write CPython itself of course are) you should
consider taking a look at attrs first.

[http://www.attrs.org/en/stable/](http://www.attrs.org/en/stable/)

~~~
adamc
Does attrs support type hints? I didn't see it in a quick skim...

One thing the stdlib implementation has going for it: better naming. attr.ib()
is not exactly crystal-clear.

~~~
nicwolff
You can `from attr import attrib, attribs` and use those instead of `@attr.s`
and `attr.ib()`.

------
metalliqaz
Raymond Hettinger had a pretty good presentation on Data Classes and how they
relate to things like named tuples and a few recipes/patterns. It was linked
on Reddit[0] but it looks like the video has been removed from YouTube. His
slides are online[1], though.

[0]
[https://www.reddit.com/r/Python/comments/7tnbny/raymond_hett...](https://www.reddit.com/r/Python/comments/7tnbny/raymond_hettinger_python_37s_new_data_classes/)

[1]
[https://twitter.com/i/web/status/959358630377091072](https://twitter.com/i/web/status/959358630377091072)

------
tommikaikkonen
I love using attrs, like the idea of bringing something similar to the
standard library, but strongly disagree with the dataclasses API. It treats
untyped Python as a second class citizen.

This is what I'd prefer

    
    
      from dataclasses import dataclass, field
    
      @dataclass
      class MyClass:
        x = field()
    

but it produces an error because fields need to be declared with a type
annotation. This is the GvR recommended way to get around it:

    
    
      @dataclass
      class MyClass:
        x: object
    

You could use the typing.Any type instead of object, but then you need to
import a whole typing library to use untyped dataclasses. I highly prefer the
former code block.

There's a big thread discussing the issue on python-dev somewhere. Also some
discussion in
[https://github.com/ericvsmith/dataclasses/issues/2#issuecomm...](https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-353903594)

Anyway, it's not a huge issue—attrs is great and there's no reason not to use
it instead for untyped Python.

~~~
sleavey
Yeah, it seems strange to force people to use type hints when it has had such
a mixed reception. I really tried to use type hints with a new project a few
months ago, but ended up stripping it all out again because it's just so damn
ugly. I wish it were possible to _fully_ define type hints in a separate file
for linters, and not mix it in with production code. It's kind of possible to
do it, but not fully [1], and mixing type hints inline and in separate files
is in my opinion even worse than one or the other.

[1] [https://stackoverflow.com/questions/47350570/is-it-
possible-...](https://stackoverflow.com/questions/47350570/is-it-possible-to-
separate-all-type-hinting-checking-infrastructure-in-python)

~~~
rezistik
I've always wanted a programming UI similar to RapGenius's UI. With
annotations and docs being opened in a form panel.

------
Rotareti
It's great that we have simple/clean declarations for NamedTuples an
(Data)classes now. But I wonder why they chose two different styles for
creating them. This for NamedTuples:

    
    
        from typing import NamedTuple
    
        class Foo(NamedTuple):
            bar: str
            baz: int
    

and this for DataClasses:

    
    
        @dataclass
        class Foo:
            bar: str
            baz: int

~~~
metalliqaz
When you write it that way it makes me wonder why there isn't a DataClass type

~~~
joshuamorton
The short answer is that the only way to do what dataclasses do as a base
class is via python metaclasses, and you can only have one metaclass. So this
way, you can dataclassify something that inherits from a metaclass.

------
aserafini
I love that Peter Norvig left an improvement to the __post_init__ method in
the comments section of the JetBrains blog. I wonder if he uses PyCharm?

------
weberc2
I'm happy to see data classes. I think something like this exists in 3.6:

    
    
        class Person(typing.NamedTuple):
            name: str
            age: int
    

But I don't think it supports the __post_init__; however, constructors have no
business doing parsing like this anyway, so unless I'm missing something,
deriving from `typing.NamedTuple` seems strictly better than `@dataclass`
insofar as it seems less likely to be abused.

~~~
mixmastamyk
Tuples are read-only.

~~~
weberc2
Ah, of course. Good point. I tend to write things in an immutable style, so I
don't usually pay attention to this.

~~~
silviogutierrez
I see immutability as a feature.

There _are_ uses cases for mutability, but it should be opt-in, not opt-out.
So I'm not loving the fact that "frozen" (the mutability param for data
classes) defaults to False.

~~~
icebraining
Python is very much not immutable by default, it'd be weird to subvert it in a
specific case.

------
std_throwaway
Coming from C++ it feels really weird that you can simply assign
instance.new_name = value from anywhere without properly declaring it
beforehand. You also never really know what you get or if somebody modified
your instance members from the outside.

~~~
BerislavLopac
I can only imagine how weird it must seem that you can override methods of
instance objects and even classes, or even replace a whole class of an
instance with another.

    
    
        >>> class Foo:
        ...     def bar(self):
        ...         print('foo')
        ... 
        >>> class Baz:
        ...     def bar(self):
        ...         print('baz')
        ... 
        >>> f = Foo()
        >>> f
        <__main__.Foo object at 0x7fa311e7a278>
        >>> f.bar()
        foo
        >>> f.__class__ = Baz
        >>> f
        <__main__.Baz object at 0x7fa311e7a278>
        >>> f.bar()
        baz

~~~
alkonaut
Does that work even if the types had fields? What about it the fields had a
different total size? What if Baz had no parameterless constructor (I.e only
had a contractor that guaranteed arg > 0 for example)?

Is this like an unsafe pointer cast where “you are responsible, and it will
likely blow up spectacularly if you don’t know what you are doing” or is it
something safer that will magically work e.g with types of different size?

~~~
existencebox
Inline:

\- Does that work even if the types had fields? Yup!

\- What about it the fields had a different total size? Totally fine!

\- What if Baz had no parameterless constructor (I.e only had a contractor
that guaranteed arg > 0 for example)? Then you throw an exception when you
call the constructor.

\- Is this like an unsafe pointer cast where “you are responsible, and it will
likely blow up spectacularly if you don’t know what you are doing” or is it
something safer that will magically work e.g with types of different size?

Mostly the former, but if you're coming from a strongly typed compiled
language, it may feel like a bit of the latter too, since if you don't run
into any obvious runtime incompatibilities, it'll all "just seem to work" even
if the underlying classes are 200% different.

(Disclaimer, I've been using python for ~decade now but am still always
nervous to speak authoritatively about it, since I work with peers who are FAR
deeper in the actual implementation than I am, and I run the risk of being
subtlety incorrect.)

~~~
alkonaut
> Then you throw an exception when you call the constructor.

I was assuming that doing

    
    
       myobj.__class__ = whatever
    

Would not call any constructors?

~~~
existencebox
You're correct, sorry, I was not clear. The actual value setting that you cite
will work pretty much regardless, you could do __class__ = "Q" for all python
cares.

The problems would come later when you try to use any functionality of the new
__class__ in the manner of the old __class__ for which the new one is not
compatible. e.g.:

    
    
        class A:
            def go(self):
                print("A ran!")
    
        class B:
            def go(self):
                print ("B ran!")
    
        class C:
            def go(self, foo):
                print ("C ran!", foo)
    
        a = A()
        a.__class__ = B
        a.go()
    
        B ran!
        In [11]:
    
        a = A()
        a.__class__ = C
        a.go()
    
        ---------------------------------------------------------------------------
        TypeError                                 Traceback (most recent call last)
        <ipython-input-11-b67fe8fb94e1> in <module>()
              1 a = A()
              2 a.__class__ = C
        ----> 3 a.go()
    
        TypeError: go() missing 1 required positional argument: 'foo'

~~~
alkonaut
Thanks. The constructor thing can be illustrated similarly (found an online
compiler, I don't normally python...).

    
    
        class A:
            i = -1
            def __init__(self, num): 
                if num <= 0:
                    raise(Exception('noo!'))
            def go(self):
                print('A has value {}'.format(self.i))
    
        class B:
            def go(self):
                print ("B ran!")
    
        b = B()
        b.go()
    
        b.__class__ = A
        b.go()

------
foxhop
I have to be honest, coming from Python 2.3 (2004ish), I don't recognize "new"
Python anymore. I think it's mostly regarding type definitions.

~~~
gdwatson
It's relatively recent. IMHO Python 3.5 to 3.7 feel like the language is going
in a different direction than it did before -- type hints and the handling of
asynchrony in particular.

~~~
Rotareti
If been using a lot of Python/JS/TypeScript in the last couple of years and it
seems like each new release brings them closer together.

~~~
patrickxie
which IDE do you use these 3 with?

~~~
Rotareti
VSCode

------
jernfrost
Looks like how a lot of languages already work out of the box. E.g. whenever I
create a data type in Julia I automatically have such a constructor.

Static languages such as Go and C already essentially let you do this through
initialization through braces.

------
Osmium
Link to the PEP:
[https://www.python.org/dev/peps/pep-0557/](https://www.python.org/dev/peps/pep-0557/)

------
fleetfox
While 3.7 is not here, back-port:
[https://github.com/ericvsmith/dataclasses](https://github.com/ericvsmith/dataclasses)

~~~
mixmastamyk
Also on PyPI.

------
kuon
I think it's because I'm working with elm right now, but this kind of thing
scares me:

    
    
        created: datetime
    

But

    
    
       if type(self.created) is str:
           self.created = dateutil.parser.parse(self.created)
    

So basically, the type annotation cannot be trusted.

~~~
bunderbunder
In Python, type annotations are (expressly) just a special type of comment
that's been given a regularized format for the sake of both human and computer
readability.

I don't know that it should scare you, but, like with many Python features,
it's something that should be approached in a sensible manner. The Python
philosophy is to leave you with the freedom to monkey around, and leave it to
you to decide whether you want to abuse that ability.

(Look at me, all talking like someone who didn't spend 2 hours diagnosing a
type error that a static language would have found and forced me to fix within
half a second. :P )

------
david-cako
This __post_init__ thing is shit IMO. Why does it make sense for the language
feature to allow you to completely disregard the type hint and fix it later?

I have to imagine that the type checkers would factor in anything that occurs
in __post_init__ when evaluating whether the class conforms to the type hints,
but it still feels like this python static typing stuff is drifting in the
wrong direction.

~~~
cpburns2009
I don't like `__post_init__` either. Wouldn't it make more sense to just
override `__init__` similar to:

    
    
        @dataclass
        class DataObj:
    
            some_date: datetime
    
            def __init__(self, **kw):
                kw['some_date'] = dateutil.parser.parse(kw['some_date'])
                super().__init__(**kw)

~~~
dragonwriter
To allow you to override init that way, @dataclass would have to create an
additional superclass with the generated init method, rather than generating
the init method in the class intended for use, which seems to have more
overhead.

~~~
ptx
You could attach the __init__ method to dataclass (as if it were a class) and
call it like this, couldn't you?

    
    
      dataclass.__init__(self, **kw)
    

Like in the olden days before super().

~~~
dragonwriter
From the description, the dataclass decorator _generates_ an __init__ method
for the decorated class, which seems different than using a common generic
__init__ method for all classes with the decorator.

------
breatheoften
Would be convenient if it also supporting generating a to dict method with
dunder hooks for customizing the translation ...

------
czardoz
Looking forward to having ORMs support this way of defining models

------
axzak
It will be nice to have this but for data deserialization, which is the case
in the example, I would still use drf serializers or marshmallow schemas.

~~~
aldanor
Re: serialization, worth mentioning cattrs project (work with attrs classes,
but would work the same way with stdlib dataclass):

[https://github.com/Tinche/cattrs](https://github.com/Tinche/cattrs)

------
no_wizard
I just hope the `__post_init__` method catches on and becomes a regular python
dundermethod. I actually find it to be a very good quality in this new
implementation.

Example from the PEP 557:

    
    
      @dataclass
      class C:
           a: float
           b: float
           c: float = field(init=False)
    
           def __post_init__(self):
               self.c = self.a + self.b
    
    

[https://www.python.org/dev/peps/pep-0557/#post-init-
processi...](https://www.python.org/dev/peps/pep-0557/#post-init-processing)

I could use this all over the place.

~~~
UncleEntity
Where would it be useful outside of an automagically generated __init__
method?

------
hartator
Is Python becoming more statically typed then?

~~~
jernfrost
Dynamic and static languages are fundamentally different. There is no more or
less.

I've tried to elaborate here why annotating variables with type doesn't make a
dynamic language static:

[https://medium.com/@Jernfrost/dynamically-typed-languages-
ar...](https://medium.com/@Jernfrost/dynamically-typed-languages-are-not-what-
you-think-ac8d1392b803)

You can't look at the presence of type information to determine if a language
is static or dynamic. What matters is when and how that type information is
used. In static languages expression have type. In dynamic languages values
have type.

The implication is that you can't know the type of something in a dynamic
language until the expression has been evaluated at run time. With a static
language we can determine the type of every expression at compile time, which
requires full knowledge of the whole program. It is why dependencies becomes
much more problematic in static languages and why they are so poor as glue
languages.

~~~
staticassertion
> In static languages expression have type. In dynamic languages values have
> type.

I don't think this is true. In a dynamic language all values, including
expressions, share a single type - a union of all possible types. In a static
language you can limit which possible types a value may hold. A dynamic
language is like if you cast every single type to Any in a static language.

In Python this is entirely possible with mypy - a static type system for the
Python language, which works through annotations.

> in static languages they are used to prevent the compilation of programs
> containing expressions where types don’t match up.

This is already the case with mypy in Python. It seems very much static. It's
like everything is statically determined to be Any, then you specify in
certain areas a more limited type.

In your case with Julia, the type annotations are static. Even if by default
Julia is dynamic, it allows for static annotations.

I disagree that there is no in-between. I think Julia is actually a great
example of an in-between. And, in fact, it has a name - gradual typing.

~~~
spiralx
> In a dynamic language all values, including expressions, share a single type
> - a union of all possible types. In a static language you can limit which
> possible types a value may hold. A dynamic language is like if you cast
> every single type to Any in a static language.

No, you're thinking of a weakly- versus strongly-typed language. Python is
dynamically and strongly typed - values have a definite type, but a variable
can be assigned a value of any type.

~~~
staticassertion
As RussianCow said, I think you're misreading my post. A Python variable, from
a type perspective, is a variable with a type Any/ a type that is a union of
all possible types.

Using annotations you can then specify the type.

Nothing to do with weak/ strong, which merely imply some level of implicit
casting.

------
SatvikBeri
Finally! Attrs is great, but I'm glad to see this in the standard library.

------
swalsh
Looks great, too bad my project will forever be in 2.7

~~~
theptip
Presumably (hopefully?) not in 2020, when it stops getting security fixes?

[https://legacy.python.org/dev/peps/pep-0373/](https://legacy.python.org/dev/peps/pep-0373/)

~~~
zenhack
It will probably be a fair bit longer than that before 2.7 stops being
relevant. 2.7 is still the default system Python on a lot of Linux distros,
which will be in vendor support for longer. It doesn't really matter if the
fixes are coming from the PSF or someone else - nice thing about Foss.

RHEL/CentOS 6 is still reasonably widely used, and the system Python there is
2.6.

~~~
kstrauser
> 2.7 is still the default system Python on a lot of Linux distros

I will never understand why that matters. "ed" is the default system editor,
but I'm only "{apt-get,yum} install {vim,emacs}" away from having something I
actually want to use. That's the whole point of a distro. You don't have to
use Python 2.ancient just because /usr/bin/paleolithic is written with it.

~~~
jstarfish
Not all environments have unfettered access to the internet to download
whatever arbitrary packages the user decides they need that day. Sometimes
you're stuck with whatever the distro shipped with.

~~~
theptip
Most distros have been shipping Python3 for years, just not set as the target
for `/usr/bin/python`. You can run `/usr/bin/python3` currently, almost
anywhere.

And "software that has been gradually going EOL for the last 5 years" is not
"packages the user decides they need that day". I'd be very surprised if any
distros that ship with Python2 do not ship with Python3 in 2020, and I'd even
wager that most will default to python3 by then.

------
AlexCoventry
Reminds me a lot of the attrs module.

------
noobermin
I have used dictionaries for this for the longest time. The thing is I sort of
rely on the type being flexible, so what can I do?

~~~
kalessin
If you rely on your type being "flexible" I would argue that you have a design
problem.

~~~
noobermin
My working code beats your ideological purity.

~~~
kalessin
This was really coming from practical experience, but no offense taken.

~~~
noobermin
I asked for how this PEP could or could not be used for my use case, and your
response was "your use case is invalid" without knowing why I do the things I
do. No one should have to respond kindly to bullies.

~~~
weberc2
Calm down, no one is bullying you.

------
Kwpolska
attrs does a pretty good job, while also supporting older Python versions:
[http://www.attrs.org/en/stable/](http://www.attrs.org/en/stable/)

------
ianamartin
I'm piqued by one thing in this article. Using an object as a dictionary key.
What's the use for that? I don't think it's ever occurred to me to do that.

~~~
UncleEntity
Dynamic dispatching comes to mind.

    
    
      d = {Foo : do_something_with_foo,
           Bar : do_something_with_bar}
    
      d.get(type(x), default_function)(x)
    

Probably some more efficient way but honestly I've used this before instead of
writing a whole if elif chain.

\--edit--

Makes python's lack of a switch operator sometimes less painful...

~~~
ianamartin
Oh, that makes sense. Not sure why this question rubbed people the wrong way.
I was genuinely curious about when you would use an object as a key.

I guess my uses for dictionaries are pretty vanilla. But that’s a kind of
thing I do all the time. I set up dictionaries where the key is a possible
value from some operation or query, and the value is the function I want to
perform on that value.

This extends that concept. Thanks!

