Hacker News new | comments | show | ask | jobs | submit login
Guide to Data Classes in Python 3.7 (realpython.com)
92 points by dbader 7 months ago | hide | past | web | favorite | 46 comments



Data Classes are a cool feature, useful, and a superior alternative to namedtuple.

But while this was brewing, attrs ate their lunch. Data Classes just don't go far enough compared to what attrs [1] has provided for a while now. If there's a compelling reason to switch, I haven't found it yet.

Can I use Data Classes in a library supporting Python 2.7, 3.4-3.7? I don't think the answer is yes today.

[1]: https://github.com/python-attrs/attrs


The standard library is not going to have more advanced features than what the community can do, if only because of the release cycle.

Meanwhile, I was not a user of attrs before, rather I used collections.namedtuple and then typing.NamedTuple. So, data classes is a win for me and other people like me who weren't using attrs. And you can continue to use attrs without being bothered at all.

In Raymond Hettinger's talk [0] at PyCon 2018, he even says that more features from attrs will likely get added, such as data validation [1]. And that his talk was the one chance for the developer (Eric Smith) to get praise, because everyone is going to instantly second guess every decision he made.

[0] https://www.youtube.com/watch?v=T-TwcmT6Rcw

[1] see 42:36 on future directions


> Data Classes are a cool feature, useful, and a superior alternative to namedtuple.

I'm curious, how is it "superior" to NamedTuples?



I haven't watched it yet but I saw that Raymond Hettinger presented on data classes at PyCon last week [0].

[0]: https://www.youtube.com/watch?v=T-TwcmT6Rcw


I thought it was pretty good. He focused mostly on why you'd use this over typing.NamedTuple or hand-rolled code. Then, he talks about the couple limitations and where it'll go in the future.


Here are the slides for that presentation:

https://twitter.com/raymondh/status/995693882812915712


I am happy that lightweight data classes are now in the stdlib -- hopefully this puts an expiration date on overuse of namedtuple, which has turned out to be kind of a bad idea. Being able to make sane lightweight classes in one-off one-file python scripts will be nice.

That said, on any package that I write that has third party dependencies anyway (read: all of them), I don't see a compelling reason to change from attrs.


You nailed it with "on any package that I write that has third party dependencies anyway". On my recent package I was going for zero dependencies (at py3.7+) so I used the backport of dataclasses.

Ultimately, I threw out dataclasses too. But if I didn't need __repr__, __post_init__ and __slots__ with default values, I'd probably have just stuck with dataclasses :)


The problem with dataclasses are:

1) they don’t support __slots__ and default values

2) type hints aren’t supported yet, and it doesn’t appear they’ll be added to the 3.6 backport

The former issue will be resolved (maybe in 3.8?) but it will require either a metaclass approach or the @dataclass decorator to build a separate class.

The good news is that dataclasses (the backport) work on PyPy 6.0.0.


On the __slots__, the article claims that these do: https://realpython.com/python-data-classes/#optimizing-data-...

EDIT: From PEP 557: https://www.python.org/dev/peps/pep-0557/#support-for-automa...

“Support for automatically setting __slots__?

At least for the initial release, __slots__ will not be supported. __slots__ needs to be added at class creation time. The Data Class decorator is called after the class is created, so in order to add __slots__ the decorator would have to create a new class, set __slots__, and return it. Because this behavior is somewhat surprising, the initial version of Data Classes will not support automatically setting __slots__. There are a number of workarounds:

- Manually add __slots__ in the class definition.

- Write a function (which could be used as a decorator) that inspects the class using fields() and creates a new class with __slots__ set.”


The article doesn't discuss slots and default values. Here's the actual reference implementation for dataclasses where it's discussed: https://github.com/ericvsmith/dataclasses/issues/28

Your second solution works fine (and is what I'd referenced). For your first solution, I'd again redirect you to my statement "they don’t support __slots__ and default values" and the above link.


It’s not my solution, just a direct quote from PEP.


Another alternative to data classes is types.SimpleNamespace, available since Python 3.3:

https://docs.python.org/3/library/types.html#types.SimpleNam...


Data classes was an interesting read and will definitely save some time in the future, but I think the __slots__ mention is where I got the most value from (big memory and access speed savings!).

Additional reading on __slots__: https://stackoverflow.com/questions/472000/usage-of-slots


This marks the beginning of the end for python as a dynamic language.


Python keeps evolving -- great ! and also, consider support for a stable Python 2.7 LTS .. you know, for more than five years from now?


If you need Python 2.7 support after 2020, consider asking a consulting company to provide it for you.

Python is maintained by volunteers. Providing support for more than 10 years is more than enough in my opinion. The community has moved on to 3, but I'm sure there will be companies doing consulting and fixing 2.x bugs for a fee.


No you don't understand, they are evil people for stopping to give us free candy after they did so for 10 years. They can't be working on better new free candies now, that's even worse ! Monsters !


I will answer you with a song:

    It's been a long road
    Getting from there to here
    It's been a long time
    But Py3 is finally here

    And I will see my dreams come alive at night
    I will touch the sky
    And Py2's not gonna hold me down no more
    No it's not gonna change my mind
In other words:

   Let it go, let it go...


Python 2.7 has been stable and supported for almost 8 years -- how much more long term do you want? Even releases of Java only seem to get about 6 years of support.


>Python 2.7 has been stable and supported for almost 8 years -- how much more long term do you want?

Forever? 2.x codebase are in the millions of lines of code in companies, and they won't be converted or go away anytime soon. And those will need security updates and bug fixes.

Note that Google and Dropbox are still running tons of Python 2.x -- and those are two companies where Guido Vas Rossum himself has worked in the last 10 years.

Not consider the average company with tons of Python 2.x code. It's not going anywhere soon.

Heck, why is everyone surprised by this? Or is it just 20 year old first time pro devs that are surprised? The world still supports tons of Cobol and other "old" language code -- some running for 3-4 decades after a language went "out of fashion"...


> Forever?

You can have support forever.

You just can't have it __for free__ forever.

There are companies out there that will happily sell you the service.

You already get an amazing tech for free, and support of 2 decades in you take all the 2.X branch in consideration (e.g: 4 times the ubuntu LTS). You complaining at this point is just insulting the community.

I find it infuriating. When the JS or Ruby community breaks stuff, they give a few weeks notice, and a few months to migrate. Nobody complains. Python give 10 years, a lot of tooling and tutorials, and some people keep complaining. This is where being too nice is a problem.

In 2020, I'll triple my price for any work on 2.7. I'm done being fair to people with such ingratitude.


>In 2020, I'll triple my price for any work on 2.7. I'm done being fair to people with such ingratitude.

This doesn't make any sense. You're not running a charity. If you can get customers with the tripled price, then do triple it. If you can get customers with a 100x price, you'd be a fool not to 100x your price.

If, on the other hand, you can't get customers for triple the price, then tripling it will just make people go to someone else. You're not hurting anybody either way...


> This doesn't make any sense. You're not running a charity. If you can get customers with the tripled price, then do triple it. If you can get customers with a 100x price, you'd be a fool not to 100x your price

You say that because you see business as only a gateway to make money. But if you work in the libre community, you'll see that ethics and promotion of the libre is a very important part of it. It's also why FOSS people make less money.


>But if you work in the libre community, you'll see that ethics and promotion of the libre is a very important part of it

For how many people? Most major FOSS people I know work in large companies from Red Hat to IBM and Joyent. Heck, speaking of Python, Guido worked for Google and Dropbox.

And most of the others are volunteers.

Besides, I wouldn't call "tripling the price" when you don't like the client using an older version exactly "ethical".

Have you talked to the client? Do you know their costs and externalities for a 2.7 -> 3 rewrite of their existing (and working code)? It's not like they are capricious and want to use 2.7 out of malice.


many companies have made much money from the free work of volunteers, yet those exploiting companies are blameless, while proponents of a useful and stable technology are glibly mocked?


Cobol, Fortran 77, ANSI C and stuff like that lasted a long time -- but they didn't have batteries-included standard libraries, and there was no expectation that they would be maintained against things such as security threats. That just wasn't part of the design of a programming language at the time.

(Can a classic programming language specification have a security hole? Absolutely. Consider the case of gets() from C.)


Why 2.7? Why not 2.2? Why not 1.3? Why not 1.0?

At some point you need to require others to update. Or they can hire consultants like companies do for Cobol.


Python 2.7 is stable and useful now.. it is a lot of C code implementing a scripting language. There are a lot of libraries that work well right now, that took hundreds of skilled years of work to build.


The same can be said thousands of 16 bit applications that are no longer supported on 64-bit architecture. Python 2.7 isn't being deleted from the face of the Earth. You're just going to have to pay people to maintain your legacy codebase because after 10 years of knowing it will become obsolete you stuck with it.


>The same can be said thousands of 16 bit applications that are no longer supported on 64-bit architecture.

And some of those are a big loss as well.

But most of them are obsolete and deprecated (because e.g. Office suites, OSes and gaming moved on) in a way that Python 2.x programs used in businesses are not.


>But most of them are obsolete and deprecated (because e.g. Office suites, OSes and gaming moved on) in a way that Python 2.x programs used in businesses are not.

Businesses that fail to adapt to changing markets/technology have nobody to blame but themselves when they've had 10 years to adapt or die. I struggle to find any sympathy for them outside of employees ringing the death bell who were ignored. Also, just to reiterate, 2.7 won't be going anywhere - businesses will need to accept it will cost them more to stick with it. That's the price paid for holding onto technical debt.


>Businesses that fail to adapt to changing markets/technology have nobody to blame but themselves when they've had 10 years to adapt or die.

What "die"? Some of the companies using Cobol or other such older technologies (e.g. Java 1.4) are among the most valuable on earth.

And why should they adapt perfectly fine working programs? Just for the sake of it, or because devs like to rewrite stuff?


>Some of the companies using Cobol or other such older technologies (e.g. Java 1.4) are among the most valuable on earth.

And they pay consultants and contractors in order to continue to do so. I'm not sure I'm seeing where the problem is? Want to stick with Python 2.7? Hire specialists. Problem solved. You just no longer get free labor from the maintainers of the language.


And FORTRAN, C, C++... all languages with ISO standards. But with Python, the current version of the main implementation is the standard.


Perl 5 runs well after... twenty five years?


And Python 2 has run for 18, if we're counting by major versions. Heck, as long as you're okay with little changes like booleans not looking like integers anymore, you can run your Python 1 code 24 years later.

But imagine if Python followed the example of Perl. Python 3 becomes a curious side project, everyone moves back to Python 2, which goes on for decades without new features. At last, Python can have all the popularity and relevance of... Perl 5?


Python 2 will not stop running in 2020. No bomb is gonna explode. The dozen of volunteers will just stop providing free work for bug fix and security update.


The problem with this logic is it could be extended indefinitely into the future.

Python 3.3 has been available since 2012. 2020 was already a deadline that had been pushed back at least once for 2.x. I understand volatility, but 5+ years in the software world is an eternity.


I guess we're getting off topic here but can't help myself.

> ...5+ years in the software world is an eternity.

There is no particular reason why we can't maintain stability. I can still run [some] Microsoft DOS programs from the '80s, though finally most of them need DOSBox to run. For the scientific programming community we need to be able to rely on relatively ancient code. This is why FORTRAN and Matlab are still used. For example the release notes for g77 at https://gcc.gnu.org/fortran/ say

> Legacy g77 code will compile fine in almost all cases

That's code from 1977 - 40 years ago! And the release notes are just nudging people towards the 1995 language standard, not declaring deprecation.


> There is no particular reason why we can't maintain stability. I can still run [some] Microsoft DOS programs from the '80s

In fact, there isn't, you are 100% correct. I don't think anyone on the python dev team thinks that somehow 2.7 "can't" be maintained.

Grab the 2.7 branch [1] and build and test it on your favorite OS. If it still works tomorrow or next year, or in the 2080s, congratulations! You got a working product and you apparently didn't need any support.

> That's code from 1977 - 40 years ago!

Python's popularity will ensure that some group, somewhere will champion 2.7. And maybe in ~2030 we'll still see active support.

[1] https://github.com/python/cpython/tree/2.7


But you can. You will be able to run Python 2 in 5 years if you want. Nobody prevents you too. The official community will just stop working on it. Just like for DOS, which an alternative community picked up.


>The problem with this logic is it could be extended indefinitely into the future.

So? Why shouldn't it? The world still needs (and runs) untold millions of lines of Cobol.

It would take billions of money to move it over -- and it will take millions to move over legacy Python 2.x codebases.


If you have that cash, you can pay Continmum for commercial support and stop asking people to give you stuff for free forever.

Or, alternatively, you can start spending your week end, for free, working on maintaining it.


> The world still needs (and runs) untold millions of lines of Cobol.

shrug python organization isn't responsible for delivering a Cobol toolchain. They said they've set a deadline and it's open source so there's bound to be some popular vendors out there who you can buy support from. Just like you buy support from your Cobol vendor.

Most people who consume python are consuming it from a downstream vendor like a linux distro anyways. Commercial distros will likely offer extended support packages.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: