Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Good Python projects to read for modern Python?
239 points by turndown on Dec 27, 2021 | hide | past | favorite | 66 comments
Django 4.0 projects would also be appreciated, or simply high quality Django projects.

Anything by Sebastian Ramirez (fastapi), Samuel Colvin (pydantic), or the encode team (starlette).

World class stuff.




Fastapi is far from world class in any category other than usage.

>Fast: Very high performance, on par with NodeJS and Go.

Is a downright lie

> Fastapi is far from world class in any category other than usage.

Imho FastAPI's weakest point is maintenance. The Github repo has 800+ open issues and 400+ open PRs. Tiangolo is the only user that constantly makes non trivial contributions. I just can't see that being sustainable in the long term if all those items are not actively triaged.

Yeah it is definitely in the wildly popular upswing phase. I'm hoping it attracts a crew of maintainers to take some of the burden off.

The one thing it has going for it is much of the heavy lifting is done by starlette, maintained by encode.io, which is actively funded. It's not trying to be rails, rather it's relying on other libraries and packages to add functionality.

The documentation is the worst I've ever seen. There is no API reference! Just a bunch of incohesive tutorials that hopefully overlap with what you're trying to achieve.

I agree, but as someone who has been working with Django for years, my standards are pretty high, and I'm opinionated when it comes to docs.

Check out my latest post on HN. Seemingly the creator and single maintainer is ignoring requests from developers to let other people maintain the project. I used FastAPI at work some time ago, but I can't see myself using it again anytime soon. Imho relying on someone's personal project for software that's going to be in production indefinitely is a big risk.

I've never seen a library that's remotely popular without an API reference. I don't think that's a high standard.

TIL FastAPI does not have an API reference. I can't find a roadmap either. That's... wild.

I didn't like the Django philosophy. So it was FastAPI vs Flask RESTful. People kept saying FastAPI was the next big thing, but nothing can beat Flask documentation and the FastAPI community and documentation felt like, "you ought to know that before you ask".

OP asked for good projects to read. I think the code quality is very high, the documentation is great, it's been a joy to use, and it's been more than fast enough for my purposes.

I didn't make that speed claim, though I am aware that tidbit is controversial.

Hrmm. But is the code pythonic? I’d say it is. That’s what the OP is looking for.

Do you have any opinion on whether it’s a good example of a modern Python project, as per OP’s request?

Back it up with some evidence

Not the parent, but evidence was pretty easy to find[0]. fastapi comes in around 250-300th place in most of the benchmarks. ¯\_(ツ)_/¯

[0] https://www.techempower.com/benchmarks

In the composite benchmark, fastapi is 75th, express (the most popular node framework) is 94th, Gin (a popular go framework) is 55th, NodeJS is 56th.

In the continuous benchmarks, that you can see here [1], taking the last one that has finished at the time of this post (run id 31abee0e-b0c5-414f-9288-216c48e3a70e): Fastapi is 81st, Express 107th, Gin 57th, NodeJS 63rd. In the fortunes part, plain Go is always higher than Gin but varies wildly, from ~15% to ~43%. FastAPI itself is around 7%

I think Express is still the most used JS framework, by a large margin, so claiming to be faster than NodeJS may be fair, as most people thinking about NodeJS will think Express. For Go, you have to be very generous with the "on par", to the point that it's almost meaningless in my opinion. Especially since frameworks are less popular in Go, people tend to use the standard library, which has great performance here.

[1]: https://tfb-status.techempower.com/

That chart shows fastapi and nestjs are 2 of the fastest libraries among dynamic languages. And nestjs is <20% faster than fastapi because fastify is <20% faster than starlette.

Not sure what this has to do with the library’s quality though, it is a high quality modern python library.

> That chart shows fastapi and nestjs are 2 of the fastest libraries among dynamic languages.

With only JavaScript, Perl, Lua, PHP, Python and TypeScript enabled, fastapi is 64th and nestjs is 73rd, out of 136 libaries. They have a score that's ~10 times lower than the fastest libraries.

You have to deselect platforms first off but even then you will still see starlette and fastify on the list being compared to frameworks like nestjs and fastapi. The anomaly is workerman/swoole based PHP libraries which are significantly faster than other dynamic languages frameworks by 2-10x it’s not that express and Koa and fastapi etc are slow it’s that PHP is an anomaly. This is the problem with this benchmark it’s very misleading.

This is not an anomaly. Workerman and Swoole are asynchronous PHP frameworks. Node is asynchronous by default, so Express/Koa/Fastify/NestJS are asynchronous. Starlette is an ASGI framework/toolkit, so it's asynchronous too. FastAPI is based on Starlette. The thing is that asynchronous PHP is way faster than asynchronous Python or asynchronous JS. There is nothing misleading about this. If you need performance, PHP is a better choice than Python or Javascript

If you have to deselect 20 default filters and then read through the frameworks in the list to see which ones are fully featured web frameworks in order to evaluate whether a library is suitably fast for your next project it is misleading. But again, I'm not sure what this has to do with it being a high quality modern python library.

> If you have to deselect 20 default filters and then read through the frameworks in the list to see which ones are fully featured web frameworks in order to evaluate whether a library is suitably fast for your next project it is misleading.

It is not, this benchmark is purely about speed, not about being "suitable for your next project". You're the one adding all these other constraints. You only need to deselect some filters when you're doing something like comparing dynamic languages, which you did and wasn't even the initial claim discussed. I also doubt most people would call FastAPI "suitable for your next project" compared to something like Django or Flask that is already well known and maintained.

> But again, I'm not sure what this has to do with it being a high quality modern python library.

We're in a subthread discussing a claim on the FastAPI website that it's "Fast: Very high performance, on par with NodeJS and Go.", and how it's obviously wrong for Go, and possibly wrong for NodeJS.

NestJS is what I use for my website, https://www.codehawke.com. I have no idea if this provides any speed burst over fastapi, but I'll say it's a lot of code making Nestjs run. It can't be a good choice for basic static/serverless apps. These benchmarks are questionable.

I use it for a side project, I had to disable a lot of it’s useful features to get anything resembling performance out of it. A lot of this seems to stem from Pydantic though but is in no way as fast as Go out of the box and I also feel they should remove that line.

Evidence: asyncio based DB bindings are rare and not well supported.

My favorite way to find good projects to explore is using GitHub.com. I go to explore page and checkout trending projects for the week/today. https://github.com/trending/python?since=daily

I appreciate you providing a method for finding stuff, thanks.

I’ve worked on rich https://github.com/willmcgugan/rich including adding strict mypy typing https://mypy.readthedocs.io/en/stable/command_line.html#cmdo....

I found Will’s codebase easy to read and reason about, making it easy to help extend.

What's up with

    if __name__ == "__main__": some random code
at the end of module files? Never seen that, why is it used? I understand that it's a guard that only runs when the module is executed directly, but what is the code put there? For developers to quickly visually test stuff by running the module?

Here, for example https://github.com/willmcgugan/rich/blob/master/rich/color.p...

Yea it is there for devs to see different rich features in their console.

You can run some examples using the -m flag + the submodule:

  python -m rich.live
  python -m rich.markdown # this is actually a markdown to terminal cli
https://rich.readthedocs.io/en/stable/live.html https://rich.readthedocs.io/en/stable/markdown.html

It's a common pattern as others have said, but this specific version of it is poor practice. The code should be in a function instead of polluting the global namespace (otherwise 'console', 'table', and 'color' end up being shadowed if they're used elsewhere in the file). Keep it clean with e.g.:

    def main() -> None:
    if __name__ == "__main__":

That is not the case. The code within that block will not be in the namespace if that file is imported. Contrary to your example which will have the main function in the module namespace.

> "if that file is imported"

Static analysis doesn't care. The script can be executed directly, so it has to be treated as if it will be. The fact that the code behaves differently between prod and dev use is a smell, not a selling point.

I'm saying this in the context of writing modern Python with modern tools. In that context it's a poor practice.

That clearly prints out a reference of the colors, so yes, as shorthand for devs.

I think, in general, most FastAPI and Pydantic related libraries are heavily typed, use poetry, GitHub pipelines, black, isort, flake8 etc. so if you want to look at the ecosystem around a package I’ll recommend a few here, that has a smaller scope than the huge libraries Pydantic/FastAPI are. All packages listed below has all these things.

FastAPI-Azure-Auth [0] is a library to do authentication and authorization through Azure AD using tokens.

ASGI—Correlation-ID[1] is a package that utilizes contextvars to store information through the asyncio stack, in order to attach correlation/request ID to every log message from a request. Django-GUID [2] does the same for both sync and async Django.

Pydantic-factories [3] is an awesome library to mock data for your pydantic models.

[0] https://github.com/Intility/fastapi-azure-auth

[1] https://github.com/snok/asgi-correlation-id

[2] https://github.com/snok/django-guid

[3] https://github.com/Goldziher/pydantic-factories

Gonna hijack and ask for some Django projects with a lot of forms, especially nested forms with a lot of relationships. I’d be curious to learn from other folks how to do forms, how the views work (update vs create), etc.

If you're running a startup, I recommend reading this guide to structuring Django projects: https://news.ycombinator.com/item?id=27605052

Not just Django and startups, that looks like a good guide on architecture overall!

NetBox[0] is a large project that does this.


Django itself.

Really, django is the framework to read if you are into Python.

I've been working with Django since version 0.96. Still a lot of love for it, but for the last several years it's not my default for new projects though. The Django codebase is far from what can be called modern. As an experiment, it's easy to open up some Django core modules side by side with other frameworks that were mentioned (FastAPI for one) and to see that the code looks so different to the point where a programmer with not a lot of experience with Python might think those are different languages altogether.

Much of Django's core has not changed in a very long time, so there would be no type annotations and other modern Python constructs.

FastAPI, pydantic, uvicorn, httpx would be in my list to answer OPs question.

> the code looks so different to the point where a programmer with not a lot of experience with Python might think those are different languages altogether

Some of us actually prefer the "old" style of Python. Type annotations are too "Java-ish" for me, I hate source code that declares a function with signature takes more than half of the screen space, I hate making http requests with three layers of nested `async with`. If you are into static typing or RAII why not choose a real deal language instead.

Probably read something that is related to your project. My moment of enlightenment was reading SymPy code. It was well explained and documented. Reading the code answered so many questions that you couldn’t get an answer on SO.

Peter Norvig’s notebooks and his course.

Appreciating that Norvig is much smarter than myself...

I find his style to be not great, too many code-golfing one liners. Not everything has to be a list comprehension (or even, a double list comprehension). Give me a boring loop with minimal action per line, please.

Norvig's approach of converting thoughts into a program is what you need to focus on. It is trivial to convert a list comprehension into a loop.

In "When is Cheryl's birthday?" [1], the way he converts the problem into functional units is insane. A couple of list comprehensions shouldn't scare you. Look above and beyond the details.

[1] https://github.com/norvig/pytudes/blob/main/ipynb/Cheryl.ipy...

I second this


I think Bottle.py is quite interesting. It is a single file library (around 4k line) and has no dependencies other than the Python Standard Library. https://github.com/bottlepy/bottle/blob/master/bottle.py

The other is Tornado. https://github.com/tornadoweb/tornado

+1 for Tornado.

I used it early on when the docs where not there (IIRC), and spent some time reading the code. Very readable. Not sure if pythonic or not.

Tornado is an absolute feat of engineering. Shout-out to @bdarnell.

peewee the ORM is also single file.

Not perfect source code, but accomplish the task well.

Here's a few that haven't been mentioned yet:

- PDM: A modern Python package manager with PEP 582 support[1]

- Spleeter: Deezer source separation library including pretrained models[2]


[0]: https://github.com/pdm-project/pdm

[1]: https://github.com/deezer/spleeter

I'd recommend a project from work, Geostore[1]. Highlights:

- 100% test coverage (with some typical exceptions like `if __name__ == "__main__":` blocks)

- Randomises test sequence and inputs reproducibly

- Passes Pylint with max McCabe complexity of 6

- Passes `mypy --strict`

- Formatted using Black and isort

[1] https://github.com/linz/geostore

The telethon library [1] is a very interesting case. It's fully async, implements the weird binary mtproto protocol, has optional C components for speedup and is partly auto-generated from protocol specs.

It is an incredibly complex piece of code, but is immensely rigidly structured, typed and makes heavy use of OO to remain readable and manageable.

So instead of perfect hypothetical textbook code, I think this is a very instructive realistic project.

[1] https://github.com/LonamiWebs/Telethon

I don't think a lot of it exists yet, to be honest. Python 3.10's been a game changer when it comes to new patterns and idioms, and I don't think the dust has settled yet.

Why is Python 3.10 a game changer? (Honest question)

I think OP is referring to new features like pattern matching [1] that's been sorely missed in python for a while.

[1]: https://www.python.org/dev/peps/pep-0634/

Structural pattern matching is a big one. Similar to how type hinting gave us new patterns and libraries like Pydantic, FastAPI, etc, I suspect there will be new libraries and patterns that lean on structural typing heavily. 3.10 is also a culmination of additional features from 3.8 and 3.9 that were pretty lackluster on their own, but in combination with features from other point versions, you can do some cool things that you couldn't do as elegantly before.

I've recently added support for Django 4.0 in https://github.com/Bearle/django_private_chat2

The code is tested & known to work well

The EDX platform is open-source, it's a complex Django project: https://github.com/edx/edx-platform

I think that gpiozero might be worth a look: https://github.com/gpiozero/gpiozero/


Really any recent project by Encode: https://github.com/encode

What’s your goal?

Source code is not to be read like a book. If you read it, you must do so with a goal, context and purpose.

Is there another way to read books? Only looking at the pictures isn't reading.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact