What did Python have relative to other languages that inspired these developers to devote so much energy into their projects?
They could have chosen PHP, Ruby, Node, Go, Java, etc.
If I think back to ~2012 I remember that compared to other languages Python had:
1) Decent package management and "virtualization" (via virtualenv)
2) Nice balance of expressiveness and strictness
3) Was fun to write, hard to explain but it was the vibe
4) Decent enough performance for most things
5) Lots of developer tooling for debugging and stuff
It felt like the least messy of the scripting languages.
- Everyone knows all the issues PHP had, no need to list it all here
- Ruby was cool but there were a thousand ways to do the same thing
- Node was inheriting that weird web language with all its quirks
- Go was just in its infancy
- Java was the kitchen sink
(Mind you, the likes of Zope and CherryPy are still out there and under development. You just don't hear about them much.)
Up until about 2010, web dev in Python was converging hard on Django. It took Flask and similar small frameworks to bring back variety, and all of those had to meet the requirement of better than Django in at least some narrow sense, if only in being much lighter, to get any uptake.
It’s an attractive language for people who want to integrate existing packages into a solution instead of creating everything from scratch.
Poorly written software that correctly implements science/physics tends to be more valuable than expertly written software that encodes a science/physics defect.
At the time the scientific packages were extremely heavy and best avoided in production environments. Some people did it anyway but removed it once they needed to scale. Simply installing these packages was a huge pain and blocked one's ability to provision new machines, for example.
PyPI was indeed a big benefit. Ruby Gems are also what made Ruby so competitive at the time.
You’re right, scale is absolutely an issue for companies that sell software or that are very large.
Also important to consider that most companies use software internally instead of selling b2b or b2c. Fewer than 500 businesses in the US employ more than a few thousand employees* (https://fortune.com/fortune500/2020/search/?f500_%20employee...) and there are over 17.5 million businesses (https://www.naics.com/business-lists/counts-by-company-size/).
* I may be wrong in my assumption that the Fortune 500 companies are the largest employers
There's a balance and it's more delicate than people give it credit these days. Don't get me wrong though... over-engineering and/or premature optimization is just as bad. But there is an opposite extreme as well.
Ruby doesn't have primitive data types. Everything, including strings, are just objects. This level of purity is quite nice and "cleaner" in a conceptual sense. Until you encounter a large codebase with hundreds of monkey patches. Or, something I saw a lot, the extending of "base" data types such as String.
I must admit that my memory is a little hazy here, but I remember that it was impossible to find language tooling that allowed me to "jump to definition" of anything I encountered. Ruby was so free-form with so much ambiguity that at the time it just didn't exist. I'm sure this has been resolved by now. In Python we had "Jedi" and other tools that worked really well for this purpose.
You'd be browsing a Ruby codebase, see `some_str.foo`, and wonder:
1) Is foo a method or a property? In Ruby a function call doesn't need parenthesis if it takes no arguments.
2) Where is foo? I don't remember it being part of the standard library. Where is it defined?
3) If I do find foo, did something else monkey patch it? How would I know?
4) If I need to pass an argument into foo, should I factor it out of String? At what point am I overloading a base class like String with too much functionality?
These are very real questions and the answers are important. Come across enough of these scenarios (remember this is just one example w/ Ruby) and you eventually give up trying to understand the codebase at that level. Everything becomes a black box, everything is magic.
Do these footguns exist in Python? Sort of. You can't extend primitive types and monkey patching doesn't fit so cleanly into a normal program (think of it as "friction"). There was less ambiguity in Python's syntax. And the language community promoted a list of idioms which was helpful for discouraging bad practices.
These things may seem subtle but they made a pretty big difference at the time.
> especially things like numpy, pandas, scipy, etc
These libraries were generally avoided when building web services, APIs, etc. The context of this discussion is Flask and the engineer(s) who built it were generally working in the world of live services. Dropbox, Reddit, and other YC companies made heavy use of Python as live service type software. Data analytics stuff def existed but different context.
The reason numpy and friends were avoided was due to the complexity of installing them. The dependency list was enormous and some of those dependencies needed to be compiled on the fly. Scientific packages also typically shipped with very large datasets to support whatever complex computing they were doing (think training data).
The 'scientific community' played a small part in the creation of the frameworks discussed here. If they did, it was more that Python was one of the first languages that the creators picked up in University.
If you were doing something nontrivial where the interesting part weren't the kind of things Rails/ActiveSupport addressed, those factors alone (before even considering language/library features) made it a lot more of an uphill climb with Ruby.
Java was (and still is) insanely heavyweight
Ruby was tied to the hip with Rails despite being more than capable on its own and being much more expressive than Python (IMO). Python is faster though.
PHP was (and still is) PHP
Walmart picked up Node.js in 2012. Not for their core stuff obviously but they were running a bunch of internal services with it.
LinkedIn was evaluating it alongside Ruby (EventMachine) and Python (Twisted). They ended up picking Node for some of their services.
Highly concurrent green-thread-ish type stuff was all the rage. Go and Node were the front runners. Python was the conservative choice. Ruby was hipster choice :P
Not a popular opinion, but Python had Google.
Those changes have largely been incremental and/or questionable.
Python has async now, which is cool, but anyone calling a sync call or some poorly optimized algorithm anywhere in the stack can bring the entire application down.
Type hints are moving slowly in the right direction, but they’re still really cumbersome and mypy is still a pain to use (publishing type hints, recursive types, loading type hints, etc). Last I tried it (admittedly it’s been a year since I was deep in the Python game) it was still pretty beta quality.
Package management still sucks.
Deployment still sucks. Yeah, Docker helps, but it also makes some things worse, like local dev and build times and so on. And there aren’t many good solutions for distributing CLIs and other apps to users—some of the zip things work okay but they don’t close over the runtime, stdlib, or shared objects. Moreover, Python dependency zips can be huuuge—we were busting the 250MB lambda limit (compressed—this was back when lambda had a limit, pretty sure they upped/removed it since) meanwhile a comparable Go binary weighed in at 6MB.
Performance still sucks. Yeah, the 3.11 improvements will feel big to a Python programmer, but 40% (made up number) faster than 1000X slower than Go is still really slow (yeah, “just write the slow parts in C/Numpy/multiprocessing/whatever” works once in a while, but very often it doesn’t and it’s basically impossible to know from the project’s outset whether all of its bottlenecks will be amenable to that particular optimization).
So to summarize areas where Python has been left in the dust (probably an incomplete list):
* Package management
* Static type system
Great point about package management. Go already had it figured out. Node was busy figuring it out (npm shrinkwrap; still sucked but less so).
Any time I talked to people about Python the conversation was about 2to3 and core team drama lol
For the types of applications built with Django, async is actually more of a hinderance than a benefit. Async I/O can create massive back pressure in a distributed system.
A simple example would be a web service that gets a massive influx of traffic (organic or DDoS, doesn't matter). If the web service is using async I/O to manage database connections with an unlimited connection pool (often the default) then it will happily accept all incoming requests and push all of that pressure onto the database. The database gets overwhelmed because it's inherited all of that traffic. The database starts refusing connections not just to the web service but any service that connects to the database. And boom, you have system wide outage.
Obviously the database should guard against this kind of overwhelming traffic but that's often thought about last.
Diagnosing the problem becomes a lot harder since your bottleneck is further downstream. It gets messy.
Synchronous services on the other hand create a nice throttle. Your Django app is never going to achieve the kind of throughput that your database will. So, by design, your Django app will get overwhelmed first because it won't accept new connections (eg: maybe it runs out of memory).
That being said I still think it makes sense for Django to support it. Not for a chat app, but say... web sockets. You may want a websocket connection that pushes notifications back to a single-page-app. The notifications will contain a bunch of contextual data that the Django framework provides easy access to (think ORM, templates, etc). Obviously you could build the websocket thing as a separate service but then you lose all the goodies that Django provides to you.
"Delete node-modules and try again" seems to be all to common in our developer chat system
All that said, I strongly believe that solving the right problem and working in a language that more of the dev team / company are comfortable with deploying / supporting / integrating with are way more important. I’ve seen far too many times what happens when a team at a BigCo “goes rogue” and wants to use something nonstandard, then has to re-solve dozens of solved problems because of that decision. It’s never been worth it in my experience. M
see the inflection date? end of 2007. guess when that comic was published
Async support was clunky at best... The Django way of doing things has traditionally been to either use a queueing system like Celery or multi-threading, but I find first-class async like in Node is much easier for many typical web tasks (database I/O and calling external services).
I also faced issues with libraries for menial tasks. My setup was Django + React and JWT was poorly supported - the most popular auth library needed quite a bit of manual patching in code to make it all work. This was very surprising... I walked away with the feeling that some of the libs for what should be common use cases nowadays are not "production-grade".
DRF while powerful and well-documented also had a decent amount of quirks for things that were not so far at the edges.
Raw deployment on PaaS like Heroku and Elastic Beanstalk (tried both during the project, ended up settling on the latter) without Docker or K8s, also a bit messier than I'd have liked at times, but ultimately OK.
Things I really liked: models and the ORM (again the occasional niggle at the edges but good 98% of the time), most of DRF, Python itself, Python types, the admin panel, management commands, the custom commands "framework", testing.
But as it stands, with my qualms above, I think for the next project that deserves a full-batteries framework, I might give something like .Net or Spring + Kotlin a chance.
- Async... The Django way of doing things... Celery -> it's a background task system which is like bringing a bazooka to kill a fly for what you wanted to do (I assume proxying the request to an external API).
- database I/O and calling external services -> Use gunicorn gevent worker type that brings up the pseudo threads.
- JWT was poorly supported -> There's at least 5 battle tested libraries that you can use to implement JWT alone (simplejwt comes to mind first)
- the most popular auth library needed quite a bit of manual patching in code to make it all work -> Which one? There's few that are standard and most are easy to customize via global settings.
- Raw deployment on PaaS like Heroku and Elastic Beanstalk -> that's more on the platform rather than Django itself
- DRF -> I agree, DRF has it's own way of doing things and deviating from them can give you massive headaches.
We actually used for things that were close to its purpose. Mainly scheduled background tasks. However, from what I could see in the Django world, sometimes there's also a temptation to use it for heavy CPU or network bound tasks.
> Use gunicorn gevent worker type that brings up the pseudo threads
My point still stands, more complex than async in Node.js, C# or Spring for example.
> There's at least 5 battle tested libraries that you can use to implement JWT alone (simplejwt comes to mind first)
Exactly what we used: dj-rest-auth (the supported one) with simplejwt plugin. And we encountered problems on a very simple use-case: simply sending the JWT to the client. I had to patch some code found in an obscure GitHub issue for the repo, which I cannot find right now, otherwise I would link to the issue itself.
> that's more on the platform rather than Django itself
Yes, mostly, I don't blame Django itself, but support for deployment is important for me and at the end of the day I do factor it in.
> DRF -> I agree, DRF has it's own way of doing things and deviating from them can give you massive headaches.
I would add, sometimes the way of doing things was not that clear... It was not always clear whether the convention was to annotate some data or do some intermediate computations in the models or in the serializers, and opinions on the internet varied wildly. But overall that comes with the territory in software development, again I don't think DRF itself was to blame, it does many things extremely well, and does offer many clear conventions.
Which one did you use?
As to flask, I would love to see it reach the milestone that Openbox has achieved of “being done” and only really being updated for bug+security fixes or base language feature updates. As a “micro framework”, it has perfectly filled it’s necessary niche.
Both pessimistic and incorrect.
You can observe the vast number of libraries across all ecosystems that are in use, as reliable libraries, well into functional obsolescence. Not to say some niche software doesn't suffer the fate of "nothing new, so it's dead" but that's not the norm.
Web frameworks are never feature-complete, per se. There are always new technologies and workflows to support. Frameworks are usually considered dead when the development halts in exhaustion or in transition to another project (as individuals, the development team members drift to these modalities), not when they are "done" by some finite collection of features.
I've learned a lot of different languages and done various types of projects, but whenever I get free time to start a personal project that has a web component in my favorite language at the time, I'm always a little bit sad I inevitably have to build out something that was just included in Django a decade ago.
When your business application reaches a point of "non-triviality" (for lack of a better term) you'd (hopefully) realize that you need to ditch Node ASAP or face growing pains.
To understand why requires some historical context.
During the time when Node was becoming popular most scripting languages had weird bottlenecks at the request level.
PHP: When a request was received, your whole application would load into memory. You know how an application might read a config file on startup? Yea, every single request would load that file. There was no concept of "global variables across all requests". That's one OS process per request. Sometimes process forking was used to make that faster but it didn't matter that much because your entire application needed to load for every request to be handled.
Python: You'd use a toolkit like uWSGI to launch N number of processes when your server started. The processes would stay up and continue to serve requests. You'd avoid the overhead of initializing your app on every request, but you were still locked to one process per request because of the GIL.
Ruby: Basically the same as Python.
Java, C++, Go, etc: Too slow to develop in, too complicated for rapid iteration. Type systems scary. Weak dynamic typing fun. Productivity was king and you needed some time in the hard languages to git gud.
Okay so the fundamental problem with PHP/Python/Ruby was that you needed to tie up a single OS process per request. Let's say you wrote an HTTP API that would fetch the current time from time.gov or something. While your HTTP handler was fetching data from time.gov that OS process would be STUCK. It's sitting around and doing nothing while holding your whole application state in memory just for that one request you're serving.
This wasn't actually a problem, not really. You could serve an ridiculous amount of web requests using this model. Computers are fast! Until... you needed concurrency... like a chat app. Because in a chat app you need to keep a TCP request open for every single user in the system. 500k active users? 500k OS processes just sitting there doing nothing with all of your application state in memory. Not a good fit for PHP/Python/Ruby where every open request was tying up all those resources.
So how would you solve it in C++/Java/etc? Easy, you'd use non-blocking IO. With some fancy connection pooling, threading, and use of the epoll_wait syscall (or whatever) you could handle all 500k users with a SINGLE OS process. That's because your chat app is really just routing bytes between client applications. Most of the time is spent in I/O, not in your application.
Except non-blocking IO is not easy. Threading is not easy. Connection pooling is not easy. Type systems, code compilation, etc etc are not easy.
All of the early demos of Node (that I saw) were basically chat apps. Applications with very little business logic that spend most of their time doing IO. It was more of a glorified router of bytes...
But then people realized that CPUs are actually really fast and you could start modeling basic web apps. All you're doing in most web apps is pulling some data out of a DB, decorating that data, and pushing it out of NodeJS. Single threading is no problem. So people starting building on it... and building on it... until they needed to scale for real.
Some unfortunate souls didn't realize any of this and built complex business applications in Node. They struggled a lot with the single-threaded nature of this environment. That's when you started seeing things like Node Cluster which basically used threading under the hood to distribute some of that CPU load. Over time better alternatives came along.
So the reason you don't see big complex frameworks in Node is because Node doesn't need them. Using a big complex framework means you're shoving way too much business logic into a technology that wasn't designed for it.
Express and Koa are basically peak Node "frameworks". They're effectively just syntax sugar for cleanly routing your requests somewhere else. Need something more than that? Look elsewhere or you'll regret it later!
EDIT: Fixed a bunch of typos. Did not expect to write so much.
When dealing with JS this typing should be in the head of the developer. Too hard and too unreliable.
Typescript seems to be having it's heart in the right place though.
And then you had PHP, where you didn’t need any middleware and could just drop a file into your webroot and it just worked.
When I see a project this well kept I'm likely to assume there's a plan and it is being efficiently executed on.
When a project is a huge backlog of unattended issues and PRs, it is much more likely that progress is slow and there's duplication effort.
And it’s no dig on Flask specifically, you can play this game with Django too, large Django projects end up half implementing ActiveJob and ActiveSupport but worse.
This is an honest question; I don't know much about python async yet.
To me the biggest advantage of FastAPI is the excellent integration with type hinting. It permeates the entire framework and makes things really productive.
When I saw how you could use Pydantic models to define the schema of your request and response body, and then have that autogenerated into documentation - there was no going back. https://fastapi.tiangolo.com/tutorial/body/
But I would like to stress that, though it has displaced Flask in my workflow, I think of FastAPI as a spiritual successor to Flask, and would never say an ill word about Flask.
And I do miss having such industrial grade documentation for my framework. I miss it real bad.
With a proper async framework, you will keep accepting all the requests as they come -- a single node can take thousands of those requests and time out on them gracefully, as they are just lightweight entries in the event loop, and you can have lots of those compared to # of worker threads.
Of course, fully async framework requires all your code to be fully async, which in practice means minimizing the # of dependencies since its hard to trust them. And problems with SDKs, like for instance on GCP where Google keeps lacking async support (!) for most of their things. So with the "hybrid" async you get from Flask you can still choose to use sync code for some things and async for others.
pydantic validation throwing a 422 with what’s wrong before my endpoint ever gets hit is a fucking superpower.
As a side note, not sure if it's just me, but I feel a sort of "tip jar effect" with projects that have no open PRs/issues. I feel like I'd be far less willing to submit one if there's no others there. Like all eyes would be on me if I were to do so. Something about adding your issue to the pile just feels a little more welcoming. Just me?
It just does an incredible job of staying out of the way and never having become some bloated beast which ends up causing problems due to some misplaced voracious appetite for eating as many batteries to include as possible.
Yet, both repositories have closed the same amount of issues during the last month, but FastAPI only merged 2 PRs while Flask has merged 21.
Popularity probably plays into this a lot, but it's also just very clear that whoever is doing the management of Flask, is doing an excellent job! Kudos.
It's unfortunate, because FastAPI is one of the best libraries I know of otherwise in terms of capabilities for time spent writing code. Nowadays, though, when recommending this lib at work, I need to add "but pray you don't hit a bug or missing docs, or you're on your own".
On top of that it definitely is in need of more top-down organization, delegation, and coordination.
Flask will always be my go to framework but I am looking forward to actually sitting down learning Fast API and Nestjs with Typescript.
FastAPI for my usecase doesn't offer anything new but the hype is defeaning and Typer CLI seems like a good way to build CLI projects.
and Nestjs is something I should learn because I should learn. I didn't enjoy django but I need to learn a "professional" backend framework and also Typescript.
I wish Flask could be a little bit more battery included on that front. Wouldn't hurt the lightedness of the project to have automatic OPENAPI documentation, none are currently viable...
That's why I switched to FastApi but I miss @app_context, and Flask Security...
Flask with another batteries included framework is a nice combination. In my opinion, they both serve distinct use cases.
What didn't you enjoy about Django?
Compared to flask there was a bit too much boilerplate involved in Django to essentially build very barebones APIs. My usecase of a backend framework involves processing data on command and serving that data. I need a few API endpoints that trigger other Python files. I build internal use APIs. So flask, particularly RESTful flask is perfect for me.
I am not the intended user of Django. I was part of a few Django projects and I saw people investing more in configuring the framework right than to focus on the actual functionalities of the application. Even though they were building MVP, the initial investment in foundation building with the hopes of scalability for proof of concept project just feels wrong to me.
Take a look them yourself:
I am quite disgusted by his self-congratulatory tweet like that, TBH.
Those tickets that the community posts questions and asked for help, he just shut people down.
That was the first issue I clicked on, and if that's being rude I wish every maintainer was as rude.
I don’t blame these people for that kind of behavior though, I feel like shrugging your shoulders and saying “If you want that feature, pay to sponsor the project” is a legitimate response. I just think that actively shutting down discussion is going a little bit too far.
One of the first things that struck me reading the tutorial was the use of "g" which seems to stand for "global" and is used to access things like the DB via "g.db".
This seems similar to using $GLOBALS['db'] in PHP which would be a code smell and I think it's impossible as of PHP 8.1.
If I saw this in a PHP Framework I would assume it was made by an amateur. Can anyone explain if this is common practice in Python and perhaps why it's different and not a worry?
There's not much we can do about it I think. It's the blessing and curse of Flask.
For example you can combine Flask + Flask-Classful + Flask-Marshmallow + Flask-JWT-Extended to build an API in a really really nice way IMO, but OpenAPI specs are lacking because these are 4 distinct tools that are being glued together.
If generating OpenAPI specs is a top priority someone would need to bundle together the major components of an API (routing + serialization / validation + authorization) into 1 library in such a way that it's better than combining a few smaller libraries together. But if you're looking to build such a large library then it kind of goes against Flask's philosophy of being a micro-framework where you bring your own batteries.
If you want full blown batteries included for an API based app then you're probably better off using Rails (even for building APIs) or Django with DRF.
I'll admit it's not a great outcome tho. I really do think Flask-Classful + Flask-Marshmallow + Flask-JWT-Extended is a great combo but as someone who has been using Flask for 7+ years, have courses on Flask and have built a ton of apps with it when a client wants to build an API driven app where OpenAPI specs are important then I've gone with other solutions. Thankfully most of the apps I build return HTML and Flask is still great for building non-API based apps.
If you're doing any kind of database / external network calls - won't you gain a huge amount of requests per second?
I ask partly because on one of my projects, we have a fastapi app, but it's not using async, and I have been toying with the idea of converting it to async. It will take some work though because it uses libs that don't support async.
I thought the advantages would be worth it because we hit postgres and/or redis on all requests.
Any other use-case you are better off going with the simplicity of a sync framework like Flask. For databases you are way worse with async, see why from the Python database god himself: https://techspot.zzzeek.org/2015/02/15/asynchronous-python-a...
Not sure what libs you use, you'll probably need to switch libraries, it's not common for one lib to support both sync/async. I'm using aiohttp for http/websockets and asyncpg for postgres, running on uvloop as the base event loop.
Something which is trivial with asyncio: I receive 10 consecutive heavy websockets requests, I spawn a task to handle each one of them and they will be served back out of order as Postgres returns results. With no locking or messing with threads.
Another benefit is knowing for certain your code won't be interrupted unless you call "await", this simplifies a lot using shared state without requiring locks.
Only if they can be parallelized. I'm not sure if I would choose Flask if I had such a complex application.
[ed: the lack of async probably means little for how good flask is for a new project - it's a great little framework].
However, with the birth of fastapi - I would be hard pressed to name a greenfield project/situation where I would recommend flask over fastapi. And the support for async in fastapi have nothing to do with that.
Also see Quart https://github.com/pgjones/quart (which is Flask re-implemented with async-await) and Quart-Schema, https://github.com/pgjones/quart-schema for swagger docs.
Wontfix is the new completed, from what I can tell.