One thing I’d add to this conversation, though I’m certain it’s already been stated: As many have mentioned, there is a large subset of the user base that uses Python for applied purposes in unrelated fields that couldn’t care less about more granular aspects of optimization. I work as a research assistant for international finance faculty and I would say that compared to the average Hackernews reader, I’m technologically illiterate, but compared to the average 60-80 y/o econ/finance faculty member, I’m practically a Turing award winner.
Most of these applied fields are using Python and R as no more than data gathering tools and fancy calculators. something for which the benefits of other languages are just not justified.
The absolute beauty of Python for what I do is that I can write code and hand it off to a first year with a semester of coding experience. Even if they couldn’t write it themselves, they can still understand what it does after a bit of study. Additionally, I can hand it off to 75 year old professors who still sends Fax memos to the federal reserve and they’ll achieve a degree of comprehension.
For these reasons, Python, although not perfect, has been so incredibly useful.
I just want to add to this, I had this exact same experience when working with journalists and other non-technical background programmers.
You’ll find everyone from philosophy PhDs to Biologists to Journalists who use pandas because its so easy to learn it and work with it. It’s amazing how you can become productive in python/pandas without any experience or even basic understanding of programming because of how accessible jupyter, colab and blogs/docs on pandas are.
The other thing people don’t talk about is that a lot of these organizations can hire a CS student part time or a full time software engineer/data engineer/data scientist who can optimize their scripts once they are written. Pretty much any software engineer can read and debug python code without needing to learn python. So for example, I know some engineers working in genomics who have turned biologist-written scripts that take several days to run in python into scripts that take hours or minutes to run by doing basic optimizations like removing quadratic algorithms from the script or applying pyspark or dask to add parallelism.
The fact that python can be used as a bridge between technical and non-technical people is amazing and I think it has provided a better bridge between these groups than SQL was ever able to provide.
I couldn’t agree more. And I must say, now that it’s being used as a bridge between technical and nontechnical talent it’s becoming ever more vital from a career perspective. Most people recognize the value of fundamental coding skills and if you’re even just above average at coding in a non-CS field, you seem magnitudes more valuable than you really are. In both industry and research, ears immediately perk up when they realize I have a background in economics but competencies in coding beyond the standard regressions in R that everyone does in econometrics. It’s hilarious because as mentioned prior, I’m rather pathetic compared to most people on this forum.
Yeah, Python is widely used where I work for just that. The "hierarchy" of tools look somewhat like this - from most to least technical competent users
1) Languages like Python / R / Julia / etc. + SQL
2) PowerBI, Tableau, or similar tools
3) Excel
The number of users of those tools will be the inverse, with Excel being number 1.
If you're competent using the "stack" above, you could probably work as an analyst anywhere - given that you can pick up domain knowledge.
I hate to admit that I very often start the python repl to just do some simple calculations. I always have multiple terminals open so instead of opening a calculator I just use python in one of the terminals.
Agreed. Python's REPL has basically totally replaced my usage of Emacs calc as a desk calculator, mainly because it is always there and if I don't know the big-brain closed-form solution for something like compound interest, I can just write a loop and figure it out that way.
This is a really good line, the VAST VAST majority of programming in the world is done in Excel by people who would be horrified if you told them they were programming.
And I wouldn't be surprised if a large number of python programmers would say they're not programming, it's just scripting.
I also use a python repl as an alternative to excel or SQL. I find myself just downloading the data as a CSV and then quickly cooking up some pandas to get a graph or aggregate some stats, it’s just so much quick easier imo.
I’ve migrated to the tidyverse for most of my EDA and plotting - I’ve found dplyr and ggplot to be noticeably more expressive. Pandas always added a ton of friction for me.
It’s still my choice for quick and non-graphical analysis when I’m on a remote.
A bit off topic, but what would you use for data "mangling"? Like joining csvs on complex conditions, cleaning tables etc. Pandas seems to be the wrong tool for this, but I still often find myself using it as in contrast to something like Excel, my steps are at least clearly documented for future use or verification.
If you asked this question 6 or 8 years ago the answer would be it depends on the volume of data (10s of gb, 100s of gb etc.) and I could give you just a single tool that would help you in most cases.
Today honestly most tools are pretty capable, pandas is a great choice and if you have really high volumes of data you might try koalas (spark) or polars.
Honestly the biggest design considerations for data science today are things things external to your project: what do you and others on your team know, what tools does your company already have setup, what volume of data are you processing, what are your SLAs, who or what else needs to run this script/workflow, what softwares do you need to integrate with, how often does it need to be processed, how are you going to assure the quality of your data and what tools are you using for reporting?
I tend to use pandas and SQLite for most use cases cause I can cook up a script in 2 hours and be done, I just code it interactively in a notebook and most people are able to work on a pandas or SQLite script productively if it needs to be maintained even if they don't know python. If its a large volume of data or a rapid schedule (minutes, seconds) or tight SLAs on quality or processing time, then I start to consider whether pyspark, Apache beam, dask or bigquery might be a good fit.
So it really just depends but for most people who are processing < 100 GB on a 1+ day schedule or ad hoc I would recommend just using pandas or tidyverse in R and getting really good at writing those scripts fast. Today you’ll get the most mileage out of those two tools.
This is a letter to the general community: please stop writing these scripts in perl and bash one liners. That one off script you thought would only be used once or twice at this nonprofit has been in continuous use for 12 years and every year a biologist or journalist runs your script having no idea how it actually works. Eventually the script breaks after 8 years and some poor college student interns there and has to figure out how perl works, what your spaghetti is doing and eventually is tasked with rewriting it in python as an intern project (true story).
I think your complaint isn't really about perl and bash. It's about knowing your audience.
When writing code that will be used by a particular sort of user base, the code should be written in whatever way best suits that user base. If your users are academics, researchers, journalists, etc. -- yes, avoid anything with complex or obscure semantics like perl or bash.
But if your code is going to be used by programmers or people who are already comfortable with perl/bash/whatever, those tools may be just the ticket.
He has a valid point, though. I've seen (and written!) one-liners that were so complex that nobody, even devs, can deal with them without decoding them first.
They aren't technically "spaghetti", but they are technically impenetrable.
I argue that one-liners like that aren't good for anybody, dev or otherwise.
I don't see why that's something to be ashamed of. I frequently pop open a Ruby on Rails console for this purpose. (Basically ruby's repl + libraries and language extensions.)
from time to time yes. Ideally I would also have a jupyter notebook running at all times, but in the end it mostly comes down to vanilla python because that's installed on everything I am using
I've seen this too. Python has supplanted what used to be done in a spreadsheet entirely, even the custom VBA macro stuff that was once a high level spreadsheet. Python with/plus viz is more enjoyable experience than trying to wrangle some general purpose spreadsheet into doing this stuff. And, it's relatively portable and transferrable which are major advantages of the spreadsheets.
I'm one of Python's biggest critics (to me it's a Monkey's Paw of software development), but I think this is exactly the appropriate situation to use it. It's great for one-off fancy calculations, system scripts, ideally with no dependencies and/or a short lifetime
> to me it's a Monkey's Paw of software development
This piqued my curiosity. I've worked with Python on and off for the last ~20 years, and while I'm not a fanboy or apologist, and use other tools when appropriate, there's also a reason it remains in my toolbox and sees regular use while many other tools have come/gone/been replaced by something better.
Can you share an example scenario where it's a Monkey's Paw? My suspicion is that this is more of an org issue than a tech issue?
Dependency management/tooling. Python (philosophically) treats the whole system as a dependency by default, in contrast with other modern languages that operate at the project/workspace level. This means it's very hard to isolate separate projects under the same system, or to reproducibly get a project running on a different system (even the same OS, because the system-wide state of one machine vs the next matters so much).
People work around these issues with various kludges like virtual environments, Docker (just ship the whole system!), and half a dozen different package managers, each with their own manifest format. But this is a problem that simply doesn't exist in Go, JavaScript, Rust, and others.
For code that never needs anything except the standard library, or for a script that never needs to be maintained or run on a different machine, Python is fine. Maybe even nice. But I've watched my coworkers waste so many hundreds of developer-hours just trying to wrangle their Python services into running locally, managing virtual environments, keeping them from trampling on each other's global dependencies, following setup docs that don't work consistently, and fixing deployments that fail every other week because the house is built on sand.
Virtualenvs, and requirements are a thing in Python for ages.
I’ve used tons of languages and while not the best, Python dependency management and project isolation is decent. IMO certainly better than JavaScript.
It's decent if you've been in the loop enough to use it. It's not built-in. It's a good practice, for sure, but it not being built-in at the language level makes it insanely easy for a newcomer to just... Not use virtualenvs at all.
In contrast to Javascript/Node.js/NPM/Yarn/whatever-you-want-to-call-server-js, which maintains a local folder with dependencies for your project, instead of installing everything globally by default.
Heck, a virtual env is literally a bundled python version with the path variables overriden so that the global folder is actually a project folder, basically tricking Python into doing things The Correct Way.
It's been said, quite correctly, that Python is the second best language for everything.
I feel that it has recently - like many really mature platforms - become very much like the elephant from that old apocryphal story [0]. It is being used for many different purposes, with very different requirements and needs, with users being so focused on their own use that anything outside that is considered "bloat" and "waste".
when it comes to slightly more non simple use cases involving parallelism and concurrency python and their imperative kin starts falling quite short of basic needs that are easily satisfied by
fp languages like
ocaml
haskell
racket
common lisp
erlang
elixir
or rust/golang
but even if the code is single threaded and not hampered by GIL limitations python tends to be super slow imho; also debugging dynamic python and imperative stateful python after a certain code base size >10k LOC gets extremely painful
A lot of these problem spaces can get away with single threaded performance because maybe they're generating a report or running an analysis once a day or at even slower frequency. I work in a field where numerical correctness and readability is important for prototyping control algorithms (I work on advanced sensors) and python satisfies for those properties for our analysis and prototyping work.
When we really want or need performance we rewrite the slow part in C++ and use pybind to call into it. For all the real implementations that run in a soft real time system, everything is done in C++ or C depending on the ecosystem.
Because you say it doesn't make it true. It's not that painful or painful at all really. Good abstractions and planning make writing and maintaining a python easy, just like in any language.
>As many have mentioned, there is a large subset of the user base that uses Python for applied purposes in unrelated fields that couldn’t care less about more granular aspects of optimization.
Nobody cares about this that much. Even a straight up software developer in python doesn't care. The interpreter is so slow that most optimization tricks are irrelevant to the overall bottleneck. Really optimizing python involves the FFI and using C or C++, which is a whole different ball game.
For the average python developer (not a data scientist) most frameworks have already done this for you.
Python keeps growing in number of users because it’s easy to get started, has libraries to load basically any data, and to perform any task. It’s frequently the second best language but it’s the second best language for anything.
By the time a python programmer has «graduated» to learning a second language, exponential growth has created a bunch of new python programmers, most of which don’t consider themselves programmers.
There are more non-programmers in this world, and they don’t care - or know about - concurrency, memory efficiency, L2 cache misses due to pointer chasing. These people all use python. This seems to be a perspective missing from most hackernews discussions, where people work on high performance Big corp big data web scale systems.
What worries me, though, is that the features that make Python quite good at prototyping make it rather bad at auditing for safety and security. And we live in a world in which production code is prototyping code, which means that Python code that should have remained a quick experiment – and more often than not, written by people who are not that good at Python or don't care about code quality – ends up powering safety/security-critical infrastructures. Cue in the thousands of developer-hours debugging or attempting to scale code that is hostile to the task.
I would claim that the same applies to JavaScript/Node, btw.
I sometimes think about what Python would be like if it were written today, with the hindsight of the last thirty years.
Immutability would be the default, but mutability would be allowed, marked in some concise way so that it was easy to calculate things using imperative-style loops. Pervasive use of immutable instances would make it impossible for libraries to rely on mutating objects a la SQLAlchemy.
The language would be statically type-checked, with optional type annotations and magic support for duck typing (magic because I don't know how that would work.) The type system would prioritize helpful, legible feedback, and it would not support powerful type-level programming, to keep the ecosystem accessible to beginners.
It would still have a REPL, but not everything allowed in the REPL would be allowed when running code from a file.
There would be a strong module system that deterred libraries from relying on global state.
Support for at least one fairly accessible concurrency paradigm would be built in.
I suspect that the error system would be exception-based, so that beginners and busy people could write happy path code without being nagged to handle error values and without worrying that errors could be invisibly suppressed, but there might be another way.
I think free mutability and not really needing to know about types are two things that make the language easier for beginners.
If someone who's not familiar with programming runs into an error like "why can't I change the value of X" that might take them multiple hours to figure out, or they may never figure it out. Even if the error message is clear, total beginners often just don't know how to read them and use them.
They provide longer term advantages once your program becomes larger but the short term advantages are more important as a scripting language imo
The type system I want would just be a type system that tells you that your code will fail, and why. Pretty much the same errors you get at runtime. Hence the need for my hypothetical type system to handle duck typing.
I don't think mutability by default is necessary for beginners. They just need obvious ways of getting things done. There are two places beginners use mutability a lot. The first is gradual transformation of a value:
line = "The best of times, the worst "
line = line.trim()
line = line[:line.find(' ')]
This is easily handled by using a different name for each value. The second is in loops:
word_count = 0
for line in lines():
word_count += num_words(line)
I think in a lot of cases beginners will have no problem using a map or list comprehension idiom if they've seen examples:
word_counts = [num_words(line) for line in lines]
# or word_counts = map(num_words, line)
word_count = sum(word_counts)
But for cases where the immutable idiom is a bit tricker (like a complicated fold) they could use a mutable variable using the mutability marker I mentioned. Let's make the mutability marker @ since it tells you that the value can be different "at" different times, and let's require it everywhere the variable is used:
word_count @= 0
for line in lines():
word_count @= word_count + num_words(line)
Voila. The important thing is not to mandate immutability, but to ensure that mutability is the exception, and immutability the norm. That ensures that library writers won't assume mutability and rely on it (cough SQLAlchemy cough), and the language will provide good ergonomic support for immutability.
It's a common claim that immutability only pays off in larger programs, but I think the mental tax of mutability starts pretty immediately for beginners. We're just used to it. Consider this example:
Beginners shouldn't have to constantly wrestle with the difference between value semantics and reference semantics! This is the simplest possible example, and it's already a mind-bender for beginners. In slightly more complicated guises, it even trips up professionals programmers. I inherited a Jupyter notebook from a poor data scientist who printed out the same expression over and over again in different places in the notebook trying to pinpoint where and why the value changed. (Lesson learned: never try to use application code in a data science calculation... lol.) Reserving mutability for special cases protects beginners from wrestling with strange behavior from mistakes like these.
Julia is both dynamic and fast. It doesn’t solve all issues but uniquely solves the problem of needing 2 languages if you want flexibility and performance.
Exception error handling - and their extensive use in the standard library -is the fundamental design mistake that prevented Python becoming a substantial programing language.
Coupled with the dynamic typing and mutability by default, it guarantees Python programs won't scale, relegating the language to the role of a scratchpad for rough drafts and one off scripts, a toy beginner's language.
I have no idea why you say that it's a scratchpad or a toy language consdering that far more production lines of code are getting written in Python nowadays than practically any other language with the possible exception of Java.
But that's the same with Excel: massive usage for throwaway projects with loose or non-existing requirements or performance bounds that end-up in production. Python is widely used, but not for substantial programming in large projects - say, projects over 100 kloc. Python hit the "quick and dirty" sweet spot of programming.
This is absolutely not true. I’ve made my living working with Python and there’s an astounding amount of large Python codebases. Onstage and YouTube alone have millions of lines of code. Hedge funds and fintechs base their entire data processing workflows around Python batch jobs. Django is about as popular as Rails and powers millions of websites and backends.
None of those applications are toys. I have no idea where your misperception is coming from.
I guess I'm more than a little prejudiced from trying to maintain all sorts of CI tools, web applications and other largeish programs somebody initially hacked in Python in an afternoon and which grew to become "vital infrastructure". The lack of typing bytes you hard and the optional typing that has been shoehorned into the language is irrelevant in practice.
All sorts of problems would simply have not existed if the proper language was used from the beginning, as opposed to the one where anyone can hack most easily.
We still live in a world where many outward facing networked applications are written in C. Dynamic languages with safe strings are far from the floor for securable tools.
However, I hope that these C applications are written by people who are really good at C. I know that some of these Python applications are written by people who discovered the language as they deployed into production.
That’s a measure of programming prowess, not the actual security concern at hand.
If the masterful C developer still insists on using a language that has so many footguns and a weird culture of developers pretending that they’re more capable than they are, then their C mastery could very well’ve not been worth much against someone throwing something together in Python, which will at the very least immediately bypass the vast majority of vulnerabilities found in C code. Plus, my experience with such software is that the sort of higher level vulnerabilities that you’d still see in Python code aren’t ones that the C developer has necessarily dealt with.
A popular opinion in game development is that you should write a prototype first to figure out what works and is fun, and once you reach a good solution throw away that prototype code and write a proper solution with the insigts gained. The challenge is that many projects just extend the prototype code to make the final product, and end up with a mess.
Regular sofware development is a lot like that as well. But you can kind of get around that by having Python as the "prototyping language", and anything that's proven to be useful gets converted to a language that's more useful for production.
What audits need most is some ability to analyze the system discretely and really "take it apart" into pieces that they can apply metrics of success or failure to(e.g. pass/fail for a coding style, numbers of branches and loops, when memory is allocated and released).
Python is designed to be highly dynamic and to allow more code paths to be taken at runtime, through interpreting and reacting to the live data - "late binding" in the lingo, as opposed to the "early binding" of a Rust or Haskell, where you specify as much as you can up front and have the compiler test that specification at build time. Late binding creates an explosion of potential complexity and catastrophic failures because it tends to kick the can down the road - the program fails in one place, but the bug shows up somewhere else because the interpreter is very permissive and assumes what you meant was whatever allows the program to continue running, even if it leads to a crash or bad output later.
Late binding is very useful - we need to assume some of it to have a live, interactive system instead of a punchcard batch process. And writing text and drawing pictures is "late binding" in the sense of the information being parsed by your eyes rather than a machine. But late binding also creates a large surface area where "anything can happen" and you don't know if you're staying in your specification or not.
There are many examples, but let's speak for instance of the fact that Python has privacy by convention and not by semantics.
This is very useful when you're writing unit tests or when you want to monkey-patch a behavior and don't have time for the refactoring that this would deserve.
On the other hand, this means that a module or class, no matter how well tested and documented and annotated with types, could be entirely broken because another piece of code is monkey-patching that class, possibly from another library.
Is it the case? Probably not. But how can you be sure?
Another (related) example: PyTorch. Extremely useful library, as we have all witnessed for a few years. But that model you just downloaded (dynamically?) from Hugging Face (or anywhere else) can actually run arbitrary code, possibly monkey-patching your classes (see above).
Is it the case? Probably not. But how can you be sure?
Cue in supply chain attacks.
That's what I mean by auditing for safety and security. With Python, you can get quite quickly to the result you're aiming for, or something close. But it's really, really, really hard to be sure that your code is actually safe and secure.
And while I believe that Python is an excellent tool for many tasks, I am also something of an expert in safety, with some experience in security, and I consider that Python is a risky foundation to develop any safety- or security-critical application or service.
There's also the argument that at a certain scale the time of a developer is simply more expensive than time on a server.
If I write something in C++ that does a task in 1 second and it takes me 2 days to write, and I write the same thing in Python that takes 2 seconds but I can write it in 1 day, the 1 day of extra dev time might just pay for throwing a more high performance server against it and calling it a day. And then I don't even take the fact that a lot of applications are mostly waiting for database queries into consideration, nor maintainability of the code and the fact that high performance servers get cheaper over time.
If you work at some big corp where this would mean thousands of high performance servers that's simply not worth it, but in small/medium sized companies it usually is.
Realistically something that takes 1 second in C++ will take 10 seconds (if you write efficient python and lean heavily on fast libraries) to 10 minutes in python. But the rest of your point stands
I spend most of my time waiting on IO, something like C++ isn't going to improve my performance much. If C++ takes 1ms to transform data and my Python code takes 10ms, it's not much of a win for me when I'm waiting 100ms for IO.
With Python I can write and test on a Mac or Windows and easily deploy on Linux. I can iterate quickly and if I really need "performance" I can throw bigger or more VPSes at the problem with little extra cognitive load.
I do not have anywhere near the same flexibility and low cognitive load with C++. The better performance is nice but for almost everything I do day to day completely unnecessary and not worth the effort. My case isn't all cases, C++ (or whatever compiled language you pick) will be a win for some people but not for me.
And how much code is generally written that actually is compute heavy? All the code I've ever written in my job is putting and retrieving data in databases and doing some basic calculations or decisions based on it.
Code is "compute heavy" (could equally be memory heavy or IOPs heavy) if it's deployed into many servers or "the cloud" and many instances of it are running serving a lot of requests to a lot of users.
Then the finance people start to notice how much you are paying for those servers and suddenly serving the same number of users with less hardware becomes very significant for the company's bottom line.
The other big one is reducing notable latency for users of your software.
Damn! Is the rule of thumb really a 10x performance hit between Python/C++? I don’t doubt you’re correct, I’m just thinking of all the unnecessary cycles I put my poor CPU through.
Outside cases where Python is used as a thin wrapper around some C library (simple networking code, numpy, etc) 10x is frankly quite conservative. Depending on the problem space and how aggressively you optimize, it's easily multiple orders of magnitude.
FFI into lean C isn't some perf panacea either, beyond the overhead you're also depriving yourself of interprocedural optimization and other Good Things from the native space.
Of course it depends on what you are doing, but 10x is a pretty good case. I recently re-wrote a C++ tool in python and even though all the data parsing and computing was done by python libraries that wrap high performance C libraries, the program was still 6 or 7 times slower than C++. Had I written the python version in pure python (no numpy, no third party C libraries) it would no doubt have been 1000x slower.
It depends on what you're doing. If you load some data, process it with some Numpy routines (where speed-critical parts are implemented in C) and save a result, you can probably be almost as fast as C++... however if you write your algorithm fully in Python, you might have much worse results than being 10x slower. See for example: https://shvbsle.in/computers-are-fast-but-you-dont-know-it-p... (here they have ~4x speedup from good Python to unoptimized C++, and ~1000x from heavy Python to optimized one...)
Last time I checked (which was a few years ago), the performance gain of porting a non-trivial calculation-heavy piece of code from Python to OCaml was actually 25x. I believe that performance of Python has improved quite a lot since then (as has OCaml's), but I doubt it's sufficient to erase this difference.
And OCaml (which offers a productivity comparable to Python) is sensibly slower than Rust or C++.
It really depends on what you're doing, but I don't think it is generally accurate.
What slows Python down is generally the "everything is an object" attitude of the interpreter. I.e. you call a function, the interpreter has to first create an object of the thing you're calling.
In C++, due to zero-cost abstractions, this usually just boils down to a CALL instruction preceded by a bunch of PUSH instructions in assembly, based on the number of parameters (and call convention). This is of course a lot faster than running through the abstractions of creating some Python object.
> What slows Python down is generally the "everything is an object" attitude of the interpreter
Nah, it’s the interpreter itself. Due to it not having JIT compilation there is a very high ceiling it can not even in theory surpass (as opposed to things like pypy, or graal python).
I don't think this is true: Other Python runtimes and compilers (e.g. Nuitka) won't magically speed up your code to the level of C++.
Python is primarily slowed down because of the fact that each attribute and method access results in multiple CALL instructions since it's dictionaries and magic methods all the way down.
Which can be inlined/speculated away easily. It won’t be as fast as well-optimized C++ (mostly due to memory layout), but there is no reason why it couldn’t get arbitrarily close to that.
How so? Python is dynamically typed after all and even type annotations are merely bolted on – they don't tell you anything about the "actual" type of an object, they merely restrict your view on that object (i.e. what operations you can do on the variable without causing a type error). For instance, if you add additional properties to an object of type A via monkey-patching, you can still pass it around as object of type A.
A function/part of code is performed say a thousand times, the runtime collects statistics that object ‘a’ was always an integer, so it might be worthwhile to compile this code block to native code with a guard on whether ‘a’ really is an integer (that’s very cheap). The speedup comes from not doing interpretation, but taking the common case and making it natively fast and in the slow branch the complex case of “+ operator has been redefined” for example can be handled simply by the interpreter. Python is not more dynamic than Javascript (hell, python is strongly typed even), which hovers around the impressive 2x native performance mark.
Also, if you are interested, “shapes” are the primitives of both Javascript and python jit compilers instead of regular types.
> it's a VM reading and parsing your code as a string at runtime.
Commonly it creates the .pyc files, so it doesn't really re-parse your code as a string every time. But it does check the file's dates to make sure that the .pyc file is up to date.
On debian (and I guess most distributions) the .pyc files get created when you install the package, because generally they go in /usr and that's only writeable by root.
It does include the full parser in the runtime, but I'd expect most code to not be re-parsed entirely at every start.
The import thing is really slow anyway. People writing command lines have to defer imports to avoid huge startup times to load libraries that are perhaps needed just by some functions that might not even be used in that particular run.
That is true, but there are relatively few real world applications that consist of only those operations. In the example I mentioned below, there where actually some parts of my python rewrite that ended up faster than the original C++ code, but once everything was strung together into a complete application those parts where swamped by the slow parts.
Most of the time these are arithmetic tight loops that require optimisations, and it's easy to extract those into separate compiled cython modules without losing overal cohesion within the same Python ecosystem.
At some point, every engineer has heard this same argument but in favor of all kinds of dubious things such as emailing zip files of source code, not having tests, not having a build system, not doing IaC, not using the type system, etc.
I'm sure Rust was the wrong tool for the job in your case but I find this type of get shit done argument unpersuasive in general. It overestimates the value of short-term delivery and underestimates how quickly an investment in doing things "the right way" pays off.
If you're dealing in areas with short time limits then Python is great,
because you can't sell a ticket for a ship that has sailed.
And I've seen "the right way" which, again, depending on the business may
result in a well designed product that is not what's actually needed (because
people are really bad at defining what they want)
What's brilliant with Python compared to other hacky solutions that it
does support test, type hints, version control and other things. It just
doesn't force you to work that way. But if you want to write stable, maintainable
code, you can do it.
That means you can write your code without types and add them later.
Or add tests later once your prototype was been accepted. Or whenever something
goes wrong in production, fix it and then write a test against that.
Oh and I totally agree you should certainly try to "do things the right way",
if the business allows it.
It is hard to believe that Python is objectively that much more productive than other languages. I know Python moderately well (with much more real world experience in C#). I like Python very much but I don't think it is significantly more productive than C#.
This. C#, Java or even newcomers such as Kotlin/Go are even in the same ballpark due to the REPL/Jupyter alone. Let alone when you consider the ecosystem
If you are in a lab (natural science lab) or anywhere close to data, I bet you it is much more productive, even more so when you have to factor in that the code might be exposed to non-technical individuals.
The thing is that the short term is much easier to predict what you're going to need and where the value is, and in the long term you might not even work on this codebase anymore. Lot of incentives to get things done in the short term.
The business owner (whoever writes the checks) prefers get shit done over "the right way". Time to completion is a key factor of the payoff function of the devs work.
The entire point of doing things the right way is that you end up delivering more value in the long term, and "long term" can be as soon as weeks or even days in some cases.
Business owners definitely prefer less bugs, less customer complaints, less support burden, less outages, less headaches. Corner cutting doesn't make economic sense for most businesses and good engineering leadership doesn't have much trouble communicating this up the chain. The only environment where I've seen corner cutting make business sense is turd polishing agencies whose business model involves dumping their mistakes on their clients and running away so the next guy can take the blame.
Try the travel/event booking business (where I'm in) - and no, people don't dump their mistakes on the next guy here - to the contrary, the "hacky" Python solutions are supported for years and teams stay for decades (allthough a decade ago we had not discovered how great Python was)
What business owners actually don't like at all is how long is takes traditional software development to actually solve problems - which then don't really fit the business after wasting a few years of ressources... and the dumping and running away is worse in Java and other compiled software. With Python you can at least read the source in production if the team ran away...
> the dumping and running away is worse in Java and other compiled software. With Python you can at least read the source in production if the team ran away...
Java (and dotnet, the two big "VM" languages) is somewhat of a strange example for that; JVM bytecode is surprisingly stable and reverse engineering is reasonably easy unless the code was purposely obfuscated - a bad sign on any language anyways.
> underestimates how quickly an investment in doing things "the right way" pays off.
What time horizon should a startup optimize delivery for? Minutes, hours, days, weeks? Say you're a startup dev in a maximalist "get shit done now" mindset so you're skipping types, tests, any forethought or planning so you can get the feature of the week done as fast as possible. This makes you faster for one week but slower the week after, and the week after, and the week after that.
Say a seed stage startup aims for 12 months runway to achieve some key outcomes. That's still a marathon. It still doesn't make sense to sprint the first 200 meters.
> coworkers who churn out shiny new things at 10x the speed
Sounds like a classic web-dev perspective, my customers hate when we ship broken tools because it ruins their work, new feature velocity be dammned. We love our borrow checker because initially you run at 0.5x velocity but post-25kSLOC you get to run at 2x velocity, which continues to mystify managers worldwide.
With Python, testing, good hygiene and a bit of luck you can write core that is maybe 99% reliable. It is very, very hard to get to (100-eps)% for eps < 0.1% or so. Rust seems better suited to that.
Anything else, especially if there isn't a huge premium on speed, meh - Python is almost always sufficient, and not in the way.
I use the same combo: lots of Python to analyse problems, test algos, process data, etc. Then, once I settle on a solution but still need more performance (outside GPU's), I go to rust.
I'm simulating an audio speaker in real time. So I do the data crunching, model fitting, etc. in python and this gives me a godd theoretical model of the speaker. But to be able to make a simulation in realtime, I need lots of speed so rust makes sense there (moreover, the code I have to plug that in is rust too, so one more reason :-)). (now tbh, my realtime needs are not super hard, so I can avoid a DSP and a real time OS :-) )
I don't need rust specifically. It's just that its memory and thread management really help me to continue what I do in python: focusing on my core business instead of technical stuff.
My most successful career epiphany was realizing that everyone -- my customers, my boss, etc -- was happier if I shipped code when I thought it was 80% ready. That long tail from 80-100% generates a lot of frustration.
It's just an application of the Pareto principle. That last 20% of work to make perfect software costs a lot of time. Customers (and by extension, management) do not care how pretty your code is, how perfect your test coverage is (unless your manager is a former developer, then they might have more of an opinion), they care most that you ship it. Bugs are a minor irritation compared to sitting around waiting for functionality they need, as long as you're responsive in fixing the bugs that do come up.
Thanks. I thought that is what you meant but another possible take was that the last 20% is actually important. Getting something 80% finished is fast and then the long tail to get it to 100% is frustrating for everyone because the work, in theory is finished. I think that can happen as well.
Of course there are at least three dimensions to discuss here: internal quality, external quality and product/feature fit. Lower quality internal code eventually leads to slower future development and higher turnover as no one wants to work with the crappy code base. Lower external quality (i.e. bugs) can lead to customers not liking your product. Interestingly the relationship between internal and external quality is not as direct as one might think. Getting features out the door more quickly (at the expense of other things) can help with product fit. Essentially, like most things, this is an ongoing optimization problem and different approaches are appropriate for different problem domains.
That is interesting. I went in the other direction :)
I am tired of having to refactor shiny new things churned out at 10x the speed and that keep breaking in production. These days, if given a choice, I prefer writing them in Rust code, spending more time writing and less time refactoring everything as soon as it breaks or needs to scale.
When the pointer chasing (sometimes) comes in handy, is once you have a successful business with a lot of data and/or users, and suddenly the cost of all those EC2 instances comes to the attention of the CFO.
That's when rewriting the hot path in Go or Rust or Java or C or C++, can pay off and make those skills very valuable to the company. Making contributions to databases, operating systems, queueing systems, interpreters, Kubernetes etc. also fall into that category.
But yeah if you are churning out a MVP for a new business, yeah starting with Python or Ruby or Javascript is a better bet.
(Erlang/Elixir is also an interesting point in the design space, as it's very high level and concise, but also scales better than anything else, although not especially efficient for code executing serially. And Julia offers the concision of Python with much higher performance for numerical computing.)
Or there are programmers who write both. Something that I want to write once, have run on several different platforms, handle multi-threading nicely, and never have to think about again? Rust. Writing something to read in some data to unblock an ML engineer or make plots for management? Definitely not Rust, probably python. Then you can also churn out things at 10x the speed, but by writing the tricky parts in something other than python, you don't get dragged back down by old projects rearing their ugly heads, so you outpace the python-only colleagues in the long-term.
Programming is secondary to my primary duties and only a means for me to get other things done. I'm in constant tension between using Python and Rust.
With Python I can get things up and going very quickly with little boilerplate, but I find that I'm often stumbling on edge cases that I have to debug after the fact and that these instances necessarily happen exactly when I'm focused on another task. I also find that packaging for other users is a major headache.
With Rust, the development time is much higher for me, but I appreciate being able to use the type-system to enforce business logic and therefore find that I rarely have to return to debug some issue once I have it going.
It's a tough trade-off for me, because I appreciate the velocity of Python, but Rust likely saves me more time overall.
If you're 'tired of chasing pointers', Rust's a lot closer to (and I'd argue better than) Python than say Go - it'll tell you where the issue is and usually how to fix it; Go will just blow up at run time. (Python (where applicable) will do something unexpected and wrong but potentially not error (..great!))
I completely agree - but you say that like it's a bad thing. I work as a developer alongside data scientists, who might have strong knowledge of statistics or machine learning frameworks rather than traditional programming chops.
For the most part they don't need to know about concurrency, memory efficiency etc, because they're using a library where those issues have been abstracted away.
I think that's what makes python ideal - it's interoperability with other languages and library ecosystem means less technical people can produce good, efficient work without having to take on a whole bunch of the footguns that would come from working directly in a language like c++ or Rust.
But this is a false dichotomy. The space of options isn't C++/Rust or Python. There are languages which attempt to give the best of both worlds, e.g. Julia.
> they're using a library where those issues have been abstracted away.
I work in Python, and while libraries like numpy have certainly abstracted away some of those issues, there's still so much performance left of the table because Python is still Python.
Oh, I'm familiar with numba and while it certainly helps, it has plenty of it's own issues. You don't always get a performance gain and you only find this out at the end of a refactoring. Your code can get less readable if you need to transport data in and out of formats that it's compatible with (looking at you List()).
To say nothing of adding yet another long dependancy chain to the language (python 3.11 is still not supported even though work started in Aug of last year).
I do wonder if the effort put into making this slow language fast could have been put to better use, such as improving a language with python's ease of use but which was build from the beginning with performance in mind.
I've rewritten real world performance critical numpy code in C and easily gotten 2-5x speedup on several occasions, without having to do anything overly clever on the C side (ie no SIMD or multiprocessing C code for example).
Did you rewrite the whole thing or just drop into C for the relevant module(s)? Because the ability to chuck some C into the performance critical sections of your code is another big plus for Python.
But... pretty much any language can interoperate with C, it's calling conventions have become the universal standard. I mean, I still remember at $previousJob when I was deprecating a C library and carefully searched for any mention of the include file... only to discover that a whole lot of Fortran code depended on the thing I was changing, and I had just broken all of it (since Fortran doesn't use include files the same way, my search for "#include <my_library" didn't return any hits, but the function calls were there none-the-less).
Julia, to use the great-great-grand-op's example, seems to also have a reasonably easy C interop (I've never written any Julia, so I'm basing this off skimming the docs, dunno, it might actually be much more of a pain than it looks like here).
I’ve done the same but moved from vanilla numpy to numba. The code mostly stayed the same and it took a couple hours vs however long a port to C or Rust would have taken.
For a package whose pitch is "Just apply one of the Numba decorators to your Python function, and Numba does the rest." a few hours of work is a long time.
2-5x speedup is not a lot, I would say it is not worth it to rewrite from py to C if you don't have an order of magnitude improvement.
Because if you compare the benefit to the cost of rewrite from py to C and cost of maintaining/updating C code and possible C footguns like manual memory safety, etc - then there is no benefit left
I highly doubt that numpy can ever be a bottleneck. In typical python app - there are other things like I/O that consume resources and become bottleneck, before you run into numpy limits and justify rewrite in C.
I haven't personally run into IO bottlenecks so I have no idea how you would speed those up in Python.
But there's two schools of thoughts I've heard from people regarding how to think about these bottlenecks:
1. IO/network is such a bottleneck so it doesn't matter if the rest is not as fast as possible.
2. IO/network is a bottleneck so you have to work extra hard on everything else to make up for it as much as possible.
I tend to fall in the second camp. If you can't work on the data as it's being loaded and have to wait till it's fully loaded, then you need to make sure you process it as quickly as possibly to make up for the time you spend waiting.
In my typical python apps, it's 0.1-20 seconds of IO and pre-processing, followed by 30 seconds to 10 hours of number crunching, followed by 0.1-20 seconds of post processing and IO.
2-5x speedup barely seems worth re-writing something for, unless we're talking calculations that take literally days to complete, or you're working on the kernel of some system that is used by millions of people.
> For the most part they don't need to know about concurrency [...]
In my opinion, this is the part that Go got mostly right. Concurrency is handled by the runtime, and held behind a very thin veil. As a programmer you don't really need to know about it, but it's there when you need to poke at it directly. Exposing channels as a uniform communication mechanism has still enough footguns to be unpleasant, though.
In an ideal world, I should be able to decorate a [python] variable and behind the scenes the runtime would automatically shovel all writes to it through an implicitly created channel. Instead of me as a coder having to think about it. Reads could still go through directly because they are safe.
If I could have Python syntax and stdlib, with Go's net/http and crypto libraries included, and have concurrency handled transparently in Go-style without having to think about it, that would be pretty close to an all-wishes-come-true systems language. Oh, and "go fmt", "go perf" and "go fuzz" as first-class citizens too.
Someone else in this thread brought up the idea of immutable data structures as a default. I wouldn't mind that. Python used to have frozenset (technically it still does but I haven't seen a performance difference for a while), so extending the idea of freeze()/unfreeze() to all data types certainly has appeal.
In fact, the development of the world is based on constant levels of abstraction, just think of assembly language and computer punch tape programming, those days are not long past.
> without having to take on a whole bunch of the footguns that would come from working directly in a language like c++ or Rust.
Don't forget the footguns of working with developers who do those things. Ask them to do something simple and you get something complex and expensive after months of back and forth about what is wanted. You're likely to a framework for a one off SQL query.
I hear it being said already, "You're using software developers wrong!" Well, maybe software developers shouldn't be so hard to use?
> maybe software developers shouldn't be so hard to use?
This whole take assumes bad intention on both sides. Nobody's job is easy in this situation. Leadership's job is to set everyone up for success. If things go off the rails and end up with months of back and forth leading to nobody being happy despite good intentions and honest effort, then the problem lies with leadership.
Sure thing! Footguns might be the wrong word, and I know as a low level language Rust is insanely safe, but for a high level developer it's type system is gonna mean spending a lot of time in the compiler figuring out type errors, at least initially. That might not be a traditional footgun, but if you're just trying to, I dunno, build a crud api or something, its gonna nuke your development time.
Please don't read this as "rust is difficult and bad", I definitely don't think it is! But its a low level language, and working with it means dealing with complexity that for some tasks just might not be relevant.
I agree, but for something like the CRUD app example I made bringing in pydantic or something would solve that. Rust's type system is a lot stricter because it's solving problems in a space that doesn't touch a lot of Python developers.
>and they don’t care - or know about - concurrency, memory efficiency, L2 cache misses due to pointer chasing.
Also if I (a programmer) want to write really really fast code I'm probably reaching for tools like tensorflow, numpy, or jax. So there's not much incentive for me to switch to a more efficient language when as near as I can tell the best tooling for dealing with SIMD or weird gpu bullshit seems to be being created for python developers. If you want to write fast code do it in c/rust/whatever, if you want to write really fast code do it in python with <some-ML-library>.
For a very specific definition of the word "fast" at least.
> Also if I (a programmer) want to write really really fast code I'm probably reaching for tools like tensorflow, numpy, or jax. So there's not much incentive for me to switch to a more efficient language when as near as I can tell the best tooling for dealing with SIMD or weird gpu bullshit seems to be being created for python developers. If you want to write fast code do it in c/rust/whatever, if you want to write really fast code do it in python with <some-ML-library>.
Rather unfortunately, my current bugbear is that Pytorch is... slow. On the CPU. One of the most common suggestions for people who want stable diffusion to be faster is, wait for it, "Try getting a recent Intel CPU, you'll see a real uplift in performance".
This despite the system only keeping a single CPU core busy. Of course, that's all you can do in Python most of the time.
(You can also use larger batch sizes. But that only partially papers over the issue, and also it uses more GPU memory.)
Your OS, the linear algebra libraries themselves, much of the user-facing software that you use (latency sensitive rather than throughput sensitive), image/video encoding/decoding, most of the language runtimes that you use, high volume webservers, high volume data processing (where your data is not already some nice flat list of numbers you're operating on with tensor operations), for some examples.
Really, for almost any X, somebody somewhere has to do X with strict performance requirements (or at very large scale, so better perf == savings)
Most of these python libraries are only fast for relatively large and relatively standard operations in the first place. If you have a lot of small/weird computations, they come with a ton of overhead. I've personally had to write my own fast linear algebra libraries since our hot loop was a sort of modified tropical algebra once.
They asked for examples of non-numpy/tf/had use cases and I gave some including my own experience? No disagreement, HPC Python in practice is heavily biased towards numpy and friends
Your comment is super interesting because it suggests Python has evolved in a direction opposite to the Python Paradox - http://www.paulgraham.com/pypar.html
Whereas before you could get smarter programmers using Python, now because of the exponential growth of Python, the median Python programmer is likely someone with little or no software engineering or computer architecture background who is basically just gluing together a lot of libraries.
Neat observation. I wasn't doing much programming in 2004, but, I'm guessing 2004 Python would be like today's Rust. People learn it because they love it.
I think more so Rust than even Python on 2004 since Rust has a pretty steep learning curve and does require a non-trivial amount of dedication to learning it.
> It’s frequently the second best language but it’s the second best language for anything.
This myth wasn't even true many years ago, it certainly isn't true today. You can build a mobile app, game, distributed systems, OS, GUI, Web frontend, "realtime" systems, etc in Python, but it is a weak choice for most of those things (and many others) let alone the second best option.
The saying does not mean that in a rigorous evaluation Python would be second best out of all programming ecosystems for all problems.
The saying means that for any given problem, there is a better choice, but second best is the language you know which has all of the tools to get the job done, so the answer is probably just a bunch of pip installs, imports, and glue code.
It’s kind of like “the best camera is the one you have with you” — it’s a play on the differing definitions of “best” to highlight the value of feasibility over technical perfection.
When I switched from PHP to Python years ago I had the same feeling as the OP, then it became the third best, then the fourth, then situational when object-orientation makes sense, then for just scripting, and now... unsure beyond a personal developer comfort/productivity preference. TUIs and GUIs built on Python on my machine seem to be the first things to have issues during system upgrades because of the package management situation.
Anything that doesn't require high performance that is. Is there any 3D game engine for python yet? I guess Godot has gdscript which is 90% python by syntax, but that doesn't quite count I think.
You won't get high performance out of Python directly, but there are a lot of Python libraries that use C or a powerful low level language underneath. The heavy lifting in so much of machine learning is CUDA, but most people involved in ML are writing Python.
Sure, but what's not really python per se. One could also call C++ libraries from java via JNI and pretend java is super fast.
If people write program logic in python it will run at python speeds. Otherwise you're not really writing python, like nobody says some linux native program is bash because it happens to be launched from a bash script.
> Sure, but what's not really python per se. One could also call C++ libraries from java via JNI and pretend java is super fast.
But that's how every scripting language obtains good-not-just-decent performance. A strong culture of dropping down to C for any halfway-important library is why PHP's so hard to beat in real-world use, speed-wise (whatever its other shortcomings).
Java is super fast though, it almost never uses JNI as it doesn’t need it as opposed to Python. It uses JNI for integrating with the C world (e.g. opengl bindings).
Python isn't a joke either. I'm a full-on programmer who started with C and branched out to several other languages, and I'd still pick Python for a lot of new tasks, even things that aren't little scripts. Or NodeJS, which has similar properties but has particular advantages for web backends.
I’ve been a Python developer for 15 years, and Python might have been the second best language for anything when I started my career, but there are so many better options for just about any domain except maybe data science. Basically for any domain that involves running code in a production environment (as opposed to iterating in a Jupiter notebook) in which you care about reliability or performance or developer velocity, Python is going to be a pretty big liability (maybe it will be manageable if you’re just building a CRUD app atop a database). Common pain points include performance (no you can’t just multiprocess or numpy your way out of performance problems), packaging/deployment, and even setting up development environments that are reasonably representative of a production environment (this depends a lot on how you deploy to production—I’m sure lots of people have solved this for their production environment).
I'm a bit surprised to see this article on GitHub blog, it feels more like something from dev.to - looking at the surface, with little actual insights.
Most of the provided reasons behind Python's popularity are true also for other languages - portable, open source, productive, big community. This can be also said about PHP, Ruby, or Perl back in 2000s. Why isn't Perl as popular as Python?
I don't think it's all about readability or productivity, but about tools that were built over the last 30 years that have been used in academia and now with the boom in ML/AI/Data Science, they made Python an obvious choice to use for the new generation of tools and applications.
Imagine that the boom in ML/AI didn't happen - would Python be #1 language right now?
I don't think there is a single reason, but it sure didn't help that the community self-destructed by trying to make an entirely new language after version 5 and still call it Perl. It took a lot of years to resolve that nonsense, and in the meantime many people moved on.
It also does not help that Perl is a creative language, useful but very much open to many different interpretations. Hiring a perl guy and expecting them to read someone else's code is a crapshoot. The upside to Python's strong cultural opinions on coding style makes it easier for one developer to pick up someone else's code.
> Imagine that the boom in ML/AI didn't happen - would Python be #1 language right now?
Probably not. But it wouldn't be perl, either. Javascript most likely. But the core usage of python for scripting was never predicated on ML popularity, so it would still be a pretty commonly used language. and javascript has many annoying warts too, so I think plenty of people would still choose to write django apps instead of node, whether ML existed or not.
As commented somewhere else in this thread, Python was clearly more ergonomic than Python, hand had a lot of mindshare exactly for this reason. I remember when Python was new and the not that professional choice, Perl was at that time for that niche. Now still I don't see a contender for a language where speed doesn't matter. Ruby has some Perlisms that really make it weird, PHP is tight to the web, and equally weird, these $s and @s are really bad for normal people. Python wins clearly when teaching somebody programming.
I’d say that Ruby and even Perl are a lot nicer for scripting than Python (due to the extremely low-effort unix interop). Python can do it but it’s a while lot more verbose and difficult for a beginner to learn than “anything inside a pair of backticks is run as a system command and you can interpolate variables”.
Python was friendlier for beginners than Ruby the first time I took a real stab at learning to code during a CNY holiday in 2008, but it wasn’t about the language itself. Ruby was harder then because many of the popular libraries and many of the tutorials were written by people who considered Windows support as an afterthought. It’s hard to express how frustrating it was to have my vacation days ticking down, hitting issues in one tutorial after another and having people suggest I install linux on a VM (a process where I hit still more snags).
People learning Python and PHP didn’t hit that hurdle. I ended up learning Flash on my Asus laptop a couple of years later and getting my start that way and not coming back to Ruby until six years later when I was a much more experienced dev.
Perl was significantly more popular at one point, but it slowly lost traction while Python gradually gained traction over the years.
Better ecosystem for numeric computing is definitely a big reason for the success of Python, by the question is why Python gained a foothold in that niche in the first place. It think it is because Python is just a lot more accessible to people with different backgrounds. Perl really grew out of shell scripting as a supercharged alternative to Bash and Awk, but retaining many of the quirks for familiarity. Python on the other hand grew out of research in teaching programming to beginners.
This is a strange article. It's got the talking point about Python that we were hearing about 10 years ago - "tired of those pesky curly brackets in Java, try this new language you might not have heard of: Python!". Who reading the GitHub blog has not heard of Python?
Also, that snippet used in the "What is Python commonly used for" section is strange:
import antigravity
def main():
antigravity.fly()
if __name__ == '__main__':
main()
It's overly verbosely written (especially given the example just above about how you don't need main function in Python) and refers just to an insider joke/Easter egg. I can't see that it's going to convince anyone to try Python, only make them feel that they're on the outside of a joke.
It then ends with what seems like it might have been the point of the article, an advert for Copilot. It seems the way to get started writing Python is to write a short comment and then spam <TAB> and let the AI auto-complete your project.
(Also, and perhaps less importantly, looking at the author's GitHub profile I can't see a single instance of Python there. Though I'm not doing a deep-dive as that feel overly picky and there's plenty of private contributions that could well be Python.)
I read the article and had the same feeling that it's a fluff piece without substance. If you've to compare anything, compare it with the vibrant JVM ecosystem. Using the same tired argument of `System.but.Println()` shows the author has no original idea. Python is great but JVM is no lackey, it is a marvelous piece of battle tested engineering.
In the end it is just an ad for Github products and not worthy of being on HN frontpage.
>It's got the talking point about Python that we were hearing about 10 years ago - "tired of those pesky curly brackets in Java, try this new language you might not have heard of: Python!"
That was a talking point closer to 20 years ago, at this point.
> Who reading the GitHub blog has not heard of Python?
The GitHub blog became a strange place recently(-ish). It went from a factual blog describing fancy new GitHub features and interesting technical stuff (as it was a decade ago) to mostly a place full of incoherent marketing fluff like this post (with some real technical content interspersed).
Curly brackets or brackets at all aren't mentioned by name in the article - is that your interpretation of the first code example?
IMO, that example is there to show that there's less required boilerplate required in python compared to java when doing the same thing. And, in particular, none of that boilerplate really matters to what you want to do - print hello world - emphasizing the point of python being simple.
The section that explains why python is good for AI talks about pybrain, a library that seems to date from 10 years ago. I’m pretty well versed in most ml frameworks and never heard of it. Last update to the website looks to be 2010. Weird to feature that and PyTorch as examples of ML libraries. No mention of sklearn which is vastly more popular
Yes, I know. But some people still have the reflex from Python 2 and feel bitter when the error message says "I know what you want, and I'm not giving it to you."
Eh, they're just a lot of ways to say "path dependence". Scripting languages are basically the same exact technology with respect to each other. In the alternate universe where numpy and scipy are, let's say, numruby and sciruby, wouldn't we be here asking why Ruby keeps growing?
That's not a sales pitch for python, it's a sales pitch for the concept of a scripting language; it's like saying "you should really buy a Ford, it comes with four wheels".
I admit I have a blind spot for Python, because I use PHP in my day job, so when I need to do some scripting, I mostly use PHP. But admittedly Python is a lot friendlier than some alternatives (Perl, Shell scripts etc.), and more universal than others (PHP being mostly used for web dev), so that's why people choosing a scripting language for their tool/library tend to choose Python.
I use Python all the time both for my personal stuff and for some side-projects at work, so this isn't a dunk on Python, but honestly it feels like a circular thing: it's popular because it's popular.
I wouldn't say it's friendlier than the alternatives, Perl and Shell scripts sure, but not when compared to Javascript, Ruby or Lua.
Now, if you're talking about libraries, support, etc. then sure, Python wins hands down, but that doesn't make it a better language in itself. I'd say Ruby and Lua are a little bit better as languages.
But then again, I don't care much for the language in itself, so Python is enough for most of my use cases.
I read "it's popular because it's friendlier". PHP, Javascript or R are popular, but are not friendlier. I find their error messages way worse for the beginner, when you need it more. Third party code is too "clever" for the beginner to read and learn, because it seems to be two languages: the one you are learning in the tutorials, and the other idiom that is used in the serious libraries. As a beginner you are hit with this feeling that you are far, far away from writting an useful thing.
In my job I've seen some beginners starting with R, and quickly hating it because they don't feel they can do much on their own, but copy-pasting and then modifying from the examples and the tutorials. And it the changes go too far, everything collapses with cryptic errors. When you show them Python as an alternative, pointing that they shouldn't use it over R for statistics and graphics, they like that they can build ideas from the scratch. That beginner is hooked for life.
I think Lua was always seen as a bit obscure, and not enough people invested in the language to write useful utilities. It has a solid C foreign function interface, and the compiler is quite fast, which leaves me puzzled about why it never gained traction. I think it's an embedded scripting language in the majority of use cases (e.g. NeoVim, LuaLaTeX, scripting in some game engines).
The story of Ruby is altogether different: they made the fatal mistake of not defining a C foreign function interface in the standard, otherwise I imagine we'd be seeing numerical computation and ML libraries with a Ruby interface today. Still, Ruby lives on in Metasploit, and in Sorbet and Crystal.
> I think Lua was always seen as a bit obscure, and not enough people invested in the language to write useful utilities. It has a solid C foreign function interface, and the compiler is quite fast, which leaves me puzzled about why it never gained traction. I think it's an embedded scripting language in the majority of use cases (e.g. NeoVim, LuaLaTeX, scripting in some game engines).
- Lua's standard library is so weak that it makes most other batteries-not-included languages look like they have large, robust, and helpful standard libraries.
- It's got a bit of the quirkiness and gotcha-ability of JavaScript but without its being a language that's impossible to avoid due to capture of a mega-popular platform, which is what propelled JavaScript to ubiquity despite its being kinda shit and unpleasant to work with.
- Tooling's not as good as many other languages.
(FWIW sometimes I write Lua regardless, because it's the right tool for the job)
> The story of Ruby is altogether different: they made the fatal mistake of not defining a C foreign function interface in the standard, otherwise I imagine we'd be seeing numerical computation and ML libraries with a Ruby interface today.
Luck of libraries and initial userbase are certainly involved in success, but not all scripting languages are equal. I mean we could add bash to that list then.
In fact I'd argue that python enjoying the success it has, despite probably the worst handling of a version bump in any language (2->3), is a testament to its popularity.
I think one reason for Python growing popularity is because it's become the default tool in some domains whether it is the best tool or not.
This week our Director ordered a total rewrite of two years of work in Python. His rationale: it's what everyone else uses in this space. No reason specific to our use case, just simply to follow the herd. I realise that a large community translates into easy hiring and rich ecosystems, but I despise the mentality as it promotes a monoculture.
... and Python _became_ the default tool because the de facto developer consensus (after years of competing languages) is that an interpreted language should
= be usable, and
= provide a set of data structures that an educated programmer *expects to find when scripting*.
Python literally sucked less than the alternatives.
Python gets introduced to students, so for many people it's the first language they learn. Half the programming community have less than 5 years of experience. I question their ability to evaluate suckage, lol.
What are you rewriting from? At director level focus is usually more on things like how easy is it to staff / get support for something. Python is strong here - you can find programmers globally who can do pretty well with it.
Yes. The entire reason I like using Java at my day job is that the rest of the company uses it the most and supports it well. I would never use Java on my own, but that's a different situation.
It reads as a little bit of a tautology. "People are using it because people use it". I get that from a hireability standpoint it's a real thing to consider, but the statement doesn't say anything about whether or not Python is actually a good language to use
> This week our Director ordered a total rewrite of two years of work in Python
WTF? Unless your system is originally written in a proprietary language that literally no one outside your company knows, I'll say it's a good sign that you need to change team (or change job). Don't work under a director like that.
Sorry if it sounds too cynical, but that's probably the intended effect of a rewrite from what the team knows into Python: staffing changes. A bunch of the (expensive) old guard will leave and they can be replaced with cheap grads, who all know Python.
I have been a heavy Python user now about 15 years, but for me now I'm increasingly reaching for modern JavaScript and particularly TypeScript to do the things I would have traditionally done with Python.
ES modules, fat arrow expressions, and all the other nice new syntax and library features have made the language so more pleasant to use. In many ways the ergonomics of TypeScript in particular are far superior to Python now, and I find that really surprising.
I haven't yet found a replacement for Django (particularly the ORM/Admin/Forms combination) it is just so incredibly productive to use. So, I don't think I will be moving off it, but it's certainly not the growth language for me anymore.
I am going this way too. I recently tried to onboard a few contractors with limited python experience. It took several deep sessions to work out why they couldn't get an environment set up. After 10+ years I've never come across the install certificates command. There's still no way to get a dev environment with a specific version of Python working with one command that works reliably everywhere. pyenv isn't even packaged for Linux, you have to install from source.
This, plus Typescript's superior type system makes it very tempting for application development.
However I'll probably wait until there's a really good Jupyter Notebook equivalent...
Oh, I wouldn't say TypeScript is more pleasant than Python, but it is more pleasant than old JavaScript. But I increasing precise the productivity I have with TypeScript and some of the ergonomics than Python.
I still love Python. Just feel like I'm cheating on it with this new younger model...
Same. I analyzed a little why I find JS (I don't like TS) easier to deal with for some tasks:
- Particular support for web frontends or backends, both very broad categories.
- JS concurrency is easier to deal with. Focused entirely on promises with the nice async/await syntax on top, unlike Python which slapped on too many different ways to do this.
- By far easier package management and imports. Python's is so annoying that any project you download is gonna have you spin up a Docker container for it.
- ES6 added a lot of array/etc managing that JS was lacking before.
- Freeform objects with {key: value} syntax are convenient, despite maybe seeming weird at first. Python OOP somehow got really complicated over the years.
- Inline functions (I use fat-arrow but regular way is also fine). I never got why Python, despite being common in function-oriented programming, didn't let you do an inline def.
"Python 2 is unsupported by the Python Foundation since 2020-01-01" according to Debian docs. Ubuntu defaulted to Py2 until around then. So I'd say that's when the drama ended, then again it's probably not the last I've seen of Py2.
If you're looking for a Django replacement, check out Adonis. Laravel in PHP was heavily inspired by Django but is much better and Adonis is an exact clone of Laravel. I think you'd like it.
The article assigns significance to the language when the real work is done by the libraries. And many good libraries inspire even more good libraries. At that moment the language no longer matters. But it matters in the beginning when people have to create the initial environment and this is what should've been underscored. Python lets you do so many things in so many ways that minor mistakes do not matter and everyone has the freedom to experiment, achieve results, discover that they've done something stupid, but at that moment you already feel confident to do it again, but better.
5his is right in the spirit of: The Python community is a bunch of people learning programming together.
Someone else made this remark once, when the indroduction of type annotations was discussed as the python community discovering the benefits of static typing.
I'd been in the Python community for around 10 years, spoke at several conferences, wrote libraries; contributed to CPython, Openstack, and others. I've been a technical reviewer for a book on distributed computing in Python. A big part of my career was built on this language, its community, and ecosystem. I'd say a big part of Python's slow-and-steady success is its community. Going to Pycon US, Pycon CA, and Pycon's around the world -- the user groups that the foundation funds with Pizza money and mentorship programs: it's a fairly unique experience.
Another major factor is how it serves as a glue language in the scientific community. Python provides a relatively simple programming language that powers complex, powerful libraries like NumPy, SciPy, PyTorch, SymPy, etc. It's like a scripting language for engineering and scientific computing tools, sort of like what Javascript does for browsers, Node, and Deno, etc.
I don't do much Python programming these days but I owe it a huge debt! May it continue to flourish and grow.
The syntax, I think, is also the least likely to scare off newbie programmers. Personally, I came to really dislike the Python syntax, but it reads rather well, is very unintimidating, and easily supports the kinds of things that newbies will be doing.
Not to rag on Python, but I found the indent-sensitivity of Python can make it challenging to do the sort of basic things that C-family languages make fairly simple. Lambdas in Python suck in contrast to how lambdas and anonymous functions work in other languages.
But yeah, Python has proven itself to be a great language for a wide variety of applications. It makes a lot of sense for the scientific community probably because things like heavy object orientation matter a lot less in those cases. The community doesn't discourage people from just writing functions, unlike other communities that are hellbent in making everything a method of some class structure and being explicit about that.
I started in Python, as a Biologist (hooray for Jupyter-lab, coding so visually, in small steps, with output just there is so great when starting to learn Python). I ventured into other languages every now and then. For example I tried to make an Android app in Kotlin that gets info from some API. I expect something like this but with more brackets everywhere:
import request
data = request.get(https://some.api/get_some_json)
some_value_I_want = data['dig']['into']['nested_structure']
I didn't get it to work at all, I was lines and lines of code into the program when I didn't even get to see any returned values.
Also, say I want to make a nice plot of some tabular data (.tsv), I do:
import pandas as pd
import seaborn as sns
data = pd.read_csv('some_file.tsv', sep='\t')
pd.melt(data, value_vars=['s', 'max_vms'], id_vars=['sample', 'process', 'N'])
g = sns.FacetGrid(data=data, col='process', row='variable', hue='sample', sharey=False, margin_titles=True)
g.map(sns.barplot, 'sample', 'value', order=data['sample'].unique())
Boom, what a plot (or large grid of plots actually), so much information. Can anyone show me how to do this in some other language (but R)? Idk, maybe I'm just not so smart, I just learned to program at 35 after always being a biologist, but any venture into any language has me thinking: Why does this have to be so complicated?
(Btw, I'm putting 4 spaces in front of the code, why is it not rendered as code?)
When I just started I used Python to read, sort and transform images and output them to a PowerPoint file. One can use CSS like synthax to format the slides. Boom PPT with 120 slides with images and data from some Excel file that convinced a lot of people with a lot of data that fluorescent images next to H&E stains + metadata can be nice. Is it the best way to present such data? Meh. But man was it cool and easy and time saving.
This is just a property of having libraries. If you've got the same libraries in Java then the code looks identical except that you stick to word "var" in front of lines 4 and 6 and you don't have named arguments.
Python does have a great data processing ecosystem. But that isn't really a property of the language.
Creating a new venv, installing a few needed libraries that you know the names of with a simple command and writing a quick low-ceremony script that uses those libraries is frictionless in Python. Doing that repeatedly in nearly every other major language is not as easy.
Some languages have a good integrated tool chain (Rust, Go) but are not as approachable and forgiving.
Some are approachable but lack the friction-free story for locating and installing 3rd party dependencies.
The most similar language to Python is Ruby, not Rust or Go, and Ruby is better at the things you've listed. Which would bring us back to the previous point that it's not actually about ease of use.
Someone else identified Ruby's fatal flaw as not having good C tooling, and I think that's probably accurate.
Ruby has odd, once unique syntax that looks strange to anyone raised on C-like syntax, including js and java. Python is similar, simply removing redundant braces. So no ruby was not better at things most folks care about, i.e. being easy to learn.
Things that are familiar are easier to learn, it’s a fact. As someone raised on math and English notation, then industry exposure. Ruby being a odd duck did it no favors.
Python definitely found footing based on its pseudocode readability. It’s gotten a bit worse recently with too many colons and features but already made it.
Your examples don't depend on the language but on the libraries. You can be as simple and concise in most languages, provided the libraries let you, instead of needing to add boiler plate.
But what if your web request fails? Do you deal with it in time or use some_value_I_want which might contain junk?
Ok, then I'm not programming. Call it anything you want, I call it "Being productive." Or, "Saving on time spent clicking around in Excel." or "Automating the boring things." And I find it to be quite pleasant.
Maybe it also doesn't help that out there, in the C++, Rust, Javascript, Go world I'm going to run into people with usernames like "ihatepython".
Sometimes I wish for "block" function in HN for such trolls (I really hope it's no-life troll working for 5 rubles per comment and not a real person with genuine hateful opinions like that).
It is however what a lot of people who 'program' at work do every day. Python actually lets working professionals solve real problems they actually have at work quicker and easier than any other programming language.
Depends what you are trying to teach. If you're trying to teach computer science and programming fundamentals, then sure. If you're trying to teach people how to get 'real work' done quickly and efficiently then Scheme will only get in the way and slow people down.
For example when I've taught programming, one of the tasks I taught fairly beginner programmers was to grab some satellite images between certain dates, try to detect if there is a forest fire, measure the spread of the fire and plot the spread on a map.
With python (and its excellent libraries) this is quite quick and easy, and most people are up and running and hacking around with their programs in pretty short order. They find it really cool and inspiring and makes them quickly realise that programming could be something useful in their day to day job. Trying to start with chapter 1 of SICP and Scheme and working up from there to solving the above problem would probably lead to most of these people giving up on programming very quickly. That being said the few people that made it all the way through that would no doubt be much much better programmers because of it.
This is an under-discussed point, at least from my biased opinion. It seems like the Python documentation ecosystem is the best, and has been building fantastic tooling for over 10 years now.
Eh, you massively overestimate the importance of performance.
For the vast majority of use cases, performance just isn't a priority. Doubly so for Python, that shines for simple automation, command line applications, and perhaps some serveless computing.
Being easy to write, having a good ecosystem of libraries, and being widely known is typically good enough. I wouldn't use Python to write a robust backend server side application, mostly because the language doesn't lend itself well for it.
Eh, you make incorrect assumptions about me. I'm stating a fact why Python is used - the data science ecosystem in Python thrives because of well-written libraries _written in C_ under the hood AND an easy-to-use language that writes like pseudocode.
If it was too slow, we'd be doing all of this in Java, the C# or maybe doing it in C/Fortran. But because of some early design decisions (Guido being on the matrix-sig helped), the history behind Numeric/Numarray and finally NumPy and SciPy being based on those efforts allowed it to thrive.
> it's the only way a tragically slow language like Python can keep up.
Those were your words, not mine. I need not make any assumptions.
I just replied listing use cases where Python shine due to its strengths, performance being mostly irrelevant. I didn't even mention data science.
And although it's beyond the point, if I was to use Python, why should I care in which language a library was written? If the language allows libraries written in other languages, this is actually a nice feature.
> - it’s believed to be beginner friendly compared to other languages. I’m not really sure why - maybe the whitespace?
Good bait. I'll take it.
- dynamic, weak (really "duck") typing, meaning users don't have to worry about conversion between things. Want to print() a dictionary of whatever? Sure!
- no semicolons to terminate a statement. End of line, that's it
- rich standard library, so you can actually get going on things without having to go on a quest to find the right library for your thing. JSON? It's right there. argument parsing? argparse. Http? http.client works...
- also yeah the "pseudocode" thing. Python is light on extraneous syntax.
Now eliti^H^H^H advanced programmers will frown on many of these same things that make it beginner-friendly...
Dynamic typing means that complex designs have to be careful with their APIs, lest you can get tangled up in deep type errors if you're not careful (I particularly hate the "mix-in" pattern). No semicolons mean, uh, long statements need to escaped? The standard library is "where lib go to die" because they can't evolve as much. "Significant whitespace!? What is this, COBOL!?!11"
> it’s believed to be beginner friendly compared to other languages. I’m not really sure why - maybe the whitespace?
Experienced programmers tend to forget how the experience was as a beginner programmer. But Python grew out of actual usability research into teaching programming to beginners. Logo is another language with the same background, but it never escaped that niche. The success of Python is because it attractive to beginners but remains a powerful tool as the programmer becomes more experienced.
Whitespace is definitely a factor - or rather the lack of redundant braces. Beginner programmer seem to really struggle with the lack of correspondence between the visual structure and the logical structure in most languages. Even experienced programmers get tripped up by erroneous indents. Python just solves this once and for all.
The lack of type annotations is also a factor. For beginners this is just an additional layer of complexity.
Scheme is great for teaching computer science students, but if you are just a regular Joe researcher wanting to get the job done, you want to write "2 + 2" like everybody else in the world, not "(+ 2 2)"
Most languages are designed to attract experienced programmers, e.g. by having C-like syntax which they may already be familiar with. But it has a cost for beginners.
The classic "public static void main" example is also getting old. It's been so long since I last wrote it, that I actually had to copy/paste it from the article. I just generate a new project from the "Spring initializer" plugin and jump straight into the real work.
If we want to measure developer productivity we should try to compare actual real world usage that goes beyond "hello world" using Notepad. The author did provide more examples in the article, but I think we should just retire the basic minimal hello world example for these types of discussions.
If their argument is that Python is simpler for quick scripts and programs, I agree. The same applies to ML / AI where Python has lots of great tools. Once you start looking at other areas (Django and FastAPI), there are many alternatives based on other languages where you can be just as productive.
Most of the time consuming work is spent doing business logic where the differences between languages doesn't matter as much. They all have advantages and disadvantages. Personally I prefer static typing for larger projects and teams.
If their argument is that Python is better in general, they need to provide better arguments.
A third point is that python is the default scripting and plugin language in a lot of popular commercial and professional applications. A lot of people I know who learned python did so to automate or extend applications they used for their 'real' work, like ArcGIS, FME, Rhino/Grasshopper, Revit/Dynamo etc.
1. Near zero boilerplate. Python's boilerplate is usually no more than setting up a class and then calling a method. Often you can skip the class and just call it directly. This is probably the biggest strength; to give you an example, my standard test for a languages approachability is "how much work do I need to put in to get a very simple JSON file from a web URL" (nothing fancy like POST, just an HTTP GET). With python, a call to urllib.request.urlretrieve and then a call json.loads are all you need. In Java, you need to manually implement a bunch of boilerplate code that looks horrible, need to think about the size of your response in memory and often need to pass in configuration options that should be a case of "works by default" but aren't because the standard is ages old so it needs to be manually toggled on. Part of that is that the Java stdlib consists largely of reference implementations rather than actual implementations, which means that anyone who wants to implement something in Java will usually end up falling back to the stdlib interfaces, which in turn suffer from being stdlib interfaces, so a lot of "should really be on by default" expectations aren't on by default since the stdlib is where code goes to die, so you instead start piling on features.
2. Interop with C. I hold the opinion that C is a great language that doesn't work very well once you get to any form of scaling. Python allowing developers to take the slowest parts of their code and writing it in C to speed it up is one of the easiest speed gains to make and it avoids the biggest bumps that come with using C as your primary language in terms of project structuring.
3. Library support is as you say, good. If you need it, there's probably a package for it. Pypi is dependency-wise kind of a disaster but if you know how to set up requirements.txt, it works really well. Most libraries ship with "sane" defaults too.
All of these combine to a language that's easy to prototype, easy to expand
It's important to keep in mind that pythons biggest successes aren't in the speed-focused, low-memory environments where every speedgain is necessary. Its success lies in conventional desktops and servers, which have much more processing power and often have more leniency in being a bit slower.
With python you can write something in 3 hours what would take a day in another language, at the cost that instead of being lightning fast and done in 10 seconds, you need to wait 30 seconds. That's an issue for some environments but in 99% of the cases that's not a problem.
That isn't to say the language is perfect (no language is), but speed of development at the cost of slightly slower execution time is the main reason why python got popular.
>"how much work do I need to put in to get a very simple JSON file from a web URL" (nothing fancy like POST, just an HTTP GET). With python, a call to urllib.request.urlretrieve and then a call json.loads are all you need.
In C# you just have to do this:
var things = await httpClient.GetFromJsonAsync<List<Thing>>("url");
It doesn't. It's a standalone (static) helper method that uses HttpClient to perform the request and feeds the response body into a json parser. It's an extension method [0] which means that there's syntactic sugar so that you can write client.GetFromJsonAsync() and the compiler transforms it into the actual static method call, HttpClientJsonExtensions.GetFromJsonAsync(client).
Mostly because JSON is one of the most common formats used when sending over data. I think this practice started with pythons requests (which is my real answer as to what you should use in python but I wanted to focus on the stdlib), which has a json function on the Response object for convenience.
Most languages nowadays tend to implement some variation of this specific convenience because it's just one of the most frequently needed things; setting up a separate parser and then calling it might be the "cleaner" option, but it's also more boilerplate and the industry has largely moved to try and avoid that.
I agree comparing hello world code in Python and Java is a bit pointless. Hello worlds might be much shorter, but at scale, this becomes less relevant. Also, comparing to a language that hates change is unfair, if you compare it to C#, which is improving over time, you'll see the hello world also takes one line (but is 13 characters longer, so Python still wins!)
> if you compare it to C#, which is improving over time, you'll see the hello world also takes one line
Well yes it it does now, since C# 10 or, but in Python is have been a one liner since the beginning 30 years ago, and over the decades it has build a following. And even with recent efforts to cut down on boilerplate, C# still has more cryptic syntax than Python.
I'd argue C# is more maintainable in the long run due to the static typing, but there is no way it is as accessible to beginners.
I remember the evolution of Python. When I started with my brand new Ubuntu installation when Ubuntu was also very new, there were two languages, Perl and Python. Because Perl was more popular at the time I check some books in the library, and wrote some scripts. But already then the consensus was, that even though libraries are lacking (!), it is the better, cleaner language. Another big push was the bad enterprise guys, that like static typing, so C, C++, Java, vs. the hipster dynamic language guys. Some tenets were broken, speed is not as important as language concepts, that enabled fast iteration. Agile was new, and Python was very agile. There was no real alternative to Python at that time, that was as sane, and beautiful. That's the reason why we have now such a nice variety of libraries, because it is just a sane language to build on, if speed is not your business. Ruby was a contender, but not that much better in the end. Now Javascript kind of took the edge away from Python, even it wasn't for the browser and the web, Python would dominate much more. Javascript as a language is worse than Python because of its cruft, so I don't see a big rewrite of ML tooling into Javascript. Maybe Typescript.
Perl was very popular and I think Python got its boost from being seen as the more sane replacement to it. At that time nothing else really offered to solve the pain of Perl in the same way except Ruby and Ruby was even slower
I learned Perl first and found it incredibly useful just because of regexps but Perl programs got messy as they got larger. Python regexps are just that bit less convenient and that trivial amount matters but in python I feel that once you get to writing functions and perhaps even classes it accelerates far ahead of Perl and you can write understandable, maintainable large programs in Python.
You're not forced to dip into the OO side of things though and I've heard people suggest to me that Python is a language where no rules apply and no structure exists. They feel that they can dive in without the care they would have to take in Java or C++. I think they are very wrong but you can certainly write terrible code if you want and perhaps this "all-things-to-all-people" aspect of it is part of the success.
Well said. The core to Python’s success, for me, has always been its system for containing code inside modules and for importing and exporting symbols between those modules.
Understandable code is maintainable code. Abstraction makes comprehension possible — imagine if every call to print were an inline block of assembly instead! — and Python’s modules provide a very quick and easy way to break up large code into small modules.
It's interesting that people keep claiming that indentation-based blocks makes Python easier to learn and read. I've been teaching programming to students in banking, finance and insurance for a few years, using Python, and my experience with them is the opposite. They have been struggling with that a lot. They didn't pay too much attention to white spaces and tabs, probably because they are "invisible", and couldn't see why a statement was not executed at the end of the block, because it was not indented correctly. For me, it was obvious, because as a professional programmer, I've been trained to pay attention to details. But for most of them, it made no sense. Explicit block markers are much easier to teach.
On the other hand, back in college I was a TA for an intro to programming course that used Java. This:
> They didn't pay too much attention to white spaces and tabs, probably because they are "invisible"
can get so much worse than people imagine, when the language doesn't enforce it. It was that experience that made me lean towards python being a good introductory language, to help people get used to correctly indenting code in general.
> Explicit block markers are much easier to teach.
The point of communication is to express something in a way that the listener understands it. Simply expressing something that is comfortable or normal to the speaker is simply not useful for either party of the goal is communication.
You're right. What matters is that explicit block markers are easier to learn (from the student's perspective). Being easier to teach (from the teacher's perspective) doesn't matter as much.
Sure, but without the white space defined blocks it is entirely on you to teach them about the importance of formatting their code to ensure readability.
My point was to express that your students need to learn to communicate with the programming language in a way that the language understands, not in a way that is comfortable for them as the “speaker”. You as the teacher should be setting that expectation.
Code is still completely unreadable with explicit block markers, if it is without appropriate indentation and newlines. That's like trying to read minified javascript. And python forces you to have these, to some extent.
I used to think the same, but go fmt totally changed my mind on this. Most languages now have a code formatted that is integrated in all popular text editors and IDEs.
IMO, the thing about beginners and relevant white space is that it forces them to learn how to indent their code, not that it makes it any easier to learn the language.
Python is the ideal glue layer for low-level languages, now mostly Fortran/C/C++ but there are already many projects exposing Rust libraries through Python, which for me is a great combo of performance and usability.
Sure, but that's because there was a good library available, little to do with Python vs C++.
If there was an equally good C++ image library available then it would have been equally simple in C++.
Of course the thing is that, in general, there may well NOT have been such a C++ library, and this is what makes languages such as Java and Python popular - because of the breadth of libraries available.
For C++ image manipulation I've found libvips to be decent, but the point still stands that for a given problem there may not be decent libraries available.
Tangent: Something interesting (and frustrating) ... I went to the LinkedIn assessments to take the Python assessment and over half the questions were specifically about Numpy, it's API and matrix math. Which for me and what I generally do has nothing to do with "Python" and I was quite surprised to find that in the questions.
In my search for a senior Python developer I have encountered dozens and dozens of resumes whose Python experience consists almost exclusively of Numpy and data entry. Not at all the skill set I’m looking for.
As a pretty experienced python dev who has never worked professionally with Django / Flask or Numpy, I have a hard time finding job postings without those seemingly hard requirements.
It's been split between three areas - writing mathematical code that wouldn't really benefit from numpy, at least not at first (think engineering design codes - step-by-step calculations where the output has to be verifiable by a human; charts and graphs aren't the focus (I'm not at this company anymore)), writing testing infrastructure for a legacy client/server application that was designed around the time Ethernet was invented (well before I was born), and writing library code for my QA team to write tests for said application.
Most of these domains are sort of document-oriented - for the math stuff, the hard part is just defining the model, and there is only one "thing" to operate on. For testing, the units of work are test cases and steps, which _could_ be database entries, but work better in practice as a document that a non-technical QA team member could edit by hand. Results are fed to a SaaS that keeps all the historical test result data.
Not that I couldn't pick up either one of these tools and use them (I'm a mechanical engineer by training and got into Python because I didn't like using Matlab/Octave), I've just never needed them professionally. But that doesn't get me past the resume filters :(
Are you calling the job 'senior python developer'? I haven't used any other language (as the primary one anyway) professionally (and I'm open to that remaining the case) but I probably wouldn't even open such a job description.
It's a fine article and makes many of the standard points. But I would expect a slightly more data driven analysis from GitHub given the large amount of data available to them. For example, what percentage of first repositories (new GitHub users) are in Python? What percentage of GitHub Python repositories use a data science library? Etc.
They don't mention that Python makes it really easy to interoperate with other languages via the subprocess module. So you can run Javascript web code via node, wait for it to finish, then move on to something else. Or you can launch a C++ process that's can do real parallelism and wait for the results, avoiding many issues with the Python GIL.
Also, Python makes it easy to work in different programming paradigms, it's as easy to write functional-type code as it is to write object-oriented code, both approaches are supported. This makes it a great prototyping system if you want to rewrite the code later in some other language like C++ to get some performance improvements.
> Python makes it really easy to interoperate with other languages via the subprocess module
Launching processes and interacting with them through stdin and stdout is a bare minimum, and really a flag against the few languages that make it hard, instead of a relevant feature.
> Python makes it easy to work in different programming paradigms
As long as you stay in imperative, non-pure, structured with optional OOP and first classes functions. Again, that's not much in multi-paradigm. This is the set of things that work well with the imperative model.
I just wish that I enjoyed using it. I've been doing a fair amount of Python work over the years because it's been required by my employers, but to be honest, I kind of hate it.
I think the thing I hate the most about it is that white space is significant. It's like we travelled back in time to the early days, and picked up one the bad things about them and brought it back. I find that makes it more difficult for me to read, and more difficult for me to write (from a mechanical type-in-the-code point of view).
I have other niggling issues with it, but if the white space thing weren't an issue, I doubt the other issues would bother me enough to complain about them.
Slashdot called and want their comment back, haha! Whitespace blocks significantly reduce redundancy and are quite elegant imnsho.
If you’re having trouble with them, use a programmer's editor with indentation guides like notepad++ or geany. These are considered basic features these days.
Other simple things which reduce Python annoyances are tools like pyflakes and the blue formatter.
> If you’re having trouble with them, use a programmers editor with indentation guides like notepad++ or geany.
Yes, of course. Python makes that pretty much essential, which is one of my complaints about Python. A language that requires a special editor is a language that is deficient, IMO.
There are a ton of other, more minor, aspects of Python that makes it unpleasant for me. It's not just the whitespace thing. That's just the one that grinds my gears the most, because it's the one that slows me down the most.
But note what I haven't said here -- I haven't said that Python sucks. I only said that I hate using it, despite being pretty fluent in it and having used it for years.
It’s not required to use a basic editor from the early 90s, no… but it can help on a big codebase. Why handicap yourself? I make patches once in a while with nano/micro without issue. Maybe your functions are just too long, dunno.
A lot of folks can’t survive today without a giant ide awhile you advocate not to use something downright tiny in comparison. Do you hate working with jpeg or blender files because they require software? Rather move bits with a magnetic needle and steady hand?
Admit I was put off by it for an hour or so as well. One spring day in 2001? That afternoon I realized it as a masterstroke and didn’t write another line of c, perl, or java by choice for almost two decades.
Two ways to delimit blocks is redundant. Either one indents already or the project is a disaster.
Again, just to make sure I'm being very clear, note what I said in my original comment. I wish that I enjoyed Python. I did not say, and don't assert, that Python is bad or that nobody else should enjoy it.
I just wish that I did.
> A lot of folks can’t survive today without a giant ide while you advocate not to use something tiny in comparison
I'm not advocating anything. I'm stating my personal preferences. But I also make sure that I don't rely on a giant IDE for any language. I use a couple of different ones at work, but I don't use one at home (even though my hobby projects are no less complex), because I've found that using IDEs encourages me to engage in poor programming practices. I am not saying that nobody should use IDEs or that they make people worse programmers. I'm speaking for myself. I want to deeply understand the languages I use.
> Do you hate working with jpeg or blender files because they require software? Rather move bits with a magnetic needle and steady hand?
Of course not. Those are not human-readable data collections are need a tool to make them understandable by a human. A programming language is supposed to be directly understandable by a human, though, and if you need a tool to make it so, that strikes me as a failure of the design of the language.
> Two ways to delimit blocks is redundant. Either one indents already or the project is a disaster.
Eh, each to his own. Yes, there is a redundancy there, but I think it's a redundancy that brings value and reduces error.
Not sure how well-known but many 3D scene formats are text, especially early ones. Similar to svg conceptually, which many are familiar with. At a certain complexity, writing it by hand is no longer practical and software support approaches necessity.
This is very true. I do 3D printing, and am also very familiar with the fact that STL files are text files. I even edit them by hand from time to time -- but I still use software to manipulate them, for obvious reasons.
It boils down to "the right tool for the job", of course, and like every developer, I choose my tools with an eye toward the goals I want to accomplish. That my goals and yours aren't 100% aligned is to be expected. We're different people. And that also means that at times, a tool that is appropriate for your use case may not be appropriate for mine. And vice versa.
It isn't redundant though because without delimiting symbols for a code block you lose the ability to have your code autoformatted in certain situations. Here's a trivial example to illustrate the point:
def example():
x = 5
print("Hello world")
What's the mistake here? Depending on whether the print is part of the function, it should either be indented or have a newline before it. The point is you (and any formatting tool) can't know what the horizontal alignment of this code should be just by examining the vertical line order. You can only determine this by knowing (or reanalyzing) the semantics of the code. During a refactor where you're moving around lots of code, this can be a significant PITA. However, in the JS example,
function example() {
let x = 5
console.log("Hello world")
}
it's unambiguous what the mistake is because you can determine the correct formatting entirely from the line order, without having to know anything about the code's semantics.
The first example is a syntax error, which must be fixed, and takes a second. Not a PITA, just a part of normal day to day refactoring.
I see how a formatter could help you out in this specific situation. However you are trading typing of redundant characters every few seconds and readability per minute, to avoid an issue that happens once or twice a day, per week on a mature project.
In other words we generally don’t significantly reorganize code nearly as much as reading, tweaking, adding features etc.
Readability is definitely a strength of Python, not a weakness. A lot of this is because of reduction of required notation.
The code runs so it's most definitely not a syntax error. May be against PEP styling rules, but it is valid Python code.
>which must be fixed, and takes a second.
And that was my earlier point. The responsibility to get the code in a state where it is formatted AND runnable falls entirely on you, as this work cannot be entirely delegated to software when the syntax uses significant indentation. And the fixing of the code would require you to reanalyze the code's semantics before you would even know what the appropriate fix is. Of course it would only take a second for a trivial example like the one I used, but real Python codebases aren't going to be trivial.
>typing of redundant characters
It actually doesn't require any additional typing as compared to Python. In modern editors the closing brace is automatically added when you type the opening brace. So you simply type the opening brace and hit enter, just as you would type the colon and hit enter in Python. Even the space between the closing parenthesis and opening brace in the function header is added automatically by formatters.
>readability per minute
Seems pretty subjective but I don't think there's a significant difference in readability between Python and braced languages, or even between Python and languages that use block delimiters other than braces, like Ruby. A lot of readability comes down to personal familiarity with a language.
>we generally don’t significantly reorganize code nearly as much as reading, tweaking, adding features etc.
Totally agree, and I think this is one of the big problems of software development. People generally don't want to make big reorganizational changes to code and instead prefer to change the code only through additions. As a result, legacy projects tend to accrete layers of cruft over time whether it is necessary or not. I'm not a Jonathan Blow fanboy by any means, but I recently saw this clip which I think makes a good point.
https://www.youtube.com/watch?v=ubWB_ResHwM
$ python3
Python 3.10.6 (...snip...)
>>> def example():
... x = 5
... print("Hello world")
File "<stdin>", line 3
print("Hello world")
^
IndentationError: unindent does not match any outer indentation level
Sort of an odd video, haha. But I liked the guy... think we could be friends. ;-)
It's idea is sort of neither here nor there however, regarding whitespace blocks. That we "rent" more often than own is just a reality of the system we find ourselves in. (For example cheap products sell more than expensive ones and that is expected and ok.)
I don't want to optimize my projects around large refactors since I only do it a few times per project, but will read it often.
That's because the REPL has the additional restriction of requiring a definition (or any top level indented block) to end with an blank line. This restriction does not apply when running code from a file.
I see now. The issue was pasting the snippet into a terminal added an extra level of indentation from the post (pre must be indented). This changed the result.
Because its easy.
Because it has an interpreter.
Because it has notebooks.
Because it has huge set of libraries & easy library import.
Because schools are teaching it.
Because a lot of cloud native tools support it / prefer it.
Well yes.
But if you pump out millions of graduates with almost exclusively python experience, guess what they write in production once they have jobs?
Try dealing with your corporate security team, who's detecting out of date Python versions.
For the most part these update without much hassle, but now you've "broken" your researcher's code. It takes them a while to fix this, and then the security guys find out that Python packages also need updated...
I used Python for the last 10 years or so for all sorts of cross-platform command line scripting stuff, but watching Python3's progress I have the impression that this simple use case is no longer their main focus (I sometimes wonder if there's any focus or vision at all tbh). After having tinkered with Deno for the last 2 weeks or so I must say that this is indeed the 'better Python' (for me at least, and not because of Typescript - the language is more or less just an implementation detail for writing command line utilities - but because of Deno's approach to package management).
Python has been my favorite for products and scripting. I've been advocating for Python's dominance when the world was after Java. But now I'm more inclined towards JavaScript. And agree with you Deno is taking the right approach (as opposed to nodejs).
Having JavaScript in the front end and backed both can reduce resource requirements as well.
I wish I could wave a magic wand and replace all the Python in the world with JavaScript.
The languages are practically equal in terms of features. They're both typeless with layered-on crutches available to make the runaway dynamism less painful. They both have weird footguns and ugly syntax and annoying design flaws, but these are different for each. So if you're forced to use both languages, it's an endless pain in the ass to remember which stupid runtime error can happen in JS but not Python and vice versa.
So JS can replace Python... But Python can't replace JS simply because it's unavoidable on the front-end. Since you must have JS code anyway, why use a language that's functionally very similar but different enough in syntax and API that you'll be writing bugs all the time? Just stick with JS and start drinking like me.
Since I have an all-powerful magic wand that rewrites history, I guess I could use it to make Python the default language in Netscape Navigator 2.0 and then everything written in JS since 1997 would be in Python instead...
But I don't actually want to do that. The Python design philosophy isn't really compatible with loading embedded, sandboxed user programs over the network. A hypothetical Python-Netscape would have crashed and burned with security holes like ActiveX, IMHO.
Somewhat agree. In fact for a recent project I began writing in JS for this reason but quickly ran into opinions from team mates with no JS exposure asking "Why wouldn't you use Python?", so I did. But, I think JS as a Python replacement is problematic because:
1. Some probably usable version of Python is already on your computer. Not true for JS.
2. Forced use of async in JS library code, even though Node.js doesn't require it.
Here's how out-of-the-loop I am: I'd don't know the command line method for invoking the JS runtime on my computer! What it is? I regularly image my laptop, so I'd like to know what the system JS command line is. (I use Linux & macOS.)
The CLI runtime that 99% of people use is Node.js, available via the same package manager where you'd get Python 3.
Its dependency management story is not great, but in general better than the horrible mess on the Python side.
On macOS there's a system runtime called JavaScriptCore.framework. It's actually a very good and useful engine. If you're writing native code, you could link to that. But of course it doesn't provide any of the Node.js API, just the JS standard library which is tiny. So the most typical use case would be to embed a JS API into your application.
Github is primarily a website for developers, right? Who is the blog for?
This article is garbage. It reads like SEO trash and I'm not sure why Github feels the need to publish it.
It's just a list of common Python talking points. Points that the author displays a poor understanding of.
The article doesn't really attempt to explain why python keeps growing but the facts listed vaguely suggest that it "keeps growing" (not sure when it was supposed to die) because it is Good and Popular. Which makes sense, I guess.
If the title was '10 Cool Facts About Python (a Programming Language!)' I would probably find it a little bit funny.
A lot of Java's verbosity isn't so much from the language, but from when the code was written. It happened to come to popularity at a time when big over-engineered Gang of Four-style design was hot. So you get a lot of code written in this over-engineered fashion where half the classes have names that end in DelegateFactoryFacadeMessengerImpl.
Some of it is the language too, but modern Java can definitely be reasonably terse. Record classes has done a lot for the language, so has lambdas and streams.
Java is still good for several use cases, which were popular at the time, or let's say much more code was needed. Now in the Enterprise world you have much better no- or low-code tools. But in large teams all of these public/private/protected key words are quite nice. Totally unnecessary for a small codebase. But Java with its great runtime, verbose patterns is great for web server applications in large teams. I would still use it as the foundation for a tech company, but not for the Fortune500 company that needs some customization. Where is no alternative where you can find people, and has the stable properties. Maybe Rust if you have the clout as a company. Otherwise it's too expensive.
For its warts, Java was supported by a company with a lot of smart engineers who were working to provide a “batteries-included”, cross-platform language ecosystem and Sun was really invested in its success. I’d love to see some of these successor languages get the same kind of corporate sponsorship, but I don’t think it’s profitable.
Protected data was never as useful to me as package-only data. The former is only relevant with inheritance hierarchies, whereas the latter is more open while also restricting access only to other classes in your com.example.concern package. Extremely useful for keeping big teams on the right track.
* Isn’t testable because it runs the sys.exit() call unconditionally on import, so there is no way to import the module and call main() from test code.
* Fails with an error, because main has a required argument that it isn’t supplied in the unconditional main() call inside the sys.exit() call.
* Isn’t following a typical python idiom by returning a value from a function and have it be unconditionally 0; if the only options are a fixed return and a exception somewhere, the convention is to return None (which doesn’t require an explicit return.)
* Since a normal exit is the default response of ending the main program, sys.exit() is unnecessary here; “Exit with 0 if main() returns successfully, and with an abnormal exit on exceptions in main” is achieved by just calling main() and having nothing after it in the main program.
> I would argue that there is nothing simple about the last if-statement and will probably be very confusing for new developers.
The only thing that isn’t perfectly straightforward is “what is __name__”, but once you know how __name__ is defined…
OTOH, that example is much more complicated than needed.
As a simple executable script, that would do the same thing if antigravity. Fly() actually did anything, all that is needed is:
import antigravity
antigravity.fly()
The rest is unnecessary boilerplate to provide a module that can operate either as a library or a script, which is superfluous here. Moreover, since the actual functionality it is demonstrating is all in an import hook, all you actually need (the rest just produces an error message) is:
Easy to learn yet powerful.
Easy to read yet very concise.
Anytime you need something complicated, it's a pip and an import away.
Couple of downsides though:
- slow
- does not handle multiprocessing nicely.
If the python team ever manages to fixes these two (not easy or it would have happened a while ago) to bring it to - say - the level of Go, Python would be the absolute killer language.
I think people embrace Python because they think it's easy to learn and use. While that might be true, the real complexity lies not in learning and using a language (unless it's Rust) but in learning the libraries, frameworks, idioms, dos, don'ts, tooling.
How difficult it would be to transpile python program into a more performant language using a LLM ? This would solve so many issues. Write a code in python and convert it into C++/Rust/Assembly for performance.
> How difficult it would be to transpile python program into a more performant language using a LLM ?
It'd be a lot easier (and more reliable) just to use a traditional compiler, rather than an LLM imitating a compiler.
The problem is maintaining the full scope of Python semantics there is usually no advantage to this. Compiling parts of a python codebase, with restricted semantics, can give you significant gains in some use cases, and there are tools (in Python!) that do that already.
I think the idea is that a person can read a project in python and re-implement it more efficiently in C++. The LLM could do the same thing. However, I don't think LLMs are quite that powerful.
So impressive to see all trading solutions/bots made with Python! What do you think about Julia in this field: I see it as a viable alternative even if it lacks the huge amount of Python libraries?
Optimistic that we'll see maybe a 2x performance over the next 5 years, yea. Optimistic that python will catch up to any of the 'fast' scripting languages, no.
I switched to python last year because it has an excellent builtin library. No more having to download random crappy libraries that download even more crappy libraries.
I remember when Twitter rewrote the code in Java because Ruby wasn't suited for the task. Probably this will happen to many large Python code bases in the future.
Perhaps, but it seems the shelf-life of anything AI/ML related (huge part of Python usage) is so short that it'll become obsolete before it ever becomes worth rewriting.
Dare I say, the article is quite a shallow take... but maybe that's all there is - the apparent ease and the few industries that have been built on top of it?
Getting python bootstrapped is less easy than getting make bootstrapped, plus python isn't a GNU tool so I don't see them adding a dependency on it to all their packages.
GNU Make is really far more sophisticated than people give it credit for and it's not that easy to replace it.
Well as I pointed out, it does happen e.g. in Meson.
Make doesn't do any of the things you mention - it just works out what targets need to be rebuilt and the code to do it is in the rules that you have to write in a shell language like bash.
I admit you can use builtin implicit rules for C and get away with that up to a point but only a fairly low point.
python is far harder to use than bash when it comes to running processes and doing things with them and doing the file-system manipulation that is usually wanted when building.
There are modules that make it easier but so far as I've seen it's a crap choice plus python takes a long time to start up so there is a cost to running a "clean" environment for every build step but OTOH if you use the same interpreter for the whole build you can introduce all sorts of ordering problems when a build runs on a different machine and doesn't work because it's executing build tasks in a different order.
If you really want to build by "writing a script" (groan - because that's the age old horrible solution) why bother with python?
>python is far harder to use than bash when it comes to running processes and doing things with them and doing the file-system manipulation that is usually wanted when building.
Can easily be solved by having libraries that does. Which is far better solution that hacking/chaining 5 different obscure unix tools to do string manipulation.
>you can introduce all sorts of ordering problems when a build runs on a different machine and doesn't work because it's executing build tasks in a different order.
Surely it is better than make? Even if make is "better" because it "just works". This is basically Hyrum's Rule waiting to be unleashed.
>If you really want to build by "writing a script" (groan - because that's the age old horrible solution) why bother with python?
Because I personally cannot decipher make and I think this is the experience of many developers as well. And how many make derivations are there? Its basically an arcane spell that needs to be chanted with every project (configure -> make -> make install).
Funny enough this sort of sentence, i.e one with a missing word, is a very common ML training task for large language models like Gpt3: the models objective is to predict the correct word from a set of possible answers.
Python’s “simplicity” and “beauty” are always so over-emphasized. Maybe compared to something like C or Fortran or R… But it’s a pretty messy, complex language.
The growth is actually too fast. I was a python expert. Now it's not a big deal any more because the language is so easy. I've hit interviews where they DIDn't want me to use python because it's too easy.
I branched out to C++ and other languages to stay above the curve. Python is becoming like english, required to know, and not really a deep skillset employers are looking for now.
Most of these applied fields are using Python and R as no more than data gathering tools and fancy calculators. something for which the benefits of other languages are just not justified.
The absolute beauty of Python for what I do is that I can write code and hand it off to a first year with a semester of coding experience. Even if they couldn’t write it themselves, they can still understand what it does after a bit of study. Additionally, I can hand it off to 75 year old professors who still sends Fax memos to the federal reserve and they’ll achieve a degree of comprehension.
For these reasons, Python, although not perfect, has been so incredibly useful.