Hacker News new | past | comments | ask | show | jobs | submit login
Python finally offloads some batteries (lwn.net)
84 points by jwilk on March 17, 2022 | hide | past | favorite | 60 comments



Good, but let's talk about the other side of things: Features missing from Python's standard library.

Here are some features I miss regularly and I believe should be part of the standard library:

- support for modern compression algorithms (e.g. brotli, zstandard)

- support for modern hashing algorithms (e.g. argon2, bcrypt)

- support for parsing ISO 8601 date times with "Z" as timezone designator

- a TTL-cache implementation (like cachetools offers)

- support for reading/writing YAML


> Features missing from Python's standard library.

PHP's base_convert() equivalent.


The best way to talk about this is to make a PEP about it.


PEP 594 – Removing dead batteries from the standard library - https://news.ycombinator.com/item?id=30673597 - March 2022 (90 comments)


It’s interesting to see which modules were set for deprecation but ended up being un-deprecated (de-deprecated?):

  colorsys
  fileinput
  getopt
  optparse
  wave
https://peps.python.org/pep-0594/#modules-to-keep

Seems like the maintainers have been generous in deciding which batteries are not dead yet.


i mostly agree with those choices.

colorsys: a small handful of conversion functions you need every now and then

fileinput: quick and dirty boilerplate for when you want to accept input from files or stdin

getopt: a fine, well understood command line parser

optparse: it's a simpler more broken argparse that will eventually tell you "i'm sorry dave..." and people should stop using it

wave: great way to create or process pcm audio data without a whole framework


I disagree that optparse is more broken than argparse. As someone who wanted to add some (trivial) features to argparse, I quickly backed away from that code [1]. To be honest, I was really surprised how needlessly complex that particular code is.

[1] - https://github.com/python/cpython/blob/main/Lib/argparse.py


Argparse is also broken. It's not possible to create an argument that captures all remaining arguments even though the docs say it is.

There has been an open bug for this for many years.


Link to the bug?


https://bugs.python.org/issue17050

Apparently after 7 years they decided it can't be fixed and updated the docs.


I'm baffled with the resolution of this bug. Removing the docs doesn't help anybody.

In my experience, argparse.REMAINDER works reasonably well, and apparently I'm not alone; there's >100 packages in Debian that use it:

https://codesearch.debian.net/search?q=%5Cbargparse%5B.%5DRE...


i suppose they are both broken, but i claim optparse is "more broken" because it lacks the nargs="+" and nargs="?" functionality, and doesn't support positional arguments.

it's also valid to suggest that a plate of spaghetti is better than a cauldron full (thus favoring optparse).


Thank you. I thought I was crazy for feeling it was too complex.


precated :)


comprecated?


Resurected


What are you supposed to use instead of urllib? Seems like an oversight to write all about how you considered removing it without saying whether there is a replacement.


From the proposal to remove urllib linked in the article:

https://lwn.net/ml/python-dev/CABqyc3wGDmdnDjrhYh0SxT_Tgr5M8...

  From:    Victor Stinner <vstinner-AT-python.org>
  To:    Python Dev <Python-Dev-AT-python.org>  
  Subject:    [Python-Dev] It's now time to deprecate the   stdlib urllib module  
  Date:    Sun, 06 Feb 2022 15:08:40 +0100  
  Message-ID:    <CABqyc3wGDmdnDjrhYh0SxT_Tgr5M8Za3JTw4CUapnSOVQ-ci3A@mail.gmail.com>
  
  Hi,
  
  I propose to deprecate the urllib module in Python 3.11. It would emit
  a DeprecationWarning which warn users, so users should consider better
  alternatives like urllib3 or httpx: well known modules, better
  maintained, more secure, support HTTP/2 (httpx), etc.


Those don't come with python though, so the standard library wouldn't have any similar functionality. And both of the suggestions have a direct dependency on urllib.

Edit: So who maintains urllib, a dependency for the "better" libraries, if you push it out of python core?


Yep. They kind of hint at the end that the real solution is to actually fix urllib. Removing it is a mistake IMO, even if it's for "security reasons" but especially if it's for a bad API. I'd rather it be included and kind of broken than removed.


What is the alternative to “cgi”?


You don't have to find an alternative. Just copy the parts you need into your own code: https://github.com/python/cpython/blob/main/Lib/cgi.py


There seems to be overlap with wsgiref, and that ships with Python. There's a CGIHandler class that can use stdin/stdout.

Most of the examples show spinning up and listening to a port, but you can google around to find examples that work with just stdin/stdout.


Since there's no dead simple examples:

  #!/usr/bin/env python
  from wsgiref.handlers import CGIHandler

  def app(environ, start_response):
      start_response('200 OK', [('Content-Type', 'text/html')])
      return [
         b"<html><head><title>foo</title></head><body>bar</body></html>\n"
      ]

  if __name__ == '__main__':
      CGIHandler().run(app)


wsgi, or more realistically AWS Lambda or other modern FaaS platforms that runs even simpler python scripts than cgi.

You're missing the point (and the hundreds of comments in the three years this pep has been open and debated)--the cgi library never would have been included in python if it were proposed today. It doesn't meet the bar for including in the core just like modules to handle json-rpc, grpc, etc. aren't included today. If you want cgi support pull in a high quality third party library to do it.


Honestly it's kinda disappointing that the CGI library would not be included in Python if proposed today. Why would wsgiref get included then?

I get the arguments for a small standard library in theory, but given how absolutely nightmarish it is to deal with python requirements in a "will work every time" way, I don't know where we end up with the Python standard library in general.

Not making a slippery slope argument for the list of packages in this change. Just like, even when languages like Rust build for the ground up with good package management, everything requires 1000 different packages and it's a whole thing.

So if you don't have a great way of shipping dependencies, and on top of that the standard library is in "please don't let me get bigger", it makes me a bit sad.


This is an argument for good package management. There’s no way around it. You can’t add everything anyone could ever hypothetically need into the standard library, just because Pip is rubbish.

(I mean, I don’t care. I use Rust. I’m just arguing on behalf of the poor people who are still abused by their employers, or wacky enough to voluntarily choose to use Python.)


I use Rust and Python, and I don't believe that Rust's strategies regarding standard libraries are great either. And let's not talk about Javascript "in order to use CSS with Webpack you need to download this 20 line package with its own release cycle"-style noncense.

I do believe that package management solutions are needed. But I also think that a big standard library is A Good Thing, especially in a universe where some random micro-package will just lose its maintainer and cause 100 knock-on effects.


> But I also think that a big standard library is A Good Thing, especially in a universe where some random micro-package will just lose its maintainer and cause 100 knock-on effects.

Standard libraries have the same problems; in fact, part of the reason these are being removed from Python's stdlib is...they don't have maintainers.


What I have had a lot of is that infra fixes (like "oh this is now a keyword in the latest Python release, there's a trivail fix") just ends up not happening and then your packages are all in limbo.

I understand not having maintainers for actual bug fixes of tricky things. But the "maintainer disappearing" scenario has almost always been just for keeping things running. And in particular people show up with patches! Just well... some arbitrary person was a maintainer, is no longer there, and now there are bunch of people who would love to do most of the work.

I think that "lack of succession strategies" is what really gets these packages, rather than the lack of willing maintainers. Combine that with deep dependency trees and you have a lot of stuff that needs to happen.

I get it though. And I maintain some packages but I don't get involved in standard library shenanigans cuz my imagination is there's a lot of friction there. Just I like that things will probably be there for a while if its in the standard library.


A large "standard platform" may be a good thing, but it should definitely be decoupled from the language proper. (And personally I would say Javascript is doing it better than most languages on the library management front).


Eh, it's hard to talk about JavaScript because of all the different layers. My personal take is: JavaScript itself (ECMAScript) is an ugly and flawed but clearly generally productive language; Node is a work of art; NPM is horrendous with near-to-no redeeming qualities. TypeScript redeems lots of the problems with JavaScript, and I'm hopeful that Deno will ameliorate a lot of the problems with NPM (Yarn is good for what it is, but barely covers 3% of the necessary changes).

Also, an honorary mention for ESBuild (and the lesser-known SWC, extremely similarly optimal but eking out a small win due to Rust's advantages over Go). Both of them are terrific improvements over the extant tooling.


Oh man. I will miss ossaudiodev. I used it for a random project many years ago.


That's the point; you used it "many years ago".


This is stupid. I use cgi all the time. My cheap Hetzner single core vps can fork a Linux process 10k times a second, so I don't want to hear about forking being too slow until Django can serve pages that fast.

Cgi is still the simplest way to get a basic web script running, without worrying about lingering state between page hits, etc. Doesn't Linus always shut down such discussions regarding the Linux kernel with "we don't break userspace"? Python should learn something from that.


Interesting, I’ve never seen the cgi module used before. It’s good to know it exists.

Here’s an example for those interested: https://dzone.com/articles/python-simple-http-server-with-cg...


Yep, if you want to throw a ghetto GUI onto your python script, easiest way is to run CGIHTTPServer (now renamed in py3), write some 1990's style basic HTML, and point a browser at it. Use for customer facing sites? No. But for your internal status checking script or personal photo tagger, it's low fuss, low dependencies, and completely practical.


It's not forking the process that's a concern, it's what happens next. How slow is Django to start up? Do you want to pay that penalty for every request?


Django as a cgi does sound painful, but that's because it's huge. Most cgi scripts I've written are quite simple. And they rarely have to serve 100s of requests per second. And of course it's possible to optimize the startup time, e.g. by dumping an already-initialized version of the script (even django). Emacs has done that forever. It wouldn't surprise me if there is some way to do it in Python.

I remember reading somewhere that amazon.com was originally implemented as CGI's written in C. And, Perl cgi scripts served the whole web for many years, on servers much slower than what we have now.


Which Commodore format is being removed? Can't find it in the list of PEPs.



The batteries are the strong part of Python. If you remove the batteries, then you can easily dump python for something else.


Not sure I agree with that. The strong parts of Python are that it is semantically simple and easy to pick up.

Removing some batteries just means a couple more pip installs


Does anyone actually have normal users run Python programs? I thought programming was intended so that users could run your programs. But if a regular non-programming user wants to use a Python program, they need a whole tutorial on how to install and run it.

Put everything in the standard library, or nothing at all, it doesn't really matter. Either way you still have to set up a custom environment and install a bunch of extra stuff and learn "how to run Python programs".


Lots of people use python apps and not know about it. I don't get where the idea that it's complicated comes from. You run them the same way as everything else. For example if you use anything redhat-based you're running a python app (https://github.com/rpm-software-management/dnf). If you're running any desktop environment, you're likely using at least one python app or something with python scripting embedded.


Not sure what this is really on about, you can easily run a python program opaquely to a user by packaging it as an executable… or simply including #!/usr/bin/env python3 in a shell file and then `chmod +x script.py` on Linux.

But to answer your question, yes, millions of people run Python programs without ever knowing anything about Python.


> Does anyone actually have normal users run Python programs?

To give one example: Every Dropbox customer does, since their client is written in Python


I was hoping this article was going to talk about Python reducing its high energy usage relative to energy-efficient languages like Rust and C/C++.


I often wonder how many businesses often need code to operate. Back in school we had a database design class where the prof asked us to design a system for a user base of 50 something users per year. We were all dumbfounded when our react/postgres/python stacks were overkill when the prof explained that pen and paper or excel would’ve sufficed


reminds me of an interview I had a couple days ago where the interviewers wanted me to design an app for a 50 person building. I suggested a monolithic server side web app that serves html with minimal JS and they were expecting load balancers, multiple az, microservices, react, sharded db. lmao


> urllib too?

Spoiler: no


How can you have urllib2 without urllib? That would be silly


Urllib2 was renamed to urllib in python 3.


Thank $deity there's an 'urllib3' that's mostly used from 'requests' to add confusion back.


What's the exact motivation for this.

Just ask for more money, no reason Dropbox or Amazon can't pay for development. I'm worried this is going to break some use cases, not everyone has network access to install random pip packages


> not everyone has network access to install random pip packages

Lucky that the lack of network access will prevent these new versions of Python just as well.


Airgap install of Python is quite straightforward. I normally scp the Python source and altinstall on target server.

But airgap install of pip packages with myriad dependencies can be annoying.

So I agree with parent comment that having standard packages make things easier


They don’t provide more money. The PSF is mostly funded by PyCon( approximately $1 million) and they had to cancel one of them (also PyCon is run by volunteers. Also, it has lots of contributors who are volunteers. If people wanted these packages to remain someone needed to actually do the work in maintaining them.


Then they won’t be able to - and in any case won’t be forced to - install this new version of Python. This is silly. If we rejected any change because 3 people might have some farfetched problem, there would never be any progress.

The reality is that it’s acceptable for a few people in absurdly rare edge cases to have a bit of trouble, when it’s justified by much greater gain for everyone else.


There's an explanation in the PEP (linked a couple times already).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: