Hacker News new | past | comments | ask | show | jobs | submit login
What’s New in Python 3.8 (python.org)
833 points by supakeen on Oct 14, 2019 | hide | past | favorite | 367 comments



As a developer who has primarily developed applications in Python for his entire professional career, I can't say I'm especially excited about any of the "headlining" features of 3.8.

The "walrus operator" will occasionally be useful, but I doubt I will find many effective uses for it. Same with the forced positional/keyword arguments and the "self-documenting" f-string expressions. Even when they have a use, it's usually just to save one line of code or a few extra characters.

The labeled breaks and continues proposed in PEP-3136 [0] also wouldn't be used very frequently, but they would at least eliminate multiple lines of code and reduce complexity.

PEP-3136 was rejected because "code so complicated to require this feature is very rare". I can understand a stance like that. Over complicating a language with rarely-used features can definitely create problems. I just don't see why the three "headline" features I mentioned are any different.

[0]: https://www.python.org/dev/peps/pep-3136/


> The "walrus operator" will occasionally be useful, but I doubt I will find many effective uses for it.

The primary one I want is

    if m := re.match(...):
        print(m.group(1))
and

    while s := network_service.read():
        process(s)
both of which are both clearer and less error-prone than their non-walrus variants.

The other one that I would have found useful an hour ago is in interactive exploration with comprehensions. I frequently take a look at some data with [i for i in data if i[something].something...], and being able to quickly give a name to something in my conditional, as in {i: x for i in data if x := i[something]}, helps maintain the focus on data and not syntax. Obviously it will get rewritten to have clearer variable names, at the least, when it gets committed as real code, and almost certainly rewritten to be a normal non-comprehension for loop.

Like comprehensions, I expect the walrus operator to be valuable when used occasionally, and annoying if overused. There's no real language-level solution to bad taste. In Python 3 you can now do [print(i) for i in...], and I occasionally do at the REPL, but you shouldn't do it in real code and that's not an argument against the language supporting comprehensions.


Coming from Perl, I used to want this badly, but then I thought that there's absolutely nothing wrong with

    m = re.match(...)
    if m is not None:
        pass
Now, I wonder what you meant saying that single-line version is less error-prone, because I don't think so. I believe they're exactly the same in this regard, except for a bizarre case when someone would bastardize the code by putting some irrelevant lines between the assignment and the comparison, obfuscating the logic.

Also, as I tend to use `if m is not None:` rather than just `if m:` (because I typically don't want to run all that subtle `__nonzero__` magic), it would - subjectively - look less pretty in the one-line version: `if (m := re.match(...)) is not None:`.

Use in the generator expressions to create aliases/shortcuts and avoid repetition is great, though. I love this use case.


The issue, in my opinion, is when you want something like this:

    while m := f(many, args):
        # do stuff with m
Now, if you're writing this in Python 3.7, you often end up with some code duplication:

    m = f(many, args)
    while m:
        # do stuff with m
        m = f(many, args) # duplicate
Or something like this:

    while True:
        m = f(many, args)
        if not m:
            break
        # do stuff with m
Personally, I consider the last version to be the most elegant, but still something of an anti-pattern. So I welcome this new operator.


I literally wrote your last example twice last week to hash files in 4k chunks. Those 4 lines would be reduced to 1 with the walrus operator. I welcome it, as well.

Edit: 1 line, not 2 (excluding the "do stuff")


This comes up in recursive descent implementations as well https://gist.github.com/cellularmitosis/53913ad229bad5d0a3bb...


Why not something like this:

  def f_iter(many, args):
      while True:
          m = f(many, args)
          if m:
              yield m
          else:
              raise StopIteration
  ...
  for m in f_iter(many, args):
      # do stuff with m
This way you’re isolating all the initialization logic, error handling, etc. And you can focus on your domain logic in your client code.


Python has built-in facilities for that, there is no need to write wrapper by yourself.

   from functools import partial
   with open('mydata.db', 'rb') as f:
       for block in iter(partial(f.read, 64), b''):
           process_block(block)


Instead of raising StopIteration you could also just return from the function. You could also make it a bit shorter if you swap the if branches.

  def f_iter(many, args):
      while True:
          m = f(many, args)
          if not m:
              return
          yield m


Instead of raising StopIteration you have to return from the function, otherwise you'll just get a RuntimeError starting with Python 3.7.


You're right. The [changelog] for Python 3.7 states:

> [bpo-32670]: Enforce [PEP 479] for all code. This means that manually raising a StopIteration exception from a generator is prohibited for all code, regardless of whether ‘from __future__ import generator_stop’ was used or not.

[changelog]: https://docs.python.org/3.7/whatsnew/changelog.html#id115

[bpo-32670]: https://bugs.python.org/issue32670

[PEP 479]: https://www.python.org/dev/peps/pep-0479/


And less reasonably, something similar for regular expressions:

    for m in filter([re.match(...)], bool):
        ...
;-)


so i should write a whole other function that iterates over a list instead of being happy that := exists?


Yes! No pain, no gain!


The last version clearly exposes that you have an infinite loop. This is something hidden by the `while(evaluate)` expression.

Also if it comes to pure form I would have preferred `while (evaluate) as x:` out of establishing a parallel with existing syntax, but that's not very important.


On the contrary: the loop isn’t fundamentally infinite, it’s just an artefact of the language that you have to write it that way. The walrus lets you put the termination where it belongs.


I don't understand that the walrus has any significance with regards to the `while` loop termination. it's the nature of the evaluated expression that makes a difference, more precisely, whether it will return something that is cast to False.

In that sense, while loops over expressions always require knowledge of the expression to understand the semantic. On the other hand, something like:

    while True:
       with x as y:
           print(y)
would not require understanding of x to understand the intent. Therefore it is not semantically equivalent to:

    while x as y:
        print(y)
Note that in both cases, `:=` is strictly equivalent to `as` as it is defined for `with`. That is a matter of personal preference.


That last one isn't an anti-pattern. That's the pattern concluded best for do-while in the discussion of PEP-315.

https://www.python.org/dev/peps/pep-0315/#notice


Think about if you match multiple patterns:

    for line in f:
      if (m:= pat1.search(line)) is not None:
        ... do stuff ..
      elif (m:= pat2.search(line)) is not None:
        ... do other stuff ..
      elif (m:= pat3.search(line)) is not None:
         ... do something else ..
In older Python that's:

    for line in f:
      m = pat1.search(line)
      if m is not None:
        ... do stuff ..
      else:
        m = pat2.search(line)
        if m is not None:
          ... do other stuff ..
        else:
           m = pat3.search(line)
           if m is not None:
             ... do something else ..
I think the newer makes it clear that it's supposed to be a simple elif chain, where all branches following the same structure.

There are other ways to structure it, but the alternatives I can think of also have their own cumbersome complexities.


If actions are different, I tend to put them in functions (independent testable pieces, yay!) and iterate over a mapping:

    LINE_ACTIONS = (
        (re.compile("pattern 1"), do_stuff),
        (re.compile("pattern 2"), do_other_stuff),
        (re.compile("pattern 3"), do_something_else),
    )
    ...
    for pattern, action in LINE_ACTIONS:
        m = pattern.search(line)
        if m is not None:
            action(m)
            break
Or even use a method registry pattern that would auto-populate LINE_ACTIONS by just declaring the fuctions:

    @action("pattern 1")
    def do_stuff(m):
       ...
Alternatively, I might just break the processing into a function:

    def _process(line):
        m = pat1.search(line)
        if m is not None:
            ... do stuff ...
            return

        m = pat2.search(line)
        if m is not None:
            ... do stuff ...
            return

        m = pat3.search(line)
        if m is not None:
            ... do stuff ...
            return

    _process(line)
Of course, this depends on the purpose. Could be completely inadequate in some situations.


Indeed, I've also used this pattern.

I've found that I don't like using it. As you write, it's "inadequate in some situations", which I consider as part of the cumbersome complexities I mentioned.

For examples, do_stuff() and do_other_stuff() may need to share variables, and do_something_else() might need the line number to report an error while the others don't.

This can be handled with shared state/nonlocal, and by passing in more parameters to the generic handler API, but these add complexity.

Or, different parts of the file may have different line dispatch processing (eg, a header block followed by a data block followed by a footer block) where one of the handlers must indicate the transition to a different processor.

Also, function dispatch in CPython is slow. While the regex tests are also slow, it can also be important to consider the (using a hypothetical number) 5% overhead for dispatching over inline code.


That's... lovely. I propose making this the standard example of why you might want to use the walrus operator.


If you only have one or two regex like that then there is no difference in readability. If you have ten in a row though then suddenly your function no longer fits on the screen and it’s more difficult to understand what’s going on.


I mean, what you really want in this case is Perl's =~ operator, as well as the implicit $_ variable.


IMHO the whole `re.match` is a design flaw in stdlib.

re.match() should always return a match object, but instead .group(1) will return None. Then we can write one-liners easier without the walrus operator.


Sometimes you just check for a match (e.g.of entire string) and define no groups.

The problem is not in re.match but in None being a relatively poor Maybe / Optional (still ways better than C-style `null` of many other languages).


Could just have been

    if re.match(...).matched:
But yeah, agree that Optional is what I want here.

    if let Some(m) := re.match(...)
        m.group(1)...

    m = re.match(...)
    m.group(1) # TypeError because I haven't unwrapped the Optional
In current Python, that last line sometimes throws a runtime error.


> Sometimes you just check for a match

re.match vs re.search


re.search is different. re.match matches from the beginning of the string, while re.search matches from anywhere in the string.


I think you mean .group(0) as .group(1) can already return None:

    >>> import re
    >>> pat = re.compile("(A+)|(B+)")
    >>> m = pat.search("HUBBLE")
    >>> m.group(1) is None
    True
    >>> m.group(2)
    'BB'
    >>> m.group(0)
    'BB'
I don't see how using your proposed change would help.


I mean re.search() should always return a MatchObject, dont return None.


Looking at your wish for exploratory evaluations in for-comprehensions, I was reminded of how Clojure does this. I found it quite elegant.

The inside of the for-braces allow for keyword-based sub-clauses: There's

- ":when" which will suppress output for elements not matching a filter expression,

- ":while" which will stop when elements stop matching a filter, and

- ":let" which will let you bind some new values in mid-loop.

I rarely need these features, but when I do I find them really really helpful. Maybe someday someone will consider doing something similar in Python.


Your second use case is actually already covered in Python, assuming you have some kind of sentinel value you can break on:

  for s in iter(network_service.read, ''):
      process(s)


I think there are a lot of use-cases for some kind of dedicated "if-with-outputs" syntax, where you have some extra variables available inside the if-block if the condition matched. Such a syntax could cover a lot of the problems of double execution but also prevent hard to understand code like the walrus operator.

... I've got no idea how the syntax could look though.


How would this hypothetical syntax differ from the walrus operator? I.e. what is hard to understand with the walrus operator that wouldn't be with this new hypothetical syntax and why?


I think it could avoid cases where you "abuse" the walrus operator and make code harder to understand than necessary. E.g. things like:

  if (value := func()) and another_condition:
    ...
Or:

  if another_condition or (value := func()):
    ...
Or:

  x=1
  my_list = [x := x+1, x := x-1]
etc.


Ah ok, so you'd essentially allow it in fewer contexts? I would definitely have supported that, although I do think combining walrus with other boolean expressions in if and while is super useful and not particularly hard to understand.


It comes up somewhat rarely but it is useful at times. But of those rare times, it is exceedingly rare to require any more complexity than you've shown above, such as having two assignments in one statement.

That was the reasoning given for using ":=" instead of "as", to allow more complexity. I still think it was a mistake.


> while s := network_service.read(): > process(s)

Watch out with this one though. In this case it's good, but if you receive numbers from such function, it will break loop not just on None but on 0 as well. Easy to forget.

In many cases you'll need to do:

    while (s:= ....) is not None:


IMHO walrus operator goes against the zen of python.

https://www.python.org/dev/peps/pep-0572/#differences-betwee...

https://www.python.org/dev/peps/pep-0572/#relative-precedenc...

Even examples of the spec shows how unintuitive and "unpythonic" this is. Explicit is better than implicit.

IMHO adding features to the language to save 1 line of code for 10% of cases when you need it (I agree that there's occasional case when walrus will save you more than 1 line) is just bloat.

I am not a big proponent of Go, because it has its own flaws, though language is indeed very simple and creators of the language try to leave it simple.

IMO Python was very readable, super simple, intuitive and should stay that way, though recent releases show that Python is giving in for the feature bloat.

EDIT:

> Try to limit use of the walrus operator to clean cases that reduce complexity and improve readability.

Facepalm.


I've been using Python since 1.5 and I feel like the language itself has been feature-complete for some time. In the 2.0 era, the big pain points were async and unicode; as demonstrated by, well, how painful Twisted's unicode was. ;)

But now if our big paint point is assigning a variable to len(), then in two lines evaluating that variable and printing it out, Python is just trying to find stuff to add.

I miss the size and simplicity of old Python. :/


Yeah. I don't like this obsession that so many developers have with adding features without end. I think it's mostly so that they can have something to do; don't think about the bigger problems, just have something to do so that they can fill out their timesheets and submit it at the end of the week.


Well the zen of python also states:

- Complex is better than complicated.

Using the walrus operator makes the code more complex, but less complicated.


I vehemently disagree - these walrus vars are still in scope outside of the condition block. And on top of that, now there are special conditions for these walrus vars which are completely not obvious.

Reference: https://www.python.org/dev/peps/pep-0572/#scope-of-the-targe...


It also states one line before:

> Simple is better than complex.

I understand that nobody will force me to use this feature, but as I said before, even the spec and the write up tells how this feature is confusing and .. complex.


> It also states one line before

I know and I believe there is a purpose to be exactly after that line. Writing only simple code is not enough. If you write everything in a simple manner your code will most likely be complicated, that's exactly why you have next - complex is better than complicated.

I see this feature the same way as comprehensions, scary and complex at first, but once you learn it you would wish anyone would use it.


> As a developer who has primarily developed applications in Python for his entire professional career, I can't say I'm especially excited about any of the "headlining" features of 3.8.

Python is a fairly old, mature language.

What features would you have been especially excited about?


The f-strings from 3.6 are a (relatively) recent feature that I have absolutely loved. I'd go so far as to say they are my favorite feature introduced by Python 3.

I'm also looking forward to PEP-554 [0], which allows for "subinterpreters" for running concurrent code without removing the GIL or incurring the overhead of subprocesses.

[0] https://www.python.org/dev/peps/pep-0554/


Sounds like PEP-544 is actually included in 3.8 on a "provisional" basis, which I guess just means they reserve the right to change the API. That's a killer feature for me, thanks for pointing it out!


Correction: So the PEP said it'll be in 3.8, but apparently it's been postponed to 3.9.


As someone familiar with the Python C extension API, I somewhat doubt that it'll make 3.9 either. Disclaimer: I'm only familiar with CPython as a user of the extension API; I have no idea on how the Python team plans to address the challenges I'm mentioning here, and what their current progress is.

The current extension module API encourages global state, e.g. types (and objects too) are allocated statically in a global C variable. For example, there is a global C variable `_Py_NoneStruct` that is the Python `None` value, and extension modules are accessing this variable directly.

Every use of this object needs to adjust its reference count, and that reference count is directly stored within the C global variable.

`_Py_NoneStruct` is currently even exposed in the PEP 384 stable ABI. Existing extension module binaries are commonly directly touching `_Py_NoneStruct.ob_refcnt` without any synchronization. Breaking the PEP 384 compatibility promise is fundamentally unavoidable here.

One of two things must happen: Alternative one: All refcount operations must be made atomic for thread-safety. These are really really common in the Python interpreter, but atomic operations are expensive on modern CPUs (especially if there's contention). But multiple threads using the value `None` would be quite common in Python code, so I doubt you'd gain any speed even at today's core counts -- in fact I'd expect the constant inter-CPU-communication for the refcounts to make everything slower than just using a single core with today's Python!

So alternative two, ensure no Python objects are shared between the subinterpreters. That's the plan. But that also means it's a breaking change for extension modules. And it's not just an ABI change (which would be handled by merely recompiling against the new headers). Any extension modules that do not yet support PEP 489 are already incompatible with subinterpreters, so that will take quite a bit of work until the ecosystem is upgraded. But there will probably also be some other breaking API changes. I think type objects are currently shared across subinterpreters, and those are frequently defined as a `Py_TypeObject` global variable in extension module code. Also, if every subinterpreter has its own GIL, extension modules calling `PyGILState_Ensure()` will have to specify which subinterpreter they will be using, so that the appropriate lock can be acquired.

My prediction: 3.9 may have the basic functionality, but it still won't be able to run on multiple cores concurrently. That will take a bunch of more work and breaking changes, and will likely be released as Python 4.0.

There will be another slow upgrade process ("my dependencies must upgrade before I can") until the Python ecosystem is multi-subinterpreter-compatible. But at least this one only affects extension modules.


f-strings are great. Much nicer than ".format". I hope in a next iteration of the language all strings will be f-strings by default, avoiding the need to prefix them by a silly "f".


That would of course be an absolute disaster, given that any user input could easily leak internal state and/or break your program.


I meant for inline strings (i.e., typed by the programmer). This is an inoffensive change. The only possible ``accident'' is when the programmer wants to write "{x}" instead of the value of x. This is such and exceptional case that it may be best treated by forcing to escape the curly brackets.

If anything, user-input strings must be treated as tainted whatever the case.


A quick look at the Python stdlib gives shows some of the breakage that would occur:

1) existing uses of .format() would break:

   aifc.py: raise Error('marker {0!r} does not exist'.format(id))
2) existing uses of "%" formatting would break:

    argparse.py: result = '{%s}' % ','.join(choice_strs)
3) many regular expressions would break:

    uuid.py: if re.fullmatch('(?:[0-9a-f][0-9a-f]-){5}[0-9a-f][0-9a-f]', value):
4) existing strings which contain uuids would break:

    uuid.py: >>> x = uuid.UUID('{00010203-0405-0607-0809-0a0b0c0d0e0f}')


Then you would run into the same problem Kotlin has with its raw strings – the brackets need to be escaped, but raw strings disable escaping, so you'd have to write r"{}" as r"{'\{\}'}" which is quite ugly and not very raw at all.


It's not an exceptional use case at all. f-strings are pretty much useless for any non-trivial formatting:

1) customizable templates (can't allow the user to access arbitrary variables)

2) i18n (can't allow the translator to access arbitrary variables)

3) complex values being passed to .format() instead of some variable


That would break any existing strings that use curly braces, including most legacy use of str.format and docstrings that include code examples with dictionaries. It would be a big backward compatibility issue.


Explicit is better than implicit.


Sure, the curly brackets are explicit enough. What do you mean?


I wouldn’t mind cpython performance improvements over language features. Specifically startup time. Py cli tools can easily take up to 300-600ms to start. Now, try to use them for scripting. A single script doing a bunch of calls to python clis has already several seconds of runtime penalty.


I completely agree, but you'll notice similar startup overheads in almost all non-native languages: Java, Ruby, Node.js.

I wonder if there is anything that can really be done, outside of something like Cython.


> I wonder if there is anything that can really be done, outside of something like Cython.

I compile my cli tools with nuitka [0], the resulting binaries take half the time to start. I find the difference quite notable.

[0]: https://github.com/Nuitka/Nuitka


The interpreter loads reasonably fast, takes around 20-40ms. And it's even faster if "import site on initialization" is disabled.

But startup time increases an order of magnitude once you start importing 'requests' and other common packages.

Perhaps an option to turn imports lazy...


Though not a language feature per say, I would love if the language would take on the packaging ecosystem and deliver a ground up approach that isn't just a kludge on a kludge on a kludge.


It's not great, but also it isn't terrible, the problem with Python packaging is that there were many ways to do it and people are confused. What's worse, there are tons of articles (in including from PyPA) that are providing bad information.

If you use setuptools and place all your configuration declaratively in setup.cfg it is not that bad.


Unfortunately, that'd mean a package manager for C, Fortran, etc. as well, since so many Python packages have non-Python dependencies.

I've started using Docker for ensuring prod behaves the same as dev. Going from Mac dev to Linux prod would otherwise cause trouble.


Sounds like Conda.


Conda can't fix some problems, like Python 3.7 wanting ncurses >6, but a dependency wanting ncurses 5.9


I've heard people recommend poetry on here as something that is good for package management, but I haven't tried it. Does anyone know of something like that, but also can build your application into a Docker container as well?


Into a docker container as well? The package manager isn't really related to docker though?

Does your container expose ports or need volumes? Does it need gunicorn or uwsgi in front of it? What about system packages? None of those (except maybe the last one) are really in the scope of the package manager.


The main use case is I have a project with a setup.py that has the normal stuff in it, and there's a main entry point in one of the files. I then make a dockerfile that installs that package in a container and runs the main file as the entry point. Ignoring things like ports, it would be nice to emit a standard dockerfile like that, since it's very common.


You might like portage. We have USE flags (which are ./configure --stuff), slots (so you can install multiple versions in a clean way), and any kind of dependency tree you can think of. When I am thinking of upgrading the system python, I can enable the next version and let things simmer for testing (building for py3.7 and 3.8 for example) before I actually throw the switch (eselect python set N).


While not official, doesn't conda solve this problem? (Except for not containing the latest versions).


I'd really like a way to refer to a variable in a outer scope (like global... but not just for global variables).



In the same vein, they could add syntax to distinguish introducing a variable from setting the value of an existing variable.

Python's heuristics for that are pretty annoying.


I think that feature has exactly zero chances of getting adopted, but that's probably one of my top Python annoyances too.

If it's any consolation, the scope of a variable isn't determined by heuristics, it's just that the rules are (IMO) kinda bad. Basically, if a variable is assigned (in addition to `=` and friends, `import`, `class` and `def` are also assignments in disguise) to in what Python calls a "block" (which is not like a C-style block; rather `class` body, `def` body, or top-level) it is considered to belong to that block (with the notable exception of the name for a caught exception in a `except Exception as e: ...`, see [1]). If you want it to refer to a variable outside the block, you need to use `global` or `nonlocal` as appropriate. The full gory details are in [2]

1: https://docs.python.org/3/reference/compound_stmts.html#the-...

2: https://docs.python.org/3/reference/executionmodel.html#nami...


Yes, I meant the rules are bad. I called them heuristics, because I assume they came up with the rules as a heuristic to what would be the most useful behaviour.

Python didn't use to have lexical scoping, so the rules are retrofitting around that.


Yeah, it all seems a bit hacky. Especially the bit where a caught exception bound to a variable is cleared at the end of the `except` clause. I'm pretty sure it's because if you didn't you'd get a reference cycle (exception object -> stack trace -> function invocation record -> function locals -> exception object), which makes CPython's GC sad.


... and what prevents you from doing that? You mean you'd want to reassign them?


Yeah the fact that a new release is boring is a good thing.


Multi-line/multi-expression lambdas.

"end" statement which would enable automatic indentation.

"switch" statement instead of "if/elif/.../else" hell.


Oh yes, too all of them!

Correctly done multi-line lambdas/expressions could bring Python's expressivitiy to where ECMAScript is nowadays, where you can write both foo = function(...) or function foo().

Perl 6 has a great switch statement, called "given" https://docs.perl6.org/language/control#given


A null coalescing operator


There was a now-deferred PEP for 3.8 https://www.python.org/dev/peps/pep-0505/


I would honestly like an option to "compile" python to check for typing inconsistencies with the type hinting introduced in Python 3. Also, the multithreading story is kinda shitty


> I would honestly like an option to "compile" python to check for typing inconsistencies

You can use mypy for that: https://github.com/python/mypy


Pattern matching. Quasiliterals. PSF support for Poetry or PyPy.


I would love if they added "?" to the list of valid characters for names, so I could write something like `valid?()` instead of ``is_valid()`.


just use "_P", a fairly-widely-recognized substitute from the Lisp world.


As a python programmer I would have no idea what `valid_P` is supposed to mean. Valid Pascals?


P is shaped like a question mark, though I think it was retronymed to stand for "Predicate"


I think "predicate" is the original meaning of that postfix.

Emacs Lisp, at least, can use question marks in names without batting an eyelash.

I see no reason that would not have been true for Lisps generally since their birth.


The suffix is just "p" in Common Lisp[1], so I think the "predicate" explanation is more likely to be correct than the visual analogy to the question mark.

[1] http://www.lispworks.com/documentation/HyperSpec/Body/f_list...


Or even better, just `valid?` Oh, wait .... that's Ruby.


Ruby to the rescue!


I want to build executables that I can pass around without also sending an entire environment. Crazy that’s not a native feature.


Honestly? Performance and the possibility of making Pypy the reference implementation.


proper threads.


There have been many times when I've wished python had something like the walrus operator.

In particular, without the walrus operator you can't write an if-elif chain of regex comparisons. The best way to that that I've been able to find is with a bunch of nested if-else statements.


And working with queues/stacks is much more pleasant (which I seem to do a lot...).


I actually found an immense use for Walrus - used it to hack Python into doing pattern matching that is way more readable than without it:

https://github.com/eveem-org/panoramix

(source code for Eveem.org, which is arguably the best decompiler for Ethereum smart contracts out there. you can see a lot of pattern matching in pano/simplify.py , and I found no way to do it without extending the language/walrus while maintaining the readability)


Just a suggestion...

I went to your repo, and it wasn't clear to me what Panoramix actually is or does. So I checked out Eveem.org, and even on the about page, it wasn't clear to me what Eveem was.

But with that context, coming back to your GitHub page, I was able to get the gist of it.

It might be helpful to have a section above the installation section that gives a little context / tells about the project.

And in your example section -- again, I literally don't know anything about decompilation so maybe I'm not your target demographic... but if I understand the gist of it, kitties is an example function or binary. It might be better to use something more concrete (and provide the context).


Thank for the feedback! My first open-source project. I will improve this :)


After a quick read through, I personally find this syntax much harder to read using operator overloads than it would have been using a series of comparison functions. Tilde is a very obtuse operator and I don't know that this really buys you anything.

Feels like a case of preferring cleverness over readability/usability/maintainability.


Python is adding features, maybe they want to be as expressive as perl, when they grow up; however that comes with the price of decreased readability.


I actually had a series of comparison functions initially, but those weren't that readable.

To give you an example, let's look at this statement:

if exp ~ ('mask_shl', int:size, int:offset, -offset, :e): ...

It is the same as writing: if type(exp) == tuple and len(exp) == 5 and exp[0]=='mask_shl' and type(exp[1])==int aand type(exp[2])==int and exp[3] == -exp[1]: size, offset, e = exp[1], exp[2], exp[4] ...

If you can figure out syntax that makes this sort of matches more readable, I'll gladly use it :)


Fixing formatting for above:

    if exp ~ ('mask_shl', int:size, int:offset, -offset, :e): ...
vs

    if type(exp) == tuple and len(exp) == 5 and exp[0]=='mask_shl' and type(exp[1])==int aand type(exp[2])==int and exp[3] == -exp[1]: 
        size, offset, e = exp[1], exp[2], exp[4] 
        ...


Exactly the road C++ tries to go down. How's that for a ringing endorsement?


C++ is one of the most successful languages.


Despite its flaws, not due to them.

Same can be said about e.g. Cobol.


Despite the flaws, and because of being in the right place at the right time.

Cobol also was a good idea back in the day. And some of the ideas in Cobol we know recognize as bad _had_ to be tried to gain that recognition.


False.

There have been dozens of working C++ competitors over the years, each one technically as good as C++.

They all failed because their authors are good at writing languages, but very bad at knowing how languages are used in the real world.

The C++ committee really knows what it's doing, you're not going to do anything useful in this space without actually understanding the problem domain.

TL;DR - yes, C++ is complex, but that is because programming is complex. Your attitude of "programming is hard, let's go (language) shopping" will not displace C++.


C++ made a number of... unfortunate decisions early in its development, which resulted in a much worse language than it could have been. (Hindsight is 20/20, of course.)

I totally agree that C++ of 21st century is way better than C++ from 1998, and is becoming better. The problem is that maintaining backwards compatibility requires to keep a number of footguns in place, and some of them are cornerstones of the language :(

I have high hopes for Rust becoming the more sane competitor in the space traditionally occupied by C++. Its authors seem to understand the problem domain from a very practical standpoint, and are proficient in C++, to begin with. They also strive to apply reasonable design principles, and some actual math hopefully will keep the core of the language from logical pitfalls.

Another example of hitting a sweet spot is Go. I dislike many Go's language decisions. (Some are definitely great, though, e.g. the whole OOP approach lifted from Oberon.) But its authors definitely understand both the problem domain, and the target audience.

This is what I mean by talking about success despite the flaws. C++ has both flaws and merits, and merits outweighed the flaws. But the flaws are still painful.


How much of that success can be attributed to the features that were added in the last 10 years?


Rust:- Am i a joke to you?


Pretty good?


There is a similar trick that you can use without the walrus operator. I wrote the code many years ago and I am reproducing this from memory, so the code below may be a little off. Define this class:

    class Any:
        def __eq__(self, other):
            self.value = other
            return True
Then you can use an instance of it as the hole in a pattern. My use case was performing peephole optimizations in a simple 6502 assembler. For example, if the LDA operation (load value from memory location into accumulator) occurs twice in a row, the first load is redundant. So I did this test:

    x, y = Any(), Any()
    if L[i:i+2] == [('LDA', x), ('LDA', y)]:
        L[i:i+2] = [('LDA', y.value)]
(Note that the last line changes the length of the list, so it may be slow on large lists.)

I was going to write a blog post about this back then, but I never did. If anyone is interested, I can write it now.


But that breaks normal equality tests for your type?


Wouldn’t parent’s code simply copy into Any every time it’s compared in a list?

It is some spooky action at a distance.


Probably. I found it a bit confusing in any case.


You should link to a snippet that actually uses := rather than to the top level project.


Ah yes, sorry:

https://github.com/eveem-org/panoramix/blob/bc55be84a9a6abb1...

It uses "~" custom operator, but implementing it would not be possible without ":=" operator (see /tilde directory).


This is super interesting! I've never seen codecs used to introduce language features like this... Do you have any good references for getting started with using codecs?

Also, I wonder if there's an easy way to implement this in a jupyter notebook, without fiddling with the kernel...

(issues of whether it's a good idea aside, definitely seems useful to know about.. )


Thanks! The idea for using codecs is not mine - found it on StackOverflow somewhere. Googling for "how to add custom language statements python" should give you some results, this included :)

Also, in my case I had to use Codecs because a new operator causes AST-parser to not go through. But if you have some kind of syntax that will be compatible with Python AST, then you can implement it even easier.

You can google Python AST to get a lot of useful information :)


I thought regex matching was one of the strongest arguments for it.


Perl devs have been using the walrus expression for decades, but we just called it "assigning a variable with local scope in an expression".

  $ perl
    $a = "foo";
    if ( my $a = "bar" ) { print "$a\n" }
    print "$a\n"
  bar
  foo


The reason why assignment expressions initially were not allowed in Python and why they had to introduce "walrus" was because in languages that used single equal sign for assignment enable to easily make bugs by typing "=" instead of "==".


I find it strange to sacrifice syntactical simplicity for bug prevention in a language that needs deep testing anyway because it's neither nil- nor type-safe. But maybe I'm just to used to inline "=".


A partial solution was already introduced with `with x as y`.


Yup, C and many other languages also have it:

    if (ptr = strchr(address, ':'))
        port = atoi(ptr + 1);


It doesn't do scoping, so not really the same.


it can in C++ (when specifying auto or a type). I thought it did in C as well, but apparently not.


As someone who has used Python for over 10 years and C, Perl, etc for over 20, I've been craving the walrus operator forever.

I love the walrus operator with a love that is unholy.


I wonder what you think of "for else" loops.

(although I think "else" should have been a different keyword)


Not OP, but I've probably used them 2 or 3 times in the past decade or so and each time it made for a nice solution to the problem I was facing.


Every time I come across a use case when the "for else" construct would be applicable, I decide against using it just because the "else" keyword is just very unintuitive to people not familiar with this rarely used construct ... so yeah, I too wish it was a different keyword :)


maybe we will get "The 3to4 tool will automatically adapt regretfully-named-keywords when converting your sources to Python 4."


PEP-3136 was proposed and discussed more than 10 years ago in the 2->3 process. The python community was very different then. As an example, Python 3 was also the language that removed the function argument tuple packing, to which my reaction is basically WTF.


Is WTF related to the point of efficiency or some esoteric semantics?


> Over complicating a language with rarely-used features can definitely create problems.

Agreed.

There's a number of (full) languages that compile to the Python AST [0] (I'm especially fond of http://hylang.org), but they're very different from Python. It would be interesting to see smaller variations of standard Python implemented in an interoperable way like these languages do.

[0] https://github.com/vindarel/languages-that-compile-to-python


Not all releases have to be ground breaking, especially in a mature and stable language. Cleaning up things, putting other things in place for major releases, that's commendable too.

I did not follow all the conversation around the typing system but isn't the whole point of it to propose optimizations in the future? I find it exciting


So Python is finally waking-up to the advantage of everything being an expression as in Ruby and Clojure.


> Same with the forced positional/keyword arguments

I also thought, why bother at first until I learned this is already a feature in python, but only for c-functions. So of course it makes sense to level the playing field.


The audit hooks are going to allow some really interesting things, especially around security controls on executing Python code. There is hope for sandboxed Python execution!


In my opinion it was a mistake to reject the labelled break PEP


I take this as a positive sign of Python's maturity.


. . .and the threat of Just Another Vict'ry Announcement (JAVA) as the need to add "sizzle" to the next version looms.


Rock and a hard place, no? Either you keep adding to a language and people keep using it, or you stop adding to a language and people call it a dead language and stop using it.

Most devs don't understand that software can be done and still be useful.


Well, you could also try to remove from the language.

Like how they removed the `print` keyword when going from Python 2 to Python 3.

But removing is harder than adding.


Right. Maybe I should have said, "... keep changing a language..."


Mature languages don’t introduce random new ugly syntax. I miss Guido already.


Actually, Guido approved the walrus operator. (In response to the justified backlash, instead of canceling the feature, he resigned as BDFL, which is not what anyone wanted)


I stand corrected.


Guido fought to have the "random new ugly syntax" (which is anything bad) included.


  def f(a, b, /, c, d, *, e, f):
Wow: IMHO that is a very ugly syntax to define a fn with 6 parameters, although it looks to be a path dependency on previous decisions (*, and CPython ,/) and clearly there was much discussion of the need for the functionality and the compromises: https://www.python.org/dev/peps/pep-0570/

It amazes me to see how certain features make it into languages via community/committee (octal numbers in JavaScript - arrgh!).

One thing I find really difficult to deal with in languages is the overloading of different syntactic(edit:semantic) usages of different symbols, which itself is a result of limiting ourselves to the ASCII symbols that can be typed. I don't recall written mathematics having the issue badly (although I haven't had to write any for a long time!).


Yes, "path dependence" is a good way to describe it. (And Python has been my favorite language for 16+ years now)

For https://www.oilshell.org/ , which has Python/JS-like functions, I chose to use Julia's function signature design, which is as expressive as Python's, but significantly simpler in both syntax and implementation:

Manual:

https://docs.julialang.org/en/v1/manual/functions/index.html...

Comparison:

https://medium.com/@Jernfrost/function-arguments-in-julia-an...

Basically they make positional vs. named and required vs. optional ORTHOGONAL dimensions. Python originally conflated the two things, and now they're teasing them apart with keyword-only and positional-only params.

A semicolon in the signature separates positional and named arguments. So you can have:

    func f(p1, p2=0, ...args ; n1, n2=0, ...kwargs)
So p2 is an optional positional argument, while n1 is a required named argument.

And then you don't need * and star star -- you can just use ... for both kinds of "splats". At the call site you only need ; if you're using a named arg / kwargs splat.

----

Aside from all the numeric use cases, which look great, Julia has a bunch of good ideas in dynamic language design.

The multiline strings in Julia appear to strip leading space in a way that's better than both Python multiline strings and here docs in shell.

Also Julia has had shell-like "f-strings" since day one (or at least version 1 which was pretty recent). Python has had at least 4 versions of string interpolation before they realized that shell got it right 40 years ago :)


But the Julia syntax can not represent arguments that can be passed either positional or named, right?


Yes it can't, unless of course you just make two variables (one positional and one named) and inside the function you choose one.

Because of Julia's multiple dispatch paradigm, positional arguments are special compared to named arguments because they decide what method to dispatch to (in Julia f(x) is equivalent to x.f() in object oriented language, and it's extended to all positional arguments). That means that if you called f(a, b=nothing, c=nothing), f(1, c=1) it would dispatch to f(Int64, Nothing, Int64), while if you called f(a; b=nothing, c=nothing) with the same args it would dispatch to f(Int64). In Julia named arguments are effectively a way to pass more arguments without complicating the dispatch rules, and since there is only one way to call a function (outside of optional arguments, which appears to the end user as another implementation of a function) there is no ambiguity to where it dispatches to.

So basically, every language has it's own quirks, which the syntax decisions usually reflects, and Julia's scenario is fundamentally different from Python's.


Yeah that's true, but I consider it a feature and not a bug.

There was a rule in Google's style guide that said to only pass optional arguments with names, and required args without names. It basically enforced this separation while being slightly less flexible, because you can't have optional positional args or required named args.

Tens of millions of lines of code was written by thousands of programmers in that style, and it caused no problems at all. On the contrary it made code more consistent and readable.


I’d argue that written maths is worse with its overloading of the Greek alphabet. Although in the case of maths I’d argue that it’s a result of how slowly we write, with the programming equivalent usually being a variable name (autocompleteable).


Yes, math is a much bigger culprit. Including for operator overloading.

Their saving grace is that humans are more intelligent interpreters.

If you ever try to translate a human-readable proof, even a fairly formal and rigorous one, into a computer proof assistant like Agda or Coq, you can see all the little ways humans cut corners.


> which itself is a result of limiting ourselves to the ASCII symbols that can be typed

I think we're ready for programming languages using some visually good Unicode characters, instead of overloading `[]{}!@#$%^&*()-_/` for everything!


There are programming fonts that use ligatures to convert >= to ≥, or -> to →, so the source code remains in ASCII but symbol sequences show as unique Unicode characters.

A next step could be for common dev environments to actually convert symbol/key sequences to operators (how do APL programmers do it?)

Avoiding ambiguity and semantic overloading of ASCII symbols would surely help beginners (if also given a UI that clearly exposes ways to enter the new symbols). I always find one letter operators extremely strange too like u"word" s/foo/bar/ etc.

It seems a shame we can type in hundreds of Unicode symbols on a mobile virtual keyboard, but not readily on a physical keyboard.

JavaScript has supported Unicode for a long time, but the core language doesn't use it at all.


> There are programming fonts that use ligatures to convert >= to ≥

It's a neat hack, but that's the opposite of what I want. It means I can't look at source code and know what characters make it up, which is the primary reason I'm still putting up with plain text files (and mostly ASCII) for source code at all.

Out of habit, I occasionally type some control-letter combination which is a valid command in Emacs, but in Xcode (and other native Mac apps) inserts an invisible character, which happens to be invalid in Swift. 10 minutes later, I get a compilation error pointing at a line of code that looks perfectly valid. Frustrating!


The Mac has provided easy access to symbols since 1984, with nice mnemonics like these:

    option-=  ≠
    option-<  ≤
    option->  ≥
    option-/  ÷
    option-o  ø
    option-w  ∑
I don't know why other OSes haven't adopted something similar. The first couple of these were even idiomatic in HyperCard's scripting language.


That's like visual puns, and is limited to whatever symbols seemed important to the developer at the time.

How do you discover how to type ß, °, «, ‡ etc?

Android's gboard uses phonetics (long press [s] key for ß), symbolic similarity (long press [*] key for ‡) and visual similarity (long press [<] key for «) which is guessable for some symbols, but isn't discoverable for others (you can search for emoji by name, but not symbols).


The Mac has had, as a standard feature for decades, an on-screen keyboard to allow you to explore the results of different key combinations. It’s a great tool.


I should note that, alongside it, is a useful search tool that allows you to find the character you’re looking for if you’d rather just search and don’t care to learn the key combination (if there is one).


Cool! Where do I find it?


In Mojave, and it’s pretty similar across versions IIRC: open keyboard preferences in system settings and check the “Show keyboard and emoji viewers in menu bar” box.

That will replace the language flag icon in your top bar with an odd icon with the command key embedded.

The second option in that menu is the keyboard viewer. You can show that and drag on a corner to make it as large as you want. Dynamically changes the keyboard as you hold down modifier keys.


It's easy on the Mac to have a custom or purpose-built keyboard layout that makes sense for the user or context. It's a little harder with X (mostly due to uncoöperative DEs) but still doable.


Self-reply for Kwpolska: Adding layouts to /usr/share/X11/xkb/symbols/ is straightforward (but is necessarily system-wide, and requires root), and I do that. However, for the popular desktop environments, it's like pulling teeth to have that layout treated as first-class with respect to switching and settings.

(On MacOS, putting a .layout file in ~/Library/Keyboard\ Layouts is enough.)


I've hacked /usr/share/X11/xkb/symbols/xx files to have custom keyboard layouts, DEs are not able to mess with it.


X has Compose, which originated slightly earlier (the key, that is, not the X11/xkb implementation), and is a tolerable alternative. e.g. [Compose / =] ↦ [≠]


> nice mnemonics like these:

> option-w ∑

This would grate on me every single time I had to use it. It's an S. Put it on option-s.


For context opt s is ß so the decision wouldn’t be easy if you want to include both.


What are á and à?

Putting double S on option-s just makes option-w for Sigma more grating. If they were going by shape, it'd be on option-b, which is equally stupid. But since it's on S, we can conclude that... someone at Apple knows German, but nobody can even name a Greek letter? That Germans are right and Greeks are wrong? What?


I don’t think it that extreme, but that German has higher precedence than Greek. If you need to include both characters, you have to pick one or the other, and for whatever reason this configuration happened. We might never know the true reason behind the decision, but stupility is unlikely the answer. Most seemingly stupid decisions (technical or otherwise) makes at least a certain amount of sense within its proper context, and it’s fair to talk down to it when you don’t have the risk of being judged by that decision.


I repeat:

> What are á and à?

The concept of variations on a common letter wasn't exactly new at the time. If you're supporting German and Greek, it seems safe to say you're also supporting French.


Windows has a keyboard called US-International with extra symbols. Also, note that those key combinations might not exist in other keyboard layouts (replaced with local keys).


I like to use characters like those as custom operators whenever I can. :)


> I think we're ready for programming languages using some visually good Unicode characters

Some already do: https://docs.julialang.org/en/v1/base/math/#Base.:!=


I'm experimenting with this in V [0]

  if a ≠ b {
is allowed along with

  if a != b {
vfmt (kinda like gofmt) will be able to convert != to ≠.

So far the response has been mixed.

Not everyone is open to this change.

[0] https://vlang.io


Many languages allow that. Scala allows characters in the math symbols (Sm) and other symbols (So) Unicode categories as identifiers for functions and variables, and APL has a rich notation that relies on many single character operators.


I’ve been looking at making up a second keyboard with a bunch of useful characters on it (similar to Tom Scott’s unicode keyboard) and learning Perl 6 to make use of it. I’m thinking that’s going to be my Winter Project/New Year’s resolution this year.


Oh god, this. I want a language that as I type, != Gets replaced with a proper ≠ not equals sign and proper symbols for AND, OR, and NOT.




Check out Lean Prover's online editor: https://leanprover.github.io/live/latest/

They borrow syntax from LaTeX math-mode to allow symbol entry, such that you can type "\ne" and as soon as you hit the space after the "e", you get ≠ instead.

The language plugin for Visual Studio Code does the same thing.

Lean Prover's not exactly a programming language, but :

    def foo (a b : ℕ) : ℕ → ℕ → ℕ :=
        λ a b, a + b
Still looks decent.

I think it has to be all or nothing though, since if it's optional the odds of getting decent editor support for it is low.


a lot of programming fonts have ligatures for this


What bothers me is that the / is separated by commas, which makes it seem like the signature is one more than it actually is (even if that is not the case technically). I haven't looked too carefully here though.


The point of that syntax is to offer more control for how functions, particularly in C-extensions, are invoked.

It will be a "good problem" to have the skillz to make this useful.


This got missed from the release announcement, but now there's `functools.singledispatchmethod`,[1] as the class method sibling to `functools.singledispatch`.[2] This allows you to overload the implementation of a function (and now a method) based on the type of its first argument. This saves you writing code like:

    def foo(bar):
        if isinstance(bar, Quux):
            # Treat bar as a Quux

        elif isinstance(bar, Xyzzy):
            # Treat bar as an Xyzzy

        # etc.
I understand runtime type checking like that is considered a bit of a Python antipattern. With `singledispatch`, you can do this instead:

    @singledispatch
    def foo(bar:Quux):
        # Quux implementation

    @foo.register
    def _(bar:Xyzzy):
        # Xyzzy implementation
With `singledispatchmethod`, you can now also do this to class methods, where the type of the first non-self/class is used by the interpreter to check the type, based on its annotation (or using the argument to its `register` method). You could mimic this behaviour using `singledispatch` in your constructor, but this syntax is much nicer.

[1] https://docs.python.org/3/library/functools.html#functools.s...

[2] https://docs.python.org/3/library/functools.html#functools.s...


The issue with this and `singledispatch` is that they no longer support pseudo-types from the `typing` module [1] so you can't use them with containers of type `x`, e.g. `List[str]`, or protocols like `Sequence`.

[1] https://bugs.python.org/issue34498


Yea, this one is really annoying... required a lot of rewriting of code to make some old code work with 3.7.


I don't understand why anyone would want singledispatch. Instead of having the function defined in one place where you can look it up, now the function is potentially scattered all over the place. (I'm not talking hypothetically. I've had the 'pleasure' of working on a codebase where different singledispatch cases of the same function were defined in different files!)


Because if you need to branch based on type more than a few times in your function it can get pretty hard to read. Am I right that the basis of your complaint is that it's now harder to find all the members of (what you'd call in C++) the "overload set" for a particular function? If so I can see your point.


Yes, that's my point. And it doesn't really simplify the function itself, it just rearranges it, in the same way you can break up a complex painting into jigsaw pieces and say, "Look, each piece is simple!"


I think the answer to the question "should I apply X refactoring technique to this function Y?" obviously has to depend on both X and Y. There's clearly a trade-off here. If I have a free function foo(x) that I want to work differently depending if x is type A or type B, splitting that means breaking up foo, sure, but because I can only ever call foo with an x of type A or type B (unless I'm converting some As to Bs halfway through foo or something awful like that) it might be useful to see all of the "A" logic in one place and all of the "B" logic in another place.


I can definitely imagine some places where this replaces type-checking, but it still seems like a bit of an unfortunate anti-pattern to me, since it's really a sort of C/C++ style function prototype match.

My immediate thought is that it's going to be hard for PyCharm to reliably point me to a function definition.


>> My immediate thought is that it's going to be hard for PyCharm to reliably point me to a function definition.

I am certain PyCharm is going to special-case these decorators in their next release.


Nice. I missed this feature only a few weeks ago. Good to know it's landed!


Few years ago I wrote this thing: https://github.com/adh/py-clos

It can also do dispatch on multiple arguments and on equality and has dispatch cache implemented as C extension (from my rudimentary measurements it seems that dispatch with cache hit is actually slightly faster than normal CPython method call).


The expansion of f-strings is a welcome addition. The more I use them, the happier I am that they exist

https://docs.python.org/3/whatsnew/3.8.html#f-strings-suppor...


I saw that and was like "Oh, that'd be a handy feature for a lot of other programming languages too." But the more I think about it, the more I'd rather have a feature that takes a list of expressions and converts it into a dict where the keys are the expression text and the values are the values of those expressions. Basically, syntactic sugar for this:

  { k: eval(k) for k in ('theta', 'delta.days', 'cos(radians(theta))') }
This would trivially subsume the f'{user=}' syntax for the example given: just print out the dictionary. But it'd also be useful for: filling template dictionaries; printing out status pages for HTTP webservers; returning multiple variables from a function; flattening out complex data structures; creating dispatch tables out of local functions.

You could even have a syntax like locals('theta', 'delta.days') and keep it familiar.


Pretty trivial to do with proper macros in a language. Julia has macros for printing variables and values, like `@debug(var)`.

There's an Elixir library with macros to make a map (dictionary) using the variable names passed in [1]:

    iex> import ShorterMaps
    ...> name = "Chris"
    ...> id = 6
    ...> ~M{name, id}
       %{name: "Chris", id: 6}
Though wether its a good idea or not is another question. ;) If you want to do the type of programming you're talking about you should try Elixir/Julia/Clojure(/Rust?)... or any number of other languages with macros.

1: https://github.com/meyercm/shorter_maps


Can't you build your own function using format()? f-strings are just syntactic sugar for format() so presumably you can


This makes clean string interpolation so much easier to do, especially for print statements. It's almost hard to use python < 3.6 now because of them.


One thing I vastly dislike with them is that they are still expanded in logging, while they are not expanded when logging is using "the old way".


What do you mean the old way?

Will it print what foo is if I type logger.info("{foo=}")?


Logging.info(“%s %s”, name, country”) the string isn’t expanded if the logging is set to warn. While with your example the string is expanded then discarded


How would you fix the issue without breaking existing semantics?

The way I see it, it's a necessary price to pay, and I just continue to use c-style forming strings in logging.


The clean way would be to introduce laziness. But yeah, can't do that in Python.


Why didn’t Python ship with the opposite functionality as well? Parsing instead of formatting. Given a string and a format string, return a list of variables (or a dictionary).



Are you talking about parsing f-strings? How would that look like?

As far I understand the following:

    print(f"your name is {name}")
is roughly equivalent to:

    print(f"your name is " + format(name))
What is there to parse?


For example, to checkpoint a model, I would save it as “ckpt-{epoch_number}-{val_loss}”. Given this file name and the original format string, I would like to recover the epoch number and validation loss variables back.

From: ckpt-8-0.300 To: epoch=8, val_loss=0.300


I think this is what you want:

https://github.com/r1chardj0n3s/parse


This is the release that contains the controversial assignment expressions, which sparked the debate that convinced GvR to quit as BDFL.

They didn't select my preferred syntax, but I'm still looking forward to using assignment expressions for testing my re.match() objects.

I haven't used 3.8.0 yet but I hope its a good one because 2020 is the year that 2.7 dies and there will be a lot of people switching.


I'm stoked about the walrus operator. Ever since I heard it was being added I've grumbled when writing code that would have been clearer with it.

Of course I have to ask, what was your preferred syntax?


Not OP, but maybe it's "as":

  if re.match(pattern, string) as m:
      #use m
Seems a bit more Pythonic, as "as" is already used like this with "with".

Either one would be fine with me and useful.


It's already used in imports as well! It was the better choice, I don't know which arguments convinced them otherwise


The arguments were based around also supporting complex multiple assignments and being to be able to do them anywhere, not just if… and while…, list-comps, etc.

Of course neither of those things have been seen in the wild or production code since. Some folks complain even the simple case above is less readable.

"as" was the Pythonic choice, rather than the C/Pascalic one.


I think I remember seeing something about the "as" syntax. Wish I could find what I read, IIRC some arguments convinced me that the walrus operator was an improvement.



Unfortunately, none of the reasons listed against are very accurate.


yep, that's the one!


> I'm still looking forward to using assignment expressions for testing my re.match() objects.

The fact that this is the go-to example that everybody is using in justifying the introduction of the assignment expression convinces me that the real problem lies with the re module's API.

(To be clear: I will also be using assignment expressions for this case, but I don't think assignment expressions are really in line with the overall design of Python.)


re's API sucks but I'm looking forward to safe dictionary access with assignment as well. No more

if my_map.get(key) is not None:

    // do something with my_map[key]


Here the "better" solution (imo) would be support for pattern matching:

    match my_map.get(key):
      case None:
        // do thing (or nothing, i.e., pass)
      case Some(val):
        // do other thing
Scala seems to have figured out how to get pattern matching over arbitrary data (i.e., not statically-defined algebraic datatypes). I'd like to see this come to Python.

(Normally I'd fight for pattern matching to always provide static type safety guarantees, but in Python it seems completely reasonable to omit such checks.)

EDIT: But you're absolutely right that this is a place where assignment expressions will be used regularly, so thank you for pointing that out!


>They didn't select my preferred syntax

I don't write much python so there's probably something obvious I'm missing, but I don't see why they didn't use "=". Is there some significant difference between assignment expressions and assignment statements that makes it worth having distinct syntax?


Yes, there is.

    If x = 1
Has been the source of many errors in many languages. Forcing assignment to be more than a 1 character difference from the equality operator prevents this.


I may be missing something, but := is still only a 1 character difference, isn't it?


:= is basically the "yes, I really mean it" version, whereas bare = might just be a simple typo.


It depends on the definition of edit distance used; they both have a unweighted Levenshtein distance of 1 from “==” (# of inserts, deletions, or substitions), but “:=” has an LCS distance of 2 vs “=” with 1.

Perhaps more importantly, “=” and “==” are more visually similar than “==” and “:=” and also easier to mistakenly type for each other.


I should have said single-character additions or deletions . == And = have a levenstein distance of 1, == and := is 2. You can't simply make a typo and change an equality check to assignment.

So, wiki defines levenstein distance as including substitution, but I was taught or at least remember it only being additions and deletions?


It's the same topology as long as string lengths are equal


It is, but many intro programmers (and non-intro programmers!) often forget to use == instead of = for comparison, because in the real world, = more often implies equality, rather than assignment. So a novice might mistakenly type `if x=1:` but they would be unlikely to accidentally type `if x:=1:`.


Makes you wonder why we switched from = is equality and := is assignment like it was in the ALGOL family.


> Is there some significant difference between assignment expressions and assignment statements that makes it worth having distinct syntax?

Given that Python is statement-oriented, yes, having statements visually distinct from similar expressions is important.

It's also important to avoid making the equality operator and the assignment operator visually similar or easy to typo one for the other, which is arguably the bigger need for “:=” vs “=”, since “=” and “==” are quite similar and easy to accidentally mistype for each other.


The ability to accidently assign when you mean to compare leads to bad bugs. Avoiding "yoda-style" c code in Python is the major reason.


It seems if you use '=' it leads to dangerous programming bugs[1], whereas ':=' somehow prevents those bugs?

[1] https://effbot.org/pyfaq/why-can-t-i-use-an-assignment-in-an...


It's so that something like this remains invalid syntax, instead of an unpleasant accident.

  x = False
  if x = True:
      sadness()


It's much easier to leave off one = when you meant == than it is to accidentally type :=


Because of the luddites at RedHat, Python2.7 isn't actually dead until 2024. It's infuriating.

https://access.redhat.com/solutions/4455511


It's almost as if the Python Software Foundation isn't capable of killing the Python 2 language unilaterally without the Python community's consent.


[flagged]


It's almost as if programming languages live forever as long as there's production code that uses them. See: COBOL.


Yeah, it’s super unfortunate how the existence of py27 security patches magically stops all py3 users from doing any work. It’s sort of how the existence of c prevents anyone from using rust.


[flagged]


Python3 was a bad idea. Guido himself acknowledged it. There was no point in breaking everything.

The split was initiated by core developers, not by the community.


Sorry, not good enough. The decision was made, good or bad, 11 years ago. Everyone had plenty of time to come to grips with that reality, anyone still lingering on Python2 is bad at their job.

The bad decision now is to allow the split to continue, and RedHat is allowing that to happen due to greed and ignorance.


It’s a business play. If companies don’t want to move off 2.x and are willing to pay a Software vendor to backport security fixes so that they can CYA, so be it.


Easy. Don't buy products that are obviously still using python2. Problem should solve itself in some time. Hard to tell sometimes, but looking at dependencies and plugin languages is a good way.


The only software I've written in Python that's been sold was written against 2.5 and has long been out of my control. We sold the source to the sole client (was a financial services migration tool for a very specific domain during a joint venture; I can't divulge anymore).

I know longer work at the company, but i'd wager that the client wouldnt have been willing to pay $500/hr (the rate my company billed me out at for support, features, etc), 10 years ago, to have me port the app and the handful of 3rd party dependencies to Python 3.

Moving from 2 to 3 took a while. I moved once the libs I used moved. For an "app" developer, the migration was easy once my dependencies were ready. Most of the changes were straightforward. The print statement becoming a function was easy. str becoming Unicode and introducing bytes has been a headache. Still having issues time to time with text encoding, especially with encoded text dealing with SQL Server (looking at the default CP-1252 for US English). Another one that still trips me up is is needing a "newline=''" argument when opening as CSV file.

That said, I love Python 3. Ita mostly about forgetting/relearning old syntax, which is going to happen for anything youve been using for decades and undergoes a significant change. I started with Python over 15 years ago, at version 2.2, for reference.


what is your take w.r.t to other backend languages if you have worked with any?


Forunately, tauthon is a community-supported fork of Python 2.7 so that we may continue using the best version indefinitely.


You'll be laughed out of future interviews if you mention your interest in that project.


Some people use software as a tool and aren't zealots about versions or constantly chasing updates.


Yeah, and even those people would snort derisively at a project designed to keep 2.7 alive like this.

"Getting shit done" doesn't happen if you can't move forward eventually. I'd have substantial concerns hiring someone who can't get over Python 2.7.


I think it's less that someone "can't get over Python 2.7" and more "why should I?". Nothing in 3 except bytes/string handling is compelling. Lots of 2 libraries haven't been ported. There is no legal reason to move and no other time pressure.

I would argue that people have been "getting shit done" with C for decades, despite newer shinier flavors of the month popping up, so your argument doesn't hold water to me.


If you want other people to make a change then it's on you to make a convincing argument for why the new thing is an improvement, not just go on tirades about how people should get with the times.

Personally, I lost faith in the Python core team because of the Py3 migration. Yes, 3.x now has a bunch of nice features that 2.x did, but almost none of them actually depend on the 3.0's breakage (as proven by Tauthon).

If you want people to follow you through a break-the-world migration then you need to motivate why it is needed and why it couldn't be done incrementally, not try tempt them with a bunch of unrelated carrots that are bundled together with the breaking change.


> If you want people to follow you through a break-the-world migration then you need to motivate why it is needed and why it couldn't be done incrementally

They did. I remember looking through this, and I remain convinced that the core developers were correct and there was no way to fix Unicode handling incrementally.


Add the u-sigil for unicode (as they did), add the b-sigil for bytestrings (as they eventually did), and then go through a regular deprecation cycle for sigil-less strings (rather than releasing a 3.0 where the u-sigils were removed completely). Maybe at some point re-add sigil-less strings as an alias for u-strings, but I'd rather have old stuff break with a clear message than have a bunch of weird side bugs.

Do the same for the type names.


I've done that; this isn't the first time I've talked about this problem. HN is not a safe space, there is no room to convince anyone of anything on here, so tirades are all that's left.

We're well past your argument. It's been 11 years, it is no longer reasonable to hold your particular grudge. Get on board with the modern Python or get the hell out of the conversation.


the emotional content outweighs the objective situation here; also humorous since Luddites are commonly misunderstood per https://en.wikipedia.org/wiki/Luddite


Not misunderstood; luddites (lowercase l, or at least the generalized form) means something different than Luddites (capitalized L).


I'm fairly certain the lowercase l version is the same, but the Luddites are remembered as being ant technology, not pro labour.


A luddite, generally, is one who is anti-technology. A Luddite, specifically, is a member of a 19th century movement to prevent automation from taking their job.

It's insanely pedantic to try and point out the difference, given the ease with which one can find the generic definition.


Buried in the notes for the `typing` module:

> “Final” variables, functions, methods and classes. See PEP 591, typing.Final and typing.final(). The final qualifier instructs a static type checker to restrict subclassing, overriding, or reassignment:

> pi: Final[float] = 3.1415926536

As I understand it, this means Python now has a way of marking variables as constant (though it doesn't propagate into the underlying values as in the case of C++'s `const`).

The equivalent Java:

    final float pi = 3.1415926536;


The thing is, it is only enforced in mypy... code that modifies a Final object (pi = 3 for example) later on will run fine.


Correct- as usual the annotation indicates intent, but is not enforced at runtime.


I don't see the point of `final` without an optimizing compiler. Name mangling is sufficient for stashing references to avoid accidental side-effects of overriding.


You can guard yourself against overriding in the same module this way, too.

For name mangling, I think you need to start your variable with underscores, and then it won't be accessible for reading outside the module either?


For modules you can access the variable just fine, but for classes you need to use the mangled name:

  $ cat >foo.py
  __FOO = 1
  class Foo:
      __FOO = 2
  $ cat >bar.py
  import foo
  print(foo.__FOO)
  print(foo.Foo()._Foo__FOO)
  $ python3 bar.py 
  1
  2


Nope, you don't need to start the name with underscores.

    class Foo:
        bar = 'public'
        __bar = bar # stashed


Real Python have a great post out today going through most of the changes: https://realpython.com/python38-new-features/

They have a lot of code samples and examples about how and when to use the new features.Personally, I love the new debugging support in F strings.


Too bad I'm losing good articles like this one because of those patronizing, click-baity newsletters that they started sending out a year or two ago.


They appear to have toned down their marketing efforts recently. You no longer get bombarded with clickbait pop ups or alerts.


> The typing module incorporates several new features:

> A dictionary type with per-key types.

Ah, I've been waiting for this. I've been able to use Python's optional types pretty much everywhere except for dictionaries that are used as pseudo-objects, which is a fairly common pattern in Python. This should patch that hole nicely.


IMO A better fix is to use http://attrs.org or dataclasses to replace the dicts entirely.


Agreed--if you're going to go through the trouble of adding detailed type hints to describe a dict, you're like 90% of the way to a dataclass with better usability.


And then use a library like json-syntax[1] to translate those into JSON.

[1]: https://pypi.org/project/json-syntax/


this looks interesting. we use cattrs, which has worked well for us but has a long-delayed 1.0 release.


One of the features I'm looking forward to using is the kwdargs support in dataclasses. Honestly, find it so much more useful and intuitive than the 3.7 positional only.


i am actually pretty sad that TypedDict has made it out of typing_extensions in its current state.

It was a major missed opportunity to provide a properly duck-typed Dict type, which by definition would allow untyped key-value pairs to be added to a "partially-typed" dictionary.

Gradual typing in general is a massive win for many kinds of real-world problem solving, but when you make it as hard as Python has to introduce partial types to a plain data object, you're leaving a lot of developers out in the cold.

I love MyPy and the static type hints since 3.6, but structural subtyping is superior and so obviously more Pythonic than nominal, yet support for structural subtyping keeps lagging behind.


Arguably one of the hotter debated functions is assignment-as-expression through the := or walrus operator.

Quite happy with the new SyntaxWarning for identity comparison on literals and missing commas :)

Especially neat is also the `python -m asyncio` shell which allows you to run top-level awaits in your repl instead of needing to start a new event loop each time!


IPython has supported top-level async for a while too, plus many other features!


I like the additions of the f-strings and walrus operator, but I find myself wishing a breaking-release removing the old features that the new features covers. Python's philosophy was to have one way to do something, but the current situation of Python is very inconsistent. Python 3 has like 4~5 ways to format strings, and due to the addition of the walrus operator, we have (I understand the differences between := and = but) two different syntaxes for variable declaration/assignment.

I understand that Python can't break all kinds of code (as the 2->3 conversion is still a pain), but still I imagine a Python-esque language without all the warts that Python have with it's 'organic' growth.


That's a good use case for a linter. Ban outdated constructs in your code, but still allow you to depend on things that use them. Beats another 2->3 split again.


Yeah, and I already use linters to remove all outdated constructs; but... It's a pity that "There's only one way to do it" doesn't work in 2019 :-(


It was never "there's only one way to do it".

It is: "There should be one-- and preferably only one --obvious way to do it."

The core of that statement is "There should be one obvious way to do it" - there's no "only" there, it just says that when you need to do something, there should be an/some obvious way to do it. Then, preferably, that should be the only obvious way — though of course that doesn't preclude there being many other less-obvious ways.

With string interpolation we've certainly now got multiple ways to do it; that doesn't violate the principle. I agree that at least two of those are "obvious" (f-strings and .format()), and you could argue that %-interpolation is obvious too — but none of that is in violation of the principle that there should be an obvious way to do it, just of the preference that there's only one.

Finally it's worth remembering where this idea came from: it was in contrast to the perl mantra of "there's more than one way to do it", and a reaction to the resultant confusion frequently experienced when reading someone else's perl code — this was python saying "we're not perl, we value clarity and comprehension". I think that much of that problem was bound up in perl's syntax choices, and that in the cases in python where there's more than one way to do it (say, string formatting, dataclasses/attrs/namedtuples/etc.), it's usually pretty obvious what machinery is actually being used. When was the last time you looked at a line of python and said "I have no idea what the fuck is going on here?" That was a frequent experience with perl in the heady days of the late 1990s.


Perhaps Python itself could worn you that certain constructs are deprecated and will be removed in future versions.

And you could 'import something from future' to opt-in to making those warnings errors right now.


That's a pretty good idea, deprecating & warning + providing automatic conversion utilities + using 'future' to make them errors... But that won't happen :-(


And then import from past to remove warnings if you really want to use those constructs without warnings. Next time, feature would be removed or a depreciation warning when used with from past.


Some deprecation warnings have been enabled for a developers main source files recently. Though not in third-party libraries, which would be frustrating to the end-user.

I forget the exact details though.


I do not know exactly what caused this, but my internal project test-suite got a much appreciated 5% speedup! Thank you Python team!


Nice! My test suite got around 20% faster when going from 3.6 to 3.7. That’s almost like buying a whole new computer.


I you care about performance, maybe you should try pypy (the alternative python compiler)


I try PyPy every 3 months. It has improved greatly and for some periods I migrated to it. But for this particular project, most of time is spent inside lxml, pandas, scikit-learn and other extensions. CPython is actually faster than PyPy for this project. Maybe GraalVM / GraalPython can improve on this use-case.


Like, if graalpython gets ever released.


> Added new multiprocessing.shared_memory module

Hard to understand why this is so far down. This is fantastic news!


Ugh, I hate assignment expressions. I liked that they were missing from Python.

I've been coding in algolesque languages for 20 years and hiding assignments inside of expressions instead of putting them on the left like a statement has always tripped me up.


But, but, but... now you don’t need to write that extra line of code! It’s going to make everything sooooo much better, code will practically write itself now.

I’ll just leave this here:

“There should be one—and preferably only one—obvious way to do it.”


At this point it's about as true as G not being evil.

Which is fine by me, it was a silly idea to begin with. What you really want is separated concerns that compose well, obvious here doesn't mean anything over there.


Well there is now one obvious way to do most things that are targeted by this change, using the walrus operator.


> The list constructor does not overallocate the internal item buffer if the input iterable has a known length (the input implements __len__). This makes the created list 12% smaller on average. (Contributed by Raymond Hettinger and Pablo Galindo in bpo-33234.)

Wow. I believe it's a stated goal of CPython to prefer simple maintainable code over maximum performance. But relatively low-hanging fruit like this makes me wonder how much overall CPython perf could be improved by specialising for a few more of these.


I don't mind the walrus operator, but holy cow the "positional operators" syntax is completely unintelligble...what the hell were they thinking?!


It really is not supposed to be used unless you are wrapping a C library. Or at least that’s the original reasoning, I think. Let’s hope it doesn’t get abused.


Shared Memory is the star of this release IMHO.


Is shared memory one of those things that I should try to access/use immediately, or wait for someone to write a wrapper library around, due to the large number of edge cases/strangeness that might occur?


Why not both? You get a great feel for what the edge cases are and how they occur, and thus a better understanding of why decisions in "wrapper libraries" were made, when these edge cases bite you.


Mostly because I don't want to be up at 3am on a Saturday night debugging production code...


Damn... this is rather confusing: https://www.python.org/dev/peps/pep-0572/

I really like the walrus operator, but I didn't realize how many "you shouldn't do this" and "this is too hard already" cases exist where they are discouraging using the walrus operator.


It's interesting to see Python's gradual acceptance to TMTOWTDI.


The new assignment expression is great and I wish that an optional operator (`?`) and Elvis operator (`?:`) will make it to Python one day, too. I would love to write:

    v = obj?.prop1?.prop2 ?: "default"
instead of long if conditions:

    v = obj.prop1.prop2 if obj and obj.prop1 and obj.prop1.prop2 else "default"
A PEP for Python 3.8 existed but has been deferred: https://www.python.org/dev/peps/pep-0505/


I gave a talk on this at the PyBay conference this summer, and more recently at the ChiPy meetup in Chicago last week.

Presentation: https://docs.google.com/presentation/d/1a3Zoav7NmeN_gXjGcV_l...

Video from PyBay: https://www.youtube.com/watch?v=OtdQN24Z5MA

It covers assignment expressions in some depth, as well as highlights elsewhere.


Looks like python is trying to out C++ the C++ language with kitchen sink bolt ons.


> In this example, the assignment expression helps avoid calling len() twice:

    if (n := len(a)) > 10:
       print(f"List is too long ({n} elements, expected <= 10)")
Um, no it doesnt?:

    a1 = [10, 20, 30]
    n1 = len(a1)
    if n1 > 2:
       print(f'{n1} is greater than two')


Yes, but your example is less efficient than the new code.

Remember: every variable is an assignment into a dict, and every lookup a query into a dict. Reducing name lookups and assignments can yield good speedup in tight loops. For instance, caching os.path.join, os.path.split into local names can significantly speed up tight loops iterating over a filesystem. For example, os.path.split is potentially 5 dictionary lookups. Checking locals, nonlocals and globals for os. Then another to find path, and a final one for split. And this happens at runtime, for every invocation.


Local variables aren’t stored in a dict, likewise when a class has defined __slots__. Globals, modules and usual classes do use dicts internally, but locals do not. So from the efficiency standpoint it’s (almost?) the same. I haven’t checked the bytecode, maybe there is some slight difference.


I checked the bytecode, in the walrus version you get [...DUP_TOP, STORE_FAST, ...] in the non-walrus you get [...STORE_FAST, LOAD_FAST, ...]. Besides that identical. I imagine DUP_TOP is faster than LOAD_FAST, but I feel either way this is useless micro optimization at its finest.


In cpython:

DUP_TOP:

            PyObject *top = TOP();
            Py_INCREF(top);
            PUSH(top);
            FAST_DISPATCH();
TOP:

            (stack_pointer[-1])
LOAD_FAST:

            PyObject *value = GETLOCAL(oparg);
            if (value == NULL) { /* throw */ }
            Py_INCREF(value);
            PUSH(value);
            FAST_DISPATCH();
GETLOCAL:

            (fastlocals[i])
So looks like either way, you have an array access of something definitely in cache, a Py_INCREF, a PUSH, and a FAST_DISPATCH. The walrus operator saves you a null-check, but that check is probably skipped right over by the branch predictor, as it always throws. I'd bet the performance is indistinguishable, but I'd be interested to see for real.


It has been a while since I've looked at the implementation details. How does locals() work, then? It does return a dict. Slots are definitely an edge case I did not address. I honestly dont know how name lookup works in any version of Python.

Edit: I realize that local name lookup doesn't need to follow the result of the locals() built-in function.


I'm not a huge fan of the := operator for Python for clarity and single one clear way reasons, but the draw is the saved line of code here that is the "help"


New "f-strings support = for self-documenting expressions" but you cannot use f-strings as doc string ;)


The := operator is the assignment operator on Object Pascal, just saying, remembering my old days. BTW not so old for everyone -> https://news.ycombinator.com/item?id=15490345


I stepped away from Python for about a year, and now I'm coming back to it. I hardly recognize the language. I'm not happy about this at all.

I don't really have a point, except that Python 3 feels like a moving target.


I feel the same way as you do. For me, which version of an interpreter I'm using should be the kind of issue I only need to worry when solving extremely specific, deep-level problems. Python 3+ breaks this pact too often for my taste.

Considering this f-string example taken from another announcement:

   f"Diameter {(diam := 2 * r)} gives circumference {math.pi * diam:.2f}"
This is valid Python 3.8, but it's not valid in Python 3.7 (no walrus operator). And removing the walrus operator still doesn't work in Python 3.5 (no f-strings). On top of that, other comments already mention how f-strings have lots of weird corner cases anyway.

The entire point of Python in my circle of friends was that it made programming easy. Instead, I feel more and more in need of those "It works in my machine!" stickers. And good luck solving these issues if you are not a full-time programmer...


I'm not sure why you frame this as an issue with Python 3. This has always been the case, even in the 2.x days every release added new features, and if you ran code using them in an older version it wouldn't work.

The minor releases are always backward-compatible, so just run the latest version and everything will work.


The issue is that the ecosystem as a whole tends to follow the tip of the version chain. This means that any human-oriented Python stuff (e.g. documentation/tutorials/code-review/etc...) requires you stay up to date with the language changes.

Asyncio was the worst culprit, as it essentially introduces an inner-platform with its own dataflow semantics, but at least there the upsides were large and tangible.


I don't understand why the walrus operator is needed at all. Why not allow, this to work:

    if m=whatever():
        do_something
Why create a new assignment operator? For all the talk about making code not confusing, etc., some of these decisions sure seem nonsensical.

Oh, OK, it confuses passing arguments into a function by name? Really? C'mon.

Also, I can't understand the nearly religious rejection of pre and post increment/decrement (++/--) and, for the love of Picard, the switch() statement.

I enjoy using Python but some of these things are just silly. Just my opinion, of course. What do I know anyhow? I've only been writing software for over thirty years while using over a dozen languages ranging from machine language (as in op codes) to APL and every new fad and modern language in between.

As I watch languages evolve what I see is various levels of ridiculous reinvention of the wheel for very little in the way of real gains in productivity, code quality, bug eradication, expressiveness, etc. Python, Objective-C, Javascript, PHP, C# and a bunch of other mutants are just C and C++ that behave differently. Sure, OK, not strictly true at a technical level, but I'll be damned if it all doesn't end-up with machine code that does pretty much the same darn thing.

The world did exist before all of these "advanced" languages were around and we wrote excellent software (and crappy software too, just like today).

What's worse is that some of these languages waste a tremendous amount of resources and clock cycles to do the same thing we used to do in "lower" languages without any issues whatsoever. Mission critical, failure tolerant, complex software existed way before someone decided that the switch() statement was an abomination and that pre and post increment/decrement are somehow confusing or unrefined. Kind of makes you wonder what mental image they have of a programmer, doesn't it? Really. In my 30+ years in the industry I have yet to meet someone who is laid to waste, curled-up into a fetal position confused about pre and post increment/decrement, switch statements and other things deemed too complex and inelegant in some of these languages and pedantic circles.

Geez!

</rant off>


Is Python now becoming the new C++?


This is fair criticism. I've worked with Python for many years now, and one of the things that attracted me was that there was always a fsirly static "Pythonic" way to do things. The language has now grown so much that 80% of the people only use 20% of the language, but not always the same 20%, making it difficult to read other people's code. And all the syntactic changes contribute to fragmentation (not every project can have its Python interpreter upgraded regularly).

There's something to be said for a more stable, if less elegant, language.


I would argue that it's becoming the new Perl. I mean, look at this:

   f"Diameter {(diam := 2 * r)} gives circumference {math.pi * diam:.2f}"
If this is not write-only code, I don't know what is.


Disappointing that Python sill has no support for a sorted container in its standard library, akin to C++ `set` or `map` classes.


https://docs.python.org/3.0/library/bisect.html

> This module provides support for maintaining a list in sorted order without having to sort the list after each insertion.


Problem with bisect is that bisect.insort() insertion is O(n), whereas C++ set.insert() is O(log n).


It is somewhat strange that Python does not have a binary tree in the standard library. I also couldn't find any discussion on the topic either. It might be a nice contribution.

Edit: it was coined a few times on the python-ideas mailing list but it seems it just died a silent death there. https://mail.python.org/archives/list/python-ideas@python.or...


I suspect it's because you rarely need them in Python because the existing built-ins (list, tuple, dict, set) usually work well enough for a given job, or you're already using e.g. Pandas or something.

FWIW: https://github.com/calroc/xerblin/blob/master/xerblin/btree....


With respect, I think you're worng about that.

> The module [bisect] ... uses a basic bisection algorithm to do its work.

~ https://docs.python.org/3.0/library/bisect.html

> Binary search runs in logarithmic time in the worst case, making O(log n) comparisons...

~ https://en.wikipedia.org/wiki/Binary_search_algorithm

edit: Unless you mean moving the data?


What's the value of a sorted map over an ordered map?

I don't think I've ever cared to iterate over map values based on the alphanumeric ordering of their keys.


In Haskell we use sorted maps as persistent data structures. Persistent in this context means that the insert-method returns a new map and the old map is still around, if you need it.

It's basically copy-on-write. The old and new map share all but O(log n) data.

You could probably do something like that with an unsorted map, but it's a good fit for a sorted one.


Sure, but fix your comparison metric as "timestamp of insertion" (and forget for a moment that this isn't technically pure, you can fiddle to make it pure), and you still get all the CoW-niceness, but this is internal to the mapping type, my keys and values don't need to be comparable or orderable, only hashable. I'm given an ordering, the ordering is arbitrary, but I don't care, because CoW/persistent hash maps are mostly an implementation detail or optimization.


Could you please explain in some more detail how that would work, and especially how (logical) timestamp of insertion would help at all?


So on eway to think of a hash table/map is a set of tuples of <key, value>. Imagine we extend that to <insertion timestamp, key, value>. All of your conventional mapping methods (get/has/put) work on the key and value and ignore the timestamp. But anything that relies on iteration takes advantage of the timestamp[1], and iterates in timestamp order (or, in fact how this is normally done, is that you have a sparse array/map of <key, pointer> and a dense array of <timestamp, value>. Whenever you insert a new key/value, you always append value to the end of the dense array (so that array is dense), and the key is hashed as normal, but the value in the hash table is a pointer into the dense array where the timestamp/value pair is stored. So internally, get is

    return (map[hash(key) % map_size])->value
or approximately that (I haven't written C in a while).

Then iteration over objects in the array is consistent: you just iterate over the dense array. Removal from the dense array is done by tombstoning, and possibly eventual compaction. This is essentially how python's current hash table implementation works.

IIRC, if you're table can assume only insertions, this actually becomes really, really nice as a persistent data structure, since you can replace the backing vector with a backing linked list, and you only really lose out on iteration speed. Then you further extend that by swapping the linked list to a tree, and you share the entire backing structure.

[1]: And you can use an increment-only mutation counter instead of a timestamp to make this pure.


Correct me, if I am wrong: take everything you described and remove the (logical) timestamp, and all you'd be losing out on would be the iteration in insertion order?

So how does the timestamp have any impact on persistence?

(We agree on basically everything you write.)

> [1]: And you can use an increment-only mutation counter instead of a timestamp to make this pure.

Agreed. That's what I'd call a logical timestamp or a logical clock.

> IIRC, if you're table can assume only insertions, this actually becomes really, really nice as a persistent data structure, since you can replace the backing vector with a backing linked list, and you only really lose out on iteration speed. Then you further extend that by swapping the linked list to a tree, and you share the entire backing structure.

That still doesn't tell you anything at all about how you make the hashtable itself persistent.

One simple way would be to just introduce an arbitrary order on your hashes, eg compare them as if they were ints, and then stick them in an ordered map container. Of course, that's just a round-about way to impose an arbitrary order on your keys. Works perfectly well, it's just not too interesting.

To come back to your original question:

> What's the value of a sorted map over an ordered map? > I don't think I've ever cared to iterate over map values based on the alphanumeric ordering of their keys.

In the cases we talked about making the keys comparable is only a means to an end, and any arbitrary order will do. That's the most common use case. Iterating over the keys in insertion order is also often useful.

In practice, I did come across some use cases that make genuine use of the order of keys. Eg when I was interested in (automatically) keeping track of the highest or lowest key or quantiles.

You can use a sorted map as a priority queue easily. A min heap would give you O(1) access to the minimum item, but if you actually want to pop it, it's O(log n) anyway. A sorted map as tree gives you O(log n) for basically all operations. For most uses that's either good enough, or doesn't even make a difference at all compared to a heap. Using your sorted map as a priority queue gives you arbitrary priority updates in O(log n) for no extra implementation complexity.

Sorted maps also came in handy when I was implementing geometric algorithms where I wanted to sweep a scanline over points. Or for divide and conquer algorithms over space.

A sorted map is also useful as a basic datastructure to build step functions on top of. (https://en.wikipedia.org/wiki/Step_function) A simple application is solving the infamous skyline problem.

When you want to store your data structure, having similar keys close together can help with compression. Google uses sorted string tables (sstable) show that principle quite well.


But it does have a method to sort containers in its standard library, `sorted`, that C++ doesn't have. And it's trivial to use it to sort lists, sets, and dicts...


C++ also has the `sort()` function that allows you to sort any unsorted container. But that's not a replacement for a sorted container like `set` or `map` though. Because `set` or `map` allows you to insert elements at O(log n) runtime. If you have to sort every time you insert using the `sort()` or `sorted()` functions, the run time becomes O(n log n).


Everything of this looks like syntax sugar. I would expect more work on internals. Python has a lot of awkward edge cases in standard lib. In a lot of cases None is valid output for not-done states. Also I hate multithreading programming in Python 3, it has all drawbacks of C with additions of GIL.


I'm still migrating from 2.7...to 3.5


Does the python community care about concurrency at all? I havent seen anything new in terms of concurrency in a while. I might be wrong about this, but walrus operator seems like a gateway to writing obfuscated perlesque code.


3.8 brings shared memory to multiprocessing: https://docs.python.org/3/library/multiprocessing.shared_mem...


There were some changes to asyncio in this release but they were pulled at the last moment so they can better sign with Turio.


It would be nice if the Python web site would be mobile friendly.


Anyone else most excited about PYTHONCACHEPREFIX? :-P


Not sure about most excited, but I am looking forward to setting PYTHONCACHEPREFIX to a location that isn't in a volume mounted into my Docker container for development.


+1


Can I chain the walrus operator? All examples I've seen were part of if/while blocks.

Something like:

    r1 = foo(x:=bar)
    r2 = baz(x)


Assignment expression is something I miss from JavaScript which I used a lot to strip lines of code. Great to see in Python.


Updates on `f-string` are quite fun. :)

`f'{username= }'` 'username="tolgahanuzun"'

I like it.


Any idea how long it will take before it's an option with pyenv?


Still no pattern matching and no immutable vars :-(


Interesting! Thanks for the share!


man do I wish they would make multiprocessing easier


Subinterpreters are coming in 3.9, which should fill most of the same roles as multiprocessing but without the complexities and edge cases of multiple real OS processes.

https://www.python.org/dev/peps/pep-0554/


According to https://www.python.org/dev/peps/pep-0554/#provisional-status they should already be present in 3.8 for evaluation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: