I do understand the readability question in theory, but it is a thing where if you open it up in basically any text editor colors resolve it instantly.
I do feel like having to try and explain what expressions are in fact not allowed in f-strings seems more fraught than potentially having hard to read f-strings. Sounds like something where a Pylint rule could get us most of the way here by establishing a convention that most projects will follow for readability
As someone points out in the comments, it is unfortunate that quotes are the only delimiter which doesn’t differentiate between opening and closing, leading to all the parsing woes, unlike
(), <>, {}, [], …
I wonder why the default for ASCII string quotation was not
` is a dead key in many keyboard layouts. On my keyboard, I need to press shift + key left of backspace, followed by space to get it. And if I type something else I get an annoying bell sound.
Python 2 used the backtick for repr. I had almost forgotten this.
Python people also testified about all the book typesetting issues with this, and other reasons, and said backtick would never again have a use in Python.
In many (most?) fonts, "'" is straight while "`" is slanted, so that looks terrible. Also, traditionally the double quote (") is the standard quote delimiter, because it is visually more prominent than the single quote, but there is only one kind of it in ASCII. Typographic quotes (“like this”) would arguably be best, but for most people it wouldn't be obvious how to enter them; and also different natural languages have different conventions here. Once we go outside of ASCII, you also have «this» and「this」and《this》and so on.
The f-string annoyance that dwarfs all others is forgetting the bloody f, and the associated forehead slap when the console says "the answer to your big calculation is {answer}" (sic). Unfortunately I'm not sure there's any clever language solution to that.
It's not a language solution, but an AST based linter perhaps? PyCQA might have one already; they definitely have a "consider using f-strings" lint you can turn on.
I don't know the history here, but it seems like Python could have saved a lot of pain by jumping straight to this solution (which has been in Ruby for upwards of 20 years), instead of going via two solutions that seem strictly worse ("%s" interpolation and "{}".format()).
That doesn't really work, both of the other systems date back literal decades. F-Strings were implemented in 2015 as a complete new thing, inspired by other languages.
In order to implement F-Stings, and due to limitations in the CPython language parser, F-Strings used to have a separate parser. Because of this there were restriction on what Python expressions could be embedded in an F-String. In 2020 CPython replaced is parser with a much more capable one. Now they are moving the F-String parsing to be part of the CPython parser proper, this is allowing them to lift many of the restrictions within F-Strings.
"Formalising F-Strings" isn't a particularly good title, "Lifting the restrictions in F-String would be better".
Seriously, python was born in 1991, 4 years BEFORE Java existed. And it barely had any funding until 2010.
Insight is 20/20 but this is not how tech works. You are limited by knowledge, resources, politics, legacy, compatibility commitment and so on.
Not to mention you can simply get it wrong.
Would have been funny if the whole world moved out of unicode and we got stuck with it forever. Usually, you know something is a standard only when people complain you adopted it too late because everyone else did.
Like Java that used XML for all serialization and got suck with it for 15 years while we all moved to JSON. Or utf-7 that is still used by IMAP because they did the wrong call.
Yes, utf-7. What, you though utf8 was the only unicode standard? No, no, no. There are utf-7, utf-8, utf-16 (JS uses that) and utf-32 (C# uses that) in use. And more exist that abandoned.
People seems to be under the illusion that this stuff is easy, or obvious. Sure. In 2022.
Everything about text has been a fight for death to this day, for all programming languages. This includes formatting.
We have "%s" because... of printf() in C. Let that sinks in.
> People seems to be under the illusion that this stuff is easy, or obvious. Sure. In 2022.
I mentioned that Ruby had this for "upwards of 20 years." I was vague on the timeline because I didn't know for sure how long it had been. After reading your message I looked into it and it looks like Ruby supported "#{expr}" since at least Ruby 0.95, in 1995: https://github.com/nobu/ruby-1.0/blob/master/sample/tkdialog...
This is not to mention Perl, awk, shell, and Tcl, which had syntax like $FOO for even longer.
Ruby's solution existed before all of the following PEPs, which added different varieties of string formatting to Python:
- PEP 215 – String Interpolation (2000) (superseded by PEP 292)
- PEP 292 – Simpler String Substitutions (2002), added string.Template()
- PEP 3101 – Advanced String Formatting (2006), added str.format()
- PEP 498 – Literal String Interpolation (2015), added f''
I get that language design is hard, and it's difficult to have the final answer right away. But it's not as if this idea was unknown or untested when Python introduced other home-grown solutions. When Python proposed f strings in 2015, it imitated a design that had been in Ruby for almost 20 years at that point.
> There are utf-7, utf-8, utf-16 (JS uses that) and utf-32 (C# uses that) in use. And more exist that abandoned.
Javascript is just UCS-2 that exposes UTF-16 internals (eg. surrogates) for your application logic to process if you want to do anything correct with the string (other than pass it around unmodified). Maybe it's "abandoned" in the sense that UTF-16 nominally supersedes it, but yea, it's actually just UCS-2. Lots of the stuff in early 1990s used UCS-2 (eg NTFS).
This also goes to your point about living with "wrong" decisions from the past due to compatibility and inertia. Nobody knew we needed more than 2^16 code points (bet most of you don't know about the Kangxi Dictionary...)
it's difficult to say that .format and %s are both strictly worse (I think "{}.format()" is strictly better than %s, but both are a bit orthogonal to f-strings because both are lazily applied. F-strings evaluate locals immediately, but .format can be called later. That is, concretely, f-strings are almost equivalent to `"".format(*locals())`, but f-strings fail for cases where, for example, the string to be formatted itself is not available at "compile" time.
Is the .format sublanguage notably stronger than the %-one? I know in c, for example, the printf sublanguage can be used to do turing-complete things, but I didn't think that was possible in either in python.
> Is the .format sublanguage notably stronger than the %-one?
Yes, it allows arbitrary attribute traversal as well as broad indexing.
The % sublangage is a lot more restrictive and doesn’t suffer from the arbitrary reading of the C version (if you format an object that wasn’t passed in you get an error not a read off of the stack or a register).
The walrus got heat. F-string got heat. match/case got heat.
Unicode in python 3 got 15 years of trolling.
Every single time you make a proposal, you have to make a compromise between:
- what the infrastructure and style of the language allows
- the resources you have
- the desire for staying compat
- the desire for staying modern
- the pressure of people wanting a new toy
- the pressure of people hating any kinda change
You cannot flip a switch and "make the language better".
Agreed. There are things that languages must get right at the very beginning or be plagued by inconveniences forever. This is one of them. I got a number of bugs and test failures because I forgot an f before a string with interpolated values.
Python is rooted in the 80s. Maybe Guido didn't get the idea because maybe his background was only with languages without interpolation. Example: C has argument replacement with printf("%d", 42) and no interpolation.
It's interesting to note that the creator of C and his well known friend also created m4 in the 70's, and it's very similar to "string interpolation", maybe even taken to the extreme.
I was actually program-wise "raised" by The C Programming Language in the early 90's, and I only found out about m4 a month ago.
I think I would keep the f-prefix if I were to do a Python dialect from scratch with lessons learned. Making { a special character in any string seems excessive and prone to user injected rogue values that are hard to catch in all places of a large system.
Having the slightly safer types of strings be the default seems prudent.
They could have saved even more pain by doing print() and unicode strings straight from the start, and by adding strict typing, and by finding a better solution for def __hack__(), but here we are.
The one that shared an era with pokemon version 1, on the original black and white gameboy.
Because it was so clear from the beginning, even if emoji didn't exist, cpu had no core, ruby hasn't been invented yet, and people were sharing the source code on floppy disk.
> The first two might be perfectly understandable to the parser, but as a human reader, they make it more complicated and error-prone to work out which quotes delimit the f-string and which do not.
as also a human reader, the examples are fine?, I'm used to bash having `"$(basename "$val")"`, or XML having `<a><b><a></a></b></a>`, or python having `))))` at the end of an expression due to having a weird hate of standardized public methods
If you train yourself to see a { within an f-string to mean that you should enter a new parsing context stacked upon the previous one, then it's entirely clear.
i haven't heard anyone quibble about nesting complex expressions in js template strings. just let it be. people will write ugly code no matter what the language allows, you can't enforce it
I am fine with whatever restrictions Rust puts on `{foo}` in things like `println!()`. Just being able to interpolate a simple variable or constant is a huge improvement.
I do feel like having to try and explain what expressions are in fact not allowed in f-strings seems more fraught than potentially having hard to read f-strings. Sounds like something where a Pylint rule could get us most of the way here by establishing a convention that most projects will follow for readability