Formalizing Python F-Strings

rtpg · on Jan 14, 2023

I do understand the readability question in theory, but it is a thing where if you open it up in basically any text editor colors resolve it instantly.

I do feel like having to try and explain what expressions are in fact not allowed in f-strings seems more fraught than potentially having hard to read f-strings. Sounds like something where a Pylint rule could get us most of the way here by establishing a convention that most projects will follow for readability

Jakob · on Jan 14, 2023

As someone points out in the comments, it is unfortunate that quotes are the only delimiter which doesn’t differentiate between opening and closing, leading to all the parsing woes, unlike

  (), <>, {}, [], …

I wonder why the default for ASCII string quotation was not

  `string'

mongol · on Jan 14, 2023

` is a dead key in many keyboard layouts. On my keyboard, I need to press shift + key left of backspace, followed by space to get it. And if I type something else I get an annoying bell sound.

kzrdude · on Jan 14, 2023

Python 2 used the backtick for repr. I had almost forgotten this.

Python people also testified about all the book typesetting issues with this, and other reasons, and said backtick would never again have a use in Python.

toxik · on Jan 14, 2023

Let’s compromise then and have ‘foo’, now nobody can write it!

layer8 · on Jan 14, 2023

In many (most?) fonts, "'" is straight while "`" is slanted, so that looks terrible. Also, traditionally the double quote (") is the standard quote delimiter, because it is visually more prominent than the single quote, but there is only one kind of it in ASCII. Typographic quotes (“like this”) would arguably be best, but for most people it wouldn't be obvious how to enter them; and also different natural languages have different conventions here. Once we go outside of ASCII, you also have «this» and「this」and《this》and so on.

keybored · on Jan 14, 2023

Good idea assuming that backtick is not used for anything else already.

The only problem with backticks is that European keyboards hate them if you have dead keys enabled.

ejolto · on Jan 14, 2023

In scandinavian languages we differentiate between opening and closing quotes, too bad they are not supported by python « open and close ».

henrydark · on Jan 14, 2023

that's what it is in m4

tragomaskhalos · on Jan 14, 2023

The f-string annoyance that dwarfs all others is forgetting the bloody f, and the associated forehead slap when the console says "the answer to your big calculation is {answer}" (sic). Unfortunately I'm not sure there's any clever language solution to that.

layer8 · on Jan 14, 2023

Proper syntax highlighting should solve this.

cricalix · on Jan 14, 2023

It's not a language solution, but an AST based linter perhaps? PyCQA might have one already; they definitely have a "consider using f-strings" lint you can turn on.

brennvin · on Jan 14, 2023

What if the compiler issued a warning if the f was missing and let you suppress the warning with an r?

setr · on Jan 14, 2023

Then it might as well be fstrings by default, rstrings for raw.

brennvin · on Jan 14, 2023

Good idea! If we do away with f strings entirely the problem does not occur.

haberman · on Jan 14, 2023

I don't know the history here, but it seems like Python could have saved a lot of pain by jumping straight to this solution (which has been in Ruby for upwards of 20 years), instead of going via two solutions that seem strictly worse ("%s" interpolation and "{}".format()).

samwillis · on Jan 14, 2023

That doesn't really work, both of the other systems date back literal decades. F-Strings were implemented in 2015 as a complete new thing, inspired by other languages.

In order to implement F-Stings, and due to limitations in the CPython language parser, F-Strings used to have a separate parser. Because of this there were restriction on what Python expressions could be embedded in an F-String. In 2020 CPython replaced is parser with a much more capable one. Now they are moving the F-String parsing to be part of the CPython parser proper, this is allowing them to lift many of the restrictions within F-Strings.

"Formalising F-Strings" isn't a particularly good title, "Lifting the restrictions in F-String would be better".

BiteCode_dev · on Jan 14, 2023

Yeah, why did we even had that assembly stuff?

We should have had rust from the start!

Seriously, python was born in 1991, 4 years BEFORE Java existed. And it barely had any funding until 2010.

Insight is 20/20 but this is not how tech works. You are limited by knowledge, resources, politics, legacy, compatibility commitment and so on.

Not to mention you can simply get it wrong.

Would have been funny if the whole world moved out of unicode and we got stuck with it forever. Usually, you know something is a standard only when people complain you adopted it too late because everyone else did.

Like Java that used XML for all serialization and got suck with it for 15 years while we all moved to JSON. Or utf-7 that is still used by IMAP because they did the wrong call.

Yes, utf-7. What, you though utf8 was the only unicode standard? No, no, no. There are utf-7, utf-8, utf-16 (JS uses that) and utf-32 (C# uses that) in use. And more exist that abandoned.

People seems to be under the illusion that this stuff is easy, or obvious. Sure. In 2022.

Everything about text has been a fight for death to this day, for all programming languages. This includes formatting.

We have "%s" because... of printf() in C. Let that sinks in.

haberman · on Jan 14, 2023

> People seems to be under the illusion that this stuff is easy, or obvious. Sure. In 2022.

I mentioned that Ruby had this for "upwards of 20 years." I was vague on the timeline because I didn't know for sure how long it had been. After reading your message I looked into it and it looks like Ruby supported "#{expr}" since at least Ruby 0.95, in 1995: https://github.com/nobu/ruby-1.0/blob/master/sample/tkdialog...

This is not to mention Perl, awk, shell, and Tcl, which had syntax like $FOO for even longer.

Ruby's solution existed before all of the following PEPs, which added different varieties of string formatting to Python:

    - PEP 215 – String Interpolation (2000) (superseded by PEP 292)
    - PEP 292 – Simpler String Substitutions (2002), added string.Template()
    - PEP 3101 – Advanced String Formatting (2006), added str.format()
    - PEP 498 – Literal String Interpolation (2015), added f''

I get that language design is hard, and it's difficult to have the final answer right away. But it's not as if this idea was unknown or untested when Python introduced other home-grown solutions. When Python proposed f strings in 2015, it imitated a design that had been in Ruby for almost 20 years at that point.

hnfong · on Jan 15, 2023

> There are utf-7, utf-8, utf-16 (JS uses that) and utf-32 (C# uses that) in use. And more exist that abandoned.

Javascript is just UCS-2 that exposes UTF-16 internals (eg. surrogates) for your application logic to process if you want to do anything correct with the string (other than pass it around unmodified). Maybe it's "abandoned" in the sense that UTF-16 nominally supersedes it, but yea, it's actually just UCS-2. Lots of the stuff in early 1990s used UCS-2 (eg NTFS).

This also goes to your point about living with "wrong" decisions from the past due to compatibility and inertia. Nobody knew we needed more than 2^16 code points (bet most of you don't know about the Kangxi Dictionary...)

joshuamorton · on Jan 14, 2023

it's difficult to say that .format and %s are both strictly worse (I think "{}.format()" is strictly better than %s, but both are a bit orthogonal to f-strings because both are lazily applied. F-strings evaluate locals immediately, but .format can be called later. That is, concretely, f-strings are almost equivalent to `"".format(*locals())`, but f-strings fail for cases where, for example, the string to be formatted itself is not available at "compile" time.

masklinn · on Jan 14, 2023

> I think "{}.format()" is strictly better than %s

It’s not, because it’s way too powerful.

A user-provided printf string is not much of an issue, a user-provided str.format one very much is, as provides significant traversal capabilities.

joshuamorton · on Jan 14, 2023

Is the .format sublanguage notably stronger than the %-one? I know in c, for example, the printf sublanguage can be used to do turing-complete things, but I didn't think that was possible in either in python.

masklinn · on Jan 14, 2023

> Is the .format sublanguage notably stronger than the %-one?

Yes, it allows arbitrary attribute traversal as well as broad indexing.

The % sublangage is a lot more restrictive and doesn’t suffer from the arbitrary reading of the C version (if you format an object that wasn’t passed in you get an error not a read off of the stack or a register).

I don’t think format is anywhere near turing complete, but it allows significant information querying and retrieval (https://lucumr.pocoo.org/2016/12/29/careful-with-str-format/).

Both still allow for resource exhaustion but that’s what it is.

pmontra · on Jan 14, 2023

You can use eval to trasform a string to f string and evaluate it. Of course format is better in that case.

https://stackoverflow.com/questions/44757222/transform-strin...

thijsvandien · on Jan 14, 2023

You would be surprised how much hate f-strings got when they were first proposed/implemented. :)

BiteCode_dev · on Jan 14, 2023

That's something most people don't realize.

The walrus got heat. F-string got heat. match/case got heat.

Unicode in python 3 got 15 years of trolling.

Every single time you make a proposal, you have to make a compromise between:

    - what the infrastructure and style of the language allows
    - the resources you have
    - the desire for staying compat
    - the desire for staying modern
    - the pressure of people wanting a new toy
    - the pressure of people hating any kinda change

You cannot flip a switch and "make the language better".

smitty1e · on Jan 14, 2023

I have yet to ever need the walrus.

BiteCode_dev · on Jan 14, 2023

I used it 3 times yesterday (usually for re.match).

But I've been coding 15 years without it.

pmontra · on Jan 14, 2023

Agreed. There are things that languages must get right at the very beginning or be plagued by inconveniences forever. This is one of them. I got a number of bugs and test failures because I forgot an f before a string with interpolated values.

Python is rooted in the 80s. Maybe Guido didn't get the idea because maybe his background was only with languages without interpolation. Example: C has argument replacement with printf("%d", 42) and no interpolation.

henrydark · on Jan 14, 2023

It's interesting to note that the creator of C and his well known friend also created m4 in the 70's, and it's very similar to "string interpolation", maybe even taken to the extreme.

I was actually program-wise "raised" by The C Programming Language in the early 90's, and I only found out about m4 a month ago.

sakjur · on Jan 14, 2023

I think I would keep the f-prefix if I were to do a Python dialect from scratch with lessons learned. Making { a special character in any string seems excessive and prone to user injected rogue values that are hard to catch in all places of a large system.

Having the slightly safer types of strings be the default seems prudent.

pydry · on Jan 14, 2023

.format has fewer escaping headaches.

tgv · on Jan 14, 2023

They could have saved even more pain by doing print() and unicode strings straight from the start, and by adding strict typing, and by finding a better solution for def __hack__(), but here we are.

BiteCode_dev · on Jan 14, 2023

Sure. From version 1 even.

The one that shared an era with pokemon version 1, on the original black and white gameboy.

Because it was so clear from the beginning, even if emoji didn't exist, cpu had no core, ruby hasn't been invented yet, and people were sharing the source code on floppy disk.

its-summertime · on Jan 14, 2023

> The first two might be perfectly understandable to the parser, but as a human reader, they make it more complicated and error-prone to work out which quotes delimit the f-string and which do not.

as also a human reader, the examples are fine?, I'm used to bash having `"$(basename "$val")"`, or XML having `<a><b><a></a></b></a>`, or python having `))))` at the end of an expression due to having a weird hate of standardized public methods

ssl232 · on Jan 14, 2023

If you train yourself to see a { within an f-string to mean that you should enter a new parsing context stacked upon the previous one, then it's entirely clear.

8n4vidtmkvmk · on Jan 14, 2023

i haven't heard anyone quibble about nesting complex expressions in js template strings. just let it be. people will write ugly code no matter what the language allows, you can't enforce it

BiteCode_dev · on Jan 14, 2023

To be fair the JS community is not as picky about his because it lived in a stage of spaghetti code since the beginning.

It's fair to want to mind easing readability in the design of the language. That's one of the things that has made it so popular since its inception.

acjohnson55 · on Jan 14, 2023

Life would be so much easier if quotes were directional, like brackets. It's a damn shame this has never been changed on an industry level.

keybored · on Jan 14, 2023

I am fine with whatever restrictions Rust puts on `{foo}` in things like `println!()`. Just being able to interpolate a simple variable or constant is a huge improvement.