
New string formatting in Python - eatonphil
https://zerokspot.com/weblog/2015/12/31/new-string-formatting-in-python/
======
scrollaway
String interpolation is one of those features where I'm really confused about
the direction the Python team wants to take. It's implicit and magic, and yet
another non-obvious way to do do string formatting.

What's more, it's being hailed at a plus for localization _which it isn 't_.
Localizers should never, ever deal with string interpolation - anything past
what .format() does is essentially untranslatable.

~~~
hyperpape
Why is it implicit and magic? Looking at this post, and the PEP, it seems like
interpolation is actually pretty close to .format in semantics, but
syntactically simpler. There are a few bits that I didn't quite follow in the
PEP, so I may just be missing the issue, however.

~~~
saurik
The important thing for localization is to allow the strings to be swapped at
runtime by loading them from some kind of database in a way where the
arguments are at least numbered, if not named, so as to let them appear in
different orders for different languages. The interpolation syntax does not
seem, to me, allow for this: it is an "implicit" syntax offered during parsing
to take a constant string and immediately build it out of parts, as opposed to
the "explicit" % operator or .format methods: it is extremely clear with these
two models how one would load a template that has the strings in a different
order, or which used a different subset of the strings.

    
    
        >>> '%(one)s %(two)s'%{"one":"hello", "two":"world"}
        'hello world'
        >>> '{one} {two}'.format(one="hello", two="world")
        'hello world'
    

The PEP that is looking at this problem has been deferred: at the end of the
document they kind of indicate that they didn't really think about the i18n
problem correctly and so don't actually have a good solution to present, and
are going back to think about it more.

[https://www.python.org/dev/peps/pep-0501/](https://www.python.org/dev/peps/pep-0501/).

~~~
sametmax
If you need to localize, you still have format(). You just happend to have a
shorthand to do it for the 80% case, just like you have @decorator as a
shortcut for decorated = decorator(decorated) or [x for x in z] as a classic
loop alternative.

Yes, having so many solutions is not ideal, espacially when it comes to teach
the language. However, it would be foolish to avoid improving the usability of
Python just to avoir having "one more way to do it".

I do wish they'd deprecate Template though. It's more than useless.

~~~
mixmastamyk
string.Template is the one used for i18n, it is simple for the end user so has
a use.

------
mixmastamyk
From another post:

    
    
        "{a} {b} {a}".format(a=a, b=b)
        "{a} {b} {a}".format(**locals())
    

Compared to this:

    
    
        f"{a} {b} {a}"
    

Sorry that's about 1000% better. This should have been the one way to do it,
originally. It isn't magic either, rather a simple compile-time transformation
to existing format syntax. There's nothing new to remember besides a large
reduction in noise.

~~~
schmichael
I'm in the "wish it was more explicit" camp. Would a fmt function be that
terrible?

    
    
        fmt("{a} {b} {c}", a=a, b=b, c=c)
    

If you want to save typing, maybe use :a instead of {a}. Or ?a would have made
plain old ? a nice positional variant:

    
    
        fmt("?a ? ?", a=a, b, c)
    

The main benefit of the fmt function is that it requires no syntax changes to
the language and is trivially provided by a third party library for all past
versions of Python.

That being said this ship has sailed. I guess I just take a more conservative
approach to syntax changes than most.

Update: a bit sad to see my votes fluctuating wildly on this post. Please
don't use votes to support or disagree with me: that's not what they're for.
Please vote only based on whether you find this relevant.

~~~
mixmastamyk
It is explicit, in that one must use the f'' prefix. There are no syntax
changes as u'', b'', and r'' already exist. Positionals make bugs more likely.

~~~
schmichael
It is a syntax change as before f"" was a syntax error:

    
    
        >>> f"foo"
          File "<stdin>", line 1
            f"foo"
                 ^
        SyntaxError: invalid syntax

~~~
schmichael
> ... though similar to adding a function that didn't exist before.

This is not true and exactly the distinction I'm trying to make:

Adding new packages, functions, objects, etc. can all be backported to older
versions and alternative implementations. They also require no updates to
ASTs, linters, syntax highlighters, static analysis tools etc.

Adding new syntax is backward incompatible (unless it's added as a from
__future__ import to new old releases) and requires changes to all tools that
parse Python syntax (the interpreter, ASTs, linters, transpilers, etc).

~~~
mixmastamyk
Ok, true, though I would argue that it isn't entirely new syntax but rather a
variation of an existing one.

It is a shame that linters will have to add a letter to their grammar also,
but I argue that the everyday usability and readability for millions will
outweigh this drawback.

------
richard_todd
The motivation for PEP-0498 given in the article was the difference in
verbosity between these two lines:

    
    
      "{} {}".format(a, b)
      "%s %s" % (a, b,)
    

That's not very convincing. I was hoping this article would make a good case
for interpolated strings, since it's starting to feel like Python is having an
identity crisis. Type annotations especially took me by surprise, but string
interpolation is another good example of an addition that doesn't feel like
Python (imho, anyway).

~~~
jdnier
The more common case for me is to name the parameters. Especially with longer
variable names, this gets tedious fast. Given

    
    
        >>> very_long_var_name_1 = 'spam'
        >>> very_long_var_name_2 = 'ham'
    

compare

    
    
        >>> # Explicit but tedious and doesn't help readability:
        >>> print('{very_long_var_name_1}: {very_long_var_name_2}'.format(
        ...       very_long_var_name_1=very_long_var_name_1,
        ...       very_long_var_name_2=very_long_var_name_2))
        spam: ham
    

with

    
    
        >>> # Explicit but somehow feels dirty:
        >>> print('{very_long_var_name_1}: {very_long_var_name_2}'.format(
        ...       **locals()))
        spam: ham
    

and

    
    
        >>> # Still fits on one line. I think f prefix makes intent clear.
        >>> print(f'{very_long_var_name_1}: {very_long_var_name_2}')
        spam: ham

~~~
toyg
Thing is, * * locals() "feels dirty" but does exactly the same thing as f''.
Except f'' does it implicitly, hiding dirt under the carpet. One day, that
dirt will turn something into a bug; which is _exactly_ why the Zen says
"explicit is better than implicit".

The best way imho would be:

    
    
        vars = {'short1': very_long_var_name1, 
                'short2': very_long_var_name2}
        print('{short1} {short2}'.format(**vars))
    

Easy and extremely unlikely to ever include the wrong variable.

~~~
spicyj
Except a linter or static analyzer can easily be taught to understand f"", but
it's essentially impossible to reason statically about locals().

------
Walkman
I have mixed feeling about this. This will be the FOURTH way of formatting
strings in Python. The other formattings will never go away.

~~~
bhaak
What's the fourth way? The article only mentions three.

~~~
nothrabannosir
Not the op but Im guessing just using + (which calls .__str__(), iirc?)

~~~
richardwhiuk
\+ doesn't call __str__ internally.

e.g the following is an error.

    
    
      a = 4
      "a: " + a
    

You have to do:

    
    
      "a: " + str(a)

------
monkmartinez
I'll just leave this here:
[https://www.python.org/dev/peps/pep-0020/](https://www.python.org/dev/peps/pep-0020/)

"There should be one-- and preferably only one --obvious way to do it."

There are a lot of warts on that snake, but I still really like using Python.

~~~
aexaey
PEP20 _sound_ amazing, but then when this only way end up being way too
verbose/ugly, you are kind of stuck with it. Compare:

    
    
        #python
        import re
        m = re.search('(a.+)(d.+)', 'abcdef')
        if m:
          print(f"{m.group(1)},{m.group(2)}")
    
        #perl
        if('abcdef' =~ /(a.+)(d.+)/){
          print "$1,$2";
        }

~~~
JoshTriplett
I don't know why Python re match objects don't support indexing; a __getitem__
method would allow m[1] and m[2] instead (or m['name'] for named groups).

That aside, though, I don't consider brevity alone the most critical criteria
for a programming language; that way lies APL. _Expressiveness_ , yes, but not
at the expense of clarity.

------
Rangi42
Also not a fan. The two existing methods of string formatting are sufficient:

    
    
        "%s %s %s" % (a, b, a)
        "{} {} {}".format(a, b, a)
    

If you want placeholders to match variable names, you can do:

    
    
        "{a} {b} {a}".format(a=a, b=b)
        "{a} {b} {a}".format(**locals())
    

So this is just unnecessary (especially since the "f" prefix is easier to miss
than a "format" method):

    
    
        f"{a} {b} {a}"
    

And _this_ is downright obfuscated—putting operators inside of string
literals:

    
    
        f"{a + ' ' + b + ' ' + a}"

~~~
eugenekolo2
I certainly agree about your last statement, I have no clue what that would do
without reading the spec. I'd guess an error, but apparently not.

------
c3534l
At the very least, it sure looks a hell of a lot better. And using symbols and
single-character abbreviations for types (in a dynamically typed language,
mind you) and an overloaded % which has the extra syntactical rule that it
only takes a single argument and so multiple replacements have to be done by
throwing them all into a tuple is FAR more intuitive and readable than the way
most languages do string formatting. Plus the alternatives, that require you
bounce back and forth between the string and the variable it is replacing
rather than read left to right the way humans are meant to read strings of
text is a terrible design. Something as simple as string formatting should not
rely on arcane and nuanced rules which are more or less arbitrary.

You've all been programming in C-like languages for far too long to realize
what a horrible design string formatting is. You can argue over "explicitness"
all you want, the new way is easier to learn, easier to read, makes more
intuitive sense, requires learning fewer rules, ad is close enough to the
format string method that they work well together.

------
losvedir
As an outsider to the python community and full time ruby dev, this
"controversy" baffles me every time it comes up. Lightweight string
interpolation is _obviously_ better! I think you all have string Stockholm
Syndrome or something.

The one counter argument that makes sense to me is that in general we _shouldn
't_ be doing easy string interpolation, since that way lies SQL injection,
XSS, etc, and should instead rely on a stronger type system with binary text
blobs, HtmlStrings, SqlStrings, etc, with automatic escaping into and out of
the data type.

But then that's not the case with Python now. If you're only trying to stick
this string inside that string in a quick and dirty manner, I totally don't
understand the reticence folks have to something the way ruby does it: "Name:
#{first_name}".

~~~
scrollaway
If it's _obviously_ better, would you care to share some better arguments than
"people who disagree have stockholm syndrome"?

Don't get me wrong, I'd _like_ Python to have better, more obvious, more
concise string formatting. However the last time we had this discussion, it
was about str.format() and how it was going to be awesome and don't worry
modulo-formatting will go away.

Turns out it did not; modulo formatting is still there because _why would it
be removed_. This is history repeating itself - are you actually _baffled_
that some people learn from past mistakes?

~~~
mixmastamyk
All you have to do is try it once or twice to enjoy the readability and lack
of positional mismatch bugs.

None of the other techniques are going away. This time yes, no one is naive
enough to think so.

~~~
scrollaway
"Try it!" is not an objective argument (I have tried it extensively eg. in
Bash and I neither like or hate it), and positional mismatch is not a concern
when you use name-based syntax, which interpolation forces you into due to its
nature.

~~~
mixmastamyk
I've made numerous posts in this discussion, so not going to copy them again
here. The previous named-based syntax comes with significant redundancy and
noise.

As a ruby dev posted here, it is obviously better in most respects in most
common cases.

------
sauere
Not a fan. We already have 2 ways of formating strings in Python, we really
should not bring in a third one.

It may be true that

    
    
        "{} {}".format(a, b)
    

is a bit verbose, but it is crystal clear and clean. Just remember the Python
Zen: "There should be one-- and preferably only one --obvious way to do it."
_and_ "Explicit is better than implicit."

~~~
mixmastamyk
It is the same thing as .format without all the noise.

~~~
Grue3
No, it's the same thing as .format, except limited to literals, so basically
useless (hardcoding literal strings is bad for i10n).

~~~
mixmastamyk
Useless... have you never used a shell? It's quite handy.

~~~
Grue3
If they're adding features purely for the convenience of python shell users,
how about, I don't know, remove the need for significant whitespace (by adding
"end block" statement). Because Python shell is a major pain to use when you
try to copy-paste some code into it and it doesn't like the indentation
levels. On the other hand I can copy-paste whatever into my Lisp REPL and it
will run just fine. Or allow several statements per lambda. Saving a few
keystrokes writing ".format" doesn't even register on the same scale of
annoyance.

~~~
mixmastamyk
I've been using Python heavily since about 2000.

Approximately 2.5 times since then has whitespace been a problem, and which I
fixed in under 10 seconds each time. Yet, the readability gains from removal
of block delimiters in that time frame is uncountable.

> Saving a few keystrokes writing ".format" doesn't even register on the same
> scale of annoyance

It isn't just .format, it is:

    
    
        .format(long_variable_name1=long_variable_name1, 
                long_variable_name2=long_variable_name2)
    

This is a huge win that should have happened long ago.

~~~
Grue3
But you don't need to use long variable names in Python shell! Even one-letter
vars are perfectly acceptable.

~~~
mixmastamyk
When I wrote "shell" I was thinking of interpolation of bash, etc.

------
syrusakbary
Quite useful! If you want to have string interpolation in Python 2.6+ just use
[https://github.com/syrusakbary/interpy](https://github.com/syrusakbary/interpy)

~~~
kseistrup
“They” could put string interpolation in the __future__ module:

from __future__ import string_interpolation

:)

------
soheil
Am I crazy or PHP had this feature since forever? String templates can
significantly improve programming speed specially for those of us who are
keyboard impaired. Not having to remember how many %s' you need or if one of
them should be %d and so on may to most seem just knit picking but personally
it makes a day and night difference.

~~~
MereInterest
PHP has had it forever. Given PHP's reputation, I'm not sure if that is a good
argument for this feature. Rather than '%s' and '%d', why not just use '{}'
and '.format'? That way, each argument is converted to a string appropriately.
If it becomes confusing with many arguments, it is easy to name the parameters
'{myparam}'

------
hisham_hm
> That actually looks pretty nice but as Python 3.6 is slated for release in
> another 12 months you will have to wait a little longer.

...Or just do it in 28 lines of Lua:

[http://hisham.hm/2016/01/04/string-interpolation-in-
lua/](http://hisham.hm/2016/01/04/string-interpolation-in-lua/)

This is a nice showcase of how Lua's metamechanisms can be applied to do
things that often require new features in other languages.

~~~
kedean
I'd argue that it's not a good thing that you can just disregard block scoping
willy-nilly from within unprivileged code.

~~~
hisham_hm
If you are in a context where you are concerned about "privileged vs.
unprivileged" code, you have to sandbox way more than that, and Lua provides
you many mechanisms to control that (more than any other mainstream scripting
language). Disabling the "debug" library which I used there for traversing
block scopes is the very first thing one does when sandboxing in Lua.

~~~
kedean
At which point you can no longer use this fmt construct, so what was the
reason for having it?

~~~
hisham_hm
You can, privately within the module.

------
pvdebbe
I for one welcome the shorthand interpolation. String formatting is the thing
we all do all day long (slightly exaggerated pretty much).

Now if python would begin to support immutable values by default then I'd be
most content, and Python complete enough.

------
jdnier
+1 for [https://pyformat.info/](https://pyformat.info/) (mentioned in
article), with some nice practical examples of equivalent %s and .format()
formatting.

------
rshaban
_There should be one-- and preferably only one --obvious way to do it._

Hoping for a decision from the core team – having both f"" and .format is a
pretty clear deviation from this principle.

~~~
noobermin
Well, having both % and .format was already a clear deviation from it.

~~~
btmiller
Though I think the communication has been pretty clear that developers should
stop using % formatting in favor of .format when moving to Python 3. Not so
much a deviation, just a migration. Now that there's another alternative,
there definitely needs to be some clear communication about what's going to be
idiomatic going forward.

~~~
warbiscuit
Except that they went and made .format() useless for bytes... which made
everyone have to hang on to % for both 2/3 compatibility cases, and for all
the cases where bytes templates were actually needed (all kinda of lowlevel
wire protocols and file storage formats).

------
sandGorgon
The rift between Python 3 and Python 2 seems to be a fallout of the one-true-
way philosophy. In fact, it almost stands the reason that Python 3.6 ought to
be Python 4 if one-true-way needs to be upheld.

If Python 3.6 is going to introduce multiple ways to do the same thing, there
is no good reason to not merge Python 2 and 3 together and have both set of
behavior co-exist with each other (__future__ or __past__).

In one shot, you break the Berlin wall of Python.

~~~
sametmax
Apprently you haven't being using Python 3 much. It reduces the numbers of
ways to do stuff a lot.

    
    
        - class stuff(object) vs class stuff;
        - range vs xrange;
        - itertools.izip vs zip;
        - itertools.imap vs map;
        - itertools.ifilter vs filter;
        - dict.items vs dict.iteritems vs dict.viewitems;
        - dict.items vs dict.itervalues vs dict.viewvalues;
        - dict.items vs dict.iterkeys vs dict.viewkeys;
        - __cmp__ vs __eq__ + __gt__;
        - sorted(cmp) vs sorted(key);
    

They didn't fallout from the philosophy. They have been pragmatic and tried to
balance the language design : gaining modern features vs making a robust base
vs pleasing the legacy crowd. Is. Is. Very. Hard.

And yes, we would all prefer to have less way to format. Would it mean I would
prefer NOT to have fstring ? Certainly not, it's a great feature. We can't
live in the past because it will make us not stand perfectly to the ideal we
have.

Real life is not ideal.

But merging Python 2 and 3 ?

With the string model completly reworked, that would be apocaliptic. Most
people don't realize how deep the unicode change has been.

I have been dev and teaching Python 2 and 3 for years. The amount of problems
linked to UnicodeDecodeError dropped by 90% after the switch.

Not because Python 2 model didn't work.

Because nobody understand text.

Most dev don't understand what text is. They just want to format string.
That's what Python 3 helps to do, and it does it well.

Mixing both would be like mixing olive oil and vanilla ice. Great on their own
way, but use them together all you'll get a terrible meal.

~~~
sandGorgon
at no point am I arguing whether python 3 is better or worse than python 2.
That is an uninteresting conversation because the fact remains that less than
30% of all software is in python 3. Even when Google released Tensorflow, it
was in Python 2 (and they employ core developers of Python).

Do you seriously think anybody is thinking of dropping python 2 support by
2020 ? that will only create a fork. None of the core frameworks have upgraded
in a decade. Look at Flask for example. So yes, I havent been using Python 3
much - there has been no reason to.

the point I'm trying to make is how to get everyone on the same page. The
reason Python 3 was api incompatible was because of the core tenet of one-
true-way. All the functions you mentioned may be superior, but unless you give
people a way to mix and match both in the same source , you will not have
adoption.

Or do you think Python 3 adoption has been successful ?

------
bootload
_" PEP-0498 tries to improve this situation by offering something that has
been common to other languages like Ruby, Scala and Perl for quite some time:
Interpolated strings."_

P3 _'.format '_is fine, the only problem I have is forgetting the last ')' and
vim picks this up. Is interpolation that good to introduce another way of
doing things?

------
blubb-fish
> by offering something that has been common to other languages like Ruby,
> Scala and Perl for quite some time

So funny ... the most prominent language known for this stuff is PHP ... and
missing :D

------
such_a_casual
I never understood why I had to write (count=count, apples=apples, name=name)
when a logical default would suffice.

~~~
Walkman
I suppose you mean to take the variable names of the arguments without using
keyword arguments? That's not possible because .format() is just a simple
method call on str/unicode objects.

~~~
such_a_casual
>That's not possible

That statement is against my religious beliefs. Apparently we can stick an f
in front of the string, but we can't have default values that work with
locals()?

~~~
mixmastamyk
That would mean the formattee would have to dig thru its surrounding
namespace. This solution is more elegant, the string is transformed into the
equivalent format call at compile time, making the implementation tiny.

~~~
such_a_casual
Elegant for who?

Adding a new syntax that doesn't exist in any language.

vs.

Using sensible default parameters.

\-----------------

An argument for why my suggestion would be bad:
[https://www.python.org/dev/peps/pep-0498/#no-use-of-
globals-...](https://www.python.org/dev/peps/pep-0498/#no-use-of-globals-or-
locals)

~~~
mixmastamyk
> Adding a new syntax that doesn't exist in any language.

Not true, look into the new interpolation features in C#, Scala, JS, Swift,
nim, etc.

------
evrial
Just because simply one way isn't enough.

