
Ruby-like string interpolation in Python - est
https://github.com/syrusakbary/interpy
======
berdario
If the only thing you want is to use the {identifier} syntax, without having
to repeat `identifier` more than once, the simplest alternative is probably:

    
    
        >>> package = "foo"
        >>> whatever = "bar"
        >>> "Enjoy {package}".format(**locals())
        'Enjoy foo'
    

(disclaimer: I wouldn't actually use locals() like this in real code)

Also, if this thing is truly mangling bytecode, it's not portable between
different python versions

But i'm not quite sure about that: I only skimmed the codebase, and that works
seems to be done by interpy_untokenize, which boils down to some string
mangling

Also, having expressions (or worse, statements... like it would be in ruby
since there's no difference there) evaluated when evaluating a string is quite
bad (this is not Haskell, and thus we cannot have guarantees that side effects
won't happen)

Nice hack, btw

~~~
rraval
Note that `locals()` actually does funky things around enclosing scopes.

Consider the following Python2/3 code:

    
    
        def outer():
            x = 1
            def inner():
                print(x)
                print("x = {x}".format(**locals()))
            inner()
    
        outer()
    

This actually prints the right thing when run:

    
    
        1
        x = 1
    

However, if you remove the `print(x)` line, both Python 2 and 3 don't hoist
the enclosing `x` into `locals()` (since it can't see a single usage of that
variable), resulting in a `KeyError`:

    
    
        Traceback (most recent call last):
          File "blah.py", line 7, in <module>
            outer()
          File "blah.py", line 5, in outer
            inner()
          File "blah.py", line 4, in inner
            print("x = {x}".format(**locals()))
        KeyError: 'x'
    

Python 3 has the `nonlocal` keyword that you can use to indicate that you're
using a variable from an enclosing scope so that it's properly introduced into
`locals()` but Python 2 doesn't have this facility.

~~~
wylee
That doesn't seem funky to me, given that `x` in your example is in fact not
local to `inner`. It may seem surprising, but it's consistent with how Python
handles locals in general.

~~~
rraval
I called it funky because `x` is never local to `inner`, but gets an entry
inside `locals()` if `inner` uses it implicitly or explicitly with `nonlocal`.

Non-funky behaviour would be to have an `enclosing()` that walks up the outer
functions and returns the union of their `locals()`. Alternatively, a `vars()`
which expands to all variables in scope (respecting LEGB) would be best given
the context of variable interpolation.

------
rpcope1
This is certainly interesting, but I think it kind of breaks with the "only
one obvious way to do things," which Ruby tends not to follow (and with
regards to string formatting, Python doesn't either :P ). Does this buy you
anything beyond trying to ramrod Ruby syntax into Python? If this truly
compiles down to byte code that's doing string concatenation in the Python VM,
the built-in string formatting library tends to be a non-trivial amount
faster, beyond trying to write Ruby in a language that isn't Ruby.

~~~
methodover
Well... String formatting already sorta breaks the "only one obvious way to do
something" rule. I dunno about you, but we have string formatting all over the
place in our code. I would wager that every single module in our codebase has
something that looks like this:

    
    
      cache.add_message("Hey, your {zoop} is {boop}".format(zoop=zoop, boop=boop))
    

That that call to "format" just kind of sucks -- it repeats what's already
pretty obvious by looking at the string. It's specially crappy when you have
strings that need lots of variables. We've taken to doing this recently, which
I'm usually okay with:

    
    
      cache.add_message("Hey, your {} is {}".format(zoop, boop))
    

The downside of that is you have to make sure that the order of the arguments
matches exactly with the order of the empty brackets. It's kinda error
prone... But generally not that big of a deal.

We could also do this...

    
    
      cache.add_message("Hey, your %s is %s" % zoop, boop))
    

Or this...

    
    
      cache.add_message("{1} alert! Hey, your {0} is {1}".format(zoop, boop))
    

So yeah, I would argue that string formatting in Python is ALREADY in a kinda
nasty place. There's ALREADY a bunch of ways to do it, and it all just kinda
sucks.

IMO, Ruby-style string formatting is probably the nicest I've seen. If it were
in Python, it absolutely would be THE way to do string formatting, I bet.

    
    
      cache.add_message("Hey, your {zoop} is {boop}")
    

So much nicer.

~~~
dragonwriter
Actually, I kind of like the existing Python ways better in many respects from
the Ruby one, since it lets you separate format strings from the parameters
passed to them, and reuse format strings more easily (which is especially
useful if you want to move string literals out of code and into resources,
because then you can include format strings in that as well.)

~~~
stouset
You can do this in Ruby too, with sprintf. It's almost as if there are
multiple, complimentary ways of accomplishing similar tasks, depending on your
needs.

~~~
moe
You don't even need sprintf, Ruby supports Python-style interpolation, too:

    
    
      bar='batz'  
    
      "foo #{bar} #{0.5+0.5}"
       => "foo batz 1.0"
    
      "foo %s %.2f" % [bar, 1.0]
       => "foo batz 1.00"

~~~
stouset
% is an alias for sprintf. :)

------
philh
String interpolation is one of the things that I've never understood about
python. I'm not a massive fan of Ruby's particular syntax for it[1], but not
having any syntax at all feels like such a massive oversight. And then I
realize that it's probably a deliberate omission and that just seems _really
really weird_ to me.

So I'm happy to see this, even if I'm probably too conservative to use it in
my day job. And I didn't know about the coding: thing, and it looks like this
method could also be used on my other python-wtf, which makes me even happier.

(My other python-wtf is that there really ought to be nicer syntax for a['b'].
For a while I thought that a::b would be nice, but then I remembered that that
could be a slice, so it can't be parsed reliably. a$b is probably my next
choice. Or even require that kind of slice to be written with a space or
something, like "a: :b".)

[1] Requiring braces even for a simple variable name seems like a poor
decision. There's a little-known language called Haxe which IIRC gets it
right: you can embed variables with just "hello $foo", or expressions with
"your score is ${kills-deaths}". I get that Ruby allows unusual characters in
variable names, and it's not obvious whether "is this yours, #name?" means
#{name?} or #{name}?. But I'd rather have that potential for confusion than
force the braces even when there's no ambiguity.

~~~
toupeira
There's also a very well-known language called PHP that does the same ;-)

~~~
philh
Are you sure?
[https://php.net/manual/en/language.types.string.php#language...](https://php.net/manual/en/language.types.string.php#language.types.string.parsing)
suggests that it can only interpolate a few specific types of expression, but
can't do e.g. arithmetic or or simple function calls. (It seems you can
interpolate a variable named by a function call, but you can't interpolate a
function call itself.)

------
ForHackernews
You can already _almost_ do this in Python, with the native string format
operation:

>>> name = "Foo Bar"

>>> age = 25

>>> "Hi, my name is {name} and I'm {age} years old.".format( _splat_ locals())

"Hi, my name is Foo Bar and I'm 25 years old."

Arguably, this an abuse of `locals()`, but it gets you very nearly the same
kind of use-variables-in-strings-with-curly-braces functionality.

Edit: HN markdown doesn't seem to let you escape the star italics operator. To
be clear, you have to double-star splat locals().

~~~
maxerickson
Repeating my similar comment, str.format_map takes a dictionary directly
(available since 3.2).

------
ekimekim
I wrote a version of this some time ago:

[https://github.com/ekimekim/pylibs/blob/master/libs/interpol...](https://github.com/ekimekim/pylibs/blob/master/libs/interpolate.py)

It's not quite as natural as your one:

    
    
        def foo(x):
            print interpolate("Hello, {x}")
    

Though I do actually prefer having the explicit formatting call there so I
know _when_ the interpolation is being performed. Side effects and all that.
In a perfect world, this is the syntax I'd prefer:

    
    
        def foo(x):
            print "Hello, {x}".format()
    

ie. a format() without args defaults to "all variables accessible in the
current scope". I wouldn't actually want it to support arbitrary python the
way ruby does, I find the .format() syntax flexible enough.

(Also, my current implementation is for locals only. It wouldn't be hard to
extend to globals, but would suffer the "nonlocals won't be captured" problem
described in other comments here no matter what)

(EDIT: Also, it relies on sys._getframe, which is CPython specific)

------
BetaMechazawa
I guess I'm missing the point. Why not use "Welcome %s to %s" % (who, place)

~~~
PythonicAlpha
This special syntax should be discouraged for any strings that shall be
translated in the future. Different languages have different syntax, so the
order can change, with this kind of syntax, you will be in big trouble very
soon!

~~~
nostrademons
For anything translated, you'd use the (admittedly gross, fixed in Python3)
"Welcome %(who)s to %(place)s" % { 'who': who, 'place': place }. The Python3
version is "Welcome {who} to {place}".format(place=place, who=who), or if you
want to be un-idiomatic and unsafe, "Welcome {who} to {place}".format(
__locals())

------
mcbetz
Can anyone explain the creators' reasoning beyond the existing string
interpolation in Python? I never thought it was cumbersome, on the contrary I
liked the verbosity of "My {name} is, I am {years} old".format(name=name,
years=years) and I think there are good reasons (beyond dragonwriter's
reusability argument).

~~~
aetherson
So, like, just to point out, your example would be, with name Adam and age 10:

    
    
      "My Adam is, I am 10 old."
    

Correcting your example:

    
    
      "My name is {name}, I am {years} years old".format(name=name, years=years)
    

So to throw that out, that one line includes the word "name" four freaking
times, and years four freaking times. You say you like the verbosity of it.
Why? Would you like this format yet more?

    
    
      "My name is {name=name}, I am {years=years} years old".format(name=name, years=years)
    

If not why not? It's yet more verbose.

I think that the reason that other people like the non-verbose format of:

    
    
      "My name is {name}, I am {years} years old" # assuming the presence of local variables "name" and "years"
    

Is that, well, it's pretty obvious what's going on here, and repeating name
and years a bunch more times do not, it seems, make it any more clear what's
going on.

A reasonable argument might be that:

    
    
      "My name is {name}, I am {years} years old".format()
    

Is more clear about what's going on. But repeating the variable names is not
particularly elucidating.

~~~
mcbetz
Well, I did not want to defend the way .format() works now. That's why I asked
for the reasoning. Your reasonable argument sounds reasonable to me, I would
certainly use it. But then my initial question is even more important: Why is
this not the one Python way. What did the initial designers think?

------
mkesper
How is this optimized if it compiles to a + "" \+ b?

~~~
mesozoic
It's not

~~~
andybak
Are we sure?: [http://stackoverflow.com/questions/3055477/how-slow-is-
pytho...](http://stackoverflow.com/questions/3055477/how-slow-is-pythons-
string-concatenation-vs-str-join)

~~~
Luyt
Some of the examples in that SO post are outdated. However, list joining is
faster that string concatenation, but not by much. Assembling a 110 MB string:

    
    
        from timeit import Timer
        try:
            from StringIO import StringIO
        except ImportError:
            from io import StringIO
    
        nr = 1200000
        data = "The Quick Brown Fox Jumps Over The Lazy Dog: Woven silk pyjamas exchanged for blue quartz.\n"
    
        # contruct a list first, then join
        def dolist():
            s = []
            a = s.append
            i = 0
            while i < nr:
                a(data)
                i += 1
            s = "".join(s)
            print("%s chars (joined list)" % len(s))
    
        # string concatenation fest
        def dostr():
            s = ""
            i = 0
            while i < nr:
                s += data
                i += 1
            print("%s chars (string concatenation)" % len(s))
    
        # use a string as a file
        def dostringio():
            buf = StringIO()
            w = buf.write
            i = 0
            while i < nr:
                w(data)
                i += 1
            s = buf.getvalue()
            print("%s chars (cStringIO)" % len(s))
    
        if 1:
            tlist = Timer("dolist()", "from __main__ import dolist")
            print("the joined list took %.2f seconds" % tlist.timeit(2))
    
            tstr = Timer("dostr()", "from __main__ import dostr")
            print("the concatenation fest took %.2f seconds" % tstr.timeit(2))
    
            tlist = Timer("dostringio()", "from __main__ import dostringio")
            print("the cStringIO approach took %.2f seconds" % tlist.timeit(2))
        else:
            @profile
            def callall():
                # For use with a profiler (eg, kernprof.py/lineprof)
                for i in xrange(2):
                    dolist()
                    dostr()
                    dostringio()
            callall()
    
        Result:
    
        (user@air) /Users/user/Prj/python $ python3 stringplakbenchmark.py
        109200000 chars (joined list)
        109200000 chars (joined list)
        the joined list took 1.12 seconds
        109200000 chars (string concatenation)
        109200000 chars (string concatenation)
        the concatenation fest took 1.76 seconds
        109200000 chars (cStringIO)
        109200000 chars (cStringIO)
        the cStringIO approach took 1.45 seconds
    
        (user@air) /Users/user/Prj/python $ python2.7 stringplakbenchmark.py
        109200000 chars (joined list)
        109200000 chars (joined list)
        the joined list took 0.99 seconds
        109200000 chars (string concatenation)
        109200000 chars (string concatenation)
        the concatenation fest took 1.33 seconds
        109200000 chars (cStringIO)
        109200000 chars (cStringIO)
        the cStringIO approach took 5.21 seconds
    
        (user@air) /Users/user/Prj/python $ python2.6 stringplakbenchmark.py
        109200000 chars (joined list)
        109200000 chars (joined list)
        the joined list took 0.95 seconds
        109200000 chars (string concatenation)
        109200000 chars (string concatenation)
        the concatenation fest took 1.39 seconds
        109200000 chars (cStringIO)
        109200000 chars (cStringIO)
        the cStringIO approach took 5.54 seconds

------
languagehacker
Something about this seems patently unpythonic. Also, the way the code is
included reminds me of monkey-patching, which is a Ruby behavior I like to
leave at the door when I'm coding in Python.

------
est
Another gem is pyxl from Dropbox

[https://github.com/dropbox/pyxl](https://github.com/dropbox/pyxl)

Reminds me of JSX/E4X/XML Islands, etc.

~~~
alanfranzoni
Interesting and clever approach (even though, as I see, pyxl came first at
that).

Just a doubt: how do I specify source file encoding if the coding string is
now hijacked for interpolation purposes? Is there a default, fixed encoding
(which I hope is not iso-8859-1, python's own default)?

~~~
est
> Is there a default, fixed encoding

For interpy, I think it assumes utf-8 by default.

[https://github.com/syrusakbary/interpy/blob/master/interpy/c...](https://github.com/syrusakbary/interpy/blob/master/interpy/codec/register.py#L21)

------
b6fan
The `codecs.register` approach is interesting. I hope Ruby has something
similar. Unfortunately I could not find such thing in Ruby.

~~~
est
Yup that's the point of my submission, too bad HN edited my title so it looks
like a boring trick. It allows you to transform your code without significant
speed loss.

Think of possibilities like

# coding: JIT

or

# coding: inline-C

etc.

~~~
maxerickson
The typical way to do that is to reuse the interpreter infrastructure. This
hack could be done with the ast:

[https://docs.python.org/3.5/library/ast.html#ast.NodeTransfo...](https://docs.python.org/3.5/library/ast.html#ast.NodeTransformer)

~~~
est
AST is cool, but you have to comply with Python's syntax.

the coding way you can invent any wild syntax.

~~~
maxerickson
I guess my first approach to that would not involve overloading the import
statement.

------
BerislavLopac
There are also Python template strings:
[https://docs.python.org/2.7/library/string.html#template-
str...](https://docs.python.org/2.7/library/string.html#template-strings)

------
volent
Looks fun but it doesn't seem to support unicode in python2.

