
Navigating the Postmodern Python World - freyrs3
http://www.stephendiehl.com/posts/postmodern.html
======
bsaul
i've been coding in python professionally for 4 years now, and i'm currently
working on the biggest project i've ever worked on. to me, being scared of
changing the signature of a function because the static analyser will not be
able to spot all the places i've used this function is a real problem ( along
with incomplete autocomplete ). i do unit test everything, but i'd like to
keep unit tests for things a computer can not theorically do.

since i'm still in the early phase of the project, i know that python
expressiveness is an edge, but i'm looking right now at what's going to be the
"definitive" language i'm going to rebuild my product for the next 3 to 4
years.

Python badly needs optional typing. really. i'm pretty sure that would solve
both the speed and tooling issues. right now, for me, it starts to become
unsuitable as soon as you reach 5-10k lines of code and a team of 2.

~~~
spamizbad
I'm surprised you're hitting that boundary at just 5-10K lines. I work on a
team of 6* with about 60,000 logical Python statements (Not including tests)
and aren't running into any language or framework induced roadblocks.

And this isn't even a case of "If you do everything perfect like me,
[Language] works great!" \- We only have 45% code coverage, and our code
architecture, in certain places, is really sub-optimal (Caused by us, not the
language).

When I'm about to make a major change to a function (different inputs or
different outputs) I'll always start by grepping first to get an idea of what
I'm about to get myself into, write or adjust tests for the new 'signature',
change the function, and code/debug/test until it works. My team is in the
loop before my code is committed to a shared branch.

* We have other responsibilities besides the Python code.

Edit: If your team is stepping all over each other at such a low LOC, my hunch
is that its related to a communication problem with the team, your code is too
tightly coupled, or there's not enough architecture planning (Too much is bad,
but so is too little).

~~~
bsaul
I guess it really depends on what you're building. Just to give you a taste
here's the kind of thing i'm doing:

\- tree-structured sqlalchemy managed objects comparaison , generating diffs,
then applying diffs to those trees, and persist everything. I'm using
sqlalchemy declarative approach.

That diff applying is performed in a celery background task, reusing my flask
configuration.

So, in the worst case, i have to deal at the same time with : \- Business
logic on a bunch of SqlAlchemy ORM object ( declarative approach) \- a Flask
request context \- a celery task context \- and sqlalchemy session

At that point, i'm changing the signature of a function that takes pieces of
those three parts to perform some business logic. Now i'm telling that the IDE
(pyCharm, the best one) and python "compile phase" doesn't give you a CLUE on
what you're doing.

You're dealing with so much "magic" that it becomes unmanageable. You don't
need that many lines of code to reach that point.

EDIT : you've got 6 people working on 60 LOC spread on 8 discrete apps. That's
about the same amount of isolated group of LOC per person than me (a person
having to deal with a group of 5-10K LOC)

~~~
aidos
I'm not sure the complexity breaks down that way. More people + more code
means things need to be better organised because things might change under you
without warning (or you may have to work with an unfamiliar part of the
system).

How is your app structured? I'm guessing that Flask is the top level glue and
everything else is scattered around the Flask app. That's the general approach
(in most modern MVC frameworks) and I think it's also the root cause of
complexity.

Celery and Flask (and sql alchemy too) should really be asides to the main
codebase. The code should be layered and discrete libraries for handling
different parts of the system. If you have 6k loc that all cross reference one
another then you have problems in any language. Presumably there are a number
of different components in there. Each should stand on its own with as simpler
api as possible. As ever, too much coupling is going to make it impossible to
reason about your code.

If you're about to change a signature for a function, it should already be
fairly obvious as to where it is called from. If not, you need to ask yourself
why. What is this function that's so fundamental to the system that it could
be called by any module? Why is it buried in another module an being accessed
from elsewhere?

My current app has about 4k loc in python and the same again in js (angular).
It's broken into dozens of parts that I only connect where needed through a
simple api.

At the core is a sort of image processing library (that itself contains lots
of different components). On top of that is a system that works with the image
processing. Above that another system that interacts with the data models,
uses the system below and farms out processing to picloud (though could use
celery). Finally, the Flask layer just provides a web interface to talk to the
system that handles that business processing. I can tap into any of those
layers to drive them. The point is that I can operate at a high level without
needing to consider any details of deeper parts of the system.

These are the layers of abstraction that make a system understandable and stop
it from being brittle.

~~~
bsaul
The problem lies in the interfaces. Components, even when they are
independents, expose interfaces. Those interfaces acts as a contract between
the component and its users.

Python needs a way to make those interfaces automatically verifiable.

It's even worse once you start to use big libraries. If you're using top level
functions, then maybe your IDE can help you, but as soon as you're dealing
with magical properties or parameters, that becomes a mess.

Take for example the "desc" magical function in SQLAlchemy, on things like
"order_by" on relationships. That's extremely useful and clever, but i'd
really like python to give me some "ok, you're not doing things wrong" message
as I'm typing.

Even better, once i enter a relationship declaration, it should give me a list
of all the parameter i can use, along with the things thoses properties accept
for value. This way i wouldn't have to check the documentation every time i'm
writing one. It could also let me discover new things as i'm typing ("hey
what's that property doing ? that looks interesting..."). Autocompletion is
another way of discovering APIs.

But for that, you need type declarations.

EDIT : as for my api, it's really nothing fancy. It's structured in three big
parts "admin / common / public". They each have their "model / business /
service" layers, and each have their modules. Only the service layer is
impacted by flask. I have some "utils" modules for very low-level stuffs (json
serialization, etc). Flask configuration are used a bit everywhere, because i
want my api to have only one configuration file. Nothing special, really.

~~~
boothead
It's not a total solution but I find the zope.interface and zope.component
libraries really good for this. I believe that zope.component provides some
stuff to build unit tests that verify if and interface is correctly
implemented/provided too.

------
jefftchan
(off-topic)

Love the contact page [1] of this website. What are some other effective
filters?

[1]
[http://www.stephendiehl.com/pages/hire.html](http://www.stephendiehl.com/pages/hire.html)

~~~
duggan
Hah, clever. Bonus points if you don't need to run the code to figure it out
(since the cipher is trivial and the ciphertext isn't particularly well
obfuscated in the Haskell version).

~~~
alanctgardner2
The best part is that it's still his name at a common domain. I didn't bother
with the code, but you can pick the common letters and match the positions to
be pretty confident. The hardest part is his middle initial.

edit: To be more on-topic, Dwolla had a pretty challenge for their hackathon
last year: [http://venturebeat.com/2012/06/29/dwolla-etsy-join-up-for-
ny...](http://venturebeat.com/2012/06/29/dwolla-etsy-join-up-for-nycs-first-
ecommerce-hack-day/)

~~~
brdrak
Funny, I also didn't need the code to translate the scrambled address. I knew
his name so that right away gave me most of it:

tufqifo.n.ejfim@hnbjm.dpn stephen.?.diehl@?????.???

since m=l in the name, assumed the same for domain, considering it's the most
popular email service, gmail.com my first guess.

since n=m in the domain name, I assumed the same for the middle initial, which
gave away the whole thing.

~~~
alanctgardner2
That's what I said? edit: I realize it might now have been entirely clear, I
was trying to be a bit discreet ;)

------
wes-exp
Lisp has had metaprogramming, optional type declarations, speed, and DSLs
since at least the '80s. This isn't some magic new technology just introduced
by the shiny new languages cited by the author.

[http://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule](http://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule)

~~~
foldr
Lisp's optional type declarations don't provide any compile-time guarantees
though, they're just hints. (Although some implementations do use them to make
compile-time checks.)

~~~
wes-exp
Depends what you are going for. If the point of type declarations is speed, I
think most high-quality CL implementations will take advantage of a type
declaration. If the point is compile-time type safety, yeah, that's a more
obscure feature. Of course you can always use check-type for run-time type
safety though.

~~~
foldr
Well, in the context of refactoring, it's presumably compile-time safety
that's relevant.

------
ak217
The article doesn't mention recent/ongoing improvements to PyPy in the "speed"
section, and doesn't mention concurrent.futures in the "asynchronous
programming" section. Seems incomplete to me.

~~~
Permit
I take issue with the idea that you can count PyPy as a +1 for Python's speed
without also counting it as a -1 for libraries. Especially considering SciPy
and NumPy are not yet supported by PyPy, which means you lose out on a lot of
libraries that depend on them. (For example, sci-kit learn).

You can't say that Python has both speed and great libraries. It has one or
the other. Hopefully this will change at some point and I'll be able to reap
the benefits of both.

~~~
MostAwesomeDude
It is not up to PyPy to support non-pure-Python libraries; it is up to those
libraries to fix themselves by evicting their C and FORTRAN balls and chains.

~~~
dbecker
The Zen of Python states that "practicality beats purity."

And the authors of numpy, scipy, pandas, etc seem to agree that it would be
impractical to rewrite these libraries in the way you are suggesting.

~~~
MostAwesomeDude
"Special cases aren't special enough to break the rules."

------
leephillips
"there are variety of technologies encroaching on Python’s niche"

It wasn't clear to me what in particular the author thinks is Python's
"niche", so I didn't understand the point of the article as an article,
although the content was interesting.

~~~
dkarl
I think Python's niche is understood to be as a readable, concise, batteries-
included language suitable for scripting, prototyping, and application
development on a small-to-medium scale (and large scale for some people, but
that's controversial,) and serving as a convenient interface to C and C++
libraries in all those roles. Python differentiates itself from similar
languages in the niche by elevating simplicity, explicitness, and readability
over conciseness and expressiveness. That's the conventional wisdom, anyway.

I'd say the biggest challenge to Python in that niche is the emergence of type
inference for statically typed languages. Not too long ago, the difference
between a statically typed language and Python was

    
    
        List<Foo> foos = new List<Foo>();
        foos.add(new Foo(2));
        foos.add(new Foo(4));
    

versus

    
    
        foos = [ Foo(2), Foo(4) ]
    

Now, a static language might look more like this:

    
    
        val foos = List(Foo(2), Foo(4))
    

As far as I know, there still isn't a statically typed language that matches
Python's simplicity and low barrier to entry, but I say that as someone who
knows little about Go.

~~~
hellrich
Try Groovy!

    
    
      def foos = [ new Foo(2), new Foo(4) ]

~~~
dkarl
I've been writing a lot of Groovy recently, and it feels like Scala without
type checking. That's not the worst thing in the world, I'd rather just use
Scala. With Groovy, I get a lot of the expressiveness of Scala, but the lack
of static type checking combined with Groovy doing its best to be magical is a
recipe for bugs. For example, in Groovy, you can overload methods accidentally
without a warning, which can create a hell of a surprise when somebody passes
a null value. Rather than throw an error, Groovy dispatches the call to one of
the methods even if neither one is more specific than the other.

There's also this classic bug:

    
    
        def printOne(Collection c) {
            if (c.empty) {
                print("Collection is empty")
            } else {
                print(c.iterator().next())
            }
        }
    

Can you spot the bug? This code works for all Collections... except Maps. If c
is a Map, Groovy translates c.empty to c.get("empty"). Constantly having to be
on my toes to avoid stuff like that is a pain.

------
banachtarski
This might seem very petty, but I avoid python just because of the dichotomy
between Python 2 and Python 3. For me, there are plenty of other scripting
languages at my disposal, and for things like numpy, or twisted, I have better
alternatives too.

~~~
dagw
Honest question. Outside of matlab and octave (both of which fall down in a
bunch of other obvious areas), what scripting language offers anything close
to an alternative to numpy/scipy?

~~~
banachtarski
I use R, Mathematica, Haskell, Julia, or my own CAS that I built.

~~~
bjz_
What do you think of Mathematica? I've been toying with the idea of using it
for sketching out ideas related to my computer graphics work, but it's a big
investment. :/

~~~
banachtarski
I found it's utility primarily is dynamically controlled visualizations, for
example, when I wanted to see how residues were distributed in the C^2 plane
for a certain class of complex functions.

I'm not certain what sort of graphics work you are considering, but if it
involves a lot of nonlinearity or higher level functions, it may be a good
fit.

------
pjlegato
The "postmodern" conceit is a bit of a stretch; link bait to lure in readers
with liberal arts degrees.

~~~
sp332
Or perl hackers. "I think there's still a big streak of Modernism running
through the middle of computer science, and a lot of people are out of touch
with their culture. On the other hand, I'm not really out to fight Modernism,
since postmodernism includes Modernism as just another valid source of ideas."
[http://www.wall.org/~larry/pm.html](http://www.wall.org/~larry/pm.html)

~~~
berntb
Don't go Steve Yegge and pretend you didn't understand that speech was at
_least_ 50% jokingly... :-)

~~~
sp332
pjlegato mentioned postmodern conceit and linkbait, it's right on-topic ;)

------
lifeisstillgood
I like hacker ish approach to extending (abusing) a much loved language to
bring in new ideas.

But I am wary, despite being a python bigot myself, of usin one language for
all these things. At a very early point down this road it is simply better to
pay the cost of adding a new platform and using clojure to make my DSL

In the end, for production, there is a fine line between bending and breaking.

------
k_bx
Really impressed about examples with parametrized types (but I'd still better
switch to Haskell for those :).

------
xrt
I should think OCaml or SML should be added to the mix. Neither has Async
built in, but the other points are handled quite well.
[http://ocaml.org/description.html](http://ocaml.org/description.html)

------
ekanna
Other main strength of golang is no run time dependecy! Making it highly
distributable. And also cross compiles across different operating systems and
different cpu architectures. This also should be mentioned in this article!

------
parkour
"it's simply that is that the" wat

