Python, as Reviewed by a C++ Programmer (sgh1.net)
83 points by ingve 241 days ago | hide | past | web | favorite | 76 comments



For me the biggest downside of dynamically typed lanaguages like Python and Javascript is that a small error in passing a function's parameters can propagate very far in a complex system. E.g. if you mistakenly interchange two consecutive function parameters, you could spend several hours debugging the effect several modules down. This is an error that a C++ compiler would catch right away.


C++ is not always able to detect common parameter errors, especially for things like implicit conversions of numbers. Arguably Python’s option of keyword arguments make certain classes of parameter errors go away, too.

Several years ago I remember someone who (perhaps unwisely) wrote a C++ function with many floating-point values as parameters and a single default "bool". At some point the function was extended to have another float. The compiler did not catch calling code that failed to add an extra float; it simply allowed the formerly-flag arguments to be assigned to the new float argument instead. In fact, implicit conversion of parameters is such a common C++ problem that I occasionally define wrapping types for the sole purpose of being parameter types: then the compiler has to trip over them when they are used incorrectly and callers have to do some kind of explicit action to indicate what their values really are.


Those implicit conversion rules (largely inherited from C) are one of the bits of C++ that bit me the most when I was a primary C++ dev.


The cpp compiler would catch it only if the incorrect data was of a different type. Type specification and hinting only helps with the more egregious mistakes.


Type hints can catch most of these mistakes.


There are solutions for this: flowtype for js or mypy for python (preferably with python3.5+).


The article doesn't mention type annotations described in PEP484 and added to CPython since Python 3.5.

It is possible to add annotations to your code like

    class Foo:
        x: int
        # ...
and then run type checks using `mypy`, so that for the annotated code probability of the "passes the checks => won't crash the runtime" case is much larger.


Sure, but most python code isn't annotated, right?

So this might be helpful for code you wrote, but not anywhere it interfaces with other peoples' code (which, in my opinion, is probably the most important place to check for type errors).


The older method is docstrings (reST and similar) like here: https://github.com/kennethreitz/requests/blob/master/request...

IDEs like PyCharm can use those and other methods to find most typing issues. In practice this is very effective for most "boring" Python code, but if you get really clever it falls over completely.


True, but very popular packages are usually battle tested and type errors are very rare in practice in those packages. Most bugs I've seen are not because of the lack of a static type system.


Plus, many packages dynamically add members, stomp named parameters by wrapping stuff with kwargs, etc.


So monkey patching is bad you're saying? I agree that it's possible what you're saying about kwargs, but after about 6 years of python I've never encountered that in any of the code that I've worked on.


Or specifically to his example,

    from typing import Dict
    Foo = Dict[int:str]
    def __init__(self, name:str, balance=0.0):
        """Construct a Klass."""
        self.myfoos = {} # type: Foo
I _think_ that's right... close anyway


Interesting, but it's mostly a few observations about interpreted vs compiled and static vs dynamic typing. And a bit about having a rich collection of 3rd party libraries.

The article wouldn't have been much different had it been C++ vs Ruby/Perl/etc.


What is a "C++ programmer?"

On any given day, I poop you not, I'm writing either c#, java, c++ or any number of scrpting languages (perl, bash, python, JavaScript).

Do people seriously only use a single language these days!?


>Do people seriously only use a single language these days!?

Yes. In what bizarro world would one assume it's not so? By thinking that all development in the world resembles the company/work they do?

Tons of programmers only write C# or Java on Windows, and don't touch any shell scripting at all.

Tons of front-end developers only write Javascript.

Tons of statistics people only use R or 1-2 other statistics packages.

Tons of systems, drivers, embedded and OS people only C or C++.

Tons of game devs only write C++.

Tons of native mobile devs only write Swift, Objective-C or Java.

Some of them do occasional need to write some script to get something done, but it's hardly like the juggle between multiple languages.

And of course larger companies have specialized programmers for different domains, not some guy doing C++ AND admin stuff in shell AND some front-end JS etc.


The main product I work on is written in C++, so I spend most of my day looking at\writing C++ and also learning as much C++ as I can. I do use other languages for some minor tasks (python and c# for some automated test stuff and devopsish kind of work), but i dont use those languages as much as C++ nor do I invest as much effort into learning the ins and outs of those languages. So I consider myself a C++ programmer because its what I put effort into


Some people are far more specialized in what they work on.

- Apple's Swift codebase is 54.8% c++ [0]

- Microsoft's CNTK codebase is 57.6% c++ [1]

- Bitcoin's codebase is 68.6% c++[2]

- XBMC's codebase is 84.7% c++[3]

If you work on certain or specific types of projects, like ones making the tools that others use do their projects, then you are typically working more in only 1-2 languages at a time.

[0] https://github.com/apple/swift

[1] https://github.com/Microsoft/CNTK

[2] https://github.com/bitcoin/bitcoin

[3] https://github.com/xbmc/xbmc


Why do you jump between multiple languages all the time?


Not, OP, but I've worked for an ad-agency in a software-dev role.

Their primary code base is written in Java, monitoring scripts in Python/Ruby, and tracking pixel code in JavaScript, so I would have days in which I coded in all 3, albeit for separate use-cases.


The ad agency couldn't afford different developers for the various systems?


Different tools for different tasks. Not gonna use C to analyze some text files...


Best tool for the job.


For what it's worth, python has been making inroads in the HPC world as well. In that context it seems like python is being used as the glue to tie highly optimized, custom written, C/C++ libraries together. http://lorenabarba.com/publications/ http://dl.acm.org/citation.cfm?id=3019084&CFID=927535872&CFT... http://dl.acm.org/citation.cfm?id=2830170&CFID=927535872&CFT...

It's happening enough that there has been a series of popular conferences for it, PyHPC http://www.dlr.de/sc/desktopdefault.aspx/tabid-11992/21071_r...


This reads like it was written 10-15 years ago. But apparently it's new.


That's the same feeling I had when I was writing the article!


Looks like the website's bandwidth limit hit its limit. Google Cache link for those who still want to read it http://webcache.googleusercontent.com/search?q=cache:Z6cJ7kw...


There are package managers for C++, such as Conan. Some IDEs like CLion have certain degree of integration with it.

https://www.conan.io

Then, some systems cannot afford to have garbage collection, because it usually runs in an unpredictable amount of time. Such systems are for example real-time systems. In those cases a language like C++ is a good fit.


One thing that is a bless and a curse is the use of indentation for code blocks.

http://www.secnetix.de/olli/Python/block_indentation.hawk


I read that, and it absolutely does not try to make the point that indentation is a curse. Maybe it is to some (not to me!), but you would need to cite something else. This is just an explanation of how it works.


Good post but I think he'd have a bit more appreciation for Python if he tried writing in a functional style as opposed to object oriented.


It's incredibly close to my feelings after moving from C# to Javascript. Especially the heavily leaning on compiler bit.


Rust is better than C++ at “if it compiles, it works” and has a great package manager, so as far as the topics highlighted here go, Rust seems to be the best of both worlds.


One thing where Rust is rather lacking is the missionarism of some people in the community.


The Rust Evangelism Strike Force is a huge turnoff to the language. It's gotten to the point where we can't have reasoned discussions about C++ or C, let alone about their merits, without a newly enlightened user coming in and trying to direct the conversation to Rust.

It's really okay to have a future where C++ and Rust are both popular, widely used languages. I'm happy to acknowledge the merits of Rust as a language and even talk about them without inserting it into every conversation about low-level computing.

I'm sure the language designers and core developers don't actively endorse this behavior, but somewhere along the way it became a core part of the culture. This world domination shit has to stop.


Their entire selling point is that c is dangerous and rust is safe. It sort of creates an adversarial relationship.

That said, while it might be annoying to serious c programmers, they aren't really the intended audience. It's new programmers that matter. Or people who bounce between languages anyway.


> "The Rust Evangelism Strike Force"

Instant classic


I think it was coined by http://n-gate.com (mentioned in HN a few times last month), but I can't be sure.

What I do know is that a) it's hilariously on point b) Google search now autocompletes on it for me.


FWIW, the behaviour is not just not endorsed, but actively disendorsed, e.g. the code of conduct (which applies to the Rust-team-run venues) includes "Respect that people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer.", and /r/rust has a "No zealotry" rule. Both of these are enforced in their respective spaces, and both of these have existed for years: they're not a reaction to the recent RESF meme.

It's... harder/less-defensible to enforce codes of conduct/rules like these in third-party spaces, but Rust people still do often call out incorrect assertions/ridiculous comments. Someone not familiar with the community may not recognise that the people doing the calling out are actually "prominent" voices.


This is like the evangelists' attempt to distinguish "safe rust" from "unsafe rust": it's a distinction without difference. "The community" isn't just the inner circle of official Rust team members and evangelists and their particular forums where they can enforce any policy as aggressively as they like. For better or worse, "the Rust community" will necessarily be bigger than that if/as Rust grows.

My view is that Rust risks a similar track as Haskell if the community continues to grow in the way it has recently. It's not a healthy community, outside of the circles you note. Quite the opposite--I view it rather as being fairly toxic.


I don't understand what exactly you're disagreeing with. I was strengthening "I'm sure the language designers and core developers don't actively endorse this behavior", and also mentioned how those people do try to keep discussions truthful and relevant outside those spaces.


So true.

I want rust to take off, as even bounds checking at compile time would save me a lot of trouble.

I don't know if I'll ever be able to deal with the community, though.

On the other hand, I've never seen people get as passionate about c++ as I have python and rust, so maybe that says something about the benefits, even if its just as beginners see them.


I think you see people passionate because they've dealt with all the pain around C++. That said I don't agree with the GP. As much as I like Rust there's still a ton of areas that keep Rust from being used today. Library selection, even though it's hard to hire for C++, it's easier than Rust and a few other things.

That said for all my personal/greenfield stuff I've been using Rust and really happy with it.


There's a lot of people passionate about it because it's the first low-level language that's reasonably approachable to people who mostly code in Python/Ruby/etc - you rarely ever have to pull out a debugger, for example.


Haha, oh you definitely have to pull out the debugger. Rust doesn't save you from logic errors, control flow issues and ffi gone wrong.

That said I get your point about using the debugger less.


Honestly, my programs tend to be simple enough that logic errors are reasonably rare, and the ones that do pop up can be resolved by println debugging. :)


> Honestly, my programs tend to be simple enough that logic errors are reasonably rare

That says more about the particular programs you write than about the language, frankly. Writing trivial bug-free programs is easy in any language. The trick is writing complex bug-free programs.

> println debugging

This is something you should only need to do if you have a wretched debugger of if you're working on a bug where the presence of the debugger alters its behavior.


> Writing trivial bug-free programs is easy in any language.

Not really. Writing programs which don't crash the first few times I run them in a dynamic language is surprisingly hard. Just having a compiler prevents having to get to that point - and then Rust ensures that I handle potential errors correctly and avoid segfaults.

> This is something you should only need to do if you have a wretched debugger of if you're working on a bug where the presence of the debugger alters its behavior.

Or if I can't be arsed configuring gdb with breakpoints for everything from the command line. I can add a println!(), save my file, and watchman will recompile and run my program in a much more reasonable (to me) workflow. I assume that I can somehow integrate gdb into my editor, but that involves a pile of learning new things.

For a so-called "trivial" app I'm most of the way through writing right now, is an OpenID Connect and SCIM server that has had to implement a decent chunk of functionality (JWTs, key fetching and caching, juggling between postgres and redis, etc) from scratch. But for most of the particularly fiddly parts, I've managed to have the type system ensure that I either outright can't have logic errors or they're detected and the program panics, which is rather nice.


> Writing programs which don't crash the first few times I run them in a dynamic language is surprisingly hard.

Writing correct programs in JS is difficult, I agree. Writing correct trivial programs is easy, though.

> Or if I can't be arsed configuring gdb with breakpoints for everything from the command line. I can add a println!(), save my file, and watchman will recompile and run my program in a much more reasonable (to me) workflow. I assume that I can somehow integrate gdb into my editor, but that involves a pile of learning new things.

That's not a third alternative, that's just gdb being wretched, as it's always been.


I find that within the Rust community people are amazing, it's just at the boundaries that I've found problems.


Seems naive for rust to try to persuade you to use their language when the right way is to launch under the aegis of a vertically integrated corporate platform. You need an HP, Sun, MS, Apple or Google foisting you on developers to have any shot at mainstream success.


Interesting perspective on some of python's features, but I hate to see this "python is neither call by value nor call by reference" silliness perpetuated with a link to the original silly blog post. I'm surprised a C++ programmer was caught by that obviously incorrect assertion.


Claims that Python/C#/Java etc., are call by value where the 'value' is the value of the pointer does not help. It only muddies things. There really ought to be different name for such call semantics and there is. Its called call by object. In C++ lingo call by const pointer comes closest: one cannot change what the pointer points to. However the pointed object may not be a const


No, no, no there is no "call by object." It is precisely this mistaken idea which muddies the waters, as you put it. Virtually all modern processors are stack-based architectures. All arguments to functions are passed on the stack or in registers. The only thing that can be placed onto the stack or into a register are numbers. Those numbers can represent either an actual value or an address in memory. The former is call by value, the latter is call by reference. All high level languages that allow calling functions or methods with arguments have to ultimately use this facility to pass those arguments. Python is most definitely call by reference.


Couple of things:

1. "Call/pass by XXX" terminology usually refers to language semantics, not underlying implementation details. In that respect, the call-stack mechanism is largely orthogonal here.

2. According to this oft-cited article (in the context of Java), passing an address under the hood is not sufficient to make a language call-by-reference. http://javadude.com/articles/passbyvalue.htm - AFAIK Python would "fail" the "Litmus test".

(EDIT - clarify what's not call-by-ref)


In Java, this code will print the string representation of Lassie:

  Dog toby = new Dog("Toby");
  
  public static void replaceDogWithToby ( Dog d )
  {
    d = toby;
  }
  
  public static void main ( String[] args )
  {
    Dog d = new Dog("Lassie");
    replaceDogWithToby(d);
    System.out.println(d);
  }
In Pascal, this will print Toby:

  var
    toby:tDog;
  
  procedure replaceDogWithToby ( var d:tDog );
  begin
    d := toby;
  end;

  var
    lassie:tDog;
  begin
    {code to initialize toby and lassie here}
    d := lassie;
    replaceDogWithToby(d);
    writeln(getDogName(d));
  end.
The first is call by object (or call by sharing, in my book). The second is call by reference. The behavior is totally different.


I haven't done java in a long time so I won't try to debate the specific language level semantics with you (the same is true of pascal though it has been even longer), however assuming that your examples run and behave as you describe then what is going on in the java example is that 'Dog d' is passed as a reference to a reference, and then the value pointed to by the argument, i.e. the address of "toby," is replaced with the address of "lassie."


You got it backwards. The Java snippet will print "Lassie". In Java, object names are references to objects, and passing them to functions copies the references, so of course modifying the formal parameter (d from replaceDogWithToby()) doesn't modify the actual parameter (d from main()).


"Variables in Python are object references. When you call a Python function the arguments are copies of the references to the original object." -- I personally wouldn't worry about what to call it(though call by reference would be my pick too), in the end I just visualize it as being similar to how assignment works in Python.


The confusion comes from the difference between references in Python and C++. In Python you can reassign a reference to a different object any time you want. In C++ you can't, and any attempt to do so modifies the referenced object instead. That fundamental difference makes it hard to understand the mechanics of Python using the terminology of C++.


Yes. But. Language designers go out of their way typically to hide this underlying truth, by for example implementing "call by value" when the type of the value is "Customer" or whatever.


It's specifically because high level language semantics often mask the underlying implementation that knowing that implementation is necessary for truly understanding what's happening. The situation you describe is call by reference, where the value passed on the stack or in a register is a reference to a customer object. You could pass a customer object by value (in languages that allow it, such as C++) by copying the structure verbatim onto the stack and addressing it there.


It may be implemented as call-by-reference, but the semantics are not the same as what one thinks of when he hears that phrase.


In what way are they different?


Im mostly experienced with the Lua Reference handling, so beware.

Usually there is one global hash-table, and a local hash table. Everything (every function, every object, every string is stored once, depending on the calling situation. So passing a reference to a object is passing the reference to that one global object, until you change it, or make a local copy of it. That sounds really strange and in a way is it. It also means, that your list of objects, will not contain duplicates. It means no useless work is done to copy data.

The model involved borrows basically from the concept of lazy loading and flat copys. Do work only when needed. Super difficult to explain to new users without CS background.


In a classic call-by-reference, changes you make to the passed parameter are seen by the caller when the function returns. That doesn't happen consistently with Python.


I think you're just using "classic" to mean the behavior of languages like C++. Under the hood there are only two ways this mechanism can function, and I don't think it helps to mix high level language semantics in when trying to explain it, as those semantics are always expressed in one of those two ways.


I'm using "classic" to refer to the way things were done when the terms were first invented. Old languages like C or Fortran.

Python uses references implicitly without ever defining the term, so it's essential to talk about the high level semantics in this case. A Python reference (implied) differs greatly from a C++ reference in that it can be rebound with the assignment operator. If you're thinking about references in C++ terms then Python will confuse you. It's important to call out the differences.


With anything immutable.


processors and architectures have nothing to do with this. This is a matter of semantics. The prolem stems from different communities appropriating the same english word to mean subtly different things, hence my clarification "in C++ lingo ..."


Scott's book "Programming Language Pragmatics" calls it "call by sharing" IIRC. BTW, that book describes like 6 or 7 modes of parameter passing. Languages like Ada have some quite original modes.


Python is "pass reference by value", in C++ semantics.


Of course, because that is the only way to implement call by reference, i.e. to pass the reference as a value.


It's possible to pass a reference by reference.


In which case the argument is still a value, containing a pointer to a value, which is a pointer to the object.


In C semantics, that's about it.




