Static typing means you know the type of something at compile time. Dynamic typing means you don't. The upside to dynamic typing is that the compiler doesn't have to prove much about your program; the downside is that it can't really prove much of interest either. Strong vs weak typing is about how much your language is willing to fudge types in order to make an operation succeed without errors -- be it the intended result or not. Both static and strong typing tend to make errors be more obvious sooner. Both dynamic and weak typing tend to require less boilerplate to make your program do things.
Python doesn't know the types of things in advance (dynamically typed), but it is very picky about which types are allowed to interact (strongly typed). Even in cases where the answer is obvious, like str.join, you still have to make everything strings: "".join([1, 2, 3]) will fail.
To demonstrate why these are orthogonal, consider the other cases.
Haskell is strongly and statically typed. A program that adds a string and a number together will be rejected at compile time because the compiler can prove it's invalid.
C is weakly and statically typed. A program that combines a string and a number may be accepted by the compiler, who interprets the number as, I dunno, a wchar_t or something. Another example of weak typing is how C allows (heck, encourages) pointer arithmetic and untagged unions/structs.
"It’s virtually impossible to statically analyze"
"In Python a good amount of errors are discovered at runtime"
> The first glaring issue is that I have no guarantee that is_valid() returns a bool type.
You can name functions ridiculous, unhelpful and misleading things in any language. The very fact that you know to expect a bool undermines the point: clearly, despite not having static types, a human managed to make a pretty reasonable assumption about what a piece of code was about to do. Just because the compiler can check that is_valid returns a string doesn't mean that hat's magically a useful thing to do.
> The fact that I cannot, in the absence of perfect test coverage, verify the syntactical correctness of someone’s code is a big red flag.
Syntax errors are caught by the compiler. Semantic errors aren't caught by the best type systems. At best, type systems are a way to encode a bit of the semantics so that the compiler can prove things about it.
> To improve static type checking in Python, developers made libraries like jedi, pytype, mypy, and toasted-marshmallow, but they are often not used in most projects because they do not belong to the standard toolchain
I can not imagine the author actually used jedi and believes this. That's... not how jedi works. I also don't know what to do with the argument of "oh, you have an optional static type system but people don't use it because it's annoying, this is why we should use a mandatory static type system" other than shrug.
I like types! I think there are great arguments for types. I like the idea of being able to have a type system prove that certain bugs can't happen. This adds nothing but FUD and no new insights.
In my experience this is roughly equally true with statically typed languages as it is with Python. The types of errors the compiler helps find before running tests are just not important. They are not types of errors that matter much. And if compiler errors are cryptic enough, it can be a huge huge pain that you can’t just run the code on a toy input to see what’s going on. The whole pitch about how static typing and compilation will save you time by forcing more correctness prior to testing is just a hogwash myth in practice. It happens occasionally, but not nearly enough to justify all the stuff advocating it, and the benefits of being able to debug according to desired behavior and just not care about strict safety in edge cases you can guarantee you won’t hit are really tangible. You cannot easily dismiss these benefits.
Most likely a char or a * wchar. Yeah, random pointers.
In fact, C and C++ are both strongly typed, but they are both memory unsafe.
However, C++ provides enough static typing to make it easy to write memory safe (but single threaded) programs.
Also, tools like valgrind get you 99% of the benefit of memory safety, which is good enough for debugging. C++ lets you override the allocator, and build in inexpensive (fast enough for production) checks for most use after frees, memory consumption profilers / leak detection/ etc.
I’d argue that python is memory unsafe in practice, since we get segfaults from third party dependencies in python about as often as we see them in c/c++.
Crucially, C and C++ provide enough tools to get static type safety on the error handling path.
Errors like ‘assert “” is not None’ and missing field exceptions in python error handling code are frequent, but unheard of in C/C++.
If you use std::string, adding a in integer does something well-defined (rtfm). Adding an int to a char widens the result to the int type (foiling attempts to use it like a char without a downcast, if you use -Wall -Werror). Adding an into to a char* does pointer arithmetic, which can be used safely, but there are better string / buffer types you can implement in an afternoon, so do that instead.
You are refuting my point but provide no evidence and implicitly redefine several terms in a way contrary to common usage.
Is there a definition you can pen down? Because C sure isn't strongly typed in the Liskov, Jackson or modern usage senses of the term.
> Also, tools like valgrind get you 99% of the benefit of memory safety, which is good enough for debugging. C++ lets you override the allocator, and build in inexpensive (fast enough for production) checks for most use after frees, memory consumption profilers / leak detection/ etc.
Would you say that de facto C/C++ has a memory safety problem? If it does, why?
(Also, not germane to my point.)
> I’d argue that python is memory unsafe in practice, since we get segfaults from third party dependencies in python about as often as we see them in c/c++.
Do you have extraordinary evidence to go with this extraordinary claim, or are you just talking about your own anecdata?
> If you use std::string, adding a in integer does something well-defined (rtfm).
My argument was not that it is poorly defined. Weak typing doesn't mean UB.
Strong and weak typing are poorly defined terms.
I’d argue that type errors (not including memory safety errors) should not occur at runtime in a strongly typed language, or at least that a “stronger typed” language would admit fewer classes of those errors in practice.
I’d argue that de facto, modern C/C++ does not have a memory safety issue. Operating systems like OpenBSD and many other secure, hardened network daemons provide existence proofs that are not available for languages like python.
In contrast, I’d argue that python has severe type safety issues.
I’m not sure why evidence that “strongly typed” python in practice suffers from all sorts of typing errors that “weakly typed” C/C++ avoids is not relevant to your point.
I also don’t understand why you dislike the string examples, which explains the (strongly typed) semantics of the operators you mentioned in your comment.
By “strongly typed” do you mean that all operators return the (single type) of their parameters?
I don’t know of any languages that enforce such a thing.
Perhaps you mean “strongly typed languages only include operators that are to my taste”? That’s surely not a useful definition.
I work as a research assistant in a PL research lab and have talked with many PL researchers who have been in the field for a long time. They all use the terminology consistently, and it all lines up with what lvh has been saying.
I agree that the terms are poorly defined, because there's not a single definition which can be used as a criterion for evaluation. But that Python is more strongly typed than C seems to be a common sentiment, at least among people that I explicitly asked about it. Much C code practically relies on the weakness of the type system to function, whereas Python cannot be deceived in the same way.
> I’m not sure why evidence that “strongly typed” python in practice suffers from all sorts of typing errors that “weakly typed” C/C++ avoids is not relevant to your point.
Typing errors are evidence of a strong typing system...
> I also don’t understand why you dislike the string examples
Your string example relied on the weak typing of C to work. The conversions are implicit, aren't they? This is a hallmark of a weakly-typed language.
> I’m not sure why evidence that “strongly typed” python in practice suffers from all sorts of typing errors that “weakly typed” C/C++ avoids is not relevant to your point.
> I also don’t understand why you dislike the string examples, which explains the (strongly typed) semantics of the operators you mentioned in your comment.
I have no idea what evidence you’re referring to, or what examples I supposedly dislike.
To any readers that made it this far, and are hoping for something precise, here is a minimally useful definition of “strongly typed”. I’ve never heard of a broader definition of “strongly typed”:
The language’s type system must be embedable in a lattice, and at runtime, it is guaranteed that functions aren’t passing the top element around and then using it without an explicit downcast.
C/C++ meet this definition, since you need to cast void* before dereferncing it, and void isn’t a valid type for a variable. Also, the only union types are defined at compile time.
Python does not meet this definition because functions can have multiple return types due to control flow, which means you can create call sites that return things like lub(integer, string), and then compose them until you have something of type “top”, and there’s no way to statically guard against usages of such a type.
In C++, the only thing passed around at runtime is void* . The pointer is then used as a concrete type without any explicit downcast.
Any conversation about strong typing should not involve the word "static". Nothing about strong typing is "static". It is entirely a runtime concept. Functions potentially having multiple return types is entirely an issue of static typing. Not strong typing.
This is obvious because you can annootate python code with the type information. This makes it strongly typed, but not statically typed, even though it makes all unions compile time defined.
In other words, if I take python with type annotations (which is statically typed), and remove the annotations, you claim this would make the language weakly typed, but this makes no sense. There would be no difference between strength and dynamicness.
A better definition might be that a language is weakly typed of it allows transformation from type a to b, where a and b are unrelated (ie neither inherits from the other), without a call to a constructor. Python does not meet this definition, the only way to cast is via a constructor call.
C and c++ do not, void* allows arbitrary transforms. Java will throw a classcastexception, but can only do so at runtime, bit will do so at the bad cast.
C and c++ will keep chugging until the miscast object is misused. That's weak typing.
I also think Go (and languages like it) which were designed to scale, use the cloud, be safe and fast (performance close to C) are the solution and the future. Here's a quote from Ian Lance Taylor, "Go was deliberately built from the start to support large scale programs implemented by hundreds or thousands of different programmers. Those kinds of programs are written at Google, and Go was designed to be used to write programs at Google."
More from Ian on this topic here - https://www.quora.com/Will-the-Golang-code-become-unmaintain...
A few core things:
The GIL problem is frustrating. I generally just try to avoid writing threaded code and instead execute calls asynchronously or run parallel instances and work with Celery or something similar. For web applications, this tends to work well enough for me personally but this is definitely a valid issue.
Python2 vs Python3 is sort of a done deal at this point, in my opinion. I was a big holdout on moving to 3 for a while but, for the past couple of years, I haven't had more than 1 or 2 cases where I had serious issues. I don't have to deal with legacy codebases though.
On a larger note, I would say that this all points to the fact that it is easy to do things the "wrong way" in Python. The accessibility of the language is great, but it is not without its issues if you go in expecting it to be a fenced-in playground with batteries included (to mix metaphors).
I also think that the whole "drop down to C argument" is sometimes viewed as a cop-out when criticizing python, but I personally believe that being proficient at high-level and low-level languages, knowing their strengths and weaknesses, and transitioning between them when it makes sense is stronger than just using purely high OR low-level. It's sort of like having multiple gears on a car: you need all of them, and they are all useful and most opportune in different scenarios.
In terms of progress I love what .NET has achieved and found that libs like Nancy  are good to explore.
Disclaimer: I am not working as a professional programmer anymore so I look for fun and "move faster" tools.
This is a false dichotomy -- there are actually many different ways to package a Python module, including source distributions (sdists). These are just the two most well-known types of built distributions.
> The numpy module is a wheel, i.e. it is a source distribution that depends on a handful of system libraries — gfortran, blas, lapack, atlas — to compile properly.
This is not correct, a wheel is specifically not a source distribution, it is a built distribution. The dependency on platform-level libraries has nothing to do with what type of distribution it is.
NumPy publishes a lot of wheels for a given release (https://pypi.org/project/numpy/#files) each of which is platform-specific, since they are pre-built (pre-compiled) for a given platform.
In the event that a built distribution isn't available for your platform, that's when pip falls back on the source distribution, which requires the build step (hence the dependency on gcc or clang).
> This is unsettling if you’re coming from a Java or C/C++ world and are expecting access modifiers
It's unsettling to a lot of people, but so what? I still don't see a solid use case for where member access (not "privacy") is a good idea. Python is one of the only languages to get access modifiers right (convention), imo and I don't even like Python.
You could implement some "privacy" with a lambda closure (which gets increasingly complex), but that's one more technique that lambdas are ill-suited for.
> lambda closure
Just FYI (not intended to be mean or anything), in formal PL theory this phrasing doesn't make sense. A closure is the pairing of an anonymous function (a "lambda") with an environment. Python's "lambda" is really just a lambda --- it is not a closure. This is revealed in the common pitfall where you try to create a list comprehension using a lambda to generate interior values.
Hammer is viewed as a ubiquitous [sic] tool; however, its design limits its potential as a reliable and high performance drill. Unfortunately, not every carpenter is aware of its limitations.
For what it's worth, '[sic]' doesn't necessarily mean "that was spelled incorrectly", only, literally, "thus": i.e., "that's the way it was in the original; don't blame me." As such, it can be used to indicate misspellings or 'misconceptualisations' ("I know that it's the wrong word, but it's the one that was there"), as nmyk (https://news.ycombinator.com/item?id=17484468) indicates.
Self-aware clickbait is the worst kind of clickbait.
"Look at me using the word hate misleadingly, hate is such a strong word". Is it though? This is like the title-writing version of /r/madlads. :-)
In fact, before the web, authors have long used fancy titles for articles that you had to buy the magazine or newspaper to even read, so they had no advertising-like value to attract unsuspecting readers.
Were you self aware when using a double negative?
In practice it doesn't really show up: the cases where code is actually CPython-specific usually imply all the work is being done in highly optimized C bits. Hence, those programs, despite being run on CPython, aren't CPU-bound.
Finally: while this FUD is really pernicious because it used to be hard/impossible to get scientific code like numpy to run on PyPy, that hasn't been the case in a long time. Right now you can just pip install numpy and it'll just work.
The python 2 vs 3 issue isn't really an issue if you're starting with the language. There was a time where lots of big libraries were only supported by python 2, but that's definitely not the case any more.
OTOH the build system is definitely a common pain point, and the GIL forces you to use other services (e.g. redis + multiprocessing) to achieve horizontal scaling which is unfortunate. And of course python's crappy performance is something anyone looking at languages should be aware of.
It's a clusterfuck of half-baked, non-standard, spaghetti code approaches.
The Zen of Python says "There should be one-- and preferably only one --obvious way to do it." Python package management makes that in to a joke.
So being a professional Python/Django coder, I was aware of all these issues. And yet, those issues do not seem to stop the Python train from moving forward.
A couple of issues to note in his article: Everything he says is true, but the conclusions are mostly false.
1. Optimizing databases and data access are the number one scalability issue. It's the hardest thing going from a monolith to a microservice architecture. Goroutines, threads, etc. don't solve that for you.
2. Threads are terrible in any language (except Erlang maybe), and reliable code cannot be used with primitive thread constructs. Goroutines are good enough, sure. Also python asyncio. But not threads. I've seen production systems that no one could debug, because 3000 threads ran amok, thread deadlocks, race conditions, etc. And you can't unit test threaded code effectively either.
3. All software has multiple build systems. C/C++ and autotools, CMake, Scons, Waf. Java and Ant or Maven or Gradle or whatever. Go is sane for the most part, until you get to dependency management -- goglide, govendor, etc.
4. Strongly typed, statically typed systems (Java, C++) usually aren't. Usually people rely upon that as a crutch to think they don't have to worry about types, until they see NullPointerExceptions (java) or segfault (C/C++) in production.
5. Python code is only as reliable as the person who wrote it. But that's true of C/Java/C++ etc. People have to understand where the pitfalls of a language are, and that's usually documented in coding standards or style guides.
- large and inconsistent stdlib api,
- extremely wordy but somehow still vague documentation
- overwrought build/packaging system (although tooling has improved)
- endless runtime gotchya debugging in production
- almost all the baggage of OO boilerplate but none of the benefits of type-checking.
- half-baked functional paradigms
Although I would add the incredibly slow process start time to the list as well.
I took over a smallish half-completed web application written in python a couple of years ago. Nothing major - simple business-y analytics/data viewing app with some fancy charts and tables etc - perhaps a couple of thousand lines of code interfacing to backend database unique to our business. It had some tests, but nowhere near 100% coverage. I needed to finish it off and then get it into production.
Long-story short, trying to understand someone's half-completed python and make changes to this was a HUGE nightmare for precisely the reason the article outlines: python expects you to remember all the minutiae of all the code you are calling. If you cant remember, then you need to suck it up and pick your way through the code and mentally keep track of what is going on at every stack frame.
So instead of thinking about the actual problem you're trying to solve, python forces you keep track of all the little bullshit that you shouldn't need to care about. And even then, you cant be 100% sure you got it right until you exercise that code (either through 100% test coverage, or - shudder - at run-time at 3am on a Sunday when a user in Japan logs on).
So far from being productive and "easy to use", python for us was a nightmare of low-productivity and frustration in only a very small application being worked on by just 2 or 3 remote developers. I cant imagine the levels of pain and suffering for larger projects & teams!
The "solution" is apparently extensive & specially crafted comments that specifically explain the types used (that of course needs to be kept up to date) - I find this fairly odious (why not just use a typed language if you need to put the types in the comments?!)
I know now that Python is looking at type hints as part of the language (PEP 484 - https://www.python.org/dev/peps/pep-0484/) that should help a lot.
Still, its not all bad. As a result of the pain of this application (which incidentally was damn slow in production due to the processing we had to do on the data in the python backend before we sent it to the browser), we decided to learn golang and use that for future work instead and have not looked back.
Golang has the same feeling of fast-paced "throw things at the screen" productivity, but is obviously typed (so you know about errors before your users do) and orders of magnitude faster.
I have seen python code written this way. It's not a joy to work on. This function returns this type of class or this other type of class based upon input, and the two are not interchangeable say, so you have to refactor a bunch of code if your type changes underneath you.
The typical pattern is a method should return something of one kind of class only or raise an exception. And then newer tools like PyCharm can help you out here by knowing what the functions are going to return.
His outburst is an emotional manifestation of the different sources of frustration I've had working with Python that I've tried to document.
A common theme I've seen in responses is that many claim that I've mistaken Python as a weakly-typed language countering that it is indeed a strongly-typed language -- obviously an argument of semantics. I'll define a "strongly-typed" language as one where it is trivial to identify the type signature and definition of any function or variable in a program. In Python, this is non-trivial.
FYI, this article is also cross-posted on:
Regardless of language, x -> a + 1 is not going to work if a never has a value. In many cases handling “a doesn’t have a value” is the wrong way to address the problem, because the problem should be solved as “figure out why the data collection didn’t work, and propose a strategy for handling our dataset when either a value is missing or is provably wrong.”
As an example one might assume that a video feed has exactly as many frames per second as it says on the tin (eg: 24fps will have exactly 24 frames for each second of video) but in reality there will be missing frames due to data corruption, or even a mismatch in clock speeds meaning that over 100 seconds you have 2398 frames instead of 2400. How do you handle the missing two frames? The problem is there are no frames missing, it’s just that they were never there to start with, and frame 1237 of another source has no direct equivalent in the slightly-faster stream. You might alter the clock for processing, duplicate one or two frames, use interpolated frames, or otherwise process the various sources so that you get a consistent synthesised data stream with which to do further processing.
Strong typing doesn’t help solve real problems.
But seriously, the points raised by the author are valid. However despite this Python is still a pleasure to use. In addition the author fails to mention some newer things, which can help with some of the problems Python has:
1. Python Type annotations
2. Please use the concurrent futures package for multithreading: https://docs.python.org/3/library/concurrent.futures.html
3. For distributed concurrency please use Dask: https://dask.pydata.org/
4. Please use pipenv for package management:
5. For performance, have you tried numba:
Also there are very few packages actually using Numba. It seems that more people like to talk about possibly using it than actually use it.
while x in someiterator():
a = x
# Does a exist here or not?
But I agree with the author's central point that the language has a number of issues.
He mentions the python2 vs python3 issues, but doesn't discuss what is to me the biggest problem of all that: there is no standard simple way to identify within the code what version of python it needs or is written for. Coming originally from the perl world, that has always seemed like a significant oversight.
Nor does he mention several of my other biggest issues with the language, such as the arbitrary assignment of some operations to global functions (i.e. a=[3,2,1]; len(a)) while others are method calls (i.e. a.reverse()) which return nothing. And then there's list(reversed(a)) and all the iterator awkwardness ...
Explaining to students learning programming for the first time why neither of these do what they expect is never fun.
who needs a __str__ method for listreverseiterator anyways. they cherrypicked the worst of the Java world: Unreadable names and batteries excluded. yuck!
Some of the largest-scale enterprise distributed systems, as in, at the telecoms level, are written in erlang, which is dynamically typed, and highly reliable.
All Erlang variables are immutable; it's not just a default.
This isn't right. Weak/strong typing is on a different axis than static/dynamic typing.
Python is strongly typed. There's no implicit conversion among types (which is one marker of weak-typedness). For proof, consider:
>>> 1 + '3'
TypeError: unsupported operand type(s) for +: 'int' and 'str'
In a weakly-typed language, if your function expects an X and you give it a Y, well, it'll do its best to make due.
Many people think that because Python is dynamically typed (i.e. you can pass anything anywhere without compile-time restrictions), that it must also be weakly typed. This is simply not the case. If I write a function which expects an integer and I give it a string, I'm going to have a bad time despite the fact that the code "compiles" and only fails at runtime.
Edit: I've read further, and the author actually uses their lack of knowledge on this subject to portray the Python community as being willfully misleading in their promotion of the language:
C is statically typed, but it is also weakly typed. You can treat any value as any type if you so wish, either by explicitly casting or by removing checks for implicit casts during compilation.
I'm going to finish reading this article, but... I don't think I respect this author very much, based on what I've seen so far.
Edit 2: While the author's points would be valid by themselves (GIL, build toolchain, dynamic typing), the over-abundance of negative rhetoric shows that the author was merely interested in writing a smear article of Python, essentially. I think it's a pretty weak article overall, and would have been aided by a more fair comparison with notes about why some of these design decisions were made (or at least a less-slanted writing style).
Edit: I was going to mention the author's confusion, but OP did it in their edit.
Edit 2: The author is "Staff Software Engineer at Tesla" ? And he doesn't know C pointers ?
As an aside, I work in PL research (as an assistant) and have never talked to a researcher who disagreed with the assessment I gave above. I'm not saying you're wrong (because you aren't), but rather that there is at least enough of a distinction that the author shouldn't have conflated them to the extent that they did.
No doubt your typical Math Ph.D. type doesn't want a language with a step learning curve to implement their expertise, so what to do?
Perhaps everyone could at least switch to Go,a language which (as I understand) provides for some basic static type safety, good performance, etc.
I spent six months working in Python a while back and I'll be intent to avoid it from now on.
I know that many libraries and Python standard lib itself predate this PEP but I don't understand why Python 3, which broke backward compatibility anyways, did not fix that at least in the standard library.
If I had a dollar for every time a bug was introduced in Python code due to incorrect indentation during code refactoring, I would retire already.
However, what is the cost of addressing them? Slower development (core lang and ecosystem). Less use in niche fields. Less flexibility to adapt to new challenges.
Python is a rusty toolbox that's been put together over 27 years.
You know why it's still around? Because people can use it to do the things they want to do.
Fixing all the warts would give us the language we need now... in about 15 years. Which would by then be useless.
- The indentation based syntax: it can easily trip you up when moving big blocks of code around. You want to put this if else block somehwere else, hope you get a syntax error, because you can get sth. that seemingly works but does not do what it's supposed to do. Also sometimes finding where you are can be difficult.
- Dogmaticism: a big thing in the community. They'll wait 20+ years until they add string interpolation, a feature common in many similar languages. They'll wait 20+ years until they start to realise having expression equivalents of statements is useful.
- Breaking syntax: new syntax is added in what seems to be minor releases, so you either avoid new syntax to be compatible w/ all the 3.* interpreters, or you need to expressly avoid certain interpreter versions. And there's no equivalent of Perl's "use v5.something;" so you need to write code juggling multiple supported versions.
Still I think it is a nice language to have in your toolbox for the vastness of the stdlib and the available 3rd party packages.