Hacker News new | comments | show | ask | jobs | submit login
Pyre: Fast Type Checking for Python (facebook.com)
640 points by jimarcey 4 months ago | hide | past | web | favorite | 516 comments



It's interesting how static type systems are slowly gaining more adoption now by being bolted on to popular dynamic languages. It feels like a weird approach compared to using a language that was designed from the ground up to have a strong static type system but it's practical in terms of easing people into it.


Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.

After about 10 years of coding in a fit of passion we ended up with huge monolithic projects written in dynamic languages that where extremely brittle.

Fortunately languages with type inference (Rust, Golang, OCaml, Scala, etc) started becoming the answer to our problems. (We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned).

So we have a decade of code and packages written in Python and JavaScript that work well for our use cases (Data Science, MVP web apps/services, Database integration, etc) that is hard to give up. Often because alternatives aren't available yet in the statically typed languages (Hurry up Rust!).

There is often a lot of friction to get new languages introduced. I love Rust, but I don't think I can introduce it into our Golang/NodeJS/Javascript environment anytime soon.


> Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.

You may be overgeneralizing, depending on whom you mean by the term "we".

More often than not, the code I've written has to be very well trusted when deployed. For me, "getting things done" means getting to effective and trustworthy code ASAP. Static type systems have been invaluable for that work.


I interpreted the comment as "Java's verbosity made us think all static type systems were also verbose". Which I know is what a lot of people still think ("but the lack of REPL!", "but the boilerplate!", "but the elegant code", disregarding that other statically typed languages have all these features and more).


Even Java has some of those now, to some point.


especially so in scala,clojure


I don't really want to get into a flamewar about static vs dynamic. I'm a polyglot, I use several languages with multiple flavors of type system, I think that most options have at least a couple things to recommend them.

However, the grandparent has a point: Java's type system makes static typing much more painful than it needs to be. I didn't start working in Java until fairly recently, and it was only then that I started to understand how many fans of dynamic languages could say that static typing mostly just gets in the way. But, if the only static language you've spent much time with is Java. . . now I get it. Java's type system mostly just gets in the way.


How does Java's type system get in the way?


It's statically typed, but with basically no type inference (the diamond operator is something, I guess, but not much). So you end up putting a lot of time into manually managing types. That creates friction when writing code, since you need to remember the name of the return type of every method you call in order to keep the compiler happy. Worse, it creates friction when refactoring code, since any change that involves splitting or merging two types, or tweaking a method's return type, ends up forcing a multitude of edits to other files in order to propitiate the compiler. I've seen 100-file pull requests where only one of those file changes was actually interesting.

Then, to add insult to injury, its static type system is very weak. A combination of type erasure, a failure to unify arrays with generic types, and poor type checking in the reflection system means that Java's static typing doesn't give you particularly much help with writing well-factored code compared to most other modern high-level languages.

Speaking of generics, the compiler's handling of generics often leaves me feeling like I'm talking to the police station receptionist from Twin Peaks. Every time I have to explicitly pass "Foo.class" as an explicit argument when the compiler should already have all the information it needs to know what type of result I expect, I cry a little inside.

Long story short, if I could name a type system to be the poster child for useless bureaucratic work for the sake of work, it would be Java's.


Some fair points... some comments and one question:

1. Java 10 has type inference so that should improve your first point going forward to some degree. That said, I would also say type system syntax != type system.

2. Compared to what other modern high-level languages? Also slight changing of goal posts, but what other modern high-level language that has some market adoption?

3. Agree with passing `Foo.class` or type tokens around. Very annoying.


It's not powerful enough at the same time as being overly verbose.

After using python for a bit I came back to Java and tried to do all sorts of things that just were not easy without a lot of faff.


What would you consider to be a good static typing system?


C#'s, if you're looking for a Java-style language done right.

Objective-C's is also interesting, for one that makes some different design tradeoffs. Being optionally static with duck typing, for one.

Any ML variant, if you want to see how expressive a language can get when static typing is treated as a way for the compiler to work for you, not you working for the compiler.


ReasonML has it just right imho


I think that "Java bad" isn't all of the story of the move away from static type checking. Type checking in programming probably originated, and certainly featured prominently, in selection of representation of values. I think it's a valid insight of the dynamic camp that for most purposes we don't care how things are represented so long as we know how to work with them. What's often missed is that types can be a powerful tool for talking about other things, too.


Static typed languages are harder to test. So if you do cover 100% dynamic is not so bad. However well built static languages reduce the things that need to be tested in the first place. Like non-nullabilty in Kotlin and Swift.


I have to ask: what makes you think that staticly typed languages are harder to test? My experience is precisely the opposite. Large testing codebases can benefit hugely from the increased refactorability. In addition, the types help to explicitly define the contracts you need to test.


I think what dep_b refers to is that, in dynamic languages, you usually have an easier time injecting mocks and doubles. In a staticly typed language, it's usually much harder to inject mocks for IO, network, clock, etc., unless the original code has already been written to afford that (e.g. that whole Dependency Injection mess in Java).


That's exactly what I meant, thanks


Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.

This was actually a replay of what happened with Smalltalk versus C++ in the 80's and 90's, which was a part of the history of how Java came about. And even that was a replay of what happened with COBOL and very early "fourth generation" languages (like MANTIS) from a decade before that!


I don't know this history. Would you mind expanding on it, or giving a link where I could learn more?


C++ is a utterly complex language. Java appeared as an option to simplify programming compared with all the bureaucracy of C++: no need to manage every bit of memory (GC), no multiple inheritance, no templates, no multi-platform hell, a big library included etc.


Smalltalk was not available to most programmers back then, it needed an expensive machine with a lot of memory and the implementations were very expensive. Apps were also much smaller, so the disadvantages of C were less pressing.


Smalltalk was not available to most programmers back then, it needed an expensive machine with a lot of memory

I was programming back then. It ran just fine on fairly standard commodity hardware from the time 486 stopped being "high end." Also, at one point the entire American Airlines reservations system was written in Smalltalk and running on 3 early 90's Power Mac workstations.

the implementations were very expensive.

More or less true. At one point there were $5k and $10k per-seat licenses.

Apps were also much smaller, so the disadvantages of C were less pressing.

There was a major DoD project that let defense analysts do down-to-the soldier simulations of entire strategic theaters. (So imagine this as an ultra-detailed, ultra realstic RTS.) They did this as a competition with 3 teams, one working in C++, one in another language I can't recall, and one in Smalltalk. The Smalltalk group was so far ahead of the other two, there was simply no question. That was a complex app. There were countless complex financial and logistics apps.

So, small apps? Not so much.


It was already clear by the mid 90's.

Bare bones C vs something like Turbo Pascal 6.0 with Turbo Vision framework on MS-DOS 5.0.


Verbosity is usually the worst argument against a language. Your coding efficiency is not limited by how fast you can type. I've been using Kotlin recently which is basically just Terse Java and it's very nice but hasn't turned my world upside down.


Verbosity absolutely hurts comprehension of code. It's easy to hide hugs in code that seems to be just boilerplate. It also means that given a fixed screen estate you can less of the actual logic of the code at a time.


Absolutely this.

For whomever tells me verbosity isn't a limitation to a language: find me the single incorrect statement in a 100 line function vs a 10 line function.

And no cop outs with "I use {bolt-on sub-language that makes parent language more concise}" (that's not a mainstream language then) or "Well, you can just ignore all the boilerplate" (bugs crop up anywhere programmers are involved).

Or give me an ad absurdum concise counterexample with APL. :P

Ultimately language verbosity is mapped directly to proper abstraction choice. In that the language is attempting to populate the full set of information it needs, and can either make sane assumptions or ask you at every turn.


The fact that even the pythonistas are now adopting types suggests that verbosity is much less of a concern than a bunch of spaghetti code that cannot be tested, understood, or refactored. You have to squint really, really hard to think that the people who chose type-less languages over Java ten years ago actually made the right choice. Personally, when diving into codebase its "verbosity" has never been an actual issue. Nor has lack of "expressive power." Of much greater concern is how well modularized it is, how well the modules are encapsulated, and how well the intentions of the original authors were captured. Here verbosity and types in particular have been absolutely invaluable. I suspect in the end this is why serious development at scale (involving many programmers over decades) will never occur in "highly expressive" languages like lisp and to a lesser extent, ruby etc. It is simply not feasible.


As I dive deeper and deeper into this thread, it looks like people are confusing "verbosity" with "it-has-a-type-system".

Java (5,6) wasn't verbose just because of types. Java was verbose because the language, and everything surrounding it was verbose. It was difficult to read Java at times because the language had been gunked up with AbstractFactorySingletonBeans. FizzBuzz Enterprise Edition is a joke that is only funny and simultaneously dreadful in the Java world. However, despite being relatively more complex, Rust is far less verbose than Java- even though Rust is more powerful with regards to types. "Hello World" in rust is 3 lines with just 3 keywords. The Java version has 12 keywords.

Engineers ten years ago weren't choosing Ruby/Python over Java because of static typing. They didn't choose Java because it was relative nightmare to read and write.


Lambdas saved the language. Java 6 was the kingdom of nouns. You couldn't pass statements or expressions, so instead you had to create classes that happen to contain what you really wanted. Async code was nearly unreadable because the code that matters is spread out and buried.


This was said in other threads under the article, but we've definitely made huge strides in more efficient typing.

The general narrative of "early Java typing hurt development productivity" to "high throughput developers (e.g. web) jumped ship to untyped languages" to "ballooning untyped legacy codebases necessitated typing" to "we're trying to do typing more intelligently this go around" seems to track with my experience.

Generics, lamdas, duck typing, and border typing through contracts / APIs / interfaces being available in all popular languages drastically changes the typing experience.

As came up in another comment, I believe the greatest pusher of untyped languages was the crusty ex-government project engineer teaching a UML-modeling course at University.

To which students, rightly, asked "Why?" And were told, wrongly, "Because this is the Way Things Are Done." (And to which the brightest replied, "Oh? Challenge accepted.")


I really think what saved Java is the really good tooling. These nice modern IDEs almost program for you.


10 years ago I was writing Java and still am today, alongside other languages.

I will never chose Python/Ruby for anything else other than portable shell scripts.


15 years ago I wrote Python code for a living. Then about 9 years of Java. The last four years was exclusively Python. I'm never going back to Java, it has nothing I want.


What's your job while using Python?


Each to his own I guess.


(There’s only one keyword in Rust’s hello world, “fn”.)


And I think 3 in Java's? public, class and static.


Does "void" count?


That seems to be a keyword, yes. So 4.


Python types are optional, and have adequate inferencing. Any where you think it's too verbose to use types then you don't have to. In Java, you must use types even if you believe it is just boilerplate. That's an essential difference.


I keep a mental list of the qualities of good code, "short" and "readable" are on the list. I've sometimes wondered whether "short" or "readable" should be placed higher on the list, and I eventually decided that short code is better than readable code because the shortness of code is objectively measurable. We can argue all day over whether `[x+1 for x in xs]` is more or less readable than a for-loop variant, but it is objectively shorter.

Of course, it's like food and water, you want both all the time, but in a hard situation you prioritize water. Likewise, in hard times, where I'm not quite sure what is most readable, I will choose what is shortest.


> I eventually decided that short code is better than readable code because the shortness of code is objectively measurable

I can debug sophisticated algorithms code that is readable and explicit far more easily than short and concise. Anyone that tells you otherwise has never had to debug the legacy optimization algorithms of yesteryear (nor have they seen the ample amount of nested parens that comes from a bullshit philosophy of keeping the code as short as possible).


All arguments about computer languages will always end up in disagreement, since every person in that argument does programming in an entirely different context.

Short is good when the average half-life of your code is less than a month.

When you're writing something for 10 years and beyond - it makes sense to have something incredibly sophisticated and explicit.

Otherwise it doesn't since the amount of time it takes me to comprehend all of the assumptions you made in all of those nested for loops is probably longer that the lifetime of the code in production.

List comprehension has a nice, locally-defined property in python: it will always terminate.


Only if you iterate over a fixed-length iterable.

    [x for x in itertools.count()]
will never terminate.


It will terminate as soon as it runs out of memory.


By this definition, python has a nice locally defined property that it will always terminate ;)


No; this will never-ever terminate:

    (x for x in itertools.count())


Obligatory Dijkstra: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."


That's actually Brian Kernighan. Dijkstra would have never advocated debugging to begin with.


So you must be a fan of obfuscated C contests?

The main reason list comp in python were given so much praise is because how they are (were?) More efficient than loops. I personally find a series of generator expressions followed by a list comp more readable than a three-level list comprehension, although the latter is more readable.


If you reliably generate the boilerplate and throw it away, you can ignore it (and you've changed which language you're really using). If it's at all possible for a human to permanently edit the boilerplate, well now it can be wrong, so you have to start reviewing it.


A valid point. I didn't mention it above to stay concise, but the question then becomes:

If you can reliably generate boilerplate, AND it's non-editable, then why is it required by the language in the first place?

If it is editable, then it collapses back down into review burden.

I think this is where "sane, invisible, overridable defaults" shines. Boilerplate should be invisible in its default behavior. BUT the language should afford methods to change its default behavior when necessary.


> no cop outs with "I use {bolt-on sub-language that makes parent language more concise}" (that's not a mainstream language then)

Why is "I use {bolt-on} that makes {parent language} more concise" a cop out? The bolt-on could be a macro language or an IDE that does collapsing of common patterns. If it makes it easier to find a bug in a 100-line function in the parent language, or to not generate those bugs in the first place, then the {bolt-on} isn't a cop out.


Because I believe language stability is proportional to number of users.

Would I use a new transpiler for a toy personal project? Absolutely! Would I use it for an enterprise codebase that's going to live 10-15 years? No way!

If you accept that every mapping is less than perfect (e.g. source -> assembly, vm -> underlying hardware, transpiler source -> target), then it follows that each additional mapping increases impedance.

And impedance bugs are always of the "tear apart and understand the entire plumbing, then fix bug / add necessary feature."

When I'm on a deadline, I'm not going near that sort of risk.


I see "transpilers" as being on a continuum ranging from IDE collapse comments and collapse blocks at one end, to full code generation syntax macros at the other end. There's a sweet spot in the middle where the productivity gains from terser code outweigh the impedance risk.


With a proper type system you can often trade away the verbosity through type inference. Still, I'd argue that even if you couldn't, the extra 'verbosity' you take on from writing types in a language with a strong type system (Haskell, Rust, Scala, Ocaml, etc) is actually paid back in spades for readability. Particularly because you can describe your problem space through the types, and enforce invariants.

It's really just the 'static' type systems that only prevent the most pedantic of type errors where the argument holds any merit.


It's all just trade-offs. "Verbosity" is too abstract to argue by itself because it comes in so many different flavors and spectrums.

For example, the Elm lang eschews abstractions for boilerplate, deciding that the boilerplate is worth the simplicity. And I've come to appreciate its trade-offs. For example, coming back to Elm after a long break, I've not had to spend much time recredentializing in it like I had to an old Purescript project.

On the other end of the spectrum, there's something like Rails that avoids boilerplate at the high price of having to keep more complexity in your head.


I personally love introducing lots of hugs in my code.


Write hugs, not bugs!


Not if the verbosity is providing extra information. Just taking away types isn't concise, it's hiding complexity. Tracing Java in IntelliJ is always trivially easy. Tracing JavaScript Kafkaesque. Python is in between.


The problem of language verbosity is not about writing code, it's about reading code.

Writing code is rarely problematic. Usually when you sit down to write code, you have a clear idea of what you want to do. Or if you don't, you have the advantage of being intimately aware of what you're writing.

Once your project becomes large enough that you can't hold it all in your head at once, reading code becomes supremely important.

Any time you write code that interacts with other parts of your project in any way, you will need to understand those other parts in order to ensure you do not introduce errors. That very frequently means being able to read code and understand it correctly.

There's a saying that issues in complex systems always happen at the interfaces.


I see language verbosity an advantage when reading foreign code for maintenance.

Much better than deciphering some hieroglyphs.


That's a strawman. Concise code != hieroglyphs.

In fact, if you need hieroglyphs to keep your code concise that's a deficiency of the language. This is why we want expressive languages in the first place.


The size of your code is the number one predictor of bugs. The more code you have, the more bugs you probably have. Smaller code bases have less bugs. Verbosity means more code.

This is why very terse dynamic languages like Clojure often have relatively low bug counts despite a lack of static checks.

Some interesting reading on this topic:

https://cacm.acm.org/magazines/2017/10/221326-a-large-scale-...


From the very article you referenced: "One should take care not to overestimate the impact of language on defects. While the observed relationships are statistically significant, the effects are quite small. Analysis of deviance reveals that language accounts for less than 1% of the total explained deviance." That's a tiny effect size.

Moreover, Clojure is fairly compact, but isn't really a "terse" language. Consider APL and J as examples of truly terse languages. Programs written in them are generally horrible to read and maintain (at least for an average programmer who didn't write the original code!). So there might be some relationship between verbosity and quality, but the relationship is far more complex than "verbosity -> more code -> more bugs." Otherwise we'd all be building our mission-critical software in APL.

Plus, there are numerous well-known cases of bugs caused because a language provides a terse syntax, where redundant syntax would have prevented the problem. E.g., unintentional assignment in conditional expressions ("=" vs. "=="), and the "if (condition) ; ..." error in C-like languages. I've personally made similar errors in Lisp languages which are about as terse as Clojure, e.g., writing "(if some-cond (do-this) (and-that))" instead of "(if some-cond (progn (do-this) (and-that)))".

Redundancy in a programming language is often a safety rail: it shouldn't be casually discarded in the name of brevity for its own sake.


Personal experience tells me that size is probably a proxy for "this code is mathematically written". If you have even a vague idea of the math of the code you're writing, the code tends to be both shorter and have fewer bugs. But, I'd be wary of turning that around to a blanket endorsement of terseness. Terse code often needs to be restructured when requirements change. Restructuring takes more time and also risks adding new bugs. Then there are problems with readability during debugging and understanding interfaces.


> Terse code often needs to be restructured when requirements change.

It depends if the code is terse because of the programmer’s approach, or just naturally terse because of the language.


Well, there are bugs and there are bugs. With some, it is easy to find the offending bit of source code and spot the bug right away. Others may take days to localize and fix.

Bug count is not the whole story.


> Your coding efficiency is not limited by how fast you can type

Verbosity hurts reading, not typing. Think of reading an essay that takes hundreds of pages to make an argument that could have been written in a single paragraph.


That's simply not true, unless you're talking assembly-level of detail.

High-level language constructs can hide details in ways that make them harder to read, not easier to read. Ask anyone that has had to read a very complicated SQL statement about how long they had to look at various bits of the statement in order to understand exactly what was going on, and you'll get an idea of what I'm talking about (obviously, that person could be you, also).

In contrast, anyone can very easily understand for or while loops in any language without thinking about them at all. You can read them like you read a book.

It's simply a matter of fact that, unless the hidden details are very simplistic, abstract concepts with no edge cases, terseness hinders readability.

As for things like identifiers, all I can say is that developers that use super-short, non-descriptive identifiers because they think it helps readability are doing themselves, and anyone that reads their code, a grave disservice. They either a) don't understand how compilers work, or b) are beholden to some crazy technical limitation in the language that they're using, ala JS with shorter identifiers in an effort to keep the code size down before minifiers came on the scene.


>It's simply a matter of fact that, unless the hidden details are very simplistic, abstract concepts with no edge cases, terseness hinders readability.

No. Using the correct abstractions helps readability.

I'll agree with you that a complicated SQL statement may not be a good thing to use, but it also probably isn't the right abstraction.

Compare, on the other hand, using numpy/matlab/Julia vs. something like ND4J.

Its the difference between `y = a * x + b` in the former, and `y = a.mmul(x).add(b)`. Granted the ND4J version isn't terrible, but I used an older similar library in Java that only provided staticmethod, so it was `y = sum(mmul(a, x), b)`, which is fine when you're only working with two operations, but gets really ugly really fast when you want to do something remotely more complicated.

And I'll even note that all three of these are already highly abstracted. If you want to drop all semblance of abstraction, keep in mind that that `y = a * x + b` works equally well if `a` is matrix and x a vector, or `a` a vector and `x` a scalar, and separately it doesn't matter if `b` is a vector or a scalar. They'll get broadcast for you.

Overly terse code does indeed hinder readability. But so does overly verbose code. Its much more difficult to understand what is happening in

    outputValue = sumVectors(
        multiplyColumnVectorByMatrix(
            inputValue, weightsMatrix),
        broadCastScalar(
            biasValue, size(inputValue)))
than it is in `out = weights * in + bias`, even though the second is significantly more terse.


I don't know, your example seems perfectly understandable and easy to read to me. The naming is pretty good and descriptive, so anyone can understand what's going on pretty quickly.

That doesn't mean that a DSL or higher level language feature might not be better (the operations are pretty clear and not prone to edge cases, as I said before), but as far as "big problems" go, I find that example to be a pretty minor one.


My example was small but illustrative. If instead of implementing a single linear transform, you're implementing a whole neural network, or a complex statistical model or something, it will be much easier to grok the 10 line implementation than the 150 line one.

That means less surface area for typos, bugs, etc. This compounds if you ever want to go back and modify existing code.


> That's simply not true, unless you're talking assembly-level of detail.

Modern language design (of the past few decades) seems to disagree with you, with a couple of exceptions. This debate involves a degree of subjectivity, of course, but it's generally false that it's "simply not true" that less verbosity and boilerplate hinders readability. The consensus seems to be the contrary. Even Java -- late in the game -- is adopting features to reduce its verbosity and improve its expressivity.

High-level language constructs and idioms only "hide" unnecessary detail; i.e. the detail where the only degree of control you have is the possibility to introduce bugs. You learn these idioms just like you learned what an if-else or a foor loop was.


> Your coding efficiency is not limited by how fast you can type.

It absolute is, at least for me. Granted, the majority of time spent during programming is on thinking rather than typing, but any time spent typing detracts from time that could have been spent on thinking instead. Whenever I type a line too long, I tend to lose focus on what I am thinking, and get bogged down by language details. Besides, typing the useless thing again and again (like repeating a long type name) frustrates people, and frustrated people have a harder time to concentrate.


I have a hard time believing 80% of time is spent writing code vs thinking of the right solution.

That ratio alone is enough to not worry about efficiency typing imo.


You either did not read my comment or understand it, because I said the exact opposite.


As a matter of preference, the more verbose a language is, the less likely I'm motivated to learn it. Why should I have to type extra stuff to do the same things I can do in other languages? If the compiler can handle it, I shouldn't have to type it.

Java never stuck with me because of that, same with trying to learn Objective C. But languages like Swift, Go, Ruby, Python hit the sweet spot.


I wouldn't agree with the notion that Kotlin is just terse Java - it has a lot of things that don't exist in the Java world.

If you're just using it as a more terse version of Java, then I can understand why you're not seeing much of a change in your experience.


I can see how you might say that but... what about...

    std::vector<std::pair<std::size_t, std::complex<float>>> data = SomeMethod();
vs

    auto data = SomeMethod();


Adding types to local variables is quite useless and should be considered redundant and non value adding verbosity. The main driver for typing is at interface boundaries so you know what types of input another function expects.


No, it's not useless, often i want to know what's the type of a intermediate binding without looking up all the functions/exprs that transformed an input to it, often an editor plugin helps with this, but otherwise it's real pain to understand some parts (this is a problem with rust, lsp server doesn't help with this, but for ocaml merlin works beautifully).


I don’t know if I agree with this. If you want to communicate an interface for maintainabilty you’ll declare your type so you know what you’re dealing with in the future.


I'd call that first example nothing but noise. Its probably a bit better with a type alias, but knowing that SomeMethod returns an array of uint/complex pairs doesn't really tell me anything.

If anything, this just shows me bare type information alone isn't useful without accompanying documentation. For example if `SomeMethod` was renamed `CalculateAllEnemyLocations`, it might make sense, but then I get most of the relevant information from the method name.

In other words, you have a bad api, and type information sort of, but not really, makes up for that. But that just means that you're ignoring the real problem.


It doesn't matter if the more verbose version is better, we'll still use the short version because we are lazy. When programming, and you have figured out what to do, it's basically just typing in all the instructions. So typing (and reading) friendliness does matter! If it wouldn't matter we would still program in machine code. Also there's some abstraction, where 3 lines of JavaScript would need 300 lines of machine code or Java :P


The problem with verbosity is that not how fast you can write but how easily you can read the language. Verbosity with no information gain is distracting.


> I love Rust, but I don't think I can introduce it into our Golang/NodeJS/Javascript environment anytime soon.

Rings especially true for my shop as well. I had to introduce Rust or Go, and went with Go. Seeing the mild pushback I get from Go's type system makes me especially glad I didn't choose Rust.

... though, in some cases, Rust+Generics would be easier than Go.


> Rings especially true for my shop as well. I had to introduce Rust or Go, and went with Go. Seeing the mild pushback I get from Go's type system makes me especially glad I didn't choose Rust.

An other possibilities is that Go's static types feel like a lot of ceremony for too little benefit. That was one of my biggest issues with Java, and though Go's lighter it also provides less benefits… By contrast, Rust is in the camp of requiring a higher amount of ceremony but providing a lot of benefits from it (sometimes quipped as "if it compiles it works").


Why do you think Go's type has "less benefit" than Java's type system?


That's my personal, anecdotal feeling as well. Go feels a bit more like I'm yelling at the compiler "you know what I mean, why can't you just do it?!" whereas with Rust it's more "ah, I see, you have a point".


Java has generics for one.

Failed generics with type-erasure sure, but definitely better than Go which has nada.


Type erasure on it's own isn't necessarily a bad thing, Haskell implements type erasure also.


I guess this was your point, but the problem is how Java does does type erasure. With Haskell type erasure is an implementation detail, but with Java it leaks into the compile-time type checker. For instance, you can't have both these overloads:

    public void foo(List<String> list)
    public void foo(List<Integer> list)
This just wouldn't compile.


The problem there is that the equivalent Haskell would dispatch statically, whereas this is dynamic dispatch in java


Not today, but in the future it might, depending on how the work on value types will turn out.


The difference is that Haskell has very little RTTI, so you can't really see the missing types at runtime. For Java it's much more noticeable, because runtime type manipulations are way more common.


Well, Go's type system is a butt of many jokes.


>> (We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned)

I wouldn't really say that, I think it's more that we all discovered huge monoliths don't work in an "agile cloud" environment where you have 10 teams deploying on their own cadences with no coordination (which you have with on premise binary delivery, or waterfall, or when you have implicit coordination by operations because they have to build out the physical infra). Further, I think modularity has become much bigger in the past 15-20 years as more and more people contribute to open source, more problems become "solved", and languages/domain spaces mature. Whether microservices are the best solution to those observations is still up for debate, but cargo cult or not I doubt many engineers these days would use a magical wand to go back to monoliths even if they live in microservice hell right now.


For me it was less about the verbosity and more about the overuse of patterns and general overengineering present in many (most?) Java APIs. Java doesn't strike me as being much more verbose than Go, but the differences in the API designs make a huge difference in how it feels to work with the language.


Go can be terse if everything is a interface{} and you ignore errors. But production quality Go is huge because there are so many things the compiler won't help with. I want a language that would generate the same boilerplate Go other people spend actual time on writing and reading.


Actually I went the other way around.

The performance problems dealing with Tcl on a 2000 startup made me never ever again use programming language without JIT/AOT support for production code.

To this day, Java, .NET and C++ stacks are my daily tools.


I don't think lack of type inference is the primary reason of un-attractiveness of Java compared to dynamic languages. The main reason is bad expressiveness of java's types itself.

You can't even have tuples. Neither tuple-like named types and named records. You have to make class every time, and OOP discipline tells you to hide data in it and make "behavior" public (this approach is definitely not for everything, so "beans" with getters and setters became hugely popular). Ubiquity of hashmaps in dynamic languages is huge relief after that.

Scala has reputation of "rubified java" rather than "FP for java" because of hugely improved expressiveness of types (presence of data types).


It's just as possible to have huge, monolithic, and highly brittle projects written in languages with stronger typing support.

The only difference is that you get to eliminate a class of trivial annoyances.


The big difference is that there are tools that allow automatic and guaranteed safe refactors for such languages. For instance, I can't guarantee that something as simple as renaming a method won't cause runtime errors in a dynamic language.


Agreed, and I would not claim it makes no difference, just that it eliminates only one category of brittleness from a project.

Others — poor testing, over-coupling, external dependencies, and the broad category of anti-patterns — are still available to us.


In type safe languages your tools can do refactorings like extract class/extract interface (reduce coupling), create mocks automatically based on type information (helps testing), etc.

Why not let the compiler eliminate a whole class of problems and let the automated tools help you with guaranteed safe refactors?


I get both of these in python (extract class is provided by a good idea, and `mock.Mock(OBJECT_TO_MOCK, autospec=True)` creates a mock object which raises an exception if called incorrectly and can do a lot of other powerful things.


Until you try to mock anything related to the boto3 library provided by AWS....

All of the functionality is something like....

s3client = boto3.resource("s3")

The IDE has no idea what s3client is. Since I've just started using Python and mostly to do things with AWS, is this considered "Pythonic"?

Btw, After using PHP, Perl, and JavaScript, and forever hating dynamic languages, Python made me change my mind. I love Python for simple scripts that interact with AWS and event based AWS lambdas.


My Python IDE handles all the "rename a function" burden for me.


yeah, trivial annoyances like

   AttributeError: 'NoneType' object has no attribute 'method'
when you just don't expect it.


Right - this trivial annoyance, the Python equivalent of a NullPointerException, is not actually prevented by the static type system in Java and some other popular static languages. (Kotlin does prevent it, though!)


Just to clarify, I'm using trivial here in the technical sense.

YMMV on what counts as trivial to you, of course, after three days straight tearing your hair out!


If your type system only prevents trivial errors then it's not sufficiently 'strong'.


"Just as possible" is a pretty strong claim, given you're effectively saying that types have no effect on brittleness or contrarily, robustness.


> We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned

I'm genuinely curious (and probably absurdly naive), but can you explain why you believe that microservice architecture is being questioned, what alternatives there are and why they are better?


Even java has type inference now.


It's because you believe the pros of type checking always surpass the cons. But the Python user base is very diverse, with a huge difference in tastes, skills, goals, time and constraints.

That's why you can code in imperative, OO or functional and not just one paradigm.

That's why you can choose between threads, processes and asyncio, or callbacks vs await.

And of course, declaring your types or not.

This allows Python to be suitable for geography, data analysis, scripting, web dev, sysadmin, machine learning, UI, pentesting, etc.

Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.

It also has the benefit of not getting out of fashion precisely because of that.


>Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.

I think this so much. What the best programming languages for me isn't necessarily the best at anything, and shouldn't be at all. But it should be 80% of everything the best. And that in itself is possibly even harder to achieve.

I really wish Ruby was the case here. But clearly Python is taking this title.


I think the dominant reason why ruby lost out was because python won the educational market; there was something seductive about forcing children to use good indentation. The final blow came when the difficult 1.8.5 -> 2.0 transition made the act of installing ruby a challenge, of course this happened to python not too long after with the 2.7 -> 3 transition.


Frankly, Ruby lost due to the lack of libraries compared to Python. The latter had a few years of head start.

Anecdotally, I've heard the syntax may have been the cause. In those early days, Ruby was too much like Perl. If you didn't like Perl's syntax and philosophy, Python was a good alternative. If you were the type of person who grokked Perl, then why go to Ruby when you have all of CPAN at your disposal?


Precisely. Ruby was such a hit with rails they never bothered creating an ecosystem outside of the web.

JS doesn't have this problem because it has the terribly unfair advantage of a monopoly of the most popular platform in the world.

But Ruby did not, the server was open too all. And so when competition arrived on the backend, people chose a language that you could use for the web and something else.


If only Ruby had as strong an ecosystem as Python. But right now it's Rails 95% of the time and the documentation sucks most of the time.


I hate that Rails came to dominate the Ruby language in most people's minds. It's such a nice dynamic language on it's own that you can use anywhere you use Python (if the libraries exist for it).

And there were minimal frameworks like Sinatra when everyone was running to Node to use Express because of Rails, ignoring that Rails wasn't the only Ruby game in town.


IIRC, Express was actually originally based on Sinatra.

When I was doing a ruby server stuff I was using Sinatra and when I switched to JS (for various reasons) I looked specifically for Sinatra like libraries and found Express.


> Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.

This is meaningless bullshit. I could say the same thing about Java - tons of battle-tested high-level libraries, and a variety of frontends that all compile onto the same runtime. Just like Python, you can write imperative or functionally, and have your choice of working in a hand-holding high-level framework or down close to the API.

It's a programming language, if it's not a powerful+versatile toolbox it's doing something wrong. Now, if we look at the cases where a programming language is obviously not fit-for-purpose, the common thread is the inability to scale to larger codebases without the programmer effort becoming exponential.

And that's where Python fails, because it's missing a critical piece of the toolbox - type checking. It makes maintaining code vastly more difficult, because you don't have any compile-time checking about how your code is being used. That makes it brittle and difficult to refactor.

> It also has the benefit of not getting out of fashion precisely because of that.

PHP also never goes out of style. Should we all be taking cues for language design from PHP?

I don't think Python is that bad. I write Python code when it's appropriate. But it's not the language I'd choose to write a large, long-lasting codebase in, either.


So has the Python community collectively abandoned "there's only one way to do it"?


I find it funny that everyone misremebers this. The zen states that there should be one, and preferably only one, obvious way to do it. Not only one way to do it, but one obvious way, and if you have one obvious way you probably don't need more than one.

There's always been more than one way to do things. String formatting, loops/map-filter/comprehensions., Even importing.

The idea is that for whatever you're doing, the language should provide an obvious way to accomplish that. Sometimes that means having competing ways of doing it, so that distinct but similar problems both have obvious solutions.


> I find it funny that everyone misremebers this.

Probably because the Zen is also right about most people not being Dutch.


That (and Zen of Python in general) was always intended to be primarily about the development of the language itself and it's core libraries. Of course it carried over a fair bit, but I don't think it was ever really a thing enough to say that the community abandoned it.


The zen is a target, but python fails to reach it very often. After all, duck typing is implicit, there are many way to format and the logging library is very much nesting.

We are not monks, we live in the real world.


I love type checking and have a huge disdain for dynamic languages.

But I also recognize that I work on codebases that live a long time and change and get refactored A LOT. I can't imagine statistical researcher being effective writing effective if he was forced to write his one-off experiments in something like C#, although it's almost the best language for me and my tasks.


Precisely.

And the opposite is true. I'm now working on the first code base so huge that every time i open a file, i annotate all the lines a touch.

That's a beautiful thing.


Programming languages are too complicated to distill down to one feature; it won’t make sense if you look in isolation.

Python became popular after strong typing was known but before academic research into more advanced systems had entered the mainstream. For many programmers, that experience meant slow compilers, unhelpful error messages requiring ugly syntax to fix, recreating a lot of practical things which were built in to Python, etc. We don’t know how many of them would have preferred something with e.g. Rust-level tooling and capabilities if that had been available.


I learned Caml Light and SML in 1996.

Our university had quite a few courses where ML was the goto language for project assignments.

In those days Python was hardly used, still trying to get adoption.

By 2000 most Python projects were all about Zope, which I was surprised to find out it is still around.


I think there’d be an interesting study about why that happened, and why languages like Perl peaked and declined so quickly.

Anecdotally, I noticed a lot of people who missed some feature from Lisp, C, etc. but generally decided that Python made everything else enough easier that they didn’t mind very much. There’s probably an interesting discussion of language usability in that.


Engineering culture played a big role. Python was quite conservative. Additions to the language had to be proposed in a PEP, implemented, and demonstrated to carry their weight. Perl was extended quite haphazardly by comparison.

Perl 5 was abandoned largely because extending it became so painful, and Perl 6 was this crazy waterfall design process for the first few years. Perl had mostly lost all momentum, by the time Perl 6 development started to get traction.

My experience was that discourse around Python also tended to be more civil and humane, and that came from the top (Guido & Tim.) I think that played a role in or was otherwise somehow connected to the better engineering discipline.


> Perl 5 was abandoned

It's all relative. Perl 5 is more active now than at any time in its history.

Python has expanded 20x in the last 20 years. So Python has in relative terms eaten Perl's proverbial lunch and overshadows Perl so much that it's easy to think Perl just died. But it's actually successfully scavenging and growing nicely in the shadows despite the 20 year long drumbeat of pronouncements of its death.

From my perspective it does look like extending Perl 5's core is painful but that hasn't stopped it growing to a half million lines of change a year (and many millions a year in the upper reaches of the CPAN river and tens of millions further downstream) and folk writing ever more powerful pan ecosystem tools as, eg, described in this report from a couple days ago:

http://blogs.perl.org/users/preaction/2018/05/cpan-testers-a...

(That all said, I'm personally more focused on Perl 6 which is a new ball game even while it's also an extension of the old one in the sense that it cohabits in the same ecosystem and culture.)


perl5 is not abandoned, it's just currently not fashionable. There's nothing I can't do server-side in perl that I can't do in any of the other important dynamic languages.

Over the last couple of weeks I've been training people who've been stuck in svn for far too long to use git. My go to three sentences to help orient them has been "git is like perl. It's extremely useful and there's nothing you can't do with it. The problem is there's lots of different ways of achieving the same thing."

I'm not sure your argument that 'top down decisions are good' is why is valid. And the whole 'whitespace is syntactically meaningful' thing in python gives me the heebeejeebies. On the other hand when I decide I need to learn more maths, python and sympy is the tool I reach for.


Pining for the fjords, eh?


I think it was one of the first (possibly the first) to introduce serious package management and that had a lot to do with its sudden popularity. Suddenly developers could build upon each other's work really easily. That was definitely the jesustech of its day.

Perl's weak type system and cryptic syntactic muck probably had a lot to do with its decline.


One of Perl 5’s big flaws was not having an object system in the language. Different people wrote different modules on CPAN, which was deservedly popular in those days, but it often meant you’d have to kludge interfaces between multiple third party systems and other things people hacked together.

That and the syntax were why I added Perl to my bash policy that any program big enough to require scrolling would be ported to Python. Usually the richer standard library also meant that the new version ended up considerably smaller, too.


Perl 5 does have an object system, one that was based in no small part on Python.

About the biggest difference is the built-in constructor `bless` is simplistic, and should be wrapped in a method.

  class Person:
	def __init__(self, name):
		self.name = name
	def sayHi(self):
		print 'Hello, my name is', self.name

  p = Person('Swaroop')
  p.sayHi()
In Perl 5

  #!/usr/bin/perl

  use v5.12;
  use warnings;
  use feature 'signatures';
  no warnings 'experimental';

  # Allows for Python style constructor syntax
  sub Person {
    Person->new(@_);
  }

  package Person {
    # this could be added to a base class instead
    sub new($class, @args){
      my $obj = bless {}, $class;
      $obj->__init__(@args);
      $obj;
    }

    sub __init__($self, $name){
      $self->{name} = $name;
    }
    sub sayHi($self){
      say 'Hello, my name is ', $self->{name}
    }
  }

  my $p = Person('Swaroop');
  $p->sayHi();
From a structural point of view, there is almost no difference. What little semantic difference there is could be put into a module.

Not that I would write it that way when Moose-like class modules exist

  # Allows for Python style constructor syntax
  sub Person($name){
    # use the default constructor
    Person->new( name => $name );
  }

  package Person {
    use Moo;
    no warnings 'experimental'; # subroutine signatures

    has name => ( is => 'ro' );
    
    sub sayHi($self){
      say 'Hello, my name is ', $self->{name}
    }
  }

  my $p = Person('Swaroop');
  $p->sayHi
Moose was based on an early design for classes in Perl 6. (and is apparently so good there are implementations of Moose in Python and Ruby)

  class Person {
    has $.name;
    method sayHi(){
      say 'Hello, my name is ', $.name
    }

    # Allows for Python style constructor syntax
    submethod CALL-ME($name){
      self.new( :$name )
    }
  }

  my \p = Person('Swaroop');
  p.sayHi();
Whenever you are rewriting code, you will see ways of making it simpler and more concise. So there are probably just as many instances where if you translated from Python to Perl 5 it would come out shorter. (Or even Python⇒Python, Perl5⇒Perl5)


I don’t really think it’s true. Haskell was released one year before Python.


Please note that I said mainstream. Haskell took a long time to mature into its modern form, add the general features a mainstream language needs, stable performance & memory usage, etc. Even now it’s generally considered one of the more difficult languages to learn and something of a niche commercially.

That’s not a slight against the language — they were focused on other goals — but for a long time the conventional wisdom was that it was interesting for CS research and learning but not normal application development whereas you could get quite a lot of work done reasonably in Python at least one decade, probably two, earlier.


Python is strongly typed. Now it has static types enforced at runtime. It’s good to be precise when there are so many quirks to expressing and checking types.


> static types enforced at runtime

I'm sorry to break it to you, but that's an oxymoron. The whole idea of static types is that they don't need to be validated at runtime.

Python has a static type system which is gradual (ie. allows for untyped code) and is separate from the strongly but dynamically typed semantics of Python at runtime. Now, the static type system tries to mirror the dynamic semantics wherever it can, but it's still a completely separate beast. In other words, you don't enforce static types at runtime - you enforce dynamic types at runtime as usual and have an option of additionally using static type system. By the time you run the code, the static types are mostly gone. The fact that the static types reflect the dynamic type system produces an illusion that the static types remain, but they don't.

If what you said was true, the following code would not work:

    x: int = 0
    x = "0"
    print(x)
but it's still a valid Python code which runs just fine.


Not enforced at runtime.

The Python VM does nothing with the computed type hints. It's only useful for 3rd party tools that want to check the type.s PyCharm does it natively. mypy and pyre are command line tools you can use stand alone or plug into an editor (VSCode integrate very well with mypy).

But that's it.


> Now it has static types enforced at runtime.

That's not how it works. Type annotations are ignored at runtime. And to be pedantic, even if they were evaluated at runtime, they would be dynamic types, not static types. :p

The type annotations exist to make the code easier to read and to enable certain kinds of static analysis (including a type checker).


There's disagreement on that point and I think it's worth pondering. Python has more typing than some similar languages but less than popular static languages. I think there's an argument that the value curve for type checking has thresholds and Python hits a point where a fairly large percentage of people feel that going further starts costing more in overhead than it saves, with many projects being small enough that it's not an especially significant source of problems.


Huh, I thought type annotations were unused at runtime. How can I enable runtime type checking?


Even Java, which introduced generic types long after the language itself had become standardized and popular, cannot keep generic type information after compilation due to backwards compatibility.

Don't quote me on it, but I believe C# doesn't have this problem.


But the compiler checks for this by default, so you'll get a warning unless you really break the type system.


There are libraries that do magic to do this. Annotations are accessible at runtime, so you can do things like add an implicit import hook that decorates every function or class with a decorator that validates the annotations.


You can't! Or, I guess you could do something hacky with the #!.


Python is not "strongly typed". Please don't repeat this phrase near management folk. Python just gives good error messages when it crashes at runtime, e.g. "expected an Int but got a String". This is better than a segfault, but it's still a crash.

Call it "strongly tagged" if you like, but "type" unqualified should mean static type. Even "dynamic type" is a marketing hijack of the word type.

Professor Robert Harper takes the same position: https://existentialtype.wordpress.com/2011/03/19/dynamic-lan...


Python is strongly typed. It is also dynamically typed, and many people get the two confused.


"getting the two confused" implies that your terminology is standard, which it isn't. It is also quite common to use the term "type" to refer exclusively to properties that are checked statically.

Arguably "strong dynamic typing" is a property of libraries, not languages -- e.g. int conceptually has a method like:

    def __add__(self, other):
        if not isinstance(other, int):
            raise TypeError(...)
        return _unchecked_add_ints(self, other)
It's implemented in C for efficiency, but that's basically the semantics. Critically, this doesn't magically extend to user-written libraries, so unless you actually write all of that boilerplate nothing but a handful of methods in the standard library can be claimed to be "strongly typed". I've written code with a bunch of asserts in methods like that after determining that a large percentage of bugs were type errors that were silently doing the wrong thing. Without something like mypy, Python is untyped by any reasonable definition; it provides no support for users to actually work with types in any meaningful way.


I think you got the user-written part the wrong way around. It is strongly typed because unless you define the operators/methods you explicitly want to enable, they don't exist. For example if you try "object() + object()", you get an exception. You can implement the addition in a subclass, but that's explicit then.

Compare to JS where "new Object() + new Object()" results in a string by default.


You only get an exception if and when something in the bowels of your implementation bangs into a low-level operation that has one of these checks in it. For example:

    >>> import subprocess
    >>> subprocess.call([2])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python2.7/subprocess.py", line 172, in call
        return Popen(*popenargs, **kwargs).wait()
      File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
        errread, errwrite)
      File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
        raise child_exception
    AttributeError: 'int' object has no attribute 'rfind'
In the case of `object() + object()`, that check is in `object.__getattribute__`

My point is that the checks are a property of a definition of those (possibly built-in) classes; the language doesn't provide any facility to talk about types as such. Nothing in the language knows that the argument to subrpocess.call should be a list of strings, it just happily executes until it hits what is essentially:

    >>> (2).__getattribute__('rfind')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'int' object has no attribute 'rfind'
And `__getattribute__` throws an exception. This is the best case scenario. Worst case it never hits something that these primitive types are aware is not ok, and it just does the wrong thing. There's no way to specify typing invariants in a way the language understands -- you just have to put in the manual check yourself. This is what I mean when I say strong types aren't part of the language.

But if you add static types to the mix (say via mypy): If it were actually checking what should be the types involved in the code you wrote you'd get something like this:

    error: List item 0 has incompatible type "int"; expected "Union[bytes, str, _PathLike[Any]]"
...which is what mypy reports when run on that code. Python isn't checking the type of the argument subprocess.call ever, not even at runtime. If you're lucky, eventually you hit some code that has an explicity sanity check in it and it raises an exception.

There's an interesting point in the design space that you see with julia[1], and also with mechansims like racket's contracts[2], where the types aren't checked statitcally, but they are checked at runtime, unlike the example above with subprocess, where you only get the error deep in the implementation when something actually explodes. I think you could sensibly say that the "strength" is actually a property of Julia, rather than its libraries, but not in Python, the "strength" isn't part of the language per se.

The only differece with a "weakly typed" language is that those basic libraries have a much more footgun-like design -- again, not really about the language per se.

[1]: https://docs.julialang.org/en/stable/manual/types/ [2]: https://docs.racket-lang.org/reference/contracts.html


In a weakly typed language, that 2 would be implicitly converted to a string. So yes, python is absolutely strongly typed, as opposed to js which is not.


There's nothing stopping the implementation from doing that, if the authors of the stdlib thought it made sense:

    def __add__(self, other):
        if isinstance(other, str):
            return str(self) + other
        ...
Again, my point is that the distinction is mostly library design.


Indeed, but you have to explicitly do that. The language doesn't for you.

Implicit type coercion is the factor that defines weak typing. Python doesn't do implicit type coercion. Therefore it is not weakly typed. Libraries have nothing to do with it.

Yes you could, but you could do the same in Java by having a method take in Object and cast. Which by the way is exactly what's happening in the python code.


I am interpreting the term "library" somewhat broadly -- my notion is that int isn't conceptually any more core to the language than e.g. the numpy matrix classes. It's built-in for speed, but (apart from some syntactic sugar for literals), it's semantically just another class. In this light, the distinction is happening in the code that defines the int class, not in something deep in the language semantics.

The call to str isn't really a cast (which doesn't really have a counterpart in a language without static types); you're calling the constructor to the str class, which somewhere in it has a code path where it formats an int as a string (probably by way of calling the argument's __str__ method).


Sure, but int being core or not is totally irrelevant to whether or not the language attempts to convert things for you.

If int was auto coerced to strong that would be one thing, but in weak languages, everything is coerced to everything when it might make a modicum of sense, even in user defined types.

Literals become strs, objects become strs, strs become ints, arrays and objects can be combined Willy nilly leading to unintuitive results.

And since your object inherits from one of those things, you are stuck with that too. The language forces you into weakness.

Yeah okay you can jump through hoops to make your api kind of weakly typed in Java or python, but well it probably won't work in general, because the attempts to coerce will probably fail because the language doesn't know how to understand them, because the two types are incompatible. And unfortunately for your argument, not every type can be a library. Object has to be provided by the language itself.

So the question becomes, can you freely coerce between the base object type and another without loss of information. If yes, weak. If not, strong. Python: strong. Js: weak.


    > Object.prototype.valueOf = function() { throw "I will not be coerced!" }
    > 4 + {}
    Thrown: I will not be coerced!
This doesn't seem fundamentally different from overriding getattr. What you've got in Js is basically a handful of pre-defined functions with some syntactic sugar, which are bone-headedly written to go out of their way to not report problems. But this really isn't any different than higher level libraries like nodemailer, whose attempts to be smart have been the source of vulnerabilities before:

https://sandstorm.io/news/2017-03-02-security-review

The impact is the same, and your options for mitigation are the same.

Also, fwiw, Python has some questionable decisions in the basic "libraries" too:

    >>> 4 * '5'
    '5555'


>This doesn't seem fundamentally different from overriding getattr.

Right, but you can't override Object.__getattr__ in python. That's basically where the difference lies.

In python, `object`s don't naturally coerce, and you cannot force them to coerce. In js they do and you can.

Of course in any language you can write a badly behaved object that implements an interface that it really shouldn't, or in a way that is unexpected, but that isn't weak typing.

>Also, fwiw, Python has some questionable decisions in the basic "libraries" too:

Right, but even this isn't weak typing. It might be an interface you disagree with, but its very explicitly written to work this way. Compare that, again, to a weakly typed language like JS where you can multiply a string by anything, and you'll get NaN instead of a type error. In python on the other hand, `4.3 * '5'` gets you a type error, because you can't have a 4 and 3 tenths character string.

Basically, the difference between what python does and what js does is that python says

`* ` is an operator implemented on a left hand and right hand object. So for `l * r`, I first attempt to see if the left hand object knows how to multiply itself by the right hand object (`l.__mul__(r)`), and if that works, great. Otherwise, I check if the right hand can multiply itself by the left (`r.__rmul__(l)`), and if that works, great. Otherwise, these don't match, so throw an exception. This is strong typing, where every object decides, for itself, which other objects it is compatible with. You can certainly write an object which is compatible with anything, but up to you. You get to decide which objects you are compatible with, you're not forced into an all or nothing situation.

JS on the other hand says `* ` is an operator defined over things of type number, so to make `* ` work, I will implicitly convert the left and right hand sides to numbers (by repeatedly calling `valueOf`, and then multiply whatever I get from that. The language decides what you are compatible with, and how. You can't just be compatible with numeric types, if you want that, you're gonna be compatible with arrays and objects too, and there's nothing you can do about it.

The difference, to put it simply more simply, is that python (and strongly typed languages in general) doesn't have valueOf. You're confusing strong typing + operator overloading with weak typing, and those aren't, at all, the same.

The difference becomes clear when you take your example (with the no-coercion object) and try to add an array and an int. The Object's error gets thrown. In python, it would be possible to have a list's __add__ handle an int by appending it. But by extending for another list. In JS, that's impossible. That's because what python and Js are doing are different. Python defers to each type for how it wants to handle other types, but allows them to say "nope, this isn't valid". Js doesn't do that, either you are incompatible with anything ever, or you are compatible in ways neither you nor the other object you are working with can control.


The key thing I'm arguing is that + * / etc. aren't special, beside some very superficial syntactic support. Your description of python's `` basically boils down to:

    def multiply(l, r):
        if hasattr(l, '__mul__'):
            return l.__mul__(r)
        else:
            raise TypeError(...)
Besides the syntax (and, again, the performance implications of the implementation), there really is nothing special about '+', '
', '/' etc. They're just functions. And the implemenetation above is no more or less a design decision that the implementation of int.__mul__, which does the the aformentioned thing with strings. They very much could have implemented `multiply` as:

    def multiply(l, r):
        while not isinstance(l, [dict, list, int, float, str, bytes, ...]):
            l = l.value_of()
        while not isinstance(r, ...):
            r = r.value_of()
        if type(l) is dict:
            l = 'I love buggy software'
        ...
Thankfully the designers of the python builtins had a bit more taste than that. But, ultimately, just like any stand-alone function in a library, if you don't like its behavior, you can't just override a method on your own classes to make it operate differently on them, unless the function specifically makes use of that method. Your only real option is to call a different function.

And unfortunately, the design decisions made for those operators don't extend to anything else written in the language. By default, if I define that I intend to only make sense to call on with an argument of requests.Session, and somebody (possibly me) passes it an etree.xml.ElementTree.Element, Python will happily chug along, with no way of knowing that this makes no god damn sense, until something goes wrong who knows where. And there is no declarative way for me to tell it otherwise; I have the same set of options available to me as in javascript.

I think it's at least coherent to claim that the built-in syntax makes these operators more than just functions, but... meh? It seems like a fairly trivial difference at that point. Even if you grant that these are really part of the language in a non-trivial way, it feels rather like saying that Go has generics because of the built-in slices, maps channels etc, or early Java, because of arrays. It's just not the same thing as a mechanism that actually extends to the rest of language in a meaningful way. If "strong typing" is to mean something about the language, it should it should apply to more than a handful of built-in operators. Python just doesn't have a mechanism that could be considered language-level support for any kind of typing (again, ignoring mypy and related stuff).


> And unfortunately, the design decisions made for those operators don't extend to anything else written in the language.

Yes, it applies to a number of built in functions and other operators beyond the mathematical ones.

And there's no reason they can't to even non-built-ins. When you implement a method or function, you can adopt a similar special-method-based protocol for it's arguments, and define the appropriate special methods (monkey patching existing classes, if necessary) for the acceptable types.

> If "strong typing" is to mean something about the language, it should it should apply to more than a handful of built-in operators.

I suppose there is a difference between a language with a weakly typed core (where you cannot avoid the risks of weak typing without isolating the core), one with a strongly typed core that doesn't force user code to be strongly typed (where Python sits), and one that forced strong typing (which probably had to be static, as well).


> I suppose there is a difference between a language with a weakly typed core (where you cannot avoid the risks of weak typing without isolating the core), one with a strongly typed core that doesn't force user code to be strongly typed (where Python sits), and one that forced strong typing (which probably had to be static, as well).

Yeah, fwiw I do think that the distinction being made with javascript vs. python is meaningful, it just isn't really about the language per se, but rather the design of "built-in" functions, which often doen't need to be built in as far as their semantics are concerned.

In my comment to the sibling, reiterate the reference to julia/racket -- you could have a language feature that does dynamic checks at the call site. IIRC, Haskell's GHC has a flag to basically compile down any time errors to exceptions (I don't know that anybody uses it though).


>The key thing I'm arguing is that + * / etc. aren't special,

We're in more or less full agreement as to this. Operator overloading is a syntactic choice that is totally independent of weak or strong typing.

    def multiply(l, r):
        if hasattr(l, '__mul__'):
            return l.__mul__(r)
        else:
            raise TypeError(...)
    
No, this is completely different than how python handles things. In a really important way. Its

    def multiply(l, r):
        try:
            return l.__mul__(r)
        except AttributeError, TypeError:
            return r.__rmul__(l)
This doesn't look that different, but it absolutely is hugely different than the example you provide. Consider two types, `int` and `MyCustomType`. `int` has no concept of `MyCustomType`, given that I just came up with `MyCustomType`, and `int` has been a part of the standard library for over a decade.

If multiply were implemented as you suggest, we could do `MyCustomType * int` and it would succeed, but `int * MyCustomType` would fail. This again, because `MyCustomType` knows how to deal with an `int`, but the reverse isn't true, so if you only rely on the lhs implementation, you get weirdness.

As a language designer you have a few options for handling the `int * MyCustomType` case.

1. You have the language semantics say "when your type is multiplied by, we implicitly convert it to something compatible with multiplication, first by calling toString, then by taking the length of the resulting string. We also do the same to the rhs to make sure it can multiply. In other words, `multiply` is

    def multiply(l, r):
        return len(str(l)).__mul__(len(str(r)))
This is weak typing. Note that while `* ` will always work (because everything gets a toString method (or valuOf)), the library writer has very little control over what happens.

2. You defer to each object. Each object says "this is what I can do". If either object believes itself to be compatible with the other object, great! If not, you raise some sort of error. This can either be done via capabilities/interfaces (which gets you duck typing), or via strict inheritance/class names (which python does in the standard library perhaps more than it should).

Multiply in python can't work the way you suggest because multiply is both more powerful and less broken than your suggestion. What you suggest cannot handle the example of MyCustomType unless multiply already understands how to deal with something that emulates what I want MyCustomType to do.

If, for example, multiply were implemented the way you suggest, something like

    def multiply(l, r):
        if isinstance(l, int) and isinstance(r, Iterable):
            return [x for x in r for y in range(l)]
that works well and good, until I define a type `ndarray`, which is absolutely an iterable, but for which I actually want multiply to broadcast. The language doesn't know that, so with weak typing I end up getting out a much longer list. But with strong typing, I can control how things work.

So to be very clear, I think if you believe that `multiply` could have been implemented the way you believe, you have a core misunderstanding about how python works, because python absolutely could not work the way you describe.

>And unfortunately, the design decisions made for those operators don't extend to anything else written in the language. By default, if I define that I intend to only make sense to call on with an argument of requests.Session, and somebody (possibly me) passes it an etree.xml.ElementTree.Element, Python will happily chug along, with no way of knowing that this makes no god damn sense, until something goes wrong who knows where.

Right, and as strong typing suggests as soon as you attempt to treat an Element like a Session, you'll get an error (either type or Attribute, depending). The language won't attempt to make things work, or try to hold over for a while by for example make attribute access return `undefined` instead of blowing up as soon as something goes wrong.

The difference is that strong typing makes the error happen as soon as you misuse an object (but not before, because that's not possible). Weak typing delays that as long as possible.

Strong typing works the same way with any function. `len` is a good example. The basis of strong vs. weak typing is essentially that the language doesn't allow things that don't intend to comply with (implicit or explicit) interfaces comply with them in unexpected ways. You can always make something that intends to comply with an interface badly. But if I make something, in a strong language, you can't make it behave in ways that it "shouldn't". Another way of putting this might be simply that weak(er) languages have Object implement a number of interfaces (such as Comparable, Number, String, Iterable, which are all implemented, badly, by everything in js), whereas stronger languages do not (in python, object implements only String and Hashable). Further, stronger languages have more interfaces. For example, python separates "number" from "multipliable" in a way that JS does not.

But "strong" typing is totally and completely independent of "static" typing. Strong vs. weak is totally and completely independent of static vs. dynamic. Many people consider C to be a statically typed, but weakly typed language, because you can do things like cast a string to a struct willy nilly. You can't do that in python or Java. And note that what this means is that you can take an existing object and cast it to be something totally unlike what it should be. I can give you "Hello world" and you can pass that to a function that expects an struct{int, int, int}[4]. Something will happen. Similarly, in js you can pass a string into a function that expects an int, and something will happen. You try that in Python and you'll get a type error. You try that in Java, even if you do bad evil things like cast everything up to object so that it compiles, and you'll get runtime errors.

That's the difference.


> This doesn't look that different, but it absolutely is hugely different than the example you provide.

The salient point about the example I provided was that it's implemented in Python -- I appreciate the correction regarding the actual semantics, and I agree that's a somewhat better design, but I also think the difference is irrelevant to the point I as making: that these operators aren't magic beyond their syntax.

> The difference is that strong typing makes the error happen as soon as you misuse an object (but not before, because that's not possible). Weak typing delays that as long as possible.

Sure, but my complaint is that if the contract for a function says it only accepts a Session, then passing it an Element is in and of itself an attempt to use an Element like a Session. This is the key point -- Python has absolutely no knowledge or understanding of the intended type of the function's arguments, and so as you say, it is impossible for it to know to signal an error here. For strong typing to be a language feature there would have to be some way of capturing that sort of thing. Instead, we have specific methods like int.__mul__ etc. in classes that ship with python that check their arguments and raise exceptions. That's good library design, but it's not a property of the language itself.

Note that catching the error at the call site is not as ambitious as catching it at compile time -- see the references I made in earlier comments to racket and julia, which can do some level of dynamic checking for user-defined functions/types.


The semantic difference between your multiply and mine is vital. Yours can only get generic through coercion (weak types), mine can be generic without coercion (strong types).

What you're describing in Julia is more static types, not stronger types.

The more static a language, the earlier it will raise a type error. The weaker a language, the easier it is to violate an interface with no error at all.

Catching the type error earlier is being static. Having a type error at all is being strong. That's the difference.

Python had design decisions, like no undefined, explicit instead of implicit varags, operator overloading via interface, not inheritance, etc. Which make the language more likely to have typeerrors instead if allowing g the successful misuse of types.

You're trying to make strong typing another flavor of static typing, and it's a totally independent concept.


When you talk about "Strong types," what actually constitutes a type in this terminology?

Re: the semantic difference, my point is that whatever the semantics of this multiply are, it's not really "part of the language" in any deep sense -- it is just another function. Given that, the semantics of multiply is entirely irrelevant to questions about the semantics of the language.

Re: "trying to makes strong typing another flavor of static typing" I absolutely am not. The "static" in static typing very much means "before the entire program is run" not just a vague "early."


>Re: the semantic difference, my point is that whatever the semantics of this multiply are, it's not really "part of the language" in any deep sense -- it is just another function. Given that, the semantics of multiply is entirely irrelevant to questions about the semantics of the language.

Well but that's only half true. That's like saying that the semantics of `int` aren't relevant to JS because its just an object and you can implement it however you want. Except that strong vs. weak typing is quite literally a question of "what are the semantics of int in JS". You can't discuss weak vs. strong types without discussing the language's actual semantics, and not the semantics of some DSL you can defined on top of the language. They're all Turing complete.

>When you talk about "Strong types," what actually constitutes a type in this terminology?

The types that can be strong or not are exactly the same types that can be static or not. They're just two independent dimensions upon which a language can vary its semantics.

>I absolutely am not. The "static" in static typing very much means "before the entire program is run" not just a vague "early."

Fine, stop confusing "strong vs. weak typing" with "runtime type checking". Strong vs. weak typing is about coercion. Runtime type checking is about type checking.

I really think that thinking in terms of interfaces is the easiest way to conceptualize this. Object in js implements a whole host of interfaces that everything then inherits from. The same is untrue in python. This leads to implicit coercion according to those (wrongly implemented interfaces).

Much like you can consider dynamic typing to simply by static typing where every object is of type Any, you can consider weak typing in the limit to be strong typing where every object implements every interface (not, claims to, but actually does, for some value of actually). So these arguments about "well the implementations don't matter" don't hold water. The implementations (specifically of Object) are the difference.


We seem to have hit some upper limit on the nesting depth in a thread on hackernews; there's no reply button for me on your latest comment[0] so I'm replying here.

This seems to be the crux of the disagreement:

> Except that strong vs. weak typing is quite literally a question of "what are the semantics of int in JS". You can't discuss weak vs. strong types without discussing the language's actual semantics, and not the semantics of some DSL you can defined on top of the language. They're all Turing complete.

To some extent it's a definitional issue; like I said in an earlier comment, I think you can sensibly argue that because + / * etc are privileged with special syntax they are "part of the language." If you chose to define it that way, then yes, the semantics of int, +, etc. matter. If you're willing to treat the syntactic sugar as unimportant, then you can just wrap the bad library api with a better one:

    var TypeErr = {};
    var AttrErr = {};

    var mul = function(l, r) {
      if(typeof(l) === 'number' && typeof(r) === 'number') {
        // both are built-in numbers; use built-in *.
        return l * r;
      }
      try {
        if(typeof(l) !== 'object' || !('__mul__' in l)) {
          throw AttrErr
        }
        return l.__mul__(r)
      } catch(e) {
        if(e !== TypeErr && e !== AttrErr) {
          throw(e)
        }
        if(typeof(r) !== 'object' || !('__rmul__' in r)) {
          throw AttrErr
        }
        return r.__rmul__(l)
      }
    }

    // Javascript doesn't actually have ints, just floats, but there's a
    // common trick with the bitwise operators to get them; let's abstract it out:
    var int = function(n) {
      return {
        _value: n,
        __mul__: function(r) {
          return int(mul(this._value, r._value)|0)
        },
        __rmul__: function(l) {
          return int(mul(l._value, this._value)|0)
        },
      }
    }

    console.log(mul(7, 2))
    console.log(mul(int(4), int(2)))
    console.log(mul(4, "hello")) // this thorws AttrErr.
It's not really any differrent (again, if you dscount the privileged syntax) than using requests instead of the mess that is urllib, or any other instance of "that API is terrible, let's use a different library."

> Fine, stop confusing "strong vs. weak typing" with "runtime type checking". Strong vs. weak typing is about coercion. Runtime type checking is about type checking.

Run-time checking is a requirement to avoid coercion (assuming you also don't have static checking) -- everything is ultimately just bits, so if the check never actually occurs, it just "does something." Granted, you don't get active, willfull coercion like in Javascript, but ultimately if the implementation of int.__mul__ didn't do some kind of run-time checking, you would just get some garbage number when you called it. If you don't have any checking, you have coercion. Probably not object -> string, but quite likely object -> god-knows-what-memory-corruption. This is the nature of much of what you describe as C's "weak-typing" -- no runtime checks. The only difference between dereferencing a NULL pointer in C and doing None.foo in python is a run-time check.

> I really think that thinking in terms of interfaces is the easiest way to conceptualize this.

I agree. And my argument as to why strong-typing is not a feature of python-the-language is that it doesn't provide any declarative way for me to extend that property to my own interfaces; if I have some higher-level interface that doesn't fall right out of the existing Python libraries' interfaces, I have to do all of the same kind of work that I did in the above snippet of Javascript.

I think the concept of strong vs. weak typing that you're describing is coherent, but only (a) as a property of libraries, not languages, which is my argument, or (b) you assert that the built-in syntax is central. I think the latter is defensible.

Note that I am not saying that design choices of libraries that ship with the language don't matter, or even that they don't matter more than something used less often. People use these libraries every day, because they are there. The best thing Python has going for it is its ecosystem, and if you're e.g. evaluating it as a tool to use, splitting hairs like this over part-of-language vs. library probably doesn't make a ton of sense.

[0]: https://news.ycombinator.com/item?id=17071558


>It's not really any differrent (again, if you dscount the privileged syntax) than using requests instead of the mess that is urllib, or any other instance of "that API is terrible, let's use a different library."

Sure, but all you've really done is defined a new type system distinct from that of JS or python [1]. I'll agree your type system is different from JS's. Again, the fact that you can write a new language, with different semantics, within JS is not and should not be surprising.

>Run-time checking is a requirement to avoid coercion (assuming you also don't have static checking)

Ish. Duck typing avoids both of these, unless you consider the extreme that the existence or lack of any specific method defines an interface, and every object implements some subset of these interfaces, but that's not a particularly useful abstraction.

That is to say `try: lhs.method(rhs.value), except: TypeError` avoids both coercion and type checking.

>I agree. And my argument as to why strong-typing is not a feature of python-the-language is that it doesn't provide any declarative way for me to extend that property to my own interfaces; if I have some higher-level interface that doesn't fall right out of the existing Python libraries' interfaces, I have to do all of the same kind of work that I did in the above snippet of Javascript.

Huh? I'm going to challenge you to give an example, because I don't think you'll be able to.

>I think the concept of strong vs. weak typing that you're describing is coherent, but only (a) as a property of libraries, not languages, which is my argument, or (b) you assert that the built-in syntax is central. I think the latter is defensible.

To be clear, (a) is a valid interpretation, but only if you consider the base `Object` type to be a library (or more broadly, the base type). This isn't a realistic assumption. As soon as you write you're own object, you've created a new language with similar syntax but distinct semantics. Basically yes, its absolutely possible to implement a strongly typed language within a weakly typed language. But its also possible to implement a statically typed language within a dynamically typed one. That doesn't mean that the outer language isn't weakly typed. It is. You're just able to implement a stricter DSL in it, which again.

As a similar example, you can write Java code where everything is declared as Object. This doesn't make Java dynamically typed. It does however make your bit of code not statically checked. Similarly, you can write stuff within JS that is, within a fence, strongly typed, ish. You have to stop using a lot of the language's built in syntax and methods, so are you really writing JS any more? The syntax is similar, but the semantics are not.

But there are a lot of languages with similar (or even identical) syntax and different semantics. Python2 and Python3 are a good example.

[1]: https://joshuamorton.github.io/2015/10/04/types/


> But its also possible to implement a statically typed language within a dynamically typed one.

Not as a simple set of library functions that you import. You need to actually write an offline tool, because if the checking is done by library code then it's already too late -- the program is already running. Contrast the implementation above which is just a function, and still composes with the rest of the language without any assistance, and doesn't do anything that's terribly out of the ordinary for a function to do. Static typing is a language property, strong typing is a library property.

> As soon as you write you're own object, you've created a new language with similar syntax but distinct semantics

This seems like a uselessly broad criterion for what constitutes a language. If you're just using the same everyday mechanisms you do writing any program, I don't think you can claim you're using a different language, without diluting the meaning of the term beyond all utility.

> I'm going to challenge you to give an example, because I don't think you'll be able to.

The trivial example is keeping units straight. Merely having an attribute __mul__ does not not adequately capture the interface of multiplication in the presence of units. Unless I specifically write the kind of boilerplate with a bunch of checks like my JavaScript example, it will just silently do the wrong thing.

In some sense Python's object does satisfy every interface, with a default implementation of throw AttributeError/TypeError. This may seem trivial, but it is important in that overriding one of these isn't different from overriding valeOf, in order to get a different implementation for your object (one that throws).


There is no such thing as "strong typing", so no language has that feature. Perhaps you mean it's type safe in some dynamic sense, but strong and weak are not qualifiers that have any kind of rigourous definition in type systems.


Usually what is meant by weak and strong typing is that there are a lot or few implicit conversions between types which undermine the typesystem.

Imagine an esoteric language which automatically converts any type into any other type. For the conversion the language will pick the default value for that type or if there is no default value it will pick null. How useful would the typesystem be in this case?


The confusion is no doubt intentional in order to make dynamic languages sound safer than they really are. There really is only one type: PyObject.


This statement wouldn't be helpful even if wasn't incorrect. You clearly don't like Python but accusing other people of malice for not sharing your personal taste is not going to accomplish whatever you believe it will.


This is about being open and honest to stakeholders about Pythons strengths and weaknesses. Sure, I won't ever get the terminology used in your community to change, but at least you will know that there are people who find it rather uncomfortable and disingenuous.


Disingenuous is lying about people’s motives to make your opinion seem more like a truth.

It’s okay not to understand Python or typing as well as you think but that should be a chance to learn rather than your cue to accuse people of malice for giving the world free software.


Go read the link I posted by Professor Robert Harper before accusing me or anyone else of not understanding type theory. Whether you like it or not, technology does compete for mindshare. Haskell and OCaml are also free software.


Prof. Harper didn’t mention Python. He definitely didn’t accuse people of dishonest advocacy. Stop using his name to support your comments.

You could try to do something positive like talk about how Haskell does something neat which Python doesn’t support. I can think of several examples which are far more interesting than arguing over whether your personal definition of “strong typing” is universally shared.


> He definitely didn’t accuse people of dishonest advocacy.

Have you actually read it?

"Like most marketing tactics, it’s designed to confuse, rather than enlighten, and to create an impression, rather than inform."


You really aren’t willing to avoid personal attacks, are you? It’s not doing anything productive.

The main thing I took away from that post was his belief that the static / dynamic divide was a false dichotomy. I do agree, however, that I was incorrectly remembering it too charitably: he and you are both willing to distract by accusing other people of acting in bad faith without bringing evidence to support such a strong accusation.


I missed off a smiley, nothing personal was meant by it. Some of your posts have appeared abrasive too. Let's just agree to disagree.


>This is better than a segfault, but it's still a crash.

Strong vs weak types is orthogonal to whether there's a crash or not. The crash is because the types are dynamic (vs static) which is a different axis than strong/weak.


There is no static/dynamic axis for types. A dynamic "type" as you call it, is really a runtime tag. All languages are static languages. "Dynamic" and "dynamically typed" are IMHO, marketing terms. There's nothing that "dynamic" Python can do that I cannot do in a typed language using runtime tags, string keys and variants.


Static/dynamic and weak/strong typing is about the language semamtics and the runtime. You can't reduce the description the way you do, because for every language we end up with: there's nothing I can do in X that I can't do in untyped ASM, therefore everything is untyped. Or go down to either lambda calculus or logic gates. Either works.

What the language provides matters. For code like "1"+1 you have 4 main options:

- strong static - doesn't compile/start due to type error

- weak static - returns "11" and expression is known to result in a string at compile time

- strong dynamic - crash/exception at runtime

- weak dynamic - returns "11" at runtime, but the type is attached to the value, not the original expression


Your classification is based on a trivial expression involving primitive types. What would most Python code do if one tried to append a SQL query to a filepath? Since both will likely be represented by strings, it will return garbage, "weak" in your definition. Whether it crashes or not in any particular implementation is not a feature of the language.


>Your classification is based on a trivial expression involving primitive types.

It's not "his classification", just his example.

The classification is established terminology in Computer Science.

>Since both will likely be represented by strings, it will return garbage, "weak" in your definition.

It's not his definition, and it's not weak in this case. Weak is not about getting garbage or not, it's about the semantics of a behavior. It could be that the result of the weak behavior is exactly what one wanted (e.g. to add a string to an SQL statement -- I might be adding together SQL statements and text about them to produce a book on SQL with my script).

In any case, the SQL statement is a string in Python, not its own type, so the result will not be weakly checked, it will be strongly checked as what it is (a string) -- Python will do what it should.


>The classification is established terminology in Computer Science

No. It is terminology used by the dynamic language communities, not by academics.



That looks like a random list of papers. Which one are you claiming defines or even discusses "strong" or "weak" typing? AFAIK those terms have no real definition in type theory.


You're moving goalposts.

You said academics don't use the terms - you were given examples that do so.

None of them need to define it - if you're interested, you can do that research.

Those terms don't have definitions in type theory and that's ok. Systems defined in the type theory deal with very explicit, theoretical analysis that usually maps well only to static-strong implementations.


Yes goal posts have been moved, but not by me. Some of those papers may informally use these terms for static types, but it is certainly not to describe characteristics of dynamic languages. AFAIK, there is no classification in computer science that describes dynamic languages in this way. This has come from the dynamic language communities. None of those papers demonstrate that it is acceptable to call Python "strongly typed", in fact quite the opposite, as the technology discussed is not available in Python. Well, not yet anyway...


How about a type error, which what actually happens?

    from pathlib import Path
    p = Path(".")
    print("some random string" + p)
If you store typed data as generic strings, of course, it’ll work but no language has an AI which will prevent you from being lazy about that.


Perhaps filepath was a bad example for Python, as it has assertions. Zenhack actually made the same point and explained it better than me. Your runtime assertions (what you call "strong dynamic types") are a feature of the libraries you are using and a few built-in primitives. Python does not really have any facilities to talk about types. Using strings as in my example, would be "weak typing" in your terminology, but as you said, this is possible in any language, even Haskell. Therefore I fail to see how strong/weak is a language property. We are therefore left with static and "dynamic types", which Robert Harper argues should be statically typed and statically unityped.


>Therefore I fail to see how strong/weak is a language property.

That much is a given...



From your Wikipedia link:

"Generally, a strongly typed language has stricter typing rules at compile time, which imply that errors and exceptions are more likely to happen during compilation"


That's often a good direction.

When developing, you don't necessarily know about your types yet. And you might not even want to think about them. It's easier to shove in assumptions because things will change as you begin to understand what you want to build as you spend time writing the software.

It's not until later when the code has partially stabilised that you can start spotting what data structures are robust enough to be close to final and actually do benefit from being blessed by typing, locking down the inputs and outputs.

Traditionally, in a dynamically and/or weakly typed language, when the program grows large enough those parts could have been rewritten in a statically and strongly typed language. But if the dynamic language also offers some, even just basic tools for static typing then it all becomes much easier.


> When developing, you don't necessarily know about your types yet.

This is absolutely not true for myself and my coworkers. We think in types first. Typically we write out high-level functions with type signatures and make sure they all fit together and everything type-checks. Then, we'll fill in the detail and actually implement the functions. Often this process is recursive and we need to repeat it until we get small functions that are either easy to implement or available in a library. We have a type-based search tool to facilitate finding such library functions. Sometimes there are problems and alternative abstractions that only come to light when filling in the details. When a large re-organisation of code is necessary, types again help to get this right.

Personally, I could never build a non-trivial piece of software in a dynamic language. Perhaps I just don't have the brain power to track the types manually in my head.


I develop making heavy use of behavioral tests, REPL and sanity checks.

I don't worry too much about tracking types in my head because with a test and REPL I can quickly and trivially spin up and inspect to the minutest level of detail almost any line of code in a runtime state, use autocomplete, etc. and write reliable snippets of code with a near instantaneous feedback loop - getting instant feedback not just on an object's type and its properties, but on what it actually does when it is run.

Usually when developers used to a statically typed language switch to a dynamically typed language changing their style of development does not occur to them simply because the economics are so different to what they are used to - they rely heavily on IDEs with autocomplete, etc. (which, in statically typed language provides instantaneous feedback) and are used to long compile/test feedback loop and REPLs that are poor to non-existent.

In practice this means a lot of them don't think to go looking for potentially better forms of instantaneous feedback that are more readily available in dynamically typed languages but will kvetch because the one they are used to is suddenly unavailable... ;)


Static types mean that you never have to write behavioral tests. If you're writing behavior tests, you're just writing a very verbose subset of the same guarantees that types offer instantly.

Moreover, running code snippets in a repl only help if your codebase is small and you're designing it mostly alone. 3 months later, when someone needs to add a new feature and they call your function with incorrect assumptions (sending a dictionary with missing keys for example) then you'll have something that can work in the happy path, but doesn't work around the edge cases. I've seen behavioral tests try to address this, but again, that's just an inferior form of typing.


REPL are not confined to dynamic typed languages. Using a better programming language you won’t have any need for a lot of your tests.


>Using a better programming language you won’t have any need for a lot of your tests.

I know of no type system that can acts as a reasonable substitute for a well written integration test. How exactly is your type system going to provide a reasonable assurance that my code will work as expected with an upgraded database, for example?

They can substitute sanity checks and poorly written low level unit tests that should really have been sanity checks. I've seen nothing more impressive than that.

I've heard rumors of the amazing things you can do with haskell's type system and the old chestnut that "if it compiles it works" but in practice the one piece of software I use in haskell is, at best, averagely buggy.


Agreed 100%. For quick one-off scripts and coding interviews, I have no qualms with using python. But maintaining api input/output guarantees in a large piece of software w/o explicit typing sounds like a huge pain to me.


I spent the first 12 years of my career doing 80% C and C++ and a little Perl on the side and the last 10 years doing C# with a little PHP. Of course I've had to do some JavaScript. My experience with dynamic languages completely turned me off of them.

But recently, I was forced to learn Python. I have to admit that Python is joy to use for small scripts and event driven AWS Lambda functions, but I would still use C# for any large projects.

I can't put my finger on why Python+Visual Studio Code is such a joy to use compared to my previous experience with dynamic languages.


That's kind of the thing, though: Folks are using python to bang out trivial pieces of software all the time. Dynamic types are great for that.

Occasionally, those trivial pieces of software become non-trivial, in which case, having the ability to refactor with type hints is a good thing vs. a complete rewrite in a strongly typed language.


> Folks are using python to bang out trivial pieces of software all the time. Dynamic types are great for that.

And I think that's ok, because (imho) 90%-to-95% of the code written right now falls under the "trivial" label.


Some large investment banks are writing millions of lines of non-trivial code in Python. Perhaps because management heard it was "strongly typed".


Or perhaps because they invested a lot of research and figured Python was the best tool for what they set out to do.

You really should stop with the FUD in this thread, it's unconstructive and getting really annoying. We get it, you don't like Python.


It's not FUD if someone posts an opposing viewpoint. My point was that some of the terminology used in this community is rather disingenuous. Python has a lot of strengths, but let's also be honest about its weaknesses.


So far you haven’t had a good track record of making statements which are factually correct and detailed enough to discuss. If you want to contribute anything of value to the conversation try making a longer comment which explains in detail precisely what you believe is a problem and why it matters so much that you’re willing to accuse a popular open source community of making false claims.

So far I’ve seen one concrete example from you (the SQL/file path one) which is either the same in every other language (if you store them as generic strings) or prevented by Python’s type system (if you use non-generic types).


I posted a link by Professor Robert Harper, I guess you didn't read it. This has been a discussion about terminology, so I'm not sure you can claim anybody is factually incorrect.


That guess would be incorrect. I would suggest that you stop disparaging other people and try to clearly articulate the point you are trying to make in enough detail to be discussed.


As I said, the link I posted contains the detail. If you've read it then perhaps you'd care to comment on it instead of posting successive personal attacks. I suggest that you not take my opinions (or Harpers) so personally.


Let's be real, probably 99% of projects just use a language that the main developer is already familiar with. Social and economic realities often trump technical merit, which is why inferior technology has such inertia long past the time when better replacements are available.


Banks also do lots of critical business decisions in Excel, sanity is not always their strongest point.


s/Occasionally/Frequently/

I made a temporary kludge around 10 months ago whose time to live was supposed to be a few weeks. Guess what happened to that? I am definitely looking forward to retiring it mind you.


"It's not until later when the code has partially stabilised that you can start spotting what data structures are robust enough to be close to final"

That's the argument I made in "How ignorant am I, and how do I formally specify that in my code?"

http://www.smashcompany.com/technology/how-ignorant-am-i-and...

"2.) the other group of programmers start off saying “There is a great deal that I do not know about this particular problem, and I am unwilling to commit myself to absolute statements of truth, while I’m still groping in the dark.” They proceed with small experiments, they see where they fail, they learn about the problem, they discover the solutions, and they add in type information incrementally, as they feel more confident that they understand the problem and possess a workable solution. These are the dynamic type advocates."


But I would assert that it is misleading to pretend that you understand the system at the beginning. I want to be honest about how ignorant I am when I begin to work on a problem that I have never encountered before.

But are you honest with yourself? When working "in the unknown", the basis of your code is nevertheless a set of assumptions. You might not spell them out, but they are there. Static types do not equate pretense of understanding - they are those assumptions made explicit, and more importantly, universal (and not existential like tests). They help reasoning about code (esp with larger codebases/teams) and it is easier to adapt and extend (well, refactor) them in a safe way as the code and with it your knowledge about the problem develops.


To take a blob of JSON and ram into MongoDB doesn't reveal much bias. If we later decided "This JSON needs to have, at a minimum, 4 keys, including user_id, import_id, source, and version" then that is the first layer of runtime checks we can add, or type hints, or Schema with Clojure. And that becomes version 1. If we later say "We also need to add created_at and processed_bool" then that becomes version 2. The lack of the 4 keys in version 0 reveals my ignorance, just as the lack of the extra 2 keys reveals my ignorance in version 1. There will always be ignorance, there will always be things I don't know, the important thing is to make that as explicit as possible. Adding explicit types in version 0 suggests that I have a confidence that I do not have, and in fact, I eagerly wish to communicate the opposite of confidence in that early code.


Hmm. From my POV, nothing in your example is actually incompatible with a process using explicit static types.

As long as I'm writing code without any assumption regarding the JSON structure, i.e. "I'm gonna take strings and push them around", there won't be anything else than string based type signatures. As soon as I've reasoned enough about the problem that additional assumptions are manifest enough to influence future code (e.g. code that presumes your version 1 properties) the type signatures are extended.

The main thing I see different: I strive to make those assumptions explicit before writing code based on them, and to make them explicit in a universal manner. This has nothing to do with "The Truth". Quite the contrary, one should always be aware that code as a whole and typing information especially are only a model based on assumptions.

"Using static typing" doesn't mean "Let's build an army of interfaces and classes as large as we can come up with in our brainst...erm,analysis sessions". That's a rookie mistake. I dunno...your experience might be primarily in overspecified "SeaOfNouns"-Java projects, but the problems associated with those have nothing to do with static typing itself.

And to be frank: Statements like "Thus you are seeing my exact level of ignorance, and my exact level of certainty." scream "hubris in action" to me. No: What I see is what you think your exact level of ignorance and certainty are. And I don't find that kind of information particularly helpful.


That's odd, I consider myself to be firmly in group 2. I'm often fine with stating things I don't know. Yet I'm a big advocate of static type systems.

Starting with types allows you to play around with data models of the problem space quickly before committing a lot of code to provide a solution. At least that's my take on this.


To contrast this, types help a LOT for me when developing something. The last thing you know are the implementation details of what you're doing, so I usually start by sketching what I want to do put, with some simple ADTs (algebraic data types, e.g. union/sum types and product types just to give something tangible). Then create a couple of functions I think make sense, give them an undefined for now in their function body, and align the types up from the inputs and outputs.

Now, here's where you usually hear people praise dynamically typed languages—I want to change the design, because I still prototyping. How do I do this? Just change the types (ADTs)!

After changing my design, the compiler will now let me know where my functions don't make sense anymore. Contrast this with when I usually dev in a dynamical language, I end up running my program again at this step, finding out that one function over here didn't like that I'm now passing a string down instead of an Intel, another function was expecting a boolean, but now I'm returning the result instead, etc.

The saying that static languages are not great for prototyping needs to die. They are magnificent, you just need to have a conversation with the compiler instead of fighting it.

Of course, this all assumes a language with a sophisticated enough type system, e.g. Haskell, OCaml, Idris, PureScript, Elm (somewhat), Rust (depending on your do domain), etc.


Haskell has defer-type-errors option so that you can defer type decision while developing.


For me this not only matches ease of onboarding new users, but also what experts like over the lifespan of a project.

So often I start out just by banging on something. My domain model isn't well defined, I'm very tolerant of errors, and I really value quick exploration and a fast iteration cycle. If a project starts to mature, though, I start wanting more things that aid long-term maintainability. More type checking. More controlled error handling. More constraints on what code is ok.

Maybe a good example is checked exceptions. Everybody hated those in Java, because it imposed a discipline that's only really valuable in certain situations. In contrast, something like Ruby or Python lets you have a single high-level error handler to start, and gradually tighten things up. Java's type system similarly pushed a lot of people away, because it made them pay up-front costs for down-the-road benefits they might never need. Hopefully this will be a similar thing, where projects big enough to benefit from type discipline will add it as they go.


It's about having the right tool for the job.

Type checking is all about complexity management... you don't need it if you're hacking on a weekend project, but when you have hundreds of developers on a million-line codebase, it can save you tons of time.

Dynamic languages have their advantages too (productivity, REPL's, etc.)

Why NOT combine the two features? Just because it's not built into the core of the language doesn't mean you can't benefit from the complexity management aspect if your job is to manage the code complexity in a large developer workforce.


> Type checking is all about complexity management... you don't need it if you're hacking on a weekend project

I see this argument a lot but I don't buy it. You might not need Java's SimpleBeanFactoryAwareAspectInstanceFactory but I don't believe putting in types hinders quick prototype development; I think, in general, it helps it. Especially when combined with good tooling and a good IDE.


I code in mostly static languages, and even when I'm doing 1000 line python projects I miss types.

It especially helps in refactoring things quickly without needing a special IDE environment. It also lets me skip an entire class of unit tests that boil down into type checking.


>Especially when combined with good tooling and a good IDE.

Quite a few people, myself included, don't use IDEs.


Many of the "dynamically typed language" features can be had in statically typed languages as well: there are statically typed languages that have REPls for example. Your point about having higher productivity in the beginning and then retrofitting types after that still stands though.


> Dynamic languages have their advantages too (productivity, REPL's, etc.)

I've seen the productivity argument a lot, so I'm curious about this perspective. What does "productivity" mean here? Shipping code? Because if you ship code that doesn't work, did you really ship it? If you're writing a small throwaway sub-1000-line utility for internal use, I can see that, but if you're getting paid to ship code what is the definition of productivity?


FWIW, static languages have added dynamic typing features as well:

https://docs.microsoft.com/en-us/dotnet/csharp/programming-g...


But I suspect whereas the static type additions to Python will tend to spread throughout your entire system (the more you annotate, the more useful the feature is), the dynamic features of static languages tend to be used in specific places for specific reasons (the less you use the feature and the more you encapsulate it so the dynamism doesn't "escape", the better it works with the program as a whole).


This was pretty much true when I was last using C#. You would find specific places where you need "an object" (e.g. maybe you want to LINQ over some data in a method) so you could dynamically create it but if you wanted that object to go anywhere outside of scope you had to have a concrete definition.


Had a similar thought, but around C++.

auto types and lambda functions are the ones that come to my mind. I am sure there are more features making their way into modern C++ versions.


Those are completely different concepts. You're confusing dynamic typing with type inference.

Dynamic typing: C# "dynamic" keyword

Type inference: C# "var" keyword, C++ "auto" keyword


auto types really don't have anything to do with dynamic typing, in fact I would argue type deduction is an important tool in making static typing more attractive. An example of dynamic typing in C++ would be std::any (although you're still limited by what you can do with the contained object of course, for better or worse).


I think describing these as static type systems is a bit limiting. I prefer to think of them as static code analyzers. They don't change anything about the representations in memory, but they do validate invariants about the code. I actually hope we start seeing systems that validate invariants that are not type based like "SQL Injection Safe SQL Query" and "Range between 10-100".


> I actually hope we start seeing systems that validate invariants that are not type based like "SQL Injection Safe SQL Query" and "Range between 10-100".

Both of those can be types in a rich enough type system; the latter is a fairly common example for dependent types, for instance.


Agreed, it's like saying C++ has a borrow checker like Rust because memory sanitization tools exist.


[stanza](http://lbstanza.org/) is one of the more interesting recent languages i've seen - it is designed from the ground up to be gradually typed, which is the end goal of projects like pyre.


This looks at first sight like a cross between Python and LISP without parentheses (not saying that in a bad way). I'm actually glad that they choose this syntax, because a lot of people get hung up on the parentheses.

Have you used this for anything? If so, how was the experience?


no, just wrote a couple of toy programs. but it's on my short list of languages i keep an eye on because it looks like they hit a sweet spot in language design, implementation and ergonomics, and they might be exactly what i need for some future project. (d and pony are a couple of others in that camp)


It's because dynamic typing is easier for small programs and beginners, but sometimes the language becomes popular (e.g. Python or JavaScript) and people start trying to write large robust programs in it. Then you need static typing.

I initially thought optional / gradual typing (a la Dart 1) was mad but when you think about it in this context it makes sense. However now Dart 2 has gone back to static typing so who knows...


Define large.

I've built video streaming websites with a users account, uploads, votes, comments, many filters and hundred of thousand files, holding half a million user a day in Python. The server cost is around a fifth of the generated revenues.

Most projects will never even reach this size, not to mention remotely approaching google size (from 2010).


Large as in lines of code and number of authors. Not how many people it serves - why would that matter?

Something like 10k lines of code or three authors seems to be about the limit where dynamic typing becomes really painful.


Strict static typing incours a cost at development time whether you need it or not. Being able to develop quickly with dynamic typing and then add static typing checks afterwards is just such a huge productivity boost really exposing the best of both worlds.


It doubly interesting since python does not have types.

It has builtin classes, unlike every other popular language out there.


And, ironically, static typed languages like C++ are moving to more "dynamic" (actually, inferred) type systems via 'auto'


Inferred is very different from dynamic.


I'm not sure the static type system thing is real. Ultimately for 90% of workflows you wind up reserializing into a very much not-static-datatype (JSON, e.g.)


I don't see how that matters -- it seems to indicate a misunderstanding of what type systems are. Type systems obviously have boundaries whether you're interchanging with another type-safe system or not, that's irrelevant.

Every program has to deserialize data like JSON into types and enforce its assumptions about that data whether you have static analysis tooling at compile-time or not.


It absolutely matters, type systems don't exist in a pure universe, they are important because we use them to interact with human needs. A good type system pushes out typing concerns to the outer boundary of of the system. For that you don't need static typing, you need strong typing. Refactoring in a statically typed system can be an incredibly tedious process with tons of boilerplate. At the other extreme if you allow duck typing and monkeypatching then you wind up with a problem where debugging becomes a problem. There's a middle ground that a lot of very good and productive programming languages occupy.


You're not going to find many people who agree that static-typing makes refactoring harder. The classic issues with large coupled dynamically-typed systems is how difficult refactoring is without runtime errors.

You also have to be more precise when you hand-wave about benefits of "strong typing" as you can read this very comments section to see how nobody agrees on what that means.


>A good type system pushes out typing concerns to the outer boundary of of the system.

No, that would be a BAD type system. A good type system puts type concerns to the very core of the language and program, where they belong.

A program is a proof, and types are elements (like axioms) in this proof.


By 'type concerns' i mean coercing the real world into the type system. You want that to be at the edge.

IMO, treating a program as a proof is somewhat impracticable as you see with very pure functional programming paradigms struggle with universal state monads.

More

Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: