It's interesting how static type systems are slowly gaining more adoption now by being bolted on to popular dynamic languages. It feels like a weird approach compared to using a language that was designed from the ground up to have a strong static type system but it's practical in terms of easing people into it.
Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.
After about 10 years of coding in a fit of passion we ended up with huge monolithic projects written in dynamic languages that where extremely brittle.
Fortunately languages with type inference (Rust, Golang, OCaml, Scala, etc) started becoming the answer to our problems. (We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned).
So we have a decade of code and packages written in Python and JavaScript that work well for our use cases (Data Science, MVP web apps/services, Database integration, etc) that is hard to give up. Often because alternatives aren't available yet in the statically typed languages (Hurry up Rust!).
There is often a lot of friction to get new languages introduced. I love Rust, but I don't think I can introduce it into our Golang/NodeJS/Javascript environment anytime soon.
> Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.
You may be overgeneralizing, depending on whom you mean by the term "we".
More often than not, the code I've written has to be very well trusted when deployed. For me, "getting things done" means getting to effective and trustworthy code ASAP. Static type systems have been invaluable for that work.
I interpreted the comment as "Java's verbosity made us think all static type systems were also verbose". Which I know is what a lot of people still think ("but the lack of REPL!", "but the boilerplate!", "but the elegant code", disregarding that other statically typed languages have all these features and more).
I don't really want to get into a flamewar about static vs dynamic. I'm a polyglot, I use several languages with multiple flavors of type system, I think that most options have at least a couple things to recommend them.
However, the grandparent has a point: Java's type system makes static typing much more painful than it needs to be. I didn't start working in Java until fairly recently, and it was only then that I started to understand how many fans of dynamic languages could say that static typing mostly just gets in the way. But, if the only static language you've spent much time with is Java. . . now I get it. Java's type system mostly just gets in the way.
It's statically typed, but with basically no type inference (the diamond operator is something, I guess, but not much). So you end up putting a lot of time into manually managing types. That creates friction when writing code, since you need to remember the name of the return type of every method you call in order to keep the compiler happy. Worse, it creates friction when refactoring code, since any change that involves splitting or merging two types, or tweaking a method's return type, ends up forcing a multitude of edits to other files in order to propitiate the compiler. I've seen 100-file pull requests where only one of those file changes was actually interesting.
Then, to add insult to injury, its static type system is very weak. A combination of type erasure, a failure to unify arrays with generic types, and poor type checking in the reflection system means that Java's static typing doesn't give you particularly much help with writing well-factored code compared to most other modern high-level languages.
Speaking of generics, the compiler's handling of generics often leaves me feeling like I'm talking to the police station receptionist from Twin Peaks. Every time I have to explicitly pass "Foo.class" as an explicit argument when the compiler should already have all the information it needs to know what type of result I expect, I cry a little inside.
Long story short, if I could name a type system to be the poster child for useless bureaucratic work for the sake of work, it would be Java's.
Some fair points... some comments and one question:
1. Java 10 has type inference so that should improve your first point going forward to some degree. That said, I would also say type system syntax != type system.
2. Compared to what other modern high-level languages? Also slight changing of goal posts, but what other modern high-level language that has some market adoption?
3. Agree with passing `Foo.class` or type tokens around. Very annoying.
C#'s, if you're looking for a Java-style language done right.
Objective-C's is also interesting, for one that makes some different design tradeoffs. Being optionally static with duck typing, for one.
Any ML variant, if you want to see how expressive a language can get when static typing is treated as a way for the compiler to work for you, not you working for the compiler.
I think that "Java bad" isn't all of the story of the move away from static type checking. Type checking in programming probably originated, and certainly featured prominently, in selection of representation of values. I think it's a valid insight of the dynamic camp that for most purposes we don't care how things are represented so long as we know how to work with them. What's often missed is that types can be a powerful tool for talking about other things, too.
Static typed languages are harder to test. So if you do cover 100% dynamic is not so bad. However well built static languages reduce the things that need to be tested in the first place. Like non-nullabilty in Kotlin and Swift.
I have to ask: what makes you think that staticly typed languages are harder to test? My experience is precisely the opposite. Large testing codebases can benefit hugely from the increased refactorability. In addition, the types help to explicitly define the contracts you need to test.
I think what dep_b refers to is that, in dynamic languages, you usually have an easier time injecting mocks and doubles. In a staticly typed language, it's usually much harder to inject mocks for IO, network, clock, etc., unless the original code has already been written to afford that (e.g. that whole Dependency Injection mess in Java).
Java's verbosity made us all hate type systems in the early 2000s so many of us migrated to dynamic languages such as Python, Ruby in the mid 2000s that allowed us to work fast and loose and get things done.
This was actually a replay of what happened with Smalltalk versus C++ in the 80's and 90's, which was a part of the history of how Java came about. And even that was a replay of what happened with COBOL and very early "fourth generation" languages (like MANTIS) from a decade before that!
C++ is a utterly complex language. Java appeared as an option to simplify programming compared with all the bureaucracy of C++: no need to manage every bit of memory (GC), no multiple inheritance, no templates, no multi-platform hell, a big library included etc.
Smalltalk was not available to most programmers back then, it needed an expensive machine with a lot of memory and the implementations were very expensive. Apps were also much smaller, so the disadvantages of C were less pressing.
Smalltalk was not available to most programmers back then, it needed an expensive machine with a lot of memory
I was programming back then. It ran just fine on fairly standard commodity hardware from the time 486 stopped being "high end." Also, at one point the entire American Airlines reservations system was written in Smalltalk and running on 3 early 90's Power Mac workstations.
the implementations were very expensive.
More or less true. At one point there were $5k and $10k per-seat licenses.
Apps were also much smaller, so the disadvantages of C were less pressing.
There was a major DoD project that let defense analysts do down-to-the soldier simulations of entire strategic theaters. (So imagine this as an ultra-detailed, ultra realstic RTS.) They did this as a competition with 3 teams, one working in C++, one in another language I can't recall, and one in Smalltalk. The Smalltalk group was so far ahead of the other two, there was simply no question. That was a complex app. There were countless complex financial and logistics apps.
Verbosity is usually the worst argument against a language. Your coding efficiency is not limited by how fast you can type. I've been using Kotlin recently which is basically just Terse Java and it's very nice but hasn't turned my world upside down.
Verbosity absolutely hurts comprehension of code. It's easy to hide hugs in code that seems to be just boilerplate. It also means that given a fixed screen estate you can less of the actual logic of the code at a time.
For whomever tells me verbosity isn't a limitation to a language: find me the single incorrect statement in a 100 line function vs a 10 line function.
And no cop outs with "I use {bolt-on sub-language that makes parent language more concise}" (that's not a mainstream language then) or "Well, you can just ignore all the boilerplate" (bugs crop up anywhere programmers are involved).
Or give me an ad absurdum concise counterexample with APL. :P
Ultimately language verbosity is mapped directly to proper abstraction choice. In that the language is attempting to populate the full set of information it needs, and can either make sane assumptions or ask you at every turn.
The fact that even the pythonistas are now adopting types suggests that verbosity is much less of a concern than a bunch of spaghetti code that cannot be tested, understood, or refactored. You have to squint really, really hard to think that the people who chose type-less languages over Java ten years ago actually made the right choice. Personally, when diving into codebase its "verbosity" has never been an actual issue. Nor has lack of "expressive power." Of much greater concern is how well modularized it is, how well the modules are encapsulated, and how well the intentions of the original authors were captured. Here verbosity and types in particular have been absolutely invaluable. I suspect in the end this is why serious development at scale (involving many programmers over decades) will never occur in "highly expressive" languages like lisp and to a lesser extent, ruby etc. It is simply not feasible.
As I dive deeper and deeper into this thread, it looks like people are confusing "verbosity" with "it-has-a-type-system".
Java (5,6) wasn't verbose just because of types. Java was verbose because the language, and everything surrounding it was verbose. It was difficult to read Java at times because the language had been gunked up with AbstractFactorySingletonBeans. FizzBuzz Enterprise Edition is a joke that is only funny and simultaneously dreadful in the Java world. However, despite being relatively more complex, Rust is far less verbose than Java- even though Rust is more powerful with regards to types. "Hello World" in rust is 3 lines with just 3 keywords. The Java version has 12 keywords.
Engineers ten years ago weren't choosing Ruby/Python over Java because of static typing. They didn't choose Java because it was relative nightmare to read and write.
Lambdas saved the language. Java 6 was the kingdom of nouns. You couldn't pass statements or expressions, so instead you had to create classes that happen to contain what you really wanted. Async code was nearly unreadable because the code that matters is spread out and buried.
This was said in other threads under the article, but we've definitely made huge strides in more efficient typing.
The general narrative of "early Java typing hurt development productivity" to "high throughput developers (e.g. web) jumped ship to untyped languages" to "ballooning untyped legacy codebases necessitated typing" to "we're trying to do typing more intelligently this go around" seems to track with my experience.
Generics, lamdas, duck typing, and border typing through contracts / APIs / interfaces being available in all popular languages drastically changes the typing experience.
As came up in another comment, I believe the greatest pusher of untyped languages was the crusty ex-government project engineer teaching a UML-modeling course at University.
To which students, rightly, asked "Why?" And were told, wrongly, "Because this is the Way Things Are Done." (And to which the brightest replied, "Oh? Challenge accepted.")
15 years ago I wrote Python code for a living. Then about 9 years of Java. The last four years was exclusively Python. I'm never going back to Java, it has nothing I want.
Python types are optional, and have adequate inferencing. Any where you think it's too verbose to use types then you don't have to. In Java, you must use types even if you believe it is just boilerplate. That's an essential difference.
I keep a mental list of the qualities of good code, "short" and "readable" are on the list. I've sometimes wondered whether "short" or "readable" should be placed higher on the list, and I eventually decided that short code is better than readable code because the shortness of code is objectively measurable. We can argue all day over whether `[x+1 for x in xs]` is more or less readable than a for-loop variant, but it is objectively shorter.
Of course, it's like food and water, you want both all the time, but in a hard situation you prioritize water. Likewise, in hard times, where I'm not quite sure what is most readable, I will choose what is shortest.
> I eventually decided that short code is better than readable code because the shortness of code is objectively measurable
I can debug sophisticated algorithms code that is readable and explicit far more easily than short and concise. Anyone that tells you otherwise has never had to debug the legacy optimization algorithms of yesteryear (nor have they seen the ample amount of nested parens that comes from a bullshit philosophy of keeping the code as short as possible).
All arguments about computer languages will always end up in disagreement, since every person in that argument does programming in an entirely different context.
Short is good when the average half-life of your code is less than a month.
When you're writing something for 10 years and beyond - it makes sense to have something incredibly sophisticated and explicit.
Otherwise it doesn't since the amount of time it takes me to comprehend all of the assumptions you made in all of those nested for loops is probably longer that the lifetime of the code in production.
List comprehension has a nice, locally-defined property in python: it will always terminate.
Obligatory Dijkstra: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
The main reason list comp in python were given so much praise is because how they are (were?) More efficient than loops. I personally find a series of generator expressions followed by a list comp more readable than a three-level list comprehension, although the latter is more readable.
If you reliably generate the boilerplate and throw it away, you can ignore it (and you've changed which language you're really using). If it's at all possible for a human to permanently edit the boilerplate, well now it can be wrong, so you have to start reviewing it.
A valid point. I didn't mention it above to stay concise, but the question then becomes:
If you can reliably generate boilerplate, AND it's non-editable, then why is it required by the language in the first place?
If it is editable, then it collapses back down into review burden.
I think this is where "sane, invisible, overridable defaults" shines. Boilerplate should be invisible in its default behavior. BUT the language should afford methods to change its default behavior when necessary.
> no cop outs with "I use {bolt-on sub-language that makes parent language more concise}" (that's not a mainstream language then)
Why is "I use {bolt-on} that makes {parent language} more concise" a cop out? The bolt-on could be a macro language or an IDE that does collapsing of common patterns. If it makes it easier to find a bug in a 100-line function in the parent language, or to not generate those bugs in the first place, then the {bolt-on} isn't a cop out.
Because I believe language stability is proportional to number of users.
Would I use a new transpiler for a toy personal project? Absolutely! Would I use it for an enterprise codebase that's going to live 10-15 years? No way!
If you accept that every mapping is less than perfect (e.g. source -> assembly, vm -> underlying hardware, transpiler source -> target), then it follows that each additional mapping increases impedance.
And impedance bugs are always of the "tear apart and understand the entire plumbing, then fix bug / add necessary feature."
When I'm on a deadline, I'm not going near that sort of risk.
I see "transpilers" as being on a continuum ranging from IDE collapse comments and collapse blocks at one end, to full code generation syntax macros at the other end. There's a sweet spot in the middle where the productivity gains from terser code outweigh the impedance risk.
With a proper type system you can often trade away the verbosity through type inference. Still, I'd argue that even if you couldn't, the extra 'verbosity' you take on from writing types in a language with a strong type system (Haskell, Rust, Scala, Ocaml, etc) is actually paid back in spades for readability. Particularly because you can describe your problem space through the types, and enforce invariants.
It's really just the 'static' type systems that only prevent the most pedantic of type errors where the argument holds any merit.
Not if the verbosity is providing extra information. Just taking away types isn't concise, it's hiding complexity. Tracing Java in IntelliJ is always trivially easy. Tracing JavaScript Kafkaesque. Python is in between.
It's all just trade-offs. "Verbosity" is too abstract to argue by itself because it comes in so many different flavors and spectrums.
For example, the Elm lang eschews abstractions for boilerplate, deciding that the boilerplate is worth the simplicity. And I've come to appreciate its trade-offs. For example, coming back to Elm after a long break, I've not had to spend much time recredentializing in it like I had to an old Purescript project.
On the other end of the spectrum, there's something like Rails that avoids boilerplate at the high price of having to keep more complexity in your head.
The problem of language verbosity is not about writing code, it's about reading code.
Writing code is rarely problematic. Usually when you sit down to write code, you have a clear idea of what you want to do. Or if you don't, you have the advantage of being intimately aware of what you're writing.
Once your project becomes large enough that you can't hold it all in your head at once, reading code becomes supremely important.
Any time you write code that interacts with other parts of your project in any way, you will need to understand those other parts in order to ensure you do not introduce errors. That very frequently means being able to read code and understand it correctly.
There's a saying that issues in complex systems always happen at the interfaces.
In fact, if you need hieroglyphs to keep your code concise that's a deficiency of the language. This is why we want expressive languages in the first place.
The size of your code is the number one predictor of bugs. The more code you have, the more bugs you probably have. Smaller code bases have less bugs. Verbosity means more code.
This is why very terse dynamic languages like Clojure often have relatively low bug counts despite a lack of static checks.
From the very article you referenced: "One should take care not to overestimate the impact of language on defects. While the observed relationships are statistically significant, the effects are quite small. Analysis of deviance reveals that language accounts for less than 1% of the total explained deviance." That's a tiny effect size.
Moreover, Clojure is fairly compact, but isn't really a "terse" language. Consider APL and J as examples of truly terse languages. Programs written in them are generally horrible to read and maintain (at least for an average programmer who didn't write the original code!). So there might be some relationship between verbosity and quality, but the relationship is far more complex than "verbosity -> more code -> more bugs." Otherwise we'd all be building our mission-critical software in APL.
Plus, there are numerous well-known cases of bugs caused because a language provides a terse syntax, where redundant syntax would have prevented the problem. E.g., unintentional assignment in conditional expressions ("=" vs. "=="), and the "if (condition) ; ..." error in C-like languages. I've personally made similar errors in Lisp languages which are about as terse as Clojure, e.g., writing "(if some-cond (do-this) (and-that))" instead of "(if some-cond (progn (do-this) (and-that)))".
Redundancy in a programming language is often a safety rail: it shouldn't be casually discarded in the name of brevity for its own sake.
Personal experience tells me that size is probably a proxy for "this code is mathematically written". If you have even a vague idea of the math of the code you're writing, the code tends to be both shorter and have fewer bugs. But, I'd be wary of turning that around to a blanket endorsement of terseness. Terse code often needs to be restructured when requirements change. Restructuring takes more time and also risks adding new bugs. Then there are problems with readability during debugging and understanding interfaces.
Well, there are bugs and there are bugs. With some, it is easy to find the offending bit of source code and spot the bug right away. Others may take days to localize and fix.
> Your coding efficiency is not limited by how fast you can type
Verbosity hurts reading, not typing. Think of reading an essay that takes hundreds of pages to make an argument that could have been written in a single paragraph.
That's simply not true, unless you're talking assembly-level of detail.
High-level language constructs can hide details in ways that make them harder to read, not easier to read. Ask anyone that has had to read a very complicated SQL statement about how long they had to look at various bits of the statement in order to understand exactly what was going on, and you'll get an idea of what I'm talking about (obviously, that person could be you, also).
In contrast, anyone can very easily understand for or while loops in any language without thinking about them at all. You can read them like you read a book.
It's simply a matter of fact that, unless the hidden details are very simplistic, abstract concepts with no edge cases, terseness hinders readability.
As for things like identifiers, all I can say is that developers that use super-short, non-descriptive identifiers because they think it helps readability are doing themselves, and anyone that reads their code, a grave disservice. They either a) don't understand how compilers work, or b) are beholden to some crazy technical limitation in the language that they're using, ala JS with shorter identifiers in an effort to keep the code size down before minifiers came on the scene.
>It's simply a matter of fact that, unless the hidden details are very simplistic, abstract concepts with no edge cases, terseness hinders readability.
No. Using the correct abstractions helps readability.
I'll agree with you that a complicated SQL statement may not be a good thing to use, but it also probably isn't the right abstraction.
Compare, on the other hand, using numpy/matlab/Julia vs. something like ND4J.
Its the difference between `y = a * x + b` in the former, and `y = a.mmul(x).add(b)`. Granted the ND4J version isn't terrible, but I used an older similar library in Java that only provided staticmethod, so it was `y = sum(mmul(a, x), b)`, which is fine when you're only working with two operations, but gets really ugly really fast when you want to do something remotely more complicated.
And I'll even note that all three of these are already highly abstracted. If you want to drop all semblance of abstraction, keep in mind that that `y = a * x + b` works equally well if `a` is matrix and x a vector, or `a` a vector and `x` a scalar, and separately it doesn't matter if `b` is a vector or a scalar. They'll get broadcast for you.
Overly terse code does indeed hinder readability. But so does overly verbose code. Its much more difficult to understand what is happening in
I don't know, your example seems perfectly understandable and easy to read to me. The naming is pretty good and descriptive, so anyone can understand what's going on pretty quickly.
That doesn't mean that a DSL or higher level language feature might not be better (the operations are pretty clear and not prone to edge cases, as I said before), but as far as "big problems" go, I find that example to be a pretty minor one.
My example was small but illustrative. If instead of implementing a single linear transform, you're implementing a whole neural network, or a complex statistical model or something, it will be much easier to grok the 10 line implementation than the 150 line one.
That means less surface area for typos, bugs, etc. This compounds if you ever want to go back and modify existing code.
> That's simply not true, unless you're talking assembly-level of detail.
Modern language design (of the past few decades) seems to disagree with you, with a couple of exceptions. This debate involves a degree of subjectivity, of course, but it's generally false that it's "simply not true" that less verbosity and boilerplate hinders readability. The consensus seems to be the contrary. Even Java -- late in the game -- is adopting features to reduce its verbosity and improve its expressivity.
High-level language constructs and idioms only "hide" unnecessary detail; i.e. the detail where the only degree of control you have is the possibility to introduce bugs. You learn these idioms just like you learned what an if-else or a foor loop was.
> Your coding efficiency is not limited by how fast you can type.
It absolute is, at least for me. Granted, the majority of time spent during programming is on thinking rather than typing, but any time spent typing detracts from time that could have been spent on thinking instead. Whenever I type a line too long, I tend to lose focus on what I am thinking, and get bogged down by language details. Besides, typing the useless thing again and again (like repeating a long type name) frustrates people, and frustrated people have a harder time to concentrate.
As a matter of preference, the more verbose a language is, the less likely I'm motivated to learn it. Why should I have to type extra stuff to do the same things I can do in other languages? If the compiler can handle it, I shouldn't have to type it.
Java never stuck with me because of that, same with trying to learn Objective C. But languages like Swift, Go, Ruby, Python hit the sweet spot.
Adding types to local variables is quite useless and should be considered redundant and non value adding verbosity. The main driver for typing is at interface boundaries so you know what types of input another function expects.
No, it's not useless, often i want to know what's the type of a intermediate binding without looking up all the functions/exprs that transformed an input to it, often an editor plugin helps with this, but otherwise it's real pain to understand some parts (this is a problem with rust, lsp server doesn't help with this, but for ocaml merlin works beautifully).
I don’t know if I agree with this. If you want to communicate an interface for maintainabilty you’ll declare your type so you know what you’re dealing with in the future.
I'd call that first example nothing but noise. Its probably a bit better with a type alias, but knowing that SomeMethod returns an array of uint/complex pairs doesn't really tell me anything.
If anything, this just shows me bare type information alone isn't useful without accompanying documentation. For example if `SomeMethod` was renamed `CalculateAllEnemyLocations`, it might make sense, but then I get most of the relevant information from the method name.
In other words, you have a bad api, and type information sort of, but not really, makes up for that. But that just means that you're ignoring the real problem.
It doesn't matter if the more verbose version is better, we'll still use the short version because we are lazy. When programming, and you have figured out what to do, it's basically just typing in all the instructions. So typing (and reading) friendliness does matter! If it wouldn't matter we would still program in machine code. Also there's some abstraction, where 3 lines of JavaScript would need 300 lines of machine code or Java :P
The problem with verbosity is that not how fast you can write but how easily you can read the language. Verbosity with no information gain is distracting.
> I love Rust, but I don't think I can introduce it into our Golang/NodeJS/Javascript environment anytime soon.
Rings especially true for my shop as well. I had to introduce Rust or Go, and went with Go. Seeing the mild pushback I get from Go's type system makes me especially glad I didn't choose Rust.
... though, in some cases, Rust+Generics would be easier than Go.
> Rings especially true for my shop as well. I had to introduce Rust or Go, and went with Go. Seeing the mild pushback I get from Go's type system makes me especially glad I didn't choose Rust.
An other possibilities is that Go's static types feel like a lot of ceremony for too little benefit. That was one of my biggest issues with Java, and though Go's lighter it also provides less benefits… By contrast, Rust is in the camp of requiring a higher amount of ceremony but providing a lot of benefits from it (sometimes quipped as "if it compiles it works").
That's my personal, anecdotal feeling as well. Go feels a bit more like I'm yelling at the compiler "you know what I mean, why can't you just do it?!" whereas with Rust it's more "ah, I see, you have a point".
I guess this was your point, but the problem is how Java does does type erasure. With Haskell type erasure is an implementation detail, but with Java it leaks into the compile-time type checker. For instance, you can't have both these overloads:
public void foo(List<String> list)
public void foo(List<Integer> list)
The difference is that Haskell has very little RTTI, so you can't really see the missing types at runtime. For Java it's much more noticeable, because runtime type manipulations are way more common.
>> (We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned)
I wouldn't really say that, I think it's more that we all discovered huge monoliths don't work in an "agile cloud" environment where you have 10 teams deploying on their own cadences with no coordination (which you have with on premise binary delivery, or waterfall, or when you have implicit coordination by operations because they have to build out the physical infra). Further, I think modularity has become much bigger in the past 15-20 years as more and more people contribute to open source, more problems become "solved", and languages/domain spaces mature. Whether microservices are the best solution to those observations is still up for debate, but cargo cult or not I doubt many engineers these days would use a magical wand to go back to monoliths even if they live in microservice hell right now.
For me it was less about the verbosity and more about the overuse of patterns and general overengineering present in many (most?) Java APIs. Java doesn't strike me as being much more verbose than Go, but the differences in the API designs make a huge difference in how it feels to work with the language.
Go can be terse if everything is a interface{} and you ignore errors. But production quality Go is huge because there are so many things the compiler won't help with. I want a language that would generate the same boilerplate Go other people spend actual time on writing and reading.
The performance problems dealing with Tcl on a 2000 startup made me never ever again use programming language without JIT/AOT support for production code.
To this day, Java, .NET and C++ stacks are my daily tools.
I don't think lack of type inference is the primary reason of un-attractiveness of Java compared to dynamic languages. The main reason is bad expressiveness of java's types itself.
You can't even have tuples. Neither tuple-like named types and named records. You have to make class every time, and OOP discipline tells you to hide data in it and make "behavior" public (this approach is definitely not for everything, so "beans" with getters and setters became hugely popular). Ubiquity of hashmaps in dynamic languages is huge relief after that.
Scala has reputation of "rubified java" rather than "FP for java" because of hugely improved expressiveness of types (presence of data types).
The big difference is that there are tools that allow automatic and guaranteed safe refactors for such languages. For instance, I can't guarantee that something as simple as renaming a method won't cause runtime errors in a dynamic language.
In type safe languages your tools can do refactorings like extract class/extract interface (reduce coupling), create mocks automatically based on type information (helps testing), etc.
Why not let the compiler eliminate a whole class of problems and let the automated tools help you with guaranteed safe refactors?
I get both of these in python (extract class is provided by a good idea, and `mock.Mock(OBJECT_TO_MOCK, autospec=True)` creates a mock object which raises an exception if called incorrectly and can do a lot of other powerful things.
Until you try to mock anything related to the boto3 library provided by AWS....
All of the functionality is something like....
s3client = boto3.resource("s3")
The IDE has no idea what s3client is. Since I've just started using Python and mostly to do things with AWS, is this considered "Pythonic"?
Btw, After using PHP, Perl, and JavaScript, and forever hating dynamic languages, Python made me change my mind. I love Python for simple scripts that interact with AWS and event based AWS lambdas.
Right - this trivial annoyance, the Python equivalent of a NullPointerException, is not actually prevented by the static type system in Java and some other popular static languages. (Kotlin does prevent it, though!)
> We also collectively decided that Microservivces were another solution to our woes though that cargo cult is being questioned
I'm genuinely curious (and probably absurdly naive), but can you explain why you believe that microservice architecture is being questioned, what alternatives there are and why they are better?
It's because you believe the pros of type checking always surpass the cons. But the Python user base is very diverse, with a huge difference in tastes, skills, goals, time and constraints.
That's why you can code in imperative, OO or functional and not just one paradigm.
That's why you can choose between threads, processes and asyncio, or callbacks vs await.
And of course, declaring your types or not.
This allows Python to be suitable for geography, data analysis, scripting, web dev, sysadmin, machine learning, UI, pentesting, etc.
Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.
It also has the benefit of not getting out of fashion precisely because of that.
>Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.
I think this so much. What the best programming languages for me isn't necessarily the best at anything, and shouldn't be at all. But it should be 80% of everything the best. And that in itself is possibly even harder to achieve.
I really wish Ruby was the case here. But clearly Python is taking this title.
I think the dominant reason why ruby lost out was because python won the educational market; there was something seductive about forcing children to use good indentation. The final blow came when the difficult 1.8.5 -> 2.0 transition made the act of installing ruby a challenge, of course this happened to python not too long after with the 2.7 -> 3 transition.
Frankly, Ruby lost due to the lack of libraries compared to Python. The latter had a few years of head start.
Anecdotally, I've heard the syntax may have been the cause. In those early days, Ruby was too much like Perl. If you didn't like Perl's syntax and philosophy, Python was a good alternative. If you were the type of person who grokked Perl, then why go to Ruby when you have all of CPAN at your disposal?
Precisely. Ruby was such a hit with rails they never bothered creating an ecosystem outside of the web.
JS doesn't have this problem because it has the terribly unfair advantage of a monopoly of the most popular platform in the world.
But Ruby did not, the server was open too all. And so when competition arrived on the backend, people chose a language that you could use for the web and something else.
I hate that Rails came to dominate the Ruby language in most people's minds. It's such a nice dynamic language on it's own that you can use anywhere you use Python (if the libraries exist for it).
And there were minimal frameworks like Sinatra when everyone was running to Node to use Express because of Rails, ignoring that Rails wasn't the only Ruby game in town.
IIRC, Express was actually originally based on Sinatra.
When I was doing a ruby server stuff I was using Sinatra and when I switched to JS (for various reasons) I looked specifically for Sinatra like libraries and found Express.
> Python is never the best language at anything. But it's a damn good language at most things. It's an invaluable powerful versatile toolbox because it gives you the margin to adopt the style that fits your problem instead of forcing one on you.
This is meaningless bullshit. I could say the same thing about Java - tons of battle-tested high-level libraries, and a variety of frontends that all compile onto the same runtime. Just like Python, you can write imperative or functionally, and have your choice of working in a hand-holding high-level framework or down close to the API.
It's a programming language, if it's not a powerful+versatile toolbox it's doing something wrong. Now, if we look at the cases where a programming language is obviously not fit-for-purpose, the common thread is the inability to scale to larger codebases without the programmer effort becoming exponential.
And that's where Python fails, because it's missing a critical piece of the toolbox - type checking. It makes maintaining code vastly more difficult, because you don't have any compile-time checking about how your code is being used. That makes it brittle and difficult to refactor.
> It also has the benefit of not getting out of fashion precisely because of that.
PHP also never goes out of style. Should we all be taking cues for language design from PHP?
I don't think Python is that bad. I write Python code when it's appropriate. But it's not the language I'd choose to write a large, long-lasting codebase in, either.
I find it funny that everyone misremebers this. The zen states that there should be one, and preferably only one, obvious way to do it. Not only one way to do it, but one obvious way, and if you have one obvious way you probably don't need more than one.
There's always been more than one way to do things. String formatting, loops/map-filter/comprehensions., Even importing.
The idea is that for whatever you're doing, the language should provide an obvious way to accomplish that. Sometimes that means having competing ways of doing it, so that distinct but similar problems both have obvious solutions.
That (and Zen of Python in general) was always intended to be primarily about the development of the language itself and it's core libraries. Of course it carried over a fair bit, but I don't think it was ever really a thing enough to say that the community abandoned it.
The zen is a target, but python fails to reach it very often. After all, duck typing is implicit, there are many way to format and the logging library is very much nesting.
I love type checking and have a huge disdain for dynamic languages.
But I also recognize that I work on codebases that live a long time and change and get refactored A LOT. I can't imagine statistical researcher being effective writing effective if he was forced to write his one-off experiments in something like C#, although it's almost the best language for me and my tasks.
Programming languages are too complicated to distill down to one feature; it won’t make sense if you look in isolation.
Python became popular after strong typing was known but before academic research into more advanced systems had entered the mainstream. For many programmers, that experience meant slow compilers, unhelpful error messages requiring ugly syntax to fix, recreating a lot of practical things which were built in to Python, etc. We don’t know how many of them would have preferred something with e.g. Rust-level tooling and capabilities if that had been available.
I think there’d be an interesting study about why that happened, and why languages like Perl peaked and declined so quickly.
Anecdotally, I noticed a lot of people who missed some feature from Lisp, C, etc. but generally decided that Python made everything else enough easier that they didn’t mind very much. There’s probably an interesting discussion of language usability in that.
Engineering culture played a big role. Python was quite conservative. Additions to the language had to be proposed in a PEP, implemented, and demonstrated to carry their weight. Perl was extended quite haphazardly by comparison.
Perl 5 was abandoned largely because extending it became so painful, and Perl 6 was this crazy waterfall design process for the first few years. Perl had mostly lost all momentum, by the time Perl 6 development started to get traction.
My experience was that discourse around Python also tended to be more civil and humane, and that came from the top (Guido & Tim.) I think that played a role in or was otherwise somehow connected to the better engineering discipline.
It's all relative. Perl 5 is more active now than at any time in its history.
Python has expanded 20x in the last 20 years. So Python has in relative terms eaten Perl's proverbial lunch and overshadows Perl so much that it's easy to think Perl just died. But it's actually successfully scavenging and growing nicely in the shadows despite the 20 year long drumbeat of pronouncements of its death.
From my perspective it does look like extending Perl 5's core is painful but that hasn't stopped it growing to a half million lines of change a year (and many millions a year in the upper reaches of the CPAN river and tens of millions further downstream) and folk writing ever more powerful pan ecosystem tools as, eg, described in this report from a couple days ago:
(That all said, I'm personally more focused on Perl 6 which is a new ball game even while it's also an extension of the old one in the sense that it cohabits in the same ecosystem and culture.)
perl5 is not abandoned, it's just currently not fashionable. There's nothing I can't do server-side in perl that I can't do in any of the other important dynamic languages.
Over the last couple of weeks I've been training people who've been stuck in svn for far too long to use git. My go to three sentences to help orient them has been "git is like perl. It's extremely useful and there's nothing you can't do with it. The problem is there's lots of different ways of achieving the same thing."
I'm not sure your argument that 'top down decisions are good' is why is valid. And the whole 'whitespace is syntactically meaningful' thing in python gives me the heebeejeebies. On the other hand when I decide I need to learn more maths, python and sympy is the tool I reach for.
I think it was one of the first (possibly the first) to introduce serious package management and that had a lot to do with its sudden popularity. Suddenly developers could build upon each other's work really easily. That was definitely the jesustech of its day.
Perl's weak type system and cryptic syntactic muck probably had a lot to do with its decline.
One of Perl 5’s big flaws was not having an object system in the language. Different people wrote different modules on CPAN, which was deservedly popular in those days, but it often meant you’d have to kludge interfaces between multiple third party systems and other things people hacked together.
That and the syntax were why I added Perl to my bash policy that any program big enough to require scrolling would be ported to Python. Usually the richer standard library also meant that the new version ended up considerably smaller, too.
Perl 5 does have an object system, one that was based in no small part on Python.
About the biggest difference is the built-in constructor `bless` is simplistic, and should be wrapped in a method.
class Person:
def __init__(self, name):
self.name = name
def sayHi(self):
print 'Hello, my name is', self.name
p = Person('Swaroop')
p.sayHi()
In Perl 5
#!/usr/bin/perl
use v5.12;
use warnings;
use feature 'signatures';
no warnings 'experimental';
# Allows for Python style constructor syntax
sub Person {
Person->new(@_);
}
package Person {
# this could be added to a base class instead
sub new($class, @args){
my $obj = bless {}, $class;
$obj->__init__(@args);
$obj;
}
sub __init__($self, $name){
$self->{name} = $name;
}
sub sayHi($self){
say 'Hello, my name is ', $self->{name}
}
}
my $p = Person('Swaroop');
$p->sayHi();
From a structural point of view, there is almost no difference. What little semantic difference there is could be put into a module.
Not that I would write it that way when Moose-like class modules exist
# Allows for Python style constructor syntax
sub Person($name){
# use the default constructor
Person->new( name => $name );
}
package Person {
use Moo;
no warnings 'experimental'; # subroutine signatures
has name => ( is => 'ro' );
sub sayHi($self){
say 'Hello, my name is ', $self->{name}
}
}
my $p = Person('Swaroop');
$p->sayHi
Moose was based on an early design for classes in Perl 6.
(and is apparently so good there are implementations of Moose in Python and Ruby)
class Person {
has $.name;
method sayHi(){
say 'Hello, my name is ', $.name
}
# Allows for Python style constructor syntax
submethod CALL-ME($name){
self.new( :$name )
}
}
my \p = Person('Swaroop');
p.sayHi();
Whenever you are rewriting code, you will see ways of making it simpler and more concise. So there are probably just as many instances where if you translated from Python to Perl 5 it would come out shorter. (Or even Python⇒Python, Perl5⇒Perl5)
Please note that I said mainstream. Haskell took a long time to mature into its modern form, add the general features a mainstream language needs, stable performance & memory usage, etc. Even now it’s generally considered one of the more difficult languages to learn and something of a niche commercially.
That’s not a slight against the language — they were focused on other goals — but for a long time the conventional wisdom was that it was interesting for CS research and learning but not normal application development whereas you could get quite a lot of work done reasonably in Python at least one decade, probably two, earlier.
Python is strongly typed. Now it has static types enforced at runtime. It’s good to be precise when there are so many quirks to expressing and checking types.
I'm sorry to break it to you, but that's an oxymoron. The whole idea of static types is that they don't need to be validated at runtime.
Python has a static type system which is gradual (ie. allows for untyped code) and is separate from the strongly but dynamically typed semantics of Python at runtime. Now, the static type system tries to mirror the dynamic semantics wherever it can, but it's still a completely separate beast. In other words, you don't enforce static types at runtime - you enforce dynamic types at runtime as usual and have an option of additionally using static type system. By the time you run the code, the static types are mostly gone. The fact that the static types reflect the dynamic type system produces an illusion that the static types remain, but they don't.
If what you said was true, the following code would not work:
x: int = 0
x = "0"
print(x)
but it's still a valid Python code which runs just fine.
The Python VM does nothing with the computed type hints. It's only useful for 3rd party tools that want to check the type.s PyCharm does it natively. mypy and pyre are command line tools you can use stand alone or plug into an editor (VSCode integrate very well with mypy).
That's not how it works. Type annotations are ignored at runtime. And to be pedantic, even if they were evaluated at runtime, they would be dynamic types, not static types. :p
The type annotations exist to make the code easier to read and to enable certain kinds of static analysis (including a type checker).
There's disagreement on that point and I think it's worth pondering. Python has more typing than some similar languages but less than popular static languages. I think there's an argument that the value curve for type checking has thresholds and Python hits a point where a fairly large percentage of people feel that going further starts costing more in overhead than it saves, with many projects being small enough that it's not an especially significant source of problems.
Python is not "strongly typed". Please don't repeat this phrase near management folk. Python just gives good error messages when it crashes at runtime, e.g. "expected an Int but got a String". This is better than a segfault, but it's still a crash.
Call it "strongly tagged" if you like, but "type" unqualified should mean static type. Even "dynamic type" is a marketing hijack of the word type.
"getting the two confused" implies that your terminology is standard, which it isn't. It is also quite common to use the term "type" to refer exclusively to properties that are checked statically.
Arguably "strong dynamic typing" is a property of libraries, not languages -- e.g. int conceptually has a method like:
def __add__(self, other):
if not isinstance(other, int):
raise TypeError(...)
return _unchecked_add_ints(self, other)
It's implemented in C for efficiency, but that's basically the semantics. Critically, this doesn't magically extend to user-written libraries, so unless you actually write all of that boilerplate nothing but a handful of methods in the standard library can be claimed to be "strongly typed". I've written code with a bunch of asserts in methods like that after determining that a large percentage of bugs were type errors that were silently doing the wrong thing. Without something like mypy, Python is untyped by any reasonable definition; it provides no support for users to actually work with types in any meaningful way.
I think you got the user-written part the wrong way around. It is strongly typed because unless you define the operators/methods you explicitly want to enable, they don't exist. For example if you try "object() + object()", you get an exception. You can implement the addition in a subclass, but that's explicit then.
Compare to JS where "new Object() + new Object()" results in a string by default.
You only get an exception if and when something in the bowels of your implementation bangs into a low-level operation that has one of these checks in it. For example:
>>> import subprocess
>>> subprocess.call([2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/subprocess.py", line 172, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
AttributeError: 'int' object has no attribute 'rfind'
In the case of `object() + object()`, that check is in `object.__getattribute__`
My point is that the checks are a property of a definition of those (possibly built-in) classes; the language doesn't provide any facility to talk about types as such. Nothing in the language knows that the argument to subrpocess.call should be a list of strings, it just happily executes until it hits what is essentially:
>>> (2).__getattribute__('rfind')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'rfind'
And `__getattribute__` throws an exception. This is the best case scenario. Worst case it never hits something that these primitive types are aware is not ok, and it just does the wrong thing. There's no way to specify typing invariants in a way the language understands -- you just have to put in the manual check yourself. This is what I mean when I say strong types aren't part of the language.
But if you add static types to the mix (say via mypy):
If it were actually checking what should be the types involved in the code you wrote you'd get something like this:
error: List item 0 has incompatible type "int"; expected "Union[bytes, str, _PathLike[Any]]"
...which is what mypy reports when run on that code. Python isn't checking the type of the argument subprocess.call ever, not even at runtime. If you're lucky, eventually you hit some code that has an explicity sanity check in it and it raises an exception.
There's an interesting point in the design space that you see with julia[1], and also with mechansims like racket's contracts[2], where the types aren't checked statitcally, but they are checked at runtime, unlike the example above with subprocess, where you only get the error deep in the implementation when something actually explodes. I think you could sensibly say that the "strength" is actually a property of Julia, rather than its libraries, but not in Python, the "strength" isn't part of the language per se.
The only differece with a "weakly typed" language is that those basic libraries have a much more footgun-like design -- again, not really about the language per se.
In a weakly typed language, that 2 would be implicitly converted to a string. So yes, python is absolutely strongly typed, as opposed to js which is not.
Indeed, but you have to explicitly do that. The language doesn't for you.
Implicit type coercion is the factor that defines weak typing. Python doesn't do implicit type coercion. Therefore it is not weakly typed. Libraries have nothing to do with it.
Yes you could, but you could do the same in Java by having a method take in Object and cast. Which by the way is exactly what's happening in the python code.
I am interpreting the term "library" somewhat broadly -- my notion is that int isn't conceptually any more core to the language than e.g. the numpy matrix classes. It's built-in for speed, but (apart from some syntactic sugar for literals), it's semantically just another class. In this light, the distinction is happening in the code that defines the int class, not in something deep in the language semantics.
The call to str isn't really a cast (which doesn't really have a counterpart in a language without static types); you're calling the constructor to the str class, which somewhere in it has a code path where it formats an int as a string (probably by way of calling the argument's __str__ method).
Sure, but int being core or not is totally irrelevant to whether or not the language attempts to convert things for you.
If int was auto coerced to strong that would be one thing, but in weak languages, everything is coerced to everything when it might make a modicum of sense, even in user defined types.
Literals become strs, objects become strs, strs become ints, arrays and objects can be combined Willy nilly leading to unintuitive results.
And since your object inherits from one of those things, you are stuck with that too. The language forces you into weakness.
Yeah okay you can jump through hoops to make your api kind of weakly typed in Java or python, but well it probably won't work in general, because the attempts to coerce will probably fail because the language doesn't know how to understand them, because the two types are incompatible. And unfortunately for your argument, not every type can be a library. Object has to be provided by the language itself.
So the question becomes, can you freely coerce between the base object type and another without loss of information. If yes, weak. If not, strong. Python: strong. Js: weak.
> Object.prototype.valueOf = function() { throw "I will not be coerced!" }
> 4 + {}
Thrown: I will not be coerced!
This doesn't seem fundamentally different from overriding getattr. What you've got in Js is basically a handful of pre-defined functions with some syntactic sugar, which are bone-headedly written to go out of their way to not report problems. But this really isn't any different than higher level libraries like nodemailer, whose attempts to be smart have been the source of vulnerabilities before:
>This doesn't seem fundamentally different from overriding getattr.
Right, but you can't override Object.__getattr__ in python. That's basically where the difference lies.
In python, `object`s don't naturally coerce, and you cannot force them to coerce. In js they do and you can.
Of course in any language you can write a badly behaved object that implements an interface that it really shouldn't, or in a way that is unexpected, but that isn't weak typing.
>Also, fwiw, Python has some questionable decisions in the basic "libraries" too:
Right, but even this isn't weak typing. It might be an interface you disagree with, but its very explicitly written to work this way. Compare that, again, to a weakly typed language like JS where you can multiply a string by anything, and you'll get NaN instead of a type error. In python on the other hand, `4.3 * '5'` gets you a type error, because you can't have a 4 and 3 tenths character string.
Basically, the difference between what python does and what js does is that python says
`* ` is an operator implemented on a left hand and right hand object. So for `l * r`, I first attempt to see if the left hand object knows how to multiply itself by the right hand object (`l.__mul__(r)`), and if that works, great. Otherwise, I check if the right hand can multiply itself by the left (`r.__rmul__(l)`), and if that works, great. Otherwise, these don't match, so throw an exception. This is strong typing, where every object decides, for itself, which other objects it is compatible with. You can certainly write an object which is compatible with anything, but up to you. You get to decide which objects you are compatible with, you're not forced into an all or nothing situation.
JS on the other hand says `* ` is an operator defined over things of type number, so to make `* ` work, I will implicitly convert the left and right hand sides to numbers (by repeatedly calling `valueOf`, and then multiply whatever I get from that. The language decides what you are compatible with, and how. You can't just be compatible with numeric types, if you want that, you're gonna be compatible with arrays and objects too, and there's nothing you can do about it.
The difference, to put it simply more simply, is that python (and strongly typed languages in general) doesn't have valueOf. You're confusing strong typing + operator overloading with weak typing, and those aren't, at all, the same.
The difference becomes clear when you take your example (with the no-coercion object) and try to add an array and an int. The Object's error gets thrown. In python, it would be possible to have a list's __add__ handle an int by appending it. But by extending for another list. In JS, that's impossible. That's because what python and Js are doing are different. Python defers to each type for how it wants to handle other types, but allows them to say "nope, this isn't valid". Js doesn't do that, either you are incompatible with anything ever, or you are compatible in ways neither you nor the other object you are working with can control.
The key thing I'm arguing is that + * / etc. aren't special, beside some very superficial syntactic support. Your description of python's `` basically boils down to:
Besides the syntax (and, again, the performance implications of the implementation), there really is nothing special about '+', '', '/' etc. They're just functions. And the implemenetation above is no more or less a design decision that the implementation of int.__mul__, which does the the aformentioned thing with strings. They very much could have implemented `multiply` as:
def multiply(l, r):
while not isinstance(l, [dict, list, int, float, str, bytes, ...]):
l = l.value_of()
while not isinstance(r, ...):
r = r.value_of()
if type(l) is dict:
l = 'I love buggy software'
...
Thankfully the designers of the python builtins had a bit more taste than that. But, ultimately, just like any stand-alone function in a library, if you don't like its behavior, you can't just override a method on your own classes to make it operate differently on them, unless the function specifically makes use of that method. Your only real option is to call a different function.
And unfortunately, the design decisions made for those operators don't extend to anything else written in the language. By default, if I define that I intend to only make sense to call on with an argument of requests.Session, and somebody (possibly me) passes it an etree.xml.ElementTree.Element, Python will happily chug along, with no way of knowing that this makes no god damn sense, until something goes wrong who knows where. And there is no declarative way for me to tell it otherwise; I have the same set of options available to me as in javascript.
I think it's at least coherent to claim that the built-in syntax makes these operators more than just functions, but... meh? It seems like a fairly trivial difference at that point. Even if you grant that these are really part of the language in a non-trivial way, it feels rather like saying that Go has generics because of the built-in slices, maps channels etc, or early Java, because of arrays. It's just not the same thing as a mechanism that actually extends to the rest of language in a meaningful way. If "strong typing" is to mean something about the language, it should it should apply to more than a handful of built-in operators. Python just doesn't have a mechanism that could be considered language-level support for any kind of typing (again, ignoring mypy and related stuff).
> And unfortunately, the design decisions made for those operators don't extend to anything else written in the language.
Yes, it applies to a number of built in functions and other operators beyond the mathematical ones.
And there's no reason they can't to even non-built-ins. When you implement a method or function, you can adopt a similar special-method-based protocol for it's arguments, and define the appropriate special methods (monkey patching existing classes, if necessary) for the acceptable types.
> If "strong typing" is to mean something about the language, it should it should apply to more than a handful of built-in operators.
I suppose there is a difference between a language with a weakly typed core (where you cannot avoid the risks of weak typing without isolating the core), one with a strongly typed core that doesn't force user code to be strongly typed (where Python sits), and one that forced strong typing (which probably had to be static, as well).
> I suppose there is a difference between a language with a weakly typed core (where you cannot avoid the risks of weak typing without isolating the core), one with a strongly typed core that doesn't force user code to be strongly typed (where Python sits), and one that forced strong typing (which probably had to be static, as well).
Yeah, fwiw I do think that the distinction being made with javascript vs. python is meaningful, it just isn't really about the language per se, but rather the design of "built-in" functions, which often doen't need to be built in as far as their semantics are concerned.
In my comment to the sibling, reiterate the reference to julia/racket -- you could have a language feature that does dynamic checks at the call site. IIRC, Haskell's GHC has a flag to basically compile down any time errors to exceptions (I don't know that anybody uses it though).
This doesn't look that different, but it absolutely is hugely different than the example you provide. Consider two types, `int` and `MyCustomType`. `int` has no concept of `MyCustomType`, given that I just came up with `MyCustomType`, and `int` has been a part of the standard library for over a decade.
If multiply were implemented as you suggest, we could do `MyCustomType * int` and it would succeed, but `int * MyCustomType` would fail. This again, because `MyCustomType` knows how to deal with an `int`, but the reverse isn't true, so if you only rely on the lhs implementation, you get weirdness.
As a language designer you have a few options for handling the `int * MyCustomType` case.
1. You have the language semantics say "when your type is multiplied by, we implicitly convert it to something compatible with multiplication, first by calling toString, then by taking the length of the resulting string. We also do the same to the rhs to make sure it can multiply. In other words, `multiply` is
This is weak typing. Note that while `* ` will always work (because everything gets a toString method (or valuOf)), the library writer has very little control over what happens.
2. You defer to each object. Each object says "this is what I can do". If either object believes itself to be compatible with the other object, great! If not, you raise some sort of error. This can either be done via capabilities/interfaces (which gets you duck typing), or via strict inheritance/class names (which python does in the standard library perhaps more than it should).
Multiply in python can't work the way you suggest because multiply is both more powerful and less broken than your suggestion. What you suggest cannot handle the example of MyCustomType unless multiply already understands how to deal with something that emulates what I want MyCustomType to do.
If, for example, multiply were implemented the way you suggest, something like
def multiply(l, r):
if isinstance(l, int) and isinstance(r, Iterable):
return [x for x in r for y in range(l)]
that works well and good, until I define a type `ndarray`, which is absolutely an iterable, but for which I actually want multiply to broadcast. The language doesn't know that, so with weak typing I end up getting out a much longer list. But with strong typing, I can control how things work.
So to be very clear, I think if you believe that `multiply` could have been implemented the way you believe, you have a core misunderstanding about how python works, because python absolutely could not work the way you describe.
>And unfortunately, the design decisions made for those operators don't extend to anything else written in the language. By default, if I define that I intend to only make sense to call on with an argument of requests.Session, and somebody (possibly me) passes it an etree.xml.ElementTree.Element, Python will happily chug along, with no way of knowing that this makes no god damn sense, until something goes wrong who knows where.
Right, and as strong typing suggests as soon as you attempt to treat an Element like a Session, you'll get an error (either type or Attribute, depending). The language won't attempt to make things work, or try to hold over for a while by for example make attribute access return `undefined` instead of blowing up as soon as something goes wrong.
The difference is that strong typing makes the error happen as soon as you misuse an object (but not before, because that's not possible). Weak typing delays that as long as possible.
Strong typing works the same way with any function. `len` is a good example. The basis of strong vs. weak typing is essentially that the language doesn't allow things that don't intend to comply with (implicit or explicit) interfaces comply with them in unexpected ways. You can always make something that intends to comply with an interface badly. But if I make something, in a strong language, you can't make it behave in ways that it "shouldn't". Another way of putting this might be simply that weak(er) languages have Object implement a number of interfaces (such as Comparable, Number, String, Iterable, which are all implemented, badly, by everything in js), whereas stronger languages do not (in python, object implements only String and Hashable). Further, stronger languages have more interfaces. For example, python separates "number" from "multipliable" in a way that JS does not.
But "strong" typing is totally and completely independent of "static" typing. Strong vs. weak is totally and completely independent of static vs. dynamic. Many people consider C to be a statically typed, but weakly typed language, because you can do things like cast a string to a struct willy nilly. You can't do that in python or Java. And note that what this means is that you can take an existing object and cast it to be something totally unlike what it should be. I can give you "Hello world" and you can pass that to a function that expects an struct{int, int, int}[4]. Something will happen. Similarly, in js you can pass a string into a function that expects an int, and something will happen. You try that in Python and you'll get a type error. You try that in Java, even if you do bad evil things like cast everything up to object so that it compiles, and you'll get runtime errors.
> This doesn't look that different, but it absolutely is hugely different than the example you provide.
The salient point about the example I provided was that it's implemented in Python -- I appreciate the correction regarding the actual semantics, and I agree that's a somewhat better design, but I also think the difference is irrelevant to the point I as making: that these operators aren't magic beyond their syntax.
> The difference is that strong typing makes the error happen as soon as you misuse an object (but not before, because that's not possible). Weak typing delays that as long as possible.
Sure, but my complaint is that if the contract for a function says it only accepts a Session, then passing it an Element is in and of itself an attempt to use an Element like a Session. This is the key point -- Python has absolutely no knowledge or understanding of the intended type of the function's arguments, and so as you say, it is impossible for it to know to signal an error here. For strong typing to be a language feature there would have to be some way of capturing that sort of thing. Instead, we have specific methods like int.__mul__ etc. in classes that ship with python that check their arguments and raise exceptions. That's good library design, but it's not a property of the language itself.
Note that catching the error at the call site is not as ambitious as catching it at compile time -- see the references I made in earlier comments to racket and julia, which can do some level of dynamic checking for user-defined functions/types.
The semantic difference between your multiply and mine is vital. Yours can only get generic through coercion (weak types), mine can be generic without coercion (strong types).
What you're describing in Julia is more static types, not stronger types.
The more static a language, the earlier it will raise a type error. The weaker a language, the easier it is to violate an interface with no error at all.
Catching the type error earlier is being static. Having a type error at all is being strong. That's the difference.
Python had design decisions, like no undefined, explicit instead of implicit varags, operator overloading via interface, not inheritance, etc. Which make the language more likely to have typeerrors instead if allowing g the successful misuse of types.
You're trying to make strong typing another flavor of static typing, and it's a totally independent concept.
When you talk about "Strong types," what actually constitutes a type in this terminology?
Re: the semantic difference, my point is that whatever the semantics of this multiply are, it's not really "part of the language" in any deep sense -- it is just another function. Given that, the semantics of multiply is entirely irrelevant to questions about the semantics of the language.
Re: "trying to makes strong typing another flavor of static typing" I absolutely am not. The "static" in static typing very much means "before the entire program is run" not just a vague "early."
>Re: the semantic difference, my point is that whatever the semantics of this multiply are, it's not really "part of the language" in any deep sense -- it is just another function. Given that, the semantics of multiply is entirely irrelevant to questions about the semantics of the language.
Well but that's only half true. That's like saying that the semantics of `int` aren't relevant to JS because its just an object and you can implement it however you want. Except that strong vs. weak typing is quite literally a question of "what are the semantics of int in JS". You can't discuss weak vs. strong types without discussing the language's actual semantics, and not the semantics of some DSL you can defined on top of the language. They're all Turing complete.
>When you talk about "Strong types," what actually constitutes a type in this terminology?
The types that can be strong or not are exactly the same types that can be static or not. They're just two independent dimensions upon which a language can vary its semantics.
>I absolutely am not. The "static" in static typing very much means "before the entire program is run" not just a vague "early."
Fine, stop confusing "strong vs. weak typing" with "runtime type checking". Strong vs. weak typing is about coercion. Runtime type checking is about type checking.
I really think that thinking in terms of interfaces is the easiest way to conceptualize this. Object in js implements a whole host of interfaces that everything then inherits from. The same is untrue in python. This leads to implicit coercion according to those (wrongly implemented interfaces).
Much like you can consider dynamic typing to simply by static typing where every object is of type Any, you can consider weak typing in the limit to be strong typing where every object implements every interface (not, claims to, but actually does, for some value of actually). So these arguments about "well the implementations don't matter" don't hold water. The implementations (specifically of Object) are the difference.
We seem to have hit some upper limit on the nesting depth in a thread on hackernews; there's no reply button for me on your latest comment[0] so I'm replying here.
This seems to be the crux of the disagreement:
> Except that strong vs. weak typing is quite literally a question of "what are the semantics of int in JS". You can't discuss weak vs. strong types without discussing the language's actual semantics, and not the semantics of some DSL you can defined on top of the language. They're all Turing complete.
To some extent it's a definitional issue; like I said in an earlier comment, I think you can sensibly argue that because + / * etc are privileged with special syntax they are "part of the language." If you chose to define it that way, then yes, the semantics of int, +, etc. matter. If you're willing to treat the syntactic sugar as unimportant, then you can just wrap the bad library api with a better one:
var TypeErr = {};
var AttrErr = {};
var mul = function(l, r) {
if(typeof(l) === 'number' && typeof(r) === 'number') {
// both are built-in numbers; use built-in *.
return l * r;
}
try {
if(typeof(l) !== 'object' || !('__mul__' in l)) {
throw AttrErr
}
return l.__mul__(r)
} catch(e) {
if(e !== TypeErr && e !== AttrErr) {
throw(e)
}
if(typeof(r) !== 'object' || !('__rmul__' in r)) {
throw AttrErr
}
return r.__rmul__(l)
}
}
// Javascript doesn't actually have ints, just floats, but there's a
// common trick with the bitwise operators to get them; let's abstract it out:
var int = function(n) {
return {
_value: n,
__mul__: function(r) {
return int(mul(this._value, r._value)|0)
},
__rmul__: function(l) {
return int(mul(l._value, this._value)|0)
},
}
}
console.log(mul(7, 2))
console.log(mul(int(4), int(2)))
console.log(mul(4, "hello")) // this thorws AttrErr.
It's not really any differrent (again, if you dscount the privileged syntax) than using requests instead of the mess that is urllib, or any other instance of "that API is terrible, let's use a different library."
> Fine, stop confusing "strong vs. weak typing" with "runtime type checking". Strong vs. weak typing is about coercion. Runtime type checking is about type checking.
Run-time checking is a requirement to avoid coercion (assuming you also don't have static checking) -- everything is ultimately just bits, so if the check never actually occurs, it just "does something." Granted, you don't get active, willfull coercion like in Javascript, but ultimately if the implementation of int.__mul__ didn't do some kind of run-time checking, you would just get some garbage number when you called it. If you don't have any checking, you have coercion. Probably not object -> string, but quite likely object -> god-knows-what-memory-corruption. This is the nature of much of what you describe as C's "weak-typing" -- no runtime checks. The only difference between dereferencing a NULL pointer in C and doing None.foo in python is a run-time check.
> I really think that thinking in terms of interfaces is the easiest way to conceptualize this.
I agree. And my argument as to why strong-typing is not a feature of python-the-language is that it doesn't provide any declarative way for me to extend that property to my own interfaces; if I have some higher-level interface that doesn't fall right out of the existing Python libraries' interfaces, I have to do all of the same kind of work that I did in the above snippet of Javascript.
I think the concept of strong vs. weak typing that you're describing is coherent, but only (a) as a property of libraries, not languages, which is my argument, or (b) you assert that the built-in syntax is central. I think the latter is defensible.
Note that I am not saying that design choices of libraries that ship with the language don't matter, or even that they don't matter more than something used less often. People use these libraries every day, because they are there. The best thing Python has going for it is its ecosystem, and if you're e.g. evaluating it as a tool to use, splitting hairs like this over part-of-language vs. library probably doesn't make a ton of sense.
>It's not really any differrent (again, if you dscount the privileged syntax) than using requests instead of the mess that is urllib, or any other instance of "that API is terrible, let's use a different library."
Sure, but all you've really done is defined a new type system distinct from that of JS or python [1]. I'll agree your type system is different from JS's. Again, the fact that you can write a new language, with different semantics, within JS is not and should not be surprising.
>Run-time checking is a requirement to avoid coercion (assuming you also don't have static checking)
Ish. Duck typing avoids both of these, unless you consider the extreme that the existence or lack of any specific method defines an interface, and every object implements some subset of these interfaces, but that's not a particularly useful abstraction.
That is to say `try: lhs.method(rhs.value), except: TypeError` avoids both coercion and type checking.
>I agree. And my argument as to why strong-typing is not a feature of python-the-language is that it doesn't provide any declarative way for me to extend that property to my own interfaces; if I have some higher-level interface that doesn't fall right out of the existing Python libraries' interfaces, I have to do all of the same kind of work that I did in the above snippet of Javascript.
Huh? I'm going to challenge you to give an example, because I don't think you'll be able to.
>I think the concept of strong vs. weak typing that you're describing is coherent, but only (a) as a property of libraries, not languages, which is my argument, or (b) you assert that the built-in syntax is central. I think the latter is defensible.
To be clear, (a) is a valid interpretation, but only if you consider the base `Object` type to be a library (or more broadly, the base type). This isn't a realistic assumption. As soon as you write you're own object, you've created a new language with similar syntax but distinct semantics. Basically yes, its absolutely possible to implement a strongly typed language within a weakly typed language. But its also possible to implement a statically typed language within a dynamically typed one. That doesn't mean that the outer language isn't weakly typed. It is. You're just able to implement a stricter DSL in it, which again.
As a similar example, you can write Java code where everything is declared as Object. This doesn't make Java dynamically typed. It does however make your bit of code not statically checked. Similarly, you can write stuff within JS that is, within a fence, strongly typed, ish. You have to stop using a lot of the language's built in syntax and methods, so are you really writing JS any more? The syntax is similar, but the semantics are not.
But there are a lot of languages with similar (or even identical) syntax and different semantics. Python2 and Python3 are a good example.
There is no such thing as "strong typing", so no language has that feature. Perhaps you mean it's type safe in some dynamic sense, but strong and weak are not qualifiers that have any kind of rigourous definition in type systems.
Usually what is meant by weak and strong typing is that there are a lot or few implicit conversions between types which undermine the typesystem.
Imagine an esoteric language which automatically converts any type into any other type. For the conversion the language will pick the default value for that type or if there is no default value it will pick null. How useful would the typesystem be in this case?
This statement wouldn't be helpful even if wasn't incorrect. You clearly don't like Python but accusing other people of malice for not sharing your personal taste is not going to accomplish whatever you believe it will.
This is about being open and honest to stakeholders about Pythons strengths and weaknesses. Sure, I won't ever get the terminology used in your community to change, but at least you will know that there are people who find it rather uncomfortable and disingenuous.
Disingenuous is lying about people’s motives to make your opinion seem more like a truth.
It’s okay not to understand Python or typing as well as you think but that should be a chance to learn rather than your cue to accuse people of malice for giving the world free software.
Go read the link I posted by Professor Robert Harper before accusing me or anyone else of not understanding type theory.
Whether you like it or not, technology does compete for mindshare. Haskell and OCaml are also free software.
Prof. Harper didn’t mention Python. He definitely didn’t accuse people of dishonest advocacy. Stop using his name to support your comments.
You could try to do something positive like talk about how Haskell does something neat which Python doesn’t support. I can think of several examples which are far more interesting than arguing over whether your personal definition of “strong typing” is universally shared.
You really aren’t willing to avoid personal attacks, are you? It’s not doing anything productive.
The main thing I took away from that post was his belief that the static / dynamic divide was a false dichotomy. I do agree, however, that I was incorrectly remembering it too charitably: he and you are both willing to distract by accusing other people of acting in bad faith without bringing evidence to support such a strong accusation.
>This is better than a segfault, but it's still a crash.
Strong vs weak types is orthogonal to whether there's a crash or not. The crash is because the types are dynamic (vs static) which is a different axis than strong/weak.
There is no static/dynamic axis for types. A dynamic "type" as you call it, is really a runtime tag. All languages are static languages. "Dynamic" and "dynamically typed" are IMHO, marketing terms. There's nothing that "dynamic" Python can do that I cannot do in a typed language using runtime tags, string keys and variants.
Static/dynamic and weak/strong typing is about the language semamtics and the runtime. You can't reduce the description the way you do, because for every language we end up with: there's nothing I can do in X that I can't do in untyped ASM, therefore everything is untyped. Or go down to either lambda calculus or logic gates. Either works.
What the language provides matters. For code like "1"+1 you have 4 main options:
- strong static - doesn't compile/start due to type error
- weak static - returns "11" and expression is known to result in a string at compile time
- strong dynamic - crash/exception at runtime
- weak dynamic - returns "11" at runtime, but the type is attached to the value, not the original expression
Your classification is based on a trivial expression involving primitive types. What would most Python code do if one tried to append a SQL query to a filepath? Since both will likely be represented by strings, it will return garbage, "weak" in your definition. Whether it crashes or not in any particular implementation is not a feature of the language.
>Your classification is based on a trivial expression involving primitive types.
It's not "his classification", just his example.
The classification is established terminology in Computer Science.
>Since both will likely be represented by strings, it will return garbage, "weak" in your definition.
It's not his definition, and it's not weak in this case. Weak is not about getting garbage or not, it's about the semantics of a behavior. It could be that the result of the weak behavior is exactly what one wanted (e.g. to add a string to an SQL statement -- I might be adding together SQL statements and text about them to produce a book on SQL with my script).
In any case, the SQL statement is a string in Python, not its own type, so the result will not be weakly checked, it will be strongly checked as what it is (a string) -- Python will do what it should.
That looks like a random list of papers. Which one are you claiming defines or even discusses "strong" or "weak" typing? AFAIK those terms have no real definition in type theory.
You said academics don't use the terms - you were given examples that do so.
None of them need to define it - if you're interested, you can do that research.
Those terms don't have definitions in type theory and that's ok. Systems defined in the type theory deal with very explicit, theoretical analysis that usually maps well only to static-strong implementations.
Yes goal posts have been moved, but not by me. Some of those papers may informally use these terms for static types, but it is certainly not to describe characteristics of dynamic languages. AFAIK, there is no classification in computer science that describes dynamic languages in this way. This has come from the dynamic language communities. None of those papers demonstrate that it is acceptable to call Python "strongly typed", in fact quite the opposite, as the technology discussed is not available in Python. Well, not yet anyway...
Perhaps filepath was a bad example for Python, as it has assertions. Zenhack actually made the same point and explained it better than me. Your runtime assertions (what you call "strong dynamic types") are a feature of the libraries you are using and a few built-in primitives. Python does not really have any facilities to talk about types. Using strings as in my example, would be "weak typing" in your terminology, but as you said, this is possible in any language, even Haskell. Therefore I fail to see how strong/weak is a language property. We are therefore left with static and "dynamic types", which Robert Harper argues should be statically typed and statically unityped.
"Generally, a strongly typed language has stricter typing rules at compile time, which imply that errors and exceptions are more likely to happen during compilation"
Even Java, which introduced generic types long after the language itself had become standardized and popular, cannot keep generic type information after compilation due to backwards compatibility.
Don't quote me on it, but I believe C# doesn't have this problem.
There are libraries that do magic to do this. Annotations are accessible at runtime, so you can do things like add an implicit import hook that decorates every function or class with a decorator that validates the annotations.
When developing, you don't necessarily know about your types yet. And you might not even want to think about them. It's easier to shove in assumptions because things will change as you begin to understand what you want to build as you spend time writing the software.
It's not until later when the code has partially stabilised that you can start spotting what data structures are robust enough to be close to final and actually do benefit from being blessed by typing, locking down the inputs and outputs.
Traditionally, in a dynamically and/or weakly typed language, when the program grows large enough those parts could have been rewritten in a statically and strongly typed language. But if the dynamic language also offers some, even just basic tools for static typing then it all becomes much easier.
> When developing, you don't necessarily know about your types yet.
This is absolutely not true for myself and my coworkers. We think in types first. Typically we write out high-level functions with type signatures and make sure they all fit together and everything type-checks. Then, we'll fill in the detail and actually implement the functions. Often this process is recursive and we need to repeat it until we get small functions that are either easy to implement or available in a library. We have a type-based search tool to facilitate finding such library functions. Sometimes there are problems and alternative abstractions that only come to light when filling in the details. When a large re-organisation of code is necessary, types again help to get this right.
Personally, I could never build a non-trivial piece of software in a dynamic language. Perhaps I just don't have the brain power to track the types manually in my head.
I develop making heavy use of behavioral tests, REPL and sanity checks.
I don't worry too much about tracking types in my head because with a test and REPL I can quickly and trivially spin up and inspect to the minutest level of detail almost any line of code in a runtime state, use autocomplete, etc. and write reliable snippets of code with a near instantaneous feedback loop - getting instant feedback not just on an object's type and its properties, but on what it actually does when it is run.
Usually when developers used to a statically typed language switch to a dynamically typed language changing their style of development does not occur to them simply because the economics are so different to what they are used to - they rely heavily on IDEs with autocomplete, etc. (which, in statically typed language provides instantaneous feedback) and are used to long compile/test feedback loop and REPLs that are poor to non-existent.
In practice this means a lot of them don't think to go looking for potentially better forms of instantaneous feedback that are more readily available in dynamically typed languages but will kvetch because the one they are used to is suddenly unavailable... ;)
Static types mean that you never have to write behavioral tests. If you're writing behavior tests, you're just writing a very verbose subset of the same guarantees that types offer instantly.
Moreover, running code snippets in a repl only help if your codebase is small and you're designing it mostly alone. 3 months later, when someone needs to add a new feature and they call your function with incorrect assumptions (sending a dictionary with missing keys for example) then you'll have something that can work in the happy path, but doesn't work around the edge cases. I've seen behavioral tests try to address this, but again, that's just an inferior form of typing.
>Using a better programming language you won’t have any need for a lot of your tests.
I know of no type system that can acts as a reasonable substitute for a well written integration test. How exactly is your type system going to provide a reasonable assurance that my code will work as expected with an upgraded database, for example?
They can substitute sanity checks and poorly written low level unit tests that should really have been sanity checks. I've seen nothing more impressive than that.
I've heard rumors of the amazing things you can do with haskell's type system and the old chestnut that "if it compiles it works" but in practice the one piece of software I use in haskell is, at best, averagely buggy.
Agreed 100%. For quick one-off scripts and coding interviews, I have no qualms with using python. But maintaining api input/output guarantees in a large piece of software w/o explicit typing sounds like a huge pain to me.
I spent the first 12 years of my career doing 80% C and C++ and a little Perl on the side and the last 10 years doing C# with a little PHP. Of course I've had to do some JavaScript. My experience with dynamic languages completely turned me off of them.
But recently, I was forced to learn Python. I have to admit that Python is joy to use for small scripts and event driven AWS Lambda functions, but I would still use C# for any large projects.
I can't put my finger on why Python+Visual Studio Code is such a joy to use compared to my previous experience with dynamic languages.
That's kind of the thing, though: Folks are using python to bang out trivial pieces of software all the time. Dynamic types are great for that.
Occasionally, those trivial pieces of software become non-trivial, in which case, having the ability to refactor with type hints is a good thing vs. a complete rewrite in a strongly typed language.
It's not FUD if someone posts an opposing viewpoint. My point was that some of the terminology used in this community is rather disingenuous. Python has a lot of strengths, but let's also be honest about its weaknesses.
So far you haven’t had a good track record of making statements which are factually correct and detailed enough to discuss. If you want to contribute anything of value to the conversation try making a longer comment which explains in detail precisely what you believe is a problem and why it matters so much that you’re willing to accuse a popular open source community of making false claims.
So far I’ve seen one concrete example from you (the SQL/file path one) which is either the same in every other language (if you store them as generic strings) or prevented by Python’s type system (if you use non-generic types).
I posted a link by Professor Robert Harper, I guess you didn't read it. This has been a discussion about terminology, so I'm not sure you can claim anybody is factually incorrect.
That guess would be incorrect. I would suggest that you stop disparaging other people and try to clearly articulate the point you are trying to make in enough detail to be discussed.
As I said, the link I posted contains the detail. If you've read it then perhaps you'd care to comment on it instead of posting successive personal attacks. I suggest that you not take my opinions (or Harpers) so personally.
Let's be real, probably 99% of projects just use a language that the main developer is already familiar with. Social and economic realities often trump technical merit, which is why inferior technology has such inertia long past the time when better replacements are available.
I made a temporary kludge around 10 months ago whose time to live was supposed to be a few weeks. Guess what happened to that? I am definitely looking forward to retiring it mind you.
"2.) the other group of programmers start off saying “There is a great deal that I do not know about this particular problem, and I am unwilling to commit myself to absolute statements of truth, while I’m still groping in the dark.” They proceed with small experiments, they see where they fail, they learn about the problem, they discover the solutions, and they add in type information incrementally, as they feel more confident that they understand the problem and possess a workable solution. These are the dynamic type advocates."
But I would assert that it is misleading to pretend that you understand the system at the beginning. I want to be honest about how ignorant I am when I begin to work on a problem that I have never encountered before.
But are you honest with yourself? When working "in the unknown", the basis of your code is nevertheless a set of assumptions. You might not spell them out, but they are there. Static types do not equate pretense of understanding - they are those assumptions made explicit, and more importantly, universal (and not existential like tests). They help reasoning about code (esp with larger codebases/teams) and it is easier to adapt and extend (well, refactor) them in a safe way as the code and with it your knowledge about the problem develops.
To take a blob of JSON and ram into MongoDB doesn't reveal much bias. If we later decided "This JSON needs to have, at a minimum, 4 keys, including user_id, import_id, source, and version" then that is the first layer of runtime checks we can add, or type hints, or Schema with Clojure. And that becomes version 1. If we later say "We also need to add created_at and processed_bool" then that becomes version 2. The lack of the 4 keys in version 0 reveals my ignorance, just as the lack of the extra 2 keys reveals my ignorance in version 1. There will always be ignorance, there will always be things I don't know, the important thing is to make that as explicit as possible. Adding explicit types in version 0 suggests that I have a confidence that I do not have, and in fact, I eagerly wish to communicate the opposite of confidence in that early code.
Hmm. From my POV, nothing in your example is actually incompatible with a process using explicit static types.
As long as I'm writing code without any assumption regarding the JSON structure, i.e. "I'm gonna take strings and push them around", there won't be anything else than string based type signatures.
As soon as I've reasoned enough about the problem that additional assumptions are manifest enough to influence future code (e.g. code that presumes your version 1 properties) the type signatures are extended.
The main thing I see different: I strive to make those assumptions explicit before writing code based on them, and to make them explicit in a universal manner. This has nothing to do with "The Truth". Quite the contrary, one should always be aware that code as a whole and typing information especially are only a model based on assumptions.
"Using static typing" doesn't mean "Let's build an army of interfaces and classes as large as we can come up with in our brainst...erm,analysis sessions". That's a rookie mistake. I dunno...your experience might be primarily in overspecified "SeaOfNouns"-Java projects, but the problems associated with those have nothing to do with static typing itself.
And to be frank: Statements like "Thus you are seeing my exact level of ignorance, and my exact level of certainty." scream "hubris in action" to me.
No: What I see is what you think your exact level of ignorance and certainty are. And I don't find that kind of information particularly helpful.
That's odd, I consider myself to be firmly in group 2. I'm often fine with stating things I don't know. Yet I'm a big advocate of static type systems.
Starting with types allows you to play around with data models of the problem space quickly before committing a lot of code to provide a solution. At least that's my take on this.
To contrast this, types help a LOT for me when developing something. The last thing you know are the implementation details of what you're doing, so I usually start by sketching what I want to do put, with some simple ADTs (algebraic data types, e.g. union/sum types and product types just to give something tangible). Then create a couple of functions I think make sense, give them an undefined for now in their function body, and align the types up from the inputs and outputs.
Now, here's where you usually hear people praise dynamically typed languages—I want to change the design, because I still prototyping. How do I do this? Just change the types (ADTs)!
After changing my design, the compiler will now let me know where my functions don't make sense anymore. Contrast this with when I usually dev in a dynamical language, I end up running my program again at this step, finding out that one function over here didn't like that I'm now passing a string down instead of an Intel, another function was expecting a boolean, but now I'm returning the result instead, etc.
The saying that static languages are not great for prototyping needs to die. They are magnificent, you just need to have a conversation with the compiler instead of fighting it.
Of course, this all assumes a language with a sophisticated enough type system, e.g. Haskell, OCaml, Idris, PureScript, Elm (somewhat), Rust (depending on your do domain), etc.
For me this not only matches ease of onboarding new users, but also what experts like over the lifespan of a project.
So often I start out just by banging on something. My domain model isn't well defined, I'm very tolerant of errors, and I really value quick exploration and a fast iteration cycle. If a project starts to mature, though, I start wanting more things that aid long-term maintainability. More type checking. More controlled error handling. More constraints on what code is ok.
Maybe a good example is checked exceptions. Everybody hated those in Java, because it imposed a discipline that's only really valuable in certain situations. In contrast, something like Ruby or Python lets you have a single high-level error handler to start, and gradually tighten things up. Java's type system similarly pushed a lot of people away, because it made them pay up-front costs for down-the-road benefits they might never need. Hopefully this will be a similar thing, where projects big enough to benefit from type discipline will add it as they go.
Type checking is all about complexity management... you don't need it if you're hacking on a weekend project, but when you have hundreds of developers on a million-line codebase, it can save you tons of time.
Dynamic languages have their advantages too (productivity, REPL's, etc.)
Why NOT combine the two features? Just because it's not built into the core of the language doesn't mean you can't benefit from the complexity management aspect if your job is to manage the code complexity in a large developer workforce.
> Type checking is all about complexity management... you don't need it if you're hacking on a weekend project
I see this argument a lot but I don't buy it. You might not need Java's SimpleBeanFactoryAwareAspectInstanceFactory but I don't believe putting in types hinders quick prototype development; I think, in general, it helps it. Especially when combined with good tooling and a good IDE.
I code in mostly static languages, and even when I'm doing 1000 line python projects I miss types.
It especially helps in refactoring things quickly without needing a special IDE environment. It also lets me skip an entire class of unit tests that boil down into type checking.
Many of the "dynamically typed language" features can be had in statically typed languages as well: there are statically typed languages that have REPls for example. Your point about having higher productivity in the beginning and then retrofitting types after that still stands though.
> Dynamic languages have their advantages too (productivity, REPL's, etc.)
I've seen the productivity argument a lot, so I'm curious about this perspective. What does "productivity" mean here? Shipping code? Because if you ship code that doesn't work, did you really ship it? If you're writing a small throwaway sub-1000-line utility for internal use, I can see that, but if you're getting paid to ship code what is the definition of productivity?
But I suspect whereas the static type additions to Python will tend to spread throughout your entire system (the more you annotate, the more useful the feature is), the dynamic features of static languages tend to be used in specific places for specific reasons (the less you use the feature and the more you encapsulate it so the dynamism doesn't "escape", the better it works with the program as a whole).
This was pretty much true when I was last using C#. You would find specific places where you need "an object" (e.g. maybe you want to LINQ over some data in a method) so you could dynamically create it but if you wanted that object to go anywhere outside of scope you had to have a concrete definition.
auto types really don't have anything to do with dynamic typing, in fact I would argue type deduction is an important tool in making static typing more attractive. An example of dynamic typing in C++ would be std::any (although you're still limited by what you can do with the contained object of course, for better or worse).
I think describing these as static type systems is a bit limiting. I prefer to think of them as static code analyzers. They don't change anything about the representations in memory, but they do validate invariants about the code. I actually hope we start seeing systems that validate invariants that are not type based like "SQL Injection Safe SQL Query" and "Range between 10-100".
> I actually hope we start seeing systems that validate invariants that are not type based like "SQL Injection Safe SQL Query" and "Range between 10-100".
Both of those can be types in a rich enough type system; the latter is a fairly common example for dependent types, for instance.
[stanza](http://lbstanza.org/) is one of the more interesting recent languages i've seen - it is designed from the ground up to be gradually typed, which is the end goal of projects like pyre.
This looks at first sight like a cross between Python and LISP without parentheses (not saying that in a bad way). I'm actually glad that they choose this syntax, because a lot of people get hung up on the parentheses.
Have you used this for anything? If so, how was the experience?
no, just wrote a couple of toy programs. but it's on my short list of languages i keep an eye on because it looks like they hit a sweet spot in language design, implementation and ergonomics, and they might be exactly what i need for some future project. (d and pony are a couple of others in that camp)
It's because dynamic typing is easier for small programs and beginners, but sometimes the language becomes popular (e.g. Python or JavaScript) and people start trying to write large robust programs in it. Then you need static typing.
I initially thought optional / gradual typing (a la Dart 1) was mad but when you think about it in this context it makes sense. However now Dart 2 has gone back to static typing so who knows...
I've built video streaming websites with a users account, uploads, votes, comments, many filters and hundred of thousand files, holding half a million user a day in Python. The server cost is around a fifth of the generated revenues.
Most projects will never even reach this size, not to mention remotely approaching google size (from 2010).
Strict static typing incours a cost at development time whether you need it or not. Being able to develop quickly with dynamic typing and then add static typing checks afterwards is just such a huge productivity boost really exposing the best of both worlds.
I'm not sure the static type system thing is real. Ultimately for 90% of workflows you wind up reserializing into a very much not-static-datatype (JSON, e.g.)
I don't see how that matters -- it seems to indicate a misunderstanding of what type systems are. Type systems obviously have boundaries whether you're interchanging with another type-safe system or not, that's irrelevant.
Every program has to deserialize data like JSON into types and enforce its assumptions about that data whether you have static analysis tooling at compile-time or not.
It absolutely matters, type systems don't exist in a pure universe, they are important because we use them to interact with human needs. A good type system pushes out typing concerns to the outer boundary of of the system. For that you don't need static typing, you need strong typing. Refactoring in a statically typed system can be an incredibly tedious process with tons of boilerplate. At the other extreme if you allow duck typing and monkeypatching then you wind up with a problem where debugging becomes a problem. There's a middle ground that a lot of very good and productive programming languages occupy.
You're not going to find many people who agree that static-typing makes refactoring harder. The classic issues with large coupled dynamically-typed systems is how difficult refactoring is without runtime errors.
You also have to be more precise when you hand-wave about benefits of "strong typing" as you can read this very comments section to see how nobody agrees on what that means.
By 'type concerns' i mean coercing the real world into the type system. You want that to be at the edge.
IMO, treating a program as a proof is somewhat impracticable as you see with very pure functional programming paradigms struggle with universal state monads.
I don't see a contradiction here. (speaking as a Python and Haskell user):
1 - dynamic languages, while granting great freedom, make it easy to shoot yourself in the foot with simple typing bugs, so we cope by adding type checking
2 - many (most?) static type systems are not expressive enough to type some programs, so people embed dynamic solutions to build the stuff they need.
Haskell is pretty decent at typing most stuff I want, but it makes me think about it up front, which is nice sometimes, but not always. Python lets me build stuff quickly and work out the design as I go, but it's hard to make sure everything works as intended. I wish there was something that let me bodge things together, but then constrain them with types when I need to...
Type hints aren't checked or enforced by the compiler. They're really just special-syntax comments to help the developer (or IDE, as the case may be). I wouldn't call this "type checking".
I'm using the type annotations from `typing` in a Python 3.5 project I'm working on, but only as a form of documentation. Unfortunately mypy doesn't work for me – the project heavily uses a namedtuple-style sum type library, and I haven't found a way to make the types it generates work with mypy.
(iirc mypy has builtin support for namedtuple, but doesn't really support other types generated at runtime)
Beanshell is another example for Java, and did it before Apache Groovy did. Beanshell added features like optional typing and terse property syntax to Java, then Groovy extended Beanshell by adding closures to Java. Java has made these scripting languages less relevant lately because of more recent features like lambdas and inferred types.
For 2, I should have said "programs written in statically-typed languages". Oops.
Examples are most text editors: emacs (elisp), Sublime Text (Python), Vim (VimScript), Atom (JavaScript). Many games and game engines: World of WarCraft (Lua), Unreal (UnrealScript), SCUMM, etc.
I'm guessing you meant these hypotheses to seem paradoxical (i.e., dynamic code bases are rewritten in static languages to improve maintainability; static code bases are rewritten in dynamic languages to improve maintainability), but I don't think they are.
Dynamic codebases are rewritten in static languages to improve maintainability and code quality; programs written in static languages include an interpreter as a feature that supports user-defined programs, e.g., a browser's JS engine or emac's text editor and also for specialty domains where code quality is less important than developer velocity and the static language of choice is unnecessarily painful (games).
Assuming I didn't misrepresent your thesis, does this seem more or less accurate to you?
No, I actually didn't intend them to be paradoxical.
My feeling is that statically-typed languages and dynamically-typed languages are each better suited for certain kinds of problems. The former gives you better performance, long-term maintainability, and scales to larger programs. The latter gives you a faster iteration loop, is easier to learn, and is easier to dynamically load.
I think what happens is that any sufficiently large program covers so many different requirements that eventually different parts of the program are best implemented in different styles of language.
If you start statically-typed, you eventually get bogged down by the slow compile-reload cycle, need to dynamically load user-defined code, or what to express parts of the program in something higher level or more declarative. So you end up embedding a scripting or configuration language.
If you start dynamically-typed, eventually your program grows to the point that you either can't maintain it or it isn't fast enough, so you end up trying to bolt a type system onto your existing code or rewriting parts of it in a statically-typed language.
I think the costs for the latter are usually higher than the costs for the former, so I tend to lean towards starting statically-typed, but that's also because I'm comfortable using types.
Ah, thanks for clearing that up. I forgot compilation can take a while. I used to write C++, but the only compiled language I've worked seriously with in the last several years has been Go, which tends to compile in about the same time it takes a Python program to load its imports.
UnrealScrip is explicitly said to be both statically typed and inspired by Java a lot[0][1]. It just had lots of language features dedicated to the game, like networking, state machines, being preemptively multitasked, integration with the editor, etc.
Lua is of course popular and the topic of using it in games has been done to death: Love2D, Grim Fandango, Don't Starve, Cry Engine, Garry's Mod, etc.
Eve Online supposedly uses Stackless Python and I've seen Vampire the Masquerade Bloodlines (early Source engine) use Python.
Valve's wiki lists Squirrel, Lua, Python and Gamemonkey and they are all dynamic but I think Squirrel is the top language there.
I've also seen Angel Script in HPL1 (the Penumbra series by frictional) but it's another static scripting language and it markets this as a feature.
Unity of course has its C# 'scripts' which are static and I found the idea of scripting in such a behemoth of a language very novel when I've first seen it (yes, UnityScript has dynamic capabilities but it has static ones too and most of the users seem to be with C#).
SCUMM (and Z machine and similar) is a very weird case because it seems more like a script in the sense of a scene play (and the S might stand for script but they were naming utilities after gross stuff like CYST, MMUCUS, etc. so it might be shoehorned), not actual programming language, it's tied to the engine/game/genre/assets way more strongly than even Unreal Script was but I didn't do anything with SCUMM so I can't actually say for sure what would happen if you wrote something that didn't make sense (like tell a table to walk somewhere or reference an object or animation or something else that doesn't exist).
Many Naughty Dog games (including Crash Bandicoot games on PS1 which is frankly insane) also used their own Lisp inspired by Scheme called GOOL and then GOAL but I'm not sure how dynamic it was.
There's also a subtle line between embedding a language to make yourself scriptable and making yourself a library written in mix of C, C++ and that language and there is an interesting article on that[3].
GIMP also includes (or used to?) something related to Scheme/Lisp and has a system in place to support multiple scripting languages at once and calls between any of them called Procedural DataBase. Adobe Lightroom includes Lua too.
Id tech engines apparently used another language/technique each major engine version and I don't know the specifics of it at all.
I'm quite interested in the topic of extending applications (and specifically games) in various scripting languages, plugin systems, etc. but I'd not say all static programs grow dynamic scripting capabilities since AngelScript, Pascal Script (:D), Unreal Script and C# are all static 'scripting' languages and scripting in general is about 'scripting' (as in - small or big bits of code that is easy to change and safe to experiment with modifying a larger program in small or big ways), not necessarily dynamic programming (e.g. you could in theory every viably have Lua scripts in a larger Python program or even AngelScript scripts in a Python program to really mix it up and script a dynamic program with a static scripting language). You could also extend a C or C++ program with traditional 1990s/2000s style 'plugins' written in almost anything that can get the required ABI out into a dll/so and load, unload, reload, etc. those at runtime without stopping the main program.
C# has had a dynamic type built in for years now. It's literally a dynamically typed language embedded within the statically typed C#,which grew out of the work on gradual type systems.
That's not why the dynamic type was created, that was just one side benefit. The dynamic type grew out of the IronPython effort which grew into the DLR in order to better support dynamically typed languages on the CLR, by having reusable polymorphic inline caching for efficient dispatch.
COM integration with the CLR has existed since the beginning, which I used quite extensively many years ago, they just took advantage of the DLR to make it easier.
Still given that it was already available as Variant on Visual Basic before .NET was even an idea, it hardly has anything to do with "which grew out of the work on gradual type systems.".
Visual basic is hardly the first language with a dynamic type, and the DLR is clearly descended from Self, Smalltalk and other dynamically typed languages and started with Iron Python:
I use mypy[0] for PEP-484 type checking on a regular basis in large projects.
PyCharm[1] supports PEP-484 type checking as well and displays errors in the editor.
Mypy is the standard, but too slow to integrate into editors (more appropriate for CI). PyCharm built its own implementation for this reason, but of course it's proprietary. Pyre's trying to get typecheck integration into other editors.
Why do you think it's too slow? I have it integrated with Emacs' flymake and it works pretty well. It takes less than a second to run mypy and display its results (with red underline or whatever else you have configured). From my experience, Visual Studio with F# takes similar time to type-check and show errors.
Well, I am start to writing some Typescript with tslint configured on save, and it takes way more than 1 second for a small toy project.
Even just building this project with Typescript compiler takes more than 1 second. And I don't think it is that bad because at least the process is async.
Interesting. I write TypeScript in PyCharm and I get feedback as soon as I write (PyCharm performs save automatically). I don't know how they implemented it that it's so fast.
Well, PyCharm is already doing a lot of parsing for syntax highlighting and autocompletion and other goodies. They're checker implementation probably just takes the AST PyCharm already has available to it, which saves the round-trip of saving to a file and having an outside checker open and re-parse the file.
Ok, I don't know myself, because I stopped looking for the next editor years ago, so I'd like to ask: do you know how long it takes, on average, for IDEs and editors to highlight a type error?
I've had to disable mypy a couple time for specific parts that it doesn't understand. Very annoying how it breaks on non-trivial stuff because that'd be the use case where you'd want it most.
I've never had it prevent a problem. I like staticly/strongly typed languages for this reason and putting optional type annotations into a language is never going to work out imo.
I upvoted because I had a similar experience; however, your conclusion doesn't follow. Python's borked type checking doesn't mean optional type annotations are fundamentally not going to work out.
In particular, if Python added support for Go-like interfaces and recursive types, it wouldn't fall over for ~99% of nontrivial cases.
I just tried both mypy and pyre for the first time. Pyre immediately found two actual issues, but mypy found a thousand possible ones. Maybe I just need to configure mypy better.
What’s interesting about this announcement is that they not only announced a type checking tool, they are also open sourcing a tool that will apply a patch and annotate your code with types for you. I tried a demo of this at the Instagram booth at pycon yesterday. They basically had a package that would run your tests for you, and using the coverage that you created, it could infer the types while executing your code. Now if the point of type annotations is purely to check for bugs that are avoided in statically typed languages it seems like you should be able to just combine the two tools and catch type based bugs without ever applying the annotation.
I think the type checking community also has quite a ways to go when it comes to asynchronous code. It becomes quite worthless to check that a future is being returned by a function, but it could be more helpful to know what that future will eventually yield.
Surely `await f()` should be a type error if `f` doesn't return a future and checking for that seems useful. I am not up to speed on Python's types but in most type systems you can do something like `f : Future<Int>` which _will_ tell you what the future will eventually yield an `Int`.
It's a language that let you start small, and progress a lot with the complexity of your code. You can be productive in Python in 3 days if you know another language. But you can still learn new Python useful things 10 years after you started.
The progress curve is very sane.
And in the same way, your project may start small, then you add docstrings, classes, modules, packaging, unittests, infrastructure... And at some point, you may want types.
I use types on maybe 10% of my code in Python. It's great it's not mandatory. And it's great it's here when I benefit from it.
I write Python in my day job and I _always_ start with types. I find they make even prototype code so much easier to reason about. It's just nice to be able to tell at a glance what a function returns, what args it takes, etc., and I don't have to master Sphinx's cryptic syntax (which I still haven't managed after using it for 5 years). And I experience this benefit even for code that I wrote yesterday without even running the type checker. And it's only going to get better with editor integration.
Without types, if I have the code `a = foo(x=1)` then I have to hunt down the source file for `foo()`, which likely just returns `bar(x)`, so I have to hunt down the source file for `bar()` to figure out what the hell its return type is, and so on and so forth. With types, I just look at the type signature for `foo()` and I'm good to go (and again, editor integration means that I don't even need to look up `foo()` at all!).
> I write Python in my day job and I _always_ start with types.
It depends of the project. If you write a lot of flask/django code, writing the types is not that worth it except for a few functions/methods.
> Without types, if I have the code `a = foo(x=1)` then I have to hunt down the source file for `foo()`
No, you just hover the function and get the help() out of it in most framework and libs. Again, if your code is mainly using a well define, documented and popular framework/lib/api, that's not a big deal. And it certainly doesn't require YOU to add types.
Or if you write a program that is contained in 1 to 5 files top. Not use for types.
Or if you are writing your program in jupyter.
And you most likely copy a snipet from the doc anyway. After all, if you see that the function you want to use return a AbstractTranscientVectorServiceFactory object, you can't do much with the information without the doc anyway.
But let's be real, most functions in Python are named pretty explicitly, and return things that you expect like "iterable of numbers" or "file like object yielding utf8".
Types are particularly useful in the cases if you are in a big project with a lot of custom code or in a domain either very complex or that you don't master very well. They are a good for of safety net and documentation at the same time.
But they come at a cost and it's a good thing to be able to choose.
> No, you just hover the function and get the help() out of it in most framework and libs. Again, if your code is mainly using a well define, documented and popular framework/lib/api, that's not a big deal. And it certainly doesn't require YOU to add types.
If the help or comments tell you the types of arguments and return values, that is just static typing in the form of comments. Even more verbose than language level type annotations, yet much harder to parse for editors/IDEs/linters.
> No, you just hover the function and get the help() out of it in most framework and libs.
There are no editors that can reliably deliver type information. The ones that come closest (PyCharm and YouCompleteMe) are otherwise very poor editors and resource hogs (the former) or nearly impossible to set up correctly (the latter). And neither is of any use when I'm reviewing someone's code on Github or debugging over ssh.
> Or if you write a program that is contained in 1 to 5 files top. Not use for types.
Types are definitely _useful_ for small programs, but you're right that they're not _necessary_ to the degree that they are for larger programs.
> And you most likely copy a snipet from the doc anyway.
This... is not the kind of programming I do. Although it may explain a lot about our difference of opinion on the topic.
> But let's be real, most functions in Python are named pretty explicitly, and return things that you expect like "iterable of numbers" or "file like object yielding utf8".
Ah yes, the get_values_as_iterable_of_numbers() function. :) Let's be real, virtually _no_ functions are named this way, and even if they were, the typing syntax is surely a better way of facilitating this information. More importantly, it still doesn't tell us anything about the types or number of the arguments. Lots of popular Python libraries (Pandas, Matplotlib) take ridiculous combinations of arguments (two strings and an int sometimes, but other times its an int and a float and a file handle, and if it's called on a Tuesday in April then you can omit all int args). If they had to appease a type checker, these libraries would certainly be much more usable.
> But they come at a cost and it's a good thing to be able to choose.
There are tradeoffs in many areas of programming, but typing is a clear win. Specifically, what is the cost you mention? The type checker won't let your type documentation grow stale? Or perhaps it makes it hard to write clumsy signatures like those in Pandas and Matplotlib?
I'll will also learn because the language is constantly evolving and there's always new toys to play:-)
I program Python for 24 years now. It is my preferred language, but just recently I could go back to develop in it. It is impressive how much new idioms there are to learn.
Yeah, it just needs to work on its tooling and library story. Last time I tried, the OCaml toolchain was such a big pain that the Reason community basically told me to give up and compile to Node. And then there are relatively few native-ReasonML libraries, so you have to figure out how to integrate with OCaml which usually means learning how to _read_ OCaml without clawing your eyes out (I'm only sort of joking).
I'm rooting for Reason, but it has a few nontrivial hills to climb before it's practical.
That's encouraging to hear. Unfortunately I don't do much frontend web development, but hopefully if the frontend web-dev story catches on then the backend will follow.
If the question is "why choose Python" or "why choose JavaScript", then I think you're missing the point. Python is simple and beautiful and highly productive to work in. JavaScript is ostensibly the only language that you can use in the browser without a compilation step (to JavaScript). Just because typing is added separately doesn't negate all of the other benefits of these languages. Hell, if all programmers cared about was type safety we'd all be writing Rust for every project.
> If the question is "why choose Python" or "why choose JavaScript", then I think you're missing the point. Python is simple and beautiful and highly productive to work in.
Well, it depends on the problem! If you’re working with a lot of data structures, or with bytes and serialization, compiled system languages with static typing are going to be productivity boosts over python. Languages aren’t everything, but there are definitely poor language/problem space fits.
> If you’re working with a lot of data structures, or with bytes and serialization, compiled system languages with static typing are going to be productivity boosts over python.
Python has fantastic, expressive and productive libraries for those kind of problems. Unless you have some performance issue, I can't see how you are going to be faster than that.
I think languages like Go are changing that. Most of the statically types languages have been very verbose and often more lower level than people like. If you are building a new web service, python, ruby or JS are very attractive as they help to get something out of the door very quickly. I think Go hits the right spot here between verbosity and safety/expresivness, and a lot of new startups are choosing Go for things they would have built in Python or Node. I think this will only improve as more dynamic languages start adopting static typing and most static languages start adopting things like type inference, etc.
You make an excellent point here that I don't see articulated very often. The Landscape of Popular Languages, let's say C/C++/Java/Python/Ruby/JavaScript, has a big gap in the middle. You have good choices between:
1. lower level "systems" oriented languages with static types, usually compiled, like C++. You get lots of flexibility and direct access to primitives. Static types help wrangle big codebases. Generally suited to large projects.
2. higher level "scripting" oriented languages with dynamic types, usually interpreted, like Python. Writing code for most tasks is easier. You give up some stuff you'd want for projects like operating systems or databases. Most projects aren't operating systems or databases, so that's usually a good tradeoff. Generally suited to small projects.
The problem is that lots of projects are medium-ish. You set out to build your web service backend or whatever, it would be a pain in the ass to write in C++, so you use Python. Getting it working is quick and easy. A few months later it's big, complex piece of software and working on it in Python is a pain in the ass. You can't win. What you really wanted was a language that's "easy to write" like Python, but with static or optional types, maybe better thread handling, and at this point the interpreter isn't doing much for you so it might as well be compiled. There are tons of cases where you just want a "better C" or "Python but faster and with static types", and for the longest time the Landscape of Popular Languages just had a giant hole there.
We needed that space filled and Go delivered. I'm usually very critical of Go, but I can't hate on it for being the wrong kind of language. It's definitely the right kind of language for these "goldilocks" problems that aren't too high or too low level, too big or too small. Part of being a good programmer is understanding that languages are tools, and you need to pick one that fits your problem. Go deserves all the success and praise it's gotten for being a language that fits actual problems.
These gradual type checking frameworks allow migrating massive code bases incrementally. Doing a complete big-bang rewrite into a different language isn't feasible (for the obvious software engineering and business reasons).
So you can introduce a type system in a module, but you can’t write new version of that module in a different language? To me these seem comparable in scope.
They are in no way comparable in scope. Introducing a type system gradually doesn't break any compatibility, within the module or outside the module. It doesn't require you to understand what the legacy code is doing. Rewriting in a different language also means different libraries, different glue code, different interfaces, different everything.
I've been adding TypeScript types to a bunch of JavaScript code recently, and it really is much safer and faster than trying to move to a different language. As long as you are only adding types, you don't have any chance of introducing a bug, since the types just compile away, and plenty of real-world code (especially untyped code) tends to have subtle details that you might overlook if you're reading it for the first time.
Another thing to keep in mind is that type annotations can be done incrementally. You can't take 1000 Python files and port them to Go one by one, but you can add types a little bit at a time with a completely usable codebase at every intermediate step.
The rigor and concepts and generics and interfaces required to properly use staticly typed languages at scale are not always necessary. It's nice for it to be a "warning mode" when you want it, instead of having to design and build that from the beginning.
I guess it comes down to a difference of philosophy. Never in my life have I ever felt compelled to shove an object of the completely wrong type into a function and see what happens. In the overwhelming majority of cases, the answer is going to be "it crashes", so there is no reason to do this.
The point of type-checking is that the function needs to work on a duck, then there is no point to ever pass it anything except a duck. In fact,the compiler should not even let you pass it something that's not a duck, because it's so pointless. That's literally the start and end of static typing, and if that's "rigorous" then yes, the whole point of static typing is to introduce this very basic level of rigor into your codebase. Because it's not going to work regardless of whether it passes the compiler.
Pretending interfaces do not exist does not actually make them go away. There is an interface there whether you explicitly enumerate it or not... even duck typing will fail if you try to call duck functions on something that is not a duck. Dynamic typing is not magic, it's the equivalent of passing everything around as Object or String in a static language. And that's an anti-pattern.
If you just want something to compile, you can pass in null-values of the appropriate type.
Now: there is a valid complaint that Java in particular really embraces the architecture-astronaut philosophy where everything is an overly-abstracted AbstractSingletonProxyFactoryBean (a convenient superclass for FactoryBean types that produce singleton-scoped proxy objects!). But usually it's fairly simple to wall that badness off from your actual business logic.
>The point of type-checking is that the function needs to work on a duck, then there is no point to ever pass it anything except a duck.
If I know I'm only going to pass ducks, and the software is quick and dirty and non-critical enough that it doesn't matter if I accidentally pass it a goose, then I don't want to have a general contractor standing next to me developing saying "make sure that's a duck!"
> If I know I'm only going to pass ducks, and the software is quick and dirty and non-critical enough that it doesn't matter if I accidentally pass it a goose, then I don't want to have a general contractor standing next to me developing saying "make sure that's a duck!"
There is no situation that is so "quick and dirty and non-critical" that you would want an invalid function invocation that could never, ever possibly succeed.
Again, if you just want a dummy call while you're refactoring so that things compile, pass in nulls in those parameters. But there is literally no reason to ever pass totally invalid but real data to a function.
I'm literally at a loss for any situation where you would ever pass an apple to a function that expects a duck, and think that's not just a valid construct, but a positive one that a language should encourage. It just boggles the mind how stupid that is.
Guess people just hate nulls that much, even as a parameter to a dummy stub, that they're willing to give away type safety.
If you really, really, really want to do a construct like that in a typed language, you can always just do "myFunc((Duck) myApple)" to pass the apple as an instance of type Duck, but if a duck is not an apple then that's guaranteed to fail at runtime, just like in the dynamically typed language (because dynamic typing is not magic).
If am writing something quickly, and it's a small project or test, and I might be testing out whether I want to pass a single value or a list in my workflow, I don't want the rails on constantly. I don't want to guarantee safety by ctrl+H-ing my "str" to "Iterable" constantly. Move quickly, break things, get the product out. At least, when the product is a quick and dirty tool and the options are "spend 30 minutes to get it built" or "don't build it".
> The point of type checking is that it has to be rigorous, it doesn't do anything for you if it doesn't check types.
I don't always need to check types. I don't need to know that isReady returns a bool. It clearly returns a bool. This idea that type checking has to be absolute is because in statically typed languages it does, not because that's a forgone conclusion.
> Static typing is extremely useful for knowing what things return - names are not.
Only if you completely ignore convention (that is prefixes return bools) then, sure, it's not useful. But if you just want to break conventions, then all of this is moot because you are guaranteed to do stupid stuff outside of that.
Even if you don't want to break conventions, you might do it by accident. If you don't have a compiler (or a static checker of some sort), there's no one there to keep you consistent.
You are assuming that your usage of isReady will be flawless every time. What if you call a function updateUser("username", isReady()) but it turned out you remembered the order of arguments wrongly. Then the type checker will tell you as you can't pass a bool to a string argument and vice versa. This type of error is super common in %-string formatting.
It won't help if all the arguments have the same type unfortunally but at least it catches something. Just like unit tests, they won't catch everything so the more checks we have and the earlier we run them the less risk there is something slips through to production.
With flow, you just gradually add typings and fix the resulting type errors. It's way easier to slowly integrate types file by file rather than rewriting everything.
Moreover I think that at the time there were no statically typed languages that targeted web.
There are many reasons. Some positive ones are: unnecessary complexity for the initial project, prototyping, exploratory analysis for machine learning/data science, etc. Some negative ones are: fear/ignorance about complexity, incorrect approach to testing (e.g. preference for many fine-grained unittests that effectively act as pseudo-compilers and type checkers under the mistaken belief the code is being tested), incorrect understanding of the tooling/operational support required, etc.
Sometimes you find yourself with a legacy codebase. Other times you find yourself with a team that bucks hard at the thought of a statically typed language. Politics is hard.
I'm using Flow but I'd recommend TypeScript. I love Flow but TypeScript just seems to be more seriously maintained and less buggy. I've been waiting some bug fixes for 2 years (like all the object destructing and spread issues[1]), but they prioritize the Facebook private internal roadmap before anything else (like they improve the performances in almost every patch, but it's useful only for FB and its millions of lines of code). So yeah, the support is kind of inexistent and it's frustrating. When you find a bug, you have to rewrite your code in another way to workaround…
On the other hand, I don't agree with some of the comments here. For example, Flow is not terminal only at all, I never use the terminal to run Flow. The editor integration is totally fine, especially in Atom and VSCode.
JS with flow is way better than JS without flow. It prevents a whole class of errors, and you can basically opt-in and opt-out of the type system at any time if you're having trouble with it too. I wrote a blog post about it a while back: https://www.aria.ai/blog/posts/why-use-flow.html
That being said, while I started out using Flow, TypeScript just has way more community adoption and better tooling. So at this point, I usually recommend TS over Flow to most people. But using either of them is way better than writing just regular JS code.
Typescript has way better editor support, and is faster on large projects. Flow is mostly terminal only, and required a really annoying syntax in it's config file to exclude all node modules.
That being said, Flow is better then no typechecks, and it was half a year ago that I looked, stuff might have improved.
You can run Pyre without using watchman if your editor-of-choice supports LSP -- the two modes are basically complementary. You can also run `pyre check` in non-incremental mode, though that may be slower depending on the size of your codebase.
So you're saying the LSP portion checks the file(s) you're currently editing, whereas the watchman (also incremental) check or the non-incremental check do the whole project?
Edit: for the active file / LSP part, I mean it shows errors mainly for the active file.
Exactly - the LSP portion of pyre works for files you have open in an editor, but might miss changes due to a rebase or files you edit on the terminal. The watchman integration is there to make sure that pyre's aware of changes outside your editor.
Edit: pyre will show type errors for all files in your repository, not just the ones you have open.
Ah, that's nice. So you can change the type of a function in one file that you are editing, and pyre will show you all the other places that are now using the wrong types.
Someone should do a proper analysis of MyPy vs PyType vs Pyre. What features they support, how fast they run, what tooling there is around it, etc. But yeah, competition is definitely nice, as it'll lead it innovation and hopefully cross-project inspiration. All 3 are open-source so that's awesome.
Interesting - I also save very frequently (Cmd+S every few characters), and I think watchman is the first piece of software that has allowed me to work on a large remote folder seamlessly (I use it through Nuclide). I tried various options before that (using vim or emacs directly on the remote server, mounting the remote drive in various ways, etc). None of them come even close in my experience (slow sync, disconnects, etc).
Asciinema has the fatal flaw that it is dependent on their server. When they go under so do your recordings. There are other options out there that can handle playback of traditional console recordings without such dependencies.
Python is a very good language for academic use. It is a more featureful FOSS system that can be used for RAD or matlab like purposes. Lack of types is not a serious problem for small codebases. But when the code base passes the magical 10000 line spot, type problems start to become serious. I hate the annotations about types in comments it is not maintenable.
Either you create a codebase strictly modularized respecting the lack of types and the 10000LOC speed limit or you change language. Golang is not fancy but it is really pragmatical and could appeal to the dynamic language crowd.
The other choices are Java/D/Scala. Or one can use Python as a glue language (superpowered C written modules used by Python). This is the problem/solution with Lisp too. If you go dynamic, you should be very careful, but there are gains. Don't forget to write tests unless you intend for a Matlab style experience.
My personal experience shows that you need many separate codebases to achieve 30m LOC, in other words many small projects of the 10.000 LOC code limit.
This is the main trading and risk management platform for a bank. A central team of around 150 people build the core platform libraries and underlying infrastructure. Then business-aligned teams build applications on top. The code is in one huge mono-repo.
Why do you feel type annotations in comments are not maintainable?
The type checker obviously has to be run in continuous integration so it's not like the annotations are going to get outdated just because they are in comments. (Actually you shouldn't see them as comments, they are as much code as the rest of your code)
I like Rust as much as the next HN enthusiast, but there's a reason there's not a lot of Rust code actually out in the wild, or jobs much available, or libraries with more than one creator/maintainer... and it's because of all the words you could use to describe Rust, the last one on that list would be "easy." Anything non-trivial in Rust is a lesson in extreme patience.
This doesn't work on Windows? I'm getting an error installing it:
> python --version
Python 3.6.1
> python -m pip install pyre-check
Collecting pyre-check
Could not find a version that satisfies the requirement pyre-check (from versions: )
No matching distribution found for pyre-check
Looking at the published package, they're binary wheels and only published for linux/AMD64 and OSX 10.13/AMD64. I guess it's manual installation for everybody else.
Wow. Do you know how they even build this thing? The instructions I see just keep saying pip install everywhere. I don't see any setup.py/configure/Makefile/etc.
OT: I thought the docs for pyre-check.org looked really nice. Didn't look like a Jekyll or Sphinx template. Looked at the source code and it's another FB library, Docusaurus, which uses React and Markdown: https://docusaurus.io/
Had been briefly discussed a few months ago on HN [0] but I guess I missed it. Always looking for some variety in static site/docs deployment!
I'm curious what the proponents of dynamic typing over static typing think about the recent push towards adding type systems to the two biggest dynamic languages out there: Python and JS.
It seems to be a pretty clear case of "failure by success". Python, JS, and PHP were so good for developing applications that many people wrote huge ones in them -- including companies that made billions of dollars like Google, Facebook, and Instagram. GMail and Google Maps broke new ground in terms of JavaScript applications (using the Closure compiler, which is probably almost 15 years old now.)
At the start, those languages tended to be used for 100-line, 1000-line projects. They work really well in those domains. 5K lines works fine for single person projects too.
Once you get to 10K- and 100K- line projects, and you have 10+ people on a project, types start to make sense. You can't change 10K lines of code at once, so you might as well have some rigidity.
Also note that you can do a lot more in 10K lines of Python than 10K lines of Java, C or C++ -- and in the 90's, when those languages came up, those were basically your choices.
IMO it’s that people coming from static languages to dynamic languages want them.
I honestly can’t think of anybody in my career who’s been comfortable with dynamic languages that had a desire to move to static. It’s such an impediment to the entire programming style that it doesn’t naturally happen.
On the flip side, I know plenty of static typing people who don’t seem to think they can function without it.
No matter what your preferred language, Python and JavaScript are almost unavoidable and because of that I think you see a lot of stuff like this brought to the table to help make people more comfortable.
Python for sysadmin, math, ML, etc. Javascript for browser.
You see a lot of the same thing in the Elixir community with its gradual type system. I can’t tell you how many times I’ve seen the discussion from people who want it to be statically typed, but neglect that all of the guarantees from it go out the window with distributed nodes.
> I honestly can’t think of anybody in my career who’s been comfortable with dynamic languages that had a desire to move to static. It’s such an impediment to the entire programming style that it doesn’t naturally happen.
Really? Anecdotally, I have the exact opposite experience: I know many devs (myself included) who had always used dynamically typed languages but got "hooked" on static typing after trying it in a decent language (Swift in my case), but I can't name a single person who went the other way.
In my experience, if you use a modern statically typed language with a solid type system, after the initial learning curve, the type system stops being an impediment to probably 95% of the code you write. In return, you get some really nice guarantees about your code, and what would be large and tedious refactors in a dynamically typed language now become a breeze. And if you bolt static typing onto a dynamically typed language, you get the best of both worlds: safety guarantees by default, and the full expressiveness of the underlying language when you need it.
Personally, I'm excited for these tools because I write code in dynamically typed languages every day, but I strongly value the benefits of static typing. This way, I can have my cake and eat it, too.
I really think a lot of it depends on your stack. On the web application side, I find static typing to be entirely an impediment - usually because the types that I really care about are already defined at the database level.
Every statically typed language I've worked with on the web just ends up forcing you to duplicate the same already defined structure in multiple places, writing a ton of extra code that provides marginal benefit but creates a major negative impact on productivity.
For many other areas, especially on phones, embedded devices or desktop software static types will make a lot more sense.
I think languages where the primary focus is data exchange the benefits are less pronounced. That's just my experience though.
But it's still incredibly beneficial to annotate the edges of your API. Not only does that quickly catch mistakes where you break the API contract, but it also serves as built-in documentation on the shape of the data you are serving, especially if you are using a dynamic format like JSON. And that's the beauty of having static types in a language like Python: you don't have to annotate everything, and can be selective about where you apply static typing to get the most bang for your buck.
> On the web application side, I find static typing to be entirely an impediment - usually because the types that I really care about are already defined at the database level.
I disagree here. Take one example, Slack. They publish "types" for their API that they don't adhere to, since every endpoint of their API that is supposed to return that "type" is returning something else. That could've been solved with a type system or schema validation which in itself is a weak form of a type system.
Types make sense if you're not using OO, so for pure functional python I can see how they would be useful.
In my projects I have dozen of classes that are nearly strings but add functionality not present in strings. Trying to type check those would turn into insanity pretty quickly.
>I honestly can’t think of anybody in my career who’s been comfortable with dynamic languages that had a desire to move to static. It’s such an impediment to the entire programming style that it doesn’t naturally happen.
I'm someone who would be described as a pythonist. I learned CS in python, had a quick foray into Java (which I'm not particularly fond of), and then have done the vast majority of my coding, both personal and professional, in python. I'm a huge proponent of type hinting.
Reasons for this:
- It's not really an impediment. Quality code should already have APIs notated with argument and return types. Converting docstrings to mypy is easy.
- It's optional. You have some weird super dynamic magic nonsense. Cool, annotations are optional. Don't include them on your metaclass-generating decorator function. Being able to opt out of the safety guarantees easily is really really useful for those cases where you do want to abuse dynamism.
- Its super useful. I catch bugs faster now. I write less buggy code. Refactoring is much, much easier (mypy highlights the lines where I'm now doing bad attribute accesses etc.). Some of these things can be provided by a good ide, but I'm often not in an IDE, and this way I can run it as a pre-commit hook.
Agreed that for sysadmin work its maybe not as helpful. For math and ML, I think it is. There are issues that make numpy/tensorflow really difficult to typecheck internally, but there's active work on that front as far as I know.
When I first started programming seriously (which happened to be early node.js, before any of the type systems), the lack of types was aneurysm inducing. That, combined with the tendency for libraries in dynamically typed languages to have less than ideal object documentation is very frustrating. It seems impossible to keep track of data flow in any decent sized project, and otherwise tedious.
I don't know whether this is an unpopular opinion (seems like it might be?) but I would never tell anyone to start with a dynamically typed language.
Could you elaborate on that last point regarding elixir and why type systems don't work for it? I couldn't find anything when I did a cursory search on google.
I like to use a cellular metaphor for programs, where you convert incoming data to either internal representations or errors as soon as possible, and delay converting to external representations as long as possible on the way out back into the "unfiltered world", but inside the cell things are considered "clean" and trusted. Type systems fit this point of view very naturally, because they can define that "clean view" of the world and enforce that you do the conversions early, because otherwise you won't have the target types. However, a "cell" has to be local, within a context where for instance it can count on having a stable view of the types and what they can do. (This being Erlang/Elixir, we can talk about upgrading running processes, but in that case, you have to provide a conversion function as well, though the default and common one is the identity conversion. Still, there is a conversion process.)
Once you have a system with multiple contexts, which you can't help but have if you're crossing machines but even just across OS processes you ought to behave as if you have a separate context, the type system is less helpful. You can't take a message from a foreign node of type X!foreign and simply assume it's of type X!local, because you do not know in the general case whether the foreign and local concept of X is the same thing. Perhaps they are different versions. Perhaps they aren't even the same program. At a sufficiently large scale or with sufficiently bad luck, perhaps the foreign node is actively lying to you in a attempt to hack you.
You end up having to serialize from the foreign node to the local one, and then deserialize on the local node, in order to communicate. Especially in a world where nodes can upgrade their code without so much as dropping their TCP connection to you or something. So there is a point of view from which having strong types at the language level isn't all that helpful to this process, because even having strong types doesn't let you escape from these issues. You can do all sorts of automated stuff to try to escape from the code boilerplate aspects of this problem, but you can not fully wrap the semantic problem of "this message, no matter how hard we try to abstract this away for you, may be invalid by local standards".
How much this matters to you depends on how distributed your system is. If you've got a 3-node Erlang/Elixir cluster that you fully control all the code for, you might be able to get away with just ignoring the issue. Like many other programming issues, at small enough scales it doesn't matter and you can just ignore it. But as you scale up it becomes an inevitable problem you must program for; consider something like an AWS API for S3 or something, served by a few different servers and consumed by thousands of clients using every possible version of every SDK Amazon has ever put out and who knows how many home-grown implementations. Types can define the communication contract at that level, but provide zero guarantees to a server about what it's going to receive.
So basically, when interacting with the outside world at scale, it's very hard to make any sort of guarantees about kind of data you're working with. And since you have to validate the shape of your data already, you're doing the job of the type system yourself, which makes it less useful to have one. Did I get that right?
Aren't these issues only at the boundaries where you're receiving data from the outside world? Wouldn't it still be helpful to have a type system/static analysis at the local level?
"And since you have to validate the shape of your data already, you're doing the job of the type system yourself, which makes it less useful to have one."
I'd say it's more like you don't get the imaginary benefit of being able to skip that checking.
And yes, a type system can absolutely help you with this, especially in terms of enforcing that you do this conversion (since the incoming byte[] stream is not going to be any of the types you actually want, even at the brute "int" or "char" level, let alone user-defined types). But there is certainly a sense in which this doesn't especially help you with dealing with remote nodes.
It looks like it's already been answered pretty thoroughly. The cell analogy works really well.
I tend to describe it more akin to REST vs WSDL regarding the necessity of exchanging contracts between WSDL participants when there is a change to adhere to the structure.
I think there are advantages and disadvantages to developing a project in a statically typed language just like there are for a dynamically typed one. For a lot of projects, Python makes sense because of the very deep and rich set of packages out there. It's also much easier to make something quickly if you're not fighting the compiler or a type checker.
But it's not binary and there are plenty of good reasons to have static type checking. I personally went through an exercise where I applied type annotations to a large Python project and found a good amount of bugs just by type checking.
I think having the flexibility to add type annotations later, only apply to parts of a code base, and allow some violations is a great middle ground and makes Python a much more attractive option for large code bases.
I doubt most of us really care. In fact, working in gradually typed systems is probably what I like the most.
I am however pretty unconvinced by the superiority of static typing. Being able to let the compiler verify the types for me in compartmentalised pieces of my software is very nice an all that, but for me I doubt it has caught many bugs.
I think a lot before writing code, and what comes out usually works on the first or second try.
A huge difference is that these languages doesn't force you to use type checking everywhere. You can use just in the more complex pieces of you source code. It would be nice to see how tools, like refactoring, would handle it.
Good question [I am the 'Pieter' on that blog post]. There will be a talk at PyCon that covers parts of this. It comes down to two reasons:
1) Performance. We needed something that would consistently work quickly on Instagram's server codebase (currently at several million lines).
2) We are building deeper semantic static analysis tools on top of Pyre. We've built some of these tools for Hack/PHP already, so following the Hack type checker's architecture is the best way for us to achieve this.
> 1) Performance. We needed something that would consistently work quickly on Instagram's server codebase (currently at several million lines).
That's not really an answer to why you didn't work on mypy, at least to an outsider to the decision making process. Are you saying that you discovered it's just not possible to scale mypy (or at least not without extensive work / more work than building your own solution?)
I think (2) is the overriding concern: we're getting really great results with the static analysis tools we've built internally on top of the Hack type checker. Building a similar tool on top of mypy would've required fairly invasive changes; we decided to use the Hack type checker infra instead.
Full disclosure: I worked on the Hack type checker briefly, a long time ago :)
The same reason everyone has a deep learning platform. It’s about developer mindshare and industry dominance rather than honestly thinking they’ll make a better framework starting from scratch rather than improving someone else’s.
Apple actually sort of does have a backend framework: WebObjects. They just seem to have deprecated it in favor of other solutions. Still, I've noticed at least iTunes Connect is still using it. And for what it's worth, Swift is going to have more server related stuff coming to the standard library (https://swift.org/server-apis/), and I wouldn't be surprised if Apple starts adopting this eventually as well as it continues to improve.
This is a great development for Python. The language services ecosystem was in the ICU, but perhaps now it's just on life support. VS Code, my editor of choice, supposedly has a pretty good python experience.. That was not my experience; slow! Just pulling tensorflow into a project would introduce serious save lag. Lag that's pretty much omnipresent to a lesser degree. Save and wait for syntax error updating... And wait.. Pylnint and Jedi were just not up to the task. A speedy language server is just what the doctor ordered.
Still, annotation/inference experience falls apart completely when interfacing with libraries like SQL Alchemy or boto3. Supposedly writing custom plugins is the way, but that never gets done! How will pyre handle stuff like this, or is it just asking too much? Perhaps a combination of better tooling and better library authoring, eschewing the temptation of meta programming and dynanimo, to play well with type hints will be necessary?
I'm excited for these new tools and will give it a play. However I honestly hope I never have to work in a python-forward environment again.
Anyone know how this project's capabilities compare the the PEP-484 checking built-in to IDEA/PyCharm? That implementation goes "above and beyond" inferring types based on use when there's no explicit annotations. I'm wondering if other projects are doing that kind of thing.
We haven't yet released a type inference feature, but have been building and using one internally. It helps by adding type annotations directly, based on what the type checker is able to infer. We're hoping to release this feature in the future.
It’s because it is built on top of the infrastructure that was introduce for Hack.
The system first starts by allocating a large area in shared memory (through a call to mmap). That area is very large but because most of it wont be written to it’s ok in practice.
After that, the program forks as many times as there are cores. Each of those cores are called “workers”. The first program is now the master.
The master and the workers communicate through pipes, but that is only used for synchronization. The lion share of the data goes through that shared memory that was mmaped at the beginning.
There are 2 main things shared in that area. 1- atomic hashtables of serialized ocaml objects (you can define as many as you like with a functor) 2- an atomic table of dependencies.
Each workers can read and write to those tables without locks (but only the master can remove from them).
It turns out that that setup works well in practice. Serializing/deserializing costs are mitigated by a cache of deserialized values for each core. And this way each core can manage there memory (the gc does not need to scan the shared memory).
Because that setup was working well in practice, it was reused by Flow and now Pyre. We refer to that setup internally as the “Hack infrastructure”.
You can have multi-threaded OCaml (our server actually has two threads), but only one can run at a time currently. We solve this problem by having a multi-process architecture where workers only communicate through a lock-free hash table in shared memory.
I'm glad to see more mind-share for static type checking in the Python community. At the moment, types aren't as popular in the Python world as they are in the JavaScript world where you can find TypeScript type definitions for most libraries.
I hope adding types to your library becomes the norm in Python as well, because as it stands, very few Python libraries have types defined. This means you can only have type guarantees in the code you write and not in that of your dependencies.
Somewhat off-topic: Does anyone else feel the need to avoid products purely because of their maker regardless of quality? I like some Nestle products but avoid them because of their maker. I am also avoiding GraalVM for the same reasons. Some things we can't avoid of course (especially those with momentum or without equal), but if I can avoid things built by companies I don't like, I will (digitally and in meatspace). Am I unreasonable?
Probably, but there is a lot to be said for having a moral sense.
I will never use Oracle products or allow them to bleed into my infrastructure for any reason for example.
It's possible that this has saved my company a lot of money, it's also possible that it was completely unjustified, but given the trend of the company in question- I believe it was a positive decision. Make choices, the best you can in the moment, you can usually revisit them later.
Remember that “quality” does not merely mean “works right now at this very moment”, but also should mean “is safe to build on and count on to still be there and developed in a reasonable direction for many years to come”.
There are many companies which should not get an ounce of that kind of trust, regardless of how useful their products are right now; very many simply can’t be trusted, for a myriad of different reasons.
I'd be happy to be corrected on this, but the documentation seems a bit... bare? I've used mypy, and I'm interested in types more complicated than `foo() -> int`. I'd like to see how pyre handles generic stuff, unions etc... Maybe that's all rolled into "we support PEP X", but it'd be nice to see some examples.
(I might have somehow missed the full docs - in that case, if anyone has a link, I'd be grateful.)
Thanks! I'll dig into it later, as I've been looking for a tool like his, and it so happens that the details are interesting to me :) (I built my own namedtuple-style library for sum types, go figure). It didn't seem like the easiest reading though - that could make some potential users bounce... Are you working on docs that expose these features in a friendlier way?
edit: Also, was I correct in that it's basically "The stuff that's specified in the PEP and works in mypy should work"?
On mobile web, pressing back from the note sends me to Facebook's homepage, not to hn. Anyone else notice this? This is on top of the pop-up modal prodding me to add a shortcut to Facebook on my home screen. These are the kind of annoyances and "tactics" that make me dislike Facebook and want to stop using them.
Aside from that, pyre looks like a super useful tool. Hopefully it has fewer rough edges than mypy.
I'll be honest -- we've been using Flow for JavaScript on our team for a while and we can't get over a lot of the problems we run into with Flow. We have been considering moving to Typescript. I can't really see how adding a static type checker to Python will be any better.
I know it's probably better than nothing but if the same problems carry through, it'll be a pain.
Python’s type checking story is pretty painful right now, partly because the typing library has a lot of confusing corners, MyPy (the reference type checker implementation) is pretty buggy, the rules are limiting (no support for interfaces/protocols and no recursive types), and magical code like SQLAlchemy simply can’t be type-stubbed.
These are all tractable problems, so I don’t see a theoretical problem; however without more info on your flow issues I can’t make a good comparison.
I'm curious. Why the name "pyre" and how did it get that name? I get the "py" at the beginning to refer to python. What dead things are being symbolically (or otherwise) being burned here?
Edit: I did skim through the comments here and also skimmed through the official site and the GitHub repo. I didn't find anything about this name.
I've tried it back when we were using PySpark and it did what I expected, but I am not the heaviest consumer of Jupyter notebooks to be able to say if it's 80/20 or 100% of what one would expect
$ pip install pyre-check
Collecting pyre-check
Could not find a version that satisfies the requirement pyre-check (from versions: )
No matching distribution found for pyre-check
PEP-484 Type Hints are for Python 3.5+, sorry if you are running on Python 2.x like majority of the Google python codebase does.
Update: for the downvoters, the PEP listed the python version for 3.5 here https://www.python.org/dev/peps/pep-0484/. Yes it does have a python 2.7 section but after Python 2.7 is discontinued pretty sure Python core team won't support type hints on 2.x. On top of that, not to mention the fact the type hint suggested on 2.x (as a comment) is different from the official 3.x syntax.
> Some tools may want to support type annotations in code that must be compatible with Python 2.7. For this purpose this PEP has a suggested (but not mandatory) extension where function annotations are placed in a # type: comment.
> You can use a comment-based function annotation syntax and use the --py2 command-line option to type check your Python 2 code. You’ll also need to install typing for Python 2 via pip install typing.
I fully endorse this. We've been using mypy on our Python2/Python3 project successfully. it is helpful for the type checking but I'm most interested in giving good data to IDEs for code completion.
Actually it works quite well (using comments for type annotations) on Python 2.7 (or mixed 2.7-3.6, as we are currently using in my company) code bases.
Supporting an entire new AST and type system (pattern matching) is a large investment. It can be made if there's enough of a userbase out there to justify it. That doesn't seem to be the case. Not yet at least.
If you're going to go to all the effort of declaring types, why not use a language that can take advantage of that information to get faster and more efficient code?
what i don't understand about stuff like this (and honestly typescript as well) is if there aren't definitions for type schemas then this doesn't work right? like none of these tools will introspect or something and figure out which types i really mean without me annotating with some kind of crazy sum/tuple type definition (e.g. foo() -> Object{a:b,c:d} or whatever it is in typescript). maybe i'm wrong and there's something different here?
Instagram actually has a program for stubbing existing code called MonkeyType. It requires running your code to collect the data, and then after your run finishes you can then apply the new inferred types to your program.
Even without you explicitly defining types, there is already quite a bit such tools can help you with since they usually come with type annotations for the standard library (or equivalent) of their languages.
pytype [https://github.com/google/pytype] does do type inference (disclaimer - i work on it). if you supply an explicit type annotation it will be checked for correctness, and if not we trace through the control flow graph to figure out the types of everything as best as we can (which does sometimes mean falling back to Any).
we are working on an extension that will add the inferred types as annotations to your code.
PyCharm, and perhaps other tools/editors, does introspect the code. It's not too hard to work out what protocols or interfaces an argument requires, e.g if you use it in a for loop it needs to be iterable, if you index or slice it then it needs to support that, if you add it to an integer etc etc.
It's not about the concrete types but instead about what protocols it supports.
Seriously? Another user posted "can't read this; can we get a non-facebook link please" - is that substantive?
Or, are you just picking on my comment because I phrased my concern differently?
Also, "a lot of unsubstantive comments" is false. Two of my recent comments created large sub-threads. Two others were downvoted simply because people didn't agree with my opinion.
So, I think it's fair to say that you're ignoring plenty of "unsubstantive" comments and basically picking on people who you disagree with.
The link is to the announcement of the project. The announcement is a public Facebook post, which seems reasonable given that Facebook developed the project.
It has built in classes that used to be types 15 years ago, but no types. [0]
90% of the holy war between strongly and weakly typed in python would be resolved if we just removed all references for types from Python, and renamed TypeError to UnsupportedOperand.
Pep 484 [1] is about class hints, and this software builds on top of that. That has it's place, but it's an ugly hack that should not become a main feature of the language.