Hacker News new | past | comments | ask | show | jobs | submit login

I mean, Python is the de facto beginner language and also used for professional software development (and of course intermediate data science work). Are you suggesting this is an unwise or unstable equilibrium?



I'm glad you bring that up. I think Python is the closest we have to a "universal language" (even so, it still has some limitations).

I think it works well for beginners because the language itself is so consistent and they have put a lot of effort into avoiding corner cases and "gotchas". And I think it works for professional uses because of third party library support.

To answer your question: I'm not suggesting that at all. I'm honestly not entirely sure how Python balances it seemingly so well. Given the lack of focus in the industry towards "intermediate" programmers and use cases, my slight fear is that Python will be shoehorned into one direction or the other.

Even if the language itself isn't, it does feel like the use-case-complexity gap is growing exponentially, at times.

And not just with Python. Seemingly, you're either a complete beginner learning conditionals and for-loops or you're scaling out a multi-region platform serving millions of users with many 9's of uptime.


Python does this so well because of the extremely full featured and fairly easy to use C API. Advanced programmers can write extension modules for the interpreter and provide APIs to their C libraries via Python, give their types and functions a basically identical syntax to MATLAB and R, and bang, statisticians, engineers, and scientists can easily migrate from what they already know how to use, pay no performance penalty, but do it in a language that also has web frameworks and ORMs. You can do machine learning research and give your resulting predictive models a web API in the same language.

This gets badly underappreciated. I've been working in Python for a while and honestly, I hate it. I wish I could use Rust for everything I'm doing. I can't stand finding so many errors at runtime that should be caught at build time in a language with static type checking.

But I also recognize the tremendous utility in having a language that can be used for application development but also for numerical computing where static typing isn't really needed because everything is some variant of a floating-point array with n dimensions. Mathematically, your functions should be able to accept and return those no matter what they're doing. All of linear algebra is just tensor transformations and you can model virtually anything that way if you come from a hard engineering background. Want to multiple two vectors? Forget about looping. Just v1 * v2. It will even automatically use SSE instructions. Why is that possible? The language developers themselves didn't provide this functionality. But they provided the building blocks in the form of a C API and operator overloading, that allowed others to add features for them.

So the complaints you typically see about dynamic languages simply don't matter. No static typing? Who cares? Everything is a ndarray. Syntax is unreadable? Not if you're coming from R or MATLAB because the syntax is identical to what you're already used to using. Interpreted languages are slow? Not when you have an interface directly to highly optimized BLAS and ATLAS implementations that have been getting optimized since the 50s and your code is auto-vectorized without you needing to do anything. GIL? It doesn't matter if you can drop directly into C and easily get around it.

Meanwhile, it's also still beginner friendly!

EDIT: I should add, editable installs. That's the one feature I really love as a developer. You can just straight up test your application in a production-like environment, as you're writing it line by line. No need to build or deploy or anything. Technically, you can do this with any interpreted language, but Python builds this feature directly into its bundled package manager.


Great rundown! It's love/hate for me too. Python is the worst language for scientific computing, except for all the others. I think Julia's going to take the crown in a few years though, once the libraries broaden out and they figure out how to cache precompiled binaries, to get the boot time down. With Python, it's not so much that you get to write C, you have to write C to get performance. I'll be interested to see whether Julia takes off for applications beside heavy numerical stuff. That seems to be the Achille's heel of languages designed for math/science applications -- it's easier to write scientific packages inside a general-purpose language than vice versa.


This is hands down the best description I've seen of why so many of us persist in using Python despite the language or runtime. I do hope that more alternative language ecosystems will begin to thrive in the numerical space and that we'll see more ergonomic facilities for generating high performance code from within Python itself.


TLDR; Right now Python is almost always easier for numeric Python beginners than Rust is for numeric Rust beginners and even also more productive. I just don't see Python's ease and productivity advantages remaining if Rust can catch up with Python's ecosystem and toolchain. But we'll have to wait and see if that will happen. And when and if Rust is actually (slightly) more friendly to the numeric computing beginner and much more productive in some numeric/scientific contexts than Python, Python loses its current intermediate language position. Especially if similar improvements happen in other domains.

> Python does this so well because of the extremely full featured and fairly easy to use C API. Advanced programmers can write extension modules for the interpreter and provide APIs to their C libraries via Python, give their types and functions a basically identical syntax to MATLAB and R, and bang, statisticians, engineers, and scientists can easily migrate from what they already know how to use, pay no performance penalty, but do it in a language that also has web frameworks and ORMs. You can do machine learning research and give your resulting predictive models a web API in the same language.

You know what's better than "the extremely full featured and fairly easy to use C API": If your language can itself compete with C/C++ for writing the libraries you need. The only advantage Python has over Rust regarding library ecosystem is the first-mover advantage and that Rust makes obvious how terrible the C API is which means people often invent new Rust libraries rather than reuse the old C libraries. The only advantages Python has over Julia are first-mover and that I doubt Julia-native libraries can truly match highly optimized C/C++/Rust libraries performance-wise in most situations where performance actually even matters.

> But I also recognize the tremendous utility in having a language that can be used for application development but also for numerical computing where static typing isn't really needed because everything is some variant of a floating-point array with n dimensions.

* Some numeric use-cases need to work with more than floating points. May be 2D (complex)/4D/8D numbers, may be dollars, may be durations. You lose all units of measurement and they are often valuable. In Python you cannot indicate "this cannot contain NaN/None". * In an N-D array, N is a (dependent) type, so is the size, so is the shape. Julia got this right but last time I checked it had a nasty tendency to cast the dependent types to `Any` when they got too complex. Imagine if you can replace most `resize` calls with `into` calls and have the compiler verify the few cases you still need resize. In Rust several libraries already use dependent types for these sorts of uses, but lack of important features that are only now starting to approach stable (const generics, GATs) makes them very unergonomic to work with. * I see a lot of complex types that should've been a single column in a dataframe get represented with the full complexity of multi-indexes. Juck! Not only more complex, but far less expressive and more error prone. I haven't yet seen Rust go the extra step and represent a struct as a multi-index to get the best of both worlds, but it's what I would love and Rust definitely has the potential. It's just not a priority yet as we are still just implementing the basics of dataframes first. * Things get even more interesting when you throw in machine learning. As a masters degree student, it took me months (mostly during the summer vacation, so I wasn't going to ask my promoter for help) to figure out the reason I'm getting bogus results is due to a methodological mistake that should have been caught by the type system in a safe language with a well-designed ML library. But here the issue is "safe" and "well designed library" not so much as "statically typed", but a powerful type system is required and the type system would catch the error in milliseconds in stead of hours if the it is static rather than dynamic.

> Forget about looping. Just v1 * v2. It will even automatically use SSE instructions.

Many languages have operator overloading and specialization or polymorphism to enable optimizations. In Rust this is again just a case of libraries providing an optimized implementation with an ergonomic API.

> So the complaints you typically see about dynamic languages simply don't matter. No static typing? Who cares? Everything is a ndarray.

Nope. Everything is not just an ndarray. That often works well enough. But when numeric computing gets more complex, you really want a lot more typing.

> Not when you have an interface directly to highly optimized BLAS and ATLAS implementations that have been getting optimized since the 50s and your code is auto-vectorized without you needing to do anything.

Much of those decades old optimizations are irrelevant or even deoptimizations on modern hardware and with modern workloads. The optimizations needs maintenance. In C/C++ optimizations are very expensive to maintain in Rust we cannot only leapfrog outdated optimizations but also much more cheaply maintain optimizations. Also, as we move into more and more datasets that are medium/large/big (and therefore don't fit into RAM), we're getting more and more optimizations that are very hard to make work over the FFI boundary with Python. The fastest experimental medium data framework at the moment is implemented in Rust and has an incredibly thick wrapper that includes LLVM as a dependency (of the wrapper), since it's basically hot reloading Python and compiling it to a brand new (library-specific) DSL at runtime to get some of the advantages of AOT compilation and to try to fix some of the optimizations that would otherwise be lost across the FFI boundary. Note that means now you need to do a very expensive optimized compile every run, not every release compile of the program, though I guess you can do some caching. Note also that it means maintenance cost of the wrapper quite likely dwarfs maintenance cost of the library implementation which is not a good situation to be in. The fastest currently in production python framework for medium/large data is probably Dask, but to achieve that performance you need to know quite a bit about software engineering and the Dask library implementation and do quite a bit of manual calculations for optimal chunk sizes, optimal use of reindexing, optimal switching back and forth with numpy, etc. and to avoid all the many gotchas where something that expect would work crashes in stead and needs a very complex workaround. In Rust, you can have a much faster library where the compiler handles all of the complexity for you and where everything you think should work actually does work and that library is already available (though not yet production ready).

> Meanwhile, it's also still beginner friendly!

* Is it? I admit it's code (but definitely not its toolchain) is marginally better for programming novices (and that marginally is important). Importantly, remember that novices don't need to learn about borrow checking/pointers/whatever in Rust either and that by the time they're that advanced, they need to worry about it in Python as well but the language provides no tools to help them so in stead of learning concepts, they learn debugging. Rust is lagging in teaching novices mostly due to the lack of REPL and fewer truly novice-friendly documentation, IMHO. * But give Rust a good REPL and more mature library ecosystem and I cannot imagine Python being any more beginner friendly than Rust for numeric computing. (When "everything is just an ndarray of floats" is good enough, the Rust would look identical to the Python (except for the names of the libraries used) but provide better intellisense and package management. When "just an ndarray of floats" isn't good enough, Rust would have your back and help the beginner avoid stupid mistakes Python can't help with or express custom domain types that Python cannot express or at least cannot express without losing the advantages of the library ecosystem.

Don't get me wrong. Right now Python is almost always easier and even also more productive for numeric computing. I just don't see it remaining that way if Rust can catch up with its ecosystem and toolchain. But we'll have to wait and see if that will happen.

I can also think of several other domains where Rust is actually potentially better suited as intermediate language than the competitors: * Rust arguably is already there in embedded if you can get access to and afford a Cortex-M. But I think it might actually be capable of beating mycropython in ease on an arduino one day. (At least for programmers not already experts in Python.) I won't go into my reasoning since this is already getting long. One day embedded Rust might also compete with C in terms of capabilities (the same or better for Rust) and portability (probably not as good as C on legacy hardware but possibly better on new hardware). * I think Rust is already a better language than Go for cloud native infrastructure except for its long compile times and it seems like an increasing number of cloud native infrastructure projects also feel that way. In the meantime new libraries like `lunatic` might be an indication that one day Rust might be able to compete with Go in terms of ease of writing front-ends for beginners. * Looking at what happens in the Rust game libraries space, I think Rust can definitely be a great intermediate language there one day. It already has a library that aims to take on some beginner gamedev/art libraries in languages like JS/processing/Go and at the same time, it has several libraries aiming to be best in class for AAA games.


Python abounds with corner cases and gotchas. It may have fewer than JS/Perl, but that really isn't saying much. It may hide them until a test or real-world use shows you you've stepped on them but that's not always a good thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: