Hacker News new | past | comments | ask | show | jobs | submit login

Besides the existence of __getattr__, __add__, etc. that other people mentioned, there's also:

- A Python runtime has to support threads + shared memory, while a JS one doesn't. JS programs are single-threaded (w/ workers). So in this sense writing a fast Python interpreter is harder.

- The Python/C API heavily constrains what a Python interpreter can do. There several orders of magnitude more programs that use it than v8's C++ API. For example, reference counts are exposed with Py_INCREF/DECREF. That means it's much harder to use a different reclamation scheme like tracing garbage collection. There are thousands of methods in the API that expose all sorts of implementation details about CPython.

Of course PyPy doesn't support all of the API, but that's a major reason why it isn't as widely adopted as CPython.

- Python has multiple inheritance; JS doesn't

- In Python you can inherit from builtin types like list and dict (as of Python 2.2). In JS you can't.

- Python's dynamic type system is richer. typeof(x) in JS gives you a string. type(x) in Python gives you a type object which you can do more with. And common programs/frameworks make use of this introspection.

- Python has generators, Python 2 coroutines (send, yield from), and Python 3 coroutines (async/await).

In summary, it's a significantly bigger language with a bigger API surface area, and that makes it hard to implement and hard to optimize. As I learn more about CPython internals, I realize what an amazing project PyPy is. They are really fighting an uphill battle.

> Python has multiple inheritance; JS doesn't

All these specific examples are true statements, yet isn’t Common Lisp even more dynamic, and often have even better optimizing compilers?

Lisp has multiple inheritance and also multiple dispatch, and SBCL beats CPython by a country mile in every performance comparison I’ve seen.

> SBCL beats CPython

I always find it ironic that the CMUCL lisp compiler (upon which SBCL was based) was called 'the python compiler', had machine-code generation in 1992 and that CMUCL sports native multithreading that is largely lock free..


>SBCL beats CPython by a country mile in every performance comparison I’ve seen.

This. Lisp at least 10x faster in the worst case; it can be made to run even faster...

Hm that's a good question, not sure. Do Common Lisp implementations have a "core" that multiple inheritance and multiple dispatch can be desugared to? Or are those features "axiomatic" in the language?

If it's the former, I would say that optimizing a small core is easier than optimizing a big language. Python's core is 200-400K lines of C and there are a lot of nontrivial corners to get right.

I was surprised when looking at Racket's implemetation that it's written much like CPython. IIRC it was more than 200K lines of C code. Some of that was libraries but it's still quite big IMO. I would have thought that Racket, as a Scheme dialect, would have a smaller core.

AFAIK Racket is not significantly faster than Python; it's probably slower in many areas. Maybe it's just that SBCL put a focus on performance from the beginning?

(I looked at Racket since I heard they are moving to Chez Scheme, which also has a focus on performance.)

I expect Racket to be faster than CPython:


Note that the old C runtime of Racket has been rewritten to use Chez Scheme. The work is not completely done - but it is getting close.

Talk by Matthew Flatt on the rewrite (almost a year old): https://www.youtube.com/watch?v=t09AJUK6IiM

>Do Common Lisp implementations have a "core" that multiple inheritance and multiple dispatch can be desugared to?

Yes, that's the Meta Object Protocol. It isn't on the standard, yet most Lisp implementation have it, and now you can use it in a portable way as well.

>If it's the former, I would say that optimizing a small core is easier than optimizing a big language. Python's core is 200-400K lines of C

Common Lisp's "core" (that means, not including "batteries") is considerably more involved and complex than Python's. Creating a new CL implementation is a big deal.

> (I looked at Racket since I heard they are moving to Chez Scheme, which also has a focus on performance.)

to wit, chez scheme was several generations into improving it's native code compilation abilities before python even existed:

history of chez scheme (2006):


A while ago someone made this Reddit post:


The user was comparing some Ackermann computations using Python and GNU Common Lisp (GCL), finding the performance about the same.

But he wasn't compiling the Lisp! So this compared GCL's Lisp raw AST interpreter to Python byte-code.

Any future "super speed" Python efforts would probably do well to build on the amazing work that PyPy has done in teasing apart an optimization-friendly subset of the language in the form of RPython, and building the rest of it in that language.

Like, focus on further optimizing the RPython runtime rather than starting from scratch.

That doesn't really make sense -- there is no "RPython runtime". There is a PyPy runtime written in RPython.

RPython isn't something that's exposed to PyPy users. It's meant for writing interpreters that are then "meta-traced". It's not for writing applications.

It's also not a very well-defined language AFAIK. It used to change a lot and only existed within PyPy.

I'm pretty sure the PyPy developers said that RPython is a fairly unpleasant language to write programs in. It's meant to be meta-traceable and fast, not convenient. It's verbose, like writing C with Python syntax.

Why does RPython exist? It seems to be a subset of Python that can be optimized.

Why not use a faster language to write the interpreter, like C?

This is not meant to be a hostile question, I am just confused as to why PyPy exists

The PyPy interpreter is written in RPython, but is a full Python interpreter with a JIT. When you compile PyPy, it generates C files from RPython sources, which are then compiled with a normal C compiler into a standalone binary.

RPython is both a language (a very ill-defined subset of Python... pretty much defined as "the subset of Python accepted by the RPython compiler"), and a tool chain for building interpreters. One benefit of writing an interpreter in RPython is that, with a few hints about the interpreter loop, it can automatically generate a JIT.

Basically because it can be "meta-traced", and C can't (at least not easily).

The whole point of the PyPy project is to write a more "abstract" Python interpreter in Python.

VMs written in C force you to commit to a lot of implementation details, while PyPy is more abstract and flexible. There's another layer of indirection between the interpreter source and the actual interpreter/JIT compiler you run.

See PyPy's approach to virtual machine construction


This sentence explains it best:

Building implementations of general programming languages, in particular highly dynamic ones, using a classic direct coding approach, is typically a long-winded effort and produces a result that is tailored to a specific platform and where architectural decisions (e.g. about GC) are spread across the code in a pervasive and invasive way.

Normal Python and PyPy users should probably pretend that RPython doesn't exist. It's an implementation detail of PyPy. (It has been used by other experimental VMs, but it's not super popular.)

> optimization-friendly subset of the language in the form of RPython, and building the rest of it in that language.

I'm assuming 99% of normal python users are not using anything outside of the RPython subset, correct?

RPython's restrictions [1] are quite strict - I'd say its more likely that 99% of normal Python does use features that aren't supported by RPython.

[1] https://rpython.readthedocs.io/en/latest/rpython.html

The libraries they depend on probably are, though.

>Python has multiple inheritance; JS doesn't

I don't think multiple inheritance is a performance issue. A class's resolution order is resolved when it's defined (using C3: https://en.wikipedia.org/wiki/C3_linearization), and after that it's only a matter of following it, like Javascript's prototype chain.

No, basically everything is dynamic in Python. Both objects and types are mutable after definition:


    m1 Sub
    m2 C
    Changed type of object:
    m1 C
    m2 C
    m1 Sub
    m2 C
    Changed superclass of type:
    m1 Sub
    m2 unrelated

Objects and types are mutable after definition, but that's no more severe than what you can do in Javascript. Assigning to .__class__ is like assigning to .__proto__, and assigning to a class's .__bases__ is more or less like assigning to a prototype's .__proto__.

The resolution order is calculated when it's defined. It's calculated again whenever you assign to __bases__ (or a superclass's __bases__). But it's not calculated every time it's used, which means there's no significant performance penalty to multiple inheritance unless you're changing a class's bases very often.

Metaclasses can override the MRO calculation, which we can abuse to track when it's recalculated: https://pastebin.com/NdiA12Ce

  Defining Baz
  ! Computing MRO
  Instantiating Baz
  Accessing attribute
  Changing Baz's bases
  ! Computing MRO
  Accessing attribute
Doing ordinary things with the class or its instances doesn't trigger any calculation related to multiple inheritance. You only pay for that during definition or redefinition. So there's no performance problem there compared to Javascript.

I do agree that basically everything is dynamic in Python. But some things are more dynamic than others.

Hm yeah I see what you mean. I don't know the details of how v8 deals with __proto__, but I can see in theory they are similar.

Though I think the general point that Python is a very large language does have a lot to do with its speed / optimizability. v8 is just a really huge codebase relative to the size of the language, and doing the same for Python would be a correspondingly larger amount of effort.

I don't know the details but v8 looks like it has several interpreters and compilers within it, and is around 1M lines of non-test code by my count!

v8 was written in the ES3 era. And I knew ES3 pretty well and Python 2.5-2.7 very well, which was contemporary. I'd guess at a minimum Python back then was a 2x bigger language, could be even 4x or more.

I agree, I was just nitpicking one detail. I made a similar comment: https://news.ycombinator.com/item?id=20953496

> reference counts are exposed

IIUC this is also true for PHP but HHVM has/had some interesting techniques to deal with it, like pairing up and cancelling out reference count operations, and bulk changing the reference count before taking a side exit or calling a C function.

you can definitely inherit from Array and Map in javascript.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact