Hacker News new | comments | ask | show | jobs | submit login

I'm just a lowly performance obsessed dev who uses things like node, php, python, etc. I run very high traffic applications and spend a lot of time buying and building my own servers to try to eek out every ounce of performance.

So can someone who knows about linux kernel internals explain the impact of this research? I read the abstract and some of the paper and it sounds very promising - like we may get significantly more performance out of our cores for multi-threaded or multi-process applications.




You will probably get significantly more performance out of your cores for multi-threaded or multi-process applications if you stop using node, php, python, etc. and use something that's more performance oriented.


Well, one might also get more performance per dollar if they programmed FPGAs, DSPs or GPUs instead of CPUs, and one might also get more performance per dollar if they designed their own hardware. (I do the latter for a living.)

However, "performance of Node, PHP and Python" is a sensible goal in its own right, and I disagree with the implication of your comment, and that of sister comments, that it is not a sensible goal. There's a lot of useful code written in Node, PHP and Python, and moreover, this might remain true for a long while because "something more performance oriented" is likely to be less programmer productivity oriented in that a correct, easy to use program or library will take more time to get done. Also, Node and Python specifically can be damn fast (numpy for instance is unbeatable for large linear algebra programs, because it uses optimized libraries under the hood, etc. etc.)

And some things simply can't be done in a satisfactory fashion in anything but a dynamic language, any more than you can get Apache to run on a GPU. "Dynamic" is a feature, not just a source of overhead.

So "a performance-obsessed scripting language developer" is a perfectly fine way to describe oneself IMO.


Aren't all three of those languages single-threaded? So the only way to distribute work is to run one copy per core and distribute on a per-request basis?


Python, at least, uses actual OS-level threads. However, it also uses a global interpreter lock (GIL), so only one thread can execute Pyton code at a time.

But when writing Python modules in C, you have control over acquiring and releasing of the GIL, so before starting some long running operation, you give up the lock.

Node, AFAIK, uses several OS-level threads under the hood for disk I/O. And with PHP, a web server probably will run multiple threads for handling requests concurrently.

So the impact might not be as big as for performance-oriented code in C/C++, but it is not necessarily nil, either.


With Python, numpy for instance will use multiple threads under the hood, even though the calling Python code might be single-threaded, and numpy's execution is completely unaffected by the GIL. Incidentally, to get Python code to run really fast, you'll have to offload most of the heavy lifting into libraries in any case. But it's still a Python program - in that most of the source lines, especially most of the source lines unique to the program as opposed to library code, will be still in Python, and in that in some cases you couldn't have written the program in a "more performant" language getting either the performance or the flexibility (Go for instance doesn't have operator overloading nor can it be used to implement linear algebra as efficiently as C/assembly; so a pure Go program doing linear algebra will be both slower and uglier than a Python program using numpy. A Go program using a C/assembly library will be very marginally faster than said numpy program, and just as ugly as the pure Go program.)

Also, in my understanding TFA applies to multiple processes just as much as multiple threads.


A recent alternative is to write the entire program in Julia. It is a lot less ugly than Numpy, and so performant that most code (other than steadfast libraries such as BLAS and LAPACK) are written in Julia itself. http://julialang.org/


>Aren't all three of those languages single-threaded? So the only way to distribute work is to run one copy per core and distribute on a per-request basis?

To clarify- languages are never by definition, single threaded. The reference implementations largely are.

Some background- in Python's case, PyPy supports STM which removes the global interpreter lock, while retaining backwards compatibility with existing code.

The answer to your question is no. All 3 are not single threaded.


Facebook's HHVM runs php interpreters as threads.


I need to hire developers that can work on these applications. For specific tasks, like c10k, node works incredibly and where it fails, frequently nginx comes to the rescue with a similar architecture.

But you're always going to win this argument by suggesting a lower level solution until we arrive at coding assembly optimized for a specific bare metal.


I'd suggest giving Elixir a try if you're trying to build things for extremely high-concurrency and will handle M:N scheduling across multiple CPUs for you automatically (and across multiple machines slightly less automatically).


This research applies to 'NUMA' systems. Commonly servers with multiple physical CPUs that each have a connection to their own memory banks. They can access memory of the other CPU by requesting it, but that take time. So the process scheduler has to take that into account. Usually by keeping processes a slight bit affixed to the place where it was started.

Off-topic, but high-performance long running processes are mainly programmed in C, C++ & Java. Maybe stuff like Rust and Swift in the future. Fortran if you are doing mathematical computation, but then you'd probably already use it if you need it.

For what I estimate that you mean with high traffic on PHP or node systems on multiple servers, probably you want to look at Elixir and it's Phoenix web framework. It's more appropriate for responsiveness (as in low latency). And less boilerplate than Java. |> http://www.phoenixframework.org/docs/overview


> I'm just a lowly performance obsessed dev

> who uses things like node, php, python, etc.

pick one


Actually there is a causality there. I think he became performance obsessed AFTER he picked node, php and python because performance became a problem..


You're more likely to write poor code that contains inefficiencies than having V8 or PyPy become your performance bottleneck.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: