
Pony Performance Cheatsheet - spooneybarger
https://www.ponylang.org/reference/pony-performance-cheatsheet/
======
CJefferson
I'm disappointed (in pony) that one suggest is to use tuple instead of a
class, and in general avoid naking little clases.

One often overlooked benefit of c++ (and I am sure other langyages) is that I
can wrap one or two ints in a class, add some constructors and little member
functions, and be fairly confident the whole lot will get inlined and compiled
away.

~~~
trishume
Indeed, this is a good document for Pony (and a lot applies to other languages
like OCaml as well!) but it shows what programming in a language like Rust or
C++ that has many zero-cost abstractions gets you.

In Rust I can _always_ use wrappers, aliases and union types, and idiomatic
error handling. And since the abstractions are either zero-overhead or will
reliably be optimized away by LLVM, I don't have to worry about sacrificing
performance.

~~~
pjmlp
This is not 100% true, just see the efforts of Google in optimizing string
allocation on Chrome.

Just because a language offers zero-cost abstractions, doesn't mean
performance comes for free, one needs to know how to use them properly.

And be clever to chose libraries that are also designed with performance in
mind.

~~~
nine_k
In Rust or C++, you can make the abstractions zero-cost with some effort. I'm
many other languages, you can't, no matter the effort.

~~~
pjmlp
Yes, but that "some effort" also has a development cost, which might be
worthwhile or not, depending on the use case.

Going a bit off-thread, Ada, Object Pascal, Active Oberon, D, Nim, Modula-3,
Swift also offer such capabilities.

C# also has on their roadmap plans to adopt more features from System C# and
Midori, specially when coupled with .NET Native and CoreRT.

------
infradig
I only got as far the string concatenation example. That languages like Pony,
C++ and no doubt many others, make programmers go through these performance
hoops for common-case scenarios is unfortunate.

~~~
beagle3
The problem is inherent in the imperative style.

The programmer SPECIFICALLY ASKED for all the partial results, and it turns
out that doing what was asked is sub-optimal, to the tune of O(n) allocations
of O(n) each and O(n^2) runtime as opposed to the optimal O(1) allocations of
O(n), and O(n) runtime.

It is extremely hard to make the compiler transform the sub-optimal version to
the optimal one in the general case. Sure, one can special case strings (or
std::string, or whatever) and do that without changing the semantics --
Python, for example, does that with string appends. However, none of Python,
C++ or Rust will do that for a user-defined type.

The right thing is to make sure that idiomatic way to concatenate strings is,
in fact, efficient - Python does that by making the idiom
"seperator.join(string_list)", which is as efficient as it can be.

There's a lot of performance gains to be made by transforming unreasonable
code to reasonable code, which is why compilers still do it; and unreasonable
code just as often results from mountains of abstractions as it does from
programmer unawareness. But IMHO our goal should be that the norm is that
programmers are expected to write reasonable code, not that compilers are able
to optimize unreasonable code.

~~~
wruza
>"seperator.join(string_list)", which is as efficient as it can be.

If you store strings in array, it is not efficient to concatenate it at all.
Strings could be implemented as an arrays or strings internally, so none of
(insert, delete, append, prepend) series would impact performance so much.
Editors actually do that, and your text buffer is not a single string that
memmoves every time you type a char.

Strings as they are today almost everywhere are simply of bad design.

~~~
beagle3
Depends on your use case. If you use a rope or tree representation (like most
editors do these days), random access is O(log n) at best, in many
implementations typically O(sqrt n) or even O(n), whereas concatenation string
is O(1).

I don't think that it is bad as a general compromise between editor buffers
(lots of inserts and deletes and concats) and symbols (immutable after
created), the vast majority of which are under 200 chars (file names, urls,
actions, text fields).

Sure, life would be better if every standard library supported a variety of
string types for specific uses. But they don't, and I suspect that it is
mostly for cognitive overload (that is, it is a PEBKAC, not a technical issue)

~~~
fnord123
Not to disagree, but you can barely perform random access on a utf8 string.
You need to explode it out to utf16 or utf32 which isn't what most languages
have built in. Rust and go largely work with utf8 while c and c++ love them
byte arrays (not sure I've even seen std::wstring in the wild)

~~~
burntsushi
This is misleading. UTF-16 doesn't actually provide random access, because
codepoints outside the basic multilingual plane are encoded with two UTF-16
code units (4 bytes). UTF-32 guarantees 4 bytes for every codepoint, which is
quite wasteful, but even then, random access by codepoint is generally a bad
idea because codepoints and graphemes aren't synonymous.

> but you can barely perform random access on a utf8 string.

This really isn't true, or at least, isn't a problem in practice. If you need
indices into UTF-8 strings, then you can record them by decoding the string.
This is sufficient for most string related algorithms except for the "give me
the first N characters" variety, which actually turns out to be a relative
good thing since "give me the first N characters" should require applying
Unicode's grapheme algorithm, which is never amenable to random access in
UTF-8, UTF-16 or UTF-32.

~~~
fnord123
> random access by codepoint is generally a bad idea because codepoints and
> graphemes aren't synonymous.

That's a good point.

>If you need indices into UTF-8 strings, then you can record them by decoding
the string.

Sure but the context is that I was responding to "If you use a rope or tree
representation (like most editors do these days), random access is O(log n) at
best, in many implementations typically O(sqrt n) or even O(n), whereas
concatenation string is O(1).".

Decoding the string is O(n); not O(1) (if GP meant "concatenation string" as
in a string where the text is concatenated into a single buffer - if GP meant
"concatenating a string in a rope or tree is O(1)" then I'm off on a wild
tangent).

~~~
beagle3
But you only need to decode it once (or otherwise receive those indices
without even deciding) whereas random access to a rope/tree is always o(log n)
or o(n).

Use case is everything.

~~~
fnord123
>Use case is everything.

amen.

------
panic
They talk about profiling at the end -- it would be nice to see numbers for
how much faster or slower each of the code snippets is. It's hard to tell how
much you should care about "boxing machine words", for example, especially
when the faster version makes your code less maintainable. Is this only
something to worry about if you have millions of items in your array?

------
nerdponx
This sounds like a good argument for having macros in the language.

Also, how hard is it for the compiler to optimize these cases? Why are they
zero-cost in C++ and not in Pony?

------
bhauer
This is a great summery of performance-oriented tips that in many cases are
generally applicable, even beyond the specific context of Pony.

