Hacker News new | comments | show | ask | jobs | submit login
Perl 11 (perl11.org)
304 points by totalperspectiv 23 days ago | hide | past | web | favorite | 324 comments

My perspective is that I'm glad that they are entertaining themselves, but they aren't going to succeed.

It is easy to run a subset of Perl 5 faster than Perl 5. But when you add in all of the features, beating Perl 5 on speed becomes hard.

It always was the dream to integrate Perl 5 and Perl 6. But cleaning up the internals of Perl 5 until general people could work on it would be a several month project that only a handful of people in the world have the expertise to do.

And even if you did that, how do you reconcile Perl 5 reference counting with Perl 6 true garbage collection? The two do not cooperate very well, and a lot more Perl 5 code than you would think relies on things like reliable timing of destruction. On this issue the community has generally been divided into people who think that this won't be doable, and people who think that some day a smart solution will be produced. To date, no smart solution has emerged.

Also Perl 6 was a wonderful dream to entertain people, but its real world adoption is..limited. Like it or not, Perl 6 is effectively here and is named Python 3. No, it doesn't look like the Perl 6 people thought it would, but sometimes that is life.

I'm generally in agreement with everything you state. I do think, however, that with the appropriate amount of engineering, one could build an incremental JIT compiler that sped up common base idioms meaningfully while executing the whacky stuff using the old perl VM. (Ie. more like psycho, as opposited to a full reimplementation such as PyPy.)

Yes, there's a while slew of things you can't do in the case - the old guts and data structures would permeate it. It would be a bit of a horrible maintenance burden. And it's generally not worth it (anymore). There's simply better options available.

I played around with something like that, wow, the better part of a decade ago. The approach was to go from the OP tree back to something that had a few of the execution specifics undone such that it resembled an AST a bit more. Then find sections that could be compiled by the partial implementation and convert those. Tracing facilities were there just at a function level. We'd settled on optional type annotations using a keyword plugin that would populate a structure attached to the CV (function struct) with that information as it went through regular perl compilation. Of course, this falls short of any sort of interesting complier work.

Anyway, my point is that I think it'd be doable for a sufficiently motivated and skilled (and funded) team of folks, but there's very little economic incentive to do so.

Several misunderstandings. we are not implementing a subset of perl5, we are continuing the development of perl5, which stopped around 2002. there are much more features in cperl than in perl5. perl5 is stripping its features, because they are not able to fix the bugs. cperl on the other hand fixes them, and constantly adds new features.

we don't entertain ourselves for fun, we do it for the technical necessity, because p5p is not able to, not willing, left the path to perl6 already, and broke up all cooperation. we need to fixup all the damage done by p5p, because nobody else is doing it.

perl6/rakudo is nice concept but not going anywhere. the architecture is just too broken to be realistically fast enough in production. but who knowns. moar is nice. and we've seen python 3 in production, and compared to that perl6 is a godsend.

cperl will not switch to a gc soon, neither will it provide proper lockless threading. simple goals first, like an object system, types, ffi, async/await, regexp, match, hashtable, inlining, jit, symbol table, ...

cperl easily beats perl5 in performance, but is investing it into better security checks. it's the only language with proper Unicode support already. with the jit and the inliner it will match php7. i.e. not 20x slower, just 10x slower, which is fine.

it's not a dream, it works and used in production. it is also recommended in production over perl5.

rperl already matches C++ performance.

Can I install cperl via plenv maybe? I'd like to test out production code on it and my prime number crunching benchmark.

https://perlbrew.pl/ can do it.

    $ perlbrew available | grep cperl

    $ perlbrew version
    perlbrew  - App::perlbrew/0.84

I'd be interested in this as well. I hadn't looked into cperl in depth until today and its starting to seem appealing after reading a few of the author's posts on reddit and the perl11 blog.

> the architecture is just too broken to be realistically fast enough in production

How so?

The startup overhead, the nqp overhead.

The startup overhead is caused by the idea to have the stdlib in perl6 source code (similar to mature languages like common lisp), and that the compiler will be good enough to produce good enough code. But the sigs of all methods need to be dumped somehow into the binary, and this done beyond naive. Emacs or Common Lisps solved this problem pretty well, via dumping native image of the compiled code, java has a good native class layout, but rakudo is just slow.

The nqp overhead is when you look at the compiled moar code and compare the bytecode to a fast vm. Its still too bad code. moar itself is fine, but the rakudo and nqp layers not. Same for the new jit template idea, which does not scale. At all. There are so many people with so many bad ideas who are not willing to listen to more experienced dev's, so I stopped caring also. It's not worth it, esp. after the parrot debacle. But perl6 people are generally very nice, competent and open, in total opposite to p5p.

perl11 exists because the perl community is full of whacky crazy people who like to shoot for the moon.

Realistically, though, it's not something anybody actually writing perl5 or perl6 cares about or expects to achieve anything. Basically 'urbit with $ signs'

Yeah... I was going to say urbit is urbit with dollars signs, but they are some kind of crypto-token signs, not dollars...

I'm pretty sure cperl is doing exactly that though: http://perl11.org/cperl/STATUS.html

cperl is a trainwreck that only exists because the author forked perl5 because he was removed from the maintainance list because he couldn't stop insulting people.

Don't bother pretending it'll ever be a viable project - periodically he sends patches trying to force perl5 modules to change their code because his type checking code doesn't work, then follows it up with a volley of insults when the maintainers go "huh?"

cperl is not a trainwreck, only perl5 is. cperl exists because the perl5 maintainers are totally incapable of doing their job, and risking the jobs of thousands of fellow perl5 developers. Most of them already switched to something else in the last years. cperl on the other hand implemented most missing features from the last 15 years in 2-3 years, fixed the worst mistakes, and is 10-30x faster and better developed than perl5. perl5 is dead, it is not going anywhere. You had your chance, but you failed. Now the only thing you have left is doing the culture war thing, throwing around CoC accusations. Having no faith, insulting the devs.

Typechecking errors? There are occasional errors because of adding more strictness and warnings from perl6 (strict names, hashpairs, ...) and some internal test and bignum modules are typed for 2x faster performance and to catch typical errors at compile-time. There's an API, and the types reflect that. The problem is that the implementations and the users don't care about the API at all, and neither about typechecks catching these errors.

There are no volleys of insults to any maintainers at all. You still don't get the difference between necessary technical and professional criticism and personal attacks. In fact p5p is throwing around personal attacks and insults all the time. E.g. you are one of the main examples of immature racism in your very public YAPC talks, accusing all Germans to be Assholes (in allcaps) for years. Thought about that once? Why do you think you were not accepted as pumpkin and again the technical most incapable person was elected?

p5p had their chance to do anything with the language in the last 15 years, they had a proper design and spec and sister languages doing the same. They did nothing, all attempts failed and they blocked all improvements from outside. There's no proper management, no process. Either you are committer, then you can do what you want, or not, then you may not do anything. Well, in fact there are only 4-6 bad apples at the top. The rest is doing good, but silent. But the TPM board is protecting the bad apples, promoting the most incompetent, they are even collecting the worst of them. Only if you managed to completely fail a huge project you are the perfect member for the board. Only the most unsuccessful culture warriors are lining up there.

perl5 is not recommended to be used in anything serious anymore. There are dozens of serious bugs and design errors not fixed, and errors and destruction being added every new release. I have no time to file all the CVE's, look at the cdelta's. 90% of the decisions are wrong. The new code of conduct is being misused to silence valid technical criticism, because perl5 is now a religion, and you may not distrust the leaders. This was literally the explanation.

(rurban is the author of cperl and technically brilliant)

I did, indeed, have a couple of slides in my talks where I said "if you think somebody's an asshole, you could be wrong, maybe they're actually just german" - referencing the direct german style of communication, which I personally rather like.

Taking this exactly backwards and then calling it "immature racism" is ... precisely the sort of thing I was referring to, as is deriding everybody you've had technical disagreements with as "unsuccessful culture warriors".

I find this a shame, but we've discussed it more than once over the years and given I haven't convinced you yet, I find it unlikely I ever will.

I'm German and I find that funny.

It's not that I cannot be offended by stupid stereotyping and attacks on Germans (Erik Naggum had some really vile posts in that regard), but the above is not even on my top ten list of things bothering me today.

Number one is cooking quinces to death, because I just wanted to blanch them a bit and forgot to turn the stove lower), so still pretty inconsequential.

The aggravating thing to me is it wasn't even an attack, it was part of a talk about "assholes, idiots, whiners and trolls" that basically said "none of these people are necessarily that, here's how to look at things differently and maybe end up getting on with them instead".

Every other german who rendered a comment on said talk thought it was hilarious and clearly got that I sympathised due to being pretty blunt myself (oh gods east coast american middle managers, how easy they are to offend ...)

My commiserations on the unplanned demise of your quinces.

Why is that even a YAPC talk?

I've heard points similar rurban makes from mlehmann, who also got tired of p5p and maintains his own perl5 fork.


It was a keynote because open source is made of people and encouraging helpers and contributors is how open source projects prosper.

Points taken! Thanks for adding some insight.

Your slide was simply:

We were not amused but didn't cry wolf as it would happen nowadays with a CoC. I would never use such terms, but perl5 leaders are very keen to throw such accusations around all the time because it conveniently distracts from the technical arguments you need to avoid. You kept with this slide for several years, but since then deleted all youtube instances.

I'm not a german btw. but my mother was born there, so I felt with them.

Actual slides: https://shadow.cat/resources/slides/matt-s-trout/2011/yapc-n... - and I did use part of these slides once more, but that's not exactly "several years".

If anybody's confused why there's a downwards arrow on one of the slides it was so I could stand underneath it to proclaim myself loud, blunt and obnoxious at the start of the 'assholes' section because this was a community hacking talk and I was identifying myself as part of the group I was about to talk about.

I don't even have access to any of the relevant bits of youtube.

This is just getting silly now :(

I've met plenty of direct-talking Germans who wouldn't be mistaken for assholes.

Well, yes.

"Some people who initially appear to be in group X may turn out to actually be in group Y" does not, at all, require that all of group Y may appear to be in group X. Logic doesn't work that way :)

The natural successor of perl5 was supposed to be ruby, at least in the early 2000s. Python is nothing like perl (by design), but got adopted for the same tasks nonetheless.

> Python is nothing like perl (by design)

Which is ironic because python and perl are extremely similar. I like to say that "python is perl for bureaucrats". I've been doing a bit of python for (?:recre|educ)ational purposes recently due to its superior maths libraries. I can't say I see the advantages, but that's probably just me.

> Which is ironic because python and perl are extremely similar.

Would you mind expanding on this a bit? It's been a while since I did perl, but I remember it being fairly distinct from python, other than being a dynamically-typed "interpreted" language.

The feature set and type of problems that they are good at are similar. The personality of the language is not. But the translation is easy.

What is weird about Perl is that it is intentionally designed to use the part of our brain that parses syntax to encode a bunch of information. So $ means "the", @ means "these", and so on. The linguistic analogies are carried surprisingly far, and There Is More Than One Way To Do It is considered a feature.

By contrast Python tries to have a very simple and regular syntax. It has well-organized libraries. And aims to have one simple and right way to do it. (Though why their one way to handle opening external processes should be so Byzantine..well I digress.)

But both of them live in the space of dynamic interpreted libraries with similar kinds of introspection, similar native data structures and so on. Both take heavy advantage of the fact that between decent scalars, arrays, and hashes you can write algorithms for almost any problem which will perform within a constant factor of a theoretically ideal data structure. Therefore given the same problem it is natural to write the same solution in about the same way with very similar characteristics.

If you know one, learning to write the other isn't hard and the mental translation remains easy.

Oh, there are differences. Some things are easier to express directly in Perl. Like backticks. Python can let objects be used as keys in dictionaries. Perl's optional variable declarations catch a lot of minor bugs. Python's iterators/generators take a lot of work to replicate in Perl. (Yes, there are CPAN modules that make it doable. Have we agreed on which one is right?)

But on the whole for most code, you tackle it in the same way, with equivalent data structures and naturally organize it similarly using the same strategies.

Compare and contrast with, say, Java.

The fact that they are so close puts them in a natural competition. And there, well, Perl is not popular for new projects while Python is one of the most popular languages out there. All the cool new integrations are made available in Python first and Perl is an afterthought. If they are too similar to co-exist without competition, it is clear which one is going to win. And it isn't Perl.

> Python can let objects be used as keys in dictionaries.

In Perl 6 object hashes allow you to do the same (https://docs.perl6.org/language/hashmap#index-entry-object_h...). If you're only interested in checking for existence, you can use a Set (https://docs.perl6.org/type/Set). If you want to just count the number of occurrences, you can use a Bag (https://docs.perl6.org/type/Bag).

> Perl's optional variable declarations catch a lot of minor bugs.

In Perl 6, variable declarations are obligatory by default.

> Python's iterators/generators take a lot of work to replicate in Perl.

In Perl 6 there is an easy to use interface for creating iterators (https://docs.perl6.org/type/Iterator)

All of these features are built-in, so no module loading is necessary.

I do not think of Perl 6 as Perl, despite the naming and people involved. I believe that it is as easy to migrate from Perl 5 to another scripting language as it is to migrate to Perl 6.

Furthermore the last I heard, Perl 6 performance was a disaster story. And libraries + real world adoption is kinda thin on the ground. Therefore other scripting languages (particularly JavaScript and Python) are more compelling targets to migrate to.

Has this picture brightened in the last year or so?

> I do not think of Perl 6 as Perl, despite the naming and people involved. I believe that it is as easy to migrate from Perl 5 to another scripting language as it is to migrate to Perl 6.

These beliefs are shared by most users of both languages, which makes bringing up Perl 6's capabilities in a Perl vs Python discussion confusing.

I would say so, yes. For instance, basic object creation in Perl 6 is now faster than basic object creation in Perl 5. Both can be improved with some tricks that experienced developers know, such as not using accessors in Perl 5, but directly lookup keys in the underlying hash. Similarly in Perl 6 some tricks can be performed. And then still Perl 6 is faster. Introducing Moo or Moose on the Perl 6 side, will only make Perl 5 slower.

Making things faster in Perl 6 has been the main focus of development in the past 3 years.

See also our most recent teaser for the 6.d release: https://i.redd.it/et514ut106u11.jpg which shows that the use of set operators has become almost 50x faster in that period.

> Python can let objects be used as keys in dictionaries

You can do this in perl too with a core (since perl 5.004) module. https://metacpan.org/pod/Tie::RefHash.

Their type systems, object model and reflection capabilities are very alike.

It's been a long while since I've developed anything serious in Perl, but the main difference I find between Python and Perl, aside from the syntax, is the value/reference split: Python makes every (passed) value a reference implicitly, Perl uses references explicitly, that can affect an architecture in important ways.

Indeed. I managed to program Python a few years before really taking note of that difference. Couldn't understand why my data structures kept being corrupted. Ahem.

FWIW, in Perl 6, references do not exist at the language level. At the implementation level, you could argue that everything is passed by reference. Perhaps you may find https://opensource.com/article/18/8/containers-perl-6 enlightning.

Agreed. ruby is really nice. I've worked some time on a better ruby VM, potion by _why.

but in the end ruby had the same struggles as python and perl, and basically killed themselves. it's much more racist over there than in perl. php was the only major scripting vm which had some proper progress, but they started from a very poor stdlib, so the old sins are still hurting them.

It's more worrying that that such an absurd badly designed language such as Javascript took over, and Dart was put back. And that the Lua vs Luajit fights were not resolved. This hurts everybody in the long run, because luajit would have been the best of all.

Perl5 is a great language. For internet and system related tasks it is great! It was created long time ago and evolved over the years.

One thing is true, it had a lot of advantage over other languages for a lot of years. However, now in 2018, other languages have also evolved and gained a lot of traction. One example is javascript/node which is now being used more and more. For a lot of time, a lot of frontend focused developers didnt really want to try backend development... specially because backend development was usually done in another language which was not javascript. So some never really attempted to do backend development. However when node was released, that opened a lot of new opportunities for the front end developers.

There are a lot of languages these days and many of them are capable enough to do many types of projects.

The important thing is, some languages have a bigger concentration of more capable engineers. One of these languages is perl. The perl programmers are excelent. They have real close experience with the kernel and systems and usually have great experience in complex backend tasks. Many are not as great with front end development. Some are.

Some projects can be done and deployed by 1 single perl developer. The same project if done in java for example might require more than 1 java developer.

The microsoft language creates a bubble and their developers sometimes dont know universal technical terms, they only know the microsoftish terms. Ie. they might not know what a "web server" is, but they know what "IIS" is. They might now know what a "hash" is, but they know what a "dictionary" is. Perl developers use universal terms.

Amazingly, javascript, especially ES6 is looking more and more like perl. Those that know both will agree sometimes it reminds of perl.

There are many projects people dont know about that use perl behind the scenes. But of course there are not as many perl programmers as there are in other languages and thats one reason you dont hear much about these projects.

Companies embrace more and more the modern perl development. Java, microsoft, node, python, ruby, etc are more popular of course. Specially because a lot of banks use java and they push java via universities. In the past I have seen labs with SUN machines on universities that teach Java. Microsoft also gives students free software licenses via universities. Thats another reason Microsoft and Java are very popular.

The thing about perl which people really don't appreciate is that it scales. From a thow-away oneliner to a few hundred thousand lines of code [1]. But achieving that coherently takes a depth of knowledge of the language[2] and a reluctance to reinvent the wheel.

My main problems with perl is that I haven't found any normal programming activity I can't do with it, and that it's sucked enough of my time up over the past decade that people pay me and give me good working conditions in which to use it.

[1] But if you get to "a few hundred thousand lines of code" it's likely that there's some significant architectural mistake in your code. Architectural mistakes are not a perl-specific thing. Perl's biggest strength is also its biggest weakness - it's flexible.

[2] Depth of knowledge of a language is something that seems to be quite unfashionable in the industry at the moment, and some of perl's design decisions seem to be optimised for a style of knowledge that's also unfashionable.

Could you elaborate on point 2 there? I think that's an interesting point. I think it fits with a theme I've been mulling over regarding more people being tool users than tool builders.

One example I can think of is this: if you write perl using 'canonical' expressions, your code will often get compiled to optimized function calls. But if you use non-canonical expressions to perform the same operation, the compiled code won't be optimized and will run a lot slower.

An example would be using a map expression instead of a for loop. The map expression will typically run faster, even though its body is pretty much the same as the for loop's body.

No. map should be much slower than a for loop, because map has to cons the result list, while the for loop only does it for side effects, but ignores all the return values. If map is faster, then there's some thing wrong with the loop optimizer or loop run-time overhead.

In cperl there are many loop optimizations which are not in perl5. I'm pretty sure "for" is faster than "map" there.

FWIW, in Perl 6 there is no real difference between a `map` and a `for` loop: in the case of the `for` loop, the final value of each iteration is simply discarded.

Most decent computer languages are heavily inspired by theoretical linguistics. So education tends towards that. Perl is inspired by practical linguistics. It's a total mess like English and amazingly expressive like English. This doesn't fit well with standard computing paradigms, but it does make perl a decent language for people from non-standard backgrounds.

> perl a decent language for people from non-standard backgrounds

I would argue "for any background". Why would you have to think like a computer to be able to program? Don't we have computers to do that?

A lot of people from "standard backgrounds" have the theoretical linguistics approach baked in to their literacy. Perl on the other hand has a much easier time adapting to varying thinking styles. Here's a comment I made about a former colleague's code earlier today:

> one thing people like about perl is it can shoehorn into pretty much any mental model. [ colleague ]'s model is quite solid but also pretty mental

There's something to be said for enforcing a certain kind of mental model. Java and C# do a terrible job of this. Python does a reasonable job. Ruby is just perl done differently - seems to make the "mental" part a bit easier. Personally aside from perl the programming I've most enjoyed is arduino flavoured C, but that's just for fun. I do perl for fun and profit.

Ironically, the universal terms for ‘hashes’ are ‘map,’ ‘associative array’ or, at the worst, ‘table.’ Even ‘dictionary’ makes sense and is definitely used outside of MS.

Hashes are completely different things, the use of them is an implementation detail and not necessary.

OK, I'll bite. How do you get O(1) lookup without using a hashtable?

For a small amount of short keys hash tables perform worse than a simple linear scan. 127 is a good number but it depends on the design.

O(1) is completely missing the point of hidden constant factors. For many applications, like the one used in javascript, ruby or perl - map as objects with 3-7 short keys - hash tables are totally overblown.

For compile-time known keys you better create a perfect hash table, where the fastest could be a switch of pre-compiled word-aligned memcmp statements. http://blogs.perl.org/users/rurban/2014/08/perfect-hashes-an... Haven't done that yet with SIMD instructions, which should be much faster.

For mostly read maps (like e.g. for mysql applications) it's mostly faster to run-time compile a shared lib with a perfect hash table. Compilers are very fast with data-only.

For many other use cases a judy tree, compressed patricia tree, ctrie, HAMT, CHAMP or even a btree may be faster than a typical hash table. Esp. considering cache oblivious architectures. The modern swiss table doesn't look like an old-style hash table anymore, more like a patricia tree.

E.g. almost everything is faster than the default C++ hash table.

That's not the issue. If you have a Dictionary interface that says it provides you with O(1) lookup, you don't care if it uses a hashtable or magic dust to do it.

My point is that, since a hashtable is the only way to provide the desired performance characteristics, it's not unreasonable at all to say "I want a hashtable", instead of "I want a Dictionary implementation with O(1) lookup". The former is more succinct.

I get ya, but I think you just described abstraction. Sure, if we're speaking low-level, we might as well use "hashtable".

When someone says "hashtable" in the context of writing software, they're most often specifying desired performance characteristics, not the specific implementation details. This is how Perl does it, for example.

Also, there are Map implementations that don't have these performance characteristics, e.g. a TreeMap in Java which internally uses a red/black tree and thus provides logarithmic lookup and insertion. If I'm writing something that needs to do a lot of efficient lookups but only write "Map" in the spec, there's danger that the wrong thing will be used, and performance will suffer. It really does need a hashtable, not just a generic lookup interface.

In theory, I don't think hash tables are O(1): even if you ignore the atrocious worst-case performance and just look at average performance you still have to make the size of the hash depend on the number of entries, which brings in a factor of log(n), which is exactly what tree-based algorithms have.

In practice, the performance of a hash table definitely does depend on the number of entries because of the cache hierarchy and stuff like that.

I could easily be wrong about the theory: there are, of course, lots of different kinds of hash table and different ways you might formalise the meaning of "O(1)".

Hashtables are typically considered amortized O(1) for insert and retrieve. The atrocious worst-case performance doesn't happen in common implementations because the hashtable gets enlarged before or when it occurs (and these resizes still allow for amortized O(1) performance).

And logarithmic growth in the key size is definitely not as bad as logarithmic growth in the number of nodes in a tree to traverse, because going from calculating e.g. a key size of 28 to 29 bits is not a big deal at all, but going from 28 to 29 levels in your tree means an entirely new memory lookup to grab that one node. The hash calculation is always cheap, whereas traversing a tree thrashes caches -- you only need one memory lookup in a hashtable, whereas you need logN in a tree. Those lookups are orders of magnitude more expensive time-wise than the simple integer math involved in calculating the hash.

You are definitely right for most hash tables. Determining the right bucket is O(1) but there will be collisions. Open hashing solves it by creating a linked list within buckets which you will need to traverse. Closed hashing solves it by probing, you might test several buckets before finding the right entry. In both cases, you get O(log(n)) as the average lookup performance.

You actually get O(1) with perfect hash functions, but doing so requires choosing a hash function by the data set. This is normally only feasible for precalculated hash tables, not those that are being filled dynamically.

The average lookup performance on a hashtable is definitely not O(logN) -- it's amortized O(1). Just Google for "hash table performance" and observe that all the results are saying O(1), not O(logN). I'd like to request some reputable sources from you showing O(logN).

The modal number of buckets you have to look into (or the modal size of the linked list) is 1. If it's more than 1, then your hash table is too full, and you need to enlarge it. This is admittedly an expensive operation, but it still amortizes out to O(1). And if you know in advance how many elements will be in a hashtable (this is common), then you never even need to resize.

It's important to distinguish between graphs of performance as measured on real hardware and theoretical results from computational complexity theory. It might be clearer to say "near-constant time over the region of interest" if you're talking about the former because "O(1)" has a precise meaning and refers only to what happens in the limit. Real programs usually only work for "small" n: they stop working long before the size of the input reaches 2^(2^(2^1000)), for example. But whether the algorithm is O(1) or not depends only on what happens when the input is bigger than that.

Another way of putting it: if you modify your practical program so that it runs on a theoretical machine that handles any size of input then you will probably find that another term appears in the execution time, one with a "log" in it, and that term eventually dominates when n gets big enough.

Yes, I am being a bit pedantic. But "O(1)" is mathematical notation and you have to be pedantic to do maths properly.

It should be obvoius that hash table lookups are above O(1) as long as collisions exist. Tweaking the load factor can reduce the probability of collisions, but that's merely a constant factor to the lookup complexity. As to whether O(log(n)) is a good approximation for the average, we can certainly discuss that. It depends a lot on the algorithm, but https://probablydance.com/2017/02/26/i-wrote-the-fastest-has... shows (first graph) that the reality is typically even above logarithmic for some reason.

> It should be obvoius that hash table lookups are above O(1) as long as collisions exist.

This doesn't follow at all. If the average number of lookups is 2 instead of 1, then it's O(2), which is O(1) because it's just a constant factor. The number of collisions needs to grow proportionally to the total number of elements being stored, which does not happen because the hashtable itself is also grown to keep the average number of collisions constant and low.

A hashtable that is 80% full has the same small number of collisions regardless of whether the hashtable has a thousand elements in it or a billion. Meanwhile, a BST would have to make 10 lookups in the first case and 30 in the second case -- that's logarithmic for you.

Your link shows lots of real performance figures which are dominated by CPU cache misses and other non-intrinsic factors. That's what's responsible for the reduction in performance on larger data sets, not anything intrinsic to hashtables themselves. If you look at the aforementioned first graph, the best hashmap performs at around at worst around 10ns up through ~200K elements, whereupon it blows out the fastest CPU cache and then takes around a max of 25 ns for the max size tested (~700M?). Factoring out the cache issue, it definitely does not look super-logarithmic to me. It might not even be logarithmic; I'd need to see raw data to know for sure (it's hard to tell by estimating from the graph).

There are some options. For example, if the indexes are known to be within a certain range, a plain array will do better. For tables below a certain size, an array of key/value tuples might also perform better. An implementation could even switch between the underlying approaches depending on the data at hand, choosing between hash tables and something else depending on what should perform better.

Yes, the "hash" part is an implementation detail. The performance is also an implementation detail by the way. Often, tables will optimize for a different criterion such as memory efficiency. Lookup will be less efficient then. And, as another person noted, you cannot really expect O(1) from a hash table anyway.

Lookup in a trie is O(L) where L is the length of the key; not that different if you take into account that to lookup into a hashtable, you also have to compute the hash of the key.

Actually Perl seems to be used quite a lot at Microsoft.

Look at page 3 in this publication: https://www.microsoft.com/en-us/research/wp-content/uploads/...

Perl was considered as the defacto scripting language in Windows until they settled on PowerShell.

cough VBScript cough

I also remember quite a lot of Perl script in the WDK (Windows Driver Kit) back then. Was surprised by it that there were so many perl hackers @ Microsoft.

How come? It didn’t ship with WSH but (only) JScript and VBScript did.

De facto, not de jure.

Perl’s COM module was excellent and made Windows very scriptable. This was back in the 90s.

I saw the light[tm] too late.

My only real dealing was ~2001 and fixing shitty cgi-bin-webscripts in Perl. PHP4 was a huge improvement over that :)

As the years went on and I did more admin work I've come to really enjoy having perl5 available basically anywhere - the power of python with the availability of shell scripts.. but without the hassle. I'm also kinda sad they didn't write puppet in Perl but chose Ruby. I bet it would have prevented all my "too slow" and "needs too much RAM" problems with it...

My memories over a decade old here but ... as I recall there was an early perl based protopuppet that luke wrote and used in his consulting roles in the early 2000s. This was after splitting from cfengine and moving towards a declarative model.

For better or worse Ruby really does seem a good fit for the puppet use case. A flexible DSL for expressing config intent, coupled with a very extensible/inheritance based parse & apply model.

There are no “universal” terms. You used hash as an example, that’s already less universal than calling it a dictionary - it’s a hash map/table, and I’ve only ever heard this term from Perl developers.

That's because perl developers knew lisp and functional languages before and therefore avoided the misleading term "map", which is an operator to apply a function on a vector (=array) or list since the late 50ies. a map vs to map is misleading.

"table" is shorter than "dictionary" and also describes the structure better. A dictionary is mostly a book, or a list of keys (/usr/share/dict). prolog also used the term "table" or "tabling" for its caching of backtrackings, what we would call "memoize" or memoization via a hash table.

> perl developers knew lisp and functional languages

It's a practical certainty that the vast majority do not, and its design (up to Perl 5, at least) smacks of total Lisp ignorance.

"Map" is used in mathematics as a kind of synonym for "function", especially from some arbitrary set to another one. If you assign the domain < 1, 2, 3, 4 > to the range < 3, 2, 4, 1 >, that's a map. We can draw it as two bubbles containing labeled dots, and draw arrows between the bubbles.

This sense of "map" is widely used in computing. For instance "keyboard scan codes get mapped to characters" or "a coordinate in the abstract canvas get mapped to the viewport, which is mapped to the screen" or "the temperature sensor is mapped to GPIO-s 13 to 17."

The "map" in Lisp's mapcar is exactly this sense. The map is the function that is being applied: mapcar is using the verb sense: take the car of every cell through the function: i.e. map each domain value to the range. This is called "projection" or "mapping"; "map" is just the verb describing what happens to each individual element.

Yes. If Larry knew how a LISP is implemented, many internals would have looked better. the symtab, the optree, the stack and esp. the lexicals. Just compare to the ruby internals.

But at least he knew the terminology, and got the concepts right. In opposition to php and python which failed miserably.

"I did dabble in LISP more than I did FORTRAN and it looks like oatmeal with toenail clippings."

Java always had hash maps as one of their map implementations. (I did use both languages years ago, so maybe that's why I noticed)

For what it's worth, the three main Java map implementations are HashMap, which does what it says on the time, LinkedHashMap, which has some additional pointers to maintain insertion order (and thus has performance that suffers by a constant factor), and TreeMap, which is really a red/black tree which thus offers logarithmic lookup time instead of constant lookup time (i.e. it's not a hashtable at all).

Whenever I need to write a tiny script that tests the bounds of what is easily done with just awk and Bourne shell, the tool that I next reach for nowadays is Python. A few years back it was Perl5.

I'm not exactly sure why I started reaching for Python instead. It's so much easier with perl to do things like backticks to call some utility, or to use system() and easily deal with its stdout and stdin, and I never have to do a Google search to figure out how to use a regular expression to process the output (whereas I almost always need to bring up the doc page for Python's re).

I think what it ultimately comes down to is tooling and the availability of good libraries. I call "python3 -m venv venv", do an I'm Feeling Lucky google search for "XXX library inurl:pypi", and the sky is the limit with how easily I can make a script that does something relatively complicated. Comparing this with perl5 isn't even fair. You might find a cpan module that does what you want, and it might still be maintained, and you maybe can be bothered to figure out cpanminus, but anyone who has used perl in the last 5 years will agree, it's nowhere near as easy as it is with python.

Something that makes me optimistic about newer languages is that tooling is improving a great deal. It also makes me somewhat pessimistic about newer languages, though, if tooling is the primary aspect of a language that makes it popular.

I've never seriously tried perl6 but I've seen enough of the language features to know that I was really excited about it, at least at one point. I hope that some derivative of perl can rise from the ashes, because I'm too stupid to write actual lisp programs, but perl allows me to emulate what I imagine lisp programmers must feel.

I have two experiences telling with Perl and Python. Way back in the day, I had job doing web development in Perl. I had no say in the matter, it was the only choice available. And I actually used Perl enough to actually like it! I mean I understand why people who like Perl really like Perl.

But just a few years after that and I could hardly even read Perl code let alone try and code in it again.

Python was is opposite; I never coded in Python but I ported a Python product SDK from Linux to Windows in a few weeks. I could read it. I could write it. I made more mistakes with indenting than anything else. Even to this day, I don't know Python but I can muddle my way through and be reasonably productive.


My organization maintains a couple of Perl web apps for internal purposes (one that dates back starting in 1997). We do some occasional work/improvements/modernizations on these. Mentally switching into "perl" mode to work on these projects requires tons more effort then switching between our other Bash/PHP/Puppet/Python codebases. And there are several sections of the code I've shied away from modernizing because they use esoteric perl operators that I don't use and don't really understand.

Not to mention that only a few members of our organization are really "fluent" in perl, and teaching it to the newbies feels almost like a waste of resources when they can be more productive faster using python which has ample modern documentation and simpler syntax/conventions and well defined coding standards.

> Mentally switching into "perl" mode to work on these projects requires tons more effort then switching between our other Bash/PHP/Puppet/Python

That's surprising to me. Bash, PHP and perl seem to be closer to each other than any of them are to python. For example, the use of sigils ($var) for variable names in Bash/PHP/perl.

Bash / PHP / Perl superficially look like each other. Python code looks different.

But Perl has many strange operators and conventions that do not do as expected, or are foreign to the point of being unreadable to those who don't use Perl regularly. Additionally, Perl can be made concise (one-liner) to the point of being unreadable. It's been said that Perl is a write-only language for a reason.

Contrast with Python, which I've had six-year-olds tell me what a program does after only a few minutes' introduction to Python syntax. Though to be fair, Python's List Comprehensions are in fact just as unreadable as Perl is.

I always make a point of documenting any of the more arcane parts of perl with a reference.

Both Bash and PHP use a single sigil ($) to denote variables. But in Perl, sigils also kind of denote the type of the variable (Scalar, array, hash, subroutine, typeglob). I say kind of because it's not quite that simple.

PHP is actually closer to compiled languages like Java and C# than it is to Perl, Bash, or Python. It's mostly a static language with dynamic typing.

How sigils are used in Perl 6, and how that compares to Perl 5: https://opensource.com/article/18/9/using-sigils-perl-6

I would agree. Maybe it's more because of what people are used to than the objective "distance" between them.


1 does python have the same breadth of exiting codes as CPAN ? 2 the whole tab vs spaces thing that creates a lot of hard to see syntactical errors if I wanted to program in language like that id use FORTAN IV/66

> But just a few years after that and I could hardly even read Perl code let alone try and code in it again.

Precisely this.

The Perl Cookbook (and Perl Book cookbook section) was an absolute godsend in the time before the web.

However, the fact that my Perl Book and Perl Cookbook fell apart from so much usage shows how the language never really stuck in my brain in spite of continuous low-level usage.

I lost my copies of those books because I rarely needed them. Once I wrote it once, it stuck in my mind.

Everyone has a language or two that really "clicks" with them. For me, one of them is Perl. Python, on the other hand, is one of those languages that won't ever stick in my mind. I don't know why. I just can't do Python.

I picked up Golang in about a day.

I've tried Python about 4 times over about as many years and it has gone from something I wanted to learn to something I wish never existed.

There are 2 things to grok that people rarely explain to Python newbies, and which probably prevent some from building good intuition of things:

1. EVERYTHING is a DICTIONARY (including objects, despite some magic happening behind, objects are just dicts of data and functions, that are just values)

2. VARIABLES are just LABELS for things, and they only exist and can be movable around in one scope (`x = smth` inside a local scope will either create a new local label x, or move the local label x that already existed; there is no way to "move"/"create" a label outside the current scope, you can just use it to access what it's pointing to, except ugly hackery that falls back to manipulating interpreter's guts as if they are just dicts and that nobody does in real code)

Once you grok this the language is quite nice and intuitive. The next step is grokking:

A. operator overloading - this is something people like to avoid, but any math/physics code would easily become unreadable without this sprinkled around libraries like NumPy

B. the fact that the `=` operator can be overloaded - this can really f up your mental models of things, but again, it can make math/sciency code more readable when used carefully and the people in this field (ab)use it a lot (imho it should have never existed in the language, but we all have opinions)

C. everything can behave like function (which btw are first class things in Python) by just implementing __call__, another apparent trivia but it changes the shape of APIs a lot (eg. "have class instance also behave like a function and have it passable as an argument to something that expects a callback")

D. multiple inheritance - yeah, it exists in Python and it's maybe a bit ugly, but in fact it's what allows you to implement Mixins and other sugared-compositional patterns, actually reducing your total use of inheritance and the length of inheritance chains a lot (IMO multiple inheritance is the only things that saves inheritance from itself, and you shouldn't ever have I. without MI. but people seldom agree with me on this :P)

To be honest I still profoundly dislike Python as a language, but when most of your job is reading other people's code... you come to appreciate it more than other dynamic langs :)

In perl (almost) everything is a dictionary too. Except when you need it not to be. Great source of getting things done, or footguns. Here's an implementation of something vaguely useful in perl that is mostly not a dictionary (except where it is for state management purposes) - https://metacpan.org/source/ZARQUON/Array-Circular-0.002/lib...

A. reminds me of JS objects.

B. is to expected, and I think it might be useful; e.g. we prevent accidental copying of singletons in C++ via operator= (T &other) = delete;. I'd guess you could do something similar in python.

To top that: Yesterday I learned that you can legally overload the unary operator & in C++. Smuggling that into a large code base seems like a good way to mess with your colleagues ;)

> everything in Python is a dictionary

This is new to me, but probably will be useful.

Everything in R is a vector (or if you like, array). This is what makes R fast if your objective is not program development, but dissemination of information.

I wonder what should be said of Perl, but the only thing I can think of is every input should rather be a string.

The assignment (=) operator cannot be overloaded. Thus the PEP 572 controversy. https://www.python.org/dev/peps/pep-0572/

I was referring to __setitem__: you end up with expressions like `m[...] = ...` where practically speaking the meaning of `=` is no longer obvious. In some contexts it can end up meaning pretty unintuitive things (think pandas view into dataframes etc.). But, yeah, most languages have something like this since it's pretty useful, and only way to get away from these ambiguities is using purely immutable data which can be a no-go performance wise when you're working with references to multi-gigabyte structures...

I think the big insight there is that some libraries are popular despite unintuitive design.

I had the same experience. Perl clicked for me, but not Python. A few years ago, I switched to Go for performance and it has clicked as well. Go is a rather sparse language relative to Perl (and TIMTOWTDI), so I'm a bit surprised that it was a natural switch.

Perhaps you would benefit from a good teacher. I've found that some topics I find confusing when taught by one person are obvious when taught by another.

I had a 3 days Perl formation with an excellent teacher (17 years ago). I do not code much in perl, but when someone in the buildong has a bug in a perl script, he comes to see me. I think the perl books are not very good for a beginner because they do not focus on the most usefull parts. The ideas behind the syntax makes it easy to remember (your whole life) once you have passed the first steps of learning.

everything in python is a dictionary. That is what i needed told to start to grok it.

More or less the same as Perl and JavaScript.

Having used both languages for maybe 7 years each, you can write unreadable code in both. I have seen some absolute disasters in Python, its really about the skill and motivation of the developer to produce readable code.

Perl has a lot more syntax which makes it harder for people who don't know Perl, but if you know the language I don't think its inherently worse.

In response to your last sentence.

Just wanted to mention that Ruby was highly inspired by Perl, particularly its regular expressions and it has a variety of perlisms. Kind of surprised you went to python and not ruby.

Your lisp mention reminds me of this: http://www.randomhacks.net/2005/12/03/why-ruby-is-an-accepta...

I just wish ruby had the same quality of scoping and OO that perl5 has, but it ended up with the same "variables magically pop into existence for the entire function" footgun as python and a horribly restrictive OO that also requires me to repeat myself (accessor/constructor etc.) in a way that after about an hour sends me screaming back to perl

Edit: Look, this is a genuine opinion, I've chatted to ruby core devs about this and they couldn't find ways to achieve the same things I do with perl5 as elegantly in ruby (Steve Klabnik and I have an open conversation about how to find a kata where we can compare this stuff properly), I'm 100% happy to discuss this but blindly downvoting because you disagree is depressing - tell me why you think I'm wrong instead and I'll do my best to reply constructively

Not sure who downvoted you, sounds like you have good points.

Do you have some links/examples on that scoping & how you can avoid accessor/constructor in perl?

Scoping - 'my' is lexical/block scoping, same as javascript's 'let' (thank you! steal more things please!) so

  use feature 'say';
  use strict;
  use warnings;
  my $code = do {
    my $count = 0;
    sub { ++$count }
  say $code->(); # 1
  say $code->(); # 2
and note that if I e.g. typoed $count inside the subroutine I'd get a compile time error (and it goes out of scope at the end of the 'do { ... }' block). Note that the 'state' keyword provides an easier way to do this, but it's the easiest demo of closures I can think of, though also there's fun to be had with methods:

  my $method = sub ($self, @args) { $self->foo(3, @args) };
  $object->$method(1, 2); # same as $object->foo(3, 1, 2);
Constructor/accessor wise:

  package MyClass;
  use Moo;
  has foo => (is => 'ro', required => 1);
  has bar => (is => 'rw');
  has baz => (is => 'lazy', builder => sub { expensive_default() });
then means that

will throw an exception telling you foo is required

  MyClass->new(foo => 1);
will work fine,

  $obj->foo; # returns the value of foo
  $obj->foo(3); # throws an exception - foo is ro
  $obj->bar; # returns the value of bar is any
  $obj->bar(26); # sets bar
and (if you didn't pass a value for it to the constructor)

  $obj->baz; # runs expensive_default() once and then retains the value
Hopefully that's a start, this is all typed off the top of my head on my first post-9am pint of coffee so my apologies for any omissions and/or errors.

Also, in languages like Java, C++, or C#, you can use bare braces to define a new scope, no keyword necessary. It's not done so much anymore because modern coding standards tend to favor shorter functions for clarity and unit testability, but it's always still an option.

You can do bare braces in perl5 too, I used the 'do { ... }' block because that's an expression rather than a statement so I could return the subroutine -

  my $code;
    my $count = 0;
    $code = sub { ++$count };
would've been equivalent but I always try and do assign-once-at-declaration-time where possible because mutable state is my mortal enemy.

what's the js equivalent to your example?

   let code = () => {
     let count = 0;
     return ++count 
doesn't do the trick

JS doesn't have the do {} expression form so

  let code;
    let count = 0;
    code = () => ++count;
or let-over-lambda it and

  let code = (() => {
    let count = 0;
    return () => ++count;

Oh man, you responded the same as me at the same time.

> I'm not exactly sure why I started reaching for Python instead. It's so much easier with perl to do things like backticks to call some utility, or to use system() and easily deal with its stdout and stdin, and I never have to do a Google search to figure out how to use a regular expression to process the output (whereas I almost always need to bring up the doc page for Python's re).

The thing I find easier in Perl than in the other languages I might reach for when doing a lot of quick scripting tasks is declaring nested data structures. It's easy in Perl because often you don't have to declare them--you just use them. And that means that you can often make major changes to how your data structures work with just small source changes.

For tasks where you need to process data from logs or text files or similar, but don't actually quite know what it is you need to be doing with the data so need a good bit of exploring/flailing to get your bearings, and then maybe a few iterations trying different approaches to figure out what structures you need, and you don't have a lot of time to think about it up front--in other words, data analysis improvisation, Perl shines.

Same here. I do mainly Python development - Django docs made things way easier then the Perl frameworks that I tried.

But for quick and dirty scripting tasks I still pull out Perl as the extra sysntax makes thing so much easier - backticks and built in regular expressions for example.

And I have still never seen anything as intuitive to use as a Perl hash tied to a berkley db file for simple persistence to disk.

I agree that Perl's autovivification is a timesaver, although I can usually hack together what I want with Python's defaultdict.

For "data analysis improvisation" though, I don't know. A good REPL is a key part of that, and I'm not sure that Perl has anything as good as the ipython cli or jupyter notebook.

I'm exactly the same way about Perl vs Python. After years of using Perl 5 and due to what I also think is the simplicity of the language, I know I could do most small tasks more easily in Perl.

But Python feels more like "the future". When I need to write a script-like solution, I reach for Python because I need to practice with it and because I figure that the modules will be better maintained. You're right, though. I always need to look up simple things like system calls and often have to remind myself of exactly how the re module works.

But I slog through it. Don't get me started on the Python 2 -> 3 glacial migration, though. That almost makes me want to switch back to writing my scripts in Ruby.

> I'm not exactly sure why I started reaching for Python instead.

Oh, I am quite sure. Looks like Perl6 changed that, but it's because I was never sure that data structure should be a reference to a dictionary of references to scalars, or a dictionary of scalars, or a dictionary of reference to scalars, or a reference to a dictionary of scalars.

Anyway, next time I go for a quick script, I should go for Haskell, not Python, nor Perl (5 or 6, whatever). But I will probably go for Python, because it's painless enough and I don't know Haskell's shell integration by hearth. Anyway I will go for Haskell before I try Perl6, what is a nice measure of the problem they have to get mainstream again, because the language varies, but I'm far from the only person thinking this way.

> Anyway, next time I go for a quick script, I should go for Haskell, not Python, nor Perl (5 or 6, whatever). But I will probably go for Python, because it's painless enough and I don't know Haskell's shell integration by hearth.

I've started using rust for short scripting tasks, helps me get acquainted with the stdlib & ecosystem, and `cargo script` works really neatly, seamlessly allowing adding dependencies to a single-file script with not issue. Plus if you want to scale the script out to a proper project you can just take and promote the generated cargo project.

Yeah, I've had a similar experience in Scala-land. Spinning up a JVM to run a small script feels icky, but it's actually well worth it for the sake of not switching languages.

The kind of quick scripts I use Perl for involve picking interesting bits out of text files. Honest question: Would you use Haskell for that? If so, why? And how?

I'm with you here. It's not that Haskell isn't a good language, but that scripting languages are really good at scripting :)

Haskell does have a good REPL and being able to distribute a binary when finished is really nice.

You can also run Haskell in interpreted mode, so it is actually a viable choice for scripting.

It's viable, but certainly not designed for that sort of thing like Perl, Python, Powershell, Awk, & Bash. Many of those languages have built in niceties for scripting such as Perl's myriad of command line switches and well, everything about Powershell. They're optimized for shorter programs where Haskell is optimized for helping you write higher quality code.

Try the turtle library, after using it it would certainly feel for you like it was designed for scripting, it’s such a joy to use!

Are you kidding? Haskell is the language for picking stuff out of text files. Unless Perl got a complete transformation on the 5 to 6 change, it isn't even on the same league.

Could you be a bit more specific about what makes Haskell the language for this? I'm not sure it's going to be more comprehensible than Perl, for those who don't write Haskell full-time. I'm not sure it's going to be easier to read through the lines of an input file than it is in Perl, and I'm not sure it's going to be easier to apply regexes to lines than it is in Perl.

Is one of those assumptions wrong? Can you state why it's wrong?

Or do I need to adjust my sarcasm detector?

You could argue that Perl 6's grammars are a complete transformation. Please see https://docs.perl6.org/language/grammar_tutorial for more information. Note that Perl 6's grammars are used to parse Perl 6 code itself, so it's powerful enough to build complete language compilers out of.

Cool, Perl6 gained a native context-free parser. Yeah, that gets it in the league, I would have to compare it with Python parser libraries to decide what's better.

But not, Haskell and Prolog are still leaders here, with Haskell currently having the lead.

If it's not already on your radar, you should have a look at the Shelly library on Hackage.

Oh, I know about it. And it looks great.

It's basically the reason the next language I'll try on the shell is Haskell. But it takes looking at the documentation, so I'll leave it to try when I'm not in a rush. For shell scripting, that may take a long time.

> I hope that some derivative of perl can rise from the ashes, because I'm too stupid to write actual lisp programs, but perl allows me to emulate what I imagine lisp programmers must feel.

Well, there is Clojure. You can script it with `clj` and https://github.com/l3nz/cli-matic, and/or one of the ClojureScript interpreters. It has sane library management, so there is no PITA installing anything. You have JVM libraries, so you have all you need. We find it pretty handy.

Ruby has all of those perl features you mention, without a lot of the strange quirks from perl.

> without a lot of the strange quirks of perl

And instead with a lot of the strange quirks of Ruby!

\s, only kind of, I love Ruby but it does have a few quirks that I see confuse some people coming from other languages when they start writing non-trivial Ruby code (literally everything is an object being a big one — “what do you mean a class is an object?! modules too?!”).

However, these quirks are much, much easier to learn, understand, and possibly eventually come to like, rather than some (many) of the quirks that exist in Perl.

> what do you mean a class is an object

I think that's pretty standard for object orientated languages. I can't find of any where it isn't the case off the top of my head. Is it not the case in Perl's object orientated support?

And why wouldn't it be an object? Seems more surprising for classes to be an exception.

It means they lack a metaobject protocol.

People coming from Javaville or C#land definitely don't have that kind of abstraction (and yes, the usual rebuttal is that they're procedural, not OO).

Classes are objects in Java. You can even create new classes at runtime in Java or use java.lang.reflect.Proxy to make existing objects implement different interfaces.

I once did this in a class project, including even manually creating the bytecode at runtime. Ironically, it's probably the most understandable and maintainable of the crazy metaclass-style reflection I've done (although that's probably mostly due to the fact that the scope was much more limited: I was only mirroring a set of interfaces into a different namespace to work around some stupid limitation in a different library).

You can, yes, but not nearly in the same way.

> what do you mean a class is an object

> People coming from Javaville or C#land definitely don't have that kind of abstraction

What do you mean? A class is an object in both Java and C#.

Not exactly.

What you see as the Class type in Java, at least, is simply a representation of a class. It can tell you about a class, but it doesn't represent an actual thing you can manipulate, change, or use. I can't add a method to that class, or add a field, or manipulate its instances beyond basic method lookup and dispatch.

In something like Ruby or Smalltalk, a Class is an object. It can be changed live, altered, adapted, cloned, or otherwise used just like another object; and the application will change its behavior accordingly, right there on the spot.

I'd recommend reading up on prototype inheritance, as that is the model generally used in the latter languages.

You’re telling me that class objects in Java are immutable while they’re mutable in Ruby. That’s orthogonal to whether they’re objects.

> I'd recommend reading up on prototype inheritance

Thanks but I’ve already literally got my PhD in metaprogramming in Ruby and object model implementation in Java. Ruby doesn’t use anything like prototype inheritance.

Its more then that... It's not even inmutable, not really. There are some things, like privaCy of access, that can be mutated by manipulating Java class objects.

That said, maybe the difference here is how often the janky metaprogramming of classes interferes with day to day peogramming; a java programmer might have never seen this stuff, moreso then a ruby one?

Yeah, the correction re: prototype inheritance is a good one.

That said, there is a qualitative difference in what we call 'classes' in those two environments, which is part of why I think there's confusion. Class in Java is a fairly different concept than a Class in Smalltalk, Ruby, or JS.

Java has Class which is an Object.

Honest question: Can you write

    Class foo = String;
in Java? If so I guess classes really are objects, if not I'd say "there are some objects that represent (or wrap) some classes somehow."

I think what you mean is that classes are referenced by a separate namespace in Java and aren’t in Ruby.

That’s completely orthogonal to whether they’re objects or not.

You can keep fiction and non-fiction in separate bookshelves if you want but when you get either off the shelf it’s still a book.

I guess I'm not sure what "object" means in Java, then.

In languages I'm used to, an object is a type of value, and values are things that you can store in (or refer to with) variables.

The only vaguely object-like feature of Java classes I can think of is that you can look up (static) methods/attributes on them with a dot.

Or is there more? You can't call `String.toString()`, so `String` isn't an instance of something that inherits from `Object`. I'm at a bit of a loss here... There obviously exist "class objects", but they're not normally what people refer to when they talk about classes. I'd say Java classes are almost purely lexical constructs (though I could be wrong there -- again, I don't know much about Java...)

> In languages I'm used to, an object is a type of value, and values are things that you can store in (or refer to with) variables.

Class theStringClass = String.class;

Class[] myArrayOfClasses = new Class[1];

myArrayOfClasses[0] = theStringClass;


> You can't call `String.toString()`

String.class.toString() => "class java.lang.String"

theStringClass.toString() => "class java.lang.String"

myArrayOfClasses[0].toString() => "class java.lang.String"

somethingThatReturnsAClass().toString() => "class java.lang.String"

> `String` isn't an instance of something that inherits from `Object`

Class theObjectClass = Object.class;

theObjectClass.isAssignableFrom(theStringClass) => true

> I'd say Java classes are almost purely lexical constructs

They aren't - here's the source code for the Class class.


Here's the source code for the toString method we were calling


    Class foo = String.class;
(Though `Class` is actually a generic type, so you'd want to include the type parameter in actual code.)

Sure, I'd say that fits my second suggested wording -- `String.class` is an object that represents or wraps the `String` class.

How do you tell the difference between an object that is the String class, and an object that represents the String class? If you can call .toString on it, and you can compare it to other classes, and you can call .newInstance, then how is it more of a representation than being the actual thing?

> How do you tell the difference between an object that is the String class, and an object that represents the String class?

Hah, I think "an object that is the String class" begs the question :-). I think that `String.class` is an object because you can do objecty things with it (call methods, assign it to a variable etc) and `String` is not an object because you can't do any of those things with it.

To me they're not the same kind of thing, though they're intimately, inseparably and exclusively related. I personally would say even if

  new String()
was just syntactic sugar for

, and all other "bare" uses of `String` were similarly shorthands for operations on `String.class`, I'd still hesitate to say that `String` itself was an object.

My logic is probably less useful than yours, though. I'd say "`A` is a `class`, but `a` is an instance of `Class`" and in your system where `A` and `a` are "the same thing" there's no difference between being a `class` and being an instance of `Class`. Because the things appear in different namespaces (as you say), and are related one-to-one, we could as well say they are two ways to refer to "the same thing", with different operations possible "depending on the phrasing."

Shrugs, it doesn't come naturally to me, but it's at least a self-consistent system.

In java you have to call the special and awkward .newInstance rather than being able to use standard "new". Whereas in e.g. Python I can equally well do

    >>> str("foo")
    >>> x = str
    >>> x("foo")
and the "builtin" str and "userspace" x are equally first-class.

Sure, the ergonomics are different. One is a statically compiled language and one is dynamic. But it's still an object.

If I write

    new String("foo");
in Java, how do I get an object that can take the place of String in that expression? String.class has some relationship with the "String" in the code I quoted, but it's not the same thing; the thing that I apply new to is not a thing I can call methods on, and the thing that I can call methods on is not a thing I can apply new to.

Not to mention the nightmare of ruby versions and packages.

I quite like the language by my biggest drawback to using ruby for anything is versioning and packages.

Honestly curious: can you give an example of what kind of issues you run into? Asking because I basically never have versioning or package issues with Ruby. And on the rare occasions when I do, there's usually a good Stack Overflow answer or Github issue.

Huh, with Bundler, Ruby was one of the first language ecosystems that actually got it's act together on those points. It is so much more mature than python in this regard, for example.

I would agree, however bundler & rvm are incredibly easier to setup and use when comparing to Python’s virtualenv (and the few other alternatives frequently brought up). I’m not sure there is even an equivalent for the Perl ecosystem, as I gave learning Perl an actual effort a while ago (no idea why) and all it’s quirks and weird syntax immediately turned me away and I’ve never looked back.


I haven't used it much or in a while thought Ruby had their package story down.

Python packaging is pretty bad requirements.txt is a pretty adhoc list and pipenv doesn't seem ready for prime time(haven't tried poetry yet)

> \s, only kind of, I love Ruby but it does have a few quirks that I see confuse some people coming from other languages when they start writing non-trivial Ruby code (literally everything is an object being a big one — “what do you mean a class is an object?! modules too?!”).

That's also true in python

---| type(enum)

    <class 'module'>
Also ruby Procs aren't objects

And unlike Smalltalk &&/and ||/or are not methods

It's also slower, which is quite a pity because I really like the syntax compared to Python. It has proper end statements for example.

We still use Perl5 heavily for web and backend services development (Mojolicious). If you keep your code tidy it's maintainable, so it all boils down to how you write your code. Needless to say, my tiny scripts are written in Perl. I simetimes even use it from command line to calculate dates in the future or do something that is too complicated with sed and awk.

> because I'm too stupid to write actual lisp programs

Even PG admits that Lisp (in particular CL) is an "atrocious"[0] scripting language for Unix, in part because the text-processing applications that you speak of aren't its forte. It's not that you're "too stupid", it's that it's harder than it needs to be to write scripts that do useful things on filesystems and text files using Lisp in a way that is compatible with multiple CL implementations. You'd still be Googling plenty and Lisp sure doesn't have the wealth of libraries that Python or Perl does.

On the other hand, I'd recommend you learn some form of lisp anyway for the aha moment when you implement a domain specific language with macros and scratches some itch of yours. Try Ruby, too, for the lispiest of those three scripting languages.

[0]: http://www.paulgraham.com/popular.html

That may be true in CL, but there is a scripting dialect of Lisp called TXR Lisp. It's documented in one big manual page (just like GNU Awk, Bash, and other classic programs). If you can't find what you're looking for there, either it doesn't exist, or you need help from StackOverflow or the TXR Mailing list.

TXR has decent capabilities for text processing, plenty of POSIX functions built right in, and such.

There is a decent port of TXR to Windows, which uses a fork of the Cygwin DLL called Cygnal. Cygnal provides more Windows-like behaviors in areas where stock Cygwin is too assertive with POSIX conventions in ways that make Cygwin programs confusing to Windows users.

How about Guile? I've only seen it used as an embedded language (in C) or using the repl.

For me, also with quite a bit of experience with Perl and Python, the main reason to use Python over Perl is readability. Especially when it comes to code written or modified by less experience engineers.

Programmers tend to think that writability is the most important when it comes to coding, while readability is actually far more important.

I agree that using Python's re module is harder to write for than regexps in Perl or Ruby. But honestly it's not that much harder, and usually using it leads to clearer, more explicit and more readable code.

I lean to Python when writing anything more than a few lines that I can see extending, but Perl (perl -pe '<code here>') still rules for command line pipes where I need a bit more power than what the standard Unix utilities provide.

Perl is no longer my first choice for scripting. Once upon a time perl was preinstalled on almost every Linux box and was a default package on most distributions, now you can't assume that. Also it is a bit difficult to use cpan - often the c code for a module does not compile for some reason during installation and you have to go and figure; on the other hand I never had any problems with pip. (Though I still don't quite like Python)

On the other hand bash has maps so it is possible to use that more for scripting.

Bash has a horrible syntax. Beyond a simple for loop I would reach for Perl.

But it has set -x ; really helps to see what the script is doing step by step. I don't know how to do that easily with other languages.

Put "how to trace a X script" (where X is a language name) into the search engine of your choice.

Shell scripts have so many footguns. Python is almost as terse, takes less time to write (for me anyway), and is way more maintainable.

What Linux distros don't include Perl by default? I certainly haven't come across any...

Well I did not do a lot of research for this post, but many installations that I have to deal with come from some image where perl is not been installed.

I just use perl5. Python is too hard to use for perl tasks. Separate venv for a script is horrible overkill. I also use Fedora Linux, where everything is integrated and just works.

> Separate venv for a script is horrible overkill

...and god help you when you write code/use a library that depends on language features not available by default, without pragmas, on the system Perl (5).

There is local lib and if you don't use RHEL or some other OS with ancient Perl it works just fine. Or just install what you need from the OS if you have root. Or use plenv and cpanm and whichever Perl version you want to create a local environment.

> Or just install what you need from the OS if you have root.

The availability of libraries is not the problem. The availability of language features is. Manually replacing the OS version of a ubiquitously-depended-on language as root seems like a very poor idea. The last time I tried that with Perl, I broke GNU Parallel (whose Perl code is a fascinating read, by the way!).

> Or use plenv and cpanm and whichever Perl version you want to create a local environment.

I agree; that's the right approach (or Perlbrew or any other portable-Perl installer). GP was responding to someone who argued against using venv, which is the equivalent of plenv.

I was arguing about installing Perl modules from the distribution's repository as root, not replacing the system interpreter. The other solution is system wide module install with cpanm. Perl5 has plugin "language features" available via modules such as Moo, Role::Tiny, MOP, autobox, Clone etc.

Whenever I need to write a tiny script that tests the bounds of what is easily done with just awk and Bourne shell, the tool that I next reach for nowadays is TXR or TXR Lisp.

I made it myself.

One of the conscious goals of TXR is to have a scripting language in the POSIX environment (without forgetting about MS Windows too) that is acceptable for people who would rather be coding in Lisp. (Speaking of awk: TXR has it: it's a Lisp macro.)

TXR saves coders from Python, Perl, Ruby, Tcl, Lua and whatever else.

TXR has no ecosystem or libraries; everything comes with the program. If it's not described in the manual, it doesn't exist; stop looking and write it yourself.

This kind of thing is forbidden in the TXR project: https://xkcd.com/1987/

Here is my unrelated to the article perl memory...

I fell in love with perl at school because it seems easier that sh/awk/sed/etc... After I got a job I still used perl for everything. Perl5 came along and made a bunch of really needed improvements [1]

My problem was that at work they used rs6000 machines running AIX for everything and perl5 didn't work on AIX. So I couldn't use the latest toys. So I took time out and fixed dynamic linking so perl5 worked correctly on AIX. http://web.mit.edu/darwin/src/modules/perl/perl/ext/DynaLoad...

Then later I got an email from Redhat offering me shares of stock when they went public. This was because of my perl contribution. That was clearly some shady scheme so I declared that email spam and threw it away.

1) But yes perl5 is quite a bit more complicated than perl4 and that bothered a lot of people.

I'm always worried when a system seems to have no drawbacks. The strengths of Perl 5 and Perl 6 and the speed of C? That sounds strictly harder than either writing Perl 6, or the decades of work put into optimising Perl 5.

What are the limitations which make this achievable?

It's a little bit ridiculous, but you could actually have this if you transpiled perl to rust :)

The amount of perl (5) that relies on 'eval', and invoking the interpreter and compiler at runtime (in various degrees), is . . . . massive.

To claim that switching to a "traditional" compiled language would resolve those issues in the language that is perhaps the most comfortable with lapsing into "well, I can't metaprogram it, so just take this string/block/function-like of code and interpret/run it, tell me if it works, and give me the result" is hubristic to say the least.

None of that should be taken as condemning that tendency in Perl. I think that its tight interpreter/runtime coupling and feedback loop is what makes it uniquely powerful in many cases.

Yeah, the entire thing is rubbish.

I'll just keep writing perl5 that solves customers' problems, and use the rust integration we now have to write fast safe code when I need that.

Yeah but then how do you achieve what perl 11 sets out to be then?

I'm not really trying to push this perl->rust transpilation idea (it was just a silly thought, people seem to have taken is very very seriously), but maybe v0.1 can just not support eval, and then see where they go from there?

Either way it seems like perl 11 is going to have to solve this issue of whether or not they have to bring interpreter & runtime with them as well if they want to support C/C++ speed and `eval`.

eval is not evil per se. eval is mostly a necessary feature to do macros, i.e. compile-time evaluation in proper syntax. (called BEGIN blocks in perl). The only problem with eval is it's internal slow implementation, compared to a proper language. Two big stacks of structs (~10 words) on the heap, compared to one word on the stack. longjmp for exceptions. For perl being a 20x slower scripting language it's fine though.

rust on the other hand is an even more hype driven language than perl6. Their documentation is full of hype and lies. perl6 is at least honest about its limitations, but rust is hyping about its safeties and performance while there's none. rust has no type safety (unsafe keyword), memory safety (refcounting. only stack or external heap allocations, both are unsafe) and fearless concurrency safety (requiring manual locks. dangerous and unsafe). perl6 is much safer than this, and parrot had even proper lockless concurrency and safety. rust is a nice language, I use it also, but never trust liars and hype-driven development. For proper performance and safeties use pony. ATS is also fine, and there some new parallel languages cropping up. For slow and safety alone there are also many nice copying languages around, like Go, Erlang, Scala, Akka or Lisp.

c/c++ speed is an explicit non-goal for perl 11. We have no chance to catch up to javascript, only to php7. But even this is not easy because the perl5 API is as fucked up as the ruby and python API, exposing its bloated inner structs instead of functions only. You only get proper performance by optimizing those fat structs to single words, "unboxing" data, slimming down the code and runloop. With an exposed API like that based on structs you are doomed to be 10x slower. Additionally there are so many exposed globals in the API, that proper threading is almost impossible.

gperl has no eval and was about 200x faster. The goal of gperl was to add proper eval but it's not easy, and he already went away from perl5 to go and java, as most asians did recently. There's spvm though, a new asian perl5 attempt to create a proper VM.

The goal of perl11 is simply to add most perl6 features to the perl5 codebase without breaking backcompat. This is what perl5 should have done, but refused to do and is not able to do.

> c/c++ speed is an explicit non-goal for perl 11.

That's not the message sent by the start of the webpage ( http://perl11.org/ ) at the moment:

> Perl 11 is currently a philosophy with 3 primary tenets:

> 1. Pluggability Of Perl On All Levels

> 2. Reunification Of Perl 5 & Perl 6

> 3. Runtime Performance Of C/C++ Or Faster

Maybe the page should be corrected?

That's from rperl. rperl transpiles perl to C++, so it's correct in this context only, but not for the others. He added that goal just yesterday :)

One beast will rise from the desert, the other from the sea. When they meet, they will fuse to form the prophesied CamelCrab, and you will know the end is nigh.

Could you though? I mean the semantics are so different that the code you generate is probably going to need to be the equivalent of Perl's VM at that point anyway.

In fact, I ask about this:


So, in short, the answer is no. Rust currently lack a clean way to do this.

The semantics are hard to translate also. I'm doing a little lang in rust and is HARD. However, is possible? Yes.

But I think rust need to be able to run from a shared lib and compile to memory, like


I think you could -- you'd have to just translate all those semantics down to rust as well. I can't think of any perl semantics/sugar that can't be compiled out. Even the famous default variable ($_) could be naively transpiled out with just a variable binding.

Also, if I'm understanding the original post right, they want the VM itself to be swappable -- that with the desire to reach C speed makes me think that they want to make it optional all-together.

Your reasoning is kind of like saying we only need a 6502 and a large bank of ram because, hey, that is turing equivalent, and a modern x86 with the same amount of ram is turing equivalent, so it should be complete.

Imagine a perl program which asks the user for input. Based on the user's input, the perl program does "do a.pl" or "do b.pl". Hell, maybe the user's input is actually executable code that is thrown in the mix too.

Then the original program does "print $a + $b;". What should the rust transpiler do? Are they floats? Are they ints? Or strings? Do they have values already?

Yes, you could create a transpiled version that used hashes to look up symbols, and based on what the hash returned it would execute code to add two (boxed) ints, or floats, or float and int, or throw an error, but then what you've done is rewritten the perl interpreter in rust.

> Your reasoning is kind of like saying we only need a 6502 and a large bank of ram because, hey, that is turing equivalent, and a modern x86 with the same amount of ram is turing equivalent, so it should be complete.

Yes, this is literally the case, I didn't say it would be easy or that it should be done. Maybe I should have been more clear, but it was just a silly joke, lighten up.

Also, analogging perl and c/c++ style languages to the modern x86 and 6502 is a bit hyperbolic.

If I'm understanding your point correctly, you're saying that dynamic interpretation (via code loading or `eval`) is an important perl feature you want to retain. In a world in which someone was crazy enough to write a perl->rust transpiler, they could just not support eval for v0.1, and figure out how they want to deal with it in the future. One option is to compile in a runtime + interpreter that could be used if the code being built uses eval.

Any other difficulties you can forsee? or areas where there's a 6502-x86 style feature disparity?

Having eval is a part of the language, and without it you aren't transpiling perl, just perl-- or whatever you want to call it.

That doesn't make sense, the cost incurred by dynamic languages is not just due to the interpreter (otherwise that would be easy to solve with a JIT or, as you point out, static compilation). Rather the problem is that since these languages are by nature dynamic you can't make many of the assumptions that you do it static languages. You can not know what a value passed as parameter is going to hold until the function is actually called, you cannot inline method calls because they could be redefined dynamically etc...

For this reason JIT actually makes more sense that static compilation since at least then you'll potentially be able to use dynamic runtime information to better optimize the code.

perl6, cperl, rperl are all gradually typed, and if so compile to native unboxed ints and floats. If untyped the cperl jit will observe run-time types and switch to the unboxed variant or the whole basic block then, if profitable. moar (perl6) ditto.

Same with the object system. A native class has the same layout as a C struct. No costly refcounting and indirections, just a word with the data or pointer. Single op assignments, reads and stores. Easy to ffi back and forth. Fast objects.

You can inline method calls on the tracing jit when it's called often enough with the right types. Static methods can be inlined at compile-time already, but that's not merged yet. A few bits are still missing in the inliner.

The static compiler benefits greatly from types also, but the type inference is not that bad also, just a bit unstable (B::CC). Many modules cannot be compiled statically with this optimizing compiler yet. The normal static compiler (B::C) is stable though and used in production for decades. cperl just adds many memory optimization for that use case.

If you properly type your sigs and vars, the static compiler will fly. If not, the jit has more work to do. The biggest problem is only the non-cooperation from p5p. They explicitly disabled types in sigs (~4 lines of code) in their part, so nobody can use it because it breaks code. python, ruby, java went with annotations as comment. But this is lame. perl5 had the chance to do it properly when they introduced sigs (using the perl6 syntax or the syntax from its cpan modules) but blew it on purpose.

Or just transpiling into C...

True -- I thought rust just because it's what I write in lately (and some of it's benefits over raw C) but this would be equally true, C/C++ would make just as much sense.

BTW over in the land of things-that-maybe-shouldn't-be-things:


I don't think you can transpile Perl. Maybe a subset. https://www.perlmonks.org/index.pl?node_id=663393

RPerl is a Perl 5 transpiler. We started with a low-magic subset of Perl 5, and are now slowly moving into the medium-magic parts of Perl 5 like regular expressions and Moo(se) and databases etc.

See B::C and B::CC, the compilers to static C. Also called the perl-compiler.

B::C just jumps into the libperl run-time, but B::CC even tries to optimize away many slow run-time functions, because it knows much more about the code and data.

perlcc compiles your perl code into native code and runs it. It's very stable, because I'm maintaining it, and it's used in production.

you can't do this without type information

Nor could you with Perl to Rust...

The "Pluggable VMs" goal is just a bad idea IMO. The more generic you make an interface, the more you ruin it. Tight integration may make certain things harder, but it allows better control over the pathways for performance optimizations, and keeps the implementation simple which inherently increases maintainability. And in practice, only one of these VMs is going to actually be used by 97% of people, and the other 3% can do the extra legwork themselves in a maintained fork if it's really that important/beneficial to their niche.

Agreed. This was mostly pep talk, parrot inspired. We don't have the luxury of a modern clean functional-only API, where you can simply switch VM's. But think of Amazon. Demand a properly designed API and stick to it. Version it. Hide the structs and details. That was the idea at least.

Most of the work is the stdlib anyway, not the VM. A proper VM is max 5k src loc, but a stdlib is 10x more nasty work.

But to have an actual small, fast, thread-safe VM was a good goal nevertheless. We got it both small and fast, and and then also big slow and thread-safe, but not all together at the same time.

Nowadays I would only consider tvmjit (luajit with lisp oo), p2 (potion: lua with ruby-like mop and jit) or pony (non-blocking only, fast and safe) useful for a VM. A lisp (like chez) not really anymore, the days of pointer chasing are gone. POSIX is not going anywhere on multicore. It needs to die. (I've told that L4, but they still like to support blocking for POSIX backcompat, and have optional timeout args for everything.) .NET and jvm only for the worst case, but they got all the libs and the good GC's.

The "pluggability of Perl at all levels" is a goal meant to move us away from the existing Perl 5 core design, which is mostly C macros and is essentially unmaintainable. Even if we never truly achieve pluggability on ALL levels, we must at least work TOWARD that goal, in order to achieve the other big plans such as reintegration of Perl 5 + Perl 6, and C/C++ performance, etc.

Smalltalk has been able to achieve the goal of pluggable VMs. If you check the Squeak project, you can choose VMs according to your desired performance/stability.

I think the Perl 6 developers figured that out a few years ago, and ended up implementing MoarVM, instead of using or building a more generic VM. One can implement other languages in MoarVM, but it's been built for Perl 6, so it fits Perl 6 best. Maybe it isn't a super "tight" integration, but it's much tighter than something like Perl 6 on the JVM could ever be.

It's more very tight semantic alignment across the entire stack of abstractions. Which is actually the important part, things like a "string" being the same conceptual object from top to bottom.

It's a silly idea.

Fortunately, none of the people actually involved in progressing perl5 or perl6 have anything to do with this moonshot.

It's a silly idea only as afterthought.

perl6 is still hooked onto it, Larry Wall constantly confirms the validity. They went through some successful VM's already, so why not. Esp. since the rakudo/nqp layer needs a lot of work.

- They have three layers in it's main implementations now: moar (prev. parrot) - nqp - rakudo, with jvm or v8 as other backends for nqp, and tvmjit as possible replacements for the moar/nqp layer, and p2 and niecza as standalone impl's.

perl5 being destructed is lucky to have cperl as failover replacement. Pluggable only on the XS API and CPAN layer.

Personally when I saw JVM as a VM for this I lost interest. JVM is not how you make something as fast as C/C++. You want to make it that fast you should be running it directly on a C/C++ code base. Thats pretty much how Python, PHP, and other scripting languages work.

So the sight of pluggable VM backends, which includes JVM, somehow struck you as a deadly limitation?

Funny, most other languages dream of having their core implementation runnable on multiple VMs. Otherwise why does every successful language end up with a separate-and-nearly-equal JVM implementation?

As someone that started coding when C and C++ compilers code quality was seen as too slow for writing AAA games, this kind of comment is always funny somehow.

That's not true anymore with the new JVM tech

The third goal in particular looks like a bit of a moonshot... they offer little technical information for how they’d actually achieve it.

What, "Runtime Performance Of C/C++ Or Faster" ? Obviously they can't rival the speed of "C/C++" so they're probably going to go with "Faster", which isn't a real language so it's going to be easy to beat.

I’m not familiar with the details but I’d wager they’re referring to JIT compilation which can exceed precompiled native code as it can (in theory) account for actual usage patterns.

That was a fun theory 15-20 years ago. We've now got piles of evidence it doesn't work that way. JITs can accelerate slow languages, at a stiff RAM penalty, but they still remain noticeably slower than faster languages.

Claims to the contrary at this point will require concrete demonstration. Starting from the union of Perl 5 and Perl 6 is just about the worst starting point for that goal I can imagine. (Not necessarily because "Perl sucks", but because any such solution to that problem is probably going to require writing the language with that goal from day 1, not taking existing languages as a given. Or, like LuaJIT, cutting things that don't fit. And between Perl 5 and Perl 6, there are sooooo many "things"....)

I'm with you about the overall point that Perl5/6 are not going to get C/C++ speed.

However, on the specific point, why do you think the JIT forces high memory usage? In the case of Java, my perception is that it's the pervasive use of boxed types everywhere and a standard library that's past its sell-by-date that drives bloated memory usage.

Maybe my sense of things is a little skewed by my work: I work on server processes that use gigs of memory. There, the resources for the JVM/JIT itself usually use vastly less memory than the java objects contained in the heap (probably 10x, and this is for a monolith with thousands of classes).

The JVM uses the memory to optimize the garbage collection, so it is a tradeoff between memory usage and speed (favoring the latter).

It's the pervasive requirement of a dynamically typed language, not the JIT per say.

And yet Perl 6 has been droppping their performance floor by enormous factors, with a lot of room for improvement still.

Everyone always shrugged at me when I explained that Perl 6 was designed to be able to be fast, with the ability to outstrip Perl and even other languages.

And now that it's happening? They are still shrugging, but its the uncomfortable shrug of someone trying to shake off an inconvenient truth.

"And yet Perl 6 has been droppping their performance floor by enormous factors, with a lot of room for improvement still."

Considering the literally-unusable performance that Perl 6 has had within fairly recent memory, getting a lot faster hasn't been that impressive.

Plus, goals aren't results. I'll believe Perl 6 is natively fast when somebody actually shows it outcompeting, say, Rust, on some non-trivial task, written in the native idiom. No fair just sitting there in a loop and adding integers, which is easy-mode for a JIT. Show me a real program, that I didn't have to write in a magical subset of Perl 6 to get the performance (a problem Javascript has, "fast JIT'ed Javascript" is a mysterious and ever-changing subset, where your performance can be tanked at any moment by the smallest of changes), and don't tell me about how it's a 100x faster than last year, show me how it's faster than Rust now. Or, heck, just show me beating Go. I'll believe when I see it. And I will believe it when I see it. I have no problem with that. But I've seen the whole "as good as or better than C" claims a lot over the years, and the people making those claims are batting nearly 0.000. (Rust is pretty much the only language with a chance.)

You seem to believe in the existence of a bunch of people who have somehow wrapped their identity around hating Perl 6. That's not it; what you're seeing is that the Perl 6 community burned through their goodwill literally years ago, and now people are just exasperated when the topic comes up. If it helps you to identify as some sort of persecuted minority, hey, go nuts [1], it's a very popular option nowadays, but the Perl 6 community is years past the point where mocking the potential customers into trying it is going to do any good. You're going to need hard evidence... really, really hard evidence, probably rather a lot of it, more than you'll probably think is fair but such is the hole the Perl 6 community has dug itself into over the years... to convince people, not mockery.

[1]: Actually I think it's a terrible and very unhealthy idea; you'll find none of the collected ancient wisdom of humanity will tell you that constantly nursing a persecution complex is the way to wisdom or happiness. But such is the zeitgeist of the era.

> You seem to believe in the existence of a bunch of people who have somehow wrapped their identity around hating Perl 6.

You only have to go to PerlMonks to find them. It's not that hard.

> the Perl 6 community burned through their goodwill literally years ago

That may be so, but most Perl 6 people from that era, either have left or have reverted to lurking mode. The current Perl 6 community mostly consist of people who have become active during the last implementation attempt, based on 6model and MoarVM. And who are focused on results.

> You're going to need hard evidence

Working on it :-). https://twitter.com/zoffix/status/1045623538345017344 shows an object creation benchmark compared to Perl 5 (Perl 6 having become faster than Perl 5), and a "real world" benchmark that's been running for 4+ years: https://tux.nl/Talks/CSV6/speed.log

Yes, it's still not as fast as one would like it to be. Still, as said before, Rakudo on MoarVM is designed to be able to support runtime optimizations. If you like to find out more about how these optimizations are done, check out Jonathan Worthington's blog: https://6guts.wordpress.com

> I'll believe Perl 6 is natively fast when somebody actually shows it outcompeting, say, Rust, on some non-trivial task, written in the native idiom.

If it beats Java then it's fine. I do not expect a VM language to be faster than machine code. Rust is also overhyped. I did some of my own benchmarks and Go and D turned out faster than Rust, or at least their maps/associative arrays are faster that Rust's HashMap. Also, I won't use stuff that is not in the stdlib in the benchmarks.

> I did some of my own benchmarks and Go and D turned out faster than Rust, or at least their maps/associative arrays are faster that Rust's HashMap. Also, I won't use stuff that is not in the stdlib in the benchmarks.

Rust's HashMaps are intentionally slow by default. You could use a different hash function with them if you want different properties. That means writing your own if you don't want a known-good implementation, of course, but this is expected and not really representative of Rust generally, which uses associative arrays pretty infrequently.

Correct me if I'm wrong, but isn't Perl 6 still considerably slower than Perl 5?

As a rule of thumb, that's probably still accurate. But there are specific areas where Rakudo on MoarVM already outperforms the Perl5 interpreter (eg an object creation benchmark was mentioned elsewhere).

I suspect the scale will eventually tilt in Perl6's favour, but is has been slow going...

A lot of them are probably shrugging because they have 0 interest in Perl, and you telling them more about Perl is not going to change their mind.

I personally could not care less about Perl 6’s speed, I will never use it anyways.

> JITs can accelerate slow languages, at a stiff RAM penalty, but they still remain noticeably slower than faster languages.

The Julia language would like a word.

It compiles to native machine code via LLVM, which is the same backend that a lot of statically typed languages use, and bar it's very first run, it can run blindingly fast: I've been using it instead of Python at work for Data Science things and it's been great.

As I understand it, Julia's current implementation is not a JIT in any comparable sense to the JVM. Julia's JIT does not use any runtime information to optimize code; it's literally just statically compiling each function on-demand as they are encountered. You would get equivalent runtime performance by having a separate up-front compiler step, as C has. And actually, this sort of implies that Julia doesn't do inlining, the mother of all optimizations, which would mean that you'd be sacrificing quite a lot of runtime performance compared to an up-front compiled implementation.


Actually Julia is aggressively inlined because it's easy to statically determine function calls at compile time...that's why the libraries are able to use such nice abstractions.

It's already ahead of time compilable, just needs some work to emit smaller binaries and cover some corner cases

Good to know, thanks.

> As I understand it, Julia's current implementation is not a JIT in any comparable sense

Going straight to machine code via baseline compilation is a common JIT technique. Until a couple of years ago, V8 compiled everything.

Baseline compilation is fine, that wasn't what I was referring to. I the context of this thread, the long-awaited promise of JIT is the ability to profile running code to dynamically determine the hot portions and re-optimize on the fly, and do this continually for the entire runtime of the program. This is what people mean by "things that are impossible for a static compiler to do" (though PGO is a decent approximation), and the basis of all discussions that begin with "actually, Java can be faster than C". This is not to say that Julia does not satisfy the definition of a JIT, only that using Julia as an example of this principle appears to be mistaken. Julia is not fast because it dynamically re-optimizes, it's fast because it uses a mature C compiler backend.

Julia does not have a tracing JIT right now, but does use runtime information in order to JIT specializations based on (often inferred) method argument types.

For your point to be a disagreement, you'd first have to agree that Julia is a "slow language", that is sped up by a JIT. Would you agree with that? Or is it an example of a sort of language that I referenced, one that was designed to perform well from the beginning that may use some technology that looks like a JIT (though as others pointed out, not necessarily really the same)?

I didn’t mean to say I think it’s possible in this particular case for arbitrary code (particular for something as wild as Perl!). I was merely explaining (what I think) is the logic behind the approach.

The only circumstance I can see where a true JIT will beat out native code is when there are many possible paths of execution such that all may not be possible to evaluate at compile time. In that case it’s not even really interpreted vs compiled so much as static vs adaptive.

This is said every single time when "JIT" is mentioned but I've never heard an argument how that might be possible. For all I know you'll have two problems (1) your heuristic should be so damn fast that it compensates for the runtime analysis time, this is non-trivial for JIT but trivial for AoT since runtime analysis cost is 0 (2) you need to tell us a story about how can this heuristic be so specialized that I cannot write my program, profile it on some training data (bunch of program inputs), profile and optimize (gcc and llvm can do this trivially). I never saw any code solving (1) or (2). Do we have any evidence JIT being faster than AoT? No, I see no such evidence. Even the most cutting edge JIT compilers like LuaJIT or JVM are lightyears behind gcc, llvm, rustc etc... Do we have any evidence we can perform more optimizations at runtime compared to compile-time, no I don't see any evidence. JIT was a nice, powerful, avant-garde idea that could change everything, but I don't think it turned out to be the superstar we all thought it'd be.

>This is said every single time when "JIT" is mentioned but I've never heard an argument how that might be possible

I'll explain it but you have to pretend that we are in the 90s, so before you continue reading click on this link:


"Modern" CPUs achieve some of their speed by executing multiple instructions in parallel, using a pipeline: the first stage fetches an instruction from memory, hands it to the second stage that decodes it but while the decoding is taking place the first stage will have already started fetching the next instruction from memory. Conditional jump instructions of course ruin everything, because you need to know the result of their evaluation, before you can decide which instruction is the "next" one.

"Modern" CPUs work around this by always assuming that the jump is never taken and then, if it turns out that the jump does get taken, rolling back the partial work that they did.

As it turns out the vast majority of conditional jumps in a program always go the same way, i.e. any given conditional jump is either always taken or never taken. If the compiler knew which way the condition went it would be possible to lay out the program in a way that jumps are almost never taken, for maximum performance.

A static compiler can't do this but with a JIT you can run the program in bytecode a bunch of times and then use the information you gathered to lay it out in the best way possible.

All that I've said is 100% true and empirically verifiable. The reason this didn't work out is that in the early-2000s all x86 CPU manufacturer started adding this specific optimization directly inside the CPU. They started keeping branch counters and using them to guess which way conditional jumps were more likely to go.

There's other optimizations that a JIT compiler can do and a static compiler can't, but the story is similar. Big x86 manufacturers can do almost anything a JIT compiler can do and that's why the technology is essentially obsolete.

There's still some value in distributing a single binary that executes at near-native speed everywhere, but that's basically it.

A JIT can do all the optimisations that a static compiler can, and then of top of that it can do additional optimisations that a static compiler can't.

> A JIT can do all the optimisations that a static compiler can

As long as it has <100ms of compilation time.

Why do all JITs need to run in <100ms?

If you run your trading app once a day, all day, and warm it up for an hour first, and want max speed, why do you care if it takes a few seconds to compile?

If that is the scenario that you care about, then you can use profile guided optimization in statically compiled languages.

Clang has instructions on how to do so here: https://clang.llvm.org/docs/UsersManual.html#profile-guided-...

But that only gives you info from previous runs. A JIT profiles the current run. That's strictly more information than an AOT compiler has.

If you're running code all day long and profiling it dynamically the whole time, then you're going to pay for it. There is overhead to JIT, and there is overhead to profiling. There is no overhead for a statically compiled program, no matter how you profile it.

I still need evidence that my real time trading app running an entire gcc in it makes it faster. I have no problem with JIT in theory, everything you say makes sense. I've never seen this trend in practice though. Either the optimizations you're talking about can be utilized too rarely, or they are too complex that most JITs don't implement, or they're too expensive that it doesn't give any net gain.

Well, it doesn't actually make sense in theory. While JIT theoretically can profile a current run it also adds a JIT related runtime overhead. It needs to waste a lot of resources on both profiling and recompilation to make use of any new information. That's always going to be strictly slower than PGO techniques, with basically hand picked optimizations for each application and zero overhead.

Yes, that's strictly more information. You still better collect it quickly, as you are compiling your functions each time they run.

Besides, people rally don't like that warming-up period. But it's not the current bottleneck.

Anyway, JIT has a great potential. Just not for desktop software, or network services, or your trading app. It can be great for scientific computing, for example. But that's potential; currently it's not any good on practice.

Well, Android team decided that AOT introduced in Android 5 was a mistake and they went back to a mix of first level interpreter handwritten in Assembly => JIT + PGO => AOT + PGO(taken from JIT) when idle.

Also the large majority of modern Windows software is actually written in .NET, with small C++ pieces.

It's true that AOT compilers can get most of the benefits of JIT compilers by leveraging profile guided and link-time optimization. In principle, we could get rid of shared libraries, ship LLVM bitcode instead and statically compile everything, enabling cross-library optimizations - but so far, we tend to not do that...

LuaJit comes close to pulling this off in some cases (and does in a select few), but obviously not across the board in general use.

This is an easy answer which can be found on the homepage of both the Perl11.org and RPerl.org websites. But sure, I can re-explain it for those who didn't bother to read the basic source material! ... RPerl compiles Perl 5 into C++, which can then be further compiled into a binary or shared object or bytecode or whatever you like with your own favorite C++ compiler. Yes, you can consider RPerl to be a "transpiler" if you like the term. The resulting C++ code has virtually no runtime overhead, and thus runs at least at the speed of native C++. RPerl also offers automatic parallelization via integration with the PLUTO polycc compiler collection, which can often provide faster-than-normal-C++ performance.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact