Control flow and computation optimizations have enabled use of higher level abstractions with little or no performance penalty, but at the same time it's almost unheard of to automatically perform (or even facilitate) the data structure transformations that are daily bread and butter for programmers doing performance work. Things like AoS->SoA conversion, compressed object references, shrinking fields based on range analysis, flattening/dernormalizing data that is used together, converting cold struct members to indirect lookups, compiling different versions of the code for different call sites based on input data, etc.
It's baffling considering that everyone agrees memory access and cache footprint are the current primary perf bottlenecks, to the point that experts recommend considering on-die computation is free and counting only memory accesses in first-order performance approximations.
Unfortunately, no public compiler seems to be available at this time.
As I don't have much experience in this area: Can you elaborate on some use cases for layouts other than AoS and SoA?
A lot of concerns I have can all theoretically be fixed in user space, on the other hand the same can be said for any language, and hence sort of defeats the purpose of a keyword. The point of the keyword is that it can inspect the fields of a type and decide how many arrays of what size to allocate. Here are some issues which would be annoying to implement in user space with AoS/SoA being merely an orthogonal keyword (e.g. limited operator overloading and without Turing Complete templates or macros):
* Alignment and padding so certain fields can be loaded into specific registers. This effects the layout of memory differently in each mode.
* Multiple buffers for some fields. A heavily used field (say position) could have a number of buffers which are rotated in SoA mode (e.g. render-posistions, last-frame-step, next-frame-step; allowing for reader/writer systems), and are simply an array of fields in AoS mode. Where as other fields retain their singular arity.
* Concurrency concerns. Partitioning arrays in SoA so cores don't fight over the lower caches. Aligning AoS on cache lines so whole objects can be locked at once without causing cache problems with neighboring objects.
* Sparse fields. If a field is generally empty, space is still allocated for it (e.g. in SoA using a dictionary rather than an array), or a cache miss due to usage of a pointer.
* But to answer the actual question it comes down to data structures: If I wanted to iterate across objects in an arbitrary quad-tree node and it's children, there are efficient ways to layout memory so that that and similar useful operations are fast (e.g. minimal cache misses). Yes I could put the data-structure in a different array but then I am looking up values in the data array randomly (e.g. cache misses). Many useful data structures have efficient systems for laying out their values and their structure in the same memory.
To elaborate on the last one more, maybe I have a structure that has 4 fields, two are keys that can be structured (e.g. string name, and x,y pos), and two are are actual data members. Regardless of `AoS` or `SoA`, I would have to write custom data structures to maintain string (trie) and pos2 (quad tree) indices, that would also be doing slow lookups. Alternatively, if I could set the layout of an array to `Trie.name` (Trie using name field) I would be saying, layout this container to be as fast as possible (e.g minimal cache misses) for string based lookups by making it the primary layout, without having to change any code that uses it (you could even go the next step with something like: `SoA | Trie.name | QuadTree.pos` to allow reordering a single line of code to effect the performance of all code referencing it, but for the semantics of the line to act like normal bit flags).
tl;dr: Think database indices on columns, but improving runtime performance by rearranging the data itself into the index so as to better fit the algorithm and CPU requirements (e.g. rather than more abstract decisions in databases, where the model of a piece of hardware isn't usually a concern).
Ctrl+F for "data structure". It's actually the reason LLVM exists at all: http://llvm.org/pubs/2002-06-AutomaticPoolAllocation.html http://llvm.org/pubs/2003-04-29-DataStructureAnalysisTR.html http://cgo.org/cgo2004/papers/06_76_lattner_c.pdf
Contrast with Scala where the programmer doesn't have the ability to reason about memory AFAIK:
It would be very interesting to hear what happened to these LLVM research directions. I guess the default answer is "nobody wanted to fund it after the publications got written"...
Things may be different now that Moore's Law no longer holds and many programs are cache-bound, but any current research also needs to take into account GPUs, distributed computing, and deep learning. There might be room for new research, but the research would likely be as much about auto-vectorization and minimizing trips between memory spaces as about local optimizations like AoS -> SoA conversion.
One should be able to specify a compile time macro that controls memory layout.
It really changes how you do basic things like sorting. In the standard AoS approach in C-like languages you swap entire structures around the array. In an SoA approach in APL-like languages you generate a list of indices that would put the data in sorted order then apply it to each column. A number of times I written code to do this in C++ for high-performance systems, and it works great, but is definitely a different way of thinking about things.
Does this require anything more than transposing some struct member accesses? IIRC, Futhark already does that kind of conversion.
Take the sorting example i gave. In a column-oriented program you would generate a list of indices and apply them to each column. That has very good cache locality as you work on each independently. Also you tend to manipulate that index list instead of the data and after all is done, filtered, and whatever then apply it to the data.
If you did things the AoS way you would be bouncing around in memory going from column to column.
Dealing with SoA and column orientated data is more than just a storage decision.
Is there a non-straightforward conversion that isn't? It seems to me like any SoA->AoS transformation is going to have pretty much the same effect on locality.
Not at all. In a SoA system you want to operate on each attribute individually but on an AoS system you operates across structs. Besides improved cache locality it keeps loops small so they operate out of the iop loop/stream cache among other things like better packing off data etc.
If you haven't worked on a system little this sometimes it can be difficult to see how different things can be. It's literally APL vs C.
I guess that's why author didn't mention it.
More generally, the history of computing is littered with remains of plans involving "this can work, assuming we build a sufficiently smart compiler..." - you really need a language that is proactively friendly to these concepts to make the implementation straightforward and robust so language users can lean on it.
Let's say I have a Vec<User>. Perhaps it would be more efficient to have a tuple of: Vec<User::Name>, Vec<User::Address>, Vec<User::Siblings>...
That is, can we design a language where the fields of a type can be analyzed by a container, and deconstructed to be able to implicitly reconstitute an object from its fields?
This is what entity component systems try to do, but some of it can be clumsy, and you have to build these entities and components with the system at the beginning. Could we achieve more if we baked some of this into the language and allowed more "introspective" generic types?
An SQL database is not just an object store.
Using column-oriented data is more that just a storage decision. It affects other parts of the language and especially your thought process and how you use the language.
Anybody who has used systems that are column (SoA) oriented understands this. And those that haven't all see to not understand how much it changes things.
See, for example, this article on game design: http://gameprogrammingpatterns.com/data-locality.html
Rust does at least get static/dynamic dispatch right in this respect but I'd kill for SoA/AoS structure flipping.
It's really hard to do justice explaining how amazing modules are. They capture the essence of abstraction incredibly well, giving you plenty of expressive power (alongside an equally powerful type system). Importantly, they compose; you can write functions from modules to modules!
(This is even more impressive than you think: modules have runtime (dynamic) AND compile time (static) components. You've certainly written functions on runtime values before, and you may have even written functions on static types before. But have you written one function that operates on both a static and a dynamic thing at the same time? And what kind of power does this give you? Basically, creating abstractions is effortless.)
To learn more, I recommend you read Danny Gratzer's "A Crash Course on ML Modules". It's a good jumping off point. From there, try your hand at learning SML or Ocaml and tinker. ML modules are great!
So, if anything, what I'd like to see is a dependently typed language whose type language is a first-order, computation-free (no non-constructor function application) subset of the value language.
"We show how Damas/Milner-style type inference can be integrated into such a language; it is incomplete, but only in ways that are already present in existing ML implementations."
Furthermore, OCaml essentially already has first-class modules - just the syntax is much more awkward than is possible with 1ML.
In ML, you can have multiple modules (structures) conforming to a common interface (signature), but their abstract type components are in general not identifiable - with good reason. For instance, let's say you have an API for heaps:
(* For algorithmic reasons, it is useful to distinguish between
* heaps whose underlying data structure is a tree (e.g. pairing
* heaps) and heaps whose underlying data structure is a forest
* (e.g. binomial heaps). I will only use the former in this
* example. *)
signature TREE_HEAP :
type key (* abstract *)
type 'a entry = key * 'a
type 'a heap (* again abstract *)
val pure : 'a entry -> 'a heap
val head : 'a heap -> 'a entry
val tail : 'a heap -> 'a heap option
val merge : 'a heap * 'a heap -> 'a heap
signature ORD_KEY :
val <= : key * key -> bool
functor PairingHeap (K : ORD_KEY) : TREE_HEAP =
(* blah blah blah *)
functor BrodalOkasakiHeap (K : ORD_KEY) : TREE_HEAP =
(* blah blah blah *)
structure H1 = PairingHeap (IntKey)
structure H2 = BrodalOkasakiHeap (IntKey)
But, can you use multiple implementations of the TREE_HEAP in the same code? (I think so, but it's been a long while.) Can you use two versions of BrodalOkasakiHeap in the same code? (Probably not.) Can module A use one version of BrodalOkasakiHeap while module B uses another, in the same program?
> Can you use two versions of BrodalOkasakiHeap in the same code?
ML doesn't have a notion of “version”, but of course you can provide two different implementations of the same API if you so desire. However, these implementations, while interchangeable (a client can swap either implementation with the other, as long as this is done consistently), are not compatible (a client can't treat both implementations as if they were the same).
For example, the difference between a minor version increment and a micro version increment is whether client code implements an interface provided by the module or uses an implementation coming from the module. (If you use an interface that has had methods added, you won't know about those methods; but if you implement the interface, you have to provide implementations for those methods.)
So, for example, if you are using "modules" A and B, where A uses a module C and B uses C' (which has a bumped micro number), you won't have a problem simultaneously using an instance c from C (returned by A) and an instance c' from C' (returned by B).
On the other hand, if you create an instance c (as from C) and pass it to C' through B, things don't work out. (Execution blows up. Did I mention that the rest of Java's modularization story sucks?)
The same is true of functional programming. Pure functional is fine. Pure imperative is fine. Both in the same language get complicated. (Rust may have overdone it here.)
More elaborate type systems may not be helpful. We've been there in other contexts, with SOAP-type RPC and XML schemas, superseded by the more casual JSON.
Mechanisms for attaching software unit A to software unit B usually involve one being the master defining the interface and the other being the slave written to the interface. If A calls B and A defines the interface, A is a "framework". If B defines the interface, B is a "library" or "API". We don't know how to do this symmetrically, other than by much manually written glue code.
Doing user-defined work at compile time is still not going well. Generics and templates keep growing in complexity. Making templates Turing-complete didn't help.
See Architectural Mismatch or, Why it's hard to build systems out of existing parts
Yes, we are not good at it, but we need to do it. Very often, the dominant paradigm is not appropriate for the application at hand, and often no single paradigm is appropriate for the entire application.
For example, UI programming does not really fit call/return well at all.
> attaching software unit A to software unit B .. master/slave
Case in point: this is largely due to the call/return architectural style being so incredibly dominant that we don't even see it as a distinct style, with alternatives. I am calling it 'The Gentle Tyranny of Call/Return'.
If you want to study that, look at ROS, the Robot Operating System. ROS is a piece of middleware for interprocess communication on Linux, plus a huge collection of existing robotics, image processing, and machine learning tools which have been hammered into using that middleware. The dents show. There's so much dependency and version pinning that just installing it without breaking a Ubuntu distribution is tough. It does sort of work, and it's used by many academic projects.
In a more general sense, we don't have a good handle on "big objects". Examples of "big objects" are a spreadsheet embedded in a word processor document or a SSL/TLS system. Big objects have things of their own to do and may have internal threads of their own. We don't even have a good name for these. Microsoft has Object Linking and Embedding and the Common Object Model, which date from the early 1990s and live on in .NET and the newer Windows Runtime. These are usually implemented, somewhat painfully, through the DLL mechanism, shared memory, and through inter-process communication. All this is somewhat alien to the Unix/Linux world, which never really had things like that except as emulations of what Microsoft did.
"Big object" concepts barely exist at the language level. Maybe they should.
Starting a separate 'Erlang process' for all async stuff, for example, seems so wonderfully simple to me compared to the async mess I find in JS, and applying various patterns(?) to that (Task, GenServer, SuperVisor) still provides a lot of freedom without incompatibility.
Please correct me if I'm wrong though. I'm still in the research phase so I haven't even written much Elixir/Erlang yet...
In that case, a model ala Erlang may totally make sense, but create a quite high overhead in term of runtime, deployment and impedance mismatch with the rest of the ecosystem. There are use cases of course, and there are interesting possible solutions like Pony that appears slowly, but we still do not have a good solution around that.
For a pure "server" and "distributed systems" or "machine dedicated to your software", you are right in my opinion. Add on top of that that erlang know its limits and provide you with way to easily interact with languages with different paradigms.
It may make sense to keep these different paradigms isolated but easy to talk between them and exchange.
Perhaps my programming language vocabulary is the limiting factor here, but I understand “pure functional” to refer to a non-sequential computation (no ordering, just a pure transformation) and “pure imperative” to be monadic computation in Haskell (a sequence of steps, executed one after the other). I don’t see why these two could be considered incompatible — indeed, monadic computations make little sense without pure functions to transform the values inside the monad.
Can you clarify?
If you think about it so many of our problems are a direct result of representing software as a bunch of files and folders with plaintext.
Our "fancy" editors and "intellisense" only goes so far.
Language evolution is slowed down because syntax is fragile and parsing is hard.
A "software as data model" approach takes a lot of that away.
You can cut down so much boilerplate and noise because you can have certain behaviours and attributes of the software be hidden from immediate view or condensed down into a colour or an icon.
Plaintext forces you to have a visually distracting element in front of you for every little thing. So as a result you end up with obscure characters and generally noisy code.
If your software is always in a rich data model format your editor can show you different views of it depending on the context.
So how you view your software when you are in "debug mode" could be wildly different from how you view it in "documentation mode" or "development mode".
You can also pull things from arbitrarily places into a single view at will.
Thinking of software as "bunch of files stored in folders" comes with a lot baggage and a lot of assumptions. It inherently biases how you organise things. And it forces you to do things that are not always in your interest. For example you may be "forced" to break things into smaller pieces more than you would like because things get visually too distracting or the file gets too big.
All of that stuff are arbitrary side effects of this ancient view of software that will immediately go away as soon as you treat AND ALWAYS KEEP your software as a rich data model.
Hell all of the problems with parsing text and ambiguity in sytnax and so on will also disappear.
Every attempt that I've seen, e.g. Lamdu, Subtext, any "visual app builder", all fail miserably at delivering ANY benefit except for extremely simple programs -- while at the same time, taking away most of the useful tools we already have like "grep", "diff", etc. Sure, they can be re-implemented in the "rich data model", perhaps even better than their textual ancestors - but the thing is, that they HAVE to be re-implemented, independently for each such "rich data model", or you can't have their functionality at all -- whereas as 1972 "diff" implementation is still useful for 2017 "pony", a language with textual representation.
regarding your example, the "breaking things into smaller pieces" was solved long ago by folding editors (I used one on an IBM mainframe in 1990, I suspect Emacs already had it at the same time, it did for sure in 1996).
the problems with "parsing and ambiguity" are self inflicted, independent of whether the representation is textual. Lisp has no ambiguity, Q (the K syntax sugar) has no ambiguity. Both languages eschew operator precedence, by the way, because THAT is the real issue that underlies modern syntax ambiguities.
I've been waiting for that amazing "software as a data model" approach to show a benefit for almost 30 years now (There's been an attempt nearly every year I looked). Where it has (e.g. Lisp, Forth), it's completely orthogonal to the textual representation.
How do you do this with a text driven program? Typically, there are so many files and unspecified build systems, so you can't view your program as a proper tree as in Lamdu. You can't even treat compilation as a pure function. You can't take a snapshot of an arbitrary program because effects are not isolated. You might not even be able to set breakpoints without breaking the program. If it's machine code you might not even have debug symbols. It's such a mess, and it's different on every platform.
So you claim there is no benefit to be had. I claim the existing solutions are already failing. If one solution gets enough of these things right, there is no reason why it wouldn't succeed just because text-driven. (But it is also possible to do it correctly with text)
If the program code is always to be consistent, it cannot be text, because text allows you to input invalid programs. So how is text not a leaky abstraction?
If you haven't watched https://vimeo.com/36579366 , I highly recommend it. Chris Granger, after watching this, created http://www.chris-granger.com/2012/02/26/connecting-to-your-c... which you can download, and applies to ClojureScript, which is (tada!) text based.
Also, I had a chance to use time traveling virtual machine (which works independently of how your code was produced) in 2007, and while I haven't had a chance to use them, I know that source-aware time traveling debuggers have made huge strides recently.
> So you claim there is no benefit to be had. I claim the existing solutions are already failing.
These claims are not contradictory. I agree that existing solutions are failing in various ways, but I still disagree that the "code as rich data" would provide benefit.
> If one solution gets enough of these things right, there is no reason why it wouldn't succeed just because text-driven. (But it is also possible to do it correctly with text)
So, if possible to do correctly when text driven, why is text driven a problem?
> If the program code is always to be consistent, it cannot be text, because text allows you to input invalid programs. So how is text not a leaky abstraction?
All programming abstractions are leaky in one way or another, and text is no different. In my opinion, the leak that it is possible, during design time, to have an invalid program (which will be rejected as such prior to execution) is not a real problem. The requirement that "program code always be consistent" is onerous, and, in my opinion, harmful.
When I write code, I often start with sketches that cannot work and cannot compile while I think things through, and then mold them into working code. Think about it as converting comments to code. If an editor didn't let me do that (and likely wouldn't provide any help, because I'm writing comment prose and not "rich data" code ...), I'd open a text editor to write my plans. text editor.
I will restate my belief in a different way: Any benefit from "code as data" that cannot be applied to a textual representation, only ever manifests in extremely simple cases. I offer as support for this belief, the history of visual and rich data tools over the last 30 years, none of which manages to scale as well as text based processes, most of them scaling significantly worse (which is often explained away by critical mass, but there are enough successful niche tools that I think this explanation is unacceptable).
See "Always bet on text" (http://graydon2.dreamwidth.org/193447.html).
What is the state of the art in editing software that's continuously stored in a syntax tree instead of in unparsed text? (I say "continuously" because I assume that letting me save a file that doesn't parse, even temporarily, breaks most of these benefits.) How do you, for instance, represent merge conflicts, and how do I resolve them?
This is completely new to me so maybe there are great answers here that I just haven't heard of.
But... it make debugging harder without even more tooling. And debugging is most of the time spent in software... which probably explain the origin of the problem.
Especially when you need to reverse engineer code provided by a vendor.
If it's truly the "silver bullet" I'd still say I expect to see some evidence of that in some domain relatively quickly, because it is supposed to be the silver bullet after all. But it would take time to discover that.
The downside to adding an additional feature is that you are much more likely to introduce leaky abstraction (even things as minor as syntactical sugar). Your language has more "gotchas", a steeper learning curve, and a higher chance of getting things wrong or not understanding what is going on under the hood.
For this reason, I have always appreciated relatively simple homoiconic languages that are close-to-the-metal. That said, the universe of tools and build systems around these languages has been a growing pile of cruft and garbage for quite some time, for understandable reasons.
I envision the sweet spot lies at a super-simple system language with a tightly-knit and extensible metaprogramming layer on top of it, and a consistent method of accessing common hardware and I/O. Instant recompilation ("scripting") seamlessly tied to highly optimized compilation would be ideal while I am making a wishlist :)
Typically, I feel these sorts of ideas appeal to simpler foundations and achieve richness through rich additions to a simple foundation that can be shown to be consistent and compatible with that foundation in a thorough (proven) way. It's just a different ballgame.
Although, again, UI is another issue.
On the DEC VAX, overflow could be set to trap, based on a bit mask at the beginning of each function, but that was not the case on the PDP-11.
Nobody used that feature. I once modified the C compiler for the VAX to make integer overflow trap, and rebuilt standard utilities. Most of them trapped on integer overflow.
If you want arithmetic to wrap, you should have to write something like
n := (n + 1) mod (2^32);
Because it's either a solved problem if you just use infinite precision numbers, or it's encompassed by other things the article lists, like refinement types/extended static checking and errors.
 I've been told the Value plugin is better for these safety issues, but I haven't figured out how so.
I was mildly curious why Graydon didn't mention my current, mildly passionate affair, Pony (https://www.ponylang.org/), and its use of capabilities (and actors, and per-actor garbage collection, etc.). Then, I saw,
"I had some extended notes here about "less-mainstream paradigms" and/or "things I wouldn't even recommend pursuing", but on reflection, I think it's kinda a bummer to draw too much attention to them. So I'll just leave it at a short list: actors, software transactional memory, lazy evaluation, backtracking, memoizing, "graphical" and/or two-dimensional languages, and user-extensible syntax."
Which is mildly upsetting, given that Graydon is one of my spirit animals for programming languages.
On the other hand, his bit on ESC/dependent typing/verification tech. covers all my bases: "If you want to play in this space, you ought to study at least Sage, Stardust, Whiley, Frama-C, SPARK-2014, Dafny, F∗, ATS, Xanadu, Idris, Zombie-Trellys, Dependent Haskell, and Liquid Haskell."
So I'm mostly as happy as a pig in a blanket. (Specifically, take a look at Dafny (https://github.com/Microsoft/dafny) (probably the poster child for the verification approach) and Idris (https://www.idris-lang.org/) (voted most likely to be generally usable of the dependently typed languages).
Perhaps the core problem is that no matter how you slice it, the nature of being an actor-based system means that cross-actor calls are simply more runtime and cognitively expensive than a conventional function call, and across the sum total of the system, that ends up adding a lot of drag. It may be the case that "cheap concurrency" added value, and actors actually net removed value, but the sum total was positive in the end. (I'm spitballing a bit here, to be clear.)
To really see the cheap concurrency in play, Haskell is a great place to play. It has a ton of mechanisms for concurrency and ways of composing them together, which kind of makes it an interesting cage match environment for what concurrency primitives are useful. Actors would be really easy in Haskell, and they basically go unused. I suspect this means something.
Actors in Erlang seem more emergent to me than fundamental. To first order, Go and Erlang use similar concurrency strategies with cheap green threads and channels for communication. The main difference being that Go has channels as separates objects that can be passed around (leaving the goroutine anonymous), while Erlang fuses the channels into the process. In this respect, they both have similar levels of "actor-ness" in my mind. The biggest upside that I can see to channels being separate is that a goroutine can have multiple channels, they can be sent around (though so can PIDs), etc. This matters a lot because the channels in Go are statically typed, while Erlang/Elixir's dynamic typing makes this less meaningful since you can send anything on the same channel.
Of course, I supposed if one defines an actor as a sequential threaded process with a built in (dynamically-typed) channel, then Erlang is actor-based, so maybe I contradicted myself. In Erlang/OTP, the actor part is important to the supervision tree error handling strategy, which I think is the biggest upside, but it's not obvious to me that you need the channel and the process totally fused together to handle that.
Specifically in response to your "cheap concurrency good, actors bad" thought, my hypothesis is that actors are at least the more natural strategy in a dynamically typed language, which is why they seem to work well in Erlange/Elixir, but don't see much use in languages like Haskell (or Scala, it seems, where I at least though the Akka actors were kind of awful). Meanwhile, channels seem to fit better with static typing, though I can't quite put my finger on why.
First, fair question and I the nature of your follow-on discussion and musings.
IMHO, one of the lessons of Go for other languages is just how important the culture of a language can be. I say that because on a technical level, Go does basically nothing at all to "solve" concurrency. When you get down to it, it's just another threaded language, with all the concurrency issues thereto. An Erlanger is justified in looking at Go's claim to be "good at concurrency" and wondering "Uh... yeah... how?"
And the answer turns out to be the culture that Go was booted up with moreso than the technicals. When you have a culture of writing components to share via communication rather than sharing memory, and even the bits that share memory to try to isolate those into very small elements rather than have these huge conglomerations of locks for which half-a-dozen must be taken very carefully to do anything, you end up with a mostly-sane concurrency experience rather than a nightmare. Technically you could have done that with C++ in the 90s, it's just that nobody did, and none of the libraries would have helped you out.
That did not directly bear on your question. I mention that because I think that while you are correct that Erlang is technically not necessarily actor-oriented, the culture is. OTP pushes you pretty heavily in the direction of actors. Where in Go a default technique of composing two bits of code is to use OO composition, in Erlang your bring them both up as actors using gen_* and wire them together.
"my hypothesis is that actors are at least the more natural strategy in a dynamically typed language, which is why they seem to work well in Erlange/Elixir, but don't see much use in languages like Haskell"
I can pretty much prove that they don't: https://github.com/thejerf/suture It's the process that needs monitoring, and that process may have 0-n ways to communicate. But that's not a criticism of Erlang, as I think that's actually what it does and it just happens to have a fused message box per process.
An intriguing hypothesis I'll have to consider. Thank you.
> I can pretty much prove that they don't: https://github.com/thejerf/suture It's the process that needs monitoring, and that process may have 0-n ways to communicate. But that's not a criticism of Erlang, as I think that's actually what it does and it just happens to have a fused message box per process.
One, very neat library. Two, while I agree this proves the point that the actor model is not needed in the language to build a process supervisor, I think that your Go Supervisor looks a lot like an actor, at least in the way Erlang/Elixir uses them. From what I can see, the Supervisor itself works by looping over channel receives and acts on it. The behavior of the Supervisor lives in a separate goroutine, and you pass around an object that can send messages to this inner behavior loop via some held channels. So basically the object methods provide a client API and the inner goroutine plays the role of a server, in the same separate of responsibilities that gen_* uses.
If we squint a little bit, actors actually look a lot like regular objects with a couple of specific restrictions: all method calls are automatically synchronized at the call/return boundaries (in Go, this is handled explicitly by the channel boundaries instead), no shared memory is allowed, and data fields are always private. I'm sure this wouldn't pass a formal description, but this seems like a pragmatically useful form.
I agree that Go is less actor-oriented than Erlang/Elixir, but given how often I've seen that pattern you used in the Supervisor (and it's one I have also naturally used when writing Go) I'd argue that "Actor" is a major Go design pattern, even if it doesn't go by that name. The difference then, is the degree to how often one pulls out the design pattern. I think the FP aspect pushes Erlang/Elixir in that direction more, as this "Actor" pattern has a second function there -- providing mutable state -- that Go allows more freely.
This discussion has really made me think, thanks. I think you're right that actor-like features are valuable and that the Actor Model in the everything-is-an-actor is not itself the value (or even a positive).
Wasn't he the original designer of Rust and employed at Mozilla?
Surprised that this move completely went under my radar
In general the Rust and Swift teams are very friendly with one another, and have shared several devs.
I've been using Typescript a lot recently with union types, guards, and other tools. It's clear to me that the type system is very complex and powerful! But sometimes I would like to make assertions that are hard to express in the limited syntax of types. Haskell has similar issues when trying to do type-level programming.
Having ways to generate types dynamically and hook into typechecking to check properties more deeply would be super useful for a lot of web tools like ORMs.
It's a Standard ML variant that includes serialization of types and support for distributed programming.
Protobuf, Cap'n Proto, Flatbuffers, and anything else with a similar design gives you the ability to verify that two versions of the protocol are "compatible" insofar as:
- If a new version of the program reads a message from an older version which is missing some newly-added field, that field will assigned a default value or "null", but the rest of the message will parse fine.
- If an old version of the program reads a message from a newer version which has some new fields that the old version doesn't know about, it will ignore those fields, but can parse the rest of the message just fine.
Of course, just because you've verified that the fields line up doesn't prove that the old and new versions of the program are actually compatible. The interpretation of the protocol could change in an incompatible way without any change to the protocol itself. Verifying compatibility at that level is an AI-complete problem.
But in practice, it works very well. Enabling safe incremental, rolling updates of components in a distributed system is exactly what Protobuf was designed to do and has been doing for >15 years at Google.
Here is some more info about Nim's effect system: https://nim-lang.org/docs/manual.html#effect-system
"The next big step for compiled languages?"
I kinda see where Graydon is coming from. I have this broken half-assed Pony parser combinator thing staring at me right now.
I am interested in genuine and objective replies of course.
(Yes your joke is probably very funny and I am sure it's a novel and exciting quip about the state of affairs in 2006 when wordpress was the flagship product)
However, PHP doesn't have a module system, for better or worse. Namespaces merely deal with name collisions and adding prefixes for you. Encapsulation only exists within classes and functions, not within packages. PHP has no concept of namespaced variables, only classes, functions and constants. Two versions of the same package cannot be loaded simultaneously if they occupy the same namespace.
There has been some relatively recent discussion about having, for example, intra-namespace visibility modifiers (e.g. https://externals.io/message/91778#92148). PHP may yet grow modules. The thing is, though, all sorts of things are suggested all the time. Many PHP 7 features had been suggested many years before (https://ajf.me/talks/2015-10-03-better-late-than-never-scala...). The reason PHP 7 has them is people took the initiative to implement them.
Namespaces also provide something similar to "type theoretic modules" but they provide only the very most prosaic, simple, and unimportant function: namely, namespacing.
Does PHP have a different flagship product now?
PHP has other significant projects built on it of course, but WordPress is a behemoth nothing can hope to beat.
PHP is probably a great language now. I used it in the PHP 4 and 5 days. Coming from Perl, that was particularly painful.
I also think that eventually we may see a truly common semantic definitional layer that programming languages and operating systems can be built off of. It's just like the types of metastructures used as the basis for many platforms today, but with the idea of creating a truly Uber platform.
Another futuristic idea I had would be a VR projectional programming system where components would be plugged and configured in 3d.
Another idea might be to find a way to take the flexibility of advanced neural networks and make it a core feature of a programming language.
(I also own & admire his Basic Category Theory for Computer Scientists; I'm yet to brave his Advanced Topics in TaPL - I'm sure it's very good, but the contents list intimidates me.)
For type systems your best bet is to learn languages with advanced typing features. Start with the friendly ones (TypeScript, python mypy, Rust, Haskell) and look at Idris for dependent types.
And then reread this post, and ask a lot of questions (reddit, stackexchange, quora).
He even addresses the changes to the runtime:
I think the above post indicates that he is extremely proud of how Rust turned out.
"After memory safety, what do you think is the next big step for compiled languages to take?"
early Rust was quite different - not as low-level, had a GC, green threads, other stuff. some info here: https://news.ycombinator.com/item?id=13533701
Curious and can't find anything: what's the most complex golang program out there, and how long does it take to compile?
I'm aware of TLA+ and (Jay Misra's) Unity (https://en.wikipedia.org/wiki/UNITY_(programming_language)), but I haven't seen any other uses of temporal logics. (The lengths people will go to in order to get away from box and diamond...)
Maybe the way Haskell does STM is efficient relatively. However, you cannot port that to C/C++/Java/C#, because they do not provide the type system to make it safe.
Maybe Rust can deliver. A quick search gives me rust-stm . All these disclaimers in the README looks discouraging, though. For example, "Calls to atomically should not be nested", but isn't that the whole point about composable STM?
Repeating transactions is a no-go, because it does not mix well with side effects. Thus, we need a lock-everything-required scheme. You do not want to expose this in the type system, because it explodes everywhere and gets really tiresome. Thus you need a whole-world analysis to find out what is "required" for a transaction. That breaks modular compilation and is inconvenient in general. In general, I don't see how this could work even a very high level.
Second, the problems with impurity you listed are exactly solved by Haskell. Just because "you cannnot port that to x/y/z" language should've be counted against the haskell language.
We are talking about transactions here, but you're worried about thunks and memory allocations, and comparing this to java? It seems like you've simply decided Haskell is categorically slow, regardless of the context we're talking about.