Hacker News new | past | comments | ask | show | jobs | submit login

Don't complain that Chinese is ugly and unreadable just because you speak English as your native tongue.

Technically, the above is a snippet of C++ put into an APL variable "rth" but there's so much more to it than that, and so much more to the design that you're missing.

The design and choice of aesthetic in the compiler is a very intentional one that is arguably one of the main issues that has caused me to rewrite the compiler so many times over the years and has lead to this massive code adjustment.

There are very good reasons that the compiler is written in the style that it is, and you cannot compare it to other project's style guides.

Keep in mind that this compiler is designed to run natively on the GPU in a fully data-parallel fashion.

One major issue that I had to address, and I discuss a little bit in a thread above, is the idea of the malleability of the code base. It's critically important to this project that I be able to adapt and alter the compiler rapidly. For example, I recently had to rewrite the entire backend due to a shift in some underlying core technology. This shift lead to a shrinkage of about 2000 lines of code because the underlying supporting libraries were a better fit to what I needed than what I was using previous to this. But I might not have been willing or able to make this change if I didn't have confidence that the rewrite would be swift and fast. Indeed, it took only two months to rewrite the backend from scratch, add more new features, improve robustness, and so forth. The code also got cleaner.

This obsessive need to be highly adaptable leads me to the desire to have exceptionally "disposable" code. The cost for replacing or deleting code should be as low as possible.

This has a few follow ups. In order to achieve the above, I need to ensure that I understand the ramifications of deleting code as readily as possible as quickly as possible. This basically means that I need to be able to squeeze as much of the compiler into my head as possible, and what doesn't fit, I need to be able to "see" and "read" as quickly and as readily as possible.

The compiler is designed so that I can see as much as possible with as little indirection as possible, so that when I see a piece of code I not only know how it works in complete detail, but how it connects to the world around it, and every single dependency related to it in basically one single half screen full of code (usually much less than that) without any jumps, paging, scrolling or any movement. It means that I can completely understand the ramifications of any edit I make in nearly complete detail without any dereferencing or indirection. There are one or two places where there are some helper utilities which are on a different page, but these are part of the "domain vocabulary" which is basically in my mental cache any time I'm working with that code. I keep these "helpers" to a minimum, so that they can fit with anything else I want and not waste mental space in my head. Too many helpers leads to a failure to understand the complete macro picture and thus defeats my ability to delete code.

In order to make the code more readable, it has to be highly consistent and idiomatic. I take this to an extreme level. This code is highly regular and predictable, to an almost obsessive degree. I do this by enforcing a style discipline on the code that allows me to eliminate the use of a host of abstractions, further paring down the complexity of the programming language in which I'm working and allowing me to think in the same mental plane at all times.

The idea of semantic density is critical to this point. The semantic density of the APL code I'm using to solve the problem is at a certain rate. I maintain a consistent density rate by choosing my variable names in such a way that they visually align with the expressivity per character of the built in primitive symbols. This means that the cadence when reading the code is maintained. The "universal" naming scheme allows me to take any given name and know exactly its purpose, parentage, place, and use in the compiler without adding any additional cognitive overhead of inheritance syntax, datatypes, classes, or anything more than a name.

The C++ code above is written the way it is to allow it to stylistically align with the semantic density of the APL code. This means that I can jump between the runtime and the compiler portions of the code with minimal mental shifts between the two, because the style and approach are similar. The code can be "read" in much the same way with minimal change. I am intentionally prioritizing internal semantic and stylistic consistency over satisfying the popular expectations of how C or APL code should look. I believe the internal consistency within the project contributes more strongly to the day-to-day readability and hackability of the project.

Furthermore, I strongly restrict my use of programming languages features. This simplifies self-hosting, but it is primarily a means of maintaining stylistic and cognitive power. Since I know how I need to think about my problem "compilation on the GPU" in order to make it go, I can restrict myself to a paradigm that only allows me to think in this way. I choose a paradigm that is also exceptionally expressive to allow me to be productive as well. By selecting the right core paradigm, I can eschew further programmatic abstractions since they contribute nothing and only cost something.

One way in which I do this is to write the core of the compiler with only one or two syntactical conventions, and only one main programming method: function composition. The entire core of the compiler is a single points-free (almost), data-flow, data parallel expression. Names provide the anchor points of the "macro" level ideas, but the language is expressive enough that I need very few other anchor points. Instead, I use only function composition over the core primitives with a syntax known as "trains" to create the mental effect of working with normal expressions when in reality I create new functions with every line in the core compiler (which is 90 lines or so). By restricting myself to only writing in this style, the mental effect works. If I had to switch between expression level and trains/points-free style in the code, it would be much less readable. But because I can now treat my points-free programs as regular expressions for all intents and purposes, it actually simplifies my cognitive load, as there is only one thing to think about: function composition.




By keeping the code as visible (read, small) as possible, I see more code and can better reason at a macro level. To scale this down into the micro level of dealing with individual compiler passes, I replace all the traditional programming paradigms with others in a sort of 1 for 1 exchange. In this way, I develop a new set of idiomatic programming methods that are so concise, they can begin to be read as we read and chunk English phrases. By doing so, it becomes actually easier to just write out most algorithms, because the normal name for such an algorithm is basically as long as the algorithm itself written out. This means that I start to learn to chunk idioms as phrases and can read code directly, without the cost of name lookup indirection. I can get away with this because I've made reusability and abstraction less important (vastly so) because I can literally see every use case of every idiom on the screen at the same time. It literally would take more time to write the reusable abstraction than it would to just replace the idiomatic code in every place. It's a case of the disposability of code reaching a point that reusability is much less valuable.

This means that in those cases where reuse is valuable, it's very valuable, and it comes to the fore and you can see it as the critical thing that it is. It doesn't get drowned in otherwise petty abstractions that assist reusability, since we don't need that anymore.

Furthermore, if I write my code correctly, there is very, very little boiler plate in the compiler. Almost none. This means that every line is significant. By doing this it means that you don't get the fun of feeling like you're accomplishing something by typing in lots of excess boiler plate, but it does mean that you have no wasted architecture. Because rewriting the architecture is so trivial, basically everything now becomes important, and you don't have petty book keeping code around. You know that everything is important, and there is no superfluous bits.

The result, as mentioned elsewhere, is code that is getting continuously simpler, rather than continuously more complex. The code is getting easier to change over time, not harder. The architecture is getting simpler and more direct and easier to explain. Because it costs so little to re-engineer the compiler, I can do so constantly, resulting in little to no technical debt.

This is an intentional synergistic choice of a host of programming techniques, styles, disciplines, and design choices that enables me to program this way. Give up one of them and you start to break things down. It allows for a highly optimized programming code base that has all of the desirable properties people wish their code bases have, and it scares people. I think that's a good thing. Because I don't want people to see this codebase as just another thing. I want them to see that this is something truly different. How can I get away with no module system? How can I get away with no hierarchy? How can I get away with having everything at the top-level, with almost no nested definitions? How can I get away with writing a compiler that is not only shorter, but fundamentally simpler from a PL standpoint than standard compilers of similar complexity by using only function composition and name binding? How can I get a code base that has more features but continues to shrink?

By chasing smaller code. :-)

I assure you, and I'll make good on this in another reply here, I could get you up and running on understanding the code and how it works faster than just about any other compiler project out there. In the end, one of the goals I want for this compiler is for people to say, "Woah, wait, that's it? That's trivially simple." The more I can push people to think of my compiler as so trivial as to be obvious, the more I win. The compiler really is so dirt simple as to shock any normal compiler writer.

But to make it that simple, I have to do things in ways that people don't expect, because people expect complexity and indirection, they expect unnecessary layers for "safety" and they expect code that needs built in protections because the code is too complex to be obviously correct.

I'm pushing the other direction. If you can see your entire compiler at one go on a standard computer screen, what sort of possibilities does that open up? You can start thinking at the macro level, and simply avoid a whole host of problems because they are obviously wrong at that level. When you aren't afraid to delete you entire compiler and start from scratch? What sort of possibilities does that open up to you?


First, please let me apologize for my ill-considered and rude comment... cringe.

Thank you for explaining. Wow, so much to chew on here. The naming conventions and trains sound really interesting. I can see how having a lot of the code visible on one screen would be a fantastic advantage. Again thanks for writing this up. Obviously I didn't find your code transparent at first glance, but clearly if one takes the time to understand what you are doing, the approach has its benefits. I look forward to reading more of what you post. And you've got me intrigued about APL.


Your comments reminded me of this anecdote about Arthur Whitney:

"The k binary weighs in at about 50Kb. Someone asked about the interpreter source code. A frown flickered across the face of our visitor from Microsoft: what could be interesting about that? “The source is currently 264 lines of C,” said Arthur. I thought I heard a sotto voce “that’s not possible.” Arthur showed us how he had arranged his source code in five files so that he could edit any one of them without scrolling. “Hate scrolling,” he mumbled."

Source: http://archive.vector.org.uk/art10500700

I suspect his code looks a lot like the J incunabulum:

http://keiapl.org/rhui/remember.htm#incunabulum


It does. Furthermore, he's "simplified" APL in K to require less infrastructure, with fewer primitives, and the like. Combined with some clever, and some would argue, devious programming practices, he's able to keep things pretty small. I don't know if the interpreter is still that small, though. If someone reminds me, maybe I can talk about scrolling. :-)

Since I believe Whitney wrote the J incunabulum, I suspect that it looks very similar. The code is actually quite simple and straightforward if you take the time to read it.


Could you write a blog post (probably needs several) about the code style, architecture and design of your compiler and the idioms that you talk about ? I love the idea about keeping a project code base so small leveraging concise idioms so that everything fits in a meat-bag head, but have no idea how one goes about achieving that in practice. (Learning APL to get some pearls of wisdom would be fine)


It's something I've been working on for a while, but because the architecture is under constant flex, it's actually more valuable to be able to know how to "experience" or discover the architecture in the compiler code itself than to have a separate document to follow, since it's very easy for that document to get out of date quickly. I am building up a set of documents that discuss some of the core idioms and ideas though, and I hope to have something come of this live session that I can maybe put into an interactive document that people can work with.


The little essay you've given us in these two HN comments is one of the most brilliant things about programming I've ever read.


Two things I want to say/ask.

1. What happens if you get sick. You say this is a project in production and there is money on the table (I assume not only yours). What if you get sick and are unable to work for 3 weeks or 6 months. Don't you think that this code is very hard to grasp to someone else, who would have to temporarily work on your postion?

2. It is weird, that you wrote such a long essay, spanning two comments, but it has so little examples from the actual code. Usually when people explain stuff they go between the abstract concepts and how they are materialized in the code. Here you only explain the idea behind writing it and how it makes you feel/operate/gain flexibility and performance but the closest to the code information I've got from it is that it has compiler passes and that it has a C++ runtime in a string variable. Just a thought, what do you think about that?


At this point, if I get sick, the code doesn't move much. If I were permanently disabled, this someone else could take over. I have people contribute bugs, tests, and other things fairly often. If you had to temporarily work on the code base and weren't familiar with the background of the project, I would say you'd be lost. It's just not the sort of thing that you can start tweaking things here and there so easily, because almost everything that needs changing is a matter of addressing architectural or serious questions that require you to really understand the project. Because of the way the code is written, there's basically no "code monkey" type work. That means that you only do meaningful work, but it also means that only people who are knowledgeable architects can work on the code. You can imagine the same thing in other code bases. Imagine that you didn't need any of your lower-level programmers anymore for work because there was nothing for them to do. Now imagine how the bus factor changes on the code when only your chief architects are necessary for working on that code base. That's very nice in one dimension, but it does create quite a different picture.

You're right about the code examples. I figure that people were already posting some code snippets. I wanted to give the big ideas rather than any specifics. The reason for this is basically that if you take any single line of code out of context, it's a bit hard to explain why I'm doing the things that I'm doing. It's very much a macro design, which is why I am offering the live session to go through. It's sort of, but not quite, an "all or nothing" thing. if you let me sit down with you and go through the entire code base, then I can explain how it all fits together and why things are the way they are, but if you just take a single piece of code out, you're missing the picture.

If I took a single compiler pass, out, for instance, you'd have between 1 and 12 lines of code to look at. I could explain a few features, but how would I explain that when you look at this piece of code you're able to see it entirely in context? Well, I can't, because the code it completely out of context at that point. Or what about demonstrating how the naming conventions exhibit structure informative regularity? Again, I can't, because that's a visual design element of the code. It's something you have to "see" by looking at the whole painting as it were.

The naming convention is actually a great example. Out of context, there's apparently no rhyme or reason to it. But in context, it forms a key component to the visual regularity and continuity throughout the code. The names are an important part of how you can see the structure of the code. It helps to orient you in the big pie. But if I were to quote a single line here, there's now pie to look at, no sky to navigate by. It's just a single constellation. By analogy, it does less good to say, here's the Big Dipper, it's useful. But why? Because it's easy to find amidst the context of starts and its shape helps you to find the North Star. But on its own it doesn't seem as valuable. At that point it is just another constellation. The same thing happens with this code.

So I'll go through and explicate it all in detail in the live session, where I can provide the "painting" and workflow in its entire so people can see how it works. Then you can see how my comments here match up with the code.


Something that might be worthwhile to consider is the fact that someone who wants to make a change, only needs to look at a small program instead of a large program.

In the large program case, the programmer feels like they can cross-cut it, install some duplication, and yes: get their change done faster, but at a cost of making the program bigger.

But in the small-program case, you only pay the cost of learning the codebase when you add a new programmer to it -- something that happens very infrequently. Your program stays small, and you gain all the benefits therein (faster, fewer bugs, and so on).


This is really admirable stuff and I share this kind of goal even though I'm not working in APL style at this time, though I understand the appeal of shifting in that direction as more of the code gets abstract - and it necessarily should be so abstract if you're trying to maximize the simplicity. I believe most codebases suffer from prematurely abstracting with the easy stuff built in the source language(classes, generics, etc), and then not having the abstraction they really need when it's necessary, and being too tangled up to build it.

The only problem is that I don't know where to start if I wanted to study what you're doing and take notes. Those millions of lines of changes are still lurking in the background as building blocks for an overall understanding.


The live session would be the first start, obviously, but you can also see the Publications area of the README:

https://github.com/arcfide/Co-dfns#publications

Some of that deals with the micro and some with the macro level ideas, but there are some key elements in those that will be necessary to appreciate the whole thing.


Don't complain that Chinese is ugly and unreadable just because you speak English as your native tongue.

That's a great counterargument, and one I fully agree with. I've noticed that over the years, there has been a growing trend of promoting "readable, maintainable, clean, insert-fashionable-adjective-list-here code" which really amounts to a lower-common-denominator, dumbing-down perspective of how software should be written. In their perspective, code that someone does not immediately understand is "bad", seemingly regardless of how much (or little) knowledge that someone possesses. I think this is ultimately a harmful trend.

The opposing view, which appears to be largely a minority in more mainstream language communities but dominates in others like APL and Asm, is that programming languages are essentially like human languages: they need to be learned, are not necessarily "easy" or "familiar", and this learning and eventual mastery is wholly beneficial to their use. As with human languages, it is not expected nor a problem that a beginner will immediately understand code written by a more advanced user. Instead, the beginner progresses by learning the language and eventually becoming an advanced, "literate" user. This can be summed up in one sentence: "The code is unreadable because you are not yet qualified to read it." ;-)


Taking an example from the parent:

> rth,←' array v=array(z.s,zs.v.type());v(0)=zs.v(0);\',nl

I don't know APL or the Rth variant, but this goes against most standard style guides. There may be good reasons for it, but they are not obvious to an external viewer.

- What are 'v' and 'vs'? Is this quickly obvious from context (to anyone but the author) (where are the comments?)

- Why are they single characters (non-descriptive variables are almost always a code smell)

- Why does it do multiple things on one line? I think this is a limitation of use of Rth? Normally this sort of whitespace compression is verboten, even in functional languages like Lisp.

I think the parent has a good point about this being very difficult to understand code, and the OP has confused terseness with quality. If this code followed traditional coding styles it would be easier for new people to understand what the hell is going on, and would probably 3+ times longer in LoC. But who the hell uses LoC as a valid metric anyway? Besides the worst sorts of manager, of course...


but they are not obvious to an external viewer

That's precisely the point. The whole philosophy of APL is that it's not supposed to be obvious to anyone who doesn't (yet) know it. However, seeing as the character set is still Latin, it's not hard to guess at what it does even if you don't know the language.

- I don't see 'vs' in the snippet, but would guess V stands for Vector.

- Even in all but the most anal "standard style guides", single-character names are normal for temporary/limited-scope variable names.

- You could likewise ask why Chinese words aren't separated by spaces, or why English words need to be. It's a different language with its own grammar and style.

I don't know APL either, but at least I make an effort to see their perspective on the language, because it is clear that there are people who are highly proficient at working with code like this. (Likewise, I would guess that experienced APL'ers probably find more "traditional" languages like C, Java, Python, etc. "unreadably" verbose.)


V for Vector is an appropriate, but not the only, interpretation for that letter.

In the case of this compiler, I take an opposite convention. Most single character names are globally meaningful, and their meaning rarely, if ever changes across the whole compiler. As names get longer, they progressively represent more local elements. This is done in a way that reveals that nesting structure, but is also done because over time I realized that it was harder to remember from one patch to the next what the local variables were meant to do, rather than the global variables, which were almost always the same all the time and were much more likely to be in mental cache. Therefore, I used more "information", that is, more characters, for local names that I would more likely forget the meaning of later, than for global names that were universal and almost always on my mind.

And yes, I did try doing this compiler in many, many other styles, including C, C++, Nanopass Scheme, ML, Java, Cleanroom, traditional APL style, and so on and so forth. They were all unreadably verbose and difficult to work with and very hard to make forward progress on.


There may be good reasons for it, but they are not obvious to an external viewer.

But isn't the real question whether that obviousness is more important than ease of comprehension and maintenance for someone who does have the required skills to work on the code?

To a child who has just learned squares and square roots and who has never encountered TeX, the expression $e^{i\theta}=\cos\theta+i\sin\theta$ is probably just line noise. To a practising mathematician, it is immediately recognisable and a useful tool. Obviously the difference is that the experienced mathematician has learned the underlying concepts and the notation to represent them. The result is that while the teenager might be learning double-angle formulae by rote for their trigonometry exam in a few years, the experienced mathematician could use their more powerful tool to derive those formulae or any variations on the theme in moments whenever they need them. Their greater skill and understanding makes them much more capable.

There are certainly reasonable arguments for making the code for some projects accessible to new developers, but doing that isn't free if it also means compromising some aspect of that code for current developers. It's a trade-off, and sometimes requiring new developers to have a certain level of skill and understanding before they can work on a project is OK.


Well put. The important thing is to see what the tradeoffs actually are. Unseen tradeoffs often look like obvious wrongness.


Thanks for the comments. I would encourage you to attend or watch the recording of the live session once it is done, as it will give a more thorough answer than I can give here as to why. I've talked a little about the motivations for this in my above longer two comments, but briefly here:

1. It matters how much code you can see and work with at a time.

2. This code is using research level new algorithms for solving certain problems in the core compiler that are not a part of the standard toolbox of most programmers, and so, even if you did have the code laid out differently, it wouldn't necessarily be any easier. An early version of one of these new algorithms has been published and is listed in the set of publications:

https://github.com/arcfide/Co-dfns#publications

3. As mentioned above, keeping this code at the same semantic density as the core compiler allows you to read this code in much the same way that you can read the compiler code, making it easier to jump back and forth whenever you want throughout any point in the compiler and not adjust to multiple languages. This applies to the use of single characters.

4. The letters v, z, and s are all semantically meaningful to those working in APL, but they are also globally standardized "for the most part" throughout the compiler so that you know what we are talking about whenever the letter "z" shows up anywhere in the compiler. There are some exceptions, but these are obvious, and local, and can be that way because the use site and definition site are within a couple of lines of each other. So, yes, it's quickly obvious from the context.

5. I have tried many, many times to introduce comments. They almost never help, and almost always hurt.

6. If I were to change the style, it might be easier to understand what a given snippet is doing, but this wouldn't aid in your ability to understand what the whole compiler is doing or how it all fits together. It would be a false sense of understanding because it would be divorced from the broader context. The point of the design the way it is is to encourage you to focus on the macro level design issues, and allow you to see the whole context of the whole compiler more easily.


x,←y simply appends y to x; x,←y,nl just appens y to x, then adds a newline. '' is just a string containing C code. I think you're making this more complicated than it is.


I would be more sympathetic to this argument if the code was visibly a collaboration.

I am perfectly willing to believe that I could reduce the size of my code by a factor of 10, maybe even 100, if I was willing to give up the constraint of making it maintainable independently of myself. I think that would be a poor tradeoff to make in most cases.


You have a great point but I would state it in a positive way. What sort of system could a small team build if more than one programmer (let's say 3 or 4) could maintain the intimate familiarity with a small codebase, and consequent hyper-productivity, that arcfide is describing?


You're clearly very enamoured with this approach; I'm not. I've seen it before (as arcfide is reminding us, APL has been around for many decades; I find the Forth philosophy similar too) and I think it's a dead end, a seductive trap. You can't build for single-programmer productivity and then retrofit maintainability afterwards.

More generally I think choosing tools based on small examples is a big systematic bias affecting the industry; I can absolutely understand why people do it (because who has time to compare large systems) but I think it holds us back, and I think this particular programming style games that metric even more heavily than most, meaning people falsely attribute advantages to these languages that don't exist in the real world. I think the scepticism a lot of people are showing here is very healthy and frankly I'm surprised you don't share it.


You can go through the Dyalog meetings and see how APL scales up and down along the spectrums.

I'm glad you think my compiler is a small system. The problem I'm solving is one that people said was simply too difficult and impractical to pursue. If I have made it so simple as to be dismissed as trivial, then that's good. :-)

I'm happy to walk you through the compiler in the live session and let you decide for yourself just how maintainable it would be if you had to pick it up. But this code base has been designed with maintainability in mind from the beginning.

How big is a big system? You've called this a small system, but it's a compiler with commercial backing/funding that compiles a language used in production systems, and is, to my knowledge, the only compiler able to express core compilation algorithms in an efficient manner on the GPU. It's rapidly moving to the self-hosting point, and at that point we will have a complete compiler that compiles a real language that runs completely and entirely on the GPU, from parser to generator.

To give you an idea of this task. A basic scan primitive implemented efficiently on the GPU in the neatest and cleanest code that I know of published in the literature is 100 lines of code. If you compressed it, you could probably fit it into 50 - 70 lines of code. That's for one simple operation that takes anyone a single line of C code to write.

This project has taken a real compiler (it's not a C++ compiler, of course) and is putting it on the GPU. Is this a small system?

I would put it in the realm of the sort of problem that can only be meaningfully solved by simplification.

However, this isn't the only code base around. There's another company who has a larger team of APLers who maintain over 1 million lines of APL code in production. At that scale they have to make different design choices than I do, but they also say, if they can do it in APL, they do, and they wish they could do everything in APL. They are one of the only groups, to my knowledge, who has been able to see a net gain in value from implementing a static type system on top of APL's core. So, in terms of scalability, yeah, maybe you need something more (like a static type system) as your code grows, but if you manage to need 1 million lines of APL for your problem, then you're in a good place.

Still, just come to the live session and we can discuss all of the issues that you see with maintainability. If you can see a way to make the code simpler and easier to reason about at a macro level, I'll be all for it!


> I'm glad you think my compiler is a small system. The problem I'm solving is one that people said was simply too difficult and impractical to pursue. If I have made it so simple as to be dismissed as trivial, then that's good. :-)

I figure anything being done by one person is necessarily that trivial. Maybe you're doing the work of 100 people. Maybe the work of 1000. But you can't scale arbitrarily far; at some point you'll hit your limit. The amount of work one programmer can do is, ultimately, O(1).

> they also say, if they can do it in APL, they do, and they wish they could do everything in APL.

Fair enough; where I'm working there's a rather different view of the APL parts of our codebase.

> Still, just come to the live session and we can discuss all of the issues that you see with maintainability. If you can see a way to make the code simpler and easier to reason about at a macro level, I'll be all for it!

I can't/don't do audio/video/"live" I'm afraid (and if that's the only way you can explain the code then that itself reflects badly on its maintainability). I'll read a transcript with interest.

I do think the value of conciseness is real and underrated - at the same time it's very possible to overestimate it if you're looking right at the transition point between a project being small enough to keep in your head at once, because if your project is very close to that line then you can reap huge gains from small conciseness improvements but not in a way that scales. I once looked at implementing a lot of the APL operators (Scala supports unicode identifiers and has a very flexible syntax, so you can actually get pretty close). But I've found that, at least in the context of a large codebase moving incrementally (and I firmly believe that's the one that ultimately matters, for the reasons above), the conciseness gain isn't worth the cost of not having clear English names for all the operations. Indeed I now try to move away from symbols and short names in general as much as possible.


There are some good points here. I'm fine with reduction in constant factors when it comes to productivity. I personally find that those constant factors are the bigger issue in day to day work anyways.

And part of the problem is that people aren't thinking of the whole context and picture when they talk about conciseness. Conciseness is not the end goal in and of itself. It is a design constraint that creates pressures to help mentally force you to achieve the really important things. You can't blindly chase conciseness. You have to have a reason for choosing conciseness and fit that within a broader aesthetic with the big picture always kept in mind.

I think this is one of the reasons that people balk at APL and fail to integrate it into their code bases. We don't teach people how to design or write "poetic" style code. I believe one famous author (Perlis? Kay?) talked about as "lyrical" programming. It's not something we teach in schools, and it's not something that most people ever learn intentionally. So, people see APL and they get the take away that the reason this code is so neat is that it is so "short." So they often do one of two things. They try to make their own code short, or they see the APL operators and think, "I could do that in language X" (I did this at first with Scheme). But as you've discovered, that fails. It doesn't really seem to work well when a lot of people try it. But why? Why do we get such great results in some cases, and such poor ones in others? Why does it seem so hard to have "APL in Language X?" There are two or three major ideas that I think contribute to this.

The first is what makes APL code unique. It is not its shortness. The tersity of APL is just the surface characteristic of a set of synergistic design aesthetics that result in short code that can actually be useful. Yes, it is short, but it's not shortness for shortness sake. That brevity of expression is a part of a bigger picture. Iverson described some of how they saw this aesthetic himself in this "Notation as a Tool of Thought" Turing Award lecture. However, I think he misses a few things that were just "obvious" to him. One of those ideas would be the concept of the design tension between generality and specialization.

The second idea is this concept of semantic density. One of the reason, I believe, that APL code is unique, is because of the ability to remain largely at a single abstraction level and be so productive with minimal abstractions. The regularity of semantic density is really important. It allows you to make your variable name choices higher impact than they might otherwise be. But this semantic density issue means that by definition, inserting some small bit of APL code into a larger code base is fundamentally breaking semantic density. It's easier to deal with uniformly verbose code or uniformly terse code than a mixture of the two together.

Finally, the contribution of APL style coding is not primarily one of semantics, but one of design. That's why it doesn't work to just implement and APL semantics with terse naming in some other language and expect it to work. The semantics and domain and abstractions available to the APL programmer are a part of the whole, not the whole itself.

So, I'm not surprised that you struggled with integrating APL style programming with other styles. It's a case of trying to have your cake and eat it, too. It's also a reason that some groups can come to loathe the APL code base, because very often they are used to doing it another way, and the APL code base interferes with their approach. Of course, it's always possible to write bad APL code, too. But the style I'm talking about isn't strictly about just APL.

What most people encounter with integrating APL into other code bases, IME, is a fear of deleting their code. There's incremental development, and then there is glacial development. When I've worked with a few people on code integration, one of the things that trips them up is that they tend to think so much at the micro level, that they don't get any benefits from APL, because they don't understand how to leverage it. What they end up doing is trying to fit APL into their architecture, because they are too afraid to change the way that they do something. They want to try little "piecemeal" integration here and there. But that doesn't work, because having some mysterious three line piece of APL surrounded by hundreds of lines of other code doesn't work nearly as well. You get almost none of the benefits of APL style coding and all the potential social issues.

You can always write verbose, old style code in APL if you want, and then it will integrate fine, but you are just writing Python in APL then, and your gains will be minimal, at best.

Instead, you have to shift your granularity of integration. You have to think of replacing whole, independent units in your code base at a single time. This sounds scary when you first mention it, because people are imagining some massive change in the system. However, if you can take a single unit or module in your system at a time that is truly independent of the rest, then you can choose to have a separate style for that code and a separate aesthetic without incurring the same costs that you would have elsewhere. It would be like having part of your system written in Python and the other written in Haskell. Sure, you've two languages to work with, but as long as you're not switching back and forth all the time in the middle, you can focus on doing good design for each of the languages.

If this is done right, then the selected code base should go from a large piece of code to a very very small piece of code, and the time it takes to maintain that code should be minimal, requiring maybe one or two people to work on maybe a hundred lines of code instead of thousands or tens of thousands or something like that.

And really, to do it really right, this group needs to be working directly with customers on this code. Then you start to see the wins in APL style.

So, the shift in APL style programming is as much aesthetic and cultural as it is semantics and technical. If the methods aren't fundamentally forcing a change in the way you think about your code and the fundamental complexity of your code, then you're not using APL correctly.


> The second idea is this concept of semantic density. One of the reason, I believe, that APL code is unique, is because of the ability to remain largely at a single abstraction level and be so productive with minimal abstractions. The regularity of semantic density is really important. It allows you to make your variable name choices higher impact than they might otherwise be. But this semantic density issue means that by definition, inserting some small bit of APL code into a larger code base is fundamentally breaking semantic density. It's easier to deal with uniformly verbose code or uniformly terse code than a mixture of the two together.

I don't see how this is unique. There's a similar idea of keeping each function at a single semantic level in e.g. Clean Code. (The connection to a consistent level of terseness isn't there, but in my experience it isn't true; lines that use terse expressions of common concepts and verbose expressions of unusual concepts are more readable than lines that are uniformly at either level).

> Instead, you have to shift your granularity of integration. You have to think of replacing whole, independent units in your code base at a single time. This sounds scary when you first mention it, because people are imagining some massive change in the system. However, if you can take a single unit or module in your system at a time that is truly independent of the rest, then you can choose to have a separate style for that code and a separate aesthetic without incurring the same costs that you would have elsewhere. It would be like having part of your system written in Python and the other written in Haskell. Sure, you've two languages to work with, but as long as you're not switching back and forth all the time in the middle, you can focus on doing good design for each of the languages.

You can make that change piecemeal though - I've done so repeatedly, for various pairs of languages, and also for quite radical stylistic shifts within a Scala codebase. Yes, you have to define a border and gradually expand it rather than replacing random lines in the middle of other things, but it's very doable.

> this group needs to be working directly with customers on this code

Doing that is such a huge win in any language that it could easily explain all the advantages you're claiming for APL.

I find the poetry-of-code stuff unconvincing, and the "if it didn't work for you you must be doing it wrong" even less convincing. If we've ended up doing APL wrong then it's not for want of trying - the right way of doing it must be hard to communicate to people, which I think comes right back to my original issue. I'm very skeptical of anything that claims the only way to try it is a big migration and a huge raft of integrated changes - that very conveniently makes it hard to fairly compare the direct advantages of the thing itself, and ensures that anyone in a position to compare has already made a substantial investment/commitment to the thing.


It's a fair point you make. Regarding semantic density, what you talk about is density maintenance at a single point, that is, the density of a single function. I'm not saying that APL is unique in that respect. I'm saying that APL seems uniquely well suited to remaining at the same semantic density throughout the entire code base, and that the semantic density found in APL code more readily, IME, handles the shifts in common versus uncommon concepts at the same density level than I have found in other languages.

You're right that you can change piecemeal, but without knowing your specific case, it's hard to say what you focused on integrating, and thus, whether you were just integrating the language or integrating a style.

If you're willing to talk more about this, feel free to email me, I would love to see more details of your experiences trying to integrate and get more details on your project. This is a part of my future research agenda, and so gathering information on what has worked and didn't work for you, and making an attempt to understand all of that in your case would be very valuable.

As for the issue with big migration, I absolutely agree that it makes it hard to study. However, that's where academic research can come in. I'm specifically taking a look and building a set of research studies to isolate specific factors related to the APL phenomenon and understanding their relevance and how they place a role in the HCI of programming languages and developer culture. There's almost no one doing these kinds of studies, so the main problem at this point is that we have anecdotes, but not a lot of large, well done studies to deal with the issues. I hope to change that.

As for working with customers, are you aware of any other large projects that directly involved customers in their programming? That actually expect their customers to read source code, when the customers are not programmers themselves? I'd be interested in seeing the results of such an effort in other languages, as the only places where I've seen this level of customer integration used for non-computer science/programmer type customers (that is, those with no programming background) is in APL.

I think a big problem with people trying to work with APL code is partly training. There is very little out there on how to do "good APL" code. You get trained all through your development career on how to work with traditional programming languages and what are considered best practices there, but it is not at all clear or obvious to me that the best practices that you learn are actually the right ones once you start moving far from the traditional programming language bent. I would include not only APL in this, but Prolog and Agda, for instance.

I would even go so far as to say that some best practices in other languages are anti-patterns in APL. If your team has made a concerted effort to avoid and educate developers to avoid anti-patterns in APL and they have actually worked at changing that aspect inside of your development team(s), I would be very interested to learn more about this. It's hard to find good case studies in this material in the wild, and so if you'd be willing to share some of yours, I'd be very interested in getting good data.

Please contact me by email (arcfide@sacrideo.us) if you would be willing to discuss further.


> You can't build for single-programmer productivity and then retrofit maintainability afterwards

Perhaps my use of the word 'maintain' was confusing. I'm not suggesting that one programmer write such a system and others then take over maintaining it. I'm suggesting that 3 or 4 programmers write (and maintain) such a system together and all be intimately familiar with it.


Sure, I get that. I just think a language needs to be built from the ground up to allow multi-programmer collaboration, and that there are few if any valuable lessons to be taken from what works in the single-programmer case.


You're asserting that this isn't multi-programmer friendly. I'll agree that it's not "code monkey" friendly, but I disagree that it is not oriented towards multiple programmers. And the APL language has almost all the features you would expect from a modern multi-paradigm language, including branching, control structures, recursion, exceptions, objects, frameworks, interfaces to other languages, and so on and so forth.

But APL was designed from the beginning to enable human communication. I would argue that almost all programming languages fail to be a good human medium of communication. The evidence I give in support of this assertion is that if you look at how people write when they think the computer won't need to see the code, such as in academic publications on computer science, see what they use in the paper. Almost all of the people who implement their ideas in one language or another fail to include the entire code in their papers, and they usually include some mathematical notation and diagrams to explain their ideas instead. They may include some small snippets of code, but they rarely if every include the full code. Dan Friedman being an exception that proves the rule, if you will.

If you then take a look at how APLers communicate when they have ideas, you see code all the time, all day long. The APL community is the only one I've seen that regularly can write complete code and talk about it fluently on a whiteboard between humans without hand waving. Even my beloved Scheme programming language cannot boast this. When working with humans on a programming task, almost no one uses their programming languages that primary communication method between themselves and other humans outside of the presence of a computer. That signals to me that they are not, in fact, natural, expedient tools for communicating ideas to other humans. The best practices utilized in most programming languages are, instead, attempts to ameliorate the situation to make the code as tractable and as manageable as possible, but they do not, primarily, represent a demonstration of the naturalness of those languages to human communication.


Academia is its own thing with its own incentives. I wouldn't generalise from what happens in academic papers.

When I see people communicating in (my part of) the industry they use pseudocode, which is often described as looking like python. They use if anything fewer symbols (and more space) than a real programming language. They do indeed elide parts of the code - often things like error handling.

To my mind that says: we should use languages in which code looks like pseudocode/python (this idea was suggested in http://paulgraham.com/hundred.html , though he takes it in a different direction). And we should look for ways to elide in real code the parts that people like to elide when talking about programs: to e.g. have "ambient" error handling that's more-or-less invisible most of the time, without sacrificing the safety advantages of checking error cases (this is why I'm interested in e.g. effect systems).


I'd be very surprised if your industry really did use complete pseudocode and only elided error handling. On the other hand, you're sort of assuming in your conclusion that pseudocode is the "better way" for languages because that's what people use, but you're leaving out the initial bias. I would argue that if you made current industrial languages more like pseudocode, you'd probably do better, yes, but it's a local maximum derived from an assumption of what the end result will be.

In other words, people use pseudocode because it's close to the code they intend to write and represents their current notational expectations. It's an enforcement of legacy methods of thinking.

But many people have admitted that there is a problem with writing pseudocode style programming for modern hardware performance, where taking advantage of parallelism is important.

Furthermore, I would argue that academia is relevant because it's one of the few places where the ideas are more important than the executable. If the ideas are communicated clearly, then you've succeeded. If we really want to program for the human, then we want our programs to be focused on the communication of ideas, and not machine-focused. And the reality is that if you take the machine away, and focus on human-to-human communication, without any "industrial" bias (expectation of machine execution), then rigorious idea communication is almost always pictorial, visual, and ideographic. Fruthermore, the notations that people develop and have developed over time to communicate ideas never end up looking like mainstream programming languages. As people work with ideas, math notation is the quintessential notation for communicating human ideas rigorously. It is highly evolved for human consumption, and manipulation, rather than machine-focused.

I believe there have also been some studies on how people describe processes without any computing background, and it's inevitable that many of the core "serial" programming concepts are not "natural" in human though, but a very acquired taste.

Again, I would be surprised if you put a bunch of industry or non-industry professionals up to a white board and had them illustrate their ideas rigorously to one another on just that whiteboard, that they would naturally gravitate to any real programming language. And I doubt strongly that they would actually continue to use pseudocode at scale on the whiteboard.


> I'd be very surprised if your industry really did use complete pseudocode and only elided error handling. On the other hand, you're sort of assuming in your conclusion that pseudocode is the "better way" for languages because that's what people use, but you're leaving out the initial bias. I would argue that if you made current industrial languages more like pseudocode, you'd probably do better, yes, but it's a local maximum derived from an assumption of what the end result will be.

Error handling was one example - I see concerns like serialization, permissions, transactionality commonly elided, and I look for better ways to handle them in programming languages as well.

> I would argue that academia is relevant because it's one of the few places where the ideas are more important than the executable. If the ideas are communicated clearly, then you've succeeded.

Maybe. That assumes that the successful papers (and successful academics) are those that communicate ideas clearly. I'm not convinced.

> the reality is that if you take the machine away, and focus on human-to-human communication, without any "industrial" bias (expectation of machine execution), then rigorious idea communication is almost always pictorial, visual, and ideographic.

Not my experience at all - if anything I'd say visual aspects tend to be a marker of less rigorous communcation.

> Fruthermore, the notations that people develop and have developed over time to communicate ideas never end up looking like mainstream programming languages. As people work with ideas, math notation is the quintessential notation for communicating human ideas rigorously.

Mathematics is one such notation; "legalese" is another, and philosophical terminology a third. I'm wary of generalising too much from mathematical notation alone.


> Not my experience at all - if anything I'd say visual aspects tend to be a marker of less rigorous communcation.

I would point to the field of combinatorics, the traditional proofs of both the ancient Chinese mathematicians as well as those of the West, both of which took on various elements of geometry and spatial reasoning for a significant number of their proofs when other tools were not yet available. The development of algebra I see as a chiefly visual and ideographic one, even tangible or malleable one. The development of UML diagrams another. Flow charts another. We have the abacus and Chinese counting sticks, as well. And finally, while poetry is not specifically rigorous, it is efficient in a way that few other communication methods are. And we find a great deal of "visual cue" elements in that field. In physical sciences and statistics, visualization is a very important tool. Mathematical notation itself is largely spatial and visual at scale.

As for legalese, I would argue that legalese is perhaps well designed for experts to be complete, but not for clarity. Comprehensiveness is different that clarity of rigor. And as for philosophy, vocabulary is not enough. And you'll note that some of the best notational systems to arise came from the philosophy departments in working on logical systems. Those are all usually notationally represented using ideographic, rather than natural language forms. And even some Eastern philosophers who wrote very verbosely tended to make their arguments from visualizations in the mind to make their point.

Musical notation, again, has evolved into a spatial, visual notation. A large number of traditional writing systems were ideographic, including ones we now consider alphabetic/phonetic.


Codebase and it's terseness rarely matters. Understanding business processes that govern why the code exists is usually much more important than the code itself.

After that familiarity of code comes first. And by familiarity I mean: common patterns, common solutions, ability to bring new people into the fray.

Small terse languages tend to breed long-running small teams with an insanely high bus factor.


I disagree a little bit, but agree with you in part. You're drawing a distinction between codebase tersity and business processes.

See my other reply here about Direct Development. One of the better ways to do APL development is to write your codebase in such a way that you don't just onboard other developers, you onboard the customers as well. You don't just talk about business processes and then have some IT team turn that into code, the customers themselves work off the code together with the developers directly on solving the issues of business process. The result is that the code literally reflects the needs and expressions of "why" because that's the document or artifact in which the customers write their business processes down. And at that point, tersity matters. It's important that semantic density of the code keeps computer science-y stuff away from the customer, and brings the domain vocabulary of the customer to the fore in the code itself. You can't do this if you have your code littered with words and vocabulary that the customer has to parse and work with. The words that the customer sees, i.e., the variable names, should, ideally, only come from their own domain. In this way they can follow the code and see how the data flows.

That's the way to make business processes the core component of everything. You're reducing the development cycle down to a singular point, or very close to that.

There are a number of papers on this, but this idea of "user pair programming" has been successfully used in a number of APL projects, and is often the basic interaction/development methodology through the community.

And I would argue that the most important thing about the code base is that it accurately expresses the solution to the needs of the user, that it is capable of shifting with the needs of the user as quickly as the needs of the user change, and, finally, that it be as easy as possible to verify with the user that the code in fact is an accurate solution to their needs.

You could say I care more about whether the users can get what they need out of the code, including verifying that the code is what they intend, than about how easy it is to onboard other developers who are separate from the customer. That's best achieved by making the code easier for the user to read, rather than easier for the broad range of developers to read.


I agree short/dense/simple/linear code has huge benefits that most programmers haven't experienced, simply because it is so hard to create (especially in some languages). Your code is both impressive and inspiring.

What additionally interests me is the combination of points-free style and the kind of data structures you're processing in an array-biased language, could you give an insight on what that is like to work with?

In particular, I presume from your description, and only a conceptual familiarity with APL, that most or all of this code is "functional", i.e. all data structures exist as values passed between the composed functions, and nowhere else (no globals or similar). I'd love to hear more about the predominant data structures and what shape they take.

Somewhere else you mention Quad-XML, which seems to be a way to represent trees as arrays, with each element pre-fixed with its depth. I presume you use this for the AST? What kinds of operations are simpler on these arrays, and which are harder, compared to tree data structures used in other languages? For example, addressing the Nth child from a parent could be harder, since you have to search past the other children? I could imagine that operations like "set all fields X of the tree to Y" are a lot easier since no tree traversal is required.

Does your ability to quickly refactor rely on this functional nature?


I've structured the points free style so that it's basically like working with any expression, I just am working with expressions that build functions instead of expressions that build values. The compiler is very functional in style, and the entire core of the compiler is just a single data-flow graph if you get right down to it. I'll discuss this more in the live session, but they operate in the Nanopass style over a mostly monotonically growing (along the field axis) matrix representation of the AST whose core "columns" correspond to the core columns of the Quad-XML format. Rather than a single "attributes" column I flatten the attributes column into multiple main columns, but otherwise its the same, and the "Xml" function in my compiler helps to convert to Quad-XML format and serialize the AST for those who want to store intermediate AST results.

The challenging part, which is part of what the research is one, is doing tree transformations in an efficient manner. Because this is a pointerless representation, you have to be careful in how you design the structure to ensure that you maximize locality and parallelism. If you read my "Key" paper in the publications:

https://github.com/arcfide/Co-dfns#publications

You'll see how I manage one of the most tricky elements. By using the techniques I describe in that paper, I can perform arbitrary computation over any group of sub-trees in the AST selected by their parent-child relationships in a data-parallel fashion. This is largely accomplished by converting that depth vector in the Quad-XML format into a "path matrix" which allows for computing and reasoning about the parent child relationships of any two arbitrary nodes in the AST without reference to any other parts of the tree, without pointers. I can then optimize that path matrix representation for either ease of construction or for performance over certain types of common operations.

That's really one of the most significant elements and sort of blows the whole problem wide open and actually makes it possible to do what I'm doing so easily.

Once I replace the standard recursive transformation idioms with this new set of Key/Path Matrix idioms, I use the Nanopass architecture to allow refactoring. Nanopass is a style of compiler construction that builds on the idea of functional programming. So, yes, the compiler itself is very very functionally oriented, and that's a very big part of the refactorability of the code. But also, since I have done so in a way that results in so few variable names, that's also a major component, and it means that code is often highly "independent" and can be removed or deleted easily.


Thanks, read your paper, that answers most of my questions. Fun to see such a completely different way of working with trees. Agree that point-free makes refactoring almost trivial.. one thing it shares with Forth style languages :)


As for what is harder and what is not, it's not really so much a matter of easier and harder. By replacing all of the normal techniques with equivalent ones, it's more just programming in a different style that nets more benefits. I wouldn't say it's fundamentally easier, because the hardest part of any problem is reasoning about the problem itself, but I do think that you get a lot of benefits for writing in this way that is not really harder than the traditional methods, either. There's a sort of 1 for 1 replacement of traditional programming techniques with new techniques. The new techniques solve similar problems, but are more parallel, and more concise.


> Don't complain that Chinese is ugly and unreadable just because you speak English as your native tongue.

This is argument from analogy, and with a certainty nearing 100% it doesn't apply to programming languages.

If you want to take this argument all the way though... Why not use Japanese instead?

It's ugly, it's unreadable, it takes countless hours to master the language, the grammar, the writing system. In the end you arrive... at yet another language[1]. Which may or may not express some things that English can't. By the time you've mastered Japanese, you'll have achieved near perfection and all your goals in English :)

[1] I speak four and currently am in the process of learning a fifth language (Russian, Romanian, Turkish, English, Swedish). I can say with some "expertise" that you can't make direct comparisons between natural and computer languages.


Because some languages are better tools of thought than others for certain disciplines. Linguists have demonstrated that language itself has a shaping on the way in which people approach and see problems.

While I could have chosen Japanese, it wouldn't suit the purpose as well.

Moving this into the domain of programming languages, the point is that working with APL as a notation fundamentally changes the way you see problems. It facilitates a style of thinking and working with code that promotes the ends I'm working on.

I'm sorry that you don't like analogy, but analogy is important to me, and I'll have to throw another one in here. To me, it's more about constraints that shape design than anything else. I'm not writing "English" style stuff in Japanese or Chinese, I'm writing, say, Chinese Poetry in Chinese. It would be exceptionally difficult to achieve the same result in English. Could you literally express the same content? Sure, maybe, but it would depend on what you counted as important. And the result would be very very very verbose indeed, and would no longer have the value that the same thing in Chinese poetic style would have.

In PL, I could have written the same compiler in CUDA, and I would guess that it would take at least 10,000 lines of code or more to make it work.

I could try to write it in any other number of methods, and I would argue that not only would the results have been more ugly, they would have been much less maintainable. I could have implemented the same literal algorithms with the same literal content, but to get the Human factors that I want, I would have a very very hard time of it.

This is a big problem I see in the PL community. We don't, as a whole, understand or care to study the impact of notation on our thinking. It's all just "syntax" and we can choose what we like. But that's not really true. Just because I could write an APL library in Scheme does not mean I will be able to duplicate the efforts of APL in Scheme. Going the other way, I've for a long time tried to imagine some way of getting syntactic abstraction a la Scheme's syntax-case into APL. I can't find a way that wouldn't basically be useless, because the human factors involved change the game.

So yes, you an translate Chinese poetry into English, and no one will consider the translation to be as good. Now, what most people are suggesting is that you can create Chinese poetry by starting from English and writing the same thing there. There are a host of reasons why that doesn't work.

It's not just about meaning/algorithm/semantics, but about the experience of working with the code day to day and how that affects your ability to work in your problem domain. The design of this compiler enables better, simpler collaboration, and more adaptability and flexibility than other designs I have tried. It obviates certain documentation burdens within the team that is involved in working on this code. It simplifies deployment, maintenance, and all sorts of other things.

Perhaps the biggest boon to working in this way within the team involved in this code, is that everyone, from the managers down to the users of the compiler, are able to discuss and have conversations about making things happen, handle bug reports, and deal with architectural design issues, all without anything else in front of them except for the compiler code itself. We don't need other documentation, we don't need diagrams. We can all literally, from the top to the bottom level, work off the one single code artifact. Everyone can get what they need from that single code base, without needed extra levels of "human documents." Because the code itself is a sufficient entity for human discussion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: