From the project:...rth,←' A zs;A rs=scl(r.v(0));rr##mf(zs,rs,p);if(c==1){z.v=zs.v;R;}\',nl rth,←' array v=array(z.s,zs.v.type());v(0)=zs.v(0);\',nl rth,←' DO(c-1,rs.v=r.v(i+1);rr##mf(zs,rs,p);v(i+1)=zs.v(0))z.v=v;)\',nl rth,←' DL(zz,if(rr##scl){rr##df(z,l,r,p);R;}\',nl  ...No.And commit messages like "Hopefully that does it." No again.

 By keeping the code as visible (read, small) as possible, I see more code and can better reason at a macro level. To scale this down into the micro level of dealing with individual compiler passes, I replace all the traditional programming paradigms with others in a sort of 1 for 1 exchange. In this way, I develop a new set of idiomatic programming methods that are so concise, they can begin to be read as we read and chunk English phrases. By doing so, it becomes actually easier to just write out most algorithms, because the normal name for such an algorithm is basically as long as the algorithm itself written out. This means that I start to learn to chunk idioms as phrases and can read code directly, without the cost of name lookup indirection. I can get away with this because I've made reusability and abstraction less important (vastly so) because I can literally see every use case of every idiom on the screen at the same time. It literally would take more time to write the reusable abstraction than it would to just replace the idiomatic code in every place. It's a case of the disposability of code reaching a point that reusability is much less valuable.This means that in those cases where reuse is valuable, it's very valuable, and it comes to the fore and you can see it as the critical thing that it is. It doesn't get drowned in otherwise petty abstractions that assist reusability, since we don't need that anymore.Furthermore, if I write my code correctly, there is very, very little boiler plate in the compiler. Almost none. This means that every line is significant. By doing this it means that you don't get the fun of feeling like you're accomplishing something by typing in lots of excess boiler plate, but it does mean that you have no wasted architecture. Because rewriting the architecture is so trivial, basically everything now becomes important, and you don't have petty book keeping code around. You know that everything is important, and there is no superfluous bits.The result, as mentioned elsewhere, is code that is getting continuously simpler, rather than continuously more complex. The code is getting easier to change over time, not harder. The architecture is getting simpler and more direct and easier to explain. Because it costs so little to re-engineer the compiler, I can do so constantly, resulting in little to no technical debt.This is an intentional synergistic choice of a host of programming techniques, styles, disciplines, and design choices that enables me to program this way. Give up one of them and you start to break things down. It allows for a highly optimized programming code base that has all of the desirable properties people wish their code bases have, and it scares people. I think that's a good thing. Because I don't want people to see this codebase as just another thing. I want them to see that this is something truly different. How can I get away with no module system? How can I get away with no hierarchy? How can I get away with having everything at the top-level, with almost no nested definitions? How can I get away with writing a compiler that is not only shorter, but fundamentally simpler from a PL standpoint than standard compilers of similar complexity by using only function composition and name binding? How can I get a code base that has more features but continues to shrink?By chasing smaller code. :-)I assure you, and I'll make good on this in another reply here, I could get you up and running on understanding the code and how it works faster than just about any other compiler project out there. In the end, one of the goals I want for this compiler is for people to say, "Woah, wait, that's it? That's trivially simple." The more I can push people to think of my compiler as so trivial as to be obvious, the more I win. The compiler really is so dirt simple as to shock any normal compiler writer.But to make it that simple, I have to do things in ways that people don't expect, because people expect complexity and indirection, they expect unnecessary layers for "safety" and they expect code that needs built in protections because the code is too complex to be obviously correct.I'm pushing the other direction. If you can see your entire compiler at one go on a standard computer screen, what sort of possibilities does that open up? You can start thinking at the macro level, and simply avoid a whole host of problems because they are obviously wrong at that level. When you aren't afraid to delete you entire compiler and start from scratch? What sort of possibilities does that open up to you?
 First, please let me apologize for my ill-considered and rude comment... cringe.Thank you for explaining. Wow, so much to chew on here. The naming conventions and trains sound really interesting. I can see how having a lot of the code visible on one screen would be a fantastic advantage. Again thanks for writing this up. Obviously I didn't find your code transparent at first glance, but clearly if one takes the time to understand what you are doing, the approach has its benefits. I look forward to reading more of what you post. And you've got me intrigued about APL.
 Your comments reminded me of this anecdote about Arthur Whitney:"The k binary weighs in at about 50Kb. Someone asked about the interpreter source code. A frown flickered across the face of our visitor from Microsoft: what could be interesting about that? “The source is currently 264 lines of C,” said Arthur. I thought I heard a sotto voce “that’s not possible.” Arthur showed us how he had arranged his source code in five files so that he could edit any one of them without scrolling. “Hate scrolling,” he mumbled."I suspect his code looks a lot like the J incunabulum:
 It does. Furthermore, he's "simplified" APL in K to require less infrastructure, with fewer primitives, and the like. Combined with some clever, and some would argue, devious programming practices, he's able to keep things pretty small. I don't know if the interpreter is still that small, though. If someone reminds me, maybe I can talk about scrolling. :-)Since I believe Whitney wrote the J incunabulum, I suspect that it looks very similar. The code is actually quite simple and straightforward if you take the time to read it.
 Could you write a blog post (probably needs several) about the code style, architecture and design of your compiler and the idioms that you talk about ? I love the idea about keeping a project code base so small leveraging concise idioms so that everything fits in a meat-bag head, but have no idea how one goes about achieving that in practice. (Learning APL to get some pearls of wisdom would be fine)
 It's something I've been working on for a while, but because the architecture is under constant flex, it's actually more valuable to be able to know how to "experience" or discover the architecture in the compiler code itself than to have a separate document to follow, since it's very easy for that document to get out of date quickly. I am building up a set of documents that discuss some of the core idioms and ideas though, and I hope to have something come of this live session that I can maybe put into an interactive document that people can work with.
 The little essay you've given us in these two HN comments is one of the most brilliant things about programming I've ever read.
 Two things I want to say/ask.1. What happens if you get sick. You say this is a project in production and there is money on the table (I assume not only yours). What if you get sick and are unable to work for 3 weeks or 6 months. Don't you think that this code is very hard to grasp to someone else, who would have to temporarily work on your postion?2. It is weird, that you wrote such a long essay, spanning two comments, but it has so little examples from the actual code. Usually when people explain stuff they go between the abstract concepts and how they are materialized in the code. Here you only explain the idea behind writing it and how it makes you feel/operate/gain flexibility and performance but the closest to the code information I've got from it is that it has compiler passes and that it has a C++ runtime in a string variable. Just a thought, what do you think about that?
 At this point, if I get sick, the code doesn't move much. If I were permanently disabled, this someone else could take over. I have people contribute bugs, tests, and other things fairly often. If you had to temporarily work on the code base and weren't familiar with the background of the project, I would say you'd be lost. It's just not the sort of thing that you can start tweaking things here and there so easily, because almost everything that needs changing is a matter of addressing architectural or serious questions that require you to really understand the project. Because of the way the code is written, there's basically no "code monkey" type work. That means that you only do meaningful work, but it also means that only people who are knowledgeable architects can work on the code. You can imagine the same thing in other code bases. Imagine that you didn't need any of your lower-level programmers anymore for work because there was nothing for them to do. Now imagine how the bus factor changes on the code when only your chief architects are necessary for working on that code base. That's very nice in one dimension, but it does create quite a different picture.You're right about the code examples. I figure that people were already posting some code snippets. I wanted to give the big ideas rather than any specifics. The reason for this is basically that if you take any single line of code out of context, it's a bit hard to explain why I'm doing the things that I'm doing. It's very much a macro design, which is why I am offering the live session to go through. It's sort of, but not quite, an "all or nothing" thing. if you let me sit down with you and go through the entire code base, then I can explain how it all fits together and why things are the way they are, but if you just take a single piece of code out, you're missing the picture.If I took a single compiler pass, out, for instance, you'd have between 1 and 12 lines of code to look at. I could explain a few features, but how would I explain that when you look at this piece of code you're able to see it entirely in context? Well, I can't, because the code it completely out of context at that point. Or what about demonstrating how the naming conventions exhibit structure informative regularity? Again, I can't, because that's a visual design element of the code. It's something you have to "see" by looking at the whole painting as it were.The naming convention is actually a great example. Out of context, there's apparently no rhyme or reason to it. But in context, it forms a key component to the visual regularity and continuity throughout the code. The names are an important part of how you can see the structure of the code. It helps to orient you in the big pie. But if I were to quote a single line here, there's now pie to look at, no sky to navigate by. It's just a single constellation. By analogy, it does less good to say, here's the Big Dipper, it's useful. But why? Because it's easy to find amidst the context of starts and its shape helps you to find the North Star. But on its own it doesn't seem as valuable. At that point it is just another constellation. The same thing happens with this code.So I'll go through and explicate it all in detail in the live session, where I can provide the "painting" and workflow in its entire so people can see how it works. Then you can see how my comments here match up with the code.
 Something that might be worthwhile to consider is the fact that someone who wants to make a change, only needs to look at a small program instead of a large program.In the large program case, the programmer feels like they can cross-cut it, install some duplication, and yes: get their change done faster, but at a cost of making the program bigger.But in the small-program case, you only pay the cost of learning the codebase when you add a new programmer to it -- something that happens very infrequently. Your program stays small, and you gain all the benefits therein (faster, fewer bugs, and so on).
 This is really admirable stuff and I share this kind of goal even though I'm not working in APL style at this time, though I understand the appeal of shifting in that direction as more of the code gets abstract - and it necessarily should be so abstract if you're trying to maximize the simplicity. I believe most codebases suffer from prematurely abstracting with the easy stuff built in the source language(classes, generics, etc), and then not having the abstraction they really need when it's necessary, and being too tangled up to build it.The only problem is that I don't know where to start if I wanted to study what you're doing and take notes. Those millions of lines of changes are still lurking in the background as building blocks for an overall understanding.
 The live session would be the first start, obviously, but you can also see the Publications area of the README:https://github.com/arcfide/Co-dfns#publicationsSome of that deals with the micro and some with the macro level ideas, but there are some key elements in those that will be necessary to appreciate the whole thing.
 Don't complain that Chinese is ugly and unreadable just because you speak English as your native tongue.That's a great counterargument, and one I fully agree with. I've noticed that over the years, there has been a growing trend of promoting "readable, maintainable, clean, insert-fashionable-adjective-list-here code" which really amounts to a lower-common-denominator, dumbing-down perspective of how software should be written. In their perspective, code that someone does not immediately understand is "bad", seemingly regardless of how much (or little) knowledge that someone possesses. I think this is ultimately a harmful trend.The opposing view, which appears to be largely a minority in more mainstream language communities but dominates in others like APL and Asm, is that programming languages are essentially like human languages: they need to be learned, are not necessarily "easy" or "familiar", and this learning and eventual mastery is wholly beneficial to their use. As with human languages, it is not expected nor a problem that a beginner will immediately understand code written by a more advanced user. Instead, the beginner progresses by learning the language and eventually becoming an advanced, "literate" user. This can be summed up in one sentence: "The code is unreadable because you are not yet qualified to read it." ;-)
 Taking an example from the parent:> rth,←' array v=array(z.s,zs.v.type());v(0)=zs.v(0);\',nlI don't know APL or the Rth variant, but this goes against most standard style guides. There may be good reasons for it, but they are not obvious to an external viewer.- What are 'v' and 'vs'? Is this quickly obvious from context (to anyone but the author) (where are the comments?)- Why are they single characters (non-descriptive variables are almost always a code smell)- Why does it do multiple things on one line? I think this is a limitation of use of Rth? Normally this sort of whitespace compression is verboten, even in functional languages like Lisp.I think the parent has a good point about this being very difficult to understand code, and the OP has confused terseness with quality. If this code followed traditional coding styles it would be easier for new people to understand what the hell is going on, and would probably 3+ times longer in LoC. But who the hell uses LoC as a valid metric anyway? Besides the worst sorts of manager, of course...
 but they are not obvious to an external viewerThat's precisely the point. The whole philosophy of APL is that it's not supposed to be obvious to anyone who doesn't (yet) know it. However, seeing as the character set is still Latin, it's not hard to guess at what it does even if you don't know the language.- I don't see 'vs' in the snippet, but would guess V stands for Vector.- Even in all but the most anal "standard style guides", single-character names are normal for temporary/limited-scope variable names.- You could likewise ask why Chinese words aren't separated by spaces, or why English words need to be. It's a different language with its own grammar and style.I don't know APL either, but at least I make an effort to see their perspective on the language, because it is clear that there are people who are highly proficient at working with code like this. (Likewise, I would guess that experienced APL'ers probably find more "traditional" languages like C, Java, Python, etc. "unreadably" verbose.)
 V for Vector is an appropriate, but not the only, interpretation for that letter.In the case of this compiler, I take an opposite convention. Most single character names are globally meaningful, and their meaning rarely, if ever changes across the whole compiler. As names get longer, they progressively represent more local elements. This is done in a way that reveals that nesting structure, but is also done because over time I realized that it was harder to remember from one patch to the next what the local variables were meant to do, rather than the global variables, which were almost always the same all the time and were much more likely to be in mental cache. Therefore, I used more "information", that is, more characters, for local names that I would more likely forget the meaning of later, than for global names that were universal and almost always on my mind.And yes, I did try doing this compiler in many, many other styles, including C, C++, Nanopass Scheme, ML, Java, Cleanroom, traditional APL style, and so on and so forth. They were all unreadably verbose and difficult to work with and very hard to make forward progress on.
 There may be good reasons for it, but they are not obvious to an external viewer.But isn't the real question whether that obviousness is more important than ease of comprehension and maintenance for someone who does have the required skills to work on the code?To a child who has just learned squares and square roots and who has never encountered TeX, the expression $e^{i\theta}=\cos\theta+i\sin\theta$ is probably just line noise. To a practising mathematician, it is immediately recognisable and a useful tool. Obviously the difference is that the experienced mathematician has learned the underlying concepts and the notation to represent them. The result is that while the teenager might be learning double-angle formulae by rote for their trigonometry exam in a few years, the experienced mathematician could use their more powerful tool to derive those formulae or any variations on the theme in moments whenever they need them. Their greater skill and understanding makes them much more capable.There are certainly reasonable arguments for making the code for some projects accessible to new developers, but doing that isn't free if it also means compromising some aspect of that code for current developers. It's a trade-off, and sometimes requiring new developers to have a certain level of skill and understanding before they can work on a project is OK.
 Well put. The important thing is to see what the tradeoffs actually are. Unseen tradeoffs often look like obvious wrongness.
 x,←y simply appends y to x; x,←y,nl just appens y to x, then adds a newline. '' is just a string containing C code. I think you're making this more complicated than it is.
 I would be more sympathetic to this argument if the code was visibly a collaboration.I am perfectly willing to believe that I could reduce the size of my code by a factor of 10, maybe even 100, if I was willing to give up the constraint of making it maintainable independently of myself. I think that would be a poor tradeoff to make in most cases.
 You have a great point but I would state it in a positive way. What sort of system could a small team build if more than one programmer (let's say 3 or 4) could maintain the intimate familiarity with a small codebase, and consequent hyper-productivity, that arcfide is describing?
 You're clearly very enamoured with this approach; I'm not. I've seen it before (as arcfide is reminding us, APL has been around for many decades; I find the Forth philosophy similar too) and I think it's a dead end, a seductive trap. You can't build for single-programmer productivity and then retrofit maintainability afterwards.More generally I think choosing tools based on small examples is a big systematic bias affecting the industry; I can absolutely understand why people do it (because who has time to compare large systems) but I think it holds us back, and I think this particular programming style games that metric even more heavily than most, meaning people falsely attribute advantages to these languages that don't exist in the real world. I think the scepticism a lot of people are showing here is very healthy and frankly I'm surprised you don't share it.
 You can go through the Dyalog meetings and see how APL scales up and down along the spectrums.I'm glad you think my compiler is a small system. The problem I'm solving is one that people said was simply too difficult and impractical to pursue. If I have made it so simple as to be dismissed as trivial, then that's good. :-)I'm happy to walk you through the compiler in the live session and let you decide for yourself just how maintainable it would be if you had to pick it up. But this code base has been designed with maintainability in mind from the beginning.How big is a big system? You've called this a small system, but it's a compiler with commercial backing/funding that compiles a language used in production systems, and is, to my knowledge, the only compiler able to express core compilation algorithms in an efficient manner on the GPU. It's rapidly moving to the self-hosting point, and at that point we will have a complete compiler that compiles a real language that runs completely and entirely on the GPU, from parser to generator.To give you an idea of this task. A basic scan primitive implemented efficiently on the GPU in the neatest and cleanest code that I know of published in the literature is 100 lines of code. If you compressed it, you could probably fit it into 50 - 70 lines of code. That's for one simple operation that takes anyone a single line of C code to write.This project has taken a real compiler (it's not a C++ compiler, of course) and is putting it on the GPU. Is this a small system?I would put it in the realm of the sort of problem that can only be meaningfully solved by simplification.However, this isn't the only code base around. There's another company who has a larger team of APLers who maintain over 1 million lines of APL code in production. At that scale they have to make different design choices than I do, but they also say, if they can do it in APL, they do, and they wish they could do everything in APL. They are one of the only groups, to my knowledge, who has been able to see a net gain in value from implementing a static type system on top of APL's core. So, in terms of scalability, yeah, maybe you need something more (like a static type system) as your code grows, but if you manage to need 1 million lines of APL for your problem, then you're in a good place.Still, just come to the live session and we can discuss all of the issues that you see with maintainability. If you can see a way to make the code simpler and easier to reason about at a macro level, I'll be all for it!
 > I'm glad you think my compiler is a small system. The problem I'm solving is one that people said was simply too difficult and impractical to pursue. If I have made it so simple as to be dismissed as trivial, then that's good. :-)I figure anything being done by one person is necessarily that trivial. Maybe you're doing the work of 100 people. Maybe the work of 1000. But you can't scale arbitrarily far; at some point you'll hit your limit. The amount of work one programmer can do is, ultimately, O(1).> they also say, if they can do it in APL, they do, and they wish they could do everything in APL.Fair enough; where I'm working there's a rather different view of the APL parts of our codebase.> Still, just come to the live session and we can discuss all of the issues that you see with maintainability. If you can see a way to make the code simpler and easier to reason about at a macro level, I'll be all for it!I can't/don't do audio/video/"live" I'm afraid (and if that's the only way you can explain the code then that itself reflects badly on its maintainability). I'll read a transcript with interest.I do think the value of conciseness is real and underrated - at the same time it's very possible to overestimate it if you're looking right at the transition point between a project being small enough to keep in your head at once, because if your project is very close to that line then you can reap huge gains from small conciseness improvements but not in a way that scales. I once looked at implementing a lot of the APL operators (Scala supports unicode identifiers and has a very flexible syntax, so you can actually get pretty close). But I've found that, at least in the context of a large codebase moving incrementally (and I firmly believe that's the one that ultimately matters, for the reasons above), the conciseness gain isn't worth the cost of not having clear English names for all the operations. Indeed I now try to move away from symbols and short names in general as much as possible.
 > You can't build for single-programmer productivity and then retrofit maintainability afterwardsPerhaps my use of the word 'maintain' was confusing. I'm not suggesting that one programmer write such a system and others then take over maintaining it. I'm suggesting that 3 or 4 programmers write (and maintain) such a system together and all be intimately familiar with it.
 Sure, I get that. I just think a language needs to be built from the ground up to allow multi-programmer collaboration, and that there are few if any valuable lessons to be taken from what works in the single-programmer case.
 You're asserting that this isn't multi-programmer friendly. I'll agree that it's not "code monkey" friendly, but I disagree that it is not oriented towards multiple programmers. And the APL language has almost all the features you would expect from a modern multi-paradigm language, including branching, control structures, recursion, exceptions, objects, frameworks, interfaces to other languages, and so on and so forth.But APL was designed from the beginning to enable human communication. I would argue that almost all programming languages fail to be a good human medium of communication. The evidence I give in support of this assertion is that if you look at how people write when they think the computer won't need to see the code, such as in academic publications on computer science, see what they use in the paper. Almost all of the people who implement their ideas in one language or another fail to include the entire code in their papers, and they usually include some mathematical notation and diagrams to explain their ideas instead. They may include some small snippets of code, but they rarely if every include the full code. Dan Friedman being an exception that proves the rule, if you will.If you then take a look at how APLers communicate when they have ideas, you see code all the time, all day long. The APL community is the only one I've seen that regularly can write complete code and talk about it fluently on a whiteboard between humans without hand waving. Even my beloved Scheme programming language cannot boast this. When working with humans on a programming task, almost no one uses their programming languages that primary communication method between themselves and other humans outside of the presence of a computer. That signals to me that they are not, in fact, natural, expedient tools for communicating ideas to other humans. The best practices utilized in most programming languages are, instead, attempts to ameliorate the situation to make the code as tractable and as manageable as possible, but they do not, primarily, represent a demonstration of the naturalness of those languages to human communication.
 Academia is its own thing with its own incentives. I wouldn't generalise from what happens in academic papers.When I see people communicating in (my part of) the industry they use pseudocode, which is often described as looking like python. They use if anything fewer symbols (and more space) than a real programming language. They do indeed elide parts of the code - often things like error handling.To my mind that says: we should use languages in which code looks like pseudocode/python (this idea was suggested in http://paulgraham.com/hundred.html , though he takes it in a different direction). And we should look for ways to elide in real code the parts that people like to elide when talking about programs: to e.g. have "ambient" error handling that's more-or-less invisible most of the time, without sacrificing the safety advantages of checking error cases (this is why I'm interested in e.g. effect systems).
 I'd be very surprised if your industry really did use complete pseudocode and only elided error handling. On the other hand, you're sort of assuming in your conclusion that pseudocode is the "better way" for languages because that's what people use, but you're leaving out the initial bias. I would argue that if you made current industrial languages more like pseudocode, you'd probably do better, yes, but it's a local maximum derived from an assumption of what the end result will be.In other words, people use pseudocode because it's close to the code they intend to write and represents their current notational expectations. It's an enforcement of legacy methods of thinking.But many people have admitted that there is a problem with writing pseudocode style programming for modern hardware performance, where taking advantage of parallelism is important.Furthermore, I would argue that academia is relevant because it's one of the few places where the ideas are more important than the executable. If the ideas are communicated clearly, then you've succeeded. If we really want to program for the human, then we want our programs to be focused on the communication of ideas, and not machine-focused. And the reality is that if you take the machine away, and focus on human-to-human communication, without any "industrial" bias (expectation of machine execution), then rigorious idea communication is almost always pictorial, visual, and ideographic. Fruthermore, the notations that people develop and have developed over time to communicate ideas never end up looking like mainstream programming languages. As people work with ideas, math notation is the quintessential notation for communicating human ideas rigorously. It is highly evolved for human consumption, and manipulation, rather than machine-focused.I believe there have also been some studies on how people describe processes without any computing background, and it's inevitable that many of the core "serial" programming concepts are not "natural" in human though, but a very acquired taste.Again, I would be surprised if you put a bunch of industry or non-industry professionals up to a white board and had them illustrate their ideas rigorously to one another on just that whiteboard, that they would naturally gravitate to any real programming language. And I doubt strongly that they would actually continue to use pseudocode at scale on the whiteboard.
 > I'd be very surprised if your industry really did use complete pseudocode and only elided error handling. On the other hand, you're sort of assuming in your conclusion that pseudocode is the "better way" for languages because that's what people use, but you're leaving out the initial bias. I would argue that if you made current industrial languages more like pseudocode, you'd probably do better, yes, but it's a local maximum derived from an assumption of what the end result will be.Error handling was one example - I see concerns like serialization, permissions, transactionality commonly elided, and I look for better ways to handle them in programming languages as well.> I would argue that academia is relevant because it's one of the few places where the ideas are more important than the executable. If the ideas are communicated clearly, then you've succeeded.Maybe. That assumes that the successful papers (and successful academics) are those that communicate ideas clearly. I'm not convinced.> the reality is that if you take the machine away, and focus on human-to-human communication, without any "industrial" bias (expectation of machine execution), then rigorious idea communication is almost always pictorial, visual, and ideographic.Not my experience at all - if anything I'd say visual aspects tend to be a marker of less rigorous communcation.> Fruthermore, the notations that people develop and have developed over time to communicate ideas never end up looking like mainstream programming languages. As people work with ideas, math notation is the quintessential notation for communicating human ideas rigorously.Mathematics is one such notation; "legalese" is another, and philosophical terminology a third. I'm wary of generalising too much from mathematical notation alone.
 > Not my experience at all - if anything I'd say visual aspects tend to be a marker of less rigorous communcation.I would point to the field of combinatorics, the traditional proofs of both the ancient Chinese mathematicians as well as those of the West, both of which took on various elements of geometry and spatial reasoning for a significant number of their proofs when other tools were not yet available. The development of algebra I see as a chiefly visual and ideographic one, even tangible or malleable one. The development of UML diagrams another. Flow charts another. We have the abacus and Chinese counting sticks, as well. And finally, while poetry is not specifically rigorous, it is efficient in a way that few other communication methods are. And we find a great deal of "visual cue" elements in that field. In physical sciences and statistics, visualization is a very important tool. Mathematical notation itself is largely spatial and visual at scale.As for legalese, I would argue that legalese is perhaps well designed for experts to be complete, but not for clarity. Comprehensiveness is different that clarity of rigor. And as for philosophy, vocabulary is not enough. And you'll note that some of the best notational systems to arise came from the philosophy departments in working on logical systems. Those are all usually notationally represented using ideographic, rather than natural language forms. And even some Eastern philosophers who wrote very verbosely tended to make their arguments from visualizations in the mind to make their point.Musical notation, again, has evolved into a spatial, visual notation. A large number of traditional writing systems were ideographic, including ones we now consider alphabetic/phonetic.
 Codebase and it's terseness rarely matters. Understanding business processes that govern why the code exists is usually much more important than the code itself.After that familiarity of code comes first. And by familiarity I mean: common patterns, common solutions, ability to bring new people into the fray.Small terse languages tend to breed long-running small teams with an insanely high bus factor.
 I agree short/dense/simple/linear code has huge benefits that most programmers haven't experienced, simply because it is so hard to create (especially in some languages). Your code is both impressive and inspiring.What additionally interests me is the combination of points-free style and the kind of data structures you're processing in an array-biased language, could you give an insight on what that is like to work with?In particular, I presume from your description, and only a conceptual familiarity with APL, that most or all of this code is "functional", i.e. all data structures exist as values passed between the composed functions, and nowhere else (no globals or similar). I'd love to hear more about the predominant data structures and what shape they take.Somewhere else you mention Quad-XML, which seems to be a way to represent trees as arrays, with each element pre-fixed with its depth. I presume you use this for the AST? What kinds of operations are simpler on these arrays, and which are harder, compared to tree data structures used in other languages? For example, addressing the Nth child from a parent could be harder, since you have to search past the other children? I could imagine that operations like "set all fields X of the tree to Y" are a lot easier since no tree traversal is required.Does your ability to quickly refactor rely on this functional nature?
 I've structured the points free style so that it's basically like working with any expression, I just am working with expressions that build functions instead of expressions that build values. The compiler is very functional in style, and the entire core of the compiler is just a single data-flow graph if you get right down to it. I'll discuss this more in the live session, but they operate in the Nanopass style over a mostly monotonically growing (along the field axis) matrix representation of the AST whose core "columns" correspond to the core columns of the Quad-XML format. Rather than a single "attributes" column I flatten the attributes column into multiple main columns, but otherwise its the same, and the "Xml" function in my compiler helps to convert to Quad-XML format and serialize the AST for those who want to store intermediate AST results.The challenging part, which is part of what the research is one, is doing tree transformations in an efficient manner. Because this is a pointerless representation, you have to be careful in how you design the structure to ensure that you maximize locality and parallelism. If you read my "Key" paper in the publications:https://github.com/arcfide/Co-dfns#publicationsYou'll see how I manage one of the most tricky elements. By using the techniques I describe in that paper, I can perform arbitrary computation over any group of sub-trees in the AST selected by their parent-child relationships in a data-parallel fashion. This is largely accomplished by converting that depth vector in the Quad-XML format into a "path matrix" which allows for computing and reasoning about the parent child relationships of any two arbitrary nodes in the AST without reference to any other parts of the tree, without pointers. I can then optimize that path matrix representation for either ease of construction or for performance over certain types of common operations.That's really one of the most significant elements and sort of blows the whole problem wide open and actually makes it possible to do what I'm doing so easily.Once I replace the standard recursive transformation idioms with this new set of Key/Path Matrix idioms, I use the Nanopass architecture to allow refactoring. Nanopass is a style of compiler construction that builds on the idea of functional programming. So, yes, the compiler itself is very very functionally oriented, and that's a very big part of the refactorability of the code. But also, since I have done so in a way that results in so few variable names, that's also a major component, and it means that code is often highly "independent" and can be removed or deleted easily.
 Thanks, read your paper, that answers most of my questions. Fun to see such a completely different way of working with trees. Agree that point-free makes refactoring almost trivial.. one thing it shares with Forth style languages :)
 As for what is harder and what is not, it's not really so much a matter of easier and harder. By replacing all of the normal techniques with equivalent ones, it's more just programming in a different style that nets more benefits. I wouldn't say it's fundamentally easier, because the hardest part of any problem is reasoning about the problem itself, but I do think that you get a lot of benefits for writing in this way that is not really harder than the traditional methods, either. There's a sort of 1 for 1 replacement of traditional programming techniques with new techniques. The new techniques solve similar problems, but are more parallel, and more concise.
 > Don't complain that Chinese is ugly and unreadable just because you speak English as your native tongue.This is argument from analogy, and with a certainty nearing 100% it doesn't apply to programming languages.If you want to take this argument all the way though... Why not use Japanese instead?It's ugly, it's unreadable, it takes countless hours to master the language, the grammar, the writing system. In the end you arrive... at yet another language[1]. Which may or may not express some things that English can't. By the time you've mastered Japanese, you'll have achieved near perfection and all your goals in English :)[1] I speak four and currently am in the process of learning a fifth language (Russian, Romanian, Turkish, English, Swedish). I can say with some "expertise" that you can't make direct comparisons between natural and computer languages.
 Because some languages are better tools of thought than others for certain disciplines. Linguists have demonstrated that language itself has a shaping on the way in which people approach and see problems.While I could have chosen Japanese, it wouldn't suit the purpose as well.Moving this into the domain of programming languages, the point is that working with APL as a notation fundamentally changes the way you see problems. It facilitates a style of thinking and working with code that promotes the ends I'm working on.I'm sorry that you don't like analogy, but analogy is important to me, and I'll have to throw another one in here. To me, it's more about constraints that shape design than anything else. I'm not writing "English" style stuff in Japanese or Chinese, I'm writing, say, Chinese Poetry in Chinese. It would be exceptionally difficult to achieve the same result in English. Could you literally express the same content? Sure, maybe, but it would depend on what you counted as important. And the result would be very very very verbose indeed, and would no longer have the value that the same thing in Chinese poetic style would have.In PL, I could have written the same compiler in CUDA, and I would guess that it would take at least 10,000 lines of code or more to make it work.I could try to write it in any other number of methods, and I would argue that not only would the results have been more ugly, they would have been much less maintainable. I could have implemented the same literal algorithms with the same literal content, but to get the Human factors that I want, I would have a very very hard time of it.This is a big problem I see in the PL community. We don't, as a whole, understand or care to study the impact of notation on our thinking. It's all just "syntax" and we can choose what we like. But that's not really true. Just because I could write an APL library in Scheme does not mean I will be able to duplicate the efforts of APL in Scheme. Going the other way, I've for a long time tried to imagine some way of getting syntactic abstraction a la Scheme's syntax-case into APL. I can't find a way that wouldn't basically be useless, because the human factors involved change the game.So yes, you an translate Chinese poetry into English, and no one will consider the translation to be as good. Now, what most people are suggesting is that you can create Chinese poetry by starting from English and writing the same thing there. There are a host of reasons why that doesn't work.It's not just about meaning/algorithm/semantics, but about the experience of working with the code day to day and how that affects your ability to work in your problem domain. The design of this compiler enables better, simpler collaboration, and more adaptability and flexibility than other designs I have tried. It obviates certain documentation burdens within the team that is involved in working on this code. It simplifies deployment, maintenance, and all sorts of other things.Perhaps the biggest boon to working in this way within the team involved in this code, is that everyone, from the managers down to the users of the compiler, are able to discuss and have conversations about making things happen, handle bug reports, and deal with architectural design issues, all without anything else in front of them except for the compiler code itself. We don't need other documentation, we don't need diagrams. We can all literally, from the top to the bottom level, work off the one single code artifact. Everyone can get what they need from that single code base, without needed extra levels of "human documents." Because the code itself is a sufficient entity for human discussion.
 Snarky dismissals are not ok on Hacker News, especially not when they're advocating an entirely conventional and dare I say middlebrow position.When faced with something unconventional, the reaction we're hoping for from HN users is first to pause—and then to reflect. If after pausing and reflecting you want to argue that the conventional position is right, you'll be able to do that thoughtfully and with some sense of nuance.
 Point taken.
 in defence of the parent's snarkiness, this code is disgusting.imagine being presented with this and tasked with maintaining this. or adding a language feature. i'm certain the author could do it without much effort, but this code is as short as to be obfuscated - i have had more understanding from ioccc entries than this.code exists as a common language for humans to understand and collaborate. this code is nightmare-ish.
 There are quite a few assumptions in your comment that you could investigate if you wanted to. That might be more interesting than just being disgusted.
 actually the opening comment hit home pretty concisely, i implore you to read the code in question, the examples rasied, and to come back and tell me that you would be happy to work with this code base.the author themselve states that they are responsible for the majority of commits. this should be a red-flag, itself.
 It is very common for large APL projects to require only a single person to maintain most of the source for a large portion of its life. While this doesn't help the bus factor much, it's not the red-flag it would be in other languages.
 Sounds like the reason is nobody else will touch it. Once the author is gone, you end up scrapping the project.
 This is a commercially funded project, there are other people reading and working with the code. There's just rarely a reason for them to commit any changes.
 Why don't you just point those assumptions out? I haven't the slightest clue what "assumptions" you are talking about and how to go about investigating them. The code is disgusting, not a single variable has a meaningful name.
 One assumption is that code should make sense to anyone who doesn't know the language it's written in. We wouldn't apply that to C, or to English for that matter, so why APL?Another assumption is that short variable names always make a program more obscure.Another assumption is that concision makes a program less readable. This assumption runs so deep that it's hard to even see it, but consider: one concise line of code is harder to read than one verbose line, but much easier to read than a million verbose lines. That is, in the time it would take me to understand a million lines of verbose code, I could learn the language of a concise program from scratch, work with its short (and at first obscure) codebase long enough to understand it, and still have orders of magnitude to spare. At some point on the code size curve, the tradeoffs change dramatically.I don't know the OP's program but I can tell you that the APL culture is a mature, sophisticated, and beautiful approach to programming that occupies a different local optimum than most programmers are used to. To react to this not just with disagreement but outrage ("disgusting"!) is quite interesting. There's something threatening about encountering an approach to one's area of expertise that is so radically different, based on such different assumptions, as to be outright alien. Our reflex is to dismiss it forcefully. But if you can catch yourself doing that and stay with the unfamiliar long enough to get over the "disgust" response, the reward is magical.
 A more familiar example to many on HN of the same regrettable response would be the way React was first received. Despite having an interesting approach to solving some common practical problems in web development, and despite the underlying theory being tried and tested in other contexts, some people seemed to reject it just because it didn't maintain a clear separation of HTML, CSS and JS code, which in their minds was a deal-breaker even though there's no objective reason that such a separation is necessary.
 They have exceptionally meaningful names, they're just not English ones.
 Sorry, could you provide a quick/rough English translation of one of those lines? I'm unfamiliar with APL or how Chinese characters are written with Latin symbols, so it would be useful for 'seeing' the syntax.
 I can provide even better. If you look at the Publications:https://github.com/arcfide/Co-dfns#publicationsYou can read the "Key" paper that walks you through one of the core data structures in the compiler and how it works. If you see the "rn" binding in the compiler code in "e.cd" you'll see that code put into the first compiler pass of the project.If you want to see more than that, I can email you a copy of an updated version of that paper which is currently under review (not likely to see publishing this round). It includes descriptions of the implementations of the "lf" and "fe" compiler passes.
 Not disagreeing, but for the sake of discussion: Do you care to elaborate?
 In defense of the code: It disgusts you. I find it very pleasant to read. That means it is not disgusting.I wouldn't ordinarily want to make this personal. But you, who cannot be bothered with grammar, punctuation, or the shift key, want to make a statement about this code. You must understand it is a personal statement more about yourself than the code you are looking at.Here is someone who does something you admit you cannot do, and you say don't do that! Don't do the things I cannot do! What a missed opportunity! If I see someone do something I cannot do I say how do you do that? Why do you do that?
 I was tempted to agree after a glance at this thread but thought this looked like APL. Checking on the project confirmed it's an APL variant. APL is some weird-looking stuff I've never even tried to learn as I believed I didn't need it in place of languages + libraries that do similar jobs well with familiarity. The previous threads on HN about APL had similarly-weird code that the APL vets showing up thought was anywhere from fine to beautiful. This tells me we can't judge the quality of APL-like code unless we've dug into that paradigm and know what good, APL-like code looks like. Like other paradigms that are really different.Are your an experienced user of array-oriented, programming languages? If so, what specific things about the code were bad other than the shortened names someone else mentioned?
 Did the above code snippet remind you of APL or was it something else? This is actually a bit of an important "research" oriented question to me and is actually relevant to the design of the Co-dfns compiler.
 Remember I said I don't do APL. It reminded me of things I've seen in APL-oriented submissions or comments. All of it looked equally ugly to the uninitiated. The main point is that I couldn't tell the difference between a bad or good example since I don't do APL. So, dang's point stands that we shouldn't be quick to dismiss it without evidence it's bad. That's a decent point in general but especially for stuff really outside the norm such as APL-derived languages.
 So, as I mention above, part of the intentional design of that code snippet is to "feel" like APL. The fact that you went to that somehow is a good thing, actually. While it might have seemed as equally ugly, if it felt in the same "class" as APL, that's at least in the right direction, because as a part of the design of the style of this code, it is to mirror the semantic and stylistic densities of APL to improve the reading transition between APL code and C-style code.
 That's what I was saying. That it was so similar was clear. Past that, cant say cuz no experience in it.
 Are you going to articulate your objection to that code or just sneer at it unconstructively?
 There's a lot of people jumping on the bandwagon of down-voting anyone who dares to criticise the code, but I'm going to give it a go anyway.There's a reason why readable and beautiful code is favoured: it's so that anyone else that opens the source and tries to understand it doesn't have a difficult time, and therefore, anyone that tries to contribute doesn't have a difficult time either.Looking at the project's Github page, I can see that there's no contributor even coming anywhere close to the project owner. Whether that's because of the obscurity of the codebase or for another reason, I can't comment. However, it does stand that the project owner is the only real contributor, and so the minimum that he himself has to consider is if he can understand the code.Having said that, looking at the code does make me cringe. I'm sorry if that offends anyone but it is what it is: the code is not very nice to look at. It seems as though it has been engineered to be as obfuscated and shrunken as possible, without any regard for readability. I mean just the file names themselves: was there really any need for single-letters?Now the author claims that (and a lot of other people agree with him on this) it is not for the purpose of what I outlined above, but rather, as mentioned before, so that he can understand it all easily and rapidly modify it. Whether or not that's the case I do invite you to consider the fact that the post that we are all replying to is somewhat bragging about the extremely small size of the codebase.Personally, I think this sort of code would fit in rather well on a code-golfing forum or something similar, not on a production system. Then again, it is a personal project so ¯\_(ツ)_/¯
 Thanks for the reply, and let me just say that you take criticism really well.I still disagree with the premise that the code is clean, beautiful or readable, but I could concede that this may be due to me not having an in-depth understanding of it.You speak about a small code base being a metric for a simple code base, and while this is true sometimes, it starts to fail as a metric when the code is more obfuscated than concise. This is the kind of code I'd see in a webshell, not a compiler.I'll try and make it to your live session if I can; I look forward to it.P.S. I don't know if this was intended or not, but you really came across as trying to make everything seem more complicated than it actually is in the "sentence" where you went over the architecture. Dumping a load of complex-sounding words in a giant sentence just sounds as if you're trying to justify your design decisions by further obscuring understanding of the project. I'm fairly sure this isn't your intention since you are hosting a live session, but it's just how it looks. :P
 I look forward to convincing you of the simplicity of the code base. :-)The sentence was a bit of a tongue-in-cheek sort of rhetoric. In particular, if you look up most of those words in the relevant domains, they're all standard practice ideas that are well understood around their various parts. Importanntly, none of the words or anything said in the sentence is really complicated, and if you were familiar with all of those ideas, then it would be an easy sentence. However, I wrote it in a way so that it appears to be obtuse and a bit ridiculous on the face of it. Basically, attempting a bit to mirror the code itself. The sentence very concisely and neatly describes the architecture of the compiler, but only if you know what you're reading.One of the biggest issues with reading the compiler is that most people will have a "part" of the picture based on their backgrounds. If you have written compilers before, the overall compiler design and the strategies at a macro-level used for it will make perfect sense, but you'll balk at the data-parallel programming style. If you are an APL programmer you'll be very familiar with the basic tricks being used in the compiler and you'll easily be able to see at a micro level what's happening with the code, but not having the background in Programming Language Design (such as you might receive at Indiana University's PL course path), the overall design and the intent of the whole system won't be intuitive to you. This means that likely for anyone new coming into the code, there will be parts of the code base which feel very foreign to them. Of course, I don't know of a way of doing something new without making people learn a thing or two to understand it.Fortunately I'm not the only one in the world that has experience in all of these areas. And even better, the code is straightforward enough that once you do learn the basic skills, it's easy to work with. But there are precious few APL/Array oriented implementers out there, and probably less than a handful of compiler writers out there that specialize in this area of languages. I hope that will change and that I can convince people that this approach is actually a very neat and easy one to work with. But that's not easy to do.
 Sorry! I just realized that I forgot to answer the question about file names. The filenames themselves are a bit of a cultural homage to historical APL development. They are a little bit of a part of my push to stay small, because if I go beyond 26 or so files, I'm in trouble. But it's also a little bit of a "self documenting" element. There's a famous example of the style of C coding that I'm doing here from the author Arthur Whitney, the K developer. He famously whipped up a little J interpreter prototype that was about a page of code and Kenneth Iverson spent some time studying that code to understand its structure and layout and found it interesting. Whitney famously tended to write software in a very ascetical style and just used single letter names for his files.The use of single letter names in the files here is a bit of an inside joke, referencing back the style of programming of Arthur Whitney, signaling a bit of a historical "stylistic" or artistic connection, while at the same time being the first "alert" to the programmer that they are likely to see something along the lines of Whitney style C code inside of the files. It serves both as a chuckle to the APL community as well as a documentation of how you might want to prepare your mind before reading the code.
 For those who haven't seen it before, the J Incunabulum:
 Oh, and on another note, I've found that it's mostly programmers and computer scientists who struggle the most with the code. I've tried this style of programming out with high school students with little to no programming background, and they were able to pick it up and use it to do more in 12 hours than most students in an entry-level undergraduate course did in the first half of their semester.
 > There's a lot of people jumping on the bandwagon of down-voting anyone who dares to criticise the codePlease omit such offensive/defensive rhetoric from your posts to HN. It adds no information and is bad for conversation.The problem here isn't "daring" to criticize, it's rejecting the unfamiliar. This is like traveling to a new country and complaining because they cook everything wrong and say everything wrong. Unfamiliarity is relative—it's not a property of the thing you're reacting to. Same with readability: it's relative to the reader.In some contexts this is obvious. If you don't know German, you wouldn't reject a German text as unreadable or poorly written. But in other contexts, when we unconsciously assume or were taught that there's only one valid way to do something, we react with shock and distaste at work that violates known conventions. Such work may in fact be organized around different conventions for reasons we don't yet see. Good conversation across such boundaries requires a bit of distance from our own assumptions.Programming is like the world of art this way. There are countless examples in art history of sharp departures from convention provoking shock and distaste, and people saying things like "There's a reason why readable and beautiful [art] is favoured". Riot police famously had to be called to the early shows of the Impressionists, yet the beauty of their paintings is obvious to us now.
 > This is like traveling to a new country and complaining because they cook everything wrong and say everything wrong. > If you don't know German > Programming is like the world of artArgument from analogy?Please omit such rhetoric from HN posts
 Sorry if that offended you, it wasn't my intention; I tried to keep my comment fair to both sides.Let me however just say that what you're doing in your reply is attacking the straw man. You're setting up a version of my argument and then attacking that, instead of responding to my argument directly.I'll respond to your comment anyways. Code is art, just as you alluded to in your response. And from a solely artistic point of view, there's a certain glee and wonder at seeing short and smart code. When I browse codegolf on SE, I never fail to be amazed at the frankly fucking brilliant solutions some people come up with.Having said that, my main point was that code like that, in my opinion, does not belong in a proper project. I get the point of it being practical for a single person, but that codebase is above and beyond what is reasonable. It's simply not nice code.You're saying that I should not think that the code is not nice because I don't understand it, but you seem to be missing the main point in your flowery metaphors and analogies of art: art is subjective. You might find that code to be beautiful in its own way, and I'm sure that's justified to you, but I do not.

Search: