I think a lot of the reason people have difficulty is that you need to read the macros first; jumping into the middle won't make any sense. The macros are very mnemonic and there's only 5 of them. For example, I found the DO macro to be particularly elegant because it expresses "repeat something n times", and one of the features of array languages which makes them terse and powerful is this implicit repetition.
This isn't even unusual "I want to write K in C" style, it's specifically being annoying: https://bitbucket.org/ngn/k/src/master/m.c. I mean, for goodness' sake, it's pivoting the stack in x86_64 assembly to implement _start!
does suggest zeroing the top-most frame (%ebp) and making sure the stack is 16-byte aligned might also be worth doing. (Possibly it’s already 16-byte aligned already because you jmp to main instead of calling it though?)
that's a good read, thanks.
afaik zeroing %rbp here plays no role other than to assist debuggers.
i'm not sure if alignment makes any measurable difference. if it does, it'll be worth doing.
i jmp to main() because i wrote it and i know it never returns. it exit()-s.
> afaik zeroing %rbp here plays no role other than to assist debuggers.
Yep, it marks function frame boundaries. If you don't particularly care, you may want to consider compiling with -fomit-frame-pointer and the compiler will return this register back to the set of general purpose ones for it to use.
> i'm not sure if alignment makes any measurable difference. if it does, it'll be worth doing.
It will if you end up emitting any SSE instructions, because they will cause loads and stores that need to be aligned.
across architectures: you're right, as of now it's x86_64 only
across operating systems: it works on linux and freebsd as-is (thanks bakul), and with a few changes on windows+libc (thanks ktye) - https://github.com/ktye/i/tree/master/_/ngn though i personally never use windows
Thanks, especially for the fastai style link. I am currently finishing a book [1] that uses the Hy language (hylang) which is a Lisp wrapper on Python. I am going to revisit my code examples, informed by Jeremy’s ideas. Hy is really just a simple mapping of Python to Lisp (syntax very much like Clojure) so most of his style ideas carry over.
Yes, you need to understand a bit of APL to grok the style, but I don't know why it bothers people so much. I'm guessing because it involves people actually using the macro system for something interesting. Somehow lisp gets a pass.
Personally I think stuff like the DO macro (F/Fj in this code base https://bitbucket.org/ngn/k/src/master/k.h) should be used everywhere to the extent it is part of the C standard. If compilers were aware of it, there are significant optimizations which could be made. It's dirt simple and saves considerable typing and eliminates lots of potential for error.
Where i,j,k are first, second and third indices. So i in the last example takes values 0..2, for each i value j takes values 0..3 and for each j value k takes values 0..4.
I just noticed two missing defines. Sorry about that. Fixed below.
I did some testing and seems like this generates much less code at least with clang (with -O3 -fomit-frame-pointer) than geocar's much briefer solution!
I think the one which allows me to learn "setf's destination is the first parameter" and forget the rest is easier by far. setf is glorious macrology: It allows people to write arbitrarily complicated datatypes and have assignment just work, reliably.
But the problem with this code isn't the macros. It's the fact the author types like they're being charged by the character and taxed for whitespace. The macros are amazing to the extent they allow this sort of thing. Blaming them for it is wrong, but judging by the only standard that matters, readability, this is not a good coding style.
Your example neatly demonstrates that your definition of readability is based on familiarity. What does variable assignment have to do with the Contents of Address Register, anyway?
In the absence of familiarity, minimizing the number of moving parts and factoring out repeated patterns still has intrinsic value for aiding understanding.
> Your example neatly demonstrates that your definition of readability is based on familiarity. What does variable assignment have to do with the Contents of Address Register, anyway?
I explained myself right below my examples.
> In the absence of familiarity, minimizing the number of moving parts and factoring out repeated patterns still has intrinsic value for aiding understanding.
It's the fact the author types like they're being charged by the character and taxed for whitespace.
I dunno about you, but I have very little working memory in my head, just a few items, and only one focus of attention with a narrow field of view.
I can’t read that style of code, but opening a traditional codebase with part of the code here using lots of symbols to achieve a tiny amount of work and the next piece of work is four screens away, using a lot of symbols to achieve a small amount of work, it’s like declaring legalese as the most readable kind of English because it bloats so big it must be readable.
judging by the only standard that matters, readability, this is not a good coding style
What about composable idioms and conventions which let you do a lot of data transforms with a little attention and time?
What about a small amount of code to maintain and refactor? If this style saved you ten lines it might not help but where’s the cutoff that you’d start to be interested - if it saved you 50% of the work? 75%? 99.999999%?
If “big count per line of code is constant” this ought to be an exceptionally low bug style, right?
What about having less code to demonstrate correctness of?
What about having little code to throw away and rewrite if you don’t like it, leasing to a faster refactoring cycle time and a better expression of your intent after s set time? Or leading to readability simply being less important because the cost of rewriting 50 lines from scratch is so much lower than rewriting 10,000 lines from scratch?
Or if you’re writing documentation and there’s an order of magnitude less code to write about?
Or if you should write as many lines of test code as real code, and there’s an order of magnitude less code to test?
The whole style isn’t just “golf the character count”, but “collapse down the size of everything so one human attention covers a lot more of it, like zooming out to get a higher overview”.
Crossing down from several files to one file is a phase transition, crossing from one long file to one screen, is.
> I dunno about you, but I have very little working memory in my head, just a few items, and only one focus of attention with a narrow field of view.
Right. Which is why readable, mnemonic names are so important. Single-character names are good for array indexing, because their scope is so small you can see the whole region they're used in a fraction of a screen (usually) but variables which get used over a larger fraction of the codebase need names which explain what and why they are.
> What about composable idioms and conventions which let you do a lot of data transforms with a little attention and time?
Great. That doesn't preclude meaningful variable names.
> What about a small amount of code to maintain and refactor? If this style saved you ten lines it might not help but where’s the cutoff that you’d start to be interested - if it saved you 50% of the work? 75%? 99.999999%?
Great. That doesn't preclude meaningful variable names.
> If “big count per line of code is constant” this ought to be an exceptionally low bug style, right?
If that's true to an unlimited extent, IOCCC programs should be nearly self-debugging, with how short they are.
> What about having less code to demonstrate correctness of?
Testability is about each bit of code doing one thing, not doing ten things in one line. You can test one thing. If you try to test ten things at once, you have 1024 possible ways for that test to come out, assuming Pass/Fail is completely binary, as opposed to real testing, which is more like Pass/Fail/WTF.
> What about having little code to throw away and rewrite if you don’t like it, leasing to a faster refactoring cycle time and a better expression of your intent after s set time?
Throwing code away doesn't hurt my feelings.
> Or leading to readability simply being less important because the cost of rewriting 50 lines from scratch is so much lower than rewriting 10,000 lines from scratch?
I can't rewrite what I can't read. I would struggle to rewrite code I struggle to read.
> Or if you’re writing documentation and there’s an order of magnitude less code to write about?
Writing documentation based on LoC count is deranged.
> The whole style isn’t just “golf the character count”, but “collapse down the size of everything so one human attention covers a lot more of it, like zooming out to get a higher overview”.
And we circle back around: Lisp macros allow me to zoom out like that by letting me factor out common patterns more easily, like collapsing all possible "assignment pattern" instances to a setf invocation. I don't see the guts of how assignment is done for that particular kind of data when I'm focused on the other bits of the codebase. It's out of focus for me then, and it collapses neatly into one little form.
But, and here's the big sticking point for me, I can do all that without variable names which look like line noise.
If that's true to an unlimited extent, IOCCC programs should be nearly self-debugging, with how short they are.
This is "they all look the same to me" prejudice. IOCCC code relies on abusing undefined behaviour and clever tricks and edge cases, terse code like the OP link and the array languages relies on composing high level abstractions and using them over and over in learnable patterns. That aside, IOCCC programs wouldn't be "self-debugging" with this argument, they would "have fewer bugs". It shouldn't be too hard to accept that a given program written in 5k lines, 50k lines, and 500k lines would take different amounts of debugging effort. Whether it's a significant enough trade off of harder-to-code vs saves-much-debugging is a better question.
Throwing code away doesn't hurt my feelings.
And your feelings are the only potential cost involved in rewriting code, there's no time involved?
I can't rewrite what I can't read.
Of course you can? This is so common it's a programming trope "this is garbage code, nobody can tell what it's doing, let's rewrite it". Although I'm not sure that arugment works in my favour - Netscape's rewrite of Navigator which dragged on for years; was that because they /couldn't/ read the code to rewrite it, or because they couldn't rewrite it better while being bug-for-bug compatible?[1] Even then, writing one function and thinking "I could do this better" and rewriting it, then thinking "I could do this better" and rewriting it, is a cycle which gets much worse the longer the code is to rewrite.
Writing documentation based on LoC count is deranged.
But also necessary, unless you either a) admit that the extra LoC isn't saying anything important (so you have a poorly expressive language which forces you to write lots of boilerplate that isn't worth talking about) or b) you want to leave more of the code undocumented.
Lisp macros allow me to zoom out like that by letting me factor out common patterns more easily
"My way of writing dense composable code which outsiders hate and consider unreadable is great tho!", lol. LISP goes towards climbing an abstraction tower of code-which-writes-code, to get more done with less code, rather than simply /writing less code/. But this way, at least you agree that "less code is better" by factoring out common patterns? So why don't we push for common patterns which are shorter and achieve more, in normal languages (like we changed for loops to map in popular languages, eventually).
But, and here's the big sticking point for me, I can do all that without variable names which look like line noise.
Exactly what variable names do you think would make the code in the OP link "readable"?
Great. That doesn't preclude meaningful variable names.
It rather does; it's possible to give functions names, but when you have small functions and compose them together, it quickly gets to the stage where there's no meaningful English name to give them which helps clarify what's happening. In an APL example:
⊂ ⍝ left shoe encloses (nests) an array into a scalar
, ⍝ ravel unravels an array into a vector
⊃ ⍝ right shoe (NARS2000 APL) discloses a nested vector into a matrix
The pattern
⊃,⊂ 1 2 3
is common, it turns a 1D vector of 1 2 3 into a 2D matrix where the first row is 1 2 3. What would you name the pattern ⊃,⊂ that clarifies what it does and doesn't get in the way?
Once you've decided on that name, what do you name the data which goes into it and comes out of it, in a way that usefully explains what state it's in, without getting in the way? "matrix_of_grade_indices_into_data_array"? But you can see that from the way it's ⊃,⊂⍋data those words are what that combination of symbols says, only more precisely.
By the time you've written and read all those English words, you've hidden what the code does and covered it over with much less precise verbiage. The pattern is common because you can do different things to a 2D matrix than to a 1D vector, and the pattern is so short that you can inline it when you need it:
isn't clearer either. And you can say I'm making up stupid variable names to obscure it, but what variable names wouldn't obscure it? How doesn't this terse style lead away from "English word variable names"?
The C style code which I can't read, I am strongly suspecting that what it does is this kind of code-block combining, not IOCCC style compiler abuse, and in that case it won't help anyone with the skill to work on it, to rename "x" to "pointer_to_first_function_argument_in_backwards_compatible_adjusted_scope", because at the point where such a person understands the code, they can see that's what "x" is, because that's what "x" always is and besides there's only a page of code to see it in, in the same way that a loop index variable is easily read without needing a long name in "for (int i = 0" code.
One subtle feature of oK is that it supports the creation of mutable closures. "f.d", where "f" is a function and "d" is some dictionary, will attach the dictionary to the function as if it were the global scope. You can then later use ".f" to extract that dictionary from a function to inspect it. This mechanism allows you to violate referential transparency (which is why Arthur doesn't like it), but you get a powerful mechanism for sandboxing code which can also permit writing code in an "OOP" style, if desired. Here's the feature in action:
f:{a+::x}.[a:10]
{[x]a::.`a+x}
f 30
40
f 30
70
f 30
100
.f
[a:100]
Naturally, if you have not explicitly re-bound the closure of a function, .f gives you access to that function's natural scope:
oK also supports unlimited lexical closure, which is a bit unusual among K interpreters; usually they either eschew closure entirely for simplicity or (as in k3) allow functions to close over a single scope.
i changed it to "telemetry". i'm not really sure what to call it. they are upfront about it in the clickwrap eula but the data is not anonymized (see section 1.5a - in k ".z.u" means username) and you are not allowed to obstruct connectivity. it's not quite spyware either, as it happens with your consent.
it's not quite spyware either, as it happens with your consent.
All the famous spyware I'm aware of which you had to have initiated the action to install includes an EULA which does mention the data collection. That doesn't make it any less spyware.
i'm not trying to purposefully copy [Arthur's] style. i've always been trying to write shorter code - there are public traces of how that evolved in ngn/apl and in dyalog's ride. i acknowledge that after seeing some of arthur's code, including b, my mental threshold of what is considered acceptable dropped significantly.
once you accept certain principles about the code, for instance that it's more important to be able to hold it at once in your head than to be able to explain it to the uninitiated, this style becomes more efficient and more pleasant
...
if you seek simplicity, this style is to a large extent discovered (as opposed to invented)
complicated things can be arranged in many ways. simple in few. that's why it always looks like entropy is increasing in the physical world :)
A fun and somewhat illuminating exercise in C is to simply try minimizing the number of semicolons in your programs. Pick any comfortable, well-defined task and spend a few hours refining an implementation.
Fewer semicolons means fewer statements. Apart from simplifying the structure of ones code overall, this also drives writing in a somewhat more functional style. You will find yourself taking advantage of the comma and ternary operators more, if you don't already. Macros are another tool for decreasing repetition. So is expressing programs in a "data-oriented" style, using lookup tables instead of explicit conditionals. If something is used only once, inlining it can reduce the number of distinct statements.
You could argue that the license of the code, in this case AGPL, is irrelevant. I don’t think the source code is any easier to read than disassembled output.
You could say a similar thing about Japanese text if you don’t read Japanese - “copyright isn’t relevant, who would ever want to copy this gibberish?”.
As people in this and similar threads mention - it is easier to read; in fact, it is easier to read than even the preprocessed output, if you are familiar with the style and concepts.
What a disappointing comment. This is like saying MIT/GPL/AGPL in programming languages most people don't understand is irrelevant, since most people wouldn't wanna read/copy/edit code anyway. By that logic, there is no value using open/free license in niche languages.
Even though you don't understand this language/style, doesn't mean a user may not want to refer to source code, or someone may want to contribute to the project.
For the record, I'm more open to this style now so I'd like permission to take a few steps back from that statement in 2009 (and probably most of my others that year!).
Coding in a terse style really does have some benefits. You can understand more program flow at one time and you can easily see larger patterns.
i'd encourage people to compile from source. the binary in "downloads" was compiled without the "-march=native" flag, in order to support older cpus, so it's a bit slower. also, i don't intend to update it regularly - i'll probably remove it in a couple of days.
So it uses memory mapped files for all operations? So whether the file is 1 MB or 1 TB, it'll read what it can via streaming? Sorry, not knowledgeable here. Sounds neat though.
1:x uses mmap, so it would return instantly no matter how large a file you give it
0:x uses 1:x and then it splits the content into lines. unfortunately splitting requires copying, so you'd be limited by the amount of ram (let's ignore "swap").
i don't use mmap for writing/amendment yet. i'll be working on it.
the way modern hardware works is like this: every process run by the os has its own view of the (typically 48-bit) address space. a process can request from the os that a part of that address space be "backed" by a certain file. this means that every time the process touches (i.e. tries to read or write) a virgin memory page there (usually page=4k, always aligned), the os will be automatically notified and will make sure to fill it in with actual content from the file, before the process even knows. from that point on, the page will occupy physical ram. if the os is low on memory later, it may decide to free up the page and return it to its previous state.
in effect, data from disk (or any disk-like storage) can be "streamed" while your program uses ordinary array indexing. the word "streaming" though implies reading from start to end in order (which is additionally sped up by prefetch, but that's a different story..); memory mapping is more general - it allows random access.
Assuming you are terminating strings with a NUL, for 0: it may be advantageous to R/W mmap with copy-on-write (i.e. MAP_PRIVATE). Then you can overwrite LF or CRLF with a NUL.
ianal. to the best of my knowledge the project is clean from a legal perspective. ofc, you don't have to be doing something wrong to become a target of copyright trolling.
Yeah, the reason oK was a toy was because John Earnest works for a company that has their own fork of the K3 source, if I remember correctly, which puts him in an area that won't end with one of Kx's harsh lawsuits getting dismissed immediately.
(Not that if he made a high-performance interpreter it'd be a bad thing legally: the reason GNU's programming guidelines are so archaic is partially because everyone implementing GNU early-on had seen the UNIX source code yet they could get around getting sued by writing esoterically.)
This must be the famous write once read never coding style I've heard so much about, a bit like my FINO list of bookmarks, where project like this one are always welcome
This is precisely explained by the quote elsewhere in the thread:
once you accept certain principles about the code, for instance that it's more important to be able to hold it at once in your head than to be able to explain it to the uninitiated, this style becomes more efficient and more pleasant
The Whitney style relies mostly on C macros, while IOCCC (and especially golfed) entries tend to exploit very forgiving operators. It is likely that you can actually understand the thing with only proper renaming. I can confidently say this because my IOCCC entry had won in the Best Short Program category ;-)
I'm still not sure that this terse style in C improves reading. I agree that lots of concise idioms present in APL family of languages do improve reading (once you picked them up), and I do like lots of short functions more than a number of long functions (another contrast to IOCCC, where you don't see many functions). But those languages are tailor made to maximize this aspect. You for example don't think much about loop invariants in APL and friends because they are mostly implicit, and in turn you get very good refactorability. Would it improve other non-array languages? I doubt.
It’s just a different style. After using kdb+/q for aoc2019, I quite like the no space one-line style. I find the code easier to read and remember. I think most people who have tried an APL derived language also agree.
It’s important that many/most cultural norms are not optimal, they are arbitrary. So don’t immediately discount something for looking weird.
> I think most people who have tried an APL derived language also agree.
That part I have some experience with, and I think that's not the case. I was taught APL (in a CS course) along with a hundred or so other students, and it was nearly universally disliked; IIRC, it only resonated with some of the EE people. The department's administration was even written in APL, and nobody wanted to touch it. After finishing my assignments, I've never wanted to look at it again. So I, for one, doubt the style is universally likable.
> So I, for one, doubt the style is universally likable.
I agree; it's definitely a niche thing. It takes quite a while to get into it. Even if you understand the basics; if you don't like 'puzzling with code', you won't be idiomatic. And most people don't like puzzling with code; they want to get things done. You need to do quite a lot with it to start recognising the idioms, but once you do it's a blast. And it makes both reading and writing quite easy. That's why I think doing something like the advent of code in one of these languages will help a lot; I had to quit this year because a project at up my time I reserved for aoc19. The exercises I did though where cool to solve in k.
Hah, this is ironically bittersweet for me. Sadly I disliked J and K because they didn't scan like APL to me. Programming after APL in so-called modern languages seems like wordy drudgery in comparison.
While I wouldn’t write C like that, that’s how Arthur Whitney and others write code. The entire kdb+/q codebase is structured like that. And considering the success and insane performance of the system, I believe there is something to it.
The point of this style isn’t obfuscation, it’s clarity and compactness. The compactness boosts the signal to noise ratio and allows the programmer to retain more information in working memory. I haven’t programmed C like that and wouldn’t, but I definately understand and respect the theory behind it.
I have on the other hand written q like that, at first skeptical of the style. But after a couple days, I wouldn’t have it any other way. Being able to fit your entire program on a single screen has pretty profound productivity benefits. It also makes IDEs unnecessary.
For other code structured like this, check out Kona, J or kdb+/q.
In general, I don’t think refusing to believe is a very useful attitude. See: Chesterton’s Fence.
I didn’t tolerate it before I started using APL and K. And now my C, while not entirely ArthurWhitnious, is closer to this kind of code.
When I first saw examples of how terse K and APL code gets, my thought was “obviously, the language was designed to solve this specific example or it wouldn’t be this terse”. But after the 100th or so significantly different example, one can’t NOT admit that the problems are simple, and common practices are just incredibly bloated..... which leads some (me for example) to adopt a much terser style.
Indeed. It is rather common to see K criticized as a "domain-specific language", but empirically it seems that K's domain is rather large.
One of the reasons I wrote iKe was to demonstrate that the same primitives which are well-suited to constructing databases and financial models can be applied to computer graphics. With little more than bindings that supply appropriate IO capabilities, you can even write games in K:
Learning q has changed almost everything I believe about what good programming practice should be. Functional Haskell or Scala code that I used to consider to be highly elegant, I now think unnecessarily verbose. And don’t even get me started on Java or Python...
There are plenty of other programmers just like you who can't read this. Most of them spend their time "programming" with stackoverflow on one screen and sublimetext (or whatever else is cool these days) on another and copy and paste between them. They write things like left-pad and are celebrated in stars and followers. You and you alone can decide whether you want to be more like them, or more like something else.
However, when I see someone doing something I cannot do, I want to learn more. That's how I get better. I think there are other people out there like that who might just need a little bit of encouragement, and that's why I reply: Not to convince you, but to convince them it's okay to learn something that few people understand (but those that do think is really cool!)
> There are plenty of other programmers just like you who can't read this. Most of them spend their time "programming" with stackoverflow on one screen and sublimetext (or whatever else is cool these days) on another and copy and paste between them.
I think that your second paragraph is great, but this first paragraph, which comes across as a snide judgement of your parent as an SO-reliant trend-follower, is unwarranted. Your parent may have made too hasty a declaration of unreadability, but that tells you only what kind of code saagarjha wants to read, not, except indirectly at best, what kind they write.
People who follow trends are like other people who follow trends, at least in that way. They will never see something new so long as they do doing what everyone else does.
I think this is what you really want to ask: “This style of programming is totally foreign to me. Clearly this is intentional — I am curious as to why you chose it. Can you explain?” :-)
Understanding why someone else makes a difference choice can be illuminating. Given that the author of the original k has written at least 5 or 6 versions of the language interpreter from scratch every time in this style, he must be doing something right. Right?
> Given that the author of the original k has written at least 5 or 6 versions of the language interpreter from scratch every time in this style, he must be doing something right. Right?
Not necessarily, but others have pointed out reasons why one might write code this way.
> I've spent a fair amount of time digging my teeth into it
I tend to label code that makes me do that "unreadable"; it's not that I literally can't read it (I routinely deal with worse) but it requires more effort than I consider reasonable for the medium it is presented in.
it's totally full of unwritten assumptions and the same header is usually parsed in several places slightly differently. it's a spooky action at a distance playground.
But it conveys far more information for one page than you have in other styles, so obviously it will take you more time to read + understand it. It's far more content dense. If you change the style, you probably spend more time reading it, spread over more pages & files and personally that takes me more time to read + understand (depending on the functionality, much more).
It is plain to any developer that this fits all definitions of "unreadable" and honestly I think in this case it falls on you to explain why you think it isn't.
I was genuinely curious how this can be considered clear, readable code that someone would like to write, maintain and support since it flies in the face of all the usual conventions. I will not enumerate them here, you know them.
Honestly I think that either one of three things is going on here:
1. you can look at m.c, identify a line like "A1(a1,atnv(tX,1,A_(x)))A2(a2,atnv(tX,2,A_(x,y)))A3(a3,atnv(tX,3,A_(x,y,z)))A2(aA,atnv(tA,2,A_(x,y)))A2(aa,atnv(ta,2,A_(x,y)))" and easily understand what it means in the context of the ngn/k interpreter
or
2. you understand the code on the same level the rest of us do (can parse it but need to manually take notes on what everything means) ... but you want people to think you belong to group #1
or
3. you are trolling, in which case good job I guess since I spent like 5 minutes typing this comment out
The variables x,y,z are used to describe the first, second and third arguments. This is a common convention, so you get used to it.
tX, tA and ta are type enums.
A1(f,b) is just a 1-ary function f with the body b; A2 2-ary, and A3 3-ary. A_(f...) is a list of A (box) objects; literally (A[])({f1, f2, ... fn}) to avoid an extra definition.
atnv is a mnemonic for allocate of (t)ype, le(n)gth, (v)alues. This is not dissimilar from Arthur's ktn(t,n) function, except it also fills the buffer with the values (memcpy).
These are allocation wrappers for the rest of the interpreter.
I've tried to write some about why this style is better on HN previously:
I don't let it bother me though. Occasionally you meet people who can be convinced to take a certain leap of faith, and then you get to have very interesting conversations about making software that you can't have otherwise.
Perhaps you've heard this said of other things as well: That learning lisp makes you a better programmer even if you do not use lisp. This thing here, is kind of like that.
I've read and appreciate your previous posts. They did get me thinking why programming is the one technical discipline which is actively hostile to notation/shorthand.
On the other hand, I think that there are two aspects of this style that need to be separated - the decision to use extremely concise identifiers has nothing to do with the performance / size / correctness characteristics of the code. There is something else to the style that might be applied to get those, and perhaps that would be more important to teach than the ability to read 'notation-like' code (for lack of a better name).
I do understand the claim that the identifier style is what helps you understand and review the code much better, so it may be an enabler.
Thanks for giving this an actual shot. I appreciate it's alien tech, but it's something I've grown quite fond of over the last few years and I enjoy talking about it.
> They did get me thinking why programming is the one technical discipline which is actively hostile to notation/shorthand.
Oh they're not the only one. Look at maths papers from the last ten years versus just twenty years ago: So much more verbose! So hostile to brevity!
The benefit to a good notation is keeping in your head the solution.
The downside to a good notation is that it keeps people from contributing to the solution that don't understand the notation.
My conjecture: People who can't learn the notation can't contribute for other reasons.
My second conjecture: This shit isn't as hard as people think it is.
> the decision to use extremely concise identifiers has nothing to do with the performance / size / correctness characteristics of the code.
This is not true! When I wrote the first kOS kernel, I wrote it with newlines and tabs because I didn't know any better. When I rewrote it in this style, I noticed a lot of code duplication that I could only see because the kernel now fit on a screen. This made my program smaller (original kernel was around 50kb the new one was around 30kb) which left more room in cache for programs making them faster as well. I was also able to see bugs that were only visible because I could see the caller and the callee at the same time and correct them. It all adds up!
Since then, I've had numerous opportunities to see similar effects in other programs. Learning how to use notation has made me a better programmer.
Readability is defined by the fact that anyone who understands the language shouldn't need to "know how to read it".
Conventions, style, standards, etc are explicitly created for the purpose of lowering the barrier to entry for understanding and maintaining a piece of code. The maximum number of potential users who can quickly interpret and be productive with said code is the purpose of the concept of readability.
Unless you are claiming a specific advantage for the stylistic choices over simply taking more care to write more conventionally readable code, I don't understand your objection to the term unreadable.
"unreadable" in common parlance of the software world is not taken to mean literally impossible to parse but simply, not optimised for ease accessibility for the maximum number of people in the shortest amount of time. I'm afraid that means that the fact there is debate over this to such an extreme suggests this is at the very least, not very readable.
As with everything, there is a balance required. Unreasonable lengths should not have to be undertaken to maximise code readability. I do wonder what advantages you attribute to this style which you think justifies any potential trade-off in readability? What is gained for such a potential expense?
> Readability is defined by the fact that anyone who understands the language shouldn't need to "know how to read it".
Perhaps then this isn't C, but a language compatible with (at least a few) C compilers.
This language is something you can learn to read as effortlessly as you read other languages (perhaps more so), you simply haven't learned how to read this yet.
If you want to suggest readability is defined in that way and no other, how can someone who does not know this language make a statement about its readability?
I'm not sure I'm convinced with this line of thinking, because it's incredibly exclusionary, just watch:
> I do wonder what advantages you attribute to this style which you think justifies any potential trade-off in [me being unable to read it]? What is gained for such a potential expense?
Why would anyone want your input on something you do not understand?
Why would anyone write in Chinese if you cannot understand Chinese?
What is the point in this kind of definition of "readability"?
Does that make sense? Surely it's much more useful to talk instead about what the capability of the programmer is that can read and write and maintain code written this way, instead of the possible missed opportunities of the programmer that cannot?
You're a bit off, because this is C and I know how to write it. I can read and write English pretty well, too, but I have come across literature that is unreadable in that language as well.
Your comment about this being like unreadable English is very astute and got me thinking. To me if someone says "m.c" is nice and readable, it's a bit like someone claiming Finnegan's Wake is. For anyone who is not aware, this is what Finnegan's Wake looks like:
"""
riverrun, past Eve and Adams, from swerve of shore to bend of bay, brings us by a commodius vicus of recirculation back to Howth, Castle and Environs.
Sir Tristram, violer d'amores, fr'over the short sea, had passencore rearrived from North Armorica on this side the scraggy isthmus of Europe Minor to wielderfight his penisolate war; nor had topsawyer's rocks by the stream Oconee exaggerated themselse to Laurens County's giorgios while they went doublin their mumper all the time....
"""
The difference is that I don't think anyone makes the claim that Finnegan's Wake is readable or easy to work with. Everyone's pretty much agreed that this is a very unusual style that takes an enormous amount of time and effort to appreciate and understand. The style is fine, if it's someone's "thing" then great, more power to them. But what gets me is people pointing to this style of coding and feigning ignorance or surprise that others could find it troublesome and making a big point of saying "Well it's easy for me, I guess you're just not smart" when confronted about it
> But what gets me is people pointing to this style of coding and feigning ignorance or surprise that others could find it troublesome and making a big point of saying "Well it's easy for me, I guess you're just not smart" when confronted about it
Hey listen, I understand that other people find it troublesome.
The thing is that I think it's something people can learn if they want, and I think there are benefits to doing so.
I only object to the label "it's unreadable" because it demands the author (or someone else) make it "readable" to cure, when the right fix is to fix the programmer: to give them the experiences needed to be able to read it.
That's admittedly not very easy for the novice reader- certainly it would be much better for them to write less dense and use a smaller vocabulary, and that's why, just with English, we start novice readers with less dense study materials that utilise a smaller vocabulary.
However it's a mistake to think that just because business English is a bit more dense than Goodnight Moon, that it's unreadable, and it's a worse one to think its lack of accessibility is without some value to the writer.
I should think people should want to seek out that value in exactly the same way.
> I only object to the label "it's unreadable" because it demands the author (or someone else) make it "readable" to cure, when the right fix is to fix the programmer: to give them the experiences needed to be able to read it.
Though I disagree with the claim of unreadability, for the reason you've mentioned—readability is a matter of conventions, never an absolute—I don't think it really 'demands' anything. If I were to call this a non-standard C coding style, then I don't think you or anyone would disagree; but that is just a statement of fact, not a demand to standardise it. I think a claim of unreadability can be read charitably in the same way, as a statement of the fact that "people used only to the standard C programming style will find this difficult to read", not any demand or even request that it be otherwise.
Because I'm not speaking to someone who speaks that way.
Arthur Whitney once wrote: One day I asked him, "Roger, do you do math in English or Cantonese?" He smiled at me and said, "I do it in Cantonese because it's faster and it's completely regular."
The point you have made repeatedly is that other programmers should aspire to use the style you prefer, even if they currently find it difficult to read.
In this comment you make the opposite point: that one should write in a style readers will most easily understand as they currently are.
You may think, because one example is of code and the other of prose, that those points are not in conflict--but they are.
Speaking of literature, have you read Edward A. Abbott's 1884 book "Flatland: A Romance of Many Dimensions"? Some of the comments here make me think of Flatlanders!
That is a great comparison! There are people who can't understand heavily-spoken Scots English, African-American Vernacular English or Cajun Vernacular English, and while all of those are English, they're different dialects.
All are proper English, as any linguist will tell you, but Scots English is more or less mutually unintelligible.
AAVE is relatively understandable because of pop-culture, but without previous exposure to it, someone who didn't have any influence to it would be confused in an environment that heavily uses it.
And if you got sat in front of a Cajun family, despite the relative amount in common with them, linguistically, you'd be bewildered at some of their heavily French-influenced vocabulary and structure.
Snide remarks are dull, and HN's guidelines explicitly recommend against cheap humor. Every time K is posted to HN someone makes this exact style of comment in an attempt to be funny, but it's entirely unoriginal.
There's good reason as to why C written by APL programmers looks as it does, and shallowly dismissing it does a disservice to everyone.
There are people who value brevity and minimal amount of code above everything else. Arthur Whitney, the author of K, doesn't even use scrolling — everything he reads or writes has to fit on a single page.
Brevity should not be an excuse for obscurity and illegibility. I get it that you assign a short name to central data structures, but all those two character identifiers...
As this thread and post proves, quite a lot of people have no issue reading + understanding it. One of the goals to fit a lot of functionality on 1 screen which helps, especially when you get older, but it always helped. No scrolling, searching, trying to find in which abstraction in which of the 1000s of files with 'readable' code things are stored. You open 1 file and yes it will take you (a lot) more time to read the same length (in lines) page of, let's say Java or C#, but in those languages you will need often 10s, 100s or even 1000s of these pages (in many different files because that's 'readable') to write the same functionality idiomatically in that language as that one pager in this style. So people who take the patience to read and understand this will actually be more productive in my experience: I am claiming that it will take most people (a lot) more time to read and understand this code if it was written in a traditional way in C or whatever, with proper variable names, spread over many nicely created .h and .c files than it takes someone who has done this before to understand and work on this. And that gets more obvious when you actually start using APL/K/J which allow even more expressiveness into the language and will let you even compress more functionality in less characters.
However it does not naturally scale to teams. But each way of working has its own strengths and weaknesses; if you are aiming for large teams working together on large projects, I would not recommend this. That is not because of readability, but because the code is so terse that you would, even with 2 people, be changing the same code a lot of the time, so it's more efficient to do it alone or in pair programming (which does work well; also remotely in an APL-like in my experience).
Think of it like it's math. If you have to explain the value of e every time you use it, the value of your work to onlookers diminishes, and it can no longer be held in the mind of any reader.
Fastai has a guide how to write Whitney-style code in Python [2].
Thre is a relevant thread in HN [3].
[1] https://github.com/kevinlawler/kona/wiki/Coding-Guidelines#t...
[2] https://docs.fast.ai/dev/style.html
[3] https://news.ycombinator.com/item?id=19481505