Hacker Newsnew | past | comments | ask | show | jobs | submit | antirez's commentslogin

C has no problems splitting programs in N files, to be honest.

The reason FB (and myself, for what it is worth) often write single file large programs (Redis was split after N years of being a single file) is because with enough programming experience you know one very simple thing: complexity is not about how many files you have, but about the internal structure and conceptually separated modules boundaries.

At some point you mainly split for compilation time and to better orient yourself into the file, instead of having to seek a very large mega-file. Pointing the finger to some program that is well written because it's a single file, strlongly correlates to being not a very expert programmer.


The file granularity you chose was at the perfect level for somebody to approach the source code and understand how Redis worked. It was my favorite codebases to peruse and hack. It’s been a decade and my memory palace there is still strong.

It reminded me how important organization is to a project and certainly influenced me, especially applied in areas like Golang package design. Deeply appreciate it all, thank you.


Reminds of one time when I was pair programming and the other chair said “let’s chop this up, it’s too long” and when I queried the motivation (because I didn’t think it was too long), it was something like, “I’m very visual, seeing the file tree helps me reason about internals”. Fair enough, I thought at the time, whatever makes us more productive together.

On reflection, however, I’m unsure how that goes when working on higher-order abstractions or cross-cutting concerns that haven’t been refactored, and it’s too late to ask.


I split to enforce encapsulation by defining interfaces in headers based on incomplete structure types. So it helps me with he conceptually separated module boundaries. Super fast compilation is another benefit.

> complexity is not about how many files you have, but about the internal structure and conceptually separated modules boundaries.

You probably don't need this, but ...

https://www.lelanthran.com/chap13/content.html


If this had been available in 2010, Redis scripting would have been JavaScript and not Lua. Lua was chosen based on the implementation requirements, not on the language ones... (small, fast, ANSI-C). I appreciate certain ideas in Lua, and people love it, but I was never able to like Lua, because it departs from a more Algol-like syntax and semantics without good reasons, for my taste. This creates friction for newcomers. I love friction when it opens new useful ideas and abstractions that are worth it, if you learn SmallTalk or FORTH and for some time you are lost, it's part of how the languages are different. But I think for Lua this is not true enough: it feels like it departs from what people know without good reasons.

I don't love a good deal of Lua's syntax, but I do think the authors had good reasons for their choices and have generally explained them. Even if you disagree, I think "without good reasons" is overly dismissive.

Personally though, I think the distinctive choices are a boon. You are never confused about what language you are writing because Lua code is so obviously Lua. There is value in this. Once you have written enough Lua, your mind easily switches in and out of Lua mode. Javascript, on the other hand, is filled with poor semantic decisions which for me, cancel out any benefits from syntactic familiarity.

More importantly, Lua has a crucial feature that Javascript lacks: tail call optimization. There are programs that I can easily write in Lua, in spite of its syntactic verbosity, that I cannot write in Javascript because of this limitation. Perhaps this particular JS implementation has tco, but I doubt it reading the release notes.

I have learned as much from Lua as I have Forth (SmallTalk doesn't interest me) and my programming skill has increased significantly since I switched to it as my primary language. Lua is the only lightweight language that I am aware of with TCO. In my programs, I have banned the use of loops. This is a liberation that is not possible in JS or even c, where TCO cannot be relied upon.

In particular, Lua is an exceptional language for writing compilers. Compilers are inherently recursive and thus languages lacking TCO are a poor fit (even if people have been valiantly forcing that square peg through a round hole for all this time).

Having said all that, perhaps as a scripting language for Redis, JS is a better fit. For me though Lua is clearly better than JS on many different dimensions and I don't appreciate the needless denigration of Lua, especially from someone as influential as you.


> For me though Lua is clearly better than JS on many different dimensions and I don't appreciate the needless denigration of Lua, especially from someone as influential as you.

Is it needless? It's useful specifically because he is someone influential, and someone might say "Lua was antirez's choice when making redis, and I trust and respect his engineering, so I'm going to keep Lua as a top contender for use in my project because of that" and him being clear on his choices and reasoning is useful in that respect. In any case where you think he has a responsibility to be careful what he says because of that influence, that can also be used in this case as a reason he should definitely explain his thoughts on it then and now.


Formally JavaScript is specified as having TCO as of ES6, although for unfortunate and painful reasons this is spec fiction - Safari implements it, but Firefox and Chrome do not. Neither did QuickJS last I checked and I don't think this does either.

ES is now ES2025, not ES6/2015. There are still platforms that don't even fully implement enough to shim out ES5 completely, let alone ES6+. Portions of ES6 require buy in from the hosting/runtime environment that aren't even practical for some environments... so I feel the statement itself is kind of ignorant.

> I think the distinctive choices are a boon. You are never confused about what language you are writing because Lua code is so obviously Lua. There is value in this.

This. And not just Lua , but having different kind of syntax for scripting languages or very high level languages signal it is something entirely different, and not C as in system programming language.

The syntax is also easier for people who dont intend to make programming as their profession, but simply want something done. It used to be the case in the old days people would design simple PL for new beginners, ActionScript / Flash era and even Hypercard before that. Unfortunately the industry is no longer interested in it, and if anything intend to make every as complicated as possible.


>Lua is the only lightweight language that I am aware of with TCO.

Scheme is pretty lightweight.



Tcl needs a special command for tail calls though, instead of it Just Working (tm). It's kind of awkward.

Which scheme implementation? Guile?

All of them.

To elaborate, the scheme spec requires tco.

> Lua has a crucial feature that Javascript lacks: tail call optimization.

I'm not familiar with Lua, but I expect tco to be a feature of the compiler, not of the language. Am I wrong?


You’re wrong in the way in which many people are wrong when they hear about a thing called “tail-call optimization”, which is why some people have been trying to get away from the term in favour of “proper tail calls” or something similar, at least as far as R5RS[1]:

> A Scheme implementation is properly tail-recursive if it supports an unbounded number of active tail calls.

The issue here is that, in every language that has a detailed enough specification, there is some provision saying that a program that makes an unbounded number of nested calls at runtime is not legal. Support for proper tail calls means that tail calls (a well-defined subgrammar of the language) do not ever count as nested, which expands the set of legal programs. That’s a language feature, not (merely) a compiler feature.

[1] https://standards.scheme.org/corrected-r5rs/r5rs-Z-H-6.html#...


Thank you for the precise answer.

I still think that the language property (or requirement, or behavior as seen by within the language itself) that we're talking about in this case is "unbounded nested calls" and that the language specs doesn't (shouldn't) assume that such property will be satisfied in a specific way, e.g. switching the call to a branch, as TCO usually means.


Unbounded nested calls as long as those calls are in tail position, which is a thing that needs to be defined—trivially, as `return EXPR(EXPR...)`, in Lua; while Scheme, being based around expressions, needs a more careful definition, see link above.

Otherwise yes. For instance, Scheme implementations that translate the Scheme program into portable C code (not just into bytecode interpreted by C code) cannot assume that the C compiler will translate C-level tail calls into jumps and thus take special measures to make them work correctly, from trampolines to the very confusingly named “Cheney on the M.T.A.”[1], and people will, colloquially, say those implementations do TCO too. Whether that’s correct usage... I don’t think really matters here, other than to demonstrate why the term “TCO” as encountered in the wild is a confusing one.

[1] https://www.plover.com/misc/hbaker-archive/CheneyMTA.html


Cheney on the MTA is a great paper/algorithm, and I'd like to add (for the benefit of the lucky ten thousand just learning about this) that it's pun on a great old song: Charlie on the MTA ( https://www.youtube.com/watch?v=MbtkL5_f6-4 ). The joke is that in both cases it will never return, either because the subway fare is too high or because you don't want to keep the call stack around.

Why do you think that?

I sort of see what you are getting at but I am still a bit confused:

If I have a program that based on the input given to it runs some number of recursions of a function and two compilers of the language, can I compile the program using both of them if compiler A has PTC and compiler B does not no matter what the actual program is? As in, is the only difference that you won’t get a runtime error if you exceed the max stack size?


That is correct, the difference is only visible at runtime. So is the difference between garbage collection (whether tracing or reference counting) and lack thereof: you can write a long-lived C program that calls malloc() throughout its lifetime but never free(), but you’re not going to have a good time executing it. Unless you compile it with Fil-C, in which case it will work (modulo the usual caveats regarding syntactic vs semantic garbage).

I think features of the language can make it much easier (read: possible) for the compiler to recognize when a function is tail call optimizable. Not every recursion will be, so it matters greatly what the actual program is.

It is a feature of the language (with proper tail calls) that a certain class of calls defined in the spec must be TCOd, if you want to put things that way. It’s not just that it’s easier for the compiler to recognize them, it’s that it has to.

(The usual caveats about TCO randomly not working are due to constraints imposed by preexisting ABIs or VMs; if you don’t need to care about those, then the whole thing is quite straightforward.)


If the language spec requires TCO, I think you can reasonably call it part of the language.

It wouldn't be the first time the specs have gone too far and beyond their perimeter.

C's "register" variables used to have the same issue, and even "inline" has been downgraded to a mere hint for the compiler (which can ignore it and still be a C compiler).


inline and register still have semantic requirements that are not just hints. Taking the address of a register variable is illegal, and inline allows a function to be defined in multiple .c files without errors.

IIRC, ES6+ includes TCO, but no actual implementation/engine has implemented it.

I don't think you're wrong per se. This is a "correct" way of thinking of the situation, but it's not the only correct way and it's arguably not the most useful.

A more useful way to understand the situation is that a language's major implementations are more important than the language itself. If the spec of the language says something, but nobody implements it, you can't write code against the spec. And on the flip side, if the major implementations of a language implement a feature that's not in the spec, you can write code that uses that feature.

A minor historical example of this was Python dictionaries. Maybe a decade ago, the Python spec didn't specify that dictionary keys would be retrieved in insertion order, so in theory, implementations of the Python language could do something like:

  >>> abc = {}
  >>> abc['a'] = 1
  >>> abc['b'] = 2
  >>> abc['c'] = 3
  >>> abc.keys()
  dict_keys(['c', 'a', 'b'])
But the CPython implementation did return all the keys in insertion order, and very few people were using anything other than the CPython implementation, so some codebases started depending on the keys being returned in insertion order without even knowing that they were depending on it. You could say that they weren't writing Python, but that seems a bit pedantic to me.

In any case, Python later standardized that as a feature, so now the ambiguity is solved.

It's all very tricky though, because for example, I wrote some code a decade that used GCC's compare-and-swap extensions, and at least at that time, it didn't compile on Clang. I think you'd have a stronger argument there that I wasn't writing C--not because what I wrote wasn't standard C, but because the code I wrote didn't compile on the most commonly used C compiler. The better approach to communication in this case, I think, is to simply use phrases that communicate what you're doing: instead of saying "C", say "ANSI C", "GCC C", "Portable C", etc.--phrases that communicate what implementations of the language you're supporting. Saying you're writing "C" isn't wrong, it's just not communicating a very important detail: what implementations of the compiler can compile your code. I'm much more interested in effectively communicating what compilers can compile a piece of code than pedantically gatekeeping what's C and what's not.


Python’s dicts for many years did not return keys in insertion order (since Tim Peters improved the hash in iirc 1.5 until Raymond Hettinger improved it further in iirc 3.6).

After the 3.6 changed, they were returned in order. And people started relying on that - so at a later stage, this became part of the spec.


There actually was a time when Python dictionary keys weren't guaranteed to be in the order they were inserted, as implemented in CPython, and the order would not be preserved.

You could not reliably depend on that implementation detail until much later, when optimizations were implemented in CPython that just so happened to preserve dictionary key insertion order. Once that was realized, it was PEP'd and made part of the spec.


Are you saying that Lua's TCO is an accidental feature due to the first implementation having it? How accurate is that?

What? No, I'm definitely not saying that.

I'm saying it isn't very useful argue about whether a feature is a feature of the language or a feature of the implementation, because the language is pretty useless independent of its implementation(s).


> as my primary language

I'd love to hear more how it is, the state of the library ecosystem, language evolution (wasn't there a new major version recently?), pros/cons, reasons to use it compared to other languages.

About tail-calls, in other languages I've found sometimes a conversion of recursive algorithm to a flat iterative loop with stack/queue to be effective. But it can be a pain, less elegant or intuitive than TCO.


Lua isn't my primary programming language now, but it was for a while. My personal experience on the library ecosystem was:

It's definitely smaller than many languages, and this is something to consider before selecting Lua for a project. But, on the positive side: With some 'other' languages I might find 5 or 10 libraries all doing more or less the same thing, many of them bloated and over-engineered. But with Lua I would often find just one library available, and it would be small and clean enough that I could easily read through its source code and know exactly how it worked.

Another nice thing about Lua when run on LuaJIT: extremely high CPU performance for a scripting language.

In summary: A better choice than it might appear at first, but with trade-offs which need serious consideration.


Yeah, you can usually write a TCO based algorithm differently without recursion though it's often more messy of an implementation... In practice, with JS, I find that if I know I'm going to wind up more/less than 3-4 calls deep I'll optimize or not to avoid the stack overflow.

Also worth noting that some features in JS may rely on application/environment support and may raise errors that you cannot catch in JS code. This is often fun to discover and painful to try to work around.


Re: TCO

Does the language give any guarantee that TCO was applied? In other words can it give you an error that the recursion is not of tail call form? Because I imagine a probability of writing a recursion and relying on it being TCO-optimized, where it's not. I would prefer if a language had some form of explicit TCO modifier for a function. Is there any language that has this?


At least in Lua then the rule is simply 'last thing a function dose' this is unambiguous. `return f()` is always a tail call and `return f() + 1` never is.

What about:

return 1 + f()

?


No, the last thing is the +; which can't run till it knows both values. (Reverse Polish notation is clearer, but humans prefer infix operators for some reason)

Although it’s a bit weird, Able Forth has the explicit word ~

https://github.com/ablevm/able-forth/blob/current/forth.scr

I do prefer this as it keeps the language more regular (fewer surprises)


Sounds a bit like Clojure's "recur". https://clojuredocs.org/clojure.core/recur

Scala has the @tailrec annotation which will raise a warning if the function can’t be TCO’d

C, with [[clang::musttail]]

JS has required proper tail calls (PTC) for a decade now. Safari's JavascriptCore and almost every implementation except v8/spidermonkey (and the now defunct chakra) have PTC.

v8 had PTC, but removed it because they insisted it MUST have a new tail call keyword. When they were shot down, they threw a childish fit and removed the PTC from their JIT.


> More importantly, Lua has a crucial feature that Javascript lacks: tail call optimization. There are programs that I can easily write in Lua, in spite of its syntactic verbosity, that I cannot write in Javascript because of this limitation. Perhaps this particular JS implementation has tco, but I doubt it reading the release notes.

> [...] In my programs, I have banned the use of loops. This is a liberation that is not possible in JS or even c, where TCO cannot be relied upon.

This is not a great language feature, IMO. There are two ways to go here:

1. You can go the Python way, and have no TCO, not ever. Guido van Rossum's reasoning on this is outlined here[1] and here[2], but the high level summary is that TCO makes it impossible to provide acceptably-clear tracebacks.

2. You can go the Chicken Scheme way, and do TCO, and ALSO do CPS conversion, which makes EVERY call into a tail call, without language user having to restructure their code to make sure their recursion happens at the tail.

Either of these approaches has its upsides and downsides, but TCO WITHOUT CPS conversion gives you the worst of both worlds. The only upside is that you can write most of your loops as recursion, but as van Rossum points out, most cases that can be handled with tail recursion, can AND SHOULD be handled with higher-order functions. This is just a much cleaner way to do it in most cases.

And the downsides to TCO without CPS conversion are:

1. Poor tracebacks.

2. Having to restructure your code awkwardly to make recursive calls into tail calls.

3. Easy to make a tail call into not a tail call, resulting in stack overflows.

I'll also add that the main reason recursion is preferable to looping is that it enables all sorts of formal verification. There's some tooling around formal verification for Scheme, but the benefits to eliminating loops are felt most in static, strongly typed languages like Haskell or OCaml. As far as I know Lua has no mature tooling whatsoever that benefits from preferring recursion over looping. It may be that the author of the post I am responding to finds recursion more intuitive than looping, but my experience contains no evidence that recursion is inherently more intuitive than looping: which is more intuitive appears to me to be entirely a function of the programmer's past experience.

In short, treating TCO without CPS conversion as a killer feature seems to me to be a fetishization of functional programming without understanding why functional programming is effective, embracing the madness with none of the method.

EDIT: To point out a weakness to my own argument: there are a bunch of functional programming language implementations that implement TCO without CPS conversion. I'd counter by saying that this is a function of when they were implemented/standardized. Requiring CPS conversion in the Scheme standard would pretty clearly make Scheme an easier to use language, but it would be unreasonable in 2025 to require CPS conversion because so many Scheme implementations don't have it and don't have the resources to implement it.

EDIT 2: I didn't mean for this post to come across as negative on Lua: I love Lua, and in my hobby language interpreter I've been writing, I have spent countless hours implementing ideas I got from Lua. Lua has many strengths--TCO just isn't one of them. When I'm writing Scheme and can't use a higher-order function, I use TCO. When I'm writing Lua and can't use a higher order function, I use loops. And in both languages I'd prefer to use a higher order function.

[1] https://neopythonic.blogspot.com/2009/04/tail-recursion-elim...

[2] https://neopythonic.blogspot.com/2009/04/final-words-on-tail...


EDIT 3: Looking at Lua's overall implementation, it seems to be focused on being fast and lightweight.

I don't know why Lua implemented TCO, but if I had to guess, it's not because it enables you to replace loops with recursion, it's because it... optimizes tail calls. It causes tail calls to use less memory, and this is particularly effective in Lua's implementation because it reuses the stack memory that was just used by the parent call, meaning it uses memory which is already in the processor's cache.

The thing is, a loop is still going to be slightly faster than TCOed recursion, because you don't need to move the arguments to the tail call function into the previous stack frame. In a loop your counters and whatnot are just always using the same memory location, no copying needed.

Where TCO really shines is in all the tail calls that aren't replacements for loops: an optimized tail call is faster than a non-optimized tail call. And in real world applications, a lot of your calls are tail calls!

I don't necessarily love the feature, for the reasons that I detailed in the previous post. But it's not a terrible problem, and I think it at makes sense as an optimization within the context of Lua's design goals of being lightweight and fast.


> In my programs, I have banned the use of loops.

Rather, you no longer see what they're doing clearly.


There's value in both implicit and explicit loops.

Some highly recursive programming styles are really just using the call stack as a data structure... which is valid but can be restrictive.


How so?

I suppose if you don’t understand recursion.


No offence but it seems that all of the people that are replying to this comment are essentially screaming in the void, if anything among each other.

I scrolled most of this sub thread and gp seem to not be replying to any of the replies they got.


> I do think the authors had good reasons for their choices and have generally explained them

I'm fairly certain antirez is the author of redis


The word "authors" in that phrase refers to the authors of Lua, not Redis.

Pretty sure he's talking about Lua's authors.

I do not think your compiler argument in support of TCO is very convincing.

Do you really need to write compilers with limitless nesting? Or is nesting, say, 100.000 deep enough, perhaps?

Also, you'll usually want to allocate some data structure to create an AST for each level. So that means you'll have some finite limit anyway. And that limit is a lot easier to hit in the real world, as it applies not just to nesting depth, but to the entire size of your compilation unit.


TCO is not just for parse trees or AST, but in imperative languages without TCO this is the only place you are "forced" to use recursion. You can transform any loop in you program to recursion if you prefer, which is what the author does.

> it feels like it departs from what people know without good reasons.

Lua was first released in 1993. I think that it's pretty conventional for the time, though yeah it did not follow Algol syntax but Pascal's and Ada's (which were more popular in Brazil at the time than C, which is why that is the case)!

Ruby, which appeared just 2 years later, departs a lot more, arguably without good reasons either? Perl, which is 5 years older and was very popular at the time, is much more "different" than Lua from what we now consider mainstream.


We had a lot problems embedding Ruby in a multithreaded C program as the garbage collector tries to scan memory between the threads (more details here: https://gitlab.com/nbdkit/nbdkit/-/commit/7364cbaae809b5ffb6... )

Perl, Python, OCaml, Lua and Rust were all fine (Rust wasn't around in 2010 of course).


I'm reving _why's syck right now. Turns out my fork from 2013 was still the most advanced. It doesn't implement the latest YAML specs, and all of it's new insecurities, which is a good thing. And it's much, much faster than the sax-like libyaml.

But since syck uses the ruby hashtable internally, I got stuck in the gem for a while. It fell out of their stdlib, and is not really maintained neither. PHP had the latest updates for it. And perl (me) extended it to be more recursion safe, and added more policies (what to do on duplicate keys: skip or overwrite).

So the ruby bindings are troublesome because of its GC, which with threading requires now7 a global vm instance. And using the ruby alloc/free pairs.

PHP, perl, python, Lua, IO, cocoa, all no problem. Just ruby, because of its too tight coupling. Looks I have to decouple it finally from ruby.


> Ruby, which appeared just 2 years later, departs a lot more, arguably without good reasons either?

I doubt we ever would have heard about Ruby without it's syntax decisions. From my understanding it's entire raison d'être was readability.


It's essentially Perl for people who don't like punctuation marks.

More like if Smalltalk and Perl had a prettier baby.

Pascal and Ada are Algol syntaxed relative to most languages.

> yeah it did not follow Algol syntax but Pascal's and Ada's

Now quite sure what you mean by that; all of Lua, Pascal, and Ada follow Algol's syntax much more closely than C does.


    def ruby(is)
      it = is 
      a = "bad"
      example()
      begin
        it["had"] = pascal(:like)
      rescue
        flow
      end
    end

I don't think you understand his point. Ruby has a different syntax because it presents different/more language features than a very basic C-like language; it's inspired by Lisp/SmallTalk, after all. Lua doesn't but still decided to change its looks a lot, according to him.

I read this comment, about to snap back with an anecdote how I as a 13 year old was able to learn Lua quite easily, and then I stopped myself because that wasn't productive, then pondered what antirez might think of this comment, and then I realized that antirez wrote it.

I think the older you are the harder Lua is to learn. GP didn't say it made wrong choices, just choices that are gratuitously different from other languages in the Algol family.

I’m tickled that one of my favorite developers is commenting on another of my favorites work. Would be great if Nicolas Cannasse were also in this thread!

Lua syntax is pretty good for DSL (domain specific language) cases / configuration definitions.

For example Premake[1] uses Lua as it is - without custom syntax parser but with set of domain specific functions.

This is pure Lua:

   workspace "MyWorkspace"
      configurations { "Debug", "Release" }
   
   project "MyProject"
      kind "ConsoleApp"
      language "C++"
      files { "**.h", "**.cpp" }
   
   filter { "configurations:Debug" }
      defines { "DEBUG" }
      symbols "On"
   
   filter { "configurations:Release" }
      defines { "NDEBUG" }
      optimize "On"

In that sense Premake looks significantly better than CMake with its esoteric constructs. Having regular and robust PL to implement those 10% of configuration cases that cannot be defined with "standard" declarations is the way to go, IMO.

[1] https://premake.github.io/docs/What-Is-Premake


It sounds like you're trying to articulate why you don't like Lua, but it seems to just boil down to syntax and semantics unfamiliarity?

I see this argument a lot with Lua. People simply don't like its syntax because we live in a world where C style syntax is more common, and the departure from that seem unnecessary. So going "well actually, in 1992 when Lua was made, C style syntax was more unfamiliar" won't help, because in the current year, C syntax is more familiar.

The first language I learned was Lua, and because of that it seems to have a special place in my heart or something. The reason for this is because in around 2006, the sandbox game "Garry's Mod" was extended with scripting support and chose Lua for seemingly the same reasons as Redis.

The game's author famously didn't like Lua, its unfamiliarity, its syntax, etc. He even modified it to add C style comments and operators. His new sandbox game "s&box" is based on C#, which is the language closest to his heart I think.

The point I'm trying to make is just that Lua is familiar to me and not to you for seemingly no objective reason. Had Garry chosen a different language, I would likely have a different favorite language, and Lua would feel unfamiliar and strange to me.


GP is the creator of Redis. I would imagine he knows Lua well given that Redis has embedded it for around a decade.

In that case, my point about Garry not liking Lua despite choosing it for Garrysmod, for seemingly the same reason as antirez is very appropriate.

I haven't read antirez'/redis' opinions about Lua, so I'm just going off of his post.

In contrast I do know more about what Garry's opinion on Lua is as I've read his thoughts on it over many years. It ultimately boils down to what antirez said. He just doesn't like it, it's too unfamiliar for seemingly no intentional reason.

But Lua is very much an intentionally designed language, driven in cathedral-style development by a bunch of professors who seem to obsess about language design. Some people like it, some people don't, but over 15 years of talking about Lua to other developers, "I don't like the syntax" is ultimately the fundamental reason I hear from developers.

So my main point is that it just feels arbitrary. I'm confident the main reason I like Lua is because garry's mod chose to implement it. Had it been "MicroQuickJS", Lua would likely feel unfamiliar to me as well.


If I am remembering correctly, there was a moment where Garry was seriously considering using Squirrel instead of Lua. I think he experimented with JavaScript too.

I’m not sure it’s still the case but he modified Lua to be zero indexed and some other tweaks because they annoyed him so much, so it’s possible if you learned GMod Lua you learned Garry’s Lua.

Of course his heart has been with C# for many years now.


It wouldn't fix the issue of semantics, but "language skins"[1][2] are an underexplored area of programming language development.

People go through all this effort to separate parsing and lexing, but never exploit the ability to just plug in a different lexer that allows for e.g. "{" and "}" tokens instead of "then" and "end", or vice versa.

1. <https://hn.algolia.com/?type=comment&prefix=true&query=cxr%2...>

2. <https://old.reddit.com/r/Oberon/comments/1pcmw8n/is_this_sac...>


Not "never exploit"; Reason and BuckleScript are examples of different "language skins" for OCaml.

The problem with "skins" is that they create variety where people strive for uniformity to lower the cognitive load. OTOH transparent switching between skins (about as easy as changing the tab sizes) would alleviate that.


> OTOH transparent switching between skins (about as easy as changing the tab sizes) would alleviate that.

That's one of my hopes for the future of the industry: people will be able to just choose the code style and even syntax family (which you're calling skin) they prefer when editing code, and it will be saved in whatever is the "default" for the language (or even something like the Unison Language: store the AST directly which allows cool stuff like de-duplicating definitions and content-addressable code - an idea I first found out on the amazing talk by Joe Armstrong, "The mess we're in" [1]).

Rust, in particular, would perhaps benefit a lot given how a lot of people hate its syntax... but also Lua for people who just can't stand the Pascal-like syntax and really need their C-like braces to be happy.

[1] https://www.youtube.com/watch?v=lKXe3HUG2l4


Also consider translation to non-English languages, including different writing and syntax systems (e.g. Arabic or Japanese).

Some languages have tools for more or less straightforward skinning.

Clojure to Tamil: https://github.com/echeran/clj-thamil/blob/master/src/clj_th...

C++ to distorted Russian: https://sizeof.livejournal.com/23169.html


> transparent switching between skins (about as easy as changing the tab sizes)

One of my pet "not today but some day" project ideas. In my case, I wanted to give Python/Gdscript syntax to any & all the curly languages (a potential boon to all users of non-Anglo keyboard layouts), one by one, via VSCode extension that implements a virtual filesystem over the real one which translates back & forth the syntaxes during the load/edit/save cycle. Then the whole live LSP background running for the underlying real source files and resurfacing that in the same extension with line-number matchings etc.

Anyone, please steal this idea and run with it, I'm too short on time for it for now =)


I want to do the opposite: Give curly braces to all the indentation based languages. Explicit is better than implicit, auto format is better than guessing why some block of code was executed outside my if statement.

I wanted to give Python/Gdscript syntax to any & all the curly languages (a potential boon to all users of non-Anglo keyboard layouts)

Neo makes it really easy to type those

https://neo-layout.org


People fight about tab sizes all the time though.

That's precisely the point of using tabs for indentation: you don't need to fight over it, because it's a local display preference that does not affect the source code at all, so everyone can just configure whatever they prefer locally without affecting other people.

The idea of "skins" is apparently to push that even further by abstracting the concrete syntax.


> you don't need to fight over it, because it's a local display preference

This has limits.

Files produced with tab=2 and others with tab=8, might have quite different result regarding nesting.

(pain is still on the menu)


I don't see why? Your window width will presumably be tailored to accommodate common scenarios in your preferred tab width.

More than that, in the general case for common C like languages things should almost never be nested more than a few levels deep. That's usually a sign of poorly designed and difficult to maintain code.

Lisps are a notable exception here, but due to limitations (arguably poor design) with how the most common editors handle lines that contain a mix of tabs and spaces you're pretty much forced to use only spaces when writing in that family of languages. If anything that language family serves as case in point - code written with an indentation width that isn't to one's preference becomes much more tedious to adapt due to alternating levels of alignment and indentation all being encoded as spaces (ie loss of information which automated tools could otherwise use).


I find it tends to be a structural thing, Tabs for indenting are fine, hell I prefer tabs for indenting. But use tabs for spacing and columnar layout and the format tends to break on tab width changes. Honestly not a huge deal but as such I tend to avoid tabs for layout work.

I love the idea of "tabs for indents, spaces for alignment", but I don't even bring it up anymore because it (the combination of the two) sets so many people off. I also like the idea of elastic tabs, but that requires editor buy-in.

All that being said, I've very much a "as long as everyone working on the code does it the same, I'll be fine" sort of person. We use spaces for everything, with defined indent levels, where I am, and it works just fine.


I completely agree, hence my point about Lisps. In terms of the abstraction a tab communicates a layer of indentation, with blocks at different indentation levels being explicitly decoupled in terms of alignment.

Unfortunately the discussion tends to be somewhat complicated by the occasional (usually automated) code formatting convention that (imo mistakenly) attempts to change the level of indentation in scenarios where you might reasonably want to align an element with the preceding line. For example, IDEs for C like languages that will add an additional tab when splitting function arguments across multiple lines. Fortunately such cases are easily resolved but their mere existence lends itself to objections.


Do you mean that files produced with "wide" tabs might have hard newlines embedded more readily in longer lines? Or that maybe people writing with "narrow" tabs might be comfortable writing 6-deep if/else trees that wrap when somebody with their tabs set to wider opens the same file?

One day Brython (python with braces allowing copy paste code to autoindent) will be well supported by LSPs and world peace will ensure

  SyntaxError: not a chance

VB.Net is mostly a reskin of C# with a few extras to smooth the transition from VB.

Lowering the barrier to create your own syntax seems like a bad thing though. C.f. perl.

JavaScript in 2010 was a totally different beast, standartization-wise. Lots of sharp corners and blank spaces were still there.

So, even if an implementation like MicroQuickJS existed in 2010, it's unlikely that too many people would have chosen JS over Lua, given all the shortcomings that JavaScript had at the time.


While you're not wrong that JS has come a long way in that time, it's not the case that it was an extremely unusual choice at the time - Ryan Dahl chose it for node in 2009.

Lua has been a wild success considering it was born in Brazil, and not some high wealth, network-effected country with all its consequent influential muscle (Ruby? Python? C? Rust? Prolog? Pascal? APL? Ocaml? Show me which one broke out that wasn't "born in the G7"). We should celebrate its plucky success which punches waaay above its adoption weight. It didn't blindly lockstep ALGOL citing "adooooption!!", but didn't indulge in revolution either, and so treads a humble path of cooperative independence of thought.

Come to think of it I don't think I can name a single mainstream language other than Lua that wasn't invented in the G7.


I appreciate your point, but Python was invented in .nl which wouldn't be G7 strictly speaking.

In the same vein Pascal was invented by Niklaus Wirth in Switzerland.

Out of interest, was Tcl considered? It's the original embeddable language.

In 1994 at the second WWW conference we presented "An API to Mosaic". It was TCL embedded inside the (only![1]) browser at the time - Mosaic. The functionality available was substantially similar to what Javascript ended up providing. We used it in our products especially for integrating help and preferences - for example HTML text could be describing color settings, you could click on one, select a colour from the chooser and the page and setting in our products would immediately update. In another demo we were able to print multiple pages of content from the start page, and got a standing ovation! There is an alternate universe where TCL could have become the browser language.

For those not familiar with TCL, the C API is flavoured like main. Callbacks take a list of strings argv style and an argc count. TCL is stringly typed which sounds bad, but the data comes from strings in the HTML and script blocks, and the page HTML is also text, so it fits nicely and the C callbacks are easy to write.

[1] Mosaic Netscape 0.9 was released the week before


Wasn't the original Redis prototype written in Tcl?

Yes, previously: https://news.ycombinator.com/item?id=35989909

The Redis test suite is still written in Tcl: https://news.ycombinator.com/item?id=9963162 (although more recently antirez said somewhere he wished he'd written it in C for speed)


LuaJIT’s C FFI integration is super useful in a scripting language and I’ve replaced numerous functions previously written in things like Bash with it.

it also helps that it has ridiculously high performance for a scripting language


+1 for the incredibly niche (but otherwise make-it-or-break-it) fact that PUC-Rio is and likely always will be strict C89 (i.e. ANSI C). I think this was (and still is?) most relevant to gamedev on Windows using older versions of MSVC, which has until recently been a few pennies short of a full C99 implementation.

I did once manage to compile Lua 5.4 on a Macintosh SE with 4MB of RAM, and THINK C 5.0 (circa 1991), which was a sick trick. Unfortunately, it took about 30 seconds for the VM to fully initialize, and it couldn't play well with the classic MacOS MMU-less handle-based memory management scheme.


> If this had been available in 2010, Redis scripting would have been JavaScript and not Lua.

Thank god it wasn’t then.


I also strongly disliked luas syntax at first but now I feel like the meta tables and what not and pcall and all that stuff is kinda worth it. I like everything about Lua except some of the awkward syntax but I find it so much better then JS, but I haven't been a web dev in over a decade

The only thing I dislike about Lua is the 1-indexing. I know they had reasons for it but it always caused issues.

I'm torn on this.

Initially I agreed, just because so many other languages do it that way.

But if you ignore that and clean slate it, IMO, 1 based makes more sense. I feel like 0 based mainly gained foothold because of C's bastardization of arrays vs pointers and associated tricks. But most other languages don't even support that.

You can only see :len(x)-1 so many times before you realize how ridiculous it is.


0 based has a LOT of benefits whereas the reasoning, if I recall, for 1-indexing in Lua was to make the language more approachable to non-devs.

Having written a game in it (via LÖVE), the 1-indexing was a continued source of problems. On the other hand, I rarely need to use len-1, especially since most languages expose more readable methods such as `last()`.


Python got this right. Zero-based indexing combined with half-open slice notation means as a practical matter you don't see too many -1s in the code. Certainly far fewer than when I wrote a game in Löve for a gamejam, where screen co-ordinates are naturally zero-indexed, which has implications for everything onscreen (tile indices, sprites, ...)

I could live with 1-indexing but a closed range array unpack (slices) is quite big toll and breaks nice intuitive invariant.

Normally I'd say "it's never too late!" but clearly would diverge and require an entirely new project, maintaining two bases for the same thing, etc.

Good to see you alive and kicking. Happy holidays


I’m always surprised people pick Lua when Pawn exists. I think I’d even still choose it over MicroQuickJS

https://www.compuphase.com/pawn/pawn.htm


I remember seeing this a long time ago and liking it, I just didn't have a use for it at the time. How does it stack up against luahit for perf and memory, and threading? It also looks like it could be worth looking at porting the compiler to zig which excels at both compiler writing and cross platform tooling.

LuaJIT is best in class performance for a scripting language -by a huge margin. In very specific problems, it can outperform C. Surely anything else is going to come up lacking if you are only considering raw benchmarks.

My hunch is that the same is true of Wikipedia's choice of Lua for template scripting, made back in 2012.

https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists...


> it feels like it departs from what people know without good reasons.

Lua is a pretty old language. In 1993 the world had not really settled on C style syntax. Compared to Perl or Tcl, Lua's syntax seems rather conventional.

Some design decisions might be a bit unusual, but overall the language feels very consistent and predictable. JS is a mess in comparison.

> because it departs from a more Algol-like syntax

Huh? Lua's syntax is actually very Algol-like since it uses keywords to delimit blocks (e.g. if ... then ... end)


I known for very long time that c (and co) inherited the syntax from algol.

But only after long time I tried to check what Algol actually looked like. To my surprise, Algol does not look anything like C to me.

I would be quite interested in the expanded version of “C has inherited syntax from Algol”

Edit: apparently the inheritance from Algol is a formula: lexical scoping + value returning functions (expression based) - parenthesitis. Only last item is about visual part of the syntax.

Algol alternatives were: cobol, fortan, lisp, apl.


The use of curly braces for delimiting blocks of code actually comes from BCPL.

Of course, C also inherited syntax from Algol, but so did most languages.


> consistent and predictable

That's what matters to me, not how similar Lua is to other languages, but that the language is well-designed in its own system of rules and conventions. It makes sense, every part of it contributes to a harmonious whole. JavaScript on the other hand.

When speaking of Algol or C-style syntax, it makes me imagine a "Common C" syntax, like taking the best, or the least common denominator, of all C-like languages. A minimal subset that fits in your head, instead of what modern C is turning out to be, not to mention C++ or Rust.


Is modern C really much more complicated than old C? C++ is a mess of course.

I don't write modern C for daily use, so I can't really say. But I've been re-learning and writing C99 more these days, not professionally but personal use - and I appreciate the smallness of the language. Might even say C peaked at C99. I mean, I'd be crazy to say that C-like languages after C99, like Java, PHP, etc., are all misguided for how unnecessarily big and complex they are. It might be that I'm becoming more like a caveman programmer as I get older, I prefer dumb primitive tools.

C11 adds a couple of nice things like static asserts which I use sometimes to document assumptions I make.

They did add some optional sections like bounds checking that seem to have flopped, partly for being optional, partly for being half-baked. Having optional sections in general seems like a bad idea.


If you don't have compiler restrictions, C23 is also a pleasure to write. `typeof`, `constexpr`, `#embed`, `nullptr`, attributes and all.

The big new thing in C11 was atomics and threading.

IDK about C11; but C99 doesn't change a lot compared to ANSI C. You can read The C Programming Language 2nd edition and pick up C99 in a week. It adds boleans, some float/complex math ops, an optional floating point definition and a few more goodies:

https://en.wikipedia.org/wiki/C99

C++ by comparison it's a behemoth. If C++ died and, for instance, the FLTK guys rebased their libraries into C (and Boost for instance) it would be a big loss at first but Chromium and the like rewritten in C would slim down a bit, the complexity would plummet down and similar projects would use far less CPU and RAM.

It's not just about the binary size; C++ today makes even the Common Lisp standard (even with UIOP and some de facto standard libraries from QuickLisp) pretty much human-manageable, and CL always has been a one-thousand pages thick standard with tons of bloat compared to Scheme or it's sibling Emacs Lisp. Go figure.


C++ is a katamari ball of programming trends and half baked ideas. I get why google built golang, as they were already pretty strict about what parts of the c++ sediments you were allowed to use.

Not Google actually, but the same people from C, AWK and Unix (and 9front, which is "Unix 2.0" and it has a simpler C (no POSIX bloat there) and the compilers are basically the philosophy of Golang (cross compile from any to any arch, CSP concurrency...)

Also, the Limbo language it's basically pre-Go.



No.

https://en.wikipedia.org/wiki/Alef_(programming_language)

https://en.wikipedia.org/wiki/Limbo_(programming_language)

https://en.wikipedia.org/wiki/Newsqueak

https://en.wikipedia.org/wiki/Communicating_sequential_proce...

https://doc.cat-v.org/bell_labs/new_c_compilers/new_c_compil...

It was amalgamated at Google.

Originally Go used the Ken C compilers for Plan9. It still uses CSP. The syntax it's from Limbo/Inferno, and probably the GC came from Limbo too.

If any, Golang was created for Google by reusing a big chunk of plan9 and Inferno's design, in some cases even straightly, as it shows with the concurrency model. Or the cross-compiling suite.

A bit like MacOS X under Apple. We all know it wasn't born in a vacuum. It borrowed Mach, the NeXTStep API and the FreeBSD userland and they put the Carbon API on top for compatibility.

Before that, the classic MacOS had nothing to do with Unix, C, Objective C, NeXT or the Mach kernel.

Mac OS X is to NeXT what Go is for Alef/Inferno/Plan9 C. As every MacOS user it's using something like NeXTStep with the Macintosh UI design for the 21th century, Go users are like using a similar, futuristic version of the Limbo/Alef programming languages with a bit of the Plan9 concurrency and automatic crosscompilation.


That's wonderful how you tied those threads together to describe Go's philosophical origins. I'm having a great time exploring the links. And the parallel with NeXTSTEP is fascinating too, I've been interested in that part of software history since learning that Tim Berners-Lee created WorldWideWeb.app on the NeXTcube.

Not to mention the 1-based indexing sin. JavaScript has a lot of WTFs but they got that right at least.

This indeed is not Algol (or rather C) heritage, but Fortran heritage, not memory offsets but indices in mathematical formulae. This is why R and Julia also have 1-based indexing.

Pascal. Modula-2. BASIC. Hell, Logo.

Lately, yes, Julia and R.

Lots of systems I grew up with were 1-indexed and there's nothing wrong with it. In the context of history, C is the anomaly.

I learned the Wirth languages first (and then later did a lot of programming in MOO, a prototype OO 1-indexed scripting language). Because of that early experience I still slip up and make off by 1 errors occasionally w/ 0 indexed languages.

(Actually both Modula-2 and Ada aren't strictly 1 indexed since you can redefine the indexing range.)

It's funny how orthodoxies grow.


In fact zero-based has shown some undeniable advantages over one-based. I couldn't explain it better than Dijkstra's famous essay: http://www.cs.utexas.edu/~EWD/ewd08xx/EWD831.PDF

It's fine, I can see the advantages. I just think it's a weird level of blindness to act like 1 indexing is some sort of aberration. It's really not. It's actually quite friendly for new or casual programmers, for one.

I think the objection is not so much blindness as the idea that professional tools should not generally be tailored to the needs of new or casual users at the expense of experienced users.

Is there any actual evidence that new programmers really find this hard? Python is renowned for being beginner friendly and I've never heard of anyone suggesting it was remotely a problem.

There are only a few languages that are purely for beginners (LOGO and BASIC?) so it's a high cost to annoy experienced programmers for something that probably isn't a big deal anyway.


Pascal, frankly, allowed to index arrays by any enumerable type; you could use Natural (1-based), or could use 0..whatever. Same with Modula-2; writing it, I freely used 0-based indexing when I wanted to interact with hardware where it made sense, and 1-based indexes when I wanted to implement some math formula.

As I understand it Julia changed course and is attempting to support arbitrary index ranges, a feature which Fortran enjoys. (I'm not clear on the details as I don't use either of them.)

Let’s hope that they don’t also replicate ISO Fortran’s design flaws with lower array bounds, which contain enough pitfalls and portability problems that I don’t recommend their use.

I haven't used either language much myself and I thought the feature looked brilliant so I'd be very curious to know what sort of issues you ran into in practice.

> Lots of systems I grew up with were 1-indexed and there's nothing wrong with it. In the context of history, C is the anomaly.

The problem is that Lua is effectively an embedded language for C.

If Lua never interacted with C, 1-based indexing would merely be a weird quirk. Because you are constantly shifting across the C/Lua barrier, 1-based indices becomes a disaster.


And MATLAB. Doesn't make it any better that other languages have the same mistake.

Does it count as 0-indexing when your 0 is a floating point number?

Actually in JS array indexing is same as property indexing right? So it's actually looking up the string '0', as in arr['0']

Huh. I always thought that JS objects supported string and number keys separately, like lua. Nope!

  [Documents]$ cat test.js
  let testArray = [];
  testArray[0] = "foo";
  testArray["0"] = "bar";
  console.log(testArray[0]);
  console.log(testArray["0"]);
  [Documents]$ jsc test.js
  bar bar
  [Documents]$

Lua supports even functions and objects as keys:

  function f1() end
  function f2() end
  local m1 = {}
  local m2 = {}
  local obj = {
      [f1] = 1,
      [f2] = 2,
      [m1] = 3,
      [m2] = 4,
  }
  print(obj[f1], obj[f2], obj[m1], obj[m2], obj[{}])
Functions as keys is handy when implementing a quick pub/sub.

They do, but strings that are numbers will be reinterpreted as numbers.

[edit]

  let testArray = [];
  testArray[0] = "foo";
  testArray["0"] = "bar";
  testArray["00"] = "baz";
  console.log(testArray[0]);
  console.log(testArray["0"]);
  console.log(testArray["00"]);

That example only shows the opposite of what it sounds like you’re saying, although you could be getting at a few different true things. Anyway:

- Every property access in JavaScript is semantically coerced to a string (or a symbol, as of ES6). All property keys are semantically either strings or symbols.

- Property names that are the ToString() of a 31-bit unsigned integer are considered indexes for the purposes of the following two behaviours:

- For arrays, indexes are the elements of the array. They’re the properties that can affect its `length` and are acted on by array methods.

- Indexes are ordered in numeric order before other properties. Other properties are in creation order. (In some even nicher cases, property order is implementation-defined.)

  { let a = {}; a['1'] = 5; a['0'] = 6; Object.keys(a) }
  // ['0', '1']

  { let a = {}; a['1'] = 5; a['00'] = 6; Object.keys(a) }
  // ['1', '00']

There's nothing wrong with 1-based indexing. The only reason it seems wrong to you is because you're familiar with 0-based, not because it's inherently worse.

I'll refer you to Dijkstra's "Why numbering should start at zero": https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831...

That's simply untrue. 1-based indexing is inherently worse because it leads to code that is less elegant and harder to understand. And slightly less efficient but that's a minor factor.

If you can't deal with off-by-one errors, you're not a programmer.

But with Lua all those errors are now off by two

Or 0. Lua is in superposition, ready for the quantum computing age.

Except for Date.

Lua having a JIT compiler seems like a big difference though. It was a while since that got major updates, but probably relevant at the time?

Lua - an entire language off by one.

Sure, because the first element is at index 1, not zero. Ha

Redis' author also made jimtcl, so I don't think the lack of a small engine was the gap

You're replying to Redis' author.

What are the chances of switching to MQJS or something like it in the future?

> If this had been available in 2010, Redis scripting would have been JavaScript and not Lua.

This would have been a catastrophic loss. Lua is better than javascript in every single way except for ordinal indexing


I for one would be would be very interested in a Redbean[0] implementation with MicroQuickJS instead of Lua, though I lack the resources to create it myself.

[0] https://redbean.dev/ - the single-file distributable web server built with Cosmopolitan as an αcτµαlly pδrταblε εxεcµταblε


[flagged]


I think criticizing JavaScript has become a way of signaling "I'm a good programmer." Yes, good programmers ten years ago had valid reasons to criticize it. But today, attacking the efforts of skilled engineers who have improved the language (given the constraints and without breaking half of the web) seems unfair. They’ve achieved a Herculean task compared to the Python dev team, which has broken backward compatibility so many times yet failed to create a consistent language, lacking a single right way to do many things.

> But today, attacking the efforts of skilled engineers who have improved the language (given the constraints and without breaking half of the web) seems unfair.

I was criticising a thing not a person.

Also your comment implies it was ok to be critical of a language 10 years ago but not ok today because a few more language designers might get offended. Which is a weird argument to make.


I think he’s saying it’s a fundamentally improved language at this point?

Not OP, but the case can be made that it's still the same very ugly language of 10 years ago, with few layers of sugar coating on top. The ugly hasn't gone anywhere. You still have to deal with it and suffer the cognitive burden.

> Not OP, but the case can be made that it's still the same very ugly language of 10 years ago, with few layers of sugar coating on top.

Let's talk specifics. As it seems you have strong opinions, in your opinion what is the single worst aspect of JavaScript that justifies the use of the word "ugly"?


https://dorey.github.io/JavaScript-Equality-Table/

https://www.reddit.com/r/learnjavascript/comments/qdmzio/dif...

or anything that touches array ops (concatenating, map, etc…). I mean, better and more knowledgeable people than me have written thousands of articles about those footguns and many more.

I am not a webdev, I don't want to remember those things, but more often than I would wish, I have to interop with JS, and then I'd rather use a better behaved language that compiles down to JS (there are many very good ones, nowadays) than deal with JS directly, and pray for the best.


If type conversion and the new var declaration keywords are your top complains about a language, I'm sorry to say that you are at best grasping at straws to find some semblance of justification for you irrational dislike.

> I am not a webdev, I don't want to remember those things, (...)

Not only is JavaScript way more than a webdev thing, you are ignoring the fact that most of the mainstream programming languages also support things like automatic type conversion.


> you are at best grasping at straws to find some semblance of justification for you irrational dislike.

You seem so emotionally-involved that the whole point whooshed above your head. JS is a language that gives me no joy to use (there are many of those, I can put Fortran or SQL in there), and, remarkably, gives me no confidence that whatever I write with it does what I intend (down to basic branching with checking for nulliness/undefinedness, checking for edge-cases, etc). In that sense it's much worse than most of those languages that I just dislike.

> Not only is JavaScript way more than a webdev thing, you are ignoring the fact that most of the mainstream programming languages also support things like automatic type conversion.

Again, you are missing the point. JS simply has no alternative for webdev, but it's easy to argue that, for everything else, there are better, faster, more expressive, more robust, … languages out there. The only time I ever have to touch JS is consequently for webdev.


Or good programmers understand why JS is bad?

Every programming language is an abomination depending on the perspective.

> Text generated by an LM is not grounded in communicative intent

This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

Also their model requires the contrary, too: that the model does not know, semantically, what the query really means.

Stochastic Parrot has a scientific meaning, and just only observing the function of the models, it is quite evident that they were very wrong, but now we have stong evidence (via probing) that also the sentence you quoted is not correct, since the model knows the idea to express also in general terms, and features about things it is going to say much later activates a lot of tokens earlier, including conceptual features that are relevant later in the sentence / concept expressed.

You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models. The stochastic parrot does not understand the query nor is trying to reply to you in any way, it just exploits a probabilistic link among the context window and the next word. This link can be more complex than a Markov chain but must be of the same kind: lacking understanding whatsoever and communication intent (no representation of the concept / sentences that are required to reply correctly). How it is possible to believe in this, today? And, check yourself what the top AI scientists today believe about the correctness of the stochastic parrot hypothesis.


> > Text generated by an LM is not grounded in communicative intent

> This means exactly that no representation should exist in the activation states about what the model wants to tell, and there must be only a single token probabilistic inference at play.

That's not correct. It's clear from the surrounding paragraphs what Bender et al mean by this phrase. They mean that LLMs lack the capacity to form intentions.

> You are doing the big error that is common to do in this context of extending the stochastic parrot to a non scientifically isolated model that can be made large enough to accomodate any evidence arriving from new generations of models.

No, I'm not. I haven't, in fact, made any claims about the "stochastic parrot". Rather, I've asked whether your characterisation of AI researchers' views is accurate, and suggested some reasons why it may not be.


What happened recently is that all the serious AI researches that were in the stochastic parrot side changed point of view but, incredibly, people without a deep understanding on such matters, previously exposed to such arguments, are lagging behind and still repeat arguments that the people who popularized them would not repeat again.

Today there is no top AI scientist that will tell you LLMs are just stochastic parrots.


You seem to think the debate is settled, but that’s far from true. It’s oddly controlling to attempt to discredit any opposition to this viewpoint. There’s plenty of research supporting the stochastic view of these models, such as Apple’s “Illusion” papers. Tao is also a highly respected researcher, and has worked with these models at a very high level - his viewpoint has merit as well.

The stochastic parrot framing makes some assumptions, one of them being that LLMs generate from minimal input prompts, like "tell me about Transformers" or "draw a cute dog". But when input provides substantial entropy or novelty, the output will not look like any training data. And longer sessions with multiple rounds of messages also deviate OOD. The model is doing work outside its training distribution.

It's like saying pianos are not creative because they don't make music. Well, yes, you have to play the keys to hear the music, and transformers are no exception. You need to put in your unique magic input to get something new and useful.


Now that you’re here, what do you mean by “scientific hints” in your first paragraph?

I'm not involved in business decisions and while I'm very AI positive I believe Redis as a company should focus on Redis fundamentals: so my piece has zero alignment on what I hope for the company.

This means your difficulty is not programming per se, but that you are working on a very suboptimal industry / company / system. With all due respect, you use programming at work, but true programming is the act of creating a system that you or your team designed and want to make alive. Confusing the reality of writing code for a living in some company with what Programming with capitalized P is, produces a lot of misunderstanding.


Wow. I think that's a serious mistake. Maybe GitHub is no longer so great and snappy but nowhere to justify moving something that needs: 1. Money, 2. Exposition, to something obscure just because it's a bit better. It's Git with an UI anyway, there isn't such large difference. I don't care about the fact the post is harsh: it's the content that it is broken from my POV because. It is absolutely legit to do something like that, in theory, but when you are handling a project that - at this point - is also the chosen language of a non trivial amount of folks, you need to act not just following what you like, but what is better for the project in the long time, and it is very hard to see how going away from GitHub (the fucking big market of open source software in the main city plaza -- let's use the same post tones) is better for Zig. What I think it is better is, of course, not absolutely better, but let's zoom on this issue root cause. It is the classical developer intolerance for tool that are not "as they wish/think", which is very common among technical people, but is a POV, I mean this "tool oriented" workflow, where this little feature/customization matters so much in your life (instead of adapting a bit and do not care), that I believe is a problem in our industry, and also has effects on the design philosophy of many programmers, that are too details oriented. Coders spend the majority of their life in the terminal, not on in GitHub. To check issues / PR there is not this Stranger Things Upside Down nightmare.

Another problem with that is that you know what you are leaving, but you don't really know what you find in the new place. GitHub used to go down often in the early days. Now they may not be snappy and unfortunately like 99% of the web felt for this Javascript framework craziness. But the site is always up, I bet has disaster recovery and serious backup policy, and so forth. Can you find this so obviously in other smaller places?


GitHub Actions are seriously broken and that alone is a technically sound enough reason to move: the sleep.sh bullshit has degraded the performance of our CI for a long time, as it regularly livelocks in an endless while(true) spin runner agents, who stop processing new jobs. The agent itself has poor platform support also because it has a runtime dependency on .NET, and lately GH Actions started running jobs out of order with the result that old jobs would starve and time out, causing PRs to turn up red for no real reason.

It's a real problem to run a project like Zig if your CI doesn't work. I guess we could have paid for an external CI service, but that as well would depend on GitHub APIs, so we would have gained what, a couple years? Given the current trajectory of GitHub I wouldn't trust them to maintain those APIs correctly for any longer than that (and as far as I know the current vibe-scheduling issues might already be reflected in the APIs that third party CI providers would use).

Let's not forget that "GitHub is an AI company now".


As a side note, people said that not posting anymore on Twitter and leaving Reddit was also a death sentence for Zig. Time has passed and we're still alive so far, while in the meantime both platforms have started their final journey towards the promised lands of the elves.

They won't get there tomorrow or the next month, but I'm sure there has been a time where people started moving from Sourceforge to GitHub and somebody else remarked that they were doing something needlessly risky.

As far as we can tell Codeberg is a serious attempt at a non-profit code sharing platform and we feel optimistic enough about its future that we're willing to bet on it.


I hope the best for Zig, Loris. But even if Zig will survive and prosper (I hope for both), still I believe this is not a sounding decision and not the right attitude. I hope I'm wrong, but I wanted to share with you my reasoning. Here you are moving away from the open source marketplace AND from your main revenue stream. It's not similar to not posting anymore to Twitter. A better parallel would be not posting anymore on Hacker News anything Zig related, in terms of potential outcome.


We've been directing people to use other means to donate for a few years now, so GH Sponsors is not our main source of income anymore (and hasn't been for a while). It's still a significant chunk, but it's also not going to go away overnight.

> A better parallel would be not posting anymore on Hacker News anything Zig related, in terms of potential outcome.

I've been thinking about this lately and in my experience (having seen the effect of HN posts in the past when Zig was smaller vs now) the community is already big and vibrant enough that an HN post alone doesn't do too much of a difference. To be clear, I don't think that HN is losing relevance (unlike all the other big platforms mentioned earlier in this conversation), but our situation has changed.

People now are more and more learning about Zig though cool Zig projects, not by looking at yet another superficial language comparison blog post, which is the kind of content that tends to get to the top of HN more often than not.

More in general I think that your point about not pulling away from all the markeplaces of ideas is valid, but most of those marketplaces are not as good as they claim to be and we have the luxury to run a project that has a strong community connected to it, meaning that we won't be starved of attention or contributors by moving away from GitHub.

This whole situation has an interesting parallel with what's happening in our community wrt chat platforms, if we happen to be at the same tech event in person I'll be happy to share with you all the details :^)


> It's still a significant chunk, but it's also not going to go away overnight.

You despise and are leaving GitHub, but intend to keep collecting money from their sponsorship feature/program? Sounds like they are doing something right then…


I had to scroll too far for someone to mention third-party CI. GitHub Actions free runners have always sucked, but the third-party runner ecosystem is really strong for those who can afford it. imo the APIs are far better than the rest of the product - I suspect enterprise customers are strong-arming GitHub to keep them reliable. and there's always third-party CI like tekton if Actions' yaml is too annoying


Namespace fixed all of the issues I had with GitHub Actions. Ghostty uses them to build their project: https://namespace.so

Using Namespace made it clear how much cruft GitHub Actions has accumulated and how much performance they leave on the table. I regard GitHub Actions like Nix: weird configuration, largely shell-based, and the value you get out is commensurate with the investment you put in. But it works well enough.

But at the end of the day, GitHub Actions, like Nix, is just shell scripts. They're fairly portable. I like Namespace because they fixed the parts of CI that matter, like fast local caching versus GitHub's HTTP-based cache.

But I also don't hate this: I use GitHub for the pretty website and global search. Someone will mirror Zig for the search, and my terminal does not care where I clone the repository from. I think this is the new world we live in.

Someone will have to build the aggregator that indexes all repositories and makes them searchable, but that can ultimately be separate from hosting.


Speaking as the creator of one of the largest Zig projects, I agree with antirez


As the creator of one of the larger Zig projects, does this announcement undermine your confidence in Zig at all? Either with the language used or the decision in itself?


> undermine your confidence in Zig

Nah

I do think the combo of not using Twitter, not using Discord, and not using GitHub does make it a bit more challenging for Zig to become a mainstream programming language. Twitter being the least important amongst those. Hard to say how much it matters in practice, as things tend to win on their strengths and not on lack of weakness


> let's zoom on this issue root cause. It is the classical developer intolerance for tool that are not "as they wish/think", which is very common among technical people, but is a POV, I mean this "tool oriented" workflow, where this little feature/customization matters so much in your life (instead of adapting a bit and do not care), that I believe is a problem in our industry

k

Prospective Zig contributors can just adapt a bit and not care about the fact that the project isn't hosted on GitHub, then.

Right?


I was happy from the sideline seeing the recent big Zig donations. But this sudden decision is a shock. Technical issues can be worked around (I wish/think), but leaving such a dominant platform? I don't know. For my small small needs Forgejo works great, but for Zig, a project which I hope has a lot of mainstream success, I'm not convinced, that Forgejo/Codeberg is the best fit (atm). Even Graphene OS which has very high standards is (still) on Github, maybe Zig could brood (brüten) a bit longer to decide if it is really time to leave?



Moving away from Github may end up killing the project.


They will spend less time on PRs from LLM spammers like you (for anyone who wonders, just Google his username and check the PRs made to OCaml/Zig/Julia), so if anything they freed up resources.


Some time ago based on the same set of ideas I made this:

https://makerworld.com/it/models/99219-olivetti-style-vase-m...

You can create fast to print objects consuming very little filament, however to have some kind of texture on the surfaces is absolutely needed for strength.


(Original author of the blog post here.)

That is a neat model. It seems like unlike printables, makersworld doesn't have a 3D preview on the website (or maybe you need to be logged in, or use an app or something)? As I have a Prusa printer, I have naturally falles into using printables (and Thingyverse back before printables was launched).

I will have to download the model and take a proper look in the slicer or CAD program when I'm no longer on my mobile phone. (I will probably not print it, as I'm not currently in the need of a desk organiser, and I don't like wasting plastic.)

For the benefit of those who don't know (I assume you do): Texture absolutely helps rigidity in any part, since if the surface is already curved or creased in one direction it resists bending in other directions. (This is why a rolled up paper is stiffer than a sheet.)


Thank you for your comment! The model can be also found on Printables. I have both an MK4 and a Bambulab A1 :D Thanks.


Ah, I have a Mk3.9s, and definitely don't need more than one printer.

I looked at the model on printables, and have a followup question: why does the slit go all the way through? You can make the first few layers print normal and solid (in fact this is very common), so I'm slightly confused as to why you didn't (it would probably increase the strength a bit).


It was an attempt at using less plastic, in vase mode there is this absurd thing that the first few solid layers are accountable often from 30% or more of the whole object weight! However, yes, it would be more robust that way.

I have a long love affair with vase mode and abusing it, and always saving time/weight is my main goal, like in this other case: https://makerworld.com/it/models/840291-vase-mode-gear-phone...

Your blog post made me thing that we would almost need a specialized vase mode site for models of that kind :D Moreover, there is no reason why the top surface could not be closed with bridging. The slicers have a lot of odd limitations in the context of vase mode.


Agreed, I would like to be able to specify vase mode as a height range modifier in the slicer, so I could shift back to non-vase mode near the top.

If you want full control though, you might want to look at https://github.com/FullControlXYZ/fullcontrol (using python to directly generate gcode). Perhaps a bit over the top, but their examples are cool and show things that can't currently be done any other way. You could definitely switch back and forth between vase and normal with it.


Thanks, yep I know ControlXYZ, it's cool but far from practical, I have the feeling their contribution would be a lot more impactful if working with OrcaSlicer.


The problem is how they are fine tuned with human feedbacks that are not opinionated, so they produce some "average taste" that is very recognizable. Early models didn't have this issue, it's a paradox... Lower quality / broken images but often more interesting. Krea & Black Forest did a blog post about that some time ago.


Oh yeah, funny enough even though I’m a bit of an AI art hater I actually thought very early Midjourney looked good because of all had an impressionistic, dreamy quality.


I wonder if we'll get to the point where we train different personalities into an image model that we can bring out in the prompt and these personalities have distinct art/picture styles they produce.


Related: Redis has a keyspace notification doing something similar that is not very well known, but when it is needed, it is really needed. We are thinking of extending this mechanism in new ways. Similarly I have seen setups where the notifications arriving from Postgres/MySQL are used in order to materialize (and keep updated) a cached view in Redis. To me, it is interesting how certain teams relay on these kind of mechanisms provided by database systems, and other teams like to do things in the client side to have full control, even in the face of having to reimplement some logic.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: