I like to rate programming language's features not by how much I use them when I'm in the given language, or how good they make me feel, but by how much I miss them when I'm in a different language, once I'm fluent in that language and writing in the native idiom. (This is important. If you're still trying to write X in Y, yes, you'll miss the features from X, but that's not a useful data point.)
By this metric, rather a lot of features turn out to be less important than they may seem at first. Many things are a zero on this scale that I think might surprise people still on their second or third language. From this perspective you start judging not whether a language has this or that exact feature that is a solution to a problem that you are used to, but whether it has a solution at all, and how good it is on its own terms.
So while sigils have a lot of company in this, they are also a flat zero for me on this scale. Never ever missed them. I did a decade+ of Perl as my main language, so it's not for lack of exposure.
(As an example of something that does pass this test: Closures. Hard to use anything lacking them, though as this seems to be a popular opinion nowadays, almost everything has them. But I'm old enough to remember them being a controversial feature. Also, at this point, static types. Despite my decades of dynamic typed languages, I hate going back to dynamic languages anymore. YMMV.)
> So while sigils have a lot of company in this, they are also a flat zero for me on this scale. Never ever missed them. I did a decade+ of Perl as my main language, so it's not for lack of exposure.
I tend to miss one specific sigil (or pair of sigils): the @ and @@ sigils in Ruby, that mean "instance variable" and "class variable" respectively. Having identifier shadowing between stack-locals, and what Java would call "members" and "statics", be literally impossible, is just so nice. Especially when you get it "for free" in terms of verbosity, rather than needing to type `self.class.` or something.
I also really quite interned-string-literal : sigils in Ruby/Elixir — though I'd be equally fine with the Prolog/Erlang approach of barewords being symbols and identifiers needing to be capitalized. As long as there's some concise syntax for interned strings, especially in the context of dictionary keys. Because otherwise people just won't use them, even when they're there in the language. (See: Java, Python, ECMA6.)
Speaking of Elixir, the "universal sigil" ~ is kind of amazing. Define a macro sigil_h/2, and you can suddenly write ~h/foo/bar (or ~h[foo]bar, or whatever other delimiter works to best avoid the need for escaping), and foo and bar will be passed to sigil_h/2 as un-evaluated AST nodes to do with as you please. The language gives you ~w by default (which works like Ruby %w); but more interestingly, Regex literals in Elixir are just sigil_r.
> I tend to miss one specific sigil (or pair of sigils): the @ and @@ sigils in Ruby, that mean "instance variable" and "class variable" respectively. Having identifier shadowing between stack-locals, and what Java would call "members" and "statics", be literally impossible, is just so nice. Especially when you get it "for free" in terms of verbosity, rather than needing to type `self.class.` or something.
When I went from C++ to Python, the explicit "self" felt weird but over time, I felt it was much better. This became a lot more obvious in Rust. In C++ you get an implicit `this` variable and you get weird trailing keywords on functions to modify the `this` variable. Granted, these kinds of use cases won't be needed in every language. However, I also feel like sigils for this would be less understandable for someone unfamiliar with the language than explicit `self`. Something I judge a language on is how easy is the code to casually maintain by a group that is trying to get other stuff done.
The thing about Ruby is that it has uniform syntax — a.b for any a+b mean "send a message :b to a." So `self.foo` (and `self.foo = bar`, too!) are possible to write, but these are always interpreted as message sends (to the :foo and :foo= methods, respectively), not as direct field accesses. The "syntax-ness" of @ and @@ show are that you're specifically breaking out of† the paradigm of "everything is a message send", to instead "just" access a field. It's what makes this make sense:
def foo # define a getter method
@foo # in terms of a field access
end
How would you write that, if the field access was spelled `self.foo`? The language wouldn't be able to tell that you're not just recursively calling the getter!
---
† Though, technically, you're not breaking out of the paradigm; @foo is short for self.instance_variable_get(:@foo). It's message-sends all the way down, until you hit natively-implemented methods.
> How would you write that, if the field access was spelled `self.foo`? The language wouldn't be able to tell that you're not just recursively calling the getter!
You can require parentheses for method calls put methods and fields in separate namespaces.
Elixir supports paren-free calls but the default linter and formatter won't let you use them except for a few whitelisted DSLs. I've never missed not having them.
In languages that require parens for method calls, not using them usually gets you a method handle. Which is still in conflict with a field reference — usually because methods are just function-pointer-typed static fields.
As someone who infrequently had to touch Ruby code, this was maddening. Years later, I only now am finding out what was going wrong and a better sense of what search terms to use.
As I said I'm a big propoenent in languages being approachable for those infrequent one-off cases. I've been burned by the challenge of updating the "handful" of Perl and Ruby scripts (and Perl was my first language). This is why I advocate against Lua and 1-indexing when the target audience is programmers and it isn't a "primary" language.
I also have to touch Ruby code from time to time, so when I found out I don't quite understand what "@" and "@@" mean (other parts, even blocks, were kinda more or less apparent), I... went and read the docs. Took me an hour or two but now I know what "@" and "@@" mean and actually think they're a pretty ingenious solution.
I explicit this.thing in C# as well. It started from inheriting some coding standards / projects whose designers came straight from C and didn't do the idiomatic _variable thing for instance variables.
Now it's quite an entrenched habit and at this stage I'd prefer if the implicit access wasn't possible.
Thanks for putting into words something I've started to feel over time, but never conceptualized clearly.
I agree that closures pass the test - and I too remember when they weren't popular. I also remember what I did before learning about the very idea of first-class functions and closures: I simulated them with some ad-hoc means (like function pointers in C/C++, or passing strings to be eval()-ed in PHP, etc.).
This, I think, is an useful heuristic: the things likely to pass your test are the ones which people who don't have and don't know about them still end up approximating anyway - meaning those things are a natural solutions to some common problems.
I can think of couple other things that pass your test:
- Functions in general. It's the basic organizational primitive in code; working without them is Not Fun.
- Lisp-style macros. There are many problems that would be best solved with some surgical code generation, and having that option built-in into the language makes all the difference. Most languages don't have this type of macros - but that doesn't mean they aren't needed. Having done enough Lisp macrology, I saw that in those other languages I've always been coping. Missing them without knowing what they are.
Hell, look no further than webdev - these days, major frameworks like React, and every other minor library, and even the language evolution itself, all depend on running an external macro processor / code generation tool as part of your build pipeline.
A trivial but real one for me is being able to use non-alphanum chars freely in var and fn names. Being able to name a fn 'string->int' or especially something like 'valid?' seems very small but I really miss it in languages with more restrictions on names.
Although less encompassing than what you're talking about, I miss being able to name functions with a "?" at the end when they are a predicate; "isValid?".
That’s what the ‘is’ is for. If you have a ‘?’ character, you don’t need it. But tastes vary. I don’t even like the ‘is’.
1. if isValid()
2. if valid?()
3. if isValid?()
4. if valid()
Number 4 is nicest to my eyes. But I guess if the ‘?’ or ‘is’ (or both) is a promise to the user that the function is a true predicate, then I can see its utility.
'?' / 'is' get more useful with more complex predicate names than just 'valid', and they also help with certain corner cases in English. For example, what does the following code do:
if(widget.free()) { ... }
Does it 1) check if the widget is "free" (whatever that means in the widget domain), or 2) frees the widget and checks the outcome of that operation?
If I saw something like this while reading code, I'd pause and carefully check what exactly is going on here.
In fact, I was going to write "Option 2) resembles resource management patterns, for example memory management in C", but then I checked and noticed that free() in C does not return a value, so this pattern would not exist with malloc()/free() - in other words, despite doing a bit of C and a lot of C++ in the past two decades, I still tripped over this.
On that note, I'd love some kind of sigil for "asserting" functions - which are similar to checks, but instead of returning true/false, they ensure the argument is in the state described by the function name, or else they throw an exception. It's a pattern I've been using in exception-ful code to cut down on noise. For example:
// Check if connected; if not, maybe run reconnection
// logic or attempt some other form of recovery.
// The only way control proceeds past this line is if
// the session is connected; if it isn't and can't be,
// exception is thrown.
EnsureConnected(session);
if(IsSomething(session, arg1)) {
// ... some code requiring connected session
}
// ... more code requiring connected session
It's not a big deal, but in some cases, that "Ensure" or "Assert" look weird, and I don't like inventing more synonyms for the same pattern.
> Does it 1) check if the widget is "free" (whatever that means in the widget domain), or 2) frees the widget and checks the outcome of that operation?
Just program in Esperanto! So 1) would be "umo.libera()" and 2) be "umo.liberigu()".
I can't believe I still remember the grammar after what, 10 years of complete disuse?
Thanks for putting into words something that was on the edge of my mind, but never quite graspable.
Two more examples (for me?) of features that I find you really miss in a language even if you’re fluent in the local idioms: First-class functions and pattern matching.
Passing functions as values is so nice and afaik most modern languages have that feature nowadays. But I remember when it used to blow people’s minds.
Pattern matching is something I’ve missed ever since having it in Haskell. Such an elegant solution to a problem that you have just often enough that the typical native approach feels clunky.
Dart 3 got very nice pattern matching [1]. And the next version of Java might introduce it (but likely it will still be behind a "preview" flag) as well.
Perl's regex handling. That's the one I miss in whatever other language I program in.
To be able to match AND get the matched substrings out in one line, makes this really succinct.
if ($hour >= 12) {
if ($min == 42) {
doSomething();
}
}
Whereas in Python, Ruby even though it's only a tiny extra step, but that symantic distance in one's head when parsing out and naming the parts of a regex on the same is a convenience that when you get used to it, you really miss it.
m = /(\d\d):(\d\d):(\d\d)/.match( someTimeString )
hour = m[0]
min = m[1]
sec = m[2]
if hour >= ...
but ruby, python etc.. can just extract the list too, it's just an extra method call on the same object (perhaps there could be destructuring too in modern ruby/python)
Matching & getting in most languages typically devolves into matching a regexp to a string, getting a MatchResult object, and then getting/iterating/checking/... on it.
I switched from Perl to Python around 12 years ago. I do think the sigils make code a little faster to comprehend. A bare word in python can be anything, wheres a variable will always start with a '$' or '@'.
It's not a huge win, but I do think it's better than nothing.
As for missing things, I do miss Perl a lot. I missed the curly braces when I first started and whitespace didn't feel right. Than after maybe 6 months I had to go back and do some Perl. Moved some blocks of code around, then got the dreaded missing brace problem. I realized that was something that I never got in Python and am a fan of whitespace since then.
I like this approach - We use python for our backend and the things I miss the most from other languages are the protected/public/private keywords (Java), single-line `if condition: return` statements (js, ruby, etc.), and npm/yarn/package.json (js). I miss types too but it feels unfair to complain about that with Python.
MyPy isn't quite as nice as typescript, and it also has some trouble with Django. I know it CAN work, but does any language have a worse type system? Maybe Ruby.
For single line conditionals - looks like you're right from that I can tell. I mistakenly assumed the pep8 errors were actually interpreter errors. Thank you!
I believe that the reason you're having issues with Django is that Django lacks types on its external interfaces. This isn't an issue with MyPy. You're conflating language-level issues with code-level issues. Python's type hints are quite powerful, assuming you're using stuff that provides them. At this point, it has a better type system than Go (if only because it has sum types and option types), but it's opt-in.
> modern IDEs and editors give us all the type information we could want, and these tools made sigils obsolete.
I like to rate a programming language by how dependent the language is on some bloated IDE ("editor"). If I need an Eclipse or a Pycharm just to edit a file, something has gone wrong syntactically and systemically.
Sigils are semantic information about the code. Sigils do not reduce readability, they increase expressivity and comprehensibility. It isn't the characters themselves that are the problem -- we see the same notations for different purposes entering Python and DSLs such as Pandas.
sed and Perl scripts are very often extremely hard to decipher.
I've used both extensively a long time ago when my workstation had a Sun Microsystems logo on it (yeah I am that old) and I remember having problems to read my own scripts a few months later.
I agree with this and in most languages I don’t miss sigils but one language I do wish supported them is plpgsql.
The reason is that column names and function arguments overlap a lot, which can cause ambiguities when performing updates or selects. To become productive at plpgsql it’s a problem that you have to solve.
There are several approaches but the one I settled on is just to prefix all formal parameters with underscores.
The wish I have with plpgsql is that I could use $ instead since underscore is already heavily used as a word separator.
> Despite my decades of dynamic typed languages, I hate going back to dynamic languages anymore. YMMV.
Mine does vary - while static typing is helpful, it still (even with more advanced type systems) leads to boilerplate code that I dislike writing. In a compiler written in OCaml that I worked on for a bit, there were hundreds of lines of code dedicated to just stringifying variants. It could have been generated by a syntax transform (the newer tools for this are actually quite good), but that's another dependency and another cognitive overhead. In Kotlin, lack of structural types means that the rabid "clean architecture" fans create 3 classes for each piece of data, with the same 10 fields (names and types), and methods to convert between those classes - it requires 10x as much code for very little gain. Lack of refinement types makes the type systems mostly unable to encode anything relating to the number values, other than min/max values for a given type. There's reflection in Kotlin (not in OCaml though) that you can use, but then we're back to everything being an Object/Any and having runtime downcasts everywhere.
I think gradual type systems are a good compromise, for now at least. I'd prefer Typed Racket approach of clearly delineating typed and untyped code while generating dynamic contracts based on static types when a value crosses the boundary. Unfortunately, that's not going to work for existing languages, so the next best thing is something like TypeScript or mypy.
Of course, convenient, hygienic, Turing-complete not by accident, compile time execution and macros would, to some extent, alleviate the problems a simplistic type systems cause. A good example is Haxe, Nim, Rust, Scala 3, etc. Without such features, though, I'm not willing to part with runtime reflection and metaprogramming facilities provided by dynamic languages - the alternative is a lot more lines of code that need to be written (or generated), and I don't like that.
---
More to the topic: logic variables. The `amb` operator from Scheme, for example, or what Mozart/Oz has, or Logtalk, or Prolog of course. They're powerful, incredibly succinct way of constraints solving without writing a solver (just state the problem declaratively and done - as close to magic as it gets). No popular language offers an internal logic DSL, although there are some external DSLs out there.
Also, coroutines. No more manual trampolining, no need for nested callbacks, the state of execution can be saved and resumed later mostly transparently. Lua has them built-in, Kotlin implements CPS transform in the compiler. Nowadays almost all popular languages provide them, mostly exposed as async/await primitives. Scheme and Smalltalk can implement them natively inside the language and did so for ages; it's nice to see mainstream languages catch up.
REPLs. Not a language feature per se, but an implementation decision that has a lot of impact on productivity. It's relatively commonplace now - even Java has jshell - but most of the REPLs are pretty bad at executing "in context" of a project or module. Racket, Clojure, Common Lisp, Erlang, Elixir are gold standards, still unmatched, but you can get pretty far with Jupyter Notebooks.
Destructuring/pattern matching. It was carefully added in some simplified cases (mostly simply destructuring sequences) in many languages, then the support for wildcard and splicing was added, then support for hashes/dicts was added, and now finally Python has a proper `match` statement. I think more languages will implement it in the near future.
Some sort of coroutine solution is definitely on my list too. I'm actually not too passionate about which one it is, except for a distaste for async/await on the grounds the compiler ought to be able to do it for me. But generators, threads or actors cheap enough to use freely, coroutines, something that allows me to break out of the strictly hierarchical structured programming system and retain some degree of state within a function when I need to. It's possible to hack something together in a language lacking this, by moving all function state into a struct/object but all the manual scaffolding is painful and error prone.
So it's not a type its more of a language implementation detail that doesn't add value to the programmer.
Where as with the Hindley–Milner type system the types exist and add value without requiring all of the extra code. You can infer the type most of the time and you get a nice strong type check at compile time.
Sigils seem like a step in the opposite direction to strong inferred types. You have to add a little bit of boiler plate but it's not that strict so it's meaning can still be confusing.
Perl is the most confusing language I’ve ever used professionally, largely because of how it uses sigils sometimes as operators. The other source of confusion are the million and a half implicit variables that are context-specific. It’s the only language where even after ten years I still regularly have to google to do simple things like iterate over a hash. And half the time it doesn’t work because the hash is actually a ref so now you need an arcane syntax or it doesn’t work. In Perl all of the implementation details of the language become footguns for you to shoot yourself with.
> And half the time it doesn’t work because the hash is actually a ref
Perl still supports both „copy by value“ and „copy by reference“. For example:
my @copy = @original;
my %hash_copy = %orig_hash;
my @processed = foo_func(@copy);
my $ref = \@copy;
bar_func($ref); # modifies @copy in place
That’s why it uses these sigils to distinguish between references being scalar values (“$”) and actual lists/dictionaries (“@“ and “%”). Since Perl is also dynamically typed if it weren’t for the sigils, it would be quite confusing when reading “array[i] =“ somewhere not knowing whether it modifies an array created remotely or locally. Sigils communicate that because it reads “$array[$i] =“ for a local array or “$$array[$i]”
for a remote one.
In other languages like JavaScript or Python everything is basically a reference and, hence, you don’t quite need sigils there. However, on the flip side, you need to be more careful and constantly remind yourself of the fact you are dealing with references and not to accidentally modify the objects you get passed into your function.
not knowing whether it modifies an array created remotely or locally
Yes, good language design allows the programmer to ignore details such as how an object was created or where it is stored.
Having to remember something like that violates so many design principles. Why would a programmer using the array need to know how it was created? It just adds unnecessary complexity to the code, making the programmer's job harder than it needs to be. It's accidental complexity ossified in the programming language.
> Having to remember something like that violates so many design principles.
While I am inclined to agree, this is the case for many languages. Python and JS for example IRC. When you receive a dictionary or an object as a function argument you have no write protection if you poke around inside of it. Perl makes it at least a bit more obvious.
In C++ you have constant reference signatures if you happen to use them but the syntax is also not exactly pretty.
In C you only may pass pointers to an array which is even done implicitly. No write protection either.
That is fairly simple, and not a good example of something that is difficult in Perl. If you cannot keep that straight, I suspect the extent to which you really use Perl is limited.
Perl is hardly alone in the million and a half implicit variables. Python is a far worse offender on this point and probably equally so in the context specific meaning of a statement (is this a generator or a list). My guess is this gripe is a familiarity thing. Iterating a hash in perl isn't that complicated, neither is iterating a hashref but you do need to be aware of which it is - which you should be anyway so you know whether you're modifying it locally or for the caller.
That said, I'm also in the camp of "I don't care much for sigils."
Not to mention that this refactoring introduces a bug:
$length = qw(burgers fries shakes);
Because lists and arrays convert to scalars differently. A list is more of a syntactic construct and an array is a data structure. Confused the heck out of me my first few weeks of Perl.
> So it's not a type its more of a language implementation detail
Well, it’s used to distinguish between values that were passed into a function via „copy by value“ or „copy by reference“. Insofar sigils have nothing to do with types but rather with function passing semantics. Yes, with a good type system you can also communicate whether an array is passed by “copy by reference” in the type signature of a function. Then you wouldn’t need sigils. But it’s a secondary feature of static types.
However, you could also abolish “copy by value” altogether (as JS and Python do) and then wouldn’t need neither sigils nor types.
> Where as with the Hindley–Milner type system the types exist
Perl is a dynamically typed language. If it were using static types, it wouldn’t quite need sigils, yes, but then it also wouldn’t have been the Perl programming language…
Sigils do not really have anything to do with static types and shouldn’t be discussed in this context. Then they are also not as confusing.
BTW: This whole discussion of static vs. dynamic typing has become a bit tiresome over the years. It will never be settled. In the 90s everybody tended to hate static typing and for good reasons so: It makes generic programming quite a bit more complicated than necessary. This was when all these dynamically typed scripting languages were invented to enable programmers write abstract code more easily by lifting the burden of constantly inventing composite types. „If it walks like a duck …“
Of course, everything is a trade off and with that approach you are prone to get more run time errors and, hence, people started test-driven development. Then developers started to hate writing tests (also for good reasons) and re-discovered static typing. Now people seem to be happy hacking ad-hoc types together — until they have to refactor other people’s programs and find it rather tricky because of the contagiousness of type signatures. When they spent enough weeks to rewrite type signatures in half of the program base they will long for the good ol’ scripting languages of the 90s again. And the cycle begins again.
The problem here isn't that Perl is a dynamically typed languages but because it uses pass-by-value semantics to an almost absurd degree. This is why it needs explicit references in the first place where as most other languages these either use references for non-primitive types (Ruby, Python, anything on the JVM, &c.) implicitly or treat values and references uniformly (Go, for instance).
I was attempting to contrast the usefulness of sigils vs static typing. While they're different things the do serve similar purposes. They restrict a thing so one can reason about it. At a meta level they do similar things.
The reason I don't like Perl's sigils is that my head is not good at inferring meaning from something like a $ or % or \%. I prefer they way it's done in other languages where you have to call a copy or deep copy function. Not everyone would agree with that preference.
I think dynamic languages have their place. I'm personally more comfortable with the training wheels on in a statically typed language. But there are times where using a statically typed language is going to arrive at an overengineered solution. There are developers that can do amazing things in dynamic languages because they're good at catching their own errors and can leverage the flexibility of dynamic typing to their benefit.
> It makes generic programming quite a bit more complicated than necessary. This was when all these dynamically typed scripting languages were invented to enable programmers write abstract code more easily by lifting the burden of constantly inventing composite types. „If it walks like a duck …“
Perl5 got sigils wrong, where more complex usages hinted at incorrect human parsing of expressions. That was designed borrowing from natural language, so wd should probably throw out the early part of the post. Now the post says in Raku
@ says “Use me with an array-like interface”
% says “Use me with a hash-like interface”
& says “Use me with a function-like interface”
$ says “I won’t tell you what interface you can use, but treat me as a single item”
I don't use Raku nor used much of Perl5 (only enough to learn it's good for writing, not for reading). Sigils in Raku may be fine and better than not using them. I'll accept that.
However, I much prefer inferred static typing and referential transparency where everything produces a value and it's not material whether it's a precomputed value or something that will produce the value when 'pulled on'. The last part works well with pure functions and lazy evaluation. Until someone claiming benefits of sigils has used this alternative for large, long-lived code written and maintained by many, I'll leave sigils to Raku alone.
I think there's a piece of insight missing from the author's analysis of non-programmatic sigils. To wit, the sigils are only valuable when both parties deeply understand the information that the sigil is trying to convey. The "$framework at $dayjob" example illustrates this point. Programmers familiar with the use of sigils to indicate variables intrinsically grok this phrase, but it looks like gobbledygook to non-programmers. The email inbox example is similar. (I'd argue the hashtag/@-symbol example is a bit more complicated, because those symbols service important UX functions.)
I think this insight crystalizes the trade-off. I agree with the author that sigils are a powerful way of communicating useful information in a concise fashion. But does their inscrutability to non-expert users justify their existence? I'd argue it usually doesn't. Whenever I've had to pick up a language that uses a lot of sigils (or even just had to read source code in one of those languages if I don't use it daily), I always find the sigils require a bit of extra mental effort to process. It seems like other languages manage to express meaning in a way that is less burdensome to non-experts.
> because those symbols service important UX functions
As I read the post, I was thinking that #tags and @mentions are primarily about input, not reading. It's easier to just whack some #random #tags in your #sentences than to switch to a separate tag list input. Similarly, highlighting some text in order to apply the "mention" brush like we might with bold or italics would be strictly worse.
I might agree with you if I ask for the fact some of the most popular beginner languages use sigils:
- BASIC
- Shell scripting
- PHP
It’s also worth noting that all languages have special tokens to identify properties of the code. Eg why does a string need to be wrapped in quotation marks but integers do not? Why do single and double quotation marks behave differently in some languages? Why do function names behave differently if you pass () vs not including parentheses in some languages?
At the end of the day, if you want to learn to program then you are always going to have some degree of syntax that you just have to learn. Sigils aren’t inherently hard but some languages make I them more abstract than others.
Another thing that’s worth baring in mind is that sigils solve a problem in languages that make heavy use of barewords, such as shells. Eg how do you know if foobar is a variable, function, keyword, parameter, etc if you syntax is
echo foobar
This is why other languages then use quotation marks, parentheses, etc. But while that’s arguably more readable, it’s a pain in the arse for REPL work in a shell (I know because I’ve tried it).
> I might agree with you if I ask for the fact some of the most popular beginner languages use sigils
20 years ago I might've agreed with you. But I do not think that PHP, BASIC and shell scripting are popular beginner languages in 2023.
> It’s also worth noting that all languages have special tokens to identify properties of the code. Eg why does a string need to be wrapped in quotation marks but integers do not?
Quotation marks and especially parentheses after function calls don't fit TFA's definition of a sigil because they aren't at the beginning of the word and (arguably only in the latter case) don't communicate meta-information about the word.
> At the end of the day, if you want to learn to program then you are always going to have some degree of syntax that you just have to learn.
I'll agree with you that the line between sigils and general syntax/punctuation is a bit of a blurry one - where do you stop? Using my definition above, I think wrapping strings in quotation marks is a clear win because it fits our widely-held shared understanding that quotation marks demarcate and group a sequence of words. Single and double quotes behaving differently is unintuitive for the same reason while not conferring a corresponding benefit on experts.
> 20 years ago I might've agreed with you. But I do not think that PHP, BASIC and shell scripting are popular beginner languages in 2023.
PHP and shell scripting are still massively used in 2023 (eg https://madnight.github.io/githut/#/pull_requests/2023/1). You have a point about BASIC but it was the de facto standard for computers at a time when people didn't have the web to quickly look up problems and thus learning to code was much harder. Yet we (in fact I) managed just fine.
> Quotation marks and especially parentheses after function calls don't fit TFA's definition of a sigil because they aren't at the beginning of the word and (arguably only in the latter case) don't communicate meta-information about the word.
I didn't say they are sigils. I said they're tokens. My point was that removing sigils doesn't remove meta-information encoded in magic characters:
- You have `foobar()` where the braces denote (call the function rather than pass the function reference
- "" == string which allows escaping and/or infixing vs '' which doesn't (other languages have different tokens for denoting string literals, like `` in Go)
- # in C and C++ is a marco
- // is a line comment in some languages. Others use #, or --
- Some languages use any of the following for multi-line comments: ```, /* /, and even {} is used. Whereas it's an execution block in some other languages
My point is you have to learn what all of these tokens mean regardless of whether they sit as a prefix or not. The that that they're a sigil doesn't change anything.
The real complaint people are making here is about specific languages, like Perl, overloading sigils to do magical things. That is a valid complaint but, in my opinion, it's a complaint against overloading tokens rather than sigils specifically. Much like a complaint about operator overloading doesn't lead to the natural conclusion that all operators are bad.
> don't communicate meta-information about the word.
We need to be careful about our assumption about whether a token effectively communicates meta-information because while I do agree that some tokens are more intuitive than others, there is also a hell of a lot of learned behaviour involved as well. And it's really* hard to separate what is easier to understand from what we've just gotten so use to that we no longer give a second thought about.
This is a massive problem whenever topics about code readability comes up :)
> I'll agree with you that the line between sigils and general syntax/punctuation is a bit of a blurry one - where do you stop?
shrugs...somewhere...? You can't really say there should be a hard line that a language designer shouldn't cross because it really depends on the purpose of that language. For example the language I'm currently working on makes heavy use of sigils but it also makes heavy use of barewords because it's primary use is in interactive shells. So stricter C-like strings and function braces would be painful in a read once write many environment (and I know this because that was my original language design -- and I hated using the shell with those constraints).
In a REPL environment with heavy use of barewords, sigils add a lot to the readability of the code (and hence why Perl originally adopted sigils. Why AWK, Bash, Powershell, etc all use them, etc).
However in lower level languages, those tokens can add noise. So they're generally only used to differentiate between passing values vs references.
But this is a decision each language needs to make on a case by case basis and for each sigil.
There also needs to be care not to overload sigils (like Perl does) because that can get super confusing super quick. If you cannot describe a sigil in one sentence, then it is probably worth reconsidering whether that sigil is adding more noise than legibility.
> sing my definition above, I think wrapping strings in quotation marks is a clear win because it fits our widely-held shared understanding that quotation marks demarcate and group a sequence of words. Single and double quotes behaving differently is unintuitive for the same reason while not conferring a corresponding benefit on experts.
Here lies the next problem for programming languages. For them to be useful, they need to be flexible. And as languages grow in age, experts in those languages keep asking for more and more features. Python is a great example of this:
- ''
- ""
- ''' '''
- """ """
- f""
...and lots of Python developers cannot even agree on when to use single and double quotes!
I tried to keep quoting simple in my own language but I ended up with three different ways to quote:
- '' (string literals)
- "" (strings with support for escaping and infixing)
- %() (string nesting. For when you need a string within a string within a string. Doesn't come up often but useful for dynamic code. A contrived example might look like: `tmux -c %(sh -c %(echo %(hello world)))` (there are certainly better ways you could write that specific code but you get the kind of edge case I'm hinting at).
As much as languages do need to be easy to learn, they shouldn't sacrifice usability in the process. So it is a constant balancing act trying to make something easy to learn, yet also powerful enough to actually have a practical use. Not to mention the constant push and pull between verbosity where some claim fewer characters (eg `fn` as a function keyword) improves readability because it declutters the screen from boilerplate, while others say terms like `function` are more readable because it is closer to executable pseudo-code. Ultimately you cannot please all of the people all of the time.
>Programmers familiar with the use of sigils to indicate variables intrinsically grok this phrase, but it looks like gobbledygook to non-programmers.
While I understand where you're coming from, I'd argue that programming-related concepts are all "gobbledygook to non-programmers", that's to be expected. Having something like (this is close to valid Raku but it's not)
Positional[Any] ages = [42, 38, 25];
doesn't make it any easier than
my @ages = [42, 38, 25];
unless you already have prior knowledge of arrays, assignments, types, etc.
> We had that problem at $day_job: our code was a mess, but everyone thought $framework would magically fix it.
Replace with "a" or "that one" and it's the same or even better.
The reality is that $dayjob (or %dayjob% etc) signals "hey I'm a programmer" and is less to type.
> I need to be able to quickly distinguish between the two types of labels so that I can notice emails that don’t have exactly one folder-label. This is a job for sigils.
Prefixing email labels is a workaround in this case because you don't need to notice emails that belong to two folders if your filtering system is good.
> As you’ve probably noticed, I’ve been saying “has an array-like interface” instead of “is an Array ”. That’s because – and this is a crucial distinction – Raku’s sigils do not encode type information. They only tell you what interface you can use.
> ...
> And Buf s aren’t Array s, nor are they any subtype of Array . But they can be used with an array-like interface, so they can be stored in an @ -siglied variable.
Sounds like a missed opportunity to type buffers as byte arrays...
We are past the point where sigils are useful for a modern compiler or machine interpreter. A limited ability to avoid collisions with "reserved words" ... would be more of a mis-feature than a feature.
The value of a sigil is for the other humans reading the code. And, I agree it can be quite valuable there.
> A limited ability to avoid collisions with "reserved words" ... would be more of a mis-feature than a feature.
While I don't necessarily disagree with you, there's something I'd like to mention.
C# allows using @ before a variable name which happens to be the same as a keyword to escape the name.
I don't remember the specifics, but it has saved me once when I wanted to create a class shaped exactly like a JSON object I was loading from an external source and one of the fields was a reserved word. For example, this works fine:
class MyClass
{
public int value;
public string @default;
public string @switch;
}
string json = "{\"value\": 8, \"default\": \"something\", \"switch\": \"on\"}";
MyClass o = JObject.Parse(json).ToObject<MyClass>();
Console.WriteLine("default={0}, switch={1}", o.@default, o.@switch);
I wish that more programming languages would include detailed rationales and justifications of their features like this in their documentation. While this blog post doesn't have the rigor of something like the Ada Rationale[1], it gives a real sense of why something might have been designed the way it was.
Maybe I just enjoy reading this sort of thing. But we are so often told to choose the "best tool", or find ourselves evaluating programming languages for all sorts of reasons. Being able to understand a language in its context, the choices it made in comparison to the alternatives, can only help with that.
When I read the rationale for a new early-stage language Austral[2], I found myself actually able to evaluate what I thought the language might be good for, how it might evolve while under the control of the same creator, whether it fit with my personal strengths, weaknesses, methods and aesthetics, etc.
By contrast, I was perusing the documentation for Odin, another new language, at around the same time. While I don't want to disparage the incredible amount of work it took to a) build a programming language and ecosystem and b) document it, I found myself wishing for a similar "rationale" document so I could actually compare Odin with Austral at a more abstract level than reading syntax.
Odin's "overview" begins with an example of the lexical syntax of comments, rather than what the principle "Striving for orthogonality"[3] in its FAQ actually means and how it is borne out in the language as designed, compared to other approaches that could have been taken.
Austral, by contrast, has a great section called "the cutting room floor" where the creator discusses different approaches to resource safety, the difficult tradeoffs to be made, and why Austral decided on possibly the severest of all the approaches. This isn't just philosophy; it tells me something useful about the tradeoffs involved in using the language.
Anyway, the OP helped me to understand very clearly that Raku's priorities and values are extremely different to my own, and that it would likely be a bad choice for me to invest time in.
Thanks for the Austral spec link! Programming languages come and go but I find it deeply interesting when language designers lay out the rationale behind possibly outlandish decisions in their programming languages. Case in point, Larry Wall and Perl-y languages [0]
>Sigils are why I gave up learning PERL. Everything I learned had a half life of 10 minutes.
Interesting! By "half life of 10 minutes", do you mean the language was changing too quickly under you or that it was difficult to remember the sigils?
I really like sigils for several reasons and I do miss them when in non sigil languages - imo raku made two big improvements over perl that is to avoid changing the sigil on accessing an item and to have an unsigilled option for people who don't like them https://rakujourney.wordpress.com/2022/12/24/on-sigils/
Avoiding sigils is a good future proofing strategy. When you find out you need a new syntax it's nice to have a set of symbols that are guaranteed not to break existing code
The way you worded this reminded me about some language extensions for the Commodore 64 and 128 that hooked into the BASIC language tokenizer. I remember one published in RUN magazine, I think for additional graphics functions, that added new BASIC instructions that were all prefixed with @. Not only did this prefix invoke the new code while interpreting, it also namespaced the addition avoiding conflicts with existing BASIC code.
In Clojure (and I’m pretty sure Scheme), ! and ? are not sigils in the sense that they’re special syntax like, say, @ (which is a reader macro), or @ and @@ in Ruby: they just tend to be used by convention. ! and ? are no different than a or b.
They somewhat follow the OP's definition of sigil (in the yellow box), which doesn't say that sigils have to be interpreted in a special way, only that they convey some meaning to the programmer (although it's a bit unclear - communicate to whom, programmer or interpreter?). They are also not at the beginning of the word, but that seems to be rather arbitrary requirement.
I perceive FORTH in a different way. The only sigil in FORTH is SPACE. Everything else is a letter.
Then you have Bjarne Stroustrup's "Generalized Overloading for C++2000", with which you can overload whitespace, uniquely redefining the meaning of space, newline, tab, and even the absence of space.
Personally, I have the exact opposite thoughs on sigils: they break my fluency in reading code.
Not all sigils are as bad, but to me it's as if there's a word from a foreign language italicized in an english sentance -- that means I'll have a reading pause there.
The dollar sign is among the worst, and curly braces are the lightest, with ”@" bring in the middle in terms of pausing reading.
Well that's a whole lot of talk with 0 concrete code examples...
One interesting thing, the author mentions VS Code as abolishing the need for Hungarian notation, which is funny - the entire VS Code codebase in written in Hungarian.
it’s always been a naughty pleasure that Matrix uses sigils both for ‘mainstream’ user IDs (@user:domain.com) and room aliases (#room:domain.com) as well as more developer focused things - $ for event IDs, ! for room IDs: https://spec.matrix.org/v1.7/appendices/#common-identifier-f...
I have a bad feeling this is due to me doing too much Perl as a child. But folks don’t seem to complain about it, especially since we also sprouted a proper URI scheme too.
* that communicates meta-information about the word.
He gives the example of `echo $USER`, where `$` is a single that communicates that `USER` is a variable, presumably with some contents. Thus, I'd wager `$` is a sigil in `$foo`.
I suppose "word" is the constraining factor there, I was thinking of > and # as sigils too, which--if you're willing to be a bit loose with what a "word" is--contradicts that they're unpopular.
By this metric, rather a lot of features turn out to be less important than they may seem at first. Many things are a zero on this scale that I think might surprise people still on their second or third language. From this perspective you start judging not whether a language has this or that exact feature that is a solution to a problem that you are used to, but whether it has a solution at all, and how good it is on its own terms.
So while sigils have a lot of company in this, they are also a flat zero for me on this scale. Never ever missed them. I did a decade+ of Perl as my main language, so it's not for lack of exposure.
(As an example of something that does pass this test: Closures. Hard to use anything lacking them, though as this seems to be a popular opinion nowadays, almost everything has them. But I'm old enough to remember them being a controversial feature. Also, at this point, static types. Despite my decades of dynamic typed languages, I hate going back to dynamic languages anymore. YMMV.)