If only more languages supported quote operators like Perl. They really do make certain things like this so much easier.
Non interpolating quotes:
q/I wasn't surprised when he said "boo" and game me $5/;
q@I wasn't surprised when he said "boo" and game me $5@;
q!I wasn't surprised when he said "boo" and game me $5!;
q(I wasn't surprised when he said "boo" and game me $5);
q[I wasn't surprised when he said "boo" and game me $5];
I think it was in ed already. The POSIX ed spec says: "Any character other than <space> or <newline> can be used instead of a slash to delimit the RE and the replacement."
R"/(I wasn't surprised when he said "boo" and game me $5)/"
R"@(I wasn't surprised when he said "boo" and game me $5)@"
R"|(I wasn't surprised when he said "boo" and game me $5)|"
I also like how Raku has extended this with the Q operator, allowing one to specify what should be interpolated (scalars, escape sequences, closures, etc) also with arbitrary delimiters --
Q/I got $5 \sigh / # literal
Q:s/Hello $name/ # just scalars
Q:b/hello world\n/ # just backslash escape sequences
It does look nice, and I kind of miss the feature in other programming languages. I also maintained cperl-mode for many years, and honestly the editor support for this functionality never quite worked right. It is hard to parse, and even harder when you are doing a heuristic-based incremental parse. (The challenge with text editors is that you have to provide correct syntax highlighting even when the document contains parse errors that the compiler would have bailed out on.)
Because editor tooling is important to me, I'm willing to see an ugly "\"quoted\" string" from time to time, if it means that I get better tools as a result.
I do generally prefer Perl's q and qq, though Python's triple-quote and its raw form are often good enough. Python also inherited C's (mis-)feature of concatenating consecutive string literals automatically, which comes in handy in some cases.
['but', 'boy', 'am', 'I', 'glad', 'I', 'don\'t', 'have', 'to', 'read', 'stuff', 'like', 'that']
qw/but boy am I glad I don’t have to read stuff like that/
If splitting a string is an important optimization point in your program, neither language is a good choice.
You fail to explain why other languages do not grow these vestiges. I can only assume the implicit explanation is that you believe them to be incompetently designed, whereas I believe the exact complement: it looks like a good idea, but it isn’t.
There are many good ideas that aren't in one programming language or another.
Does that actually mean they are bad ideas? No.
It only means that particular language didn't copy that particular idea.
There may be bad ideas that are in many languages, that doesn't somehow make them good.
---
Quote-words are used regularly in Perl because they are clear and useful.
for (qw' alpha beta charlie ') {
say
}
for (split '', 'alpha beta charlie') {
say
}
for ( 'alpha', 'beta', 'charlie' ) {
say
}
The `qw` emphasizes that we are dealing with `alpha`, `beta`, and 'charlie`. The other ones have extra noise that is only there to satisfy the compiler.
Now imagine you have to add `delta` to the list.
With the `qw` you only have to press the spacebar and the letters `d e l t a`. You don't have to worry if you accidently left off a `'`, because you didn't need to add one.
To be fair, the `split` would have the same benefit. But it is still more error-prone. For example I wonder how many people didn't notice that the first argument to `split` was an empty string when it should have been a string with one space in it.
I didn't even notice it, and I've been programming in Perl for decades.
Such things are called "syntactic sugar". They are common in other languages. Quote: «A construct in a language is called "syntactic sugar" if it can be removed from the language without any effect on what the language can do: functionality and expressive power will remain the same.»[0]
Another quote: «Data types with core syntactic support are said to be "sugared types." Common examples include quote-delimited strings, curly braces for object and record types, and square brackets for Arrays.»
It's the idea, rather than the implementation, which I'm advocating for. For a shell, it would likely be best to trigger with a reserved character (or two) that's much less likely to be encountered in normal usage.
Finding a reserved character (or even sequence) that doesn't have collisions or unintended behavior beyond what the semi-reserved ' and " can do will be hard (but not necessarily impossible :)! ).
In addition to what kbenson said, word characters [a-zA-Z0-9] usually aren't eligible as quotes anyway, otherwise how would qq'this has a q in it' parse (versus q/'this has a /).
Adding an additional :p modifier prevents the complaint about command not found:
$ # This string 'has single' "and double" quotes and a $
$ !:q
'# This string '\''has single'\'' "and double" quotes and a $'
# This string 'has single' "and double" quotes and a $: command not found
$
$ # This string 'has single' "and double" quotes and a $
$ !:q:p
'# This string '\''has single'\'' "and double" quotes and a $'
$
Bash treats the string "/dev/stdin" as magic, so that works even if /dev/stdin doesn't exist. However, bash (unlike ksh) spawns a subshell for $(</dev/stdin), so using cat is actually lighter. (Also, it's not clear to me why $(<&0) doesn't work in bash.)
For me, this escapes spaces with backslashes; the example in the article escapes them by quoting the whole string. Is this a difference in bash versions, or do these uses differ somehow?
(I replaced the single quotes with double so i could pass this to bash -c, but i get the same result if i use your code verbatim as a script)
It's plainly not the same, you can see that by looking at it. It encodes the same text, but it encodes it a different way.
And the difference is not negligible. With quotes, i can type or paste more into the middle, and the string is still valid. With escapes, i have to be careful to escape the added text. It's significantly more ergonomic to use quoted text.
Yes the example in the article and parent comments ought to be referred to as escaping, not quoting.
Most of the time, quoting with single quotes leads to something far easier on the eyes.
Here's an example function for quoting from stdin:
function bashquotesingle() {
printf "'";
sed "s/'/'\\\\''/g";
printf "'";
}
If you intend to type or paste quotes, you still need some additional escaping. And on the other hand, most substrings of (non-quoted) escaped text are themselves properly escaped.
I learned recently that escaping characters in zsh/bash also works for parameter expansion:
# zsh using flags ${(flags)name}
% string="This is a string with \"\"\" and ''' and \"\" again ''. Also such stuff as & % # ;"
% echo $string
This is a string with """ and ''' and "" again ''. Also such stuff as & % # ;
% echo ${(q)string}
This\ is\ a\ string\ with\ \"\"\"\ and\ \'\'\'\ and\ \"\"\ again\ \'\'.\ Also\ such\ stuff\ as\ \&\ %\ \#\ \;
# bash using operators ${name@operator}
% string="This is a string with \"\"\" and ''' and \"\" again ''. Also such stuff as & % # ;"
% echo $string
This is a string with """ and ''' and "" again ''. Also such stuff as & % # ;
% echo ${string@Q}
'This is a string with """ and '\'''\'''\'' and "" again '\'''\''. Also such stuff as & % # ;'
Wow, so easy then, I remember struggling so many times in the past when e.g. iterating over filenames with unusual cahracters.
One thing I've figured is that in bash you can use $'These kinds of strings', without any variable expansion, but what you get is essentially what's present in most programming languages quote-wise. Example:
$ echo $'hey there, it\'s "double quotes", \'single quotes\', and some \\, \', ", $ chars'
hey there, it's "double quotes", 'single quotes', and some \, ', ", $ chars
Those strings also support things like \0 to get a null byte, and \uxxxx to get a Unicode character. This is useful for working with filenames and other things with spaces, quotes, and so on using find, xargs, etc. E.g.
find ... -print0 | while read -d $'\0' f; do ...; done
>Those strings also support things like \0 to get a null byte
WARNING: this is not true in bash!
You can have exactly one null byte in a bash string: the terminating null byte. Try this:
echo $’foo\0bar’
It prints “foo”.
So practically you can’t have null bytes in bash strings, as it will be mistaken for the terminating null of the underlying C string.
In your example read -d ‘’ would work just the same; actually that’s the idiomatic way to iterate on zero-delimited input (or xargs -0). Why does the empty string work? Because -d takes the first char of the string as the delimiter, which for empty C strings is the terminating \0 - this is how bugs become features.
zsh has quote-line by default bound to alt-' which will escape your current command line:
quote-line (ESC-’) (unbound) (unbound)
Quote the current line; that is, put a ‘’’ character at the beginning and the end, and convert all ‘’’ characters to ‘’\’’’.
Seeing the result of a modified history expansion is (or was) kind of a desired feature, so it's built-in just like the ability to escape stuff with :q
$ # here be quote ' " chars
(use :p to print but not execute)
$ :!q:p
'# here be quote '\'' " chars'
(use :s to modify – no :q here because we modify the already quoted text !!, unquoted is now !-2)
$ !:s/be/are/:p
'# here are quote '\'' " chars'
Is anyone else annoyed by the amount of built in magic character strings in bash? There are times where I know a task is possible in a bash script but make it in Python (with no dependencies required and compatible with 2 and 3) because it's easier than looking up every random gotcha and running into issues later on when someone runs it in a directory with a space.
You must think about the teletype era. One character commands are economical, even if you have to consult a manual to remember them all. Anything printed on the paper was expensive. I remember stealing color ribbons from the account reserved for manual typists and secretaries. It was also customary to use the paper roll all 4 ways, when editing.
You can always take advantage of what I believe is the portable property of consecutive strings (without space) being concatenated together. Then you never need to escape anything in your scripts.
For instance, to produce a double quote inside single quotes, you can do this
echo "'"'"'"'"
That's three quoted strings next to each other that produce
% cat /tmp/sh
var=variables
x=$(
cat <<EOT
This string has 'single' and "double" quotes and can interpolate '$var'
EOT
)
echo $x
% bash /tmp/sh
This string has 'single' and "double" quotes and can interpolate 'variables'
I wrote bash function that leverages 'set -x' to get me the quoting in "$@" into a single bash env var say $job or in to a temp file. I use it from time to time -- pretty sure it's not perfect, but it works well enough to be useful. To use it it usually involves an 'eval'.
## key step:
out=$((set -o xtrace;: "$@") 2>&1)
# xtrace option shows us quoting for our args
## Cleanup $out
# only 1 of the next 4 'left trims' is expected to change $out:
out=${out#+++++ }
# if $FUNCNAME is eval'd three times there are five "+" chars
out=${out#++++ }
# if $FUNCNAME is eval'd twice there are four "+" chars
out=${out#+++ }
# if $FUNCNAME is eval'd there are three "+" chars
out=${out#++ }
# xtrace prepends the '++ '; ': ' is from our code shown above.
# We strip these 5 left most chars.
# Would be nice to support any level of evals, (hence any no.
# of '+' chars) [...]
The only thing that trips me up more on becoming a true grey beard other than regex mastery is how and when to properly quote things in my command line incantations.
It is not dangerous, and does not perform the same operation (it is actually incorrect and incomplete: ":" and "#" swapped, and missing substitution words). I guess it was downvoted because it didn't explain? The bash manual does, though, under "HISTORY EXPANSION". Try this:
Type a command, but don't press enter, e.g.
echo 'hello';
Type
!#:s/hello/world/
Press enter. !# means the whole current command so far, and the s/a/b/ modifier replaces the first instance of "a" with "b".
Press up to get the command that ran:
echo 'hello'; echo 'world';
To see such history expansion things before they are run, press M-^ (probably Alt-Shift-6).
Non interpolating quotes:
Interpolating quotes: Rule of thumb, single q for single quote string (non-interpolating normally), and two q's for a double quote string (normally interpolates).