Bracketed paste mode: allows editor to determine that text is from a mouse paste instead of typed in. This way, the editor can disable auto-indent and other things which can mess up the paste. Libvte now supports this!
Base64 selection transfer: this is a further enhancement which allows the editor to query or submit selection text to the X server. This allows editors to fully control the selection process, for example to allow the selection to extend through the edit buffer instead of just the terminal emulator's contents.
One patch of mine didn't take, but I think it's still needed: allow mouse drag events to be reported even if the coordinates extend beyond the xterm window frame. Along with this is the ability to report negative coordinates if the mouse is above or to the left of the window. Why would this be needed? Think of selecting text which is scrolled off the window. The distance between edge and the mouse controls the rate of selection scrolling in that direction.
BTW, it's fun to peruse xterm's change log. For example, you can see all the bugs and enhancements from Bram Moolenaar for VIM. http://invisible-island.net/xterm/xterm.log.html
Thomas Dickey maintains a lot of other software as well, in particular ncurses, vile and lynx: http://invisible-island.net/
(It does work in practice, but in-band signaling over a channel carrying complex data that receiver and sender interpret according to settings that do not appear in the protocol at all is, predictably, terrible.)
"This study parallels our 1990 study (that tested only the basic UNIX utilities); all systems that we compared between 1990 and 1995 noticeably improved in reliability, but still had significant rates of failure. The reliability of the
basic utilities from GNU and Linux were noticeably better than those of the commercial systems."
I doubt there has been much improvement in those commercial Unixes; they are basically dead. (What would be the business case for fixing something in userland utility on commerical Unix?)
The maintainers of the free BSD's have been carrying that torch, but they don't believe in features.
Stepping into a BSD variant is like a trip back to the 1980's. Not exactly the real 1980's, but a parallel 1980's in which Unix is more robust---but the features are all rolled back, so it's just about as unpleasant to use.
I used Linux for more than a decade before switching to OpenBSD precisely because Linux developers believe in features to the point where how well they're implemented is no longer relevant.
The arrogant, know-it-all kids that we so lovingly nurtured in the Linux community grew up to be its forefront developers today. It shows.
Edit: I was hesitant to write this because it always leads to very unproductive results, but what the hell, I'll bite.
Systemd was the last straw for me, not because something something Unix philosophy (after all, OpenVMS disdained the Unix philosophy, and it worked better than Linux ever has) but because it's so bug-ridden that its interface's simplicity and elegance are next to useless.
Maintaining a non-trivial network & filesystem setup (e.g. I have a few dozen Qemu VMs, because writing software for embedded systems is like that) became a nightmare. It broke with every other update. Great if you're doing devops and this is expected and part of your job, terrible if you're doing actual development and you want an infrastructure that doesn't break between compiles.
I ragequit one afternoon, put FreeBSD on my workstation and OpenBSD on my laptop. I have not touched anything in my configuration in almost a year now and it works flawlessly. I don't think I've had it work for a whole month without having to fiddle with whatever the fuck broke in systemd, somethingsomethingkit or God knows what other thing bolted on top of the system via DBus. I can write code in peace now and that's all I want.
These are all great technologies. Systemd in particular was something I enthusiastically used at first, precisely because after Solaris' SMF -- XML-based as it is -- even OpenRC seemed like a step back to me. But, ya know, I'd actually want it to work.
I don't think it's a simple problem, and I don't think all the blame should be laid on Freedesktop.org, where a lot of good projects originated. I do think a lot could be solved by developers being a little more modest.
Thus you could go from bare kernel, to CLI to GUI in a layered manner (and fall back when a higher layer had issues).
With Dbus etc the CLI has been sidelined. Now you have a bunch of daemons that talk kernel at one end and dbus at the other.
Never mind that they tried to push a variant of dbus directly into the kernel. And as that failed, is now cooking up another take that yet again is about putting some kind of DE RPC/IPC right in the kernel.
Unfortunately, there is a lot of weird interaction between all these processes. It's often badly (or not at all) documented, and what plugs where is extremely unclear. It's very messy, and it doesn't stand still long enough for someone to fix it. They just pile half-done stuff over more half-done stuff.
It's really unfortunate because the Linux kernel is, pragmatically, probably the best there is. It may not excel in specific niches (e.g. security), but overall, it does a lot of things better than, or at least about as well as BSDs, on systems where not even NetBSD boots.
One problem with message passing as such is that messages are like function calls, but you can't put a breakpoint into the system and see a call stack!
If we call a function f on object a, which calls b.g(), a breakpoint in b.g() tells us that something called a.f() which then called b.g(). If we send a message f to a, which then sends a message g to b, a breakpoint in b on the receipt of message g tells us diddly squat! The g message came in for some reason, evidently from a. Well, why did a do that? Who knows; a has gone off and is doing something else.
OpenVMS's days of glory more or less coincided with the Unix wars. Unix was brilliantly hacker-friendly, but a lot of basic things that we now take for granted in Linux -- virtual memory, high-speed I/O and networking -- were clunky and unstandardized. Others (like Files-11, VMS's excellent filesystem) were pretty much nowhere to be found on Unices (or, if they were, they were proprietary and very, very expensive). An Unix system included bits and pieces hacked together by people from vastly different institutions (often universities) and a lot of the design of the upper layers was pretty much ad-hoc.
OpenVMS had been a commercial project from the very beginning. It had a very well documented design and very sound engineering principles behind it. I think my favourite feature is (well, technically, I guess, was) the DLM (Distributed Lock Manager), which was basically a distributed concurrent access system with which you could do concurrent access to resources (such as, but not only, files) in a clustered system. I.e. you could acquire locks to remote resources -- this was pretty mind-blowing at the time. You can see how it was used here: e.g. http://www3.sympatico.ca/n.rieck/docs/openvms_notes_DLM.html .
Also, the VAX hardware it ran on rocked. The whole thing was as stable and sturdy as we used to boast about Linux in comparison to Windows 98, except at a time when many Unices crashed if you did the wrong thing over NFS.
Sorry but that's all BS. FreeBSD is definitely a modern system, has lot's of features and is emphatically, millions of times better than a 1980's Unix. Linux may have a larger community than the BSD's but saying the BSD's are like stepping into the 1980's is rather disingenuous.
.. Without taking much time to configure it.
This is fair enough but it's not what I want in a machine, openBSD might be "behind" but it feels complete, supported, sustainable and most of all "very well thought out". FreeBSD is also exceedingly good, but makes trade offs in how clean the implementation of the OS feels to keep up with linux.
or, at least it feels like this to me. But to say the BSDs aren't modern is deluded, there's a reason they're known to have the fastest software networking stack in the world.
I found FreeBSD to be unusable simply in the command line environment. I was using only a text console login. I simply wanted a decent shell and editor.
Heck, FreeBSD wouldn't even scroll back with the de facto standard Shift-PgUp.
> mounts his drives for him
That amazing advancement in Unix usability can be achieved by something called the "automount daemon" which was introduced in the late 1980's in SunOS (the real, pre-Solaris one).
Tom Lyon developed the original automount software at Sun Microsystems: SunOS 4.0 made automounting available in 1988. [https://en.wikipedia.org/wiki/Automounter]
You basically just wrote a comment which paints a 1988 commercial Unix feature as a Linux frill that BSD people don't need.
FreeBSD has caved in and has autofs as of 10.1: https://www.freebsd.org/cgi/man.cgi?query=autofs&sektion=5
That was released in November 2014, only some 26 years after Sun rolled out the feature. Better late than never, I suppose.
If you don't like a command line interface install a desktop environment, if you want a different shell install one and if you wan't a different editor, again install a different one.
Nothing you have wrote suggest FreeBSD is unusable, apparently you prefer systems with this stuff already installed which is fine, but it doesn't mean you should knock the BSD's because you are unwilling or unable to install a couple new packages.
> If you want a different shell install one and if you wan't a different editor, again install a different one.
I didn't want to customize the FreeBSD environment because I was only using to to maintain a port of a specific program. I wanted that to build in the vanilla environment and not have any dependency on some customizations.
Dealing with FreeBSD was just a hassle, even for the three or four minutes once in a while (at release time) to fire it up, pick up new code, build and go through the regression suite, then roll binaries.
The last straw was when I switched some parsing to reentrant mode, requiring a newer version of Flex than the ancient version on FreeBSD. There was no obvious way to just upgrade to a newer version without building that from sources. That's okay, but it means anyone else wanting to reproduce my steps would be forced to do the same. Everyone else has newer flex: no problem with the GNU-based toolchains on Mac OS, Solaris, and elsewhere. MinGW, Cygwin, you name it. On Ubuntu, old flex is in a package called flex-old, which is mutually exclusive with a package called flex.
I just said to heck with it; I'm just not going to actively support that platform.
Actually, that was the second to last straw. The BSD people also don't understand how compiler command line feature selection macros (cc -D_WHATEVER) are supposed to work.
If you don't have any feature selection, then you get all the symbols. The presence of feature selection macros acts in a restrictive way: intersection rather than union semantics. If you say -D_POSIX_SOURCE it means "don't give me anything but POSIX source", and so if you combine that with another such option, you get the set intersection, which is useless. I ended up using -D__BSD_VISIBLE, which is something internal that you aren't supposed to use (hence the double underscore) which has the effect of making traditional BSD functions visible even though _POSIX_SOURCE is in effect.
On GNU and some other systems, you just add -D_BSD_SOURCE and you're done: those identifiers are added to what you already selected.
This is how POSIX says feature selection works: see here:
"Additional symbols not required or explicitly permitted by POSIX.1-2008 to be in that header shall not be made visible [by _POSIX_C_SOURCE], except when enabled by another feature test macro."
Except when enabled by another feature test macro: they are additive, damn it!
The BSD stance is that "just don't put anything on the command line and you get all the library symbols. What's the problem?" (Like, some of them are specific to your OS and they clash with some of mine?)
We are still using X, still using terminals powered by control codes, etc.
Rob probably sees things like LANG and LC_ALL as bugs. His fix was UTF-8 everywhere, always. Where is Linux? Still in bag-of-bytes-o-rama.
The problems solved by LANG or LC_ALL are not solved by UTF8 alone. Even if you use UTF8 for all your input and output, there is still the question of how to format numbers and dates to the user and how to collate strings.
These things are dependent on country and language, sometimes even varying between different places in a single country (in Switzerland, the German speaking parts use . As the decimal separator, while the French speaking part prefers ,)
These things are entirely independent of the encoding of your strings and they still need to be defined. Also, because it's a very common thing that basically needs to happen with every application, this is also something the user very likely prefers to set only once at one place.
Environment variables don't feel too bad a place.
This localization BS has spawned an entire race of nonsense, where, for example, CSV files are not actually CSV in some regions, because their values are not COMMA-separated (as the name implies), but semicolon-separated. And we, programmers, have to deal with it somehow, not to mention some obsolete Faildows encodings like CP1251 still widely used here in lots of tech-slowpoke organizations.
So: one encoding, one datetime format, one numeric format for the world and for the win. Heil UTF-8!
as we're talking encodings: The worst file I ever had to deal with combined, within one file, UTF-8, cp437 and cp850.
I guess they had DOS and Unix machines bot no Windows boxes touching that file.
This is a problem that won't go away. Many developers are not aware of how character encoding, let alone Unicode, actually works and, what's the worst about this mess, many times, they can get away without knowing.
Humans find thousands separators useful. You're asking humans to give up useful things because they're hard to program.
That said, I idly wonder whether they could be implemented with font kerning. The bytes could be 123456.78, but the font could render it with extra space, as 123 456.78.
I don't know if it's possible with current font technology, and there are probably all sorts of problems with it even if it is, but it might be vaguely useful.
I agree though that this can (and should) be solved at font-rendering level, not at an application level.
See also paper sizes and electrical power outlets.
Your point's correct, but linefeed hasn't died: it's still the line-ending on Unixes. Old Macs used carriage return; Windows use carriage return line feed; Unix uses linefeed. I don't know what Mac OS X uses, because I stopped using Macs before it came out.
I use miles for the sport of running. This is because 1609 meters is close to 1600. Four laps around a standard 400 meter track is about a mile and everything follows from that. All my training is based on miles. I think of paces per mile. If I'm traveling abroad and some hotel treadmill is in kilometers and km/h, it annoys the heck out of me.
However, paradoxically, road signs and car speedometers in miles and miles/hour also annoy the heck out of me; though at least since I use miles for running, at least I'm no stranger to the damn things.
For laying out circuit boards, I use mils, which are thousandths of an inch: they are a subdivision which gives a metric air to an imperial measure. This is not just personal choice: they are a standard in the electronics industry. The pins of a DIP (the old-school large one) are spaced exactly 100 mils (0.1") apart, and the rows are 300 mils apart. So you generally want a grid in mil divisions. (The finer-grained DIPs are 0.05" -- 50 mils.).
There is something nice about a mil in that when you're working with small things on that scale, it's just about right. A millimeter is huge. The metric system has no nice unit which corresponds to one mil. A micron is quite small: it's 25.4 mils. (How about ten of them and calling it a decamicron? Ha.)
Inches themselves are also a nice size, so I tend to use them for measuring household things: widths of cabinets and shelves and the like. Last time I designed a closet shelf, I used Sketchup and everything in inches.
Centimeters are too small. Common objects that have two-digit inch measurements blow up to three digits in centimeters.
Centimeters don't have a good, concise way to express the precision of a measurement (other than the ridiculous formality of adding a +/- tolerance). In inches, I can quote something as being 8 1/16 inch long. This tells us not only the absolute length, but also the granularity: the fact that I didn't say 8 2/32 or 8 4/64 tells you something: that I care only about sixteeth precision. The 8 1/16 measurement is probably an approximation of something that lies between 8 1/32 and 8 3/32, expressed concisely.
In centimeters, a measurement like 29 cm may be somewhat crude. But 29.3 cm might be ridiculously precise. It makes 29.4 look wrong, even though it may the case that anything in the 29.1-29.5 range is acceptable. The 10X jump in scale between centimeters and millimeters is just too darn large. The binary divisions in the imperial system give you 3.3 geometric steps inside one order of magnitude, which is useful. For a particular project, you can chose that it's going to be snapped to a 1/4" grid, or 1/8" or 1/16" based on the required precision.
So for these reasons, I have gravitated toward inches, even though I was raised metric, and came to a country that turned metric before I got here. (And of course, the easy availability of rulers and tape measures marked in inches, plus support in software applications, and the enduring use of these measures in trade: e.g. you can go to a hardware store in Canada and find 3/4" wood.)
P.S. And yes, my ruler is made from aluminium, not aluminum.
Both the words "aluminium" and "aluminum" are British inventions. Both derive from "alumina", a name given in the 1700's to aluminum oxide. That word comes from the Latin "alumen", from which the word "alum" is also derived.
"Aluminum" was coined first, by English chemist Sir Humphry Davy, in 1808. He first called it "alumium", simply by adding "-ium" to "alum" (as in, the elemental base of alum, just like "sodium" is the elemental base of soda), and then added "n" to make "aluminum". In 1812, British editors replaced Davy's new word with "aluminium", keeping Davy's "n", but restoring the "-ium" suffix which coordinated with the other elements like potassium.
North Americans stuck with Davy's original "aluminum".
In Slovakia, we have a nice word for it: hliník, derived from hlina (clay).
Also, how on earth is it a good idea to make the core string routines in the library be influenced by this cruft? What if I have some locale set up, but I want part of my program to just have the good old non-localized strcmp?
The C localization stuff is founded on wrong assumptions such as: programs can be written ignorant of locale and then just localized magically by externally manipulating the behavior of character-handling library routines.
Even if that is true of some programs, it's only a transitional assumption. The hacks you develop for the sake of supporting a transition to locale-aware programming become obsolete once people write programs for localization from the start, yet they live on because they have been enshrined in standards.
Can I really expect it to work if I set
How would the two encodings be used? How would they be used in a message consisting of both monetary and datetime?
Should the setting not be one for encoding (selected from a range of encodings), then settings for formatting and messages (selected from ranges of locales), then finally a setting for collation which is both a locale and an encoding? Or is the linux locale system simply using these as keys, so in reality there is no difference in LC_TIME whether you use encA or encB, it will only use the locale prefix en_GB?
Full month names would be encoded in encA. Currency symbols in encB. Is it a good idea? No.
>Should the setting not be one for encoding (selected from a range of encodings), then settings for formatting and messages (selected from ranges of locales), then finally a setting for collation which is both a locale and an encoding?
I would argue an encoding setting should not be there to begin with or at most be application specific because that really doesn't depend on system locale (as long as the signs used by the system locale can be represented in the encoding used by the application).
I was just explaining why LC_* should exist even on a strictly UTF-8 everywhere system. I never said storing the encoding in the locale was a good idea (nor is it part of the official locale specification - it's a posix-ism)
It's even worse when things assume that my date preferences reflect my unit preferences. I prefer standard units (feet, pounds, knots &c.) and British/Continental dates: I don't want to use French units, nor do I want to use American dates. And yet so much software assumes that it's all or nothing.
LANG and LC_ALL are the work of ISO C and POSIX; they are not the fault of Linux. Linux has these in the name of compliance; they were foisted upon the free word, essentially.
We aren't using punched cards
EDIT: people hate when I say this, which amuses me. The TTY must die !!!!
Being sight-impaired, I have to disagree strongly! The TTY is the only thing that lets me adjust the font size of all programs running in it without going through lots of trouble.
(BTW: didn't downvote your comment.)
The TTY must die.
Anyway, I have tried a lot of things over the years and nothing even comes close to using a text interface.
To name a few nuisances: controls moving outside of the screen, overlapping elements in web content, unreadable buttons, unclickable input fields, tiny fonts in menus, etc. Nothing of this happens with text interfaces.
Thanks for your input, though!
Or are you the type that does everything on a touchscreen? Because, judging from your logic, traditional computer controls must die too...
By your logic, I would be stranded at the side of the road wishing I had a spare tyre.
(Powershell ISE is something else .. once it actually loads)
Plan 9 is a strawman representative of "commercial Unix".
> Combine a few GNU core utils, and you have more code than the whole plan 9 kernel.
When you actually sit down and think of the cases that can occur, that translates into code. Handle this, handle that, and the KLOC's just pile up.
Speaking of kernel size, what makes a kernel code base big? Why, device drivers for lots of hardware.
Sydney Olympics lighting system was Plan9 based.
Inferno was used by NASA JPL projects
Lucent use a real time version of plan9 in phone masts
Coraid use Plan9 on their NAS servers
Researchers at LANL and IBM use plan9 on the Blue Gene, and other, supercomputers
I have worked for two plan9 based companies - ok they didn't survive but we tried :)
The international plan9 conferences drew about 30 people. People from commercial enterprises used plan9 in their workflows. Plan9 was my desktop while building a successful recruitment website.
Literally halfs of dozens of research projects and ones of promotional installations served! Nearly threes of dozens attended conferences, at which twos of booths were no doubt tabled, perhaps both by you, one of the only persons who apparently used Plan 9 commercially.
I'm feeling nostalgic enough to go launch an inferno instance now just on principle.
Now I downvoted all your comments in this thread for they are unconstructive both in the negative and the positive directions. This is a fanboy-like attitude, where you ignore the fact I explained above, and attack other comments. You take quantity over quality.
BSDs and other systems have their user bases. Those may be small, but they exist. Both GNU/Linux and BSDs are inferior to the ideal system where most legacy cruft shall be gone, but in order to reach that ideal system we should develop the research projects, the ones with little-to-no use. E.g. Plan9. Or microkernels. The all-utf8 approach is perfect, but it can't easily propagate to the mainline if it is not tested for long in research projects, and the ecosystem adapts in this timeframe. So we'd rather not attack them, but let them happen. They'll always be better than mainline, but lesser-adopted, but when they die, the good parts of them will propagate to GNU/Linux, BSD, etc. Take ZFS for example, it was developed on a Sun system, it's not widely adopted, but its now on FreeBSD and Linux (i.e. btrfs, the same concept), for you to enjoy. Or the research in functional languages. Many of those are not adopted, but many features are now propagating to mainline languages.
Please become better informed: https://en.wikipedia.org/wiki/Unix_wars#BSD_and_the_rise_of_...
"BSD purged copyrighted AT&T code from 1989 to 1994. During this time various open-source BSD x86 derivatives took shape, starting with 386BSD, which was soon succeeded by FreeBSD and NetBSD."
BSD was an OS long before 1989; the open-source BSD's weren't new projects written from scratch, but made possible by purging AT&T copyrighted code from the code base.
Linux (the kernel) only started in 1991, from scratch. The GNU parts that go into a "GNU/Linux" --- the GNU C compiler and utilities from the GNU project --- started in 1984. But that is still later than BSD. 1BSD was released in 1978: [https://en.wikipedia.org/wiki/Berkeley_Software_Distribution...]
Seriously, it should be obvious that BSDs mean, in the context of my comment, modern BSDs. The GNU/linux environment was practically usable before those were. Your comment is pure evil rhetoric.
I would still say is was a successful experiment.
What is your take on syntax highlighting in a plan9 world? I've read a list of questions somewhere about p9, and remember this one being asked and having a typically abrasive response, along the lines of "just don't". And I often have a lot of time for those kind of arguments - embrace minimalism. But I regularly (daily) find syntax highlighting to be super-useful for highlighting small errors. What's your take? It seems like a regex-ey problem. Could it be done in a way that was within the spirit of such a system?
There is ugliness in coreutils, but it mature, functional, proven ugliness. A lot of it is even there for a reason.
It's not difficult to make an elegant toy in isolation.
Can I take the OpenBSD userland and untar, configure, build and run in cygwin? Nope. You have proven my point. Nobody uses the little SYS V version.
One hint as to why the GNU version is so "long" and "messy":
/* System V machines already have a /bin/sh with a v9 behavior.
Use the identical behavior for these machines so that the
existing system shell scripts won't barf. */
bool do_v9 = DEFAULT_ECHO_TO_XPG;
And Unix Fifth Edition, imbued with the cleanliness of Ken Thompson's ghost? Yeah, that's lovely, but not only is it again, of limited utility and portability across idiosyncratic modern environments, but it's full of bugs dating to an era where simplicity was valued above handling all inputs. In 1983, crashing on a bad input wasn't even generally understood to BE a bug, much less extremely dangerous especially in core system utilities.
Does the option really belong in echo? Who knows, but it's certainly been useful to me.
UNIX fifth edition goes for absolute minimalism. Echo in Plan 9 is apparently used enough that it's worthwhile to optimize the number of times write is called. FreeBSD echo looks like someone just learned about writev. OpenBSD's seem like the sanest of the minimalists.
What's the takeaway for you?
For many people, from the writers of the Single Unix Specification to the Debian Policy people, the take-away was and is use printf, for nigh on 20 years now.
$ printf "This\tis an\nexample\n"
This is an
$ printf "This is %-20s example\n" another
This is another example
There is an RFC standard way of quoting URLs and addresses, namely angle brackets. HN doesn't implement it, though:
See? The closing > is included in the URL, stupidly.
The convention first appeared in [https://www.ietf.org/rfc/rfc1738.txt] in 1995, with Tim Berners Lee the top author.
That said, it's useful to have features like C escape sequences for control characters or arbitrary characters. That feature should be in the shell language. Someone needed it and hacked it into some version of echo. Others elsewhere didn't and so now it's implementation defined whether or not you get backslash processing.
In fact, I'd say keyboards are woefully out to date.
Specifically, I keep looking up † dagger (U+2020) and ‡ double-dagger (U+2021) for footnotes, black heart (U+2065) to be romantic, black star (U+2605) to talk about David Bowie's last album and ∞ to talk about actual non-finite entities.
I olny found out recently that Ctrl+Shift+u and then type unicode hexadecimal outputs these in Ubuntu, presumably all Linuxen. AltGr+8 is great for diaeresis while we're at it so you can go all hëävÿ mëtäl really easily.
edit: black heart and star are not making it through, why Lord, why?!
$ clip lod
Right now my code just shells out to pbcopy on Mac, but you may be interested in pyperclip which provides cross-platform access to the clipboard.
† is AltGr-Shift-%, and ‡ AltGr-Shift-:
I'll never remember them :(
My choice for Compose is the right Windows key, which I think I eventually settled on because I use the left one in some keybinds (winkey+s for shell, etc.) and like you, I couldn't part with an alt or ctrl. I've often wondered what other folks tend to use.
To the grandparent: I'm sometimes amused by what Compose defines. There's ∞ (compose + 8 + 8), (compose + # + #), and oddly (compose + C + C + C + P). I think it may depend on system configuration, but I believe libX11 is responsible. (On my system, Arch, the key combinations appear to be documented under /usr/share/doc/libX11/i18n/compose for my locale.)
And it sucks that I have to use so much that I know the code point for it (1F4A9) off the top of my head. :-(
Edit: I'm definitely putting in U+1F4A9 (the PILE OF POO character), but apparently hacker news strips it out. I'm guessing it's filtering everything that has a symbol character class?
I am glad PILE OF POO does not work for you.
does (U+2603) snowman work?
edit: noooo, no snowman
Regarding the number of buttons, the Space Cadet had 100 buttons and no number-pad , whereas most modern keyboards have 104 buttons. I suppose I could add a number-pad to my design (117 buttons), but then I could also use that area for extra user-definable buttons (20 buttons in a 4x5 grid -> 120 buttons). The Space Cadet is a bit larger than most IBM-style keyboards, so more keys means more real-estate; this is not to mention yet further divergence in design from the original Space Cadet keyboard.
Beyond hardware issues, there are software issues to resolve, like whether to include the macro functionality of the original. I can't find any documentation on how it worked, so I get to start from fresh.
 = I really wanted to use some hall-effect switches, but nobody makes them anymore, because they are allegedly the most luxurious switches ever. I would probably have to tear apart an original Space Cadet keyboard to get some. Thus, I would probably just use Cherry MX switches
 = https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/Sp...
For keyboards with more keys (and yet no vendor-defined private HID usages), one can look at the keyboards available in Brazil. Some of the "multimidia" keyboards from the likes of Multilaser, C3 Tech, and Leadership have the 107-key Windows ABNT2 physical layout, with anywhere up to 20 further multimedia keys.
But these keyboards don't have keys engraved with more modifier types beyond the usual five.
My design may or may not contain the same keys, because I am not sure how many would want the original APL character set.
APL keyboards are available, so the market exists , but that doesn't mean much. I plan to have the micro-controller user-configurable and replaceable, so one could change what symbols were type-able with the same keyboard. As I expect to use UTF-8, this keyboard could be used to type any character in the UTF-8 code pages.
 = https://en.wikipedia.org/wiki/APL_(codepage)
 = https://geekhack.org/index.php?action=dlattach;topic=69386.0...
This is mine:
I wrote a virtual terminal subsystem a while ago. I gave it keyboard layouts with the ISO 9995-3 common secondary group. No daggers, alas. But ISO 9995-3 does have pretty much all of the combining diacritical marks. <Group2> <Level3>+D05 is combining diaeresis. In practice I find myself not appreciating that as much as I appreciate being able to type U+00A7 as <Group2> <Level2>+C02.
The whole "the ISO 6429 C1 control code 'application program command'" thing is a bit surprising though. (I'm guessing this change doesn't actually avoid this directly? If you sent an APC it'd still do it, it's just that APC is multiple bytes in UTF-8, and hopefully a bit rarer?)
> Reinterpreting US-ASCII in an arbitrary encoding
This way will likely work — at least, I thought. The vast majority of encodings are a superset of ASCII, so reinterpreting ASCII as them is valid. The only one I know of that isn't is EBCDIC, and I've never seen it used. (Said differently, non-superset-of-ASCII codecs are incredible rare to encounter, so the above assumption usually holds.) (The reverse, reinterpreting arbitrary data as ASCII, is not going to work out as well.)
Though it is rather horrifying how easily it is to dump arbitrary data into a terminals stream. Unix does not make this easy for the program. The vast majority of programs, I'd say, really just want to output text. Yet, they're connected to a terminal. Or better, if perhaps a program could say, "I'm outputting arbitrary binary data", or even "I'm outputting a application/tar+gzip"; the terminal would then know immediately to not interpret this input. And in the case of tar+gzip, it would have the opportunity to do something truly magical: it could visualize the octets (since trying to interpret a gzip as UTF-8 is insane); it could even just note that the output was a tar, and list the tar's contents like tar -t. If the program declares itself aware, like "application/terminal.ansi", then okay, you know: it's aware; interpret away.
But it doesn't, so it can't. Part of the difficulty is probably that the TTY is both input and output (not that the input can't also declare a mimetype or something similar). And the vast majority of programs don't escape their user input before sending it to a terminal; it's like one giant "terminal-XSS" or "SQL-injection-for-your-terminal". And it is probably unreasonable to expect it; I don't really know of any good libraries around terminal I/O; most programs I see that do it assume the world is an xterm and just encode the raw bytes, right there, and pray w.r.t. user input.
catting the linux kernel's gzip into tmux can have consequences from "lol" to "I guess we need a new tmux session".
It was also just today that I discovered that neither GNU's `ps` nor `screen` support Unicode, at least, for characters outside the BMP.
I figured that "ANSI" would give away that I wasn't being serious since it's not actually an encoding.
Well... If we will, why not? But the thing is that in 20-30 years we won't be able to invent any new writing systems that UTF-8 won't cover. Single-byte encodings were doomed because of their single-byteness. The same awaits two-byte encodings like UCS-2 (aka UTF-16BE) - we already have extended code points for something that glamour hipsters call "emoji". Variable-byte encoding will never become obsolete.
I think you underestimate humanity's aptitude at creating things that don't fit into well defined standards.
My (admittedly poorly stated) point wasn't that we shouldn't be moving everything over to UTF-8. I personally use it wherever possible just because it makes life easier. My point was that there are decades of things that use ASCII-US or another one of the overlapping but incompatible encodings because they were the RightThing™ to use at the time and there's no way we're going to get rid of everything non-UTF-8 any time soon.
In 20-30 years we'll be saying "Why isn't everything in FutureText-64, it should be the only encoding. Why does anything else even exist?", and it'll be because we're saying the same about UTF-8 now.
But my main point is another: eliminate all single-byter and fixed-byter zoo and leave one universal encoding. When (if ever) it's time to replace it, we'll do it all and at once, not having those crazy iconvs everywhere.
For example, in a hypothetical alien language, a hypothetical character "rjou" would have a code 0x2300740457 (all the previous codes are exhausted). We can't express this with a single code, so actually we split it into 2-byte parts and write "#" (0x0023), joiner, "t" (0x0074), joiner and "ї" (cyrillic letter yi, 0x0457). As we have a joiner between these codes, we know that we must interpret and display them not as a "#tї" sequence but as a single alien "rjou" character. I think you get the idea.
UTF-16 can handle stuff above U+FFFF just fine, it encodes that with surrogate pairs. Are you thinking about UCS-2?
The 21-bit limit for Unicode comes from the limits of UTF-16's surrogate pairs.
Wow, I'm surprised that the people whose buttons this pushes are able to make(1) a HN account, let alone have enough points to downvote.
Think about it. There is only one man page for xterm. I fyou type "man xterm" with no section number you get that man page. If there existed an xterm(7) page, you'd still get the xterm(1) man page by default. So why the hell write the (1) notation every time you type the word xterm?
Man page section numbers are not useful or relevant, by and large and mentioning them only adds noise to a paragraph.
Even stupider is when the worst of the Unix wankers write man page section numbers after ISO C function names. Example sentence: "Microsoft's malloc(3) implementation is found in MSVCRT.DLL". #facepalm#
Because the convention exists to define the type of the component. It's a handy convention, and I'm betting there are a few people reading this who have never used anything other than GNOME terminal so appending the section number immediately helps the reader to place the component, otherwise they'd have to look it up. etc
(And how did I get to the situation in which I know what (1) means, yet I only know Gnome terminal and don't know what xterm is?)
(What about the fact that xterm(1) is also a hyperlink in the sumitted page? You could change the anchor text to "xterm(foo)" and it would still navigate to the correct man page with one click.)