Hacker News new | comments | show | ask | jobs | submit login
Things Every Hacker Once Knew (catb.org)
539 points by ingve on Jan 27, 2017 | hide | past | web | favorite | 321 comments



I always thought it was a shame the ascii table is rarely shown in columns (or rows) of 32, as it makes a lot of this quite obvious. eg, http://pastebin.com/cdaga5i1

It becomes immediately obvious why, eg, ^[ becomes escape. Or that the alphabet is just 40h + the ordinal position of the letter (or 60h for lower-case). Or that we shift between upper & lower-case with a single bit.

esr's rendering of the table - forcing it to fit hexadecimal as eight groups of 4 bits, rather than four groups of 5 bits, makes the relationship between ^I and tab, or ^[ and escape, nearly invisible.

It's like making the periodic table 16 elements wide because we're partial to hex, and then wondering why no-one can spot the relationships anymore.


The 4-bit columns were actually meaningful in the design of ASCII. The original influence was https://en.wikipedia.org/wiki/Binary-coded_decimal and one of the major design choices involved which column should contain the decimal digits. ASCII was very carefully designed; essentially no character has its code by accident. Everything had a reason, although some of those reasons are long obsolete. For instance, '9' is followed by ':' and ';' because those two were considered the most expendable for base-12 numeric processing, where they could be substitued by '10' and '11' characters. (Why base 12? Shillings.)

The original 1963 version of ASCII covers some of this; a scan is available online. See also The Evolution of Character Codes, 1874-1968 by Eric Fischer, also easily found.


I stumbled across the history of the ASCII "delete" character recently: It's character 127, which means it's 1111111 in binary. On paper tape, that translates into 7 holes, meaning any other character can be "deleted" on the tape by punching out its remaining holes.

(It's also the only non-control ASCII character that can't be typed on an English keyboard, so it's good for creating WIFi passwords that your kid can't trivially steal.)


> It's also the only non-control ASCII character that can't be typed on an English keyboard

Don't count on it. There's a fairly long standing convention in some countries with some keyboard layouts that Control+Backspace is DEL. This is the case for Microsoft Windows' UK Extended layout, for example.

    [C:\]inkey Press Control+Backspace %%i & echo %@ascii[%i]
    Press Control+Backspace⌂
    127
    
    [C:\]
This is also the case for the UK keyboard maps on FreeBSD/TrueOS. (For syscons/vt at least. X11 is a different ballgame, and the nosh user-space virtual terminal subsystem has the DEC VT programmable backspace key mechanism.)


It's actually easier to add two spaces at both ends of the password :)


Wow, I never knew it was an actual character.


Sure, think of it this way: you're sitting at a terminal connected to a mainframe and press the "X" key; what bits get sent over the wire? The ones corresponding to that letter on the ASCII chart.

Now replace "X" with "Delete".


(too late for me to edit; took me a while to find online)

Another good source on the design of ASCII is Inside ASCII by Bob Bemer, one of the committee members, in three parts in Interface Age May through July 1978.

https://archive.org/details/197805InterfaceAgeV03I05

https://archive.org/details/197806InterfaceAgeV03I06

https://archive.org/details/197807InterfaceAgeV03I07


That Fischer paper does look interesting - Thanks!

I do understand that I've probably simplified "how I understand it" vs "how/why it was designed that way". This is pretty much intentional - I try to find patterns to things to help me remember them, rather than to explain any intent.


Yeah, there's not much 4-bit-ness that's an aid to understanding what it is today. One is that NUL, space, and '0' all have the low 4 bits zero because they're all in some sense ‘nothing’.


I started programming BASIC and assembly at 10 years old on a Vic-20, so I don't qualify as a wizened Unix graybeard, but I've still had plenty of cause to look up the ASCII codes, and I've never seen the chart laid out that way. Brilliant.


  > on a Vic-20
Which, weirdly, used the long-obsolete ASCII characters of 1963–1967, with '↑' and '←' in place of '^' and '_'.


PETSCII was a thing throughout the 8-bit Commodore line of products. It was based on the 1963 standard, but added various drawing primitives. I spent a lot of time drawing things with PETSCII for the BBS I ran from my bedroom.

https://en.wikipedia.org/wiki/PETSCII


Going on my deep (and probably fallible) memory; I remember seeing the ASCII set laid out like this on an Amoeba OS man page (circa 1990).

https://en.wikipedia.org/wiki/Amoeba_(operating_system)


Had one as well, how did asm work?


>"It becomes immediately obvious why, eg, ^[ becomes escape. Or that the alphabet is just 40h + the ordinal position of the letter (or 60h for lower-case). Or that we shift between upper & lower-case with a single bit."

I am not following, can you explain why ^[ becomes escape. Or that the alphabet is just 40h + the ordinal position? Can you elaborate? I feel like I am missing the elegance you are pointing out.


If you look at each byte as being 2 bits of 'group' and 5 bits of 'character';

    00 11011 is Escape
    10 11011 is [
So when we do ctrl+[ for escape (eg, in old ansi 'escape sequences', or in more recent discussions about the vim escape key on the 'touchbar' macbooks) - you're asking for the character 11011 ([) out of the control (00) set.

Any time you see \n represented as ^M, it's the same thing - 01101 (M) in the control (00) set is Carriage Return.

Likewise, when you realise that the relationship between upper-case and lower-case is just the same character from sets 10 & 11, it becomes obvious that you can, eg, translate upper case to lower case by just doing a bitwise or against 64 (0100000).

And 40h & 60h .. having a nice round number for the offset mostly just means you can 'read' ascii from binary by only paying attention to the last 5 bits. A is 1 (00001), Z is 26 (11010), leaving us something we can more comfortably manipulate in our heads.

I won't claim any of this is useful. But in the context of understanding why the ascii table looks the way it does, I do find four sets of 32 makes it much simpler in my head. I find it much easier to remember that A=65 (41h) and a=97 (61h) when I'm simply visualizing that A is the 1st character of the uppercase(40h) or lowercase(60h) set.


This single comment has cleared up so much magic voodoo. I feel like everything fell into place a little more cleanly, and that the world makes a little bit more sense.

Thank you!


I can't believe I've only just realised where the Control key gets its name from. Thank you!


The article linked mentions that the ctrl key (back then?) just clears the top 3 bits of the octet.


Awesome, yes this makes total sense. I'm glad I asked. Cheers.


Basically, that modifier keys are just flags/mask e.g. ESC is 00011011, [ is 01011011. CTRL just unsets the second MSB and shifts the column without changing the row.

Physically it might have been as simple as a press-open switch on the original hardware, each bit would be a circuit which the key would connect, the SHIFT and CONTROL keys would force specific circuits open or closed.


if you press a letter and control, it generates the control character in the left-hand column.

the letters in the third column are A = 1, B = 2 etc: 40h + the position in the alphabet.

Awesome to see ^@ as null and laying it out this way makes it easier to see ^L (form-feed, as the article says: control-L will clear your terminal screen), ^G (bell), ^D, ^C etc etc


This is so that control characters (and shifted characters — see https://en.wikipedia.org/wiki/Bit-paired_keyboard) could be generated electromechanically. Remember a teletype of the era (e.g. Model 33) has no processing power.


There's a longer explanation on Wikipedia: https://en.wikipedia.org/wiki/Caret_notation


ESC is on the same row as [, just in another column. So Ctrl ends up being a modifier just like Shift, in that it changes column but not row.

The 40h offset is 2 columns' worth.


I made and printed out a nicely-formatted table that's adorning my office wall right now, for when I was trying to debug some terminal issues a while back (App UART->Telnet->Terminal is an interesting pipeline[1]), because I was frustrated with the readability of the tables I could quickly find online, and they didn't have the caret notation that so many terminal apps still use (quick, what's the ASCII value for ^[ and what's its name?[2]).

Cool story bro, I know, but I meant to put the file online in response here, but I can't find the source doc anymore >_< Edit: actually, I found an old incomplete version as a Apple Numbers file. If there's interest I can whip it back up into shape and post it as PDF.

[1] For example, when a unix C program outputs "\n", it's the terminal device between the program and the user TTY that translates it into \r\n. You can control that behavior with stty. I know this is something ESR would laugh at being novel to me. On bare-metal, you have no terminal device between you program and the UART output, so you need to add those \r\n yourself.

[2] That's ESC "escape" at decimal 26/hex 1B, and you can generate it in a terminal by pressing the Escape key or Ctrl-[


Consider just taking a photo of the table you found.

It's entirely possible that someone reading this thread will be able to source it.

But it's just a text table, right? It should be fairly trivial to reproduce it from a decent picture.

By the way, thanks for clarifying the existence and purpose of the now-in-kernel "terminal device." I've understood the Linux PTS mechanism and am aware of the Unix98 pty thing and all of that, but identifying it like that helps me mentally map it better.


The name that you need to know is line discipline.


Right... and that's the collective name for the terminal device's settings. I see, thanks!


Minor nitpick: isn't ESC decimal 27? I distinctly remember comparing char codes to 27 in Turbo Pascal to detect when user pressed Esc key... :)


LOL you're right. I guess I still managed to misread my table ^_^;


The 8x16 layout is more compact and fits on a single screen or book page, so it became the standard. You're absolutely right that the 4x32 layout makes the relationships more obvious. But once you've learned those relationships, you've lost the incentive to change the table.


Oh, so that's why C-m and C-j make a line break in Emacs, and C-g will make a sound if you press it one extra time.


^V will put DOS (and cmd.exe) into "next character is literal" mode, so "echo ^V^G" will make a beep. I think you needed ANSI.SYS loaded for it to work though (?).

Speaking of DOS, I've never forgotten that BEL is ASCII 7, so ALT+007 (ALT+7?) will happily insert it into things like text editors. I remember it showed up as a •. I'm not quite sure why.


Amazing. That is a thing of beauty. Can't believe I've never seen it like that before.


Thanks. With a bit of deduction from your post, I just figured out why ^H is backspace.


And ^W deletes words.

Both shortcuts work in terminal emulators.

(As an aside ^W is much easier to input^Wtype as a "fake backspace" thingamajig^H^H^H^H^H^H^H^H^H^H^Hmnemonic than ^H is.)


i didnt realize that ^W did words (and i didnt know what "ETB" was either). But thats a useful one to know!


A lot of hardware still uses serial, and not just industrial stuff. Everything from sewing machines to remote controlled cameras.

If you work on embedded devices you will still encounter serial/RS-232 all the time. Often through USB-to-serial chips, which only adds to the challenge because they are mostly unreliable crap. Then there are about 30 parameters to configure on a TTY. About half do absolutely nothing, a quarter completely breaks the signal giving you silence or line noise, the final quarter only subtly breaks the signal, occasionally corrupting your data.

Still, there is nothing like injecting a bootloader directly to RAM over JTAG, running that to get serial and upload a better bootloader, writing it to flash and finally getting ethernet and TCP/IP up.


Arduino got so popular because it addressed those exact problems: realiable USB to serial built in, foolproof bootloader.

Still, there is nothing like injecting a bootloader directly to RAM over JTAG, running that to get serial and upload a better bootloader, writing it to flash and finally getting ethernet and TCP/IP up.

I'm happy to have gotten mostly rid of this. Gone are the days of choosing motherboards based on the LPT support, and praying that the JTAG drivers would work on an OS upgrade.


> I'm happy to have gotten mostly rid of this. Gone are the days of choosing motherboards based on the LPT support, and praying that the JTAG drivers would work on an OS upgrade.

It's still there, and not going anywhere - the only thing that's past is LPT. Last time I used a Linux-grade Atmel SoC, it had a USB-CDC interface but the chain was still the same: boot from mask ROM, get minimal USB bootloader, load a bootstrap binary to SRAM, use that to initialize external DRAM, then load a flashing applet to DRAM, run it, use that to burn u-boot to flash, and then fire up u-boot's Ethernet & TFTP client to start a kernel from an external server and mount rootfs over NFS. Considering the amount of magic, it worked amazingly well. The whole shebang was packaged into a zip file with a single BAT to double-click and let it do the magic.

As for COM and LPT - FTDI and J-Link changed the embedded landscape forever, and thanks for that.


Get yourself a Black Magic Probe. It's JTAG/SWD on the device end, USB GDB server on the host end. https://hackaday.com/2016/12/02/black-magic-probe-the-best-a...

disclaimer: I'm friends with the BMP manufacturer.


Arduino got so popular because it does everything the Atmel STK does at a fraction of the price.


Yes! I'm currently working on a Windows based building access system dating back to the early 90s. Most of the codebase in MFC C++. The hardware/firmware that runs the door controls used to talk RS232 to the Windows boxes, but now use TCP/IP. STX is used as the start char for a msg. The system is used by some very prominent govt and biz orgs.


One pleasant upside of such systems is that the protocols are usually so simple you can reverse engineer them in a couple evenings. I can't count the times I ended up deriving the serial protocol from some undocumented, baroque source code (or sniffer dumps) and writing a quick & dirty implementation in a few dozen lines of Python.


That's exactly what I've done :) The checksum algorithm that suffixes each msg with a 2 byte check took a little bit of figuring out. I'm using tornado.tcpserver.TCPServer and tornado.iostream. Another plus is generating logs of the msg flow; useful for diagnostics, and the after the fact specs you often need with crufty old systems.


Just about every professional display or projector has a RS-232 control interface. In arenas, museums, airports, restaurants, any place you see higher end digital signage, you'll likely see a RS-232 serial cable connected to the display device for control and maintenance.


Definitely. Just this week actually I'm writing an interface to a _new_ RS232 device for work.


Serial is so easy to program for, and really has first class support on the most common/classic SOCs. USB is kind of tough TBH.


On Plan9 you can run TCP/IP over RS232 and RS232 over TCP/IP with ~zero effort too.

So your sensors can talk across the internet.


I wish Plan9 were practically usable (good graphics support, modern chipset support, etc); it's the lack of those types of critical features that limit its widespread use at the end of the day.

From `man socat', which runs on Linux:

  (socat PTY,link=$HOME/dev/vmodem0,raw,echo=0,wait-slave
  EXEC:'"ssh modemserver.us.org socat - /dev/ttyS0,non-block,raw,
  echo=0"')
  
  generates  a pseudo terminal device (PTY) on the client that can
  be reached under the symbolic link $HOME/dev/vmodem0.  An appli-
  cation  that expects a serial line or modem can be configured to
  use $HOME/dev/vmodem0; its traffic will be directed to a  modem-
  server  via  ssh  where  another  socat  instance  links it with
  /dev/ttyS0.


import /n/kremvax/net /net.alt

now kremvax' network stack is bound over my own no need for nat


Okay that's really impressive.

By "bound over", do you mean tunneled, or "made available alongside"?


In Plan9 the file system is per process so I can choose to bind kremvax' network over mine in this shell window, then pentvax' network over min in another.

I think I'm right in sayig that if one has /net and /net.alt then /net is asked first and then /net.alt

I'm a bit rusty on the details

To make a connection a process open's its /net/tcp/ctrl file, writes a connection string, gets back a response. If that response is a number then it opens /net/tcp/1/ctrl and /net/tcp/1/data and read / writes data to the data and sees out of stream messages on ctrl. To close the connection one closes /net/tcp/1/ctrl

Everything is done via the 9p protocol. So if I can write code on my Arduino that understands 9p and a way to send/rec that data (e.g. serial but even SP1 or JTAG) to a Plan9 machine locally, then the Arduino itself could just open a tcp connection on kremvax.

Once you start layering these things on top of each other it gets a bit mind blowing and you get drunk on power. Then when you are forced back Linux or Windows you realise how dumb they are even with their fancy application software.


Okay, wow. That is really awesome.

And nice. You talk to the kernel via plain text. No ioctls! :D It's like they made an OS designed to make bash happy.

And I can sum up my conclusions of 'modern' Linux in one word: systemd. It's almost worse than Windows now. I'm not surprised this kind of thing doesn't work on Linux :P

...But I'm sad now, that I can't do all of this on Linux.

Hrm. Maybe someone should pull the Plan9 kernel-level stuff out and make a FUSE-based thing (or even a kernel driver!) that emulates all of this on Linux. There's already plan9port for the utilities... okay, they'd need to be modified back to being Plan9-ey again, but it could be really interesting.


Here's an irc bot written in shell

http://www.proweb.co.uk/~matt/chugly.rc

which shows some of the concepts.


That's almost alien. rc is quite different, wow.

It's really sad that this isn't available for Linux. :(

Because this just makes so many cool things possible.


Like I said, between the 100 of us we only scratched the surface of the power of that idea.

It is heartbreaking to know it sits there, unloved outside of our small community.


Very curious who you mean by "us." Are you part of a Plan9 user community? Interesting.

I remember firing Inferno up inside a Java applet a few years ago. I think something that might generate some interest is a) compiling Plan9 (kernel + rootfs) to JavaScript, and b) interfacing that boots-inside-JS environment to Web APIs for mouse, audio, graphics, etc. It would essentially be [all the work of] a new platform port (albeit one without a CPU architecture (!), and using JavaScript hardware/"peripherals" :P), but I think that making the effort needed to implement this would send a good message that Plan9 is built by people who are interested in keeping their work relevant, and that they think it's worth that (which of course it is ^^, but - you know, advertising). Yeah, it won't significantly sway the status quo (sadly), but I'd definitely argue that the lack of a JS port of Plan9 is - as crazy as it sounds - a real hole nowadays.

Here's the NetBSD kernel running in JS, although it doesn't have a proper shell/tty interface (it's just a proof-of-concept): https://ftp.netbsd.org/pub/NetBSD/misc/pooka/rump.js/

Browsers will never do raw TCP/IP due to security concerns and that is a fairly noteworthy roadblock. WebSockets goes over HTTP+TCP, and WebRTC data channels are SCTP. But if you wrote a Plan9 driver of some sort that handled WebSockets/WebRTC (WebSockets would be significantly easier) and let you talk 9P over that, and then wrote an appropriate driver (that listened on a local port) for native Plan9, you could talk from an in-browser Plan9 instance to a native instance running on real hardware or a VM.

I'm not sure how far plan9port goes in terms of 9P emulation, but it might be worth to figure out how to make the native (aka non-browser) side of the WebSocket/WebRTC<->9P transport work with plan9port as well as "real" native Plan9. That way users who are running just the one copy of Plan9 in browser can talk to it with plan9port tools. If plan9port can't already do that though (can it?) then maybe that would be asking a bit much (adding functionality to plan9port is a bit of a tall order).

This would make firing the system up and playing with it a whole lot easier, akin to how you can run DOS games from the Web Archive in-browser because they compiled DOSBox to JS.

While I expect that the most straightforward (and non-insane) path to doing this would be Emscripten, it may be a very fun research topic to make the Plan9 C compiler and/or the bytecode generator JIT directly compile/generate WebAssembly code :> - with a fully capable toolchain you could even compile the kernel directly to wasm, skipping the translation overhead of Emscripten!

WebAssembly is still in the early stages, although there is some active (albeit early) browser support for it floating around, and you can build and test things without having to recompile your entire browser AFAIK (yes, if I understand correctly, browsers already include alpha-stage wasm parsing code). I suspect your best bet is likely to be to skip the likely-outdated noise on websites and just chase the Chromium/Firefox/WebKit/Emscripten teams (particularly Emscripten, which is spearheading compile-to-wasm) for the latest advice.

It would be especially awesome for you to be able to immediately follow the "Chrome officially supports WebAssembly" HN post (maybe a few months away) with "Plan9 runs in-browser and compiles directly to wasm" :D - while most people will just punt and follow the "put your .js on your site, run it through $compiler and put the .wasm file next to it like this" tutorials, you'd probably get a bit of traffic and exposure from directly generating wasm. A LOT (LOT) of people - mostly hacker types - are going to want to know how wasm works, and... showing off Plan9 as the tech demo of the fact that you're directly generating wasm... well......

It's not yet perfect; see https://github.com/WebAssembly/design/blob/master/FAQ.md#is-.... At this point wasm specifies no way to talk to the DOM but only allows you to invoke high-speed code execution from JavaScript. It may be worth waiting for DOM access capabilities, although as that link mentions, it's already viable to compile language VMs to wasm, so OS kernels sound reasonable too.


Us - yes I am part of the plan9 community - though somewhat inactive at the moment - I'm doing a degree in Supply Chain Management so my focus is on that.

That's all well porting a plan9 to JS but it's the kernel services that make plan9 different.

"Everything is a file" is the core concept. The rest is just things built on top of that.

We have a boot CD, it boots in Qemu, we have an emulated environment that runs on on Linux/BSD in 9vx, we have mounting 9p in the Linux kernel, we have plan9ports in Linux/BSD.

If you want plan9 it's not hard to get!


I hope to explore Plan 9 in greater depth than I can right now too.

I agree that the kernel services are what make Plan 9 different. I think the "everything is a file" concept is something that has not been explored nearly as much as it could be.

Thanks for mentioning 9vx, I completely forgot about that project! Just downloaded it and fired it up (managed to figure out the correct invocation without needing the (nonexistent) manual :P). That would be a really interesting port to JS.

I agree that Plan9 is not at all hard to get at - but if you can run it just by visiting a website, a LOT more people will play with it. Even if they just fire it up, go "huh, neat" (or more likely "how on earth do I use it") and close it, that does increase platform exposure.

I can't help but remember all the DOS games that are directly playable on the Web Archive. Inferno used to run in-browser because it's the perfect tech demo of the platform. Plan 9 should be able to too IMHO.

At this point in time Web browsers are not the perfect environment for general-purpose x86 emulation, I can't argue that. If things were different we'd run everything inside the browser. But I think Plan9 is sufficiently resource-light, fast, and cleanly-designed enough that it would make an excellent candidate.


v9fs?


I seem to recall that Linux have a implementation of 9P these days. Can't say i have taken it for a spin though.


thats only for mounting 9p file systems, it doesnt offer kernel services


It wouldn't surprise me if RS-232 will still be around long after USB falls out of use.

Does anyone know the reasons why USB was not made backward compatible with RS-232? It would only take a very short negotiation to determine if both endpoints support USB.


I don't know but my guess is to add compatibility for RS232 voltage levels would add significant cost and complexity to the USB PHY, the thing that actually drives and measures voltages on the wire. USB was specifically designed to have a simple and cheap PHY so that it could be used in very cheap peripherals.

To expand a bit, the typical fully compliant RS232 setup has a special level translation IC (for example the MAX232) to deal with the huge voltage range and convert it to something more low voltage digital logic friendly. And that IC usually requires power supply levels that the typical cheap USB device would not have, which means it would have to add more circuits to generate them, adding even more cost to the hardware.

And the USB committee probably figured to be not fully compliant with RS232 would just confuse people, and damage hardware, so it was better to be not compliant at all.


It would have been really awesome if USB was compatible with TTL-level (5v, maybe 3.3v) "serial". This nonstandard variant of definitely-not-RS232 is everywhere.

But...

> And the USB committee probably figured to be not fully compliant with RS232 would just confuse people, and damage hardware, so it was better to be not compliant at all.

...you are sadly right.


USB controllers have to support 3.3 V anyway (on the D+/D- pair, not on the SS pairs), for supporting USB 1.x modes.


I used to work on x-ray equipment that still used serial/rs-232 ~4 years ago. I'm sure they are still around. I had to set up multiboot laptops that ran win 95/98/xp, because we couldn't virtualize the serial connection properly. And yes, the USB-to-serial rarely worked either.

Man that job was frustrating, but it sure was a lot of fun!


Isn't there something like, just one company in the world that makes 95% of all USB to Serial chips?


There are several. FTDI, Prolific, WCH (CH340)...others as well.

It's a mess at the moment, because there are unauthorized clones of both FTDI and Prolific, and both companies release drivers that purposefully don't work (or worse...brick them) on the clones. But, there's not really a way for the end buyer to know for sure they are buying the real thing.


The SiLabs CP2102N is useful for serial to parallel. It will talk to Linux with the standard Linux serial driver, although you need a free Windows program from SiLabs if you want to reconfigure it. (This is needed only for unusual applications.)

I use them because they'll go down to 45 baud for antique Teletype machines. They're popular for Arduino applications, and there are lots of cheap breakout boards with 0.100 pins for Arduino interfacing.


Interesting. I'd gotten the impression the FT232 was "rock solid" and filed them away as "they're good, use them", but that was before the bricking incident, and now I'm not really sure anymore.

I guess on the surface the big thing I really like is device differentiation. Do CP2102Ns have unique serial numbers, or can that free utility burn in info I can use to differentiate?

Going a bit deeper, can I bitbang with it?


You can set vendor ID, product ID, product string, serial string, release version, and max power requested. "Manufacturer string" is set to "Silicon Labs". After doing that, you can lock the device against further changes, if you want. This is all done via SiLabs "Simplicity Studio", which is a big IDE for their microcontrollers into which they wrapped up some of the device-specific tools for their simpler devices.

Thus, you can force the host machine to demand a device-specific driver if you need to. By default, it appears to the OS as a USB to serial port device. Linux and Windows recognize it as such, without special drivers. Linux mounts it starting at /dev/usb0; Windows mounts it starting at COM3.

No bit-banging, though; it doesn't have the hardware.


Ah, thanks! That's pretty cool.

Very nice that I can change the serial number! That's actually kind of better than the FTDI route, where the serial numbers are hardcoded; I get to use my own serial numbering scheme.

I kinda expected no bit-banging. FWIW, if I really needed that I could probably build something with an Arduino (or similar microcontroller), and there are probably devices out there that do offer that functionality. I've never practically needed it; it's just my catalyst.


I can confirm the FTDI drivers are total shit. We had to obtain hardware info under NDA, and write our own driver in order to get a reliable solution build using their parts.


It used to be "just" FTDI... but then other manufacturers and endless copycats jumped in, which resulted in a XKCD #927 situation. As if we didn't already have a USB-CDC standard...

(For an interesting side story, search for "ftdigate". At some point FTDI decided they're fed up with copycats and released a new driver pack (through automatic Windows Update) that bricked counterfeit chips. This led to a lot of angry people and number of amusing situations, including someone jokingly submitting a patch to Linux to do the same.)


FTDI is not and never was especially big in the USB-to-Serial market, and their chips tended to implement CDC in variously broken ways not to mention that their windows drivers are major PITA. What they are big in is market for "just add USB to this custom device" because they supply mostly NDA-free documentation since forever and produce various ready-made modules.


They also put the legwork into getting their drivers into nearly everything, which makes them great to use in a custom or small batch product that has users that aren't technically oriented or wont be installed by a knowledgeable installer. If you already have a client side install (like a software package) that can bundle the drivers for you, it's not a big deal.

And with the emergence of USB-CDC, it's not nearly as necessary as it used to be, since most modern OSes support that now.


And then killed that convenience by literally bricking counterfeit chips. Most users have no way of knowing whether or not their chips are genuine - and, what's worse, most electronics manufacturers have no way of knowing either, because entire supply chains are rotten. These days we have our trusted supplier of FTDI chips but we still check every single shipment. We already had one supplier send us fake chips and then flat out exclaim they did it because "no one would pay premium to get the real thing".

We would.. but since they already cheated, we had to switch suppliers.


Can you share which suppliers have you been burned by? We've seen counterfeit Prolific parts, but I'm not aware if we've had similar problems with FTDI hardware.


I don't remember, unfortunately, as I wasn't the one handling logistics, I just spec'd out parts we needed. If anything, here's a bit of advice: don't ever assume your supply chain is 100% clean - more than once we've been sold thousands of dumpster-grade components by a seemingly very reputable company. And don't ever assume suppliers won't try a bait-and-switch; we've been bitten by this more times than I can count.


Can the bigger names like Digikey be relied on? Or do their suppliers sometimes give them bad parts, too?


My point is that even if your supplier plays fair, the manufacturer may change the designs slightly and you either dont't get the PCN or fail to understand the implications of a change. We had a case once where a manufacturer added a minuscule hole on the case of a relay, presumably a pressure vent. Our devices are potted in resin in order to withstand rough conditions; in this particular case the resin penetrated the relay and blocked the contacts. We were halfway through a production run before anyone noticed. Needless to say, the run had to be scraped because there's no economical way to remove the resin once it's hardened. All for a saving of a few cents per relay.

QA lesson: new batch of anything = tight quality control. When you go into manufacturing, it becomes a game of numbers. Manufacturing 10000 of whatever is statistically bound to generate some duds and failures.

Speaking of prototype/hobbyist amounts: stick to the big suppliers and you should be safe, but if something feels fishy, don't be shy to order the part from a different supplier. Generally, identifying counterfeit parts in prototype amounts can be tricky without prior experience because your manual soldering skills will always be sub-par to a pick'n'placer; same goes for ESD precautions, accidental shocks applied to the circuit, passerby finger-pokers and your curious boss.


Thanks so much for clarifying the specifics of what happened to you; this is one of those situations that's hard to imagine, making it that much harder to envisage exactly what can go wrong, where and how.

What I'm wondering now is whether it's possible to somehow financially insure a specific level of quality control (translation: sticking to the blueprints!) in a way that doesn't scare everyone off. I'm guessing the scrapped run had to be eaten on your side? :/


Whenever I see an inexperienced young hardware startup striving to build even just 1000 of some widget, I feel sorry for them. They often go twice over the planned budget both in terms of time and money... and only after several screwups they hire some experienced guys to help them with the process.

The answer to your question? It simply experience. Start small, try to build bigger things, and put in actual time and effort to learn. Don't cut corners; learn how the big industry does things (and especially: why). Don't guess tolerances and sizes: find relevant standards. Read datasheet thoroughly and with comprehension. Ask your assembly house for guidance. Get a book or two on process & quality control.

The effort you invest will pay itself off sooner than you think. Not in revenue, mind you - but greatly reduced losses and delays.

PS. The world of manufacturing is wonderful. It's vastly different from programming, as it involves much more interaction with suppliers, vendors, teams and assembly line employees - but the feeling of holding a finished product in your hand is worth it.


Thanks very much for this info.

For a few years now I've wanted to build a handheld device that captures the essence of Lisp machines, Forth, and systems like the Commodore 64 and Canon Cat, in a portably accessible/usable form, wrapped in a highly pocketable but ruggedized enclosure similar to the old Nokias that lasted forever. I envisage it primarily as a teaching device and something people could hack on for fun, but the whole idea has never been especially practical or marketable. Now I know what it might be for (when I have a bit of money) - manufacturing education :) since device production has always been something I'm interested in and I do want some experience.

I also want to build a handheld device with a 2G+ baseband, secure boot, and an open-source firmware (perhaps seL4, most definitely not Android). The possibilities start with end-to-end encrypted SMS and trail off infinitely. I haven't really thought of what might be possible; I'm just stuck on the academic problem of secure boot - which is quite an issue, as not even Apple (just checked, $641B valuation right now) seem to be able to get this right: https://ramtin-amin.fr/#nvmedma. I'm saddened by the fact that all secure boot implementations seem to either be NDA-laden, based on security by obscurity, or both. I'm yet to find something I feel would be hard for even someone with a very very large pile of money (for arbitrary scaling of "very very large") to break. I realize that given infinite money everything is breakable, but current "secure" defenses seem to fall over much too readily IMO. (Eg, secure boot implementations have to have test modes; have these passed stringent code verification? A properly-formed legal case could subpoena any secure boot implementation's source code. This is assuming the likely-overly-idealistic case where there are no deliberate backdoors.)


My advice? Start small. You might want to build a car, but in order to do this, you first need to build a dozen skateboards before moving on to bicycles. As for Secure Boot - it takes expertise in breaking things to build something unbreakable. I have broken commercial copy protection using nothing but an adjustable power supply and some assembly code - protection that, on paper, seemed "good enough". If you think you can build something secure without decades of experience, you haven't really understood how much power a determined engineer equipped with a fast FPGA wields over your puny electronics.


I was curious what you'd say :)

As for the first idea I mentioned, disasters there could be easily tolerated since it's just a side-project thing, so using it as something to (slowly!) work towards could be interesting.

With the Secure Boot idea, I now understand that this would absolutely need to be a group effort, and I'd need the (great) assistance of others with significant experience in security for it to work. That makes perfectly logical sense now I think about it (I'm crazy for thinking I could manage it on my own...) - now I know what direction to go in! (And also the fact that I need to do quite a bit of thinking about this.)

I must confess my curiosity at the type of copy protection you referred to. I was thinking you electrically glitched the EEPROM in a dongle, but that doesn't explain the asm code.

Thanks again.

And what you say about fast FPGAs vs puny electronics is very true :D - and in all fairness, Apple haven't been in security for very long.


PS. You seem to me like a person bent on building The Next Big Thing, reading up on things, accumulating knowledge, having big expectations... I used to be like this most of my life. If you want to actually cash in on that knowledge, you need to BUILD THINGS.

Want to get into hardware security? Buy a hardware glitcher, break some chips, write up on it. Find out it's been done before. Feel confident that you can now break harder, better protected chips. Try it. Succeed or fail. Repeat.

Thinking about secure enclaves? Implement one for some hardware of your choice. Document it. Put it up on Github. Submit to Hacker News. Get feedback. Repeat.

Dreaming about a C64-style machine? Get a devkit for a suitable platform. Write some kernel code. Write examples. Breadboard a second prototype. Design a PCB and an enclosure, have it 3D printed. Heck, get a 3D printer yourself and use it all the time. Write a game. Play it until your fingers hurt. Find out how to build a better keyboard that doesn't hurt your fingers. Get a graphic designer to make some advertising templates. Ship one piece of it and bask in glory for five minutes. Pack the whole thing in a cardboard box, stash it in the attic and go thinking about the next one. Repeat.

The important part? Get something from the idea to a finished thing, repeatedly. It doesn't have to be big - but it has to be 100%. Getting it "just working" and moving to the next big thing won't cut it. There's no other way.


(Took me a bit to get back to this)

I used to have a really bad case of The Next Big Thing, but I've slowly started to come round to the idea of taking the time to study what already exists and consider where I might be the one who needs to learn and adapt. I've only just started with this train of thought, but I think this mindset is one critical of the process of doing things that are accessible and successful.

Someone once told me that to get anywhere you have to come up with a pie-in-the-sky idea that's absolutely crazy and then go for it. While taking that literally is a recipe for superfast burnout, it seems to me that that mindset tends toward system-based rather than goal-based motivation so might have some reasonable benefits for creativity and creative discipline. Not sure. Still figuring it out.

I definitely am interested in absorbing as much as I can. I've been figuring out how to build a tab/bookmarks/history-management extension for a while now, hopefully I get the courage to start (Chrome's APIs are so verbose and complex, and JavaScript requires so much boilerplate, I can't say I like it). But I currently have 652 tabs open that I need to bookmark and close, and something like 20k bookmarks that I need to tidy up (!), so it's on the todo list. Heh.

The first time I heard about hardware glitching was "Many Tamagotchis Were Harmed in the Making of this Presentation", https://youtu.be/WOJfUcCOhJ0. (I also just discovered and watched the update, http://youtu.be/mCt5U5ssbGU.) That was fun to learn about, but now I realize this sort of thing is widely applicable it's even more interesting. Thanks for the headsup! The concept of hardware glitching is something I've been interested in for a while actually.

It never occurred to me that I could implement a secure enclave myself. I thought you needed a secure processor for the design to even be viable. I'm only aware of things like the ORWL secure PC (https://www.crowdsupply.com/design-shift/orwl, https://www.orwl.org/wiki/index.php?title=Main_Page), which does use a secure processor (http://www.st.com/en/secure-mcus/st33g1m2.html) to manage the system's various security features.

It's mostly my complete domain ignorance, but I can't envisage a way to build a truly secure processor setup, mostly because of limited access to secure parts. A fast OTP microcontroller with enough space for a burnt-in key and the ability to interface with external Flash could work, but if I just used this for storage I/O going to another CPU, you could simply tap the bus lines to achieve untraceable information leakage.

The secure processor would need to deal with everything between keyboard input and LCD update, and only output encrypted data to the 2G radio. The chip I linked is only 25MHz, which would make for quite a limited device. It most definitely would work - I have an Ericsson MC218 that's that fast, and the EPOC OS (forerunner of Symbian) on it is incredible - but it would be much more accessible if the CPU were 250MHz or so instead. I'm not aware of secure processors that are that fast - and does the chip I linked even require an NDA to use? It doesn't look like it but I wouldn't be surprised if it did.

Ideally, I'd love for a way to use one of SiFive's RISC-V chips as the secure processor when they release their high-speed designs. But implementing secure boot on one of those would depend on both how the chip is designed (eg, boot sequence sensitivity considerations) and how the chip is physically constructed (I expect RISC-V chips with active meshes etc will eventually exist).

My pie-in-the-sky step-up from this basic concept would be to make a dual-chip system, with a secure CPU sitting alongside an off-the-shelf ARM CPU running Android. The secure CPU can take control of the screen and keyboard in such a way that the ARM CPU cannot attack (the keyboard would be easy - just route all keyboard I/O through the secure processor - but I fear that wielding the video side of things would be incredibly nontrivial to implement securely). Then when you wanted to do secure tasks you can simply tell the system to switch to the secure processor, which takes over the screen and keyboard until you tell it to return control to Android.

My ultimate goal would be a secure processor fast enough to capture medium-resolution video (something like 640x360 max, to begin with) from a camera module, and then play it back on an LCD, all without any sensitive data leakage (or depending on external processors that would require that). Ideally I'd like to go higher, but I think these are reasonable (beginning) expectations for a device that I would rather not put a GPU in. (Yes, crazy, but GPUs require NDA'd firmware, so the best ever case scenario I could manage is getting access to the BSP source and looking it over, but I'm most definitely not a security researcher, so I don't consider it viable. I can get away with Wi-Fi+cellular because the data going over that would already be fully encrypted with keys those chipsets cannot access, regardless of how malicious they are.)

Regarding the handheld not-quite-sure-what-it'll-be-yet thingy, the keyboard has been my biggest perplexion for a while. :) Tactile switches with low force actuation is one simple solution, but will never feel as professional as a proper rubber-dome actuator setup or similar. I've never used one, but the original Blackberry (pager) looks really close to what I want (in fact I've heard that thing runs on an 80386-compatible CPU - not sure of the manufacturer - and that there was once a devkit for it floating around and generally available). I wonder whether it uses a rubber actuator system or tactile buttons.

I completely understand your closing key point about actually manufacturing stuff though. I've gone for a very long time with just pondering and wondering, and no actual iteration, and I can't help but acknowledge that there is a lot of truth in the idea of "quantity over quality" - or more accurately our brain's ideas of "quality."

The idea that studying a subject with the notion that improving our understanding of that subject will make us better at it does hold true for a lot of areas and domains, but I think it tends to break down in a lot of the creative process. The process of making - whether that thing is something intangible like a piece of software, or a physical product - is generally something that must always be learned as a discrete subject nowadays. Unfortunately, this seems to be a rather hard idea to grasp, and there's a bit of a learning curve to it.

We depend on so many tools now, and those tools have developmental and process histories of their own that we need to appreciate in order to take the best advantage of those processes.

But our brains are likewise tools, and to use them most effectively we have to figure out how they work best. That process is a bit like jumping off a philosophical/psychological cliff :)

As for practically running off with any of these ideas and actually getting started with them, that's a ways off yet. I unfortunately don't have the budget for those things right now due to fun, expensive medical issues that make it impossible for me to get a job (yeah).


I'm going to keep it short and sweet - not because I don't care; conversely: I do and I want to get the message through. This is an "I'm an old hacker and I'm here to set your straight" message and it's not gonna be pretty. You're free to disagree; I'm not here to argue, only to offer a heavy bit of advice.

1. Your P/PC balance is completely lopsided. You seem to be focusing only on acquiring ideas and knowledge but not actually using them.

2. 500+ tabs, 20k bookmarks? Are you aware that at this rate you'll never get anything done because consumption of information will take 100% of your life, with an ever-growing TODO list of things to read? This is borderline addiction.

3. Execution is a skill. If you ever tried to do any of the things you read so much about (as opposed to just reading & talking), you'd find you completely lack experience in doing. You seem to be living under an illusion that you're acquiring skills. You are not.

You sound smart. Very smart. Almost too smart for your own good. But intelligence - and knowledge - is not enough. You need to jump off that psychological cliff before you build it up so high the fear will stop you from ever making the first step.

Having said that: close your web browser. Open your editor. You already have enough inspiration; now you need code. That's all you'll hear from me.

Kosma.

PS. If you follow my advice and start building things instead if just thinking about it, you'll find your creations don't even begin to live up to your expectations. That's normal; it simply shows the discrepancy between what you are and what you could be.


First time I've ever seen "That comment was too long." on HN.

This is part 1 of 2.

---

Mentoring is something I'm admittedly a bit lacking in, so this is highly appreciated! The to-the-point approach is an even bigger benefit.

I'm not sure if you'll respond to this - it isn't needed, unless you want to continue this conversation (even in a few weeks or months, maybe) - but I actually agree with most of what you've said.

> 1. Your P/PC balance is completely lopsided. You seem to be focusing only on acquiring ideas and knowledge but not actually using them.

Ah, production vs. production capability. Very interesting concept.

Quite some time ago, when I didn't have a mental map of a new thing, I would glitch out and keep trying to find the edges of that thing so I could conceptualize it, predictably and consistently getting stuck in infinite loops until I'd explode from stress. My ability to summarize has historically been horribly broken, and the side effect of that here was that it took me way too long to realize that a lot of things cannot be summed up without relevant mental mnemonics already in place - so mental-mapping must be multi-pass.

This meant that I was atrociously imbalanced (like, practically vertically so) toward acquisition/observation/spectation over participation. In my case I did want to participate, but my attention span didn't permit me the mental stack space to automatically create and interconnect component details as I went along, making me simply unable to parse some subjects.

The sub-problem was my lack of a toolkit to use to get past the "bah, that particular detail is BORING" phase with certain things. I have quite a backlog of things I need but don't have available because of this...

For example, I still don't know assembly language (I only just recently realized that I saw learning a language as learning its grammar, while asm is all about CPU architecture, which I was never looking at) and I also don't know basic math.

Also, I was standing in a store a while ago completely stumped about what buttons to push on my calculator to figure out how many grams of X I could get because I had $Y to spend. I did figure it out in the end but I don't have any sort of mental map of how to do these tasks because my brain doesn't find them interesting enough to focus on.

An aside: I tried to optimize my (re)typing so typed "production{,} /capability" before. That didn't really work; a) bash doesn't let you remove the space in comma expansion so this canonically doesn't work, and b) typed out like that it isn't very clear and visually looks terrible. I think I inadvertently proved your point before I got 4 words out. lol

> 2. 500+ tabs, 20k bookmarks? Are you aware that at this rate you'll never get anything done because consumption of information will take 100% of your life, with an ever-growing TODO list of things to read? This is borderline addiction.

It definitely looks like that, yes. Some clarification!

This is actually because I'm using a ThinkPad T43, and Chrome on 2GB RAM and a single-core <2GHz CPU doesn't tolerate hundreds of tabs very well. I think my real maximum working tab count is around 50-100 tabs or so, but what ends up happening is that bookmarking those tabs gets uncomfortable after only about 10 tabs are open, because opening the bookmark folder selection popup (I use Better Bookmark) means Chrome has to spawn a new renderer, an operation that makes the system swap to death and can routinely take 10-15 seconds (sometimes 30+ seconds or more). Unfortunately it's easier to just suspend the tab (with The Great Suspender) than do this.... oops, now I have 731 tabs open. Except 680 of those tabs are actually Sad Tabs now because Chrome's broken malloc decided it didn't have enough memory (with only ~1.3GB of my 7.8GB of swap in use...) and it killed all my extensions, and The Great Suspender has no functionality to detect and reload "crashed" tabs when it restarts, and fixing it manually makes the system swap to death easily for 10 minutes (yep).

TL;DR: Chrome encourages me to suspend and forget about tabs rather than get back to them and sort them out. I argue that because no work is being done to fix this, it IS kind of deliberate. But would there be a way to fit into a bug report? No. :(

The real issue is that The Great Suspender is easily 1k+ SLOC because JavaScript, "modern" OOP, and edge-case management immediately lead to verbose, hard-to-learn code. I've looked at the code and it would be quite outside my comfort zone to maintain it.

So, in the end, I'd need to make my own extension - which would need to be a rewrite, since I kinda dislike the GPLv2 for productivity stuff like this, I also don't want to wind up as the maintainer for this extension, and I need an integrated bookmark manager+tab manager+tab suspender, so I can do things like bookmark suspended tabs and get the right thing, unload/close a tab but keep it in a "read later" list, bookmark things out of that list, etc etc.

I'm at the point where I can't deny that I need to do it. I'm working on a crawler for a website that's technically already shut down so I can try and get the data off it - or, more accurately, going round in circles where I can't focus because I don't know whether the site will really shut down in 10 minutes or next week or whatever, and it's messing with my motivation - but once that's done I think I'll be starting on this.


First time I've ever seen "That comment was too long." on HN.

This is part 2 of 2.

---

> 3. Execution is a skill. If you ever tried to do any of the things you read so much about (as opposed to just reading & talking), you'd find you completely lack experience in doing. You seem to be living under an illusion that you're acquiring skills. You are not.

This was actually exactly what I was trying to say before. You said it a lot more succinctly than I did:

> The idea that studying a subject with the notion that improving our understanding of that subject will make us better at it does hold true for a lot of areas and domains, but I think it tends to break down in a lot of the creative process.

You make an undeniable point. I also noted that:

> Unfortunately, this seems to be a rather hard idea to grasp, and there's a bit of a learning curve to it.

and I wish I was making faster progress...

> You sound smart. Very smart. Almost too smart for your own good. But intelligence - and knowledge - is not enough. You need to jump off that psychological cliff before you build it up so high the fear will stop you from ever making the first step.

Thanks. I've had exactly this problem for quite some time. It actually got to a point where I nearly became fully mentally detached and went off the deep end - I was thinking about ideas I had until I'd find a hole somewhere, then scrabble around frantically until I found the first thing that sounded like it would fix that problem, at least in theory. Do that for long enough, without any groundedness, going entirely off of "reasonable guesses".... welp. :D I've thankfully moved past those anxiety issues!!

In my case the psychological wall is built up as a side effect of another process: the fact that my attention span is like a broken bicycle that I can be pedaling as fast as humanly possible, but which will gradually slow down halfway up the hill, stop, and begin rolling backwards (all while I'm pedalling at crazy speed). So no matter how much interest I have and no matter how much effort I invest (my current project, the crawler, being a textbook-for-me case in point) I always roll to a stop.

This has perplexed me for years - depression/mood doesn't quite nail it, since I can crack up at stuff on Imgur and Reddit all day (well, not all day, those websites are like chewing gum, they dry out after an hour or so at the most), and my perspective is not predominantly dark/black, which I would think is a prerequisite for behavior that could be argued looks like "giving up."

I've learned a bit about the foundational health issues behind my autism, OCD, nutrition absorption problems, brain fog, etc etc, and made some good progress with correcting those problems - particularly issues with mental clarity - but I still have quite a ways to go, as I've noted above.

> Having said that: close your web browser. Open your editor. You already have enough inspiration; now you need code. That's all you'll hear from me.

Oh yeah, I've been thinking of writing a text editor for a while now... :P

In all seriousness, my motor coordination is terrible (I use two fingers to type with, and sometimes my muscles jump) so text editors with complex shortcuts involving multiple keys or key sequences that must be executed perfectly are a deal-breaker for me. Stuff like CTRL+S is my current comfort-zone limit for keyboard shortcut complexity, although I wouldn't mind something like making the Shift or Ctrl key itself save too. If I don't use a function as frequently then I don't mind, but I save almost obsessively (I use file alteration watching to rerun my code) - I actually just hit ^S while typing that :D (I don't usually do that in Chrome, lol) - so I prefer "single-chord" or single-step keyboard shortcuts. I never used WordStar when I was younger, I guess?

I don't like that it's impossible to completely filter out the religious pretentiousness of emacs and vim, which both have their pros and cons. But vim is installed by default in most places, and I can see effort was made to give it user-friendly default keybindings, so it's what I learned (or more accurately, know I'll be able to use without learning :P). emacs is essentially where all IDEs got their inspiration, so is associated with carefully-finetuned installation and configuration, and (arguably) associated themes of fragility. I get a very "this UI is a carefully designed optical illusion" vibe from emacs, like the last time I ran it and played with the package installer I discovered that the entire UI locks up while it's doing network requests (IIRC). Fun.

So yeah, I want a simple editor that follows widespread traditions, but also one that offers some obscure things like realtime syntax highlighting/formatting similar to QBasic's editor, which I've not found in any other environment (!).

> PS. If you follow my advice and start building things instead if just thinking about it, you'll find your creations don't even begin to live up to your expectations. That's normal; it simply shows the discrepancy between what you are and what you could be.

I really really like this way of interpreting this. It's very motivating. Thanks!

Btw, I followed you on tumblr. :P


> I must confess my curiosity at the type of copy protection you referred to. I was thinking you electrically glitched the EEPROM in a dongle, but that doesn't explain the asm code.

Load exploit/dumper code over JTAG, then glitch the CPU into thinking there's no JTAG connected, making it run the code with full permission level. As simple as that. It was all written in the datasheets and reference manuals - if you knew what to look for and how to combine the knowledge.


Ah, I almost figured out what the variable power supply was for :)

So searching "CPU glitching" didn't do much, but "CPU voltage glitching" found me lots of results.

I realize all you need is a variable voltage supply, and maybe (?) something to easily inject voltage +/- pulses (within a given voltage limit) and that a lot of glitching stuff is probably unnecessary, but it might be useful for learning.

And yeah, making (often lateral) connections between disparate pieces of information is often what makes the difference. I think it's mostly about exposure to a given field to get really good at that. Guess I should get started :) ...soon.


Speaking out of curiosity, can you name some companies that provide NDA-laden USB-serial bridges? I'm painfully aware that it's hard to even know such products exist, especially when your production volume barely runs into hundreds or thousands and you're not established in the industry... and from my limited interactions with even the smaller players like SIMCom or SiRF, it takes an NDA just to talk to someone who can get you to sign a second NDA so you can see datasheets. Even if you have tons of money, giving it to such company in return for their product can be a challenge.


> FTDI is not and never was especially big in the USB-to-Serial market

... whaaa? They're probably the biggest company in that market, closely followed by Prolific (PL2303).

> ...and their chips tended to implement CDC in variously broken ways

No, their chips implement a proprietary protocol. Not CDC at all.


isn't one reason that RS-232 is an open protocol but USB requires licensing ? I thought it cost money to get an official VID/PID from the USB organization ?


Funny, had to learn all this stuff for my Master's thesis as it was a crucial part of my project to provide reliable shell command exchange via serial connection. It was really really hard to find anybody who knows anything about this lower network level and terminals.

What I can add for everybody who feels the same disappointment as ESR: It's very common for a growing community that three things happen.

A) The number of people with just a little knowledge over the holy grail of your community increases.

B) The popular communication is taken over by great communicators who care more about their publicity than your holy grail.

C) This gives the impression that the number of really cool people decreases. And that is depressing to old timers. But it's in fact often not true. Actually most often the number of cool people increases too! It's just that their voices are drowned in all the spam of what I like to call the "Party People" (see B).

So yes, you can actually cheer. It's harder to find the other dudes, but there are more of them! Trust me, I'm not the oldest guys here but I've seen some communities grow and die till now, and it's nearly always like that.


And these days

D) the B)s use various "social" tactics to tar and feather A)s that get in their way...


I've seen this kill entire companies in an extremely slow, painful death. I used to fight this... These days, as soon as I see a Chief Architect or a CTO start to social their way to some technical goal, I just leave the company - for it's a very clear signal that an engineer is a second-class citizen there. No point in waging that battle.


As someone still learning about the finer points of social interaction (and sorely lacking in experience), what would you say are some of the signs that this sort of thing is happening?



As having seen and being part of the start, rise and fall of certain scene of genre of music, this article provided a great piece to reflect upon how and what really did happen :)


This was so entertaining & insightful to read I ended up buying the actual book [1]. Thank you.

[1] https://www.amazon.com/dp/B00F9IV64W/


Burning Man ?


When you find swaths of knowledge that younger people don't know, you've found success in the overall human goal of abstracting concepts and building on the shoulders of those who came before us.

I'm not suggesting the article is a, "Gosh, Millenials!" conversation. I just get a warm tingle when reminded that I have absolutely no clue how to do something people did just a generation ago, and I don't need to. It's success!


Then you'd probably enjoy watching two TV series by James Burke: Connections and The Day the Universe Changed. They explore the driving forces behind many of the major technical inventions in the past 800 years -- how the status quo created a void that invention arose to fill.

The videos may look a bit dated now, but the content is amazing and Burke is terriffic.

https://en.wikipedia.org/wiki/James_Burke_(science_historian...


Every time I think about this series that used to air on TLC I just get so depressed at what TLC has become.

It once really was The Learning Channel


I was confused on why there were funniest home videos on National Geographic, too...


Commercial television for you.


I recall running into those on late night BBC sat broadcasts.

The best part is perhaps Burke himself, and his very very British presentation style.


Still, I have a strange feeling that our whole technological edifice is standing on the head of a pin. When the folks pass who know this stuff, and something breaks, it'll be a while before things get running again. Hopefully before the food riots.


This stuff won't be forgotten. It just doesn't need to be known by everyone wanting to so something with software.


It's really no different than COBOL. As long as there's value in knowing it, there will be a small number of people who can command a large paycheck for that rarely-useful knowledge.

At least that's what I comfort myself with as I gaze over my stash of DB25 connectors and Z80-SIO chips...


I'm curious what these are used for.

---

For those curious like myself (heavily elided for smaller wall of text):

The Z80-SIO (Serial Input/Output) ... basic function is a serial-to-parallel, parallel-to-serial converter/controller ... configurable ... "personality" ... optimized for a given serial data communications application.

The Z80-SIO ... asynchronous and synchronous byte-oriented ... IBM Bisync ... synchronous bit-oriented protocols ... HDLC and IBM SDLC ... virtually any other serial protocol for applications other than data communications (cassette or floppy disk interfaces, for example).

The Z80-SIO can generate and check CRC codes in any synchronous mode and can be programmed to check data integrity in various modes. The device also has facilities for modem controls in both channels, in applications where these controls are not needed, the modem controls can be used for general-purpose I/O.

What's really interesting is this bit:

• 0-550K bits/second with 2.5 MHz system clock rate

• 0-880K bits/second with 4.0 MHz system clock rate

110Kbps at 4MHz. That's almost the 115200 baud we're all familar with. At 4MHz! (2.5MHz yields 68750bytes/sec, or 67.13Kbps.)

https://archive.org/download/Zilog_Z80-SIO_Technical_Manual/...

Also - rather amusingly, I discovered that the MK68564 datasheets ripped off the Z80-SIO's intro text verbatim. Are these compatibles or something completely different? https://www.digchip.com/datasheets/parts/datasheet/456/MK685...


Not familiar with those, but it's likely they licensed Zilog's design.


Just like when all the horse-tenders passed and nobody knew how to plow fields with horses anymore


We're in a different situation than the horse-tenders. Much of our technology is built on top of this older tech.

It'd be like if we moved from using horses to using giant machines made out of glued-together living horses, and then all the horse-tenders died.


It can take weeks or months to figure out low-level stuff. Even a college education. If something critical depends upon it (and pretty much all critical systems do these days?) then we won't have time. Before the meltdown/sewage backup/food riots.


> When you find swaths of knowledge that younger people don't know, you've found success in the overall human goal of abstracting concepts and building on the shoulders of those who came before us.

Abstracting, yes, but I don't know about building-upon. The thing is, a lot of this stuff is still sitting around beneath the covers, and someone needs to understand it.

Even worse, sometimes there's stuff that's abstracted over that's important, e.g. if the Excel or 1-2-3 teams had known about Field/Group/Record/Unit Separators, would they have ever come up with CSV?

Or the fact that SYN can be used to self-syncronise, due to its bit pattern …


> Even worse, sometimes there's stuff that's abstracted over that's important, e.g. if the Excel or 1-2-3 teams had known about Field/Group/Record/Unit Separators, would they have ever come up with CSV?

Neither team had anything to do with CSV which originates in Fortran's list-directed I/O. And there is no field separator (which would be redundant with unit separator), FS is the file separator, these codes were intended for data and databases over sequential IO, in modern parlance a group is a table.


This exactly! I like to make the car analogy:

I have no idea how my car works. I mean, I more or less understand the principles underlying the internal combustion engine, but I wouldn't be able to service one, much less assemble one. But I don't need to. Typically, the only indication made available to me that something is wrong is a single bit of information ("check engine light"), but that is enough. You don't have to be a "car person" to make effective use of a car. I get in, I go, and well over 99% of the time that's the end of the story.

Compare this with computers. When something goes wrong, it's usually vital that you (or your users) relay the precise error message (and God help you if there isn't one). You generally have to be a "computer person" to some degree to make effective use of a computer. If you are unconvinced by this comparison, contrast how often your family asks you to perform [computer task] versus how often you would approach a mechanic family member to perform [car task]; Contrast how often you hear "I can't do this, I'm not a car person" versus "I can't do this, I'm not a computer person".

I consider swaths of modern hackers who simply don't know about much of ASCII as evidence of babysteps towards computers maturing as a technology.


I understand your argument, but I would disagree with a few points:

First, we use computers for so many more things than cars. The average user does really well with basic tasks like checking their email and simple word processing. This would be daily driving in your car analogy. Occasionally things blow up, but that isn't too different from a major problem with a car. However users are constantly trying new things with computers, new programs, websites, and tasks. Car-owners who are constantly trying new things with their cars have as many problems, if not more, than the average computer user. The difference is that the people who use the full range of their car's capabilities are deeply interested in their vehicles.

Second, abstractions like the check engine light are far from perfect. How do you know whether the light signals imminent failure or a minor inconvenience? What additional information is needed for the mechanic to diagnose the problem? I recently chased down a problem in my car that caused the check engine light to come on with a code that was physically impossible. It took a few weeks of careful experimentation and instrumentation before I was able to figure out what it thought was going on. This was a case where I absolutely needed more than a cursory knowledge of how my car works.

I also think that a hacker should be similar to a amateur mechanic: although their car might be fuel-injected, they have a cursory knowledge of how a carburetor works. They may have an automatic transmission, but they understand what a clutch is. Compare that to many developers who have never set foot outside their niche; They have never used a radically different programming language or a different OS. They've never taken the time to dig into the layers beneath the one they use. I would argue that is a weakness. How will you ever debug a problem when it inevitably occurs in the layers beneath you?


I have no idea how my car works.

I find this weird. As I proceeded through Comp sci in high school, going from Pascal to C to assembler, I was always troubled by my lack of understanding, "but why does it work?" That anxiety finally disappeared in college when I learned how logic gates are constructed and went through the exercise of implementing multiply as a series of logical operations.

Similarly, I find it strange that someone would be comfortable driving a car without fairly deep knowledge of how it functions and how to repair it. I don't understand how you're not plagued with anxiety.


> If you are unconvinced by this comparison, contrast how often your family asks you to perform [computer task] versus how often you would approach a mechanic family member to perform [car task]; Contrast how often you hear "I can't do this, I'm not a car person" versus "I can't do this, I'm not a computer person".

I think this is more about social norms/conventions than anything. I would never ask a family member who happens to be a surgeon to remove my gall bladder for me, or a family member who happens to be a mechanic to replace my clutch over the weekend. But for some crazy reason, it's perfectly acceptable to ask your "computer person" family members to spend hours removing the 1200 malware infections you got by installing that cute puppy toolbar you downloaded. The complexity of the tasks doesn't have anything to do with it.


> I consider swaths of modern hackers who simply don't know about much of ASCII as evidence of babysteps towards computers maturing as a technology.

Nobody else has argued on this point yet so I'll throw my 2¢ in. (An aside: I had to look up the codepoint for ¢ - 2A - so I could use it. I haven't memorized ASCII yet, let alone Unicode.)

In my opinion, things like the first 31 characters of ASCII, line discipline, NIC PHY AUIs, the difference between RS-232 (point-to-point) vs RS-485 (current-loop), how to make your Classic Mac show a photo of the developer team (hit the Interrupt key then input "G 41D89A"), or how to play notes on period VT100s (set the keyboard repeat rate really high); we've moved into an era where Web stacks reveal hard-to-diagnose bugs in nearly-40-year-old runtimes (Erlang), Apple will give you $200k if you extract the Secure Boot ROM out of your iPhone (in one person's case via a bespoke tool that attached to the board and talked PCIe), the UEFI in Intel NUCs is such a close match for the open-source BSP that Intel releases that it's a lot easier than everyone would like for you to make UEFI modules that step on things that shouldn't be step-on-able and let you fall through holes into SMM (Ring -2), and few people care that sudo on macOS doesn't really give you root-level privileges anymore.

We've just replaced all the old idiosyncrasies with a bunch of more modern idio[syn]crasies. You're right that technology has matured, but this has unfortunately meant that a lot of the innocence we took for granted has been lost. Things aren't an absolute disaster, but it's more political now, and we have to keep on our toes. Computers aren't universally somewhere we can go to to have fun; we have to work to find the fun now.

(Also, I just did that thing I often do with forums - I expect my reply to appear at the end of the thread, so I go to click on the reply button at the end. But that would reply to the comment at the end of the thread, not yours. This is a problem endemic to forum UI and not a HN issue. Not all aspects of computers have matured yet, not by a long shot.)


> how often you would approach a mechanic family member to perform [car task];

I guess it depends on where one live, as i see that happen all the time (either family, friends, or neighbors).


Many of the control codes are still in active use today in the air-ground communications protocol spoken between airplanes and Air Traffic Control.

The ACARS[0] protocol I work with every day starts each transmission with an SOH, then some header data, then an STX to start the payload, then ends with either an ETX or an ETB depending on whether the original payload had to be fragmented into multiple transmissions or fits entirely into one.

These codes aren't archaic and obsolete in the embedded avionics world.

[0] ACARS: "Aircraft Communications Addressing and Reporting System" - see ARINC specification 618[1]

[1]http://standards.globalspec.com/std/10037836/arinc-618


Here's the origin of that - The Teletype Model 28 "stunt box".[1] This was a mechanical state machine, programmable by adding and removing metal levers with tines that could be broken off to encode bit patterns. These were used in early systems where a central computer polled mechanical Teletype machines in the field, and started and stopped their paper tape readers and other devices. Remote stations would punch a query on paper tape and put it in the reader, then wait until the central computer would poll them and read their tape. This was addressable, so many machines could be on the same circuit. Used in 1950s to 1970s, when mainframes were available but small computers were not.

[1] https://www.smecc.org/teleprinters/28stuntbox001.pdf


Thanks so much for sharing this. This is exactly the kind of TTY history I've always been looking for.


Those DB9 and DB25 connectors are still kicking around the bottom of my toolbox.

Why is DEL's bit value 0xff (or 0255)? Because there was a gadget out there for editing paper tape. Yes. You could delete a character by punching out the rest of the holes in the tape frame. I used it once. It was ridiculous.


And don't forget the lace card - a card punched full of DEL. Having every single hole punched, it was so fragile it would crumple up and jam the reader. People today think they're smart 'cause they invent things like DRAM rowhammer... it's all been done before, kids. ;)


Ohh no.

Was there anything that created these lace cards as part of normal operation? I'm guessing not, considering the ramifications.

What about programs that did this when they encountered bugs?


LOL, I swear it feels like the answer to every "why is this weird computer thing this way?" question I see is "because we used to do it this way on punch cards."


Wait until you find out that "coder" originally meant "someone who encodes messages into Morse". And if you start digging into the word "code", you'll find it comes from latin "codex" which is a mutation of "caudex" meaning, literally, "tree trunk".

Because back then, people would write on wooden tablets covered with wax.

So.. next time you see a kludge and think of "historical reasons", consider that "historical" goes back much farther than 20th century. :)


Huh. I remembered from school that "caudex" was a reasonable translation for the insult "blockhead"... now I know why!


And, what did the word "computer" mean back in the day? Hint: they were mostly women. And, they defeated the Nazis in World War Two, with the help of Alan Turing and his crew.



Not just a gadget, but teletypes in general. (See e.g. https://en.wikipedia.org/wiki/Teletype_Model_33) You press a key, it punches the code on paper tape. If there are already holes, the new holes get punched over top. All holes is the only thing that can be punched over anything else with a consistent result.

The intent was that, when reading tape, DEL is ignored, because it's a position that was punched over with DEL, i.e. deleted.

Also, when punching tape, a key will punch its holes and (naturally) advance to the next character position. For DEL, that means you erase the character under the cursor, and then the next character is under the cursor. That is, it's a ‘forward’ delete. (I think it was DEC's VT2x0 series terminals that screwed that up for everyone.)


And note also that Teletypes and paper tape were the rationale for the ultimate "you asked for it, you got it" text editor - TECO. The basic idea was, if you had a paper tape with typographical errors, you could feed that and a correction tape into TECO, and it would apply the corrections. It eventually morphed into an all-purpose editor; DEC's VTEDIT was a screen-oriented editor written in TECO macros, and Emacs was originally implemented in TECO as well.

It's kind of appropriate that a typical TECO command line looks a lot like transmission line noise. :)


TECO = Tape Editor and COrrector


It's still a forward delete in Windows.


Nobody has mentioned the smell of the data.

The archive tapes were saturated with insecticide, so bugs would not be inclined to chew up your stored info.

There were also separate mechanical duplicators, plus multi-layer tape so that ordinary terminals could make more than one copy in real time.

For RS-232 electrical reliability, it's hard to beat a design intended to allow any pins to be connected to + or - 25 volts or ground in any combination without doing any damage to the equipment at either end.

Plus not restricted to the minuscule cable-length specifications of USB, and to reach the rest of the conected (by phone) world, the same EIA open-source digital protocol was just modulated/demodulated to analog audio upon send/recieve.

Remember RS-232 was always expected to be at least building-wide if not site-wide, depending on the size of the site.

Ordinary data communication at relatiely slow speeds has benefits that might as well be taken advantage of when they are needed.

Of course no code was absolutely required for any of these processes, but you could still seamlessly share ASCII files between Apple, Commodore, DOS PC's etc. using native COM port commands.

To get more speed between two points, on early PC's you could get software to multiplex more than one COM port to handle a single data stream over multiple signal pairs.

When needed, this would require and tie up multiple phone lines to reach off-site but it worked, plus it was the same technique on your own local copper but then it was more feasible to be always on.

Many buildings were originally equipped with top-quality AT&T/Bell copper pairs each dedicated to a separate signal for each (prospective) phone line to each office through its on-site relay box. At the time many of these pairs were rapidly becoming idle with the arrival of the modern office multiline phone which ran on fewer pairs, or had its own dedicated wiring installed at deployment.

With Windows 9x, COM port multiplexing was built into Windows, and with the arrival of the 115Kbaud UART's you could theoretically get 460Kbaud between offices by running four 3-conductor DB9 cables from the phone access plate on the nearest office wall, and using a PC having connectors for the full 4 COM ports which had become standard on motherboards.


You mean DE9.

DB series are the size of old 'parallel' ports. DE are the common width, like DE15 for vga.


From the article:

>Standard RS-232 as defined in 1962 used a roughly D-shaped shell with 25 physical pins (DB-25), way more than the physical protocol actually required (you can support a minimal version with just three wires, and this was actually common). Twenty years later, after the IBM PC-AT introduced it in 1984, most manufacturers switched to using a smaller DB-9 connector (which is technically a DE-9 but almost nobody ever called it that)

>Almost nobody ever called it that


I'm sad that "[FGRU]S ({Field|Group|Record|Unit} Separator)" didn't get much use, and instead we have to rely on tabs or commas (TSV / CSV), and suffer from the problem of quoting / escaping.

BTW, I use Form Feed (CTRL+L) character in my code to divide sections, and have configured Emacs to display them as a buffer-wide horizontal line.


It seems that so many programming headaches have the same root cause: the set of characters that compose "text" is the same set that we use to talk about text. Hence the nightmares with levels of quoting and escaping. The use of out-of-band characters like NULLs to separate pieces of text does help, but I don't think there is a complete solution. Because, eventually, we want to explain how to use these special characters, which means we must talk about them, by including them in text....


> Hence the nightmares with levels of quoting and escaping.

PostgreSQL has an interesting approach to this problem that I've found really straight forward and allows me to express text as text without getting into strange characters. What they've done is allowed using a character sequence for quoting rather than relying on a single character. They start with a character sequence that is unlikely to appear in actual text: $$, it's called dollar quoting. Beyond just $$, you can insert a word between the $$ to allow for nesting. Better explained in the docs:

https://www.postgresql.org/docs/current/static/sql-syntax-le...

What the key here is that I am able to express string literals in PostgreSQL code (SQL & PL/pgSQL) using all of the normal text characters without escaping and the $$ quoting hasn't come with any additional cognitive load like complex escaping can (and before dollar quoting, PostgreSQL had nighmareish escaping issues). I wish other languages had this basic approach.


Perl's had something like that for a long time: quote operators. You can quote a string using " or ' (which mean different things), and you can quote a regex using /. But for each of these you can change the quote character by using a quote operator: qq for the double-quote behavior, q for the single-quote behavior, and qr for the regex behavior. (There are a few others two, but I used these most often.)

    my $str1 = qq!This is "my" string.!;
    my $str2 = qq(Auto-use of matching pairs);
    $str2 =~ qr{/url/match/made/easy};
The work I did with Perl included a LOT of url manipulation, so that qr{} syntax was really helpful in avoiding ugly /\/url\/match\/made\/hard/ style escaping.


Perl is still, I think, the gold standard for quoting and string manipulation syntax. I am to this day routinely perplexed by the verbosity and ugliness of simple operations on strings in other languages.

(Of course, this may also be one of the reasons that programmers in its broad language family have a pronounced tendency to shoehorn too many problems into complex string manipulation, but I suppose no capability comes without its psychological costs.)


Yup, the 8085 CPU emulator in VT102.pl[1] uses a JIT which is essentially a string-replacement engine.

[1]: http://cvs.schmorp.de/vt102/vt102 (note - contains VT100 ROM as binary data, but opens in browser as text)


Perl also supports heredocs — blocks of full lines with explicit terminator-line:

  print '-', substr(<<EOT, 0, -1), '!\n';
  Hello, World
  EOT
Prints:

  -Hello, World!
iirc sh-shells also have that.


This seems like an awesome feature. I wish Python had something like it.


Python has triple-quoted strings which generally do the trick, and uses prefixes for "non-standard" string behaviours (though it doesn't have a regex version IIRC, Python 3.6 adds interpolation via f-strings)

    str1 = f"""This is "my" string."""
    str2 = """Auto-use of matching pairs"""
    str3 = r"""/url/match/made/easy"""


Yes, I've belatedly caught on to using triple-quotes to avoid some escaping. But I didn't know about the f-strings - thanks! (I'll be using those when I start using 3.6.)


Interesting, especially as I use PostreSQL. Unfortunately, "$$" is very common in actual text (millions of TeX documents, for example) as is $TAG$. But this could still work if you were careful to use TAGs that would never be found in text. But what if the document that you linked to itself had to be quoted? Would that lead to a problem?


I think it would be wrong to call it a perfect system or one created with the intention of so being. I'm sure in some disciplines, especially technical disciplines, you may well come across those sequences on a much more common basis... which sounds like your experience. Most of what I do is in mid-range business systems, after 20 years of professional life, it's something I've never come across. I suspect those sequences are fairly rare outside of specific domains and thus why that choice was made by the PostgreSQL developers.

Your question about self-referential documents and linking I don't understand; maybe an example. The PostgreSQL dollar sign quoting feature is simply a way to use single quotes (important in SQL) without having as many escaping issues. So instead of:

  SELECT 'That''s all folks!';
You could write:

  SELECT $$That's all folks!$$;  
or

  SELECT $BUGS$That's all folks!$BUGS$
And where it starts to save you in PostgreSQL is with something like (PL/pgSQL):

  DO
      $anon$
          BEGIN
              PERFORM $quote$That's all folks!$quote$;
              PERFORM 'Without special quoting';
          END;
      $anon$;
Note: this code produces nothing, it just should run without error (I ran it on PostgreSQL 9.4). In PL/pgSQL, the body of the procedural code is simply a string literal... but that means any SQL related single quoting would have to be escaped if we used single quotes. So using normal single quotes the previous code example would look something like:

  DO
      '
          BEGIN
              PERFORM ''That''''s all folks!'';
              PERFORM ''Without special quoting'';
          END;
      ';
And it gets worse as you get into less trivial scenarios... which is why I suspect this dollar quoting system was created to begin with.


I agree this solution handles a lot of common cases and makes the code easier to read than when forced to escape everything. I wasn't clear in my comment about self-reference. I meant that, suppose you are storing the text of articles in the DB (not a great idea, but it happens). The article (the one that you linked to) explains the $$ mechanism by showing how it works, so it's full of $$ sequences - the very sequences that we are assuming won't be encountered in normal text. That's what I meant in my beginning comment when I said that handling text that talks about our quoting conventions will lead to problems.


Ah, that's clearer for me.

There are a couple ways to handle depending on the scenario. If I were dealing with a static text under my control, say a direct insert of the text, I would either just enclose it all in traditional ' characters or come up with some unique quote text between the $$.

If I'm dealing with arbitrary text coming from, say a blogging website, I would either handle traditional SQL escaping in my input sanitizing code (or thereabouts) since I have to do that anyway ($$ is great for handwritten code where escaping introduces cognitive load, but not necessarily important for machine generated code) or I might create an inserting PL/pgSQL function with the article text as a parameter... that will get escaped without my having to do anything assuming I simply insert the text directly from the parameter.


> The use of out-of-band characters like NULLs to separate pieces of text does help, but I don't think there is a complete solution.

NULL is actually in-band, not out-of-band, and in fact it illustrates the issues with in-band communication you mention. That's what, presumably, ESC was for: a way to signal that the following character was raw and did not hold its normal meaning.

You can still devise a pretty good protocol with straight ASCII over a wire, using SYN to synchronise the signal, the separator characters to separate data values and ESC to escape the following character (like '\' is used in many programming languages).


> That's what, presumably, ESC was for: a way to signal that the following character was raw and did not hold its normal meaning.

Yes, ESC is a code extension mechanism: it means that the following character(s) are not to be interpreted according to their plain ASCII meaning, but some other pre-arranged meaning. Ultimately a shared alternate meaning for terminal control was standardized as ISO 6429 aka ECMA-48 aka “ANSI”. Free reading: https://www.ecma-international.org/publications/standards/Ec...

That this gave us keyboards with an Escape key that GUIs would repurpose to mean ‘get me out of here’ is a coincidence. (Plain ASCII had Cancel = 0x18 = ^X for that.)

MIT culture for historical non-ASCII reasons also referred to Escape as ‘Altmode’, which is ultimately how EMACS and xterms ended up with their Alt-key/ESC-prefix clusterfjord.


DLE not ESC https://en.wikipedia.org/wiki/C0_and_C1_control_codes#DLE

ESC is used for introducing C1 control sequences


We recently had to deal with an issue like this. My decision was to just sort of punt on the issue, and just base-64 encode the text. So there would be no shenanigans with escape character processing and such. The loss in efficiency was considered acceptable.


Form Feed would actually cause the printer to feed to the top of the next page. Original line printers had a carriage control tape that indicated to the printer where the top of the page was.

Also line printers using standard paper were 132 columns across and 66 lines down, which was 11 inches at 6 LPI. This matched the US portrait paper height and allowed for about 10 character per inch plus margins for the tractor feed and perforations.


I remember when a common (and relatively harmless) prank was to send a file with several thousand form feeds to the high speed "tree killer" printer at a remote site. Newbie operators would have a coronary when the fan-fold paper started spewing across the room.

("relatively harmless" because the paper wasn't actually printed on or otherwise damaged -- the operator just had to refold it and move it back to the input side).


Agreed, if there was just a little more editor support and use early on the whole CSV mess could have been avoided.

They actually display fairly well in vim


Perl has built-in support for these characters, sort-of, for use when reading in text files. Several of Perl's operators pay attention to certain global variables, like $INPUT_RECORD_SEPARATOR.

    while (my $item = <>) {
        ...
    }
By default this will read lines from STDIN, but if you set $INPUT_RECORD_SEPARATOR to the RS character, it'll read a whole record at a time. You can also set $LIST_SEPARATOR to the FS character, which you can use with the split function to divide your record into fields, and the join function will use it automatically to turn your list of fields back into a record.

I used these characters and Perl features in the early 2000s for managing a data processing workflow. The data was well-suited to being processed as a stream of text, and by using these separator characters I was able to avoid the overhead of quoting and escaping which made the processing MUCH more efficient.


There's also the $IFS (Internal Field Separator) for Bourne-like shells.

Oh, and Ruby copied Perl's `$/` and `$,` for record and field separators. (The "English" module provides $INPUT_RECORD_SEPARATOR and $FIELD_SEPARATOR.)


I worked on a database whose format was derived from Pick [1], that used 0xFE, 0xFD, 0xFC as record, field, sub-field separators - which was great until we had customers that needed those code-points for their data (i.e. in needing to enter data that was not all in English) and we had no escape mechanism. Using characters from the control set would have made so much more sense.

[1] https://en.wikipedia.org/wiki/Pick_operating_system


> I use Form Feed (CTRL+L) character in my code to divide sections, and have configured Emacs to display them as a buffer-wide horizontal line.

I've seen formfeeds used in source files like that before; it's a simple way of getting some primitive wider navigation (Emacs has built in commands for navigating by page), but I don't know why you'd limit yourself to them when there are much smarter options available now.


I don't limit myself to them, I don't really even use page navigation in Emacs. It's just that I like a nice, solid, horizontal line I have displayed instead of this character, and I kind of like the feeling of doing the old-school ^L thing :).


Things that I still find hard to forget these days:

* ASCII codes for those single and double box characters, so I could draw a fancy GUI on those old IBM text monitors

* Escape codes for HP Laserjets and Epson printers for bold, compressed character sizes etc.

* Batch file commands

* Essential commands for CONFIG.SYS

* Hayes modem control codes

* Wordstar dot format commands

* WordPerfect and DisplayWriter function keys

* dBaseII commands for creating, updating and manupulating records

I wish they would all move out of my head and leave room for me to learn some new stuff quicker!


Fun fact: Your smartphone is still using (extended) Hayes commands to control the cellular modem inside.


LTE smartphones too?


You'd be surprised how little has changed in that regard. Establishing a data session, from the early GPRS days until today's LTE, is still:

    ATD*99***1#
Even if the underlying technology has completely changed, the interface has not.


Whoa! I just dialed that on my Nexus 5x and got

> USSD code running...

then

> Connection problem or invalid MMI code.

So what was my phone just trying to do?


You are not supposed to dial 99# from the phone UI. It's special number that gets interpreted by the phone hardware itself and switches it to essentially router/access-server mode (the 1 at the end is optional parameter which is index of configuration you want, with the configurations being defined by AT+CGDCONT), most common thing that happens is that phone estabilishes PPP session with whatever device that sent ATD*99# and then routes it through packed based celular networkg (ie. GPRS/EDGE/UMTS/LTE), the mechanism is not specific to IP (although the actual network-side transport is always over IP) and other L2 protocols than PPP are sometimes supported. There is no reason why smartphone has to use Hayes protocol to communicate with baseband (and thus this mechanism), but many do.


Replying to OP because it's the most logical place to put this -

I'm curious what a "USSD code" is, and what kinds of codes I could input into the dialpad that do interesting things.

(I'm already aware of standard telco features like call forwarding, and I know iOS and Android both have their own set of "easter eggs" accessible via the dialer. I'm talking specifically about non-secret codes that talk to the baseband and stuff like that, if this sort of thing exists.)


The wikipedia page answers all your questions https://en.wikipedia.org/wiki/Unstructured_Supplementary_Ser...


Thanks so much! That was a really interesting read.


Hook your phone to your computer via USB and try to invoke that command on the serial port that appears; you'll get a PPP connection right away. It won't work with the dialpad app.


Yes, definitely. Most LTE modems are derived from 3G modems, and support 3G and 2G as fallback. That is not great for security, because 2G encryption is weak and broken.


It was such an odd joy when I had a Nokia n900, and I realized that you could thether by using wvdial, the same program I used in the 90s to dial into my isp when I couldn't be bothered to figure out pppd chat scripts...


I love how the modem control codes leaked across layers. So many scripted IRC clients had an option to send a quick "+++ATH0" to everyone and hopefully disconnect their crappy modems.


Lol talk about a walk down amnesia lane. I'm now remembering a time I embedded a couple hundred ASCII bell characters in a war script to knock other highschool kids offline. Nothing would clear a chat faster at 2am than computers making a really loud noise that unplugging the speakers wouldn't stop. Ahh nostalgia.


All the cool kids attach the power LED to the speaker headers, to stop your bell script :P


Nice try. BITD the bell triggered a piezo buzzer soldered directly to the motherboard. Speakers had nuthin to do with it.


By the time I was on IRC, most people were using PC speakers, which were real speakers at the time. These days, you get a garbage piezo with your case if you're lucky.

Either way, the power led is useless, so better to hook it up to the speaker header ;)


> * ASCII codes for those single and double box characters, so I could draw a fancy GUI on those old IBM text monitors

Presumably not ASCII - they'll be from some extended IBM character set (CP437?), I would think.


You are indeed correct. I had an old IBM PC/XT manual I believe, which had the charts at the back that I always used to refer to. There were the standard ASCII chart, and another extended chart with the IBM specific set.


dBase III blew my young mind with how it would let you teach it new words for the data you wanted to get out of it. The UX still stands out in my mind as unbelievably good. I should probably see if I can dig out an old copy and find if it's as good as I remember.


Escape codes for HP Laserjets

<esc>&l1O to switch to landscape :)

I'm just about to write some code to parse HP PJL so you never know when ...


PJL is easier to parse than it looks at first glance. I worked on a laser printer controller back in the late '80s that for convenience we made mostly HP compatible. Have fun!


Back in 1985 I wrote a HP PCL to HPGL translator in C. Fun stuff!


Add to that the z80 assembly language instructions I used most often - for patching infinite lives in games. (JMP, JZ, NOP, and RET).


Great list. How about...

* Lotus 1-2-3 "slash" commands


Fun fact about octal: every commercial and most non-commercial aircraft have a transponder to identify with Air Traffic Control. The squawk code is in octal.

https://en.m.wikipedia.org/wiki/Transponder_(aeronautics)


And the civilian SSR system was built on the released frequencies and protocols of Mode 3 of the military Mark X IFF system, which was adopted by civilian agencies in the mid-1950s after a particularly nasty airborne collision.

Sixty years later and we're still bounded by legacy. As a result of shortage the general-use squawk codes are namespaced into each national ATC region, so aircraft have to constantly change squawks even in supposedly contiguous regions such as Eurocontrol. Squawk 4463 means a different thing in UK airspace than in French.

Ironically, military aircraft still support Mode 3 in order to integrate with civilian ATC, who call it Mode-A, but all their special don't-shoot-me-I'm-friendly stuff is handled by more modern encrypted protocols.


And the squawk codes are broadcast as part of the ADS-B system[0]. These packets can be received with the inexpensive USB software-defined radios (e.g., [1]).

[0] https://en.wikipedia.org/wiki/Automatic_dependent_surveillan...

[1] http://www.rtl-sdr.com/adsb-aircraft-radar-with-rtl-sdr/


A lot of the data buses on aircraft also use octal to identify the data words.


The fact that Windows uses CR-LF as a line separator baffles me to this day (and I am not old enough to have ever used or even seen a teletype terminal!) - for a system that was developed on actual teletype terminals, it would have made perfect sense: To start a new line, you move the carriage back to the beginning of the line and advance the paper by one line.

But DOS was developed on/for systems with CRT displays.

It doesn't really bother me, but every now and then this strikes me as peculiar.


The thing to remember is that the CRs are never really gone, they're just silently inserted for you by the tty.

Dealing with raw vs cooked (where LF is automatically translated to CRLF) ttys in UNIX is also a giant pain when you have to do it, so in a way it's not surprising that they decided to leave that out. The original DOS kernel was very minimal compared to UNIX even of the same era. Of course, it turns out that having to write CRLF into files is also a pain - Windows has binary and text mode files instead of raw and cooked mode ttys - and one that you encounter much more often.


When you think about how a typewriter works it's actually correct. The "newline" handle does both a carriage return and a line feed. But you could conceivably do a carriage return without a line feed (and type over what you already have), or a line feed without a carriage return (which might have some actual use).


It's not just conceivable. It's how the output of grotty in TTY-37 mode works to this day. When you are reading a manual with underlining and boldface, and you haven't brought your manual system into the bold new GNU world of the 1990s where it actually uses ECMA-48 instead of TTY-37, the tty output driver is using carriage returns and backspaces to enact overstrikes.

* http://jdebp.eu./Softwares/nosh/italics-in-manuals.html

I wrote a better manual page for ul(1) that explains some of this. ul is basically a TTY-37 to your-terminal-type converter, and it implements a lot of the effects that one would see on a real teletype. Unfortunately, the old manual hasn't progressed much beyond the original 1970s one and doesn't explain a lot of the functionality that the program actually has.

* http://jdebp.eu./Proposals/ul.1


One use for CR without LF was to overstrike passwords on printing terminals. You'd enter your password (which would appear on the paper) then the system would go through several rounds of issuing a CR followed by overstriking the password with random characters.


I still use CR without LF all the time. If I'm writing something that features a long loop, I might want status updates, so printf("%d \r", percentage); helps tremendously. You'll need to fflush(stdout) too, since usually flush is triggered by "\n".

On the other hand, there are far fewer use cases for LF without CR, certainly nothing that isn't better done using ANSI codes.


This the use of CR without LF on a virtual TTY. On a real TTY or typewriter there is less use, but as one user pointed out one could remove information like a password which was printed.

LF without CR is something that one would do on a typewriter for typing tabulated data or mathematical formulae. It's just a way to go "down" but stay at the horizontal position you were previously at.



Aaaah, yes, that makes sense. I remember setting up LPD on NetBSD years ago, and part of that was setting up a filter to prevent the staircase effect. Those were the days... ;-)


and without the CR you could never have made the spinning line \|/-\|/-... :-)


I call BS ;)


AIUI is just that DOS was complying with the standard, while Unix had decided to break it. The standard was designed for physical teletypes and has a bunch of issues (like, what do you do with LF-CR?) but its not unreasonable of DOS to have stayed with it.


Apparently the LF vs CR+LF printer control code standard arose ~1965 as ASA (pre-ANSI). Neither Unix nor Microsoft created the standard, and of course ASCII didn't either.

https://en.wikipedia.org/wiki/Newline


CR/LF always struck me as the logical choice. And according to the SO link posted by 'chadcmulligan, it turns out that pure LF separator is actually a Unix hack that somehow got later accepted as the Right Way...


It wasn't a Unix invention; see https://en.wikipedia.org/wiki/Newline#History (as posted by randcraw nearby); for confirmation of the story there, see the Euro version of the standard for free at https://www.ecma-international.org/publications/standards/Ec...


It is the right way. Different terminals require different line endings and abstracting that to a single new line character makes sense. The UNIX tty driver then translates that new line character to what is appropriate to start a new line on a given terminal. In most cases CRLF is enough but older devices also needed additional delays.


If you are going with the abstract and translate approach, wouldn't the right way be to abstract to RS?


An interesting thing to note is that if you put a unix terminal into raw mode, you also have to use both CR and LF, because LF only moves the cursor down, but doesn't move it to the start of the line.


Windows is backwards compatible with DOS. DOS was backwards compatible with CP/M. It wasn't at all uncommon to see CP/M used with a printer as the primary interface.


And helpfully enough, EPS files can contain \r and \n as line feeds, singly or both, in either order.


Related, Python has universal newlines support:

https://www.google.com/search?q=python+universal+newlines+mo...

Edit: added the PEP (it's from 2002) and excerpt from it:

https://www.python.org/dev/peps/pep-0278/

This PEP discusses a way in which Python can support I/O on files which have a newline format that is not the native format on the platform, so that Python on each platform can read and import files with CR (Macintosh), LF (Unix) or CR LF (Windows) line endings.


Yeah, but it doesn't do \n \r, and iirc, it's not too happy when the line endings change midstream.


Interesting, I had not checked it for those cases. Maybe it only reads a few lines at the start and assumes from the line endings for those, what it is going to be for the rest.


I seem to remember the Acorn BBC used \n\r for line endings.


I think it's mainly to maintain backwards compatibility.


Thanks.

That seems to be the reason for a lot of odd design choices. ;-)


> It is now possible that the user has never seen a typewriter, so this needs explanation […]

Aw man… I'm only 36, but now I feel old for growing up in a time where a typewriter was still common enough to run into (even if they were rapidly being displaced by personal computers).

They still exist in the wild though as a hipster accessory — they probably do well on Instagram too I suppose.


I'm 36... back in the mid-00s when I was in the Army we were still using typewriters, mostly to fill in pre-printed government forms (and award certificates, etc). I even had to deal with carbon paper.

My grandmother had an old manual typewriter, which I had to use once to type up some homework when I was in High School.

I do not miss them.


I'm work at a global logistics company. Manual typewriters are still somehow part of our workflow for filling in forms, not even carbon backed.

It boggles my mind why that hasn't been replaced by a PDF form. Perhaps IT being siloed in another building keeps such legacy going.


With many of these things i suspect it comes down to laws and regulations more than anything else.

Meaning that your typewritten document will be accepted as evidence during a lawsuit or similar, while a PDF of same may not.


I gave a talk on the origins of Unicode a while ago (now published on InfoQ at https://www.infoq.com/presentations/unicode-history if you're interested) where I talked about ASCII, and where that came from in the past (including baudot code and teletype).

The slide pertaining to ASCII is here:

https://speakerdeck.com/alblue/a-brief-history-of-unicode?sl...


Ah the good old days, when hackers were hackers and quiches were quiches.

Oh wait, this article is 'man ascii' & 'man kermit'.


Although interestingly enough from 'man ascii', it's clear why ^C is ETX:

  >         Oct   Dec   Hex   Char                        Oct   Dec   Hex   Char
  >         ────────────────────────────────────────────────────────────────────────
  >         000   0     00    NUL '\0'                    100   64    40    @
  >         001   1     01    SOH (start of heading)      101   65    41    A
  >         002   2     02    STX (start of text)         102   66    42    B
  >         003   3     03    ETX (end of text)           103   67    43    C
  >         004   4     04    EOT (end of transmission)   104   68    44    D
  >         005   5     05    ENQ (enquiry)               105   69    45    E
Holding Ctrl set bit 6 to '0', bit 7 to '1', and bit 8 to '0'. 'C' and 'c' differ by bit 6 only ('1' for 'c').


It was just a month or two ago I saw an ASCII table layed out in a way so that it clicked that "oh, so THAT'S why backspace is ^H" and the other control chars you end up using like ^D, ^C, ^G, ^[ suddenly made sense after that


I remember CLU said, "Acknowledge" but I can't remember if the MCP said, "End of transmission" or "End transmission."

From now on every time I Ctrl-d I want to think the voice of the Master Control Program.


I'm pretty sure the MCP says "end of line", not "End of transmission".

It's been a while though, maybe he says both.


Ah, right. I thought there was at least one place where he said, "End transmission," but I'm probably wrong.


Then there's "End Of User" which makes their terminal explode.

http://www.catb.org/jargon/html/E/EOU.html


The MCP said "END OF LINE"


> quiches were quiches

Kind of interesting how remnants of culture wars of 35+ years ago linger today, and from the perspective of 2017 how anybody could have gotten annoyed at people who eat egg-and-cheese pies.


kbdmap(5) had a fictional character in its ASCII control character list for 17 years. ascii(7) used non-ASCII names for quite a while, too.

* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205776

* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205778


I think you mean "quiche eaters". :)


I uh was too hungry to type it like that (it's lunchtime)


Does anyone remember EBCDIC? IBM defined EBCDIC for the same purposes as ASCII, but ASCII took off with newer generations of machines. The last time I wrote an ASCII-EBCDIC conversion routine was the late 90's, part of generating a file for upload to a vendor's mainframe.


Some popular projects still have to support it: https://github.com/apache/httpd/blob/2.4.x/include/util_ebcd...


The .NET CLR has an EBCDIC text encoder/decoder in the base class library. I had to research this once when it was asked of me how easy it was to injest some COBOL-built files in C#. We ultimately didn't need to use that as the COBOL side had switched to ASCII at some point and its owners had forgot, but I suppose it is good to know that even modern .NET code can speak EBCDIC if need be.


As a young student of Electronics, we had to religiously perform conversions between EBCDIC and ASCII.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: