Hacker News new | past | comments | ask | show | jobs | submit login
Ivo – a reimagined Unix terminal system (lubutu.com)
284 points by lubutu on Dec 1, 2011 | hide | past | web | favorite | 151 comments

I'll answer with a Koan, it's author long forgotten (if you know him, do tell me his name!):

A UNIX wizard hears cries of torment from his apprentice's computer room where the apprentice is studying, and goes to investigate.

He finds the apprentice in obvious distress, nearly on the verge of tears. "What's the problem?" he asks. "Why did you cry out?"

"It's terrible using this system. I must use four editors each day to get my studies done, because not one of them does everything."

The wizard nods sagely, and asks, "And what would you propose that will solve this obvious dilemma?"

The student thinks carefully for several minutes, and his face then lights up in delight. Excitedly, he says, "Well, it's obvious. I will write the best editor ever. It will do everything that the existing four editors do, but do their jobs better, and faster. And because of my new editor, the world will be a better place."

The wizard quickly raises his hand and smacks the apprentice on the side of his head. The wizard is old and frail, and the apprentice isn't physically hurt, but is shocked by what has happened. He turns his head to face the wizard. "What have I done wrong?" he asks.

"Fool!" says the wizard. "Do you think I want to learn yet another editor?"

Immediately, the apprentice is enlightened.

I'm not feeling immediately enlightened. The existence of a program doesn't inconvenience anyone. No-one has to use it if they don't want to. That wizard needs to chill out and not get so upset about people kicking ideas around.

No one has to use it, but someone has to decide not to use it. A major concern with newcomers to many, many systems in a wide variety of markets is what option to pick. Most go with the default, which reinforces that default but doesn't necessarily fit their needs. They do this because the choice is overwhelming.

I trust users who know how to use the shell to be able to decide what suits them best and to not be paralyzed by indecision between their system's default terminal emulator and some guy's obscure hobby terminal system.

I was specifically calling out the text editor comment of your parent, but there are many different choices for shells.

Bourne, ash, bash, dash, ksh, mksh, zsh, csh, tcsh, rc, GNU Screen, etc, Each of them has a slightly different featureset. You can go with the default of bash and it could work out very well for you, but you'd be turning down potentially better alternatives. You seem pretty derisive towards hobbyist projects for a site called "Hacker News". That hobbyist project could be the best thing you've ever used, but you'll never know. That was the entire point of my original post.

I did not imply that the default thing is necessarily better than hobby thing, but that someone who knows enough to know what a shell is knows what they want and how to get it. And if they made a poor choice, so what? It's easy to change one's mind.

Scrutinizing an idea for a new method of programmer-computer interaction from the perspective of a newcomer makes little sense to me, as does the "competing standards" thing from a neighboring comment, scrutinizing OP's idea for being potentially unable to win a popularity contest. OP's terminal system isn't a standard struggling to gain widespread public acceptance. It's just some guy's program.

I think that an idea about programmer-computer interaction ought to be scrutinized for its merit in facilitating programmer-computer interaction, and that criticism from a perspective that isn't the programmer's and isn't the computer's is useless.

PS. You forgot one of the most interesting Unix shells, es. It could turn out to be the best shell you've ever used.

"It's easy to change one's mind."

Citation needed :)

The joke is that those four text editors mentioned by the student are also the result of someone one day attempting the impossible goal of creating the perfect tool.

Standards compete on many levels on a give-and-take basis and have very strong incentives to have a minimal amount of complexity. Text editors don't suffer from that set of requirements. This hypothetical editor does everything the others do, cleanly. Almost everyone will agree that it's better. To my eyes the only real problem is that creating the program itself is infeasible.

I disagree. There are already many text editors that do everything each other do, but they all do it differently, some in GUI some in command line, some with different shortcuts, etc. You could argue that Eclipse does almost everything and has a very high level of extensibility and yet you won't get me or many people I know to use it for most tasks. It's not just about some checklist of possible actions it can do. It's about workflow, ease of use, integration with the larger jobs. It's more similar to the "standards" argument than you give it credit for. Other wise everyone would use emacs or eclipse.

Hypothetical is the key here. Everyone can dream up a perfect system, but in the real world, it will have to make compromises, which means that it won't be perfect for everyone.

If anyone genuinely believed this, would any of the software any of us here use exist?

Hacker News certainly wouldn't exist. The web wouldn't exist.

Hell, Unix wouldn't have ever existed.

The "UNIX way" isn't to write a better editor, it's to use an editor that interacts well with other tools (i.e., the tools we already have at the command line.)

It's an allegory, it's not meant to be genuinely believed in the first place.

It's an allegory to push a dumb and harmful stance.

The stance it's pushing is "understand the reasons why the tools before you are limited, and give some thought into the deeper reasons why you would want to create a new tool. Don't just rush headlong into it"

Yeah, how dumb and harmful that is, asking people to think before acting.

That would be the stance it pushed if it actually had anything revelatory to say about that idea, or if it wasn't trotted out mindlessly every time someone started work on a new tool. As it is, it's just hidebound smugness.

I do not see this start of work you talk about.

Then you might RTFA instead of posting vapid koans.

While text editing interfaces could be more polished and modern--the underlying problem that causes your dissatisfaction is the text. Storing code as text is an evolutionary dead end. Storing code graphs in text requires the use of plaintext names for graph references, which binds logic and presentation together. The gap between the goal-completion logic a user comes up with and how they turn those mental instructions into plaintext is an unnecessary jump--an encoding--that produces no end of trivial but infuriating miscommunications.

Storing code as pure structure (not XML or anything silly, the serialization is trivial) avoids a huge class of artificial problems we've had to deal with since the dawn of compilers, due to the disconnect between human meaning encoded in plaintext and machine parsing.

Visualizing code as text--text as the primary view in the MVC, complemented with colours and annotations and hyperlinks and hints--is extremely useful. What we really need is a structural editor with the familiarity and ease of use of a text editor. But that is a hard problem, and more polished text manipulators are a nice stop-gap in the meanwhile.

Here's a problem: anyone who's ever used Microsoft Word has experienced random invisible bits of formatting stuck in the text somewhere, difficult to extricate, impossible to replicate, impossible to script. Anyone who's used GUI tools to edit data for long periods of time has experienced actions that are hard to perform in bulk-- that might be scriptable with some complicated API, true, but that would have been trivial to perform if the underlying data was in a text file, probably ending up as some search-and-replace.

If what I'm editing is not the original data, it had best be a perfect, 1:1 representation of that data; I should be able to edit any part of it with a regex without the fear of losing data or missing some annotation I can't see. Certainly in the case of an editor designed for programmers, that will mostly be true-- but there's a good chance that it won't be completely true, and then suddenly the editor is getting in my way and wasting my time.

I don't want to fight against technology. With current editors, I can get pretty much the same functionality, but all the metadata required to colorize text, tab complete, jump to definition, refactor, navigate by s-expr, etc. is just a cache, not something I have to think about.

p.s. this is why I hate Xcode project files.

"Microsoft Word has experienced random invisible bits of formatting stuck in the text somewhere"

In following the links to the paper on Femke, I found in that paper this expansion of the acronym Femke: "emphfunctional micro kernel experiment". I had to look at it a while before I realized that the markup had leaked into the result.

Then I though that "emphfunctional" might just be a very useful neologism, perhaps one that well describes the primary article's idealistic shell, but I'm not sure what it should mean yet.

These are the questions that plague the mind that is having trouble focusing on its work.

Why are you limiting your view to some made up markup for describing data? Data already has metadata; formats, containers, etc it's all metadata.

Agreed. Absolutely.

I think something like this is already here.

When I'm working with Lisp in Emacs using Paredit, I am not editing text. Instead it is an interface to the expressions directly. I add, remove, transpose, cut, paste, move around, up and down, all with whole expressions at a time.

If someone were to build a new structural editor for programs, I could only hope it is as good as Paredit.

It's not as hip as Lips/Emacs, but Java/Eclipse works in much the same way. IDEs with deep introspection capabilities are great. I hate all those mini refactorings when I'm programming Javascript or Coffeescript.


I love Emacs too, but so far I've managed to constrain myself to caressing it only with my fingers.

Exactly, a usable structure editor is extremely difficult to write.

The problems with Lisp are that your source code ends up flattened into plaintext in the end, and that identifiers and symbols must be resolved by string lookup; there is no "first-class" reference, just a collection of resolution rules and environments and symbol manglers. You could fix these, but you wouldn't really have Lisp anymore.

We code with text because it leverages our innate language skills. Words and grammer are how we communicate naturally. Typing in words and numbers transmits an enormous amount of information very quickly, compared to e.g. manipulating graphical controls. We use diagrams for very narrow domains of information, not for general concepts.

If you've got something better in mind, let's see it. I mean, let's see exactly how it is supposed to work. The general idea has been around a long time, but nobody has produced even a design that is compelling.

Typing in words and numbers transmits an enormous amount of information very quickly,

This reminded me of NASA 'Mission Control' used for Apollo.

To eyes used to the movies or video games the consoles look bizarre: there are no graphs, no icons, no _pictures_. All the data came in, and went out, as numbers, or text.

They did it like that because to the engineers on the console the numbers could be grasped more quickly and meant more than a picture.

Communication using numbers instead of pictures is how engineers failed to stop mission control from allowing the Challenger to launch and crash with extremely out-of-spec frozen O-rings.

I have not read about Challenger in years, but I'll grant you the point.

One thing stands out in my mind: the guys on the console for Apollo were not the same guys on the console in 1986.

Was the problem the way the data was presented, the people interpreting the data, or an organizational problem?

Good post. Put my money where my mouth is. My replies in this thread all relate to the insights I've gained from a side project I've been picking at for four years now. But the editor is only one part of it which I've barely touched. I'll work on a blog post with illustrations.

I do completely agree with you that we ought to transition toward structural code editors (and therefore, I suppose, structural terminals), but as you say, that's a very hard problem -- I have yet to see one which isn't terribly clumsy to use.

VOPs in SideFX:Houdini are the best use of code-with-nodes I've ever used. You 'write' your code as a network of nodes, then on the fly it is converted to vex (a houdini internal scripting language) and compiled, the result runs very quickly. If you would like to check it out, get Houdini Apprentice and have a play with the 'VOPsop".

Apple's Quartz composer is pretty interesting. Still clumsy, but improving: http://en.wikipedia.org/wiki/Quartz_Composer

I think this is a cool idea, but it seems to necessitate a language with a pure tree structure, i.e. lisp without conventional line-comments. Perhaps code merges would be more enjoyable too if the SCM software operated on trees instead of text files with lines.

It doesn't necessitate pure tree structure. The visuals are like conventional programming languages even if the underlying structure is tree-like. (Which is essentially the case anyway for any AST-based language.) Comments and other metadata are annotated onto the AST. (For example, English identifiers. In this way, a module of code can have an overlay of names in English, a French overlay, etc. And renaming a function or variable doesn't cause a rebuild or break any references because nothing is referenced by human-language names internally, even though it appears as so in the editor.)

Yes, the idea is to unify the structure editor with version control. Having the entire history of how nodes were moved around in the code gives you perfect knowledge of what was moved where or changed into something else, unlike line-based editing, where the problem is AI-complete.

Maybe you should work on this problem full time.


Visual programming for dataflows - as in a series of mapreduce jobs on Hadoop - is an excellent case for this kind of programming. If you haven't read it, check out this paper on PigPen from Yahoo Research: http://bit.ly/v6eqwq

I could have written your exact comment ten years ago. We should hang out, so to speak.

> Storing code as pure structure

s-expressions, then. High-end Lisp editors can do a good bit of structural editing based on the fact s-expressions are pure structure.

> The gap between the goal-completion logic a user comes up with and how they turn those mental instructions into plaintext is an unnecessary jump--an encoding--that produces no end of trivial but infuriating miscommunications.

I think this is the wrong idea. Coming up with the logic in the first place is the hard part; encoding it is trivial once the encoding is learned. The fact is, thinking logically long enough to come up with a nontrivial piece of logic for a computer to execute is hard for humans, as is turning that logic, in whatever form it's stored, into an intuitive mental structure.

S-expressions are still stored as text. Editing as structure is what IDEs do, but they serialize back to text in the end. This is the mistake. Smalltalk is closer to what I'm talking about. Code stored like git stores data, with hashes and version control.

> I think this is the wrong idea [...]

You're arguing about something else. I worded it badly. The logic is hard and the transformation into a series of instructions is hard. Once you're about to type those in, though, you should be having a conversation with the computer, as opposed to sending a series of tokens into a black box and wondering if it understood what you meant.

> Smalltalk is closer to what I'm talking about.

And in Smalltalk, applications lived in the image and were difficult to extract from it. They didn't work well with code outside the image.

That was a bigger deal when the Desktop was King. It might actually be an advantage for web applications: Everything in the image is trusted in your security model and nothing much from the outside can get in. Run the image in a virtual machine (think Xen) to enhance the iron box effect.

> Code stored like git stores data, with hashes and version control.

I don't... we already have git. What could be more 'like git' than git?

> Once you're about to type those in, though, you should be having a conversation with the computer, as opposed to sending a series of tokens into a black box and wondering if it understood what you meant.

There has been some slight work done towards visual development, where you create 'circuit-diagram software': wire up logical components with visual control flow (or visual data flow, perhaps). It never seems to catch on.

> There has been some slight work done towards visual development, where you create 'circuit-diagram software': wire up logical components with visual control flow (or visual data flow, perhaps). It never seems to catch on.

We've done this in the past. You can blame implementation, perhaps, but we've found that in our use case (3D graphics processing workflows), it starts to get too hard to understand what is occuring for any non-trivial "program", since you're bound by the constraints of how much information we decided to display visually, whereas with a general purpose domain-specific programming language you can choose how much information to surface, by being able to choose the level of abstraction you want (or not, as the case may be).

The reason we chose to go visually in the first place, is because someone has the bright idea that "anyone" should be able to create these workflows, but it ends up that a technically minded 3D graphics engineer does it anyway, and they just get frustrated by not being able to write a proper program/script :)

It strikes me that IC design systems started out oriented towards a 2d visualization scheme but have evolved towards a text representation. Chip design was once done by laying out components in a 2d plan view of the chip, but is now done in VHDL/RTL.

An opaque image is the wrong idea, yes. A structure editor necessarily makes the structure easy to extract and manipulate, like git's tooling makes it easy to work with the git DAG.

Think of code where every blob in the git DAG is one AST node. (Obviously this is horrendously inefficient, this is just an example.) No plaintext.

I'm not talking about that particular subset of visual systems like circuit diagrams or other awkward things like LEGO mindstorms. A good representation would still look like code as you know it, since that's proved to be such an information-dense and useful form. The "view" is the same, but the model has been separated from it.

Reading the introduction, I was really excited. I fully agree with the premise, but the proposed solution seems inadequate. Putting shell in an editor, adding hidden metadata, and making the output hyperlinked just don't feel radical enough.

Also, some parts of the proposal are very vague (description of MVC), while others are extremely specific (whole paragraph about an obscure Unicode delimiter), which makes it hard to get the big picture. That said, improving the terminal is a really challenging and important problem, so I'm glad there are people thinking about it.

> improving the terminal is a really challenging and important problem, so I'm glad there are people thinking about it.

People thought about it 20 years ago, not only about the terminal, but about improving UNIX in general. And not just any people, but the people that did UNIX and C in the first place. Their effort is called Plan9.

Very few people have heard of Plan9 and of those people even fewer used it to the point of understanding the novel ideas.

I use it. Even when I am forced to use UNIX I still use the Plan9 tools along with the Acme and Sam editors. Once you get used to the new Plan9 ideas, you feel crippled in UNIX and can never go back.

> Once you get used to the new Plan9 ideas, you feel crippled in UNIX and can never go back.

Would you care to expand on this? Any specific examples of things that Plan9 does for you that you feel crippled without? I ask out of genuine curiousity, as I've often heard Plan9 mentioned but have never given it enough research to understand its appeal.

For example, I'm a full time web developer and I spent a lot of time in zsh and vim. Would you recommend that someone like me checks out Plan9? Is it the type of thing you could possibly use as your general purpose OS?

I'm using Acme for web devel. Yields very well to scripting. Together with plumber and a few scripts, it provides pretty much a complete IDE. The ability to simply click on any text to 1) execute it or 2) search for it or 3) copy/paste/replace is something that I miss from other editors. Essentially, any text becomes hypertext. Output from compiler or debugger links you to source files etc.

Aside of that there's a big chunk of remote filesystem access ported from Plan 9, but I haven't used that yet.

Acme is, in fact, a modern, windowing terminal -- it lets you view, edit, execute and navigate. No surprise given that it comes from Rob Pike -- one of the guys behind an earlier windowing terminal: http://en.wikipedia.org/wiki/Blit_(computer_terminal) http://doc.cat-v.org/bell_labs/blit/

http://swtch.com/plan9port/ is quite easy to start with on any POSIX system.

Linux actually adopted some of those ideas: unicode, union directories, and /proc filesystem are all plan9 ideas. But rio seems pretty awesome as well https://en.wikipedia.org/wiki/Rio_%28windowing_system%29

The MVC part seems to have confused MVC the design pattern with the Smalltalk IDE

The idea of using an obscure unicode code point to indicate the start of metadata, in the expectation that nobody will use it, is probably self-defeating. As soon as that's is use, there will be a reason to use that obscure code point (in documentation, in code, etc), so you will still have to deal with escaping it.

You'll have to deal with escaping it, but you won't need to escape as often, and may never need to escape if you aren't writing code/documentation dealing with the system itself.

I wholly agree that splattering chrome over a terminal isn't the best way forward. Emacs shell mode gets me all?most? of the way to the interface posed as the solution (text editor as interface to terminal). Being emacs, I am sure I could finish up the proposed solution with "a few lines of elisp".

And I don't think it's the solution. These issues are spawned deeper in the design. Multiple streams of information are being spawned, and must be handled sanely: pipe redirection starts breaking down here because it's a line - what's needed is a graph of management that's able to handle different cases and join back for the next approach.

One approach might simply be to develop a higher-powered programming language environment that calls directly into the system.

Windows Powershell attempts to do what you're describing. PowerShell doesn't manipulate text streams, it manipulates .Net CLR objects. That allows shell programs to expose data in a much more meaningful way. You don't have to have a parser in every program that interacts with the shell when you can send and receive typed data.

I haven't worked much with it lately, but I wouldn't be surprised if Powershell was the closest existing technology to what the author is thinking of.

I think he agrees, except for the textual display.

From the article:

  Data structure. Windows PowerShell had a chance to 
  redesign the terminal from scratch, but defaulted to 
  the same old grid of ASCII. One innovative thing they did 
  do was add structure to their data, piping .NET objects 
  instead of raw text, allowing the user to select fields by 
  name instead of writing elaborate AWK scripts. The shell 
  for the research OS Famke does a similar thing for 
  higher-order functions

I didn't quite get where the author was going with that. Windows Powershell has access to the full Windows .Net API. If you want to spawn a window to display graphics or what have you, you're free to do so.

You can already side-chain data flows with file descriptors. You can solve more complex problems with temp files and tee. I can't really imagine that making me use a GUI to select where to pipe what would be more efficient. Most of the time, when people have problems that require complex data manipulation like this, they'll jut use their favorite scripting language.

There is also not much stopping you from using PHP, Python, Ruby, or anything else with a REPL as your daily shell, but bash is the most popular and works quite well for this problem domain.

Not to sound like an old dude stuck in his ways, but there is a reason that we're still all using VT 100 and 220 emulators from 30+ years ago -- they work great.

"Not to sound like an old dude stuck in his ways, but there is a reason that we're still all using VT 100 and 220 emulators from 30+ years ago -- they work great."

No, they don't. Really. Look at the source code to ncurses and see if the millisecond timing loop around select() is still there, to distinguish between a terminal sending ESC[1D in response to you pressing "left cursor" and a human pressing "ESC" "[" "1" "D" very quickly :-)

It sucks. Windows got this right a long time ago, to the extent that the left & right modifier keys are distinguishable. Try doing that on a VT100.

Of course, a REAL VT100 doesn't have half the keys ; F1-F4 (IIRC) aren't sent "over the wire" and have no standard encoding. Linus basically invented one for the "linux" console terminal type, and lo yet another "standard" was born.

And don't talk to me about shells. The lunacy that is never quite knowing what insane metaquoting scheme you'll need today based on which arcane shell some moron has configured as the default for THIS particular machine is EXACTLY what the top-voted Koan is about.

> There is also not much stopping you from using PHP, Python, Ruby, or anything else with a REPL as your daily shell, but bash is the most popular and works quite well for this problem domain.

This is relevant: http://stackoverflow.com/q/3637668/336455 In short, there are some objective reasons, not just popularity.

"There is also not much stopping you from using PHP, Python, Ruby, or anything else with a REPL as your daily shell"

That's a fascinating idea. Know of anyone who does that?

Does emacs count?

Emacs shell mode was my first thought too. It doesn't try to implement a new standard for pipes, though. The idea of using invisible control characters as metadata is interesting.

(My biggest complain about shell mode is that it's impossible to send certain characters to the underlying process, TAB being the most important one, so you can't rely on built-in tab completion behavior.)

Thanks, this reminded me of Microsoft Research's Dryad project, which includes a distributed "two-dimensional" shell, Nebula. http://research.microsoft.com/en-us/projects/dryad/

Sometimes when I have text files open I want to be able to run command line tools against them. I know I can go write a shell script but sometimes I want to write grep against my inbox or against all the windows I have open at the time. I want to be able to high lite some text (possibly in more than one window) and then have a transparent shell appear that lets me write expressions to work with that text, like sort it in place. Often I have to move things into and out of text files, spreadsheets,databases to be able to apply all the tricks that I like. I wish those tricks could just appear and work with what I'm looking at.

Emacs (and Vim as well IIRC) can run shell commands against selections of text so just read the files in there.


I have used emacs before and I did like having multiple shells open and the ability to copy many text fragments into the buffer. I use windows for my day to day work so I'm thinking about something more geared to that.

I want to be able to take the the best features from the command line, and the gui programs and package them up into something that just floats at the OS level. So that I use a keyboard shortcut to pull up a buffer that can grab text (or entire files) work on them and push the results back to whatever app/file I pulled it from.

I think DTerm (http://decimus.net/DTerm) is a step in the direction that you're describing...

That looks really interesting. I'm going to install it when I get home. If I'm able to be looking at a text file, launch DTerm with a keyboard shortcut, run something that modifies the file in place, and then have DTerm fade away then that will definitely be useful.

> and Vim as well IIRC

Yup, via `!` and `r!`. See, for example, http://www.oualline.com/vim-cook.html#format_para.

You also execute a selection with ":w !sh". For example, if I have a line that says 'echo "Hello, world!"' I can select the line with Shift-V and then ":w !sh" to execute the line in the shell. Works with other interpreters too so ":w !perl" would execute it in perl. The vim command should show "'<,'>" between the ':' and 'w'. Using it without selecting something first sends the whole buffer to the interpreter.

> The vim command should show "'<,'>" between the ':' and 'w'.

I've always wondered about this; I sometimes get it (but don't know how to reproduce it) when trying to navigate. What does it mean?

It's a motion command that means from the beginning of the selected area to the end. It's really two different commands ("'<" and "'>") and there is a good description of them in the vim help (":help '<" will bring you there).

I am skeptical about total reimaginations, and when I read this I started to take it as one of those airy "Dude, someone totally revolutionize this for me..." posts, but then I hit:

I’m working on this in my spare time, starting with...

and now I'm curious to see what the author comes up with.

There's just something about an actual tool, no matter how prototypical, that improves the discussion of new tools.

You can and should separate view and controller. Imagine if you could spawn windows at monitors, and then have things display on them via pipeline.

You might type into a tiny window on one monitor that never loses focus, but sends graphs and text and streams to these windows.

There's no reason that the place you enter text should correspond to the output display, and a lot of value to be had by separating them.

Imagine being able to plug display consumers into a port so that you could do visual demos.

Most unixy programs actually do have them divided. stdin, stdout (and a third for stderr). It would be very easy to separate the streams to different places.

ETA at the last place I worked, we actually had the touchscreen on one desk and the program running in a window on another computer. It was pretty nice to type something and make it show up across the room.

On this topic of stdin, stdout - wouldn't it be neat if - instead of signals - there was an input channel called stdctrl that you could send control messages in on.

Glad to read about the scenario at your last place - I've never seen this in action.

Well those are just the defaults, a program can open a lot more sockets if it wants to. IPC is an old problem with a lot of pretty good solutions by now.

Well, it's not exactly what you describe (which does sound kind of intriguing), but Linux does offer the signalfd(2) syscall, which allows you to read signals from a file descriptor. Doesn't do much for the sender of said signals, however.

At AOL circa 2000, our application framework had a Tcl interpreter listening on a "control port", so that you could telnet to any app and get stats, change settings, etc.

Of course, nowadays that's more commonly done with REST endpoints, which I think would satisfy your stdctrl idea.

I think Plan 9 does something similar. You send signals to a process by writing to a file in the process's /proc directory.

The concept of a view and a controller are distinct, a terminal simply happens to be both, because it can do both. Commands are solely controllers, and you're right, a read-only terminal would be just a view.

Here's my six-year-old rant about how crufty terminal technology is. It's not a visionary re-imagining of terminals like this article is, but personally I think that one of the biggest barriers to innovation in terminal technology is that the current stack is so incredibly baroque and difficult to program or extend elegantly.


There are some interesting ideas here. One thing that occurred to me while reading it was using a message queue to implement the pipeline functionality. I've been interested in Apache ActiveMQ/Camel for a while now and excited about some of the things it can do. With that model, each 'line' could have it's own set of headers, allowing for transformation by aware tools, but transparent by default. This would also blur the line between running commands and having background filters. It would also work nicely over the network.

I did some experiments with clojure/camel at one point, and came up with stuff like this:

(defroute context (from "file:/home/jw/scratch/inbox?noop=true") (to "file:/home/jw/scratch/outbox")) ; from http://codeabout.blogspot.com/2010/06/using-apache-camel-fro...

Which is basically a continuous file copy. I could see different flags for "do now and exit", "do at HH:MM:SS", "do until I tell you to stop", "do when system is idle", etc.

I'm not sure what you'd really gain from using a message queue. Pipes with streams seems to be lower overhead and could be extended pretty trivially to get most of the functionality you'd want.

Could you perhaps explain a bit better?

The author confuses a terminal with a shell running in a terminal.

I'd like for my terminal to be able to open a stream of html in my web browser. Use case: running `man blahblah` in a ssh session opens a nicely formatted page in a local browser window.

I'd like for my terminal to be able to open a stream of text in my editor, and accept a stream of text back from my editor to save somewhere. Use case: running `sudo -e /etc/blahblah` in a ssh session allows me to edit the remote file /etc/blahblah in a local text editor, and save my changes back.

Aren't you confusing a terminal with a terminal emulator?

Possibly, although as far as traditional unix shells are concerned, they're the same thing.

How about something like Archy? https://en.wikipedia.org/wiki/Archy#Features It was never polished but you can still play with an old (windows) build: http://users-www.wineme.fb5.uni-siegen.de/home/SebastianDrax...

Edit: or for a gui, something like Enso (for the whole OS) http://humanized.com/enso or Ubiquity https://wiki.mozilla.org/Labs/Ubiquity/Latest_Ubiquity_User_... (for the browser) Both are abandoned open-source projects with a lot of the hard work already done (the internationalized parser in Ubiquity is very nice).

Acme seems to come pretty close to the author's idea, combining both a terminal and an editor.

For anyone wondering: http://acme.cat-v.org

Fun fact: Acme was Dennis Ritchie's editor of choice.

His vision also sounds quite a bit like Emacs, especially the GTK version.

I've been thinking about a user interface for a long time. One that, leverages the graphical capabilities of the browser and the linguistic capabilities of the command line.

The two enhancements over the command line that I envision are the discoverability of commands and the ability to select the output of commands. So what you'd is an Enso/Vimperator like command interface where the output would be rendered as an 'object' or list of 'objects'. Each output object would be selectable and able to be used as input for other commands able to digest them.

You can sort of do this now with the command line but there are a few issues. The first is the inability to render graphical results in a terminal. The second is having to now something about the output of a command apriori before piping to the next command. Finally, it is difficult to know the commands available to your fingertips on the $PATH.

The more I meditate on a system like this the more I think it would be wildly productive because a wide host of problems can be solved with this single workflow.


While perhaps off-topic, I'll chime in with an anecdote in remote computing I had this past Thanksgiving. Short version: If you're an emacs veteran, tramp is the ssh version of ange-ftp. Long version: was down in the San Diego area for Thanksgiving and my development environment (Linux) was up here in San Francisco. Had ssh and VNC access to my Linux server in SF, however the latency was still pretty bad making VNC just barely usable. However I had access to a Mac in San Diego with MacPorts X11 emacs installed. I could then run emacs locally and use tramp to transparently access my remote files via ssh just like the old ange-ftp package. Another benefit is that running a shell within emacs with current tramp buffer will automatically ssh to that same server. Tramp would also work with version control (in this case svn). So the big wins here: 1) local editing speed. 2) efficient network communication (only the file data is transferred) 3) shell and version control support 4) can still use emacs.



sshfs does that, and you don't have to use emacs. Gnome's VFS also does that, with its own protocol that gets exposed as a fuse filesystem for non-vfs applications.

I think they're seriously misunderstanding both TermKit's goal and the difference between running commands and editing text.

TermKit, for instance, is not just a widget-infested terminal. The core idea behind it is what >50% of this post is about: data interchange that's not un-tokenized text that you have to `awk` to hell and back to do basic things.

Meanwhile, text editors do two things incredibly differently than system-control interfaces: they edit blocks of text, and they can un-do almost every action. Next time you `rm` something accidentally, try pressing `u`, and see if it comes back. Or rewind that `drop database production;` your cat typed into your ssh session.

I also don't want Bash to be my editor for similar reasons why I don't want to manipulate my filesystem with Vim - I rarely need to enter visual select mode when composing something in Bash, and Vim is poorly-suited to piping streams of text through multiple programs.

Some months ago I tried to improve shell while keeping all it's features. So i made webkit-based terminal emulator (pseudoterminal; like xterm) with special esc-sequences to turn on html output (in a console, yes). Screenshots: https://github.com/shepik/wkterm/#readme

This looks pretty cool. I actually spend quite some time thinking about something like that. I came up with two things that I'd add to a shell : graphics and links. Graphics you did. It seems that is doesn't degrade gracefully though, like programs that do colored output disable it when run non-interactively ?

I imagined links as something which when activated would be pasted in your command line. For example when running "git status" I would be able to get a ready made command to add a file by activating the "add" link next to it. Each file name would also be a link so that I don't have to copy/paste it. Ideally, links could be triggered with keyboard shortcuts to avoid using the mouse.

I don't think there is an issue with graceful degradation. Basically, wkterm is a terminal and a set of pretty-printers that read data from stdin and print it in html. As in my example with cpu usage graphic, there is some ./gen_cpu program which just prints cpu usage values as text, and there is graph drawing program. So, use them both for graphical output and use only the first for text.

Links, as you describe them, can be implemented with javascript. I'll try to make an example on weekend.

You might also be interrested in the Cope project (https://github.com/yogan/cope). With the same approach, you could make existing commands return nice looking output for your html terminal.

This sort of reminded me of TermKit, which I think is still on Github... but sort of stagnant the last time I looked.

Yes its here: https://github.com/unconed/TermKit

I still think the idea is great. This guys should work together instead of making yet another competing project.

Yes, I really liked that project and I think the design is fantastic (it leverages WebKit for rendering).

Both TermKit and this project will have an interesting dilemma with editing remote files though. The advantage of SSH is that you really aren't storing any remote files locally (only the currently visible characters are stored in memory). An editor that uses this new model and eliminates lag requires transfer of remote files to the local computer. There are a lot of situations where this is not a viable option.

A very good point. My prototype protocol, rwr, reads only the data in the file it has to know. Additionally, when one inserts a character it simply says "insert 'a' at byte 4", rather than having to stream the entire file. That's another benefit of running filesystem models as daemons on remote servers.

The beauty of Unix is its simplicity.

This made Unix so reliable. If you want to write a beautiful terminal as _addon_ I have no problem with that. But I would not accept a _replacement_ of the old fashioned terminals. Because they are so simple, they just work.

Apparently you've never messed around with termcap. Or had to use terminals other than xterms and vt100s, or some very near clones of them.

Meanwhile, it took Unix forever before the PC's backspace key 'just worked'.

The terminal guts are unbelievably gnarly, and it's kinda surprising it hasn't been replaced with a simpler approach designed for virtual terminals by now.

> surprising it hasn't been replaced with a simpler approach designed for virtual terminals by now.

Because they ... just work? :-)

If you want to waste less time reinventing the wheel, why not try a Lisp-machine style terminal system with "live" objects (presentations)? It's proven to work, and many people liked it more than a terminal-based command line. Use polymorphic lines (http://groups.google.com/group/comp.lang.lisp/msg/02782906f6...) to get a very concise, powerful presentation. OpenDylan (http://opendylan.org/) and its IDE/editor Deuce implement these ideas.

I agree that there is too much focus on "intuitiveness," but I'm neither a programmer (I'm working on it) nor a sysadmin, and I'd like a more robust user interface too! I have a lot of work to get done, and I'm willing to learn a different set of interface axioms than the ones we've been working with for the last few decades.

Here's the beginning of my thinking on the subject: http://blog.byjoemoon.com/post/9325300749/a-different-kind-o...

It's great to hear other voices in this discussion, though!

This is text editing in OpenGL: http://www.youtube.com/watch?v=2O5DJTOy6EA I'd like to have such a terminal.

Sublime Text is a text editor which runs on Open GL, it's not quiet what you mean though.


It used to, but starting from version 2 it utilized software rendering.

iTerm2 already has the feature where you click on a filename and it opens the file. If you're running a newish iTerm2, give it a go: do an ls, then command click.

I've called it Semantic History. It also let's you drag files out of the terminal as well. Old video walkthrough: http://vimeo.com/21872771

It does not require any special ls. As long as there's a legit filename, it should work.

How does it know where you are standing when it isn't the full path? It must integrate with the shell somehow, right?

Heh, that was a fun one to figure out. When digging around in iTerm2 source, I found out that you can figure out the current working directory of the shell, so that was relatively easy.

The hard part was my goal of getting it to work even if you change the working directory, so that paths that were legit before stay legit. I found out that my shell sends an escape code that updated the terminal title when it changed directories, so I hooked it into that, but it turns out that's oh-my-zsh specific.

I also added the ability for it to work even with spaces, which is essentially a brute-force, so it's not pretty.

I find the article a bit confusing. Given this statement:

In addition, the keyboard is often more effective than the mouse for our work, since instead of floundering around in nested menus we can just type what we want. However, it’s worth noting that we don’t avoid the mouse because it is slow — if one wants to move the cursor to an arbitrary location elsewhere on the screen, one can often do so faster with a mouse than a keyboard. The problem is the transition from the keyboard to the mouse. It’s an expensive context switch, which should not be done lightly.

I don't really understand the following:

We then add syntax highlighting and hyperlinks, so you can easily navigate between man pages, or click on a grep result to visit that line in a file. Clicking on a hyperlinked directory in a file listing would reveal the contents of that directory in a nested list, slightly indented; clicking on a file would open it in a new tab.

Too much clicking for my taste.

Could someone try to distill out some sort of concrete proposal from this? I had trouble understanding what he was implying.

Kind of off topic, but this article just reminded me, I was thinking how nice it would be for a linux distro to be made for programmers. Bundled with every bit of programming language (from c to hakell and back), the standard editors with their syntax highlighting (the .vimrc already set up a little), cool special tools that maybe only serve a special purpose (stuff like gnu radio..) I was just thinking it'd be nice to have one big package, a here ya go, have at it, kinda linux. Maybe even with some alternative (but vetted) how-to documentation. For some reason I just can't get used to the flow and syntax of Man pages... And it can leave out lots of the fancy gui sidebars and stuff (not sayin get rid of the gui, just dumb it down, I'm here for the terminal, and maybe some IDEs.. maybe).

Read the koan in the above comment again. Your vision of a Linux distribution "for programmers" is very likely different from my vision, let alone that of Dennis Ritchie or Linus Torvalds. This approach is the fundamental mistake made by other operating systems: to assume you know what the user wants/needs, and give it to them in one nice shiny box with a bow on it. It would be so convenient... right?

The core philosophy of Unix is to build tools that do one job well. By combining those tools, it is then possible to build great things. Do not assume that means you can hide the tools away and just give people the great thing, and realize it is naive to believe that you can solve everyone's problem by creating the one true master editor/program/os that combines all the great things from previous attempts, but this time "gets it right".

This is one of the rare delightful articles. I came to understand, after 6 years of non-programming work, and now back in front of the console, that what I like the most is the freedom that the shell provides and the room for creativity when accomplishing a task in a readable composition of tools in one line.

Certainly the provided successors of the Bourne shell might be more elegant. Certainly when piping data through several tools needs often transformation of the data representation to match the required input format. Sometimes the transformation uses different principles (regular expressions, shell wildcards, awk, sed, ..) which pollutes the logical flow of problem resolution. But I came to the conclusion that despite not being optimal, this data transformation noise holds information that, when reading or when needing to explain to others, helps to understand, or maybe remember the data model from the source to the sinks. However I believe, more uniform but more universal data transformation techniques would be a progress.

Until now this comment is more about shell than terminals, but for me, these are the biggest advantages of working on a command line - freedom and creativity in the usage of available tools.

Now, to come closer to the terminal aspect.

I am a typing fan. I went through hell when moving from Germany to France and having to re-learn a new keyboard layout - none of the both layouts are good, but if you are used to one it is very annoying to get to used to another. After having had some missions in other countries with different keyboard layouts, I found a solution, that is in line with my philosophy of using typing tools: Learn US keyboard layout, to the point to use it blindly without needing letters on the keys - it is installed on all OSes in all countries. At home use the extension US International with AltGr dead keys - I can now write all German, French, Turk and many other Latin based accents with a single layout. When in mission at a customer I ask if in the open session I can switch to US layout (if needed) - until now there were only some raised eyebrows, but no objections. And at least for ThinkPads you can order them with US layout, even with the € sign. So for keyboard I tackled the problem.

For text editors I resolved the problem already 10 years ago. I use VIM. VI is on all UNIXes, VIM is on Windows (I did not yet have to work on Macs).

On the UNIX shell the first thing I type is: set -o vi Like this many VI shortcuts and commands are at the command line.

One quote in the article made me smile the most:

"... However, it’s worth noting that we don’t avoid the mouse because it is slow — if one wants to move the cursor to an arbitrary location elsewhere on the screen, one can often do so faster with a mouse than a keyboard. The problem is the transition from the keyboard to the mouse. It’s an expensive context switch, which should not be done lightly. ..."

Yeah, this is the reason I use a track point instead of a mouse. Nothing to care with you, takes no space and you don't have to lift your hands from the keyboard. The only problem is the craving when having to sit at a customer at a keyboard that has not track point - I find myself sometimes searching for it with the fingers until I realize that I am not on my ThinkPad. It is a shame this device is disappearing on most computers.

At last: I think what unites the aficionados of the command line is the choice to spend more effort and time in learning tools which are more difficult than their alternatives (mouse vs. track point, command line vs. GUI, ten finger typing vs. two finger typing) at the beginning but pay big times off in efficiency in the long run and as a plus give the pleasure of creativity and freedom.

I encourage the author of the article to progress with his ideas! We absolutely need innovation in this space. It is just that the bar is huge because of all the tools we became used to.

> Yeah, this is the reason I use a track point instead of a mouse.

VIM + touchpad beneath space bar on a standard laptop works well too. I can left\right click with my left thumb and move the cursor with my right thumb without my primary digits leaving the home keys.

Sure. The thing is that

(a) a good track point is ways more precise than a touch pad

(b) to cross the screen from one corner to another on a touch pad you have to make several strokes, whereas with a track point you don't have to leave the device a single time

> "We then add syntax highlighting and hyperlinks, so you can easily navigate between man pages"

GNU info?

This article appears to be functionally describing emacs.

I'd be 90% satisfied if we just got rid of termcap and friends.

Let's just replace escape codes with some kind of markup language (or s-expressions or whatever).

Interactive SCSH Shell. That's what I'm waiting for.

> our primary interface emulates a DEC VT100 terminal from 1978

Huh? I didn't know the DEC could run software like Visual Studio, Explorer and TortoiseGit.

Whether you use an interface from the past or not is entirely a choice. Many developers apparently prefer it, and that's fine. I don't prefer it, so I use tools with a rather modern and well-designed interface, tailored towards developers.

Please don't forget that the terminal should be useful to just get work done without programming. It's also all designed to have a human interface - all the output can be grok'd by a human and not some complex tool or parser. This is what prevents things from just magically working and forces us to come up with hacks to pipe and grep and cut pieces of data to do what we want.

If you really want to "reimagine" it, throw out the box and make a new one. If you redesigned all the standard unix tools to have a universal API and added hooks for each function they contain you could just specify a workflow to execute and the tools would figure out how to transmogrify the data internally. So for example:

  rehooliginator --store=val1 --filesystem=/proc/cpuinfo --rowname='model name' --match='([[:digit:].]\+)GHz' --store=val2 --filesystem=/proc/meminfo --rowname='MemFree' --match='([[:digit:]]\+) kB' --store=val3 --cmd=ps --fields=rss,comm --sort=rss --match='java' --field=rss --sum --math='$SUM*1024' --store=val4 --cmd=vmstat --samples=5 --field=cpu-idle --avg --output="Stats:\n\tCPU: $val1\n\tFree Memory: $val2\n\tResident memory used by Java: $val3\n\tCPU idle time: $val4\n"
Not the greatest example but you get the idea. If this seems more complex than traditional one-liner scripting it's because you're trying to do a lot of little things on a single line. It may be better to shove all this into a little file in easy to understand non-programmer language and save it for later. (Also, the long GNU options could be replaced by short options for quicker use, depending on the API/module being used)

This is obviously not getting away from the 'old school dynamic' of a fake terminal, but it does remove some of the need for it when we have tools robust enough that the terminal doesn't have to be as user-friendly as it is. You could combine a tool/framework like the above with a text editor to write multi-liners, sample the output and execute them on the fly. Build in hooks to execute commands over an ssh connection - or even gather output from various hosts at a time - and you could automate sampling your whole network from a one-liner.

For a "friendly interface" I think a simple tree view file browser would work nicely. So basically an "explorer"-type app with an embedded text editor and output window to let you explore a system rapidly and also automate tasks on the fly. Hell, you could build an IDE or other friendly GUI to build your query tool's arguments using quick mouse clicks.

Eshell (The Emacs Shell) is a lot of what the original post already describes in terms of power: http://www.gnu.org/software/emacs/manual/html_mono/eshell.ht.... Too bad it isn't being very actively developed anymore.

While I like the terminal... I don't want to be one.

Best Ivo

"And we, the users, play along, pretending our machine is a video terminal presenting a grid of ASCII characters in all of 256 colours. This is ridiculous."

Not everybody uses their terminal to churn out HTML pages and add 'Nyan mode' to our 'newly discovered' emacs program. For those who do that, just buy a Macintosh or whatever is this week's hip flavor of Best Buy PC. Otherwise, use a language that doesn't require >256 different colors to be represented meaningfully.

"Typography is the future"? Thank goodness X.org/XFree86 has supported custom fonts since the 1990's.

"Opening a man page would scroll gently to the top of the page, letting you scroll down and read, or search through it as you would any text"


"We then add syntax highlighting and hyperlinks, so you can easily navigate between man pages"

Many terminals and shells support these features already.

"Finally we add visualisations so you can view plots of lines of code, etc., without having to context-switch."

Huh? I read that as 'code folding' and clang compilation.

I think the main takeaway here is that most of his "ideas" can be easily achieved within the current ecosystem of available programs, most of which are stock on modern UNIX-like OS distributions. I do think he misfiled this article under "Ideas"; it's more akin to a polite rant.

edit: colours/color killed due to conflict with reality (and irrelevance anyway).

No downvote, but I believe its worth thinking about ways to advance power-user use beyond the emulation of teletype machines (first deployed in 1910). It may be that they are the optimal power-user text interface, but it may also be that they simply occupy a local optima and we need to keep searching.

You don't seem able to see the forest for the trees... Taking quotes out of context and dismissing them is not sufficient to dismiss the entire article.

Also, I'm British.

All quotes save for one-liners are out of context. The points you make stand on their own.

I don't dismiss the entire article; I appreciate thought and innovation in the space of the terminal, but I disagree with your ideas. Thanks for putting them out there to begin with.

I'm curious, have you used Plan9? If yes, how long did you use it?

I have, yes. I own a Pentium 4 which runs Plan 9, and have done for two years or so.

I'll throw in my .02.

I see the context-switch as one between text and graphics.

If I'm working on the command line, then most times I have no need to have X11 running. I'm working exclusively with text. I can boot to a command line and start working. No X11 is needed.

But when a need arises for graphics, e.g., to read a PDF composed of scanned images (not pure Postscript), then I have to "context-switch" to the X11 context.

I find that switching back and forth between these two contexts is not smooth and can easily lead to instability.

There is often a presumption, as in Plan9, that we will just switch once: to the graphical environment. And not return to the original console.

To me, neither an X11 terminal emulator nor the Plan9 environment is "the console". It's another layer of abstraction on top of the console.

That is a lot of overhead I do not need if I'm just working with text.

Isn't that problem solved by virtual terminals?

Sort of. But you have to keep X11 running on another vt. Stopping and restarting X many times in a session is a different story. At least for me.

And even in the case I keep X11 running on another vt, I've found that when using no wm, or a simple one like evilwm, switching back and forth from console (on one vt) to X11 (on another vt) many times does not work well. Eventually it fails.

This is on {Net,Free,Open}BSD.

I don't think that Plan9 has vt's as such. It's more like what the article envisions, with graphics capabilities seemingly woven into the terminal. But you're pretty much stuck in an X11 type environment. Plan9 experts correct me if I'm wrong.

I've always found this "context switch" from console to graphics is like a one-way street. You're not really expected to keep shutting down the graphics and going back to the console. At least I've never found anyone who does that.

{Net,Free,Open}BSD... which one? All three?

I'm a NetBSD user that uses a tiling window manager (i3 - not ion3, it's different). ALT+1 and ALT+2 are where I keep my urxvts, ALT+3 my web browser, etc. The switch happens instantaneously.

Am I mistaken in my understanding of your issue, or is this helpful?

Weak typing and implicit coercion are not the same thing. They're not even close to the same thing. Haskell does (things equivalent to) implicit coercion when it adds a floating-point Number to an integral Number and nobody (sane) says Haskell is weakly-typed.

Haskell uses type inference, but that's not the same as implicit coercion, which Haskell explicitly avoids. That is: if you type 4 + 4.2

Then the compiler will infer that you mean (4::Fractional a=>a) + (4.2::Fractional a => a)

However, you cannot add an integer and a float: (4::Int) + (4.2::Float)

    Couldn't match expected type `Int' with actual type `Float'

    In the second argument of `(+)', namely `(4.2 :: Float)'

    In the expression: (4 :: Int) + (4.2 :: Float)

    In an equation for `it': it = (4 :: Int) + (4.2 :: Float)
This follows from the type of (+) :: Num a => a -> a -> a

I guess I still don't get how there could be a difference between Float and Fractional, then.

Haskell uses type classes to support ad hoc polymorphism, or overloading. Consider the statement:

(4.2::Fractional a=> a) + (4::Float)

Float is an instance of the type class Fractional. That is: methods which are defined for all fractional types must be defined for floats.

The compiler infers that (4.2::Fractional a=>a) must have type float, as it is being added to a float. This is compatible with the original type of the expression, as Float is an instance of fractional, so it is valid to read 4.2 as a Float.

This diagram might help: http://www.haskell.org/onlinereport/basic.html#sect6.3

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact