Hacker News new | past | comments | ask | show | jobs | submit login
Are We Sixel Yet (arewesixelyet.com)
206 points by ecliptik on May 14, 2023 | hide | past | favorite | 149 comments



See also rant[1] of sixel-tmux author.

> It's 2021, and we should be able to do litterate programming in the console, with full graphical support.

Yeah. We are stuck cosplaying computers from the sixties.

What's even funnier, even if you find a modern terminal emulator that supports features like ligatures, graphics, emoji etc. you still will be blocked by tmux. Sure - not everyone needs tmux. If you never work on remote machines, you can live without it.

But I work on remote machines all the time. I also use Kakoune text editor that defers window management to external tools (WM or tmux, but to be honest, tmux is much better). Zellij is more of r/unixporn bait than usable tool for now. So I'm stuck with text only interface.

[1]: https://github.com/csdvrx/sixel-tmux/blob/main/RANTS.md


For those like me with the opposite inclinations, try dtach. It is only a session handler. It does not pretend to be a terminal, or do splits or tabs or anything but I/O. That's left to your actual terminal.


How does that work?

I mean, you could avoid knowing anything about the terminal state machine if you dumped the entirety of the session’s bytestream from the very beginning on each attach event, but unless you are willing to do that it seems to me that you’re going to have to track enough state to basically amount to maintaining a screen buffer (and then you do have to care about sixels).

For example, suppose the user fills the screen (maybe even with an image), then spends a lot of bytes overwriting the last line over and over (think progress bar). Either you’re tracking the first byte to have affected the current screen (and need to store the entirety of the “a lot” and dump it onto a newly attached client) or you’re trying to discard all the stale updates while keeping the last one and the initial screen contents (and that’s just a screen buffer with extra steps). (Incidentally, I suspect Muratori’s Refterm fails this test to the point of requiring a redesign, though I haven’t checked.)


dtach sends control-l or sigwinch or one of a few other things when you reattach, at your option, so if you're running a fullscreen program that cares about the terminal state machine, it will redraw itself


I see, yeah, that’s neat. I can still imagine situations where it’ll fail, but they’re reasonably limited. (If you detach and attach in the middle of a large sixel image being transmitted, you’ll have the tail of it dumped on you in text form; if you detach from a session with the terminal set to red text, it’ll be back to default when you return; and so on.)


For me the most common breakage was due to ssh dropping overnight (by corp security policy) while a remote editor has the (local) terminal in an altered state (i.e. modifyOtherKeys). Easy to recover, and for me well worth it for the direct I/O passthrough.


> do splits or tabs or anything but I/O. That's left to your actual terminal.

I use tmux exactly to split panels in any way I want. My terminal (iTerm2) doesn't give me the full power to manipulate windows/panels using shortcuts.

But unfortunately I'm gonna have to stop using tmux because it's not compatible with sixel. I spent a lot of time ricing it tho...


> My terminal (iTerm2) doesn't give me the full power to manipulate windows/panels using shortcuts.

Which tmux pane management features do you miss in iTerm2?


If you're in control of the sixel producer, tmux has a ‘passthrough’ sequence, ESC Ptmux …

(I used to use tmux, but ultimately found it more trouble than it was worth.)


I switched from iTerm2 to WezTerm, which has Sixel and Kitty’s graphics support plus a built-in multiplexer.


For working on remote machines, I need 3 things:

* show multiple terminals on screen. The best solution is to use a tiling window manager. Both MacOS and Windows have limited windows tiling capabilities. * run long term TUI applications (like editors). The best solution is mosh. * run long term CLI applications (like shells). The best solution is dtach.

tmux helps all 3, but not particular good at either.


> tmux helps all 3, but not particular good at either.

iTerm2 on macOS has some nice tmux integration[1]. Basically, you run a tmux session (using tmux -CC), but the actual window management on the client side is handled by iTerm2. This works pretty nicely with the tiling WM (Amethyst[2]) I use on macOS.

If anybody is aware of Wayland compositors that integrate similarly, please let me know. I'd love to be able to do the same on my linux machines.

[1]: https://iterm2.com/documentation-tmux-integration.html

[2]: https://github.com/ianyh/Amethyst


it's good at the first point. if you use a tiling wm you are not multiplexing the remote connection, or am I missing something?



we're using mosh


I simply open multiple ssh sessions if that’s what’s needed. My terminal can then natively do tabs, panes, ligatures, emojis etc. There’s no middleman.

So after all these years I still managed to not need tmux. Am I missing out? Everyone seems to be using it.


I got in the habit of using GNU screen and so have never used tmux. But if you have a long-running text UI program (e.g. irssi) you want to be able to connect to on a whim and pass around to multiple client sessions, they're handy for that.


Yeah, I've also been using screen for years without ever feeling the need to switch to tmux. Although remote ssh work is relatively rare for me (eg a couple times per month)


That's by far a different work experience. Imagine to start a long running process on remote, probably after hopping through a number of different machines. The long running process should not get killed. This is not possible to guarantee on ssh connections.

Similar, I keep all my work alive in remote sessions in tmux. Vim IDE, mutt, ... I would not want to re start this everyday. The best part is, I can reach my setup from office and from home and mobile. And if necessary even from my phone.


I usually just do: ctrl-z, bg, disown. Sure, I'll lose the output, but typically that's OK for me.


For me personally tmux giver minor improvements (some of them are done by some terminals, some are not), e.g.:

* Text selection using variuos shortcuts (usually I use it only for URL):

https://github.com/tmux-plugins/tmux-copycat

* FZF autocompletion from output, e.g. in case I want to diff some file I see changed in `git status`:

https://github.com/laktak/extrakto


tmux is one of those things that is nice if you take the time to learn and actually USE it for awhile.

But it doesn’t have amazing uses that immediately make you say “wow”. The biggest I’ve found is setting up something on a remote server and being able to disconnect it. For example it can be much nicer for running a Minecraft server than daemonizing.


I felt the same way, but ended up learning tmux because you can't make asciinema span multiple tabs or panes, and sometimes you want to make an asciicast with more than one shell.

It's handy to know sometimes, but if I have a sane window manager at hand then I don't use tmux. They're not common, sane window managers, so I use it occasionally.


> because you can't make asciinema span multiple tabs or panes

Of course you can:

1. create tmux session, create splits/panes 2. detach 3. asciinema rec 4. attach (inside asciinema rec)

This way asciinema records your whole terminal, and tmux running inside of it.

In other words, if you have it like this:

terminal -> tmux -> pane -> asciinema rec

asciinema can't see outside of the pane its running in. However, if you have it like this:

terminal -> asciinema rec -> tmux attach

asciinema sees all of it because its parent is your terminal, not tmux pane.


If you just want a super simple tiling WM for demos and such, ratpoison is as minimal as it gets.

StumpWM similar concept but more powerful.


The best use case of tmux for me is Azure Cloud Console, where there is no good way/cumbersome to open multiple sessions at once.


I have been trying out a kakoune arcan frontend [1] with the intention to use it over the network along with cat9 [2].

[1] https://github.com/cipharius/kakoune-arcan [2] https://github.com/letoram/cat9


Never got into tmux, still using screen after all these years. Is tmux still claimed to be superior?


Does screen offer vertical splitting now? That was the killer feature for me a decade ago. And the more Vimlike keybindings.


> Does screen offer vertical splitting now?

Yes, for nine years now; bound to “Ctrl-a |” by default: <https://www.gnu.org/software/screen/manual/screen.html#Split>


Nice, thanks. I knew there was a patch, didn’t know it was on shipping on standard distros


In my memory, the patch was not in upstream and even more far away from distro packages at the time tmux came alive. I also went with tmux for vertical split (left-right panes)


screen is gpl, tmux isn't


Note for others: tmux is under the ISC license, a permissive non-copyleft opensource license.


As an end-user though, it doesn't usually matter.


As an end-user, what software you choose to use is a form of advocacy for that software and its license. So it does matter, but you can choose to not care.


Have a look at zellij, just as good if not better than tmux (imho obviously), and has support for sixels and much more.


I have tried zellij and it is not yet for me:

* Can't move tabs (fixed position)

* sixel support with iterm2 is meh. Yes, it shows image, but it does not look good.

* Support for undercurl seems to be missing.

* I need to sacrifice some other functionality I have in tmux.

Yes, I get some extras but it is not ready yet for me personally.


zellij is amazing in terms of UI/UX, but in my experience not as stable as the older options. Which is understandable, screen and tmux have had decades to sort things out, but it does mean that given the general messiness and complexity of remote network connections (which is the most common use case), running into these rough edges is not difficult at all.

I feel like zellij is the future of terminal multiplexers, and I like using it in my personal machines, but I stopped using it for work after a while.


https://github.com/topcat001/tmux/tree/sixel

This is a more promising and updated sixel branch of tmux, which is slowly getting updates propagated to mainline.


I think DomTerm (in addition handling sixels and extended emoji) does multi-plexing and remoting quite well. Remote access (https://domterm.org/Remoting-over-ssh.html) is done with a simple wrapper over ssh. Mosh-style predictive echo. No special priviliges needed: Just drop a domterm executable somewhere on the remote machine. Multiplexing with tabs and tiles that can be dragged between windows. Builtin attach/detach.


I use emojis on tmux. I don't think it conflicts with them. :o

You just have to have an emoji font installed (Twemoji, of course). And I use Alacritty, a Rust-based, GPU-accelerated terminal editor. How's that for "modern"? :sunglasses:


Eh, Terminology has more features, and on my machine if I run tree from the home folder it's finished quicker than anything else I've tried.

https://www.enlightenment.org/about-terminology.md


while it is indeed useful for a terminal emulator to be efficient at discarding characters you will never see, it probably isn't the main criterion to judge it by


I presume you're talking about Alacritty? Because Terminology doesn't do this AFAIK (and there's a mini-map next to the scrollbar showing everything that came by)


well, processing characters you'll never see, then


You need to configure it properly or pass the -u flag for it to support unicode I believe.


>Sure - not everyone needs tmux. If you never work on remote machines, you can live without it.

I'd rather not live without it even on local stuff. It's way too good. Yes, I already use a tiling wm, and no it's not the same.


not sure how cosplaying as a vt340+ or a late 90s japanese cellphone is supposed to be an improvement?

loose coupling via byte streams is great, and so is text, but we can do better than character cell terminals

if you sacrifice textuality, xpra does reattachment, multiwindow, and lowish bandwidth pretty well


Author of sixel-tmux (and of the matching rant) here.

Your description of sixels using words like 'cosplaying' or 'late 90s japanese cellphone' is not very technical. It seems biased. In any case, it would be better than being stuck with the 60s "text only" VT100 like protocols.

We can talk about technicalities, as some people do not like how sixels work under the hood, but... they work fine: I can play video in my terminals. I use gnuplot everyday.

Could it be done better by a protocol that would use compression and other features? Maybe. But does it have to be? Why should we waste time reinventing the wheel when locally there's plenty of bandwidth, and remotely too often enough?

Personally, I like sixels because prefer a standard that is well defined, and can't change on someone whim: I like stable interfaces.

I think users are unfairly held back due to biases like the one you demonstrated, and that seem very common in the free software world, which was the point of my rant: there is no logical reason sixels couldn't be offered today in gnome terminal if the key people making sure that doesn't happen changed their mind.


a couple of minor corrections

emoji support, not sixel support, is "cosplaying as a 90s japanese cellphone"

the vt100 is from 01978

my 'cosplaying' comment was a specific response to teddydd's 'cosplaying computers from the sixties'

---

i've been playing video and plotting functions from remote machines on terminals since 01993, but using x11, not sixel

i think h.264 (cosplaying as an mp4 player from 02005) is probably a better approach for graphical applications; that's what xpra uses


My apologies for the confusion for the vt100 year. I should have checked instead of assuming it was a rough contemporary of ascii.

With these clarifications, I now understand your reply better, and it is indeed accurate: emojis are too limited.

However, their frequent use in text interfaces demonstrate the need to transmit more information density that regular text can carry.

Personally, in terminals I suggest the use of sixels, since they work well enough.

Bandwidth is never an issue for local terminals: playing videos with mpv demonstrates that.


bandwidth is almost always an issue for local terminals; a 4k (3840×2160) terminal at 120fps and 32bpp is 4 gigabytes per second, and according to https://news.ycombinator.com/item?id=35694801, typically with ddr4 ram you get 30 or 40 gigabytes a second

if you send all that video data through the ddr4 bus twice (once writing, once reading) you're using about 20–25 percent of the machine's memory bandwidth, but the linux tty subsystem actually makes several copies and runs many instructions per byte [citation needed] so you can't get anywhere close to that in practice. especially not if you're spending a lot of time repacking bits into six-bit bytes in order to feign compatibility with the vt340, a misguided engineering dead end from 37 years ago that only supported 16 colors and 800×480 anyway

other protocols like x11 and xpra work considerably better, but x11 video reproduction on a local terminal normally uses xshm or its moral equivalent with xvshmputimage or gl. if you're on a local terminal, you might as well use x11; the only benefit to putting the pixels in the same bytestream as text is if the your terminal is connected to your drawing application over a bytestream such as an ssh connection, a tcp connection, an spi connection, or a uart

i think we could do better by designing a bytestream protocol that minimizes copies and pixel format conversions and is therefore within epsilon of the speed of xshm, but sixel isn't it, and neither is x11 without xshm. a bidirectional bytestream protocol could include flow control to also avoid transmitting pixels that won't be visible and increasing latency due to bufferbloat

as for emoji, they don't improve information density; there are only 2666 emoji in unicode 10.0, so at a maximum they convey only 13 bits of information, and they normally occupy roughly the space of two letters like 'n', so you have slightly fewer bits per pixel. people use emoji because they are cute and colorful, not because they are ithkuil. they're not

some technologies, like the unix shell, smalltalk, and tcp/ip, make hard things easy and easy things possible. others, like retrocomputing, code golf, and malbolge, make easy things hard and hard things impossible. sixel makes easy things hard and hard things impossible, like the rest of retrocomputing.

that's a fun way to spend my time when i choose to (e.g., in https://asciinema.org/a/390271 i did real-time 3d graphics in a unicode terminal emulator with braille) but it's not how i want my primary user interface to my computer to work


4 gigabytes per second? Ok, and no.

Sixel is defined for two purposes: the first is to allow display of bitmaps. the second is to allow the definition of characters glyphs (which xterm does NOT have -- due to graphics limitations). If the glyph definition could be done, your bandwidth requirement would be reduced. Back in the 80's we generated custom fonts for PostScript (the Apple laser printer PostScript would cache the glyphs). As needed, the printer would request glyps and these would be supplied by an external driver or box (we called this the "Robin Box" for fairly obscure reasons). This system provided pre-press proofing for many customers (which would include publishers like Ziff-Davis). Please note that the speed of the communication was 9600baud (1000 cps). The image of that printer was 300x300 dpi, with some printers doing 600x600. Sixel would have been good... we used ASCII.

I like sixel -- I use xterm which gives me the option. For everything? No. but if I need a graphic, it is easy enough to use. The alternative for most would be to simply generate PostScript, and run GhostScript to view the results (typically on another system). I have NEVER contemplated watching a video with sixel... I could do it, but the player would be a "labour of love" - I would use a common decoder to produce an uncompressed bitmap (scaling and colour reduction) then convert that into frames of sixel... Just to show StarWars on a 340. But, no urge.

The main issue is that sixel offers the feature. No real reason to NOT have it... and it exists.

Try pbmtoln03, ppmtosixel, imagemagick convert and lsix

sixel is not a protocol: just an encoding. Your idea conflates the two things.

The purpose of the Robin Box? The publishers typically had hundreds or thousands of typefaces. The PostScript Printer? 15 to 50. Since RAM in the printer was limited, this approach allowed the scanning of the target typeface, conversion to outline form and production of dynamic programs executing in the printer. That are discarded but results cached. (and note that technology is within a period that, in this case, was bounded by RAM space, and typographic conventions -- all of which changed fairly rapidly).

That approach allowed hundreds of typefaces to be used on a single page! With "standard" PostScript; on a printer with only a megabyte or so of RAM available. Sixel is simply a similar tool. You really can't predict how something like would be used if it were generally available.


yeah, i agree that sixel was a reasonable way to implement downloadable fonts in the 01980s, though one that handled line noise and background process output interleaving poorly

(see supdup's graphics protocol for a better design for integrating pixel graphics into a (still retrocomputing) serial terminal. however, supdup foolishly omits any font-downloading facility)

i just think sixel, whatever its merits for the problems of 40 years ago, is a worse way to display graphics today than things like x11, which is itself no shining gem, just not as bad as sixel. also we're sort of stuck with most of x11 for the foreseeable future because there's a lot of software written for it; let's make sure that doesn't happen with sixel

(you point out that sixel is an encoding, not a protocol, which is sort of true, but it has an associated state machine in the terminal, so it's also sort of a protocol, and that's the part that has the poor error-recovery characteristics)

by 'worse' i mean: it requires more effort to achieve an equivalent result; it provides a worse result with the same effort; and there are some results it simply can't achieve that the alternatives can

if you're looking for a puzzle to solve for fun, of course, those are advantages, not disadvantages; but i repeat myself

it's especially an advantage if you find something that was previously thought to be impossible due to laboring under such artificial limitations but turns out to actually be possible despite them

here are some more nontechnical descriptions of sixel that are more correct than any technical description possibly can be

sixel is oulipo programming; encoding your graphics in sixel is like writing a novel without the letter 'e'. btw, a wonderful article about oulipo programming is https://100r.co/site/working_offgrid_efficiently.html

sixel is an art project, not an engineering project


> there is no logical reason sixels couldn't be offered today in gnome terminal if the key people making sure that doesn't happen changed their mind.

I read through that entire Bugzilla page, and IMHO the usability and technical concerns are pretty valid. You can say "well, those are not an issue for me" or "I can live with it" and that's entirely fair, but I can understand how a maintainer would want it to work really well before merging.

Also, I believe it did get merged in 2020? The current status is a bit unclear to me.


> I use gnuplot everyday.

Same. Sixel support in gnuplot is very useful: it’s great to have decent graphics without relying on X with dodgy vpns. It’s probably not the most useful feature, but it was a significant QoL improvement when I found out the Sixel terminal.


I just use screens but I guess tmux is better in every way


This led to a rabbit hole. Here’s kitty’s competing protocol:

https://sw.kovidgoyal.net/kitty/graphics-protocol/

Look at the “The transmission medium” section. I’m reasonably familiar with what can go wrong when messing with files you shouldn’t mess with on Linux, I’ve worked on the various security mechanisms that can help, and I would not want to implement this. If someone wanted me to consult on implementing this, my advice would be, first and foremost, not to.

There isn’t even a “security considerations” section in the document!


Since your comment is (was) topmost, I want to clarify that while kitty seems to have an overlarge attack surface, Sixel doesn't seem bad at all.

There's an escape code to enter "sixel" mode, then base64-style data representing 1x6 pixel bitmaps, then an escape code to get back out.

No vector graphics that might overdraw unexpectedly (security considerations!), no mechanism for out-of-band data (security considerations!), no unproven compression libraries (security considerations!), but also none of the extra magic that something like kitty would provide.

I could live happily with Sixel being universally supported in my terminal emulators.


> I could live happily with Sixel being universally supported in terminal emulators.

Me too!

If I want fast graphics from a local application, there are a couple of widely available ways to make it work. X11 and Wayland come to mind :)


FUD as usual. I am so sick of people waving around the security word. If you are scared of dealing with files on Linux, I suggest you throw your computer in the garbage and retire to a mountain fastness with no electricity. If you have a specific criticism of the kitty protocol make it, otherwise spare us the vague FUD.


You connect your terminal to a program (pts, which may map to a sandbox or a remote SSH server you don’t trust or just a file you feed to cat). And it contains an escape sequence that causes your terminal to read and process ~/.ssh/id_rsa or /etc/shadow or /dev/sda or /proc/self/something or some other wildly inappropriate object. And your terminal opens and reads the file.

My terminal does not live in a mountain fastness, and it’s not as exposed as a web browser, but it should at least try to make it safe to feed it untrustworthy input.


Heavens! Your terminal opens and reads a file. What a disaster. Still waiting for a concrete issue with the actual kitty graphics protocol. How is it unsafe to feed a terminal that supports the kitty graphics protocol untrusted input. One single solitary example would go a long way to prove you aren't just full of hot air.


For those who like me had no idea what a sixel was: https://en.m.wikipedia.org/wiki/Sixel


Love the idea, but Sixels are a can of worms - it’s inefficient and it’s unclear how they should map to physical pixels - most do 1:1 but I’d prefer a more historically accurate mapping similar to the original terminals that supported them. It was originally designed for dot-matrix printing and later implemented in CRT terminals.

Also, ReGIS and Tektronix (disclaimer: I proposed including Tek graphics in VTE) could also be very useful.


Is ReGIS support uncommon? I know the terminal I use on Linux (mlterm) does, as does plain ol' xterm of course. (As an aside, it's annoying how many terminals set TERM to `xterm-256color` without being xterm compatible.)


I’m not sure - not many terminals advertise it. And yes - terminals lying on TERM is extremely annoying. Pretty much everything that has ever existed has a terminfo entry.


>how they should map to physical pixels - most do 1:1

Isn't this can of worms inherent to the subject, anyway? This question has been asked - and answered - by pretty much every rendering/display system, ever.

Seems to me, sixel adoption at this point is the pre-cursor to optimization ...


This is great but presentation has lots of repeated text, and is hard to get a full picture.

Probably work better if it was a comparison table, with ticks and crosses all in one column.


This. At least add the check/cross to the table at the beginning, so you can get an idea how widespread support is. Having to scroll down is really annoying.


Yeah, sorry about that. I threw this together with the first terminal-like Hugo theme I could find, since I wanted to focus on getting the info out there. I'd probably either have to create a custom theme, or just rewrite the site from scratch. Your point is noted though--others have brought this up.


FWIW, it might be worth mentioning in this context that iTerm 2 also has its own protocol for displaying pixels, so you can cat images in the terminal using an "imgcat" script that they offer for download:

https://iterm2.com/documentation-images.html


I was debating whether or not I should mention iTerm's image protocol, since iTerm already supports SIXEL.


That is also supported by some other terminals; at least mlterm and Konsole handle it.


Does Sixel have a spec? I.e. how large are the pixels, whether there's a newline after the image (couldn't find how to turn that _off_)?

Recently discovered Kitty's graphics protocol (https://sw.kovidgoyal.net/kitty/graphics-protocol/) which has more features or at least more documented ones :)


> Does Sixel have a spec?

DEC invented Sixels back in the ’80s, and they were serious about their docs, so the corresponding chapter of the VT3xx manual[1] is probably as good as it gets.

> I.e. how large are the pixels [...].

Historical implementations likely assume the relation between pixels and character cells that’s implied by the geometry of the DEC fonts. I’ve seen a lot of arguing about adapting this to the modern world, but I don’t know if a consensus has emerged.

[1] https://vt100.net/docs/vt3xx-gp/chapter14.html


No, there is an escape code that queries the window size in pixels:

    "\x1b[14t"
Combined with the escape code that queries the window size in character cells ("\x1b[18t"), you can calculate the number of pixels per character cell (the "pixel size").


Are these escape codes actually implemented in the average terminal? I recently tried to get e.g. alacritty to tell me this stuff but I don't even know how you're supposed to red back the response.


Yes, every xterm-compatible should have them.

You just send the particular query (e.g. ‘CSI 14 t’) and the terminal sends back a response in the defined form¹. Of course you'll want raw mode, echo off, etc. Normally a library like curses does this for you. If you want to see, https://gist.github.com/kpschoedel/6a87ec2157ce2140be69193d1... (I just whipped this up to answer the question; don't expect production quality)

¹ https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-F...


Thank you so much! I will incorporate this to my sixel feature branch of a tui matrix client. Memes in the terminal!


Which matrix console client? I only know gomuks and I really miss proper image support there.


https://github.com/ulyssa/iamb

It's still in the early phase of development, but actually fully functional.


Most implementations I've seen use an ioctl to query those particular bits. That's implemented quite reliably, since the same ioctl is used for character size as window size. Some implementations just set the character size to zero though.


Ioctl doesn't work over a serial port. The escape code queries are more general.


‘CSI 16 t’ reports the character size in pixels directly.


How about the cursor position? The spec talks about "Sixel Scrolling Mode" but I couldn't find any way to display a sixel image inline: https://stackoverflow.com/questions/70647549/displaying-a-si...


What the hell, ouch. This looks to be very buggy across the board. Here’s a test:

  LF='\n'; ESC='\e'; CSI=$ESC'['; DCS=$ESC'P'; ST=$ESC'\\'
  CUF=$CSI'%dC' # cursor forward
  SIXEL=$DCS'q%s'$ST # sixel image
  printf "before $SIXEL$CUF after$LF" \ 
    '#0;2;0;0;0#1;2;100;100;0#2;2;0;100;0#1~~@@vv@@~~@@~~$#2??}}GG}}??}}??-#1!14@' \ 
    2
With a freshly launched terminal on my machine, I get:

- in XTerm (xterm -xrm "XTerm*decTerminalID: vt340"), "before ", "HI", " after", that is to say exactly what you want, out of the box;

- in Foot, "before ", "HI", newline, some spaces, "after";

- in Contour, "before ", "HI", enough newlines to clear the screen (?..), no spaces (?!..), "after".

OK, sez I, let’s just save the cursor position (DECSC, ESC 7) before the image and restore it (DECRC, ESC 8) afterwards, then skip over it; that is,

  DECSC=$ESC'7'; DECRC=$ESC'8' # add to definitions
  printf "before $DECSC$SIXEL$DECRC$CUF after$LF" # change format string
In XTerm, this (rightly) makes no difference. In Foot and Contour however, you still end up a line resp. a screen below where you started, if now with the correct horizontal position.

So it seems to me like what you want should work by default, except it doesn’t.

It should be possible to instead just treat the whole thing as a framebuffer overlay (by computing or directly asking for the character cell size, as Kirill Panov rightly admonishes me is possible with XTWINOPS) without touching the cursor; that’s what the “sixel scrolling” setting (DECSDM) is supposed to do. Then you can just manually move the cursor forward however many positions after you’re done drawing.

Except apparently the DEC manual (the VT330/340 one above) and DEC hardware contradict each other as to which setting of DECSDM (set or reset) corresponds to which scrolling state (enabled or disabled), and XTerm has implemented it according to the manual not the VT3xx[1,2,3]—then most other emulators followed suit[4]—then XTerm switched to following the hardware[5,6] (unless you and that’s what I’m seeing on my machine right now. So now you need to check if you’re on XTerm ≥ 369 or not[7]. And also for other terminals’ versions, because apparently that’s a thing now[8,9].

Again, ouch.

P.S. DEC had an internal doc for how their terminals should operate (DEC STD 070) [10]. It does not document DECSDM at all.

[1] https://github.com/wez/wezterm/issues/217#issuecomment-86449...

[2] https://github.com/hackerb9/lsix/issues/41

[3] https://github.com/dankamongmen/notcurses/issues/1782

[4] https://github.com/arakiken/mlterm/pull/23

[5] https://invisible-island.net/xterm/xterm.log.html#xterm_369

[6] https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-T...

[7] https://github.com/dankamongmen/notcurses/commit/0918fa251e2... (the correct version cutoff is 369 not 359, the patch contains a now-fixed bug)

[8] https://github.com/dankamongmen/notcurses/issues/2204

[9] https://github.com/dankamongmen/notcurses/blob/master/src/li... (look for mentions of invertsixel or invert80)

[10] http://www.bitsavers.org/pdf/dec/standards/EL-SM070-00_DEC_S...


> [10] http://www.bitsavers.org/pdf/dec/standards/EL-SM070-00_DEC_S...

Nice. I wish I'd had that years ago when the maintainer of a then-popular virtual terminal got very angry at me for suggesting that DECCOLM (set 80/132 columns) should not change the number of lines.


It's interesting to read the discussion about Sixel support in Kitty [1], where the pros and cons of Sixel are considered in relationship with Kitty. In particular, I find this comment [2] by the maintainer of libsixel particularly intriguing:

> After I took over the maintainership of libsixel I unfortunately decided it cannot support the security demands of Kitty, it is too insecure internally. I need to write a Rust library or something.

[1] https://github.com/kovidgoyal/kitty/issues/2511

[2] https://github.com/kovidgoyal/kitty/issues/2511#issuecomment...


Kitty is the epitome of NIH. They don't do modifyOtherKeys either.


> Kitty is the epitome of NIH.

Sorry to be that guy, but what is a "NIH"? All I know is the https://www.nih.gov/ :)



My apologies — I dislike seeing unexplained acronyms myself. As detaro answered before me, it's ‘not invented here’, the tendency to reject existing solutions for a sense of control.


Check out xterm author comments about the history of MOK: some people tried to present his works as theirs.


Shouldn't you just use a web browser or a GUI at some point, if you're trying to do graphics? Adding bitmaps seems like feature creep, muddying the whole purpose and appeal of the terminal.

Next you'll want accessibility, alt text for blind readers, DPI awareness, responsive scaling on different screens, etc.


If this is feature creep at all, this is very old feature creep. It happened in 1981, when the ability to do graphics, actually available on some specialized terminals from the decade before, became mainstream in the world of character mode terminals, when DEC put Sixel and ReGIS into its VT range.

It has been in XTerm since the 20th century, and terminal emulators for MS/PC/DR-DOS like Reflection gained Sixel support somewhere around 1989.

Don't be fooled into thinking that "the terminal" is what one can do on the kernel virtual terminals of Linux, FreeBSD, or SCO Xenix. The reality is that the massive reinvention wave of the 1990s and 2000s has actually lost you functionality that was in real, hardware, terminals and contemporary terminal emulation programs of the 1980s, and given those who never lived it a very blinkered idea of terminal functionality based upon 1960s TeleType terminals that was a little outdated even when Unix itself was invented.


I agree we should tread carefully regarding feature creep, but it's already very common to display graphical data in the terminal with ASCII art or Unicode. Adding basic bitmaps would remove the need for those hacks.

Convenient terminal tools already exist for viewing formatted text like Markdown or HTML, or viewing binary data like ELF files, or showing live dashboards like htop... but images still require launching a separate application. And this is all data that you might produce while working solely within the terminal--only to have to visit GUI land to see what's in a PNG file. It's a pretty obvious gap.


I don't think so. There are plenty of terminal based programs that would clearly benefit from proper graphics. And doing it in the terminal means you don't have to set up a second connection, e.g. when you're using SSH.

That said it's disappointing that the best we can do is Sixel, a hilariously inefficient and ancient format.


Kitty term has it's own modern protocol which gets more traction.


> Kitty [graphics protocol] gets more traction.

I guess that may be true in some senses, but per Nick Black[1] (of Notcurses fame) the set of Kitty graphics protocol implementations consists of Kitty and Wezterm, that’s it.

[1] https://github.com/alacritty/alacritty/issues/910#issuecomme...


Also kde's konsole.


For the record: My terminal emulator does not do sixel, and providing that or any other graphics mechanism is not a design goal, and not in keeping with the internal mechanics of storing the terminal display as an array of character code points. The terminal emulators that my terminal emulator is designed to supplant, the kernel built-in emulators of Linux, FreeBSD, OpenBSD, NetBSD, and indeed SCO Unix, don't do sixel graphics.

This is not to say that I am against sixel. As someone who was doing graphics on terminals in the 1980s, sometimes by accident when I catted the wrong file, the idea that "terminals don't do graphics" seems blinkered and ahistoric to me. They do and did, in my direct experience. I'm simply not targetting that as a feature in a program that has the specific goal of matching specific kernel built-in emulators. You know where XTerm is if you want it. (-:

On the gripping hand, my terminal emulator does Unicode (as do several of the kernel built-in ones, albeit with severe limitations), and so all of the things that one can do with Unicode pseudo-graphics are possible on it, from MouseText windows through progress bars using the 1/8th block glyph set to PC-style line drawing.


I was sad to see that alacritty doesn't support it. I like their vim-selection-mode (even if it is missing a few movements).

I'd like to play with terminal graphics in a an app I'm working on. Does anybody know if a terminal which supports:

- wayland - sixel - copying substrings without a mouse

I know I can just pipe to wl-copy, but sometimes the right chain of sed & awk is less convenient than the right chain of vim highlight movements.


IIRC we term can do it. I was an alacritty/kitty user for quite a while, but have since moved to wezterm because it is just more pleasant to use. I did see some crashes for a while but they seem to be fixed now.


I am using alacritty with a sixel patch. Sadly, the maintainers aren't interested in supporting sixel or kitty graphics.


I just started using this! I'm making a tool called command-line-maps, its gonna be amazing.

Echoing another comment here, it doesn't work in tmux though which is pretty heartbreaking. But viewing beautiful maps in the terminal is amazing, and works perfectly.

(I'm using the alacritty branch that supports it, worked perfectly.)


What’s the point? Why would I want images in my console?


Displaying the result of a command as a graph (histograms, plots) can be more pleasant than a bunch of **** bars. An external graphic window works well too, but displaying in place can be convenient sometimes, no need to click or alt-tab.


Running MRI processing on a super computer and I want to check at a certain point in the pipeline that things went correctly by viewing a couple of slices without downloading a huge brain image



Jupyterfying the console


Eg, Euporie - https://euporie.readthedocs.io/en/latest/

> If you’re working with Jupyter notebooks in a terminal only environment, like an SSH server or a container, or just prefer working in the terminal, then euporie is the tool for you!

Uses Kitty ("Currently only the kitty and WezTerm terminals support this"), Sixels, or ansi art.

Also supports SVG, HTML, LaTeX, and Markdown.


Show images inline and have it scroll with other text. This is useful for working with programs that process images. For example, ImageMagick can render a PNG in the terminal like this:

   convert image.png six:-
"six" means use Sixel format, and "-" tells it to write to stdout.

Demo (running on cygwin's mintty): https://twitter.com/uguu_org/status/1631207051602042880


This NES emulator uses Sixels for rendering games over TCP :) https://github.com/henrikpersson/potatis


Checking snapshot diffs from git repo. I would love doing that in terminal instead of opening in external app.

I have scripts that do diffing and opens results in external app, but I will replace this with in-terminal images when I will find out a way to do that.


I find it a lot faster to do icat previews than bother opening up an application to find & view a specific image. Even something as lightweight as feh, is still gonna open a split in my window manager which in going to be more jarring than a preview in the terminal.


It would allow portable graphics applications on the terminal, e.g. this C64-emulator-in-Docker only renders ASCII characters, but could be extended with sixels to render graphics (I actually tinkered with this, but didn't get far because most terminals have either none or too slow sixels support):

https://github.com/floooh/docker-c64


Cat photos in /etc/motd


I was installing a fresh copy of Windows 10 this week and Cortana told me that once I connected to a Wifi network I'd be able to get on with browsing cat photos o_O

Microsoft has jokes.


I have literally patched OpenSSH to show me cat pictures instead of the usual randomart.


how many cat pictures do you have for this to work?


As many as necessary and then some. No such thing as too few cat pictures!

I use AI to generate the pictures when natural supply is scarce. Cache them locally, address by host key hash. Use the hash as seed for bonus points.


Mathematical plots, seeing the result of AI generated images, seeing the result of image compression algorithms, ...

Having these intermixed between the regular scrollable textual output is quite powerful.


Plotting data, mathematical notation.


So you can browse Twitter timeline with icons on kernel console on NetBSD/m68k


    $ mpv -vo=sixel ~/Downloads/pirated-hollywood-blockbuster.mkv


Today I learned about Terminal Graphics Protocol (an alternative to Sixel):

https://news.ycombinator.com/item?id=35940691


Sixels are cool. There's one thing I wish konsole would do better: if you draw a sixel image, and then zoom in/out, the image should also zoom in/out. It should scale the sixel graphics and the regular text the same way.

And the kitty protocol also supported by konsole has the exact same issue last time I tried, doesn't zoom in.


Works in mintty on Windows: just install msys2 and try it.

It's a simple feature, considering mintty was built on top of putty


This would be amazing to implement a jupyter kernel natively in the terminal!! can’t wait


I'm excited to tell you it already exists with kitty and euporie https://github.com/joouha/euporie


Please, let's move away from red and green as "good"/"bad" colors. It's hard to tell the difference for colorblind folks, especially when the symbols used are quite similar.


Sorry about this. I assumed that checkmark/cross would compensate for this color combination, but I agree that better colors would make this more accessible for everyone. From what I'm reading online, Blue / Red is a better combination here?


It's useful, though:

Green: moral

Red: immoral

Red green colourblind: amoral


That makes me wonder, if it should be solved client side by translating server side colours to something different on the client, according to the user’s needs.


There are absolutely solutions that do exactly that client side (just mapping the colors from the hard-to-tell-apart hues to other hues depending on which type of color blindness).

For instance, in Mac OS, under Accessibility there are 'Color Filters' with a dropdown for presets for a few common types of colorblindness including red/green. It is a filter that is applied systemwide at the graphic output level.

I think this is the proper approach (device-level color filtering) versus each app attempting to serve all, as that could lead to double-filtering.


Why do I smell another “just adding more capability…” thing turning into an OS? Maybe only for devs? Sure there’s a long way, but I see it coming. This time without DOM.


Missing mlterm



Any example echo command to test it?


HN isn't putting this on one line, but

  echo $'\EP0;0;0q"1;1;6;6#0;2;100;0;100#1;2;0;0;100#2;2;0;100;100#3;2;0;100;0#4;2;100;100;0#5;2;100;0;0#6;2;100;100;100#7;2;0;0;0#0!6_$#1!6O$#2!6G$#3!6C$#4!6A$#5!6@-\E\\'


That works, thanks.


OP should have linked the page as-is, without the #terminalapp part of the URL.

Do not assume my terminal!


Probably unintentional. People often click into subsections of the page and then submit their current URL as it is, without realizing they're linking to a subsection of the page.


I think it's fair to call Terminal.app out especially, though. It's in sore need of an uplift in many ways — with True Color support being another annoying example. In a few weeks at WWDC we'll find out if they've ignored it again for another year.


Or you could just use iTerm and ignore the built-in app (do you try to use TextEdit to edit code as well?)


terminal.app is much faster and lower input latency which is why I stick to it.


This is not a mutually exclusive “or”.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: