Hacker News new | comments | show | ask | jobs | submit login
Texttop – An interactive X Linux desktop rendered in TTY and streamable over SSH (github.com)
551 points by sandebert 6 months ago | hide | past | web | favorite | 92 comments

Author here. Surprised to see this here as I've been literally spending the last 2 weeks on a complete rewrite from scratch that actually layers REAL text over the blocks. You can get an idea of my approach with this function: https://github.com/tombh/texttop/blob/webext-rewrite/webext/...

After the success of hitting the front page here last year, I really wanted to sit down and do this properly. So not only am I working on real text support (that you can of course copy and paste without even zooming). But I've removed the dependencies on ffmpeg, Xorg (for Firefox at least - Chrome strangely doesn't support webextensions in headless mode), docker AND it will work on all webextension-compatible browsers. It's going to be a single cross-platform, static, Go binary, that launches your preferred browser in the background.

Generally it's bad luck to talk about something before it's finished, but seeing as it's suddenly on the front page I wanted to let all those interested know that I'm making this 10 times better. And an early apology, because I'll definitely be posting the rewrite (and rename) here and hoping to get all your attention again.

You might get a slightly better resolution with the approach taken here: https://github.com/stefanhaustein/TerminalImageViewer (= find best matching block graphics character for most prominent colors in a cell; there is a comparison to half blocks at the end).

Cool, TIV looks great! (Just compiled, ran, and tested it.)

I thought [cacaview](https://github.com/cacalabs/libcaca) was the only terminal image viewer player in town with their attendant `cacaview' prog. I'd very much appreciate it if anyone here knows of any other libs or especially the sanest way to do this in either C/C++ or Ruby … It's for a terminal-based project I have in mind :)

Some alternatives are listed in this askubuntu thread: https://askubuntu.com/questions/97542/how-do-i-make-my-termi...

I had some problems with this software, lot's of segfaults. Some images works one time but segfaults on the second view while others renders half or nothing at all before I reach a segfault.

Please file an issue at the github project, ideally with an image that always crashes.

That makes me think of the custom characters on the old Commodore64s that are often used for games that ran in text mode.

This is still a technique used by DOS game programmers today to get access to the full CGA color set with graphics. See Paku Paku for an example:


It's going to be a single cross-platform, static, Go binary, that launches your preferred browser in the background.

This is where go really shines in my opinion (generics or not :P). Cross platform just works; I was stunned when I installed go on windows and was able to build third party libs, and build and run exe's on windows.

No need to apologise - this is really creative! Best of luck :)

Have you thought about adding sound?

Either something like Opus, or...

You could do a decomposition of the audio signal and send a small array of {frequency, amplitude} pairs for each frame, then ramp to each new pair as they arrive.

If you do this fast enough then you get just enough of the timbre of the original to make a listener think they are hearing the "right" thing for a video they already know.

Even with a single oscillator you can get some interesting sounds this way.

edit: syntax error

I've thought about it a little, it seems really hard. But that's a really interesting idea you have there. Are there any docs or blog posts or something you can point me to?

I think the server part would be pretty straightforward. For example, using Jack you could port the system outs (or Firefox outs) into something like Pure Data and do the decomposition there into the frequency/amp pairs. From there you could either send to an ssh tunnel or feed into whatever process is running on the server that sends messages to the client.

On the client side, I'm not so sure. Keeping with Pure Data, you could instantiate in the background and have it listen on a localhost port, then forward the pairs and resynthesize them.

If you want I can make a demo patch to show you what this sounds like.

Interesting. I've actually completely rewritten the app (see the webext-rewrite branch), so now the main code lives in a webextension (ie inside a tab). Researching, I can't unfortunately see any way to get the audio for a tab :( So we'd have to use your approach of connecting to the outside of Firefox somehow.

So then the other thing is how to make it as convenient as possible to open up the audio connection. It's gonna have to be something like `ssh texttop -p1234` for the imagery and `texttop_audio_client.sh -p1235` for the audio? There's no way to combine them I don't think?

So when you do `ssh texttop -p1234` how do the mouse movements and clicks get sent from the client to the server?

Just through SSH, they're essentially identical to keyboard input, just wrapped in different ANSI escape codes.

Just FYI you copy-pasted your HN comment straight into your readme and it broke the link in your readme (because of the way HN formats links).

Thanks. Fixed.

Also fyi, README.md contains a link "elinks" http://www.xteddy.org/elinks/ which is currently 404. http://www.xteddy.org/ has a link "Elinks" http://www.elinks.org/ which is currently 403. Sigh. A ggled http://www.starshine.org/xteddy/thomas/elinks/ links to http://elinks.or.cz/ .

Fun project, I dreamed of this so many times. Happy to see it running

While the geekery is really really cool I also wonder if this method can be beaten with the best in video comepression. Extremely low bandwidth encoding perhaps provides better results (if you put cpu/gpu load aside), for example see this x265 example over 50 Kbit/s. http://wp.xin.at/archives/4020/comment-page-1 People are watching Netflix in low quality in 100-200 Kbit/s. With the upcoming AV! (AOmedia Video 1) it gets even beter. Another avenue is Citrix/RDP connection work ok over 30-50 Kbit/s. Then there is the counterpart FreeRDP. Wouldn't FreeRDP be the preferred solution here, except then that Mosh solves the potential latency issue? With linux desktops running on low power clients already, why bother? Does it use less battery life so you can get 48 hours surfing the internet in the mountains or forest while the solar powered server runs at home?

I certainly think you can get close with video. But one of the things about this method is that I wanted it to be as universal and as accessible as possible. As in: accessing a fully modern JS-enabled browser over just an SSH client.

I did project with Webkit that renders to the terminal:


Waat! We need to talk - I'm taking a very similar approach in my rewrite. There are some very interesting algorithm problems. Eg; I'm actually comparing pixels in 2 screenshots --with and without text rendered--, in order to get text colour and visibility. This way is actually more performant than using `getElementAt(x, y)` and `getComputedStyle(element)`.

I rewrote mine to. One big problem i hit before rewriting was performance and also css :after classes they are not reachable in the DOM if I remember correctly. So I resorted to patching phantom to give me access to the WebKit render tree.(which is a tree similar to the Dom tree but not the same.

I can't remember if there was some big issue that eventually made me stop developing. I think the final issue was characters with dual width like many Chinese characters the gui library blessed did not understand the concept of characters with non standard width and a single such character would destroy the entire layout. And issues with correlting the render tree and the Dom tree.

Now I remember the main problem before switching to the render three was text rows. there is no good way of knowing where a text line will break without resorting to the render tree or to use some form of ocr.

Ah yes, that was a big hurdle, but that's actually fully supported now with DOM Ranges - basically codified selection boxes, those regions that highlight when you select with a mouse. There's still some extra leg work though to render whitespace in the same way as the browser.

I did that to if you are referring to getBoundingClientRect & getClientRects, before I switched over to the render tree.

It does work perfecly for single line text but not multi line text. The rect will become larger than the text and you won't know where each line starts.

Well that is what I remember happening.

Oh! Sorry I just assumed they weren't around then. That's really interesting to know. Hey could I have a chat with somewhere, IM, or something? Would be really good to hear your story with this.

It uses Xvfb, ffmpeg, hiptext (with fixes from the author), and a custom interfacer written in go, which employs xzoom in C and termbox-go (with fixes from the author).

The interfacer alone looks fun, seems like I can control a remote X session via ssh.

This looks amazingly cool!

And I apparently starred the repo previously, but don't remember ever having seen this particular badass shit before.

Am I losing it?

If your situation is anything like mine, it goes much more like: "oh, neat; I'll look at that tonight" followed by a browser bounce, or something else becoming a bigger priority, lather, rinse, repeat.

There is so much great content -- both "empty calories" and otherwise -- that it is so easy to get buried under it. I came to HN to retrieve _one specific link_ and ended up with 7 new browser tabs open, counting this one.

HN Tabs are like an unstoppable cancer. I'm on XFCE and I have a whole workspace dedicated to over a year's worth of articles, books, discussions, etc from HN that "I'll eventually get around to reading".

To say nothing of the gigs of research paper PDFs; hey, we're all going to need _something_ to do during retirement, right?

I use https://www.linkpack.io/ on iOS as my black hole for links.

Didn't know about that one. Pocket would also work, esp if you're using Firefox. Also works on the Kobo e-reader.

Other options I'm aware of are conventional bookmarks, and OneTab.

I just upvote articles I want to read later, then get around to revisiting that list once in a blue moon. But at least it reduces tab bloat.

I block HN on my computer etc hosts file. If I want something to stick, I have to discover it on mobile or my iPad and jump through an extra hoop to get new links on my desktop. I find the added friction is nice for reducing "empty calorie" links.

reducing "empty calorie" links

Heh, I actually meant things like "LA to Vegas", "Family Guy", and the like, but yes, I can believe there are empty calorie HN links, too

I presume you're aware of the "noprocrast" and "minaway" settings on your HN profile? Using `/etc/hosts` is some hard-core HN Addicts Anonymous level stuff :-)

No desktop/laptop screws up a lot. No consistent access. No feeling I will not be persistently made to make mistakes re: worse than prior...

This was posted on a lot of places are year or so ago (probably around the time it was released). I remember checking it out earlier for sure. Do you think you might've stumbled upon it from HN/Reddit and starred to show support etc?


Another interesting project, that's similar, but for VNC.

Finally, good news in the world of computing. The zoom function is a nice touch...

I expected this to be an elaborate joke, but the author obviously has a good use case for it. That is quite ingenious!

> Why not VNC?

Why not ssh -Y and use X protocol as intended?

He describes the motivation in the README here: https://github.com/tombh/texttop#why

Looks like the goal is to be able to view modern websites over poor internet connections using mosh. (And also probably a lot of fun to implement/see working)

> sometimes I don't have very good Internet. If all I have is a 3kbps connection

I don't think X would handle that very well, and you may not have an X server at the other side.

3 kbps is not enough for smooth TTY text-only ssh either...

I've done pretty productive work over much tighter links than that. Think 300bps.

One of the systems I'm responsible for currently can only be reached over a 1200bps dialup connection.

But does this work require full refreshes of 80x24 screen or larger?

That's where `mosh` comes in.

With wayland this would be a dead end

Why ?

ssh -Y is insecure, unless you completely trust the SSH server.

Hmm I wonder how many of these I can run at the same ti

For more detail, there's sixels, see https://github.com/saitoha/libsixel

I did look into that, but it's not as widely supported as the UTF8 half-block.

With many bytes necessary to print just six pixels, it wouldn't be as low-bandwidth, right?

On the other hand, how about libcaca? https://en.wikipedia.org/wiki/Libcaca

The TTY "graphics" rendering is actually relatively trivial, it's really just a few lines of JS in my rewrite now. So there's no need for an external rendering lib. Also with the new approach, I'm focussing on real text as the main feature, so you'll actually be able to turn off the colour blocks and just surf the web as raw text, with all the goodness of full CSS rendered, realtime text and JS.

Wow. I definitely like the Idea.

After reading the first few paragraphs i also finally understand why it would be such a great idea to have a blocky firefox.

This is awesome, I've always wanted something like this. Zoom is the killer feature.

I wonder how you'd implement something like this in Wayland? Would you have a composer on the remote side rendering the text to a terminal (that you'd connect to over ssh/mosh)?

Wow! Can't wait to try this! Coolest thing I have seen in a minute!

That's a good idea. I'll continue reading about it.

very creative, looking forward to giving it a try.

rick roll opportunity totally missed ;D

Why not stream X directly over ssh?

Because X over SSH requires a somewhat low latency and high bandwidth. The guy's requirement is unpredictable latency and low bandwidth (like 3 kbps). X is no fun at all over that kind of connection.

I find X-over-ssh unusable over anything non-local, even over local wifi it stumbles: the latency is especially painful. There's Xpra, which makes this somewhat bearable even on mid-level residential connections (unimpressive speed, high latency, horrible jitter), but for single kB speed, even plain ssh is really bad (with mosh making it just bad). This would make the result not-quite-horrible.

Finally, high-performance remote display!

No sound though :)

Could it not play (monotone) sound through the terminal bell?

It's a feature not a bug ;)

Thus the smiley :)

Sounds like a challenge

That's fucking nuts!

Given the quality of text (or the requirement to zoom) I'm not sure what the fuss is about. It looks abysmal, and having to zoom all the time is a habit I am glad we got rid of due to higher resolutions. Though still existing on mobile devices it is also user-friendlier to zoom there, and the reason for the feature is the device is more mobile.

You could already run a browser in a framebuffer for ages. Which means you're not bound to X11. My friend, X11 hater (he was former Amiga user), just used CLI and a CLI web browser (well, he used Emacs) but sometimes he did need GUI and that was his solution. We're talking end of '90s/begin of '00s here during a time where X didn't even compositing.

If you want to run X11 over SSH, that's also possible, e.g. via NX Server. Another route could be SDN plus (Free)RDP. Mosh is great for latency, tho can't be used for SSH over Tor due to UDP.

I'm actually in the middle of a complete rewrite (I was hoping to stay under the radar until then). I'm ditching X11 and ffmpeg and actually rendering real text as a layer over the TTY blocks. See: https://github.com/tombh/texttop/blob/webext-rewrite/webext/...

Have you considered using libcaca for some use cases? Might be appropriate for images/videos.

Have you (or anyone else) tried Texttop over Tor + SSH?

NX (FreeNX) and X2Go was already mentioned in the previous thread [1]. I've used NX successfully over 32 kbit/sec upload in 2005 (server in NL, client in CA/US). Latency was acceptable. But we're not talking about network speeds or latencies like 2G/Edge (which seems to be your use case). They support resuming of sessions. In other words, like RDP. And with regards to RDP: you can decrease the image quality to make it more bearable. Nowadays, open source implementations are available. If you're on very slow links you can also perhaps enable (more) compression, at the cost of more CPU power it might still actually decrease latency. Not sure if LZMA2 is used for compression in this context these days.

Good luck with your project!

[1] https://news.ycombinator.com/item?id=11744788

Can't edit my post anymore but I just tried Texttop out. I ran it according to your Quickstart guide in Docker as you suggested. In Terminal the keybinds didn't work, in iTerm zooming at least works. Whilst I was zooming in, I noticed my CPU fans suddenly went crazy on my MBP compared to Chrome RDP. The only time that happens on RDP is when I play a game remotely, or watch a video remotely. Htop confirmed it was iTerm.App. I didn't see either of these terminals in your known to work list either.

Thanks for trying. Yes the iTerm keybindings are a known problem. Like I said, I've completely rewritten Texttop from scratch - it's now merely a lightweight Go binary (to receive a websocket and print to STDOUT) and a webextension. So it's significantly more performant.

Tom, my bad. It seems there is a bug in Docker for Mac which causes high CPU usage [1]. Def made the overheating worse. I'll give it a whirl on a different environment instead.

EDIT: Just tried it on Windows. Is there a terminal you can recommend working on Windows? I tried CMD.exe, Powershell, Bash (WSL / Ubuntu) and Cmder. None worked well.

[1] https://github.com/docker/for-mac/issues/1759

Oh, interesting, I never tried it on Windows. What doesn't work? Everything!? Can you make an Github Issue for it?

I was using RDP. Had different issues with each terminal but am going to try local, and I'll report the issue(s) on GitHub.

Thank you :)

I can't speak for the alternatives you suggest, but I feel you're being overly unfair on this project. Even if it wasn't at all practical, it's still an interesting demonstration of the approach. Plus, it sounds like the author finds it useful, and I'm sure there are relevant use-cases. Maybe you could explain the "run a browser in a framebuffer" technique for those of us unfamiliar with it.

Zooming (not panning) is essentially as easy as scrolling, and I'm looking forward to a day where it's utilised more since it's such a natural motion. I can image a desktop (relevant!) that presents a huge 'canvas', allows you to zoom in and out of the filesystem, navigate the web at different levels, etc. There's definitely potential in interface design; what annoys you about zoom? I agree it shouldn't be a primary action that one would have to carry out very frequently.

I'm just expressing how I don't see it useful (I'm curious to hear use cases though). That doesn't mean I want nobody in the world to find it useful. That's up to each and every one of us. Heck, perhaps it can be improved, who knows.

These projects I mentioned have already (partly) tackled the latency issues. RDP and NX suffers far less from latency issues than X11 and VNC.

As for framebuffer, see [1] [2] for examples. Although now that I'm looking into it perhaps my friend was using Links2 and not a Mozilla offspring. I remember he also used lynx/w3m/dillo. As you can see from the example, one could just run MPlayer locally back in those days, with a framebuffer on a local tty. Apparently YouTube works as well [3]. DirectFB links seem dead though, and it seems MPlayer website is also down [4].

Zooming appears a natural motion on mobile devices because it allows one to see more content which is hidden due to physical reasons (a trade-off). Do you really find that useful on todays large monitors? Laptops even? Instead, it appears that for desktop devices scrolling provides adequate user experience. Zooming was added ages ago in Xorg, allowing a compositing window manager to utilise. Has it been used a lot besides the obvious accessibility feature? The only reason it is usable in this specific use case is the low amount of pixels (low pixel density). But I am curious to all these use cases you got in mind for zooming.

[1] https://en.wikipedia.org/wiki/DirectFB

[2] https://en.wikipedia.org/wiki/Linux_framebuffer

[3] https://github.com/notro/fbtft/wiki/Framebuffer-use

[4] https://www.mplayerhq.hu/DOCS/HTML/en/fbdev.html

I have a use case! I travel with my smartphone (and leave my desktop at home), but I receive links to huge files via document sharing sites (gigabytes and up - not the medium I would choose, but the choice isn't mine: the clients like those, for some strange reason). The links expire quickly, are not downloadable through console browsers (JS/images/CAPTCHAs), and I might not even have enough space and/or bandwidth to download them locally, to the smartphone. Up to now, I've had to resort to weird hacks such as "X+ssh forwarding a remote xpra session from the desktop, with a browser window in it", which sort-of-worked, but was painfully slow. This is a different approach, one that seems workable.

Zooming is necessary due to the smallish smartphone display, "latency issues tackled by RDP" is mostly wishful thinking (okay, it's gone from "unusably bad" to "just bearably bad").

Right so, if the previous options of running UI apps remotely don't work, you'd need a web browser with JPEG, PNG, JS and Linux framebuffer support.

Dillo and Lynx are quick. Dillo last release was 2 years ago. Lynx is in active development. They don't have JS support though.

Links has support for JPEG/PNG and Linux framebuffer but somewhere in '00s they removed JS support (it was an unmaintained port of spidermonkey). ELinks port which also supported all 4 features is dead.

There's surf and uzbl. I don't know the latter much, former is used with keyboard shortcuts only. AFAIK these don't use the Linux framebuffer. Dooble is a lightweight web browser, but no Linux framebuffer support.

I question wanting JS 24/7 locally. You can also just not load all these JS nonsense and have a quicker experience (with NoScript or uMatrix). Same with loading all images. You could use a proxy to downsize JPEG pictures as well. The very first Opera versions on mobile phones did the same thing. They didn't even serve HTML! They spoke a different language with the browser. I'm not sure if Vivaldi (formerly Opera) still employs this feature. But you need to remember there's a market for this in upcoming economies where mobile bandwidth is scarce.

Sources used [1] [2]

[1] https://en.wikipedia.org/wiki/Comparison_of_web_browsers

[2] https://en.wikipedia.org/wiki/List_of_web_browsers_for_Unix_...

"Console browsers won't work at those particular sites, because the sites are dependent on JS and image CAPTCHAs". - "I question wanting JS 24/7 locally [because you can just not use JS and you'll be fine]". What part of "if I do not have JS, it just won't work at all" did I fail to explain? Did you even mean to reply to this comment, and not some completely different one? (Rest assured that I use NoScript for my usual browsing needs, most sites cope with no-JS gracefully, and this use-case is not a typical browsing set-up, or anything that would be useful for general browsing.)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact