This is very efficient, but the consequence is Vi will become slow when working with huge files, because it has to traverse through a bigger linear array
To put "bigger" into perspective, a modern CPU can traverse and copy memory at over 10GB/s.
This is why I've always found the argument for more "efficient" (and complex) text editor data structures a bit tenuous --- even if you have to move MBs of data with every insertion (as happens with a simple gapless buffer), computers truly are so fast that it wouldn't look any different to the end-user; a 1ms and 1us delay upon each keystroke is, to the user, practically indistinguishable.
That's not to say I'm one of those who preach against "premature optimisation" and don't care about efficiency; far from that, in fact, but the popularity of and lack of speed-related complaints against the small DOS text editors and even syntax-highlighting IDEs on PCs in the 80s through early 90s which used the same "one buffer" paradigm, on machines with a fraction of the memory bandwidth of those today, suggest that the complexity of more "clever" data structures may not be worth it.
...and yet we somehow still manage to make editors that peg a single CPU core just blinking a cursor.[1]
> computers truly are so fast that it wouldn't look any different to the end-user; a 1ms and 1us delay upon each keystroke is, to the user, practically indistinguishable.
Unless, of course, the 1 ms operation drains the users battery 1000x faster than the 1 us operation.
Efficiency is no longer about speed, it's about noise, temperature, and battery life. Speed is a happy side effect.
You’re assuming the chipset is completely free and waiting to handle your text edits.
If my editor is responsive while the computer is idle, but slows to a halt when I am running some other intensive process, that’s not good enough. I’d put the editor in the same category with the window manager and the terminal... the “needs to still work even while the machine is under extreme load for random reasons” category.
I tend to agree with you. I'm using my own editor most of the time now. I occasionally fall back to Emacs still, but less and less. An early decision I made was to separate the buffers and almost all the rest of the editor code.
It's slow Ruby. It's storing everything in an Array of String objects. Which means one (Ruby-internal) allocation per line for the String object, and an additional allocation for the rest of the String if it exceeds something like 24 characters (Ruby String objects store characters in the object itself if it's short enough, but otherwise store a pointer to a separate buffer). Every keypress (because I've been lazy and not bothered optimizing away cases where it's not needed), the frontend requests the full set of visible lines from the backend via a TCP connection (using Drb for RPC), and re-applies syntax highlighting and other formatting, and in the process causes a flurry of additional object allocations. It's not optimized at all. It's almost flagrantly wasteful in many areas.
It started that way out of simplicity and a desire to get it to the point where I could use it for most of my editing ASAP. And it has remained that way so far because the lag just isn't noticeable to me, and it's very rare I edit large enough files for it to matter. I might optimize it at some point, but to me this seems like something people tend to worry too much about.
sometimes efficiency can die due to a thousand cuts, and most text editors today end up running with plugins that also do compiling and highlighting and intellisense and so on. the work could add up, perhaps?
also sometimes people work with really really big files for various reasons, good and bad.
If you've never interacted with the file saved state of an editor which saves its edit history tree, its a revelation when you try the UNIX strings command on one. It took me a few minutes to realise insert and moves were not the same, because the structure followed insert order, to permit infinite-undo. The actual text position was a subsidiary location tagged on the string.
I think everyone should be asked to write their own simple editor, and then for extension try to convert from the obvious simple tropes in arrays to one which begins to do what these editors do.
Sure. You could leap to the best solution first. Probably, you lucked out in the gene pool and your future fate is not to write bad code. The rest of us, who did not acquire sufficiently complex frontal lobes, struggle with concepts and probably dive into stupider solutions first, like I did at uni in the 1980s allocating a fixed size buffer (big initial mistake)_ and maintaining a huge linked list of edits sequence events into it.
I passed the course, but future evidence suggests I didn't learn as much as I hoped!
Ha ha, well, I'll confess, I got lucky and discovered the existence of gap buffers before I tried to write a simple editor. And these days, there's a Wikipedia article on quite a few of the fundamental editor data structures.
It's worth noting that no widely-used editor since emacs uses a gap buffer, so it's not like it's won the hearts and minds of implementers.
I wouldn't say a gap buffer is the "complicated and crazy" data structure for text editing. Certainly not the best. There's no free lunch, every approach has its downsides :)
> [Sam] is one of the first editors to separate its UI from the actual editor - Sam can be used on both the command-line and as a graphical text editor.
There were different implementations of TECO (written in the assembly languages of different machines) that had different user interfaces. That does not in any way imply that TECO had an editor process using a defined protocol to communicate with its UI process; in fact, in 1964, TECO ran on two computers, and neither of them had separate processes, or an operating system to run them under. ITS development hadn't started yet.
As far as I know, all the implementations of TECO, even today, glom the screen update logic (if any!) together with the editor-buffer logic in a single monolithic process.
I use ropes in xi editor. I did not find the argument against ropes convincing. Yes, they're not trivial to implement, but in a proper programming language you're not dealing with the data structure directly, you're always going through the interface, so you get the logic right once in the rope library implementation and then forget about it.
In a low-level design, your editing operations would be poking at the data structure directly. There, the simplicity of a gap buffer is a pretty big win. I agree in this environment ropes are too complicated. However, I don't see any good reason to architect a text editor in this way. Use abstractions.
The linked article contains a factual error, the referenced Crowley paper does not consider ropes. Thus it cannot be used in support of the argument that piece tables outperform ropes.
There's one other important concern with piece tables I didn't see addressed. It depends on the file contents on disk not changing. If your file system supported locking or the ability to get a read-only snapshot, this would be fine, but in practice most don't. It's very common, say, to checkout a different git branch while the file is open in the editor. Thus, the editor must store its own copy to avoid corruption. In the long term, I would like to see this solved by offering read-only access to files, but that's a deeper change that can be made piecewise.
> I'm surprised programmers haven't created overlays for Vi or Emacs for the GtkTextView. Definitely an opportunity for someone there.
https://github.com/polachok/gtk-vikb already was a couple of years old by the time of writing this. (Not sure about the current status of this library though, especially considering commit history and no mentions of GTK3 whatsoever.)
MicroEmacs, a free text editor from the 1980s, uses a double-linked list of lines. It is fast and memory compact, after all, it worked on a 64K DOS machine.
In that memory regime (and without virtual memory), code size matters. With a simple approach, Your data structures may take 10% more memory, but if its code is a few k smaller, it’s a net win even if you exhaust all memory.
Also, in that tight memory, people would sometimes choose slower approaches if they had lower memory usage. For example, for each line, do you use a gap buffer, or simply move every line? The former may be slightly more efficient, but eats extra memory.
In the comment sections the author claims that Sublime text uses a fork of GtkTextView. Is this true?
Here's the relevant bit:
> Open Sublime in a hex editor, there are gtk debug strings everywhere, along with gtktextview/edit strings too. It's as "home grown" as a fork can get really. Some searching should bring up more evidence.
Let's not assume everyone is so stupid that they need their statement taken to the extreme pointed out to them as an exception.
I've noticed an increase in obnoxious disclaimers like "there are exceptions of course" and "it's just my opinion of course" as people get tired of responding to pointless responses that point those statements out.
That's just my opinion though. There are certainly exceptions. Feel free to disagree!
I absolutely agree sometimes. YMMV. Happy to hear your thoughts.
If only we had the ability to (literally) check boxes to include these disclaimers as flair on our comments. Then they might add disclaimer packs for the most controversial topics so that folks don't have to click too many boxes.
Obviously, there might be exceptions. Some folks might not want this feature. Others may. Who knows? Certainly not me.
While I don't disagree that 8GB isn't much for a developer machine these days I do also think your claim is a little exaggerated. I certainly don't have any issues running a couple of intelliJ instances + Chrome with at least a dozen tabs open (seriously; I'm terrible for leaving tabs open).
What I find kills my machine is trying to run Chrome and Firefox concurrently. Particularly if I still have IntelliJ open.
I do most of my work on a laptop with 4GiB of memory (manufactured in 2009). Probably the last time I did anything dev related that meaningfully stressed the machine was when I decided to give android studio a shot.
Just sitting there, with nothing else open, the machine was completely unusable.
I cannot recall anything else I have done with the machine in it's lifetime that's been a real problem.
This includes running firefox and chromium concurrently, plus a couple virtual machines .
Like always when people start comparing system performance, there are so many variables at play (CPU make / model / age / cache / n cores, bus speed, RAM type, OS / distro / package versions / compiled flags, swap space, running daemons / services or other background processes, number of browser tabs and even the specific sites left open, etc etc)....
But for me it's the Flash plugin container for Firefox that really does the damage when it comes to concurrency. A few Firefox tabs open running YouTube (or another video streaming service) and my fan already starts spinning up louder than a revving motorbike. If I have much more open aside Firefox then the system will just lock up without warning. I suspect Flash is thrashing both CPU and RAM though but it's consistent enough behaviour that I tend to avoid it rather than investing time debugging and fixing (though since it is Flash that kills it, I suspect any such "fix" would just be altering the behaviour of the softwares end user (ie me) anyway).
>with at least a dozen tabs open (seriously; I'm terrible for leaving tabs open).
Come back when you have 150 tabs open. Anyways, somewhere around 200 across 4 windows is where i start seeing about 10GB ram used, with os and other apps.
RAM never seems to be much of an issue unless im abusing things; its cpu/disk that kills me, particularly when some background process goes out of control.
I'm horrible about tabs too. But I mostly use it as a "I'll use this page again in a moment". Often that's true, but often I also just end up leaving it around.
My solution was the OneTab extension for Chrome. It gives you a button to click to close all tabs (with some ways of marking exceptions), and adds them to a page, so you won't lose them. It groups it by date, from newest to oldest. It makes it easy to come back to the things I left in tabs "in case", while keeping the tab count down.
In effect I find I come back to quite few of them, but knowing I can makes it a lot easier mentally to press that button.
(And now I'm off to do just that before doing some work)
Been there, done that. I was constantly losing documents I wanted saved because the browser would crash etc. So then I realised my behavior was counterproductive and thus since employed a more sane approach to tab usage.
Honestly; having more than a dozen tabs open doesn't make you a hero. It just makes your life a little less convenient.
CPU is rarely an issue for me, but disks are a constant source of annoyance.
The turning point for me was when I got a new PC at home that has an SSD. I used to think people exaggerated the benefits of SSDs, or that they were just very impatient. ... Now I am spoiled.
While I care a lot about typing lag (but not at all about autocomplete), if an editor doesn't start fast, or at least has a client-server model where the client starts fast, I'd drop it after 5 minutes.
That's one reason to care about program size, though less so about the basic binary as about how many extra dependencies it has, and certainly what else it does during initialization matters (e.g. many Emacs configs does DNS requests synchronously in the critical path on startup, and end up hanging until timeouts trigger if your network is down... Fun times)
RAM usage I agree matters less. I can't remember the last time Emacs warned me I was about to open a large file. My own editor is so RAM inefficient it'd bring my laptop to its knees in no-time if I were to open a huge file. I'm happy to fall back on another editor i I for some reason ever need that. Though with many application RAM usage is large for all the wrong reasons (my editor would be in that category for other people; for me it's the right reason: it keeps it tiny and fast for me to hack on for the time being - but I don't inflict it on other people (yet)).
But it's still useful to evaluate on these criteria, as they may matter more for others.
To put "bigger" into perspective, a modern CPU can traverse and copy memory at over 10GB/s.
This is why I've always found the argument for more "efficient" (and complex) text editor data structures a bit tenuous --- even if you have to move MBs of data with every insertion (as happens with a simple gapless buffer), computers truly are so fast that it wouldn't look any different to the end-user; a 1ms and 1us delay upon each keystroke is, to the user, practically indistinguishable.
That's not to say I'm one of those who preach against "premature optimisation" and don't care about efficiency; far from that, in fact, but the popularity of and lack of speed-related complaints against the small DOS text editors and even syntax-highlighting IDEs on PCs in the 80s through early 90s which used the same "one buffer" paradigm, on machines with a fraction of the memory bandwidth of those today, suggest that the complexity of more "clever" data structures may not be worth it.
...and yet we somehow still manage to make editors that peg a single CPU core just blinking a cursor.[1]
[1] https://news.ycombinator.com/item?id=13940014