Hacker News new | comments | ask | show | jobs | submit login
Turn Vim into Excel: Tips for Editing Tabular Data (2013) (alangrow.com)
92 points by luu 72 days ago | hide | past | web | favorite | 35 comments



Since no one mentioned it so far: Emacs has a mode built-in called Org Mode which has fantastic table support: https://orgmode.org/manual/Tables.html

I'm using it every day for so many things, importing and exporting of data, calculations, writing offers to clients, there's an infinite amount of good things to be done with spreadsheets within your text document.


Yes tables in org mode are very nice


If you want to use a spreadsheet with vi key bindings, most linux distros have sc (spreadsheet calculator) in their repositories. It's kind of a cross between vi and the DOS version of Lotus 1-2-3. A newer, improved version of sc with features including undo/redo and csv/tsv export and import is sc-im (https://github.com/andmarti1424/sc-im).


Vim keybindings are not Vim. The same claim is made about Vim emulators in IDEs. IntelliJ has the best Vim emulator I've ever seen, and it's still useless to a Vim power user.


While I don't know IntelliJ's emulation, I have to say that I find Evil-Mode for Emacs to be nothing short of amazing.


Evil is just the most complete vim that isn't actually vim


As a user of both (I use IntelliJ for Scala development and Spacemacs for everything else), I'd say Evil is way way better. The IntelliJ vim plugin is awesome, but it will never feel as integrated as Evil does. Emacs is way more extensible.


I think that's the most important differentiator evil has.

Embedding vim-style editing in an IDE or less extensible editor might get you 90% of what you want from Vim, but that 10% can be a dealbreaker depending on your workflow. Without easy extensibility, you might not be able to get that last 10% at all.

I'd say evil starts out a little better, maybe 95% out of the box, but now you have all the power of emacs to make up that last 5% and then start to go even further if you're not content then.


I've been finding visidata (http://visidata.org) the perfect vim-inspired spreadsheet tool.


Seconded. I enjoy using this software.


I found that for most problems where people use Excel I prefer CSV, various command line tools (awk, perl and the awesome xsv) and Vim. I'm regularly faster that way. When data manipulation is involved my solutions are robuster, more reliable, work well even for huge data sets and most importantly: they are reproducible.

That being said, there are two features that Excel - or any other spreadsheet software for that matter - provides, which are missing from any Vim solution I have found:

1. Hiding overly long cells.

In most tables I see most of the cells have reasonably short lengths, but almost always there are a few overly long outliers. What I want is a way to manually set the column width and then all outliers cut-off. When vim's conceal feature came out I hoped someone would make a plugin to that effect, but as far as I know no one ever did. Looking into the problem myself a while ago, I found that it's not trivial to do.

2. Quick and easy filtering.

I find the column filters in Excel pretty useful. The first value they provide is that they give an overview what values are actually used in the column. Second value is to temporarily hide all the rows one is not interested in. They are also very easy to use. I would like to have some easy way to do that in Vim just for viewing and without actually manipulating the file.


xsv is great -- as is csvkit, which preceded it and is probably a bit slower, but has a few extra utilities that I find essential:

- in2csv: extracts Excel worksheets into CSV

- csvsql: produces (and executes) the SQL needed to create a table and insert the data from a CSV

I found these scripts to be profoundly useful when working with govt data and needing to provide a automated workflow that collects and collates their Excel spreadsheets and converts them into CSV and SQLite databases.


I think 2 is fulfilled with xsv with the frequency command and the select command. It's 2 commands, but it's not excel and it's reproducible. I find the source code is kinda helpful to see what arguments I should set or how it works.


I personally prefer just strait sqlite3 and then R or maybe awk for automation.

There are tons of interactive tools that can be used with it too if you want that.


I'm not convinced. Excel is used predominantly used to evaluate formula's that are based on values in other cells. While I'm not a fan of Excel, vim simply does not have this capability and really can't replace Excel for most reasonable use-cases.


Exactly.

Excel is so much more than tabular data. It is a directed acyclic graph, a functional, reactive programming lanaguage with immutable data structures, and a display layer for the results of your computation.


The filter command in vim allows you to use any external program to manipulate the contents of the buffer. So, if you have a CSV or TSV and want to do something like compute the sum of numbers in a column, you could write a awk or perl one-liner with the appropriate field-separator parameter and do it.

So, for a CSV, you could do something like this in perl

    :% !perl -F/,/ -ane '$total += $F[3] if $F[4] =~ /Merchandise/; END {print $total;}'
This command will filter every line in the vim buffer through the perl one-liner. The perl one-liner splits the line by comma, and adds the amount in the fourth column to the total amount if the fifth column contains the word "Merchandise". The lines in the vim buffer are replaced by the computed total. I actually use this to determine how much I spend for each category in my Discover Card statement after downloading it as a CSV.


This won't work if there are commas within fields, as permitted by CSV quoting rules.


Checkout sc-im(https://github.com/andmarti1424/sc-im) if you are heavy vim user, you would definitely love it


Why is tab separated value format superior to CSV? I understand why CSV is a poor choice and commas are more likely to be needed in cell contents than tabs, but there's not a standard format for TSV anymore than there is for CSV.


There is also SC ( https://www.linuxjournal.com/article/10699 ) and the similarly named https://github.com/andmarti1424/sc-im

But I'm happily reaching out to gawk most of the time for CSV processing.


Why use Tab separated values instead of Comma separated values?

Is it possible to just use some random character, maybe even from different charset, to act as delimiter for importing/exporting tabular data?


The argument is here [http://www.catb.org/esr/writings/taoup/html/ch05s02.html]: This format is to Unix what CSV (comma-separated value) format is under Microsoft Windows and elsewhere outside the Unix world. CSV (fields separated by commas, double quotes used to escape commas, no continuation lines) is rarely found under Unix.

In fact, the Microsoft version of CSV is a textbook example of how not to design a textual file format. Its problems begin with the case in which the separator character (in this case, a comma) is found inside a field. The Unix way would be to simply escape the separator with a backslash, and have a double escape represent a literal backslash. This design gives us a single special case (the escape character) to check for when parsing the file, and only a single action when the escape is found (treat the following character as a literal). The latter conveniently not only handles the separator character, but gives us a way to handle the escape character and newlines for free. CSV, on the other hand, encloses the entire field in double quotes if it contains the separator. If the field contains double quotes, it must also be enclosed in double quotes, and the individual double quotes in the field must themselves be repeated twice to indicate that they don't end the field.

The bad results of proliferating special cases are twofold. First, the complexity of the parser (and its vulnerability to bugs) is increased. Second, because the format rules are complex and underspecified, different implementations diverge in their handling of edge cases. Sometimes continuation lines are supported, by starting the last field of the line with an unterminated double quote — but only in some products! Microsoft has incompatible versions of CSV files between its own applications, and in some cases between different versions of the same application (Excel being the obvious example here).


There's a bunch of ASCII characters (28 - 31) specifically meant to be used as delimiters, but they never caught on.

But even if they did, you still always need to handle the case where one might legitimately want to use the delimiter character inside a field value.


Wow, TIL this actually exists. Mind is blown on how they never got adopted outside of folks thinking (foolishly) that these characters were not "plain" text.


This is one of my disappointments -- the fact that unit and field separators did not catch on.


There are absurdly many corner cases either way. For my money the best representation for an Excel spreadsheet is xlsx. There are several perfectly capable open source xlsx reading and writing libraries, not to mention Excel itself. (At this point, Excel is pretty much the only reason I can think of for wanting to install Windows.)

Using xlsx rules out vim of course, and much as I'm a fan of vim I think it's plain foolishness to endorse it as a spreadsheet program. Vim is fine for ad-hoc viewing and quick-and-dirty edits, especially if the file is full of numbers and uncomplicated text, but it's really easy to break a csv that has fancy escaping in it.


I too don't see why tsv is better than csv. That said, as many people have pointed out, commas have legitimate uses in fields like Address of a locality etc. and this could cause a problem in scripting.

About exporting using random characters, I have found that ~ (tilde) is almost never seen in actual data, hence it can be used as a field separator in most cases. Another safe choice is the | (pipe) symbol. (In Windows, you can change the field separator in the "Region Settings" - a rather weird place to have that setting.)


Tilde is used in path names too much (~user), pipe is a logical OR. The best I've found is ^ (caret) which rarely shows up in real life.


Tab is in the Control Character block of ASCII so that alone makes it strictly more appropriate


It's VS Code + the vscodevim extension for me lately and it's been great. I am not a first-rate Vim user anyway, so the convenience and ecosystem of VS Code won me over easily.


Well... I'm using the same extension in VS Code, and I find the experience rather... meh. It works, it makes VS Code usable, but if you're accustomed to the performance of the real Vim, it's still underwhelming (especially if you're used to macros).


I have tried previously to switch from vim to other editors (IDEA and sublime text). It felt painful. But this time (to VS Code) it has been smooth so far. I think it might be the result of me forcing myself to forget about the vim way completely. For example I have been deliberately using mouse more often, and supressing the thoughts and longing of registers and macros etc. I am not a vim power user to begin with so It's not that difficult. Losing vim power but gaining the benefits of VS code, so far so good.


This sounds fun but honestly reminds me of all these emacs folks who've turned a text editor into a byzantine operating system with email, RSS, microwave cooking, and laundry folding.


Well technically emacs being a lisp interpreter gives you infinity possibilities.

I have heard that there was a saying that the only job of a Mac air was to open emacs.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: