
Tad, a tabular data viewer - tosh
http://tadviewer.com/
======
Macuyiko
Nice, some initial remarks:

\- Sad to see it's an Electron app, but I don't want to start this flamewar
all over again

\- Mid-sized CSV file opens fairly quick

\- If you want to build this into a full-fledged app, be prepared to handle a
lot of CSV edge cases, also think about supporting AVRO, Parquet files

\- Regarding filtering: I can filter on CONTAINS and '=' but not on NOT-
CONTAINS or '<>' I think? This is annoying to filter out empty strings (didn't
check in much detail so I might be wrong here)

\- Regarding filtering: I have a lot of columns, would be nice if the dropdown
would allow me typing some characters to prune the list and quickly find what
I'm looking for

\- You might think about including some statistics per column, e.g. variance
or entropy to allow for exploration in case many columns are present and you
quickly want to highlight the more "interesting" ones

Congrats on putting something out! Some competition in this space:
[https://exploratory.io/](https://exploratory.io/),
[https://github.com/saulpw/visidata](https://github.com/saulpw/visidata) and
[http://www.delimitware.com/](http://www.delimitware.com/) (as others have
mentioned).

~~~
leandot
Genuinely curious, what can be used instead of Electron for cross-platform
desktop app?

~~~
shakna
Tk - It can look a bit dated, but it's easy to use, comes bundled with Python
or Tcl, and bindings in most popular languages. Can run on Windows, Linux,
macOS.

Qt - The IDE is great. It doesn't look native. Either dynamically linked, or
commercially licensed. Runs on Windows, Mac OS X, Linux, Android, iOS, and a
range of embedded hardware.

wxWidgets - Native backends, so it always looks like it belongs. Bindings in a
ton of languages. Can be both simple or complex, depending on what you need it
to do. Runs on Windows, Mac OS X, Linux, and in-progress for Android and iOS.

JavaFX - Java's replacement for Spring. Runs anywhere Java does. Fairly
flexible, and easy to use.

Kivy - A Python framework. Mainly aimed at touch-compatibility. Runs on
Windows, Linux, macOS, iOS, and Android.

LCL - A Lazarus framework. (Think Pascal). Really easy to use, with great
drag'n'drop and the like in the IDE. Runs on Windows, macOS and Linux.

nuklear - Fairly easy to use, with bindings in a lot of languages. Sometimes
requires a bit more work getting it to run on some platforms.

This is not all, but the ones I find easy to use (as easy or easier than
Electron), and easy to set up and deploy.

~~~
leandot
Thanks for such a detailed answer. Can you elaborate why those would be better
than Electron?

~~~
shakna
It depends on what you want from the framework.

Electron doesn't look native, which a lot of consumers don't like. Others
don't care. So, depending on your audience, it can be an extra hurdle that
some like wxWidgets don't have.

Electron is difficult to get performant. You need to really think and test
your perf. Qt, Tk, and a few of the others are much faster, much easier. And
if you put in the same effort you need to put in for a fast Electron program,
you can end up with blisteringly fast speeds.

Electron bundles are large to download. Some places this doesn't matter,
others it does. (Think Australia, Africa and the like where tiny download caps
exist). IIRC Qt, the biggest dependency, is about half the size of Electron.
Tk and wxWidget are small, nuklear is just a header. It's tiny.

Electron is not good with touchscreens (or "harder to get right"), but more
and more people have them. Kivy is great for that usecase.

Electron is "just code". Qt Creator and Lazarus' IDE are phenomenal for
putting GUIs together. You'll be surprised how little code you need.

Electron is primarily JavaScript. Some people like that, others don't. If you
want an easy language, you can use Python, or Nim. Want more control C or C++.
Value both time and control? Go with Java.

Electron has its place.

If you have a really tight time-to-market window, or this is just a tiny
personal project, and JavaScript is your "goto" language, then awesome. Use
it.

But, the cross-platform GUI world is big. You have a dozen or so mature,
stable, proven frameworks.

So if your project takes off, or you have the time to do things right,
evaluate if any of these libraries, including Electron, fit your needs.

~~~
nimmer
How's Java giving user more control than Nim? The latter allows for hardware
access like C/C++

~~~
shakna
I said C/++ was more control, Java was a middleground between the two. Nim is
easier, but its memory usage and performance are harder to be sure about,
because it hasn't been around as long.

------
rabidrat
If you prefer working from the terminal, you should check out VisiData
([https://github.com/saulpw/visidata](https://github.com/saulpw/visidata)).
It's a curses tabular data tool that can start browsing terabyte .csv files
immediately, with few dependencies and no clicking.

~~~
nur0n
I was looking through the comments hoping to find a tool like this. You did
not let me down, fellow HNer! I try to use the command line whenever I can.
Somehow there is a command line tool for just about everything, except for the
obvious cases where it not useful like visual media and graphs.

~~~
antonycourtney
I don't want to dissuade you if curses-based UIs are your thing. But one thing
that's been crucial to me from Day 1 is that you can type "tad foo.csv" at the
command line and immediately get a usable view of your data with no extra
config or tweaking needed. I mostly launch tad from the command line.

~~~
rabidrat
I just launched "tad 311_Service_Requests_from_2010_to_Present.csv" (a 10GB
dataset from data.cityofnewyork.us) and it ate up all my memory without ever
showing any data. "vd 311_Service_Requests_from_2010_to_Present.csv" shows the
first rows instantly, and I can start rearranging columns while it is still
loading, and I can press Ctrl-C to stop the load (and the rows already loaded
are still available). No config or tweaking needed.

------
vortico
Looks fantastic! I love the combination of SQLite fixed schema with the
hierarchical view.

Some advice on the website: Do some browser sniffing (I know) to display a
screenshot of the software on the user's operating system. This immediately
answers the question "Does the software work on my computer?" Also, the source
code link should be near the Download section. Not everyone is trained that
the triangle GitHub icon in the top right means "source code".

~~~
beefsack
The screenshot suggestion is a good one, whenever I open a GUI software page
and see a screenshot of Windows or MacOS I often just close the tab and move
on.

~~~
type0
To me it's often that when I see screenshots of MacOS I usually assume that
it's Mac only software and most of the time it's true

------
misterdata
Just noticed the installer changes the file association for CSV's so they all
open with Tad, without asking me first. Please do not do this.

Otherwise, nice app. How does it scale to larger CSV files? I have been
working on something similar ([https://warp.one](https://warp.one)) for Mac,
which streams CSV files (apparently you use a SQLite database behind the
scenes as cache?)

~~~
johnlbevan2
Upvote for "don't change default file handler without asking".

~~~
antonycourtney
Really sorry about this! I do specify Tad as a handler for CSV files, but
becoming the default handler was not my intent at all; I will look into what's
causing this.

------
gravelc
Just gave it a quick test run on a 6 million row by 8 column CSV my software
produces. Fairly slow to open on an iMac, but got there in the end. Looks
nice. Sorting fast. Have got an additional column called 'Rec' for some
reason.

------
TheAceOfHearts
Related to this, if you're looking for a CLI tool for handling CSVs, xsv [0]
is looking promising.

Based on prior experiences with CSV, one of the big problems I've seen has
been figuring out what text encoding is being used. This problem appears to be
more prominent with people outside of the US. It looks like Tad is using fast-
csv, which I don't think will properly handle different file encodings. Life
would be so much simpler if everyone just used UTF-8.

[0] [https://github.com/BurntSushi/xsv](https://github.com/BurntSushi/xsv)

~~~
burntsushi
You basically have two approaches:

1\. Write your CSV parser with an assumption that the data is ASCII compatible
(this means it works with either UTF-8 or Latin-1 out of the box, possibly
modulo non-ASCII meta characters). To support additional encodings---such as
UTF-16---either the CSV library or the caller must transcode first.

2\. Write your CSV parser such that it can work on multiple different
encodings. For example, this means looking for `\x2C\x00` when parsing
UTF-16LE data instead of just `,`. This introduces implementation complexity,
and you'll be unlikely to support the full gamut of encodings that other tools
support whose job it is to do that sort of thing.

(2) is kind of weird but probably quite a bit faster than (1), although I can
imagine it being useful in very niche circumstances. e.g., "I have a boat load
of UTF-16 encoded CSV data and transcoding it to UTF-8 to use this CSV parser
isn't worth my time because ______." I can't actually fill in that blank, so
solutions in (1) tend to be the way to go.

Now... If you're building a full on CSV tabular viewer, then I might
understand why it should handle encoding for you automatically, but when it
comes down to it, the viewer is still going to need to choose between (1) and
(2). Unless they want to hand roll their own CSV library, I imagine they're
just going to pick (1), and when possible, transcode the data first. In that
case, it shouldn't really matter whether their underlying CSV parser supports
alternative encodings or not.

~~~
TheAceOfHearts
The use-case in which I got to see a lot of CSVs was in a data importing tool.
Personally, I think that everyone should just use UTF-8. But since a lot of
people don't do that, we were forced to try guessing the encoding while
providing a preview, along with a way to try manually try other encodings.

I think the only other time I've seen something that hacky has been with a
date parser that would try to guess the format for you. That pushed me to the
firm belief that ISO8601 is the only sensible way to store dates.

------
MrMattWright
This is absolutely terrific! I often work with csv or tsv and large data, and
I just want to quickly look at it without loading excel or exporting to google
sheets. [https://atom.io/packages/tablr](https://atom.io/packages/tablr) Tablr
for Atom is good too. One feature request. I very often get data in JSON
format (then convert to csv), it would be amazing to also support JSON. Thanks
for sharing.

~~~
baldfat
OpenRefine [http://openrefine.org/](http://openrefine.org/) for years has been
my go to tool when looking at csv or tsv. I really love this project.

> OpenRefine (formerly Google Refine) is a powerful tool for working with
> messy data: cleaning it; transforming it from one format into another; and
> extending it with web services and external data.

~~~
MrMattWright
Oh thanks for reminding me, I did look at that a while ago, it's absolutely
great. The transforming steps are amazing. Time to have another look at it.
Exploratory.io is similar (nice transformation steps using dplyr) but for me
it was just a bit too close to actually writing the R code.

~~~
baldfat
I LOVE writing code for R for scraping. I am actually serious. The Hadleyverse
and the package tiddyverse have changed my programming life (Well learning
Racket helped a ton also since there is Lisp in the foundations of R)

------
fjert
Wow, this seems simple but I got excited immediately when seeing this. Since I
have been writing code to export data for clients, I have grown so tired of
Kingsoft Spreadsheets and Google Sheets lagging like crazy with any sizable
amounts of data. This will be a cool new tool to show my coworkers tomorrow
and I'll be using it. Performance seems very snappy so far!

~~~
antonycourtney
I invested some effort to keep it performant even with fairly large CSV files,
including a custom port of some C++ code for fast CSV import. My current
favorite example is the Met Museum's 228MB 450k row collection data set; takes
about 12 sec. to open in Tad on my 2013 MacBook Pro. Definitely not lag free
(and hard to achieve that without going to some serious column store data
warehouse like Amazon Redshift), but still reasonable.
[https://twitter.com/antonycourtney/status/869252722624561152](https://twitter.com/antonycourtney/status/869252722624561152)

~~~
elcritch
Thanks for putting this out there!

There are some projects out there using memory mapped files to do fast CSV
parsing. Could be a nice way to speed up the memory loading and scroll it in
real time. Can't find the link to the library I saw it used in, but it might
be an interesting venue to consider. Another library that does it seems to be
astropy fast ascii IO module [1].

[1]:
[http://docs.astropy.org/en/stable/io/ascii/fast_ascii_io.htm...](http://docs.astropy.org/en/stable/io/ascii/fast_ascii_io.html)

~~~
camkego
Try benchmarking OS read() calls vs. either sequential or random reads using
memory mapping, whenever I do this OS read() calls end up being quite a bit
faster.

------
Cogito
I couldn't see any way to export data, once I had filtered or pivoted it.

Is this supported at all, or do I need to use the copy mechanism? I have a
~4GB file to filter down and analyse, and this almost looks like a nice tool
for business users to use to explore the data themselves, but they need to be
able to export to excel at some point.

------
ericfrigot
Very nice !

Mainly for the cascaded pivot option.

First remarks : \- Requires \n or \r\n end line, not working with \r

\- Requires comma as separator (not possible to change it or I don't find how)

\- Does not support (for exemple) ANSI encoding

~~~
Svip
Just out of curiosity, which systems uses \r as newline these days? As far as
I understand it, Mac OS did it prior to its version 10 (its Unix version), but
the modern macOS does not.

~~~
roller
fwiw, a Save As CSV from "Microsoft® Excel for Mac" 15.33 got me this:

test.csv: UTF-8 Unicode (with BOM) text, with CR line terminators

Apparently, even recent software is not up to date on what the line separator
should be.

------
shusson
Interesting, I would describe it as an SQL client with an easy interface to
load column based files. It's built on top of SQLite, which gives an
indication of the kind of performance one can expect.

~~~
PeCaN
SQLite is pretty fast actually. More problematic is that this uses Electron.
Loading a tiny 200-line CSV file took 120MB for me lol

~~~
eon1
Thanks for the heads up on Electron - I was going to look into onboarding this
for our team until that. We have more than enough in the way of crufty
crapplications already.

------
redgetan
A little UI comment if the the creator is reading this. It actually took me a
while to find the search/filter button.. only to see that it's actually on the
bottom. Perhaps make it more visible (on top maybe? and show the form right
away instead of having to click the filter link).

------
hougaard
They lost me right at the front page. Left justification of numbers is just
wrong...

------
rattray
The brief screenshot and short description look cool – does anyone have a
screencast demo?

EDIT: once you install it, a rich README pops up which includes more
screenshots and some example datasets. Fun to play around with

------
aargh_aargh
Dies immediately when launched on Debian Jessie:

Error: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not
found (required by /tmp/.org.chromium.Chromium.ibqKoR)

    
    
      $ strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep 
      GLIBCXX_
      ...
      GLIBCXX_3.4.19
      GLIBCXX_3.4.20
      GLIBCXX_DEBUG_MESSAGE_LENGTH

~~~
Asooka
I think you should be able to rebuild the electron npm module from source to
have a compatible runtime.

------
johnlbevan2
Feature Suggestions

(top level comment to hold various suggestions, so upvoting can be used per-
suggestion to bubble the best to the top)

~~~
johnlbevan2
Aggregate Function: Count

I've seen that COUNT is implemented for numeric values; but not for text.
There's no reason to limit COUNT to numeric (unlike AVG and SUM).

~~~
antonycourtney
Good points and great suggestions, thanks!

------
foxbarrington
Very cool! I'm interested to see how easy it is to create pivots on the fly.
We use react-pivot[1] all over the place, but it needs js to set them up.

[1] [https://github.com/davidguttman/react-
pivot](https://github.com/davidguttman/react-pivot)

------
lucasgonze
I downloaded this when this post came up and have been using it steady. It's
good. Reliable, does a job that needs doing, has a few bells and whistles.

Definitely still early stage, but that will pass.

I have showed it to a couple people I work with and they have started using it
too.

------
mrmondo
Hey, congrats on putting this out, one small thing: Homebrew is showing v0.8.3
as the latest version however v0.8.4 has been released (probably an easy fix),
also I was wondering if there were plans to make this a native / non-
javascript app?

------
amelius
Nice project page.

If only there was a tool/service to create such project pages automatically.

~~~
nerdponx
Are several, sort of. I think these kinds of pages are becoming more
accessible to design and public with static site generators.

------
asteinbr
Another Electron app, yay...

------
tadfisher
Oh joy, more Slack pings from the data team (check my username).

Nice work!

------
lumost
One area to consider growing this, existing SQL clients/viewers are pretty
awful. It would be awesome if I could directly connect to a DB

~~~
nerdponx
If you use MySQL, you should definitely check out Sequel Pro. It's very
polished and very usable.

------
alexpetralia
Maybe I missed it but how is this better than Excel?

~~~
antonycourtney
I don't think I ever claimed it was better than Excel, which is targeted at
different set of users and use cases. That said, here are some reasons why
some users might prefer Tad to Excel: It's free and MIT licensed with the
sources on github. It's available for Linux, Windows and Mac. It works on any
tabular data source. The UI for creating a pivot table is clear and intuitive
and efficient (just a few clicks); many users find Excel's pivot table
interface difficult and confusing. The underlying typed data model is a better
fit for many data sets. And finally, it is many times faster than Excel for
loading and working with large CSV files: My large test data set takes 12 sec
to load in Tad vs 40 sec. for Excel.

~~~
jonaf
Perhaps the better question is, how is this better than Google Spreadsheets?
Most of your reasons still apply, but I believe google docs handles large
spreadsheets pretty well (after an initial loading phase).

~~~
unclesaamm
Google Sheets isn't open sourced and MIT licensed is one big difference

------
malkia
Is this using JUCE?

~~~
TkTech
[https://github.com/antonycourtney/tad](https://github.com/antonycourtney/tad)

It's an Electron app.

------
continuational
If a have a table with an email column, can I pivot by email domain?

------
lloydjatkinson
Ah _of course_ it's a Node/Electron thing.

------
sullyj3
How on earth did you not call it "Tada"?

