
Show HN: Ananas – a hackable data tool for beginners - millboh
https://ananasanalytics.com/
======
sails
This is great.

Things to keep on your radar:

Meltano spun out of Gitlab is working in this domain, I think they are making
progress. [https://meltano.com/](https://meltano.com/)

dbt are building the transformation engine that Meltano is using (I think),
and are worth keeping an eye on
[https://www.getdbt.com/](https://www.getdbt.com/)

In my experience the issue with this domain isn't the "one-off" analysis, but
rather orchestrating the BI function across the business, maintaining the
single source of truth, testing and deploying across the ETL/ELT layers. I
can't speak to how well Ananas is managing these but Meltano and dbt are
giving this area a lot of thought.

------
kfk
This looks good. One of your competitors charges about $4,000 per seat per
year so this seems to be a good space. If you add the possibility of building
user defined nodes with Python you’d have a solid product.

~~~
justaguyhere
woah, that is steep. who is this competitor?

~~~
ZeroCool2u
Alteryx would be my guess.

~~~
kfk
Interesting enough both ananas here and Alteryx use a declarative approach.
Alteryx is using XML and Ananas YAML

~~~
ZeroCool2u
The YAML approach seems far more appealing to me honestly, though I'm sure XML
was a sane choice at the time.

------
dewey
Pretty similar to
[https://github.com/getredash/redash](https://github.com/getredash/redash)
from a first look. What would you say are the main differences?

~~~
millboh
It seems that Redash is a BI tool and close source. Ananas is open source, and
can be used not only as a BI tool, but also an ETL tool. More over, you can
run your pipeline on your own infrastructure, as Ananas can be run on multiple
execution engines

~~~
codetrotter
Parent commenter linked to their GitHub repo. Redash is open source under the
terms of the two-clause BSD license. (Maybe they edited to add link, maybe you
overlooked it?)

~~~
millboh
Oh, sorry about that, my bad. I just looked through redash home page, but
didn't find the open source link

~~~
djmips
redash source <\- search

------
canada_dry
An older, but very slick tool for wrangling data:
[http://vis.stanford.edu/wrangler/](http://vis.stanford.edu/wrangler/)

Which is now a commercial tool: [https://www.trifacta.com/products/wrangler-
editions/#wrangle...](https://www.trifacta.com/products/wrangler-
editions/#wrangler)

------
Jazgot
Is the name inspired by Orange [1]?

[1] [https://orange.biolab.si](https://orange.biolab.si)

------
mtarnovan
This looks a bit like [https://metabase.com](https://metabase.com) \-- anyone
used both and can make a comparison?

~~~
wiremine
Curious if anyone has used Metabase for serious work and can comment on it. I
tried setting it up and got frustrated pretty quickly with the UX. It looks
slick, but the mental model was confusing...

~~~
mazameli
Sorry to hear that (I'm the UX guy at MB). Would appreciate your
thoughts/feedback on any specifics you want to share. Thanks!

~~~
wiremine
Thanks for following up! Here's some of things that tripped me up:

1\. The lack of in depth docs.

2\. The set up and usage of metrics was focusing. This was the main use case I
was hoping Metabase could help me with, and it felt like an addon feature.

3\. For whatever reason, managing dashboards was really confusing, and the UI
[1] didn't seem to match the docs.

[1] I was using the mac version.

~~~
mazameli
Thanks, I appreciate your feedback.

------
programbreeding
Just FYI, about 1/3 of the way down your Getting Started page[0] it has a
broken link[1] to the fifa2019.csv file. The first link on the page is
valid[2], but the second one leads to a 404 due to pointing to .../raw/...
rather than .../blob/...

[0] [https://ananasanalytics.com/docs/user-guide/getting-
started](https://ananasanalytics.com/docs/user-guide/getting-started)

[1] [https://github.com/ananas-analytics/ananas-
examples/raw/mast...](https://github.com/ananas-analytics/ananas-
examples/raw/master/FifaPlayer2019/fifa2019.csv)

[2] [https://github.com/ananas-analytics/ananas-
examples/blob/mas...](https://github.com/ananas-analytics/ananas-
examples/blob/master/Fifa2019/fifa2019.csv)

~~~
bhou
Thanks for the reminder, the links are fixed now.

------
nishkalkashyap
I would recommend distributing binaries from a dedicated release server
combined with a CDN. Possibly digital ocean spaces. It really increases
download speeds for end user as compared to gitHub releases.

------
nishkalkashyap
I would recommend code-signing the build before distributing.

~~~
millboh
Thanks for your feedback, we will look for some affordable code-signing
certificates. Any suggestions? By the way, here is the issue link:
[https://github.com/ananas-analytics/ananas-
desktop/issues/61](https://github.com/ananas-analytics/ananas-
desktop/issues/61)

~~~
pimterry
I set up code signing for an electron app relatively recently. Best option I
could find was Digicert. Really sucks that this stuff is necessary nowadays
and not free, but it's not so bad.

That's for Windows - for Mac you'll also need an Apple developer account,
afaik they're the only people who can issue certs.

EDIT: Woah, I take that back. Digicert has now gone up from $74/year to
$474/year, which is crazy. I now also need a new certificate provider...

~~~
NewsAware
For Electron signing we use Tucows Code signing certs (you need to register as
Tucows auther for free) which are provided by Comodo for $140 for 2 years.
Didn't have any issues besides getting a proper CI/CD process running.

------
najarvg
This looks very good and a fit for my end users who deal with excel files all
the time. Is there any plans to add Excel as a datasource? Cannot convert to
CSV without major pain since excel files are exports from mainframe apps which
are out of my control. Thanks

~~~
millboh
Excel source was one of our first supported data sources. See our early video
demo:
[https://www.youtube.com/watch?v=GwqZlhmei78](https://www.youtube.com/watch?v=GwqZlhmei78).
We just created an issue on GitHub: [https://github.com/ananas-
analytics/ananas-desktop/issues/60](https://github.com/ananas-
analytics/ananas-desktop/issues/60) We will add this feature back in the
following release.

------
jbverschoor
The app icon is transparent on mac, and thereforce only clickable on the
border

------
jbverschoor
Unfortunately it's created by an unientified developer

~~~
jessaustin
Be sure and write some special firewall rules before running this...

------
chrsstrm
At what scale has this been tested? As in, are you aware of any data file size
limits? I have a csv with ~6M rows and when paging through the docs the
"Exploring your data source" gave me pause thinking this app might try and
open all 6M rows at once. Will I be OK importing such a large source or will
my computer turn into a space heater before refusing to respond?

~~~
bhou
Ananas has been tested on production processing terabyte data on a daily basis
(with Google Dataflow, but you can achieve the same thing with your own spark
cluster too).

In term of exploring large source file, the design principle is to paginate
any kind of data that support random access records (for example CSV, logs,
etc). So when "exploring the data" of a CSV with 6M rows, Ananas will not load
6M rows at once, but read a few rows at a time for each page. For example, in
this early demo video, exploring a 755M CSV file in seconds.
[https://www.youtube.com/watch?v=GwqZlhmei78&t=01m00s](https://www.youtube.com/watch?v=GwqZlhmei78&t=01m00s)

------
eli_gottlieb
Ok, but why did you name it after pineapples?

~~~
millboh
Ananas, Analytics made easy :) Pineapple was cool too. Will probably change it
if we see more comments ;)

~~~
hondadriver
No just keep it. Its fun and why has the name to be an existing English word?

Fun fact: if everybody starts using it, it will eventually become proper
English.

------
mingabunga
Thought about adding some words to the data output using natural language
generation? Eg arria.com or other nlg vendor?

~~~
millboh
Excellent idea. We've though about Machine learning transformer including NLP
. This NLG is something which would definitively nice to have. Please create
an issue and we will prioritize it.

------
robtherobber
I can't download it: [https://github.com/ananas-analytics/ananas-
desktop](https://github.com/ananas-analytics/ananas-desktop) , I get a Github
server error. Is it just me?

Edit: not just me, Github issues.

~~~
notduncansmith
Looks like GitHub is experiencing errors, just had the same problem with an
unrelated repo.

~~~
robtherobber
Thanks!

~~~
millboh
Yep we expected anything but Github issue ;)

------
jugg1es
Does this have any sort of hinting for indexed queries at all? I would worry
that a beginner would create a horrid mess of queries that could consume all
available resources.

~~~
millboh
That's a good point. Actually we think of this tool as a collaboration tool
which enables non technical users and data engineers to share this visual DAG
and work together. The Apache Beam runners we use behind the scene have a
Query planner to optimize chained queries . However you're totally right .
This can't help a non technical users to write messy queries. The visual DAG
should however helps them to split a complex query into simpler ones.

------
VvR-Ox
Oh nice - thank you very much! :-D

I thought about writing my own app for exactly that task but when I see yours
I think I don't need to do that anymore. Awesome! :-)

------
richk449
Can I use this if all I have is an odbc connection?

~~~
millboh
What kind of data source exactly do you need? We should be able to add
Microsoft datasource such as MSSQL if you request it.

------
mtw
This looks great as a promise - I looked into visualizations provided, and we
need much more than what's provided though

------
lucasverra
Can we hook this to an api GET request ? I guess i could API -> download JSON
-> Ananas, but you know..:)

~~~
cr0sh
I was thinking the same thing, for like IoT monitoring sensors, etc - but it
is open source; grab it (once GH is back online), add the new "source", and
issue a PR - that'd be the way to do it I think...

------
yazan94
This looks really cool! Thanks for sharing, I can't wait to test this

------
thoughtpalette
Looks great to me. Nice job shipping!

~~~
millboh
Thanks

------
pplonski86
What is your business model?

------
towlinson
What about bananas?

------
overcast
My unprofessional professional opinion. The product looks great, but the name
has to go. I can't imagine pronouncing that, let alone communicating it over a
phone. Any simple word before analytics would be better.

Edit: pineapplytics is the obvious cute and available one, however may still
be difficult to communicate.

~~~
henrikschroder
Fun fact: Only the English language calls the fruit "pineapple", almost every
other language calls it "ananas" or similar.

~~~
overcast
I looked it up, I get it, but this post and the site, are targeting English
speaking.

~~~
_frkl
Well, or they just want everyone to be able to access it? There is really no
choice than to publish something like this in English. Just a guess, but I'd
guess the amount of people ccessing it who are not native English speakers is
larger than those who are.

------
dlphn___xyz
what advantages does this have over the ELK stack?

------
tracer4201
No Redshift support? Hmmm.

~~~
bhou
It is on our roadmap! We will continue adding more data sources in the
following release.

