It is aimed at professionals who have data to transform, but aren't programmers or data science professionals.
Use cases include:
* making a list of all the people in mailing list A that are not in mailing list B
* filtering a log file
* joining two spreadsheets
* renaming, reordering and adding/deleting columns in a table
* reformatting dates
* de-duplicating a postal mailing list
It is desktop software for Windows and Mac, so there is no latency and you don't have to upload sensitive data to a third party server.
At some point we plan to start charging. But the current beta is free until the end of November. And there may be another free beta after that.
We would love to get some feedback. Particularly from people using it to solve real world problems.
Here’s a podcast interview I recently did with OP about his product:
Worth listening to if you are interested in the decisions that go into creating, designing, naming, doing usability testing and promoting a desktop app like Easy Data Transform.
#1. The "Show First 10 Rows" dropdown... nice here would be "Show First 10 MOST FREQUENT Rows" ... helps get a view of the distribution of values
#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)
You should be able to do this with a pivot, then a sort. But pivot doesn't work with non-numeric values at present. Next release!
>#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)
You could do that with 'IF'. But I guess that could be a bit verbose and I should perhaps offer a 'Lookup' transform as well. The table lookup has the advantage that the lookup table can be created/modified by Easy Data Transform.
Yes, an option to have some sort of scriptable transform would be very useful (even if it is slightly at odds with the "without programming" positioning). I personally loathe Javascipt, but I guess it would be easier to embed than, say, Python or Lua.
Thanks for the feedback.
Easy Data Transform is aimed at people who don't have either the skills, time or inclination to take on something like Nifi (which is most people!).
The aim with Easy Data Transform is that someone can use it to transform their data within a few minutes of first seeing it.
That's definitely true :)
When I write data-transformation code, I always have the feeling that it's often too inter-connected and an approach, like the one this tool follows, would be nicer.
Somehow the only the core idea of using these connected nodes is good, the rest of the UI is too clunky, so I drop down to "real" code again for some nodes and sooner or later the mixing up of nodes and code becomes too cumbersome and I drop down to "real" code for everything.
But, perhaps one day in the future, I might be able to add a script or plugin node, so you can add your own custom transforms.
Edit: not saying you’re shady, just that it has a vibe of being shady :)
I can see that might trigger some people to think it is of dubious provenance. Maybe I should put the above on a 'Buy' page?
BTW the software is digitally signed (and notarized on Mac) and we've been selling software online since 2005.
If this is a standalone binary, what I'd want to see as a user would be a one-time purchase and an optional annual support plan.
With a recurring license, I'd expect:
- Cloud computing
- Cloud storage
- Real-time collaboration
If there are no service features like real-time collaboration, then a recurring revenue model makes even less sense.
I mean, I get it—I've written software that I sell with annual licenses myself. But it depends on cloud services to work, so there are costs to me too. That's I think where it's maybe the place to step back and look at the architecture and whether it's better suited to a web app if recurring revenue is important. Just my two cents....
The question then becomes how you justify that charge, but I think you can legitimately say support and/or new feature development (especially if you allow customers to have some kind of input into that). Having guaranteed support from a company with the technical chops that Andy has would be worth it alone for me in this instance.
Jetbrains have an interesting model with their IDE’s whereby you can fallback to a perpetual desktop license for their products if you don’t want support or updates. Perhaps that’s a nice compromise option.
Yes, but not infinitely, not for a static product. Otherwise I should also be willing to accept infinite punishment for whatever harm my product does as long as it exists.
If I create a hammer, it could still be generating value in a hundred years, but it could also be used to break windows and kill people. So if I deserve continuous payment, shouldn't I also be liable for the damages? Why should I be infinitely rewarded just because my tool had the potential to add value when it also had the potential to do harm?
> Having guaranteed support [...] would be worth it
I agree with you there. But I think it should be optional. It sounds like that's what JetBrains is doing, albeit in a roundabout way.
I'm burned out by all of these subscriptions and trying to go back to my "roots".
If you estimate that the average user will use the product for N years, you can charge a one-time fee of X or a yearly sub of X/N and make the same amount either way (assuming you got N right and forgetting inflation). A yearly sub is very simple to explain. And people who use the product for longer pay more, which seems fair. The sub also helps to finance the ongoing costs of development and support.
I need to think a bit about how to flatten an XML/JSON doc into a table and then turn it back into an XML/JSON doc.
(*XLS(X) output currently only works on Windows, because it uses ActiveX. But I plan to have XLS(X) input/output on Windows and Mac at some point.)
It is true that a desktop system may not be suitable for transorming million row datasets or processing that is running 24x7 - but that is not the market we are aiming for.
TLDR : It depends.
1. Users always work with the latest version, so you only have 1 version to support
2. It would make monthly pricing an easier sell
But I think there are some downsides here, with an app that is solely about data:
1. If user's data has to flow through it, there are privacy, GDPR and intellectual property concerns (for both the SaaS vendor and customers)
2. Latency, since you're going to have to upload data
3. Possibly issues with bandwidth fees (I think most clouds only charge for egress bandwidth, but users are still going to want to download the processed data)
4. Monthly pricing is a big turn-off for a large segment
And there's openrefine http://openrefine.org/
Also building binaries for Linux is a pain. Which distributions to support?
Target flatpak and/or snap.
just build static binaries for x86-64
or better, distribute the source code ;)
My gut feeling is that Linux users:
a) Don't want to pay for software.
b) Probably have the technical chops to roll their own solution in Python/R/SQL.
Data pipelining, cleaning and feature construction is the most time consuming part of data science. Its almost always a struggle, and the process usually produces fragile and ephemeral code that will need to be rewritten to put into production. If you could provide a labview-like GUI thing to remove some of this drudgery, assuming you could hook it up to database and csv back ends, or if there were a target which could do this, and the result were robust and could be deployed, it would make me much more productive than fiddling around with pandas or R data tables.
Maybe it isn't what you built, but I've said many times this is the most useful data science product; the one that is 10x easier than writing your own every damn time. Fancy woo-woo classifiers with alleged superpowers don't even begin to compare to tooling like this.
Currently it can input XLS/XLSX (on Windows) and delimited text (e.g. CSV) and can output delimited text. I am looking to support other inputs and outputs, e.g. JSON / ODBC / SQLite, depending on feedback.
csv is the lingua franca for prototyping of course. For deploying (which could make your code more sticky in an enterprise setup), you need to hook up to real databases.
I'd suggest finding a local senior data scientist or consulting group and use them to drive your product feature development. I'm not sure who you had in mind for your end user, but I do think data science types (the ones who get paid; not smurfs who want a DS job some day) are a viable market for something like this.
People who have data they need to transform, but probably aren't programmers or data science professionals.
I would obviously be delighted if data science professionals want to buy it. But I assume they have lots of powerful tools already at their disposal.
But I am still learning about the market, so I could be completely wrong about where the opportunity is.
I can't imagine who that would be. Excel power users maybe?
It's free for non-commercial use though
Anyway - the space does need some competition so good luck with your project!
That's a bit of a stretch.