
Show HN: Data Commandr – As Easy as a Spreadsheet, as Powerful as SQL - asavinov
http://conceptoriented.com
======
stoavio
How do I schedule my transformations so they happen on a recurring basis?

I would like to try the tool but keep getting a "Server Error" message which
appears to prevent it from fully loading.

~~~
asavinov
> How do I schedule my transformations so they happen on a recurring basis?

Currently formulas are evaluated only when the user _explicitly_ presses
Evaluate button (or using explicit API call).

The logic of automatic evaluations is work in progress and it is needed more
for data integration or stream analytics. It is assumed that the formulas will
be evaluated under certain conditions on regular intervals (say, one time per
second), on certain events (after 10 appends) and so on.

> I would like to try the tool but keep getting a "Server Error" message

Hm, I have just checked and it works. It uses JSESSION so cookies must be
enabled or may be some problem with the old session id so try to open new
clean browser window.

It is MVP so there defenitely can be some technical problems. Sorry for that
and thanks for testing it.

------
techno_modus
> it could be applied to tasks where Spark is normally used

How does it relate to Spark? Is it able to read from HDFS, translate to Spark
map-reduce jobs or integrate somehow with Spark jobs?

~~~
asavinov
It could be a technological (column-oriented) _alternative_ to what map-reduce
in general and Spark with RDDs in particular are doing. It might be easier to
define a data processing workflow as a number of column definitions rather
than as a graph of collection transformations. And such a column
transformation workflow could be also more efficient at run time.

Here is my implementation of this column-oriented approach as a Java library
for server-side transformations: [https://bitbucket.org/conceptoriented/dc-
core](https://bitbucket.org/conceptoriented/dc-core)

------
asavinov
Data Commandr: [http://conceptoriented.com](http://conceptoriented.com)
(Version 0.6.0, 2017-05-14)

Data Commandr (DC) is a web application for working with _tables_. Its goal is
to make complex operations with data tables as easy as working with classical
spreadsheets (but it is not a spreadsheet application) for producing reports,
data integration, data migration and other _data wrangling_ tasks. Its main
distinguishing feature is that it does not use traditional (set-oriented)
queries but rather relies on _column formulas_ to derive new data.

I would like to get your option about Data Commandr from three points of view:

\- [As developers] How do you like the current UI and general design? How it
can be improved?

\- [As users] Can such a tool be useful for working with tables?

\- [As investors] What other tasks can be solved by using this approach? For
example, I want to pivot in the direction of stream analytics. Or it could be
applied to tasks where Spark is normally used (DC might have much higher
performance and much simpler data processing language.) Other applications:
data dictionary and data catalogs.

Thank you in advance for testing the app and your feedback.

COMPARISON:

\- [Spreadsheets] In spreadsheets, a cell is defined as a _function_ of other
cells so we get a functional model where the result of each cell (one value)
is computed by evaluating other cell formulas. In contrast, DC uses _columns_
instead of cells so that a model is a number of _column formulas_.

\- [Set-oriented approaches] In set-oriented approaches (RM, Hadoop, Spark
etc.), new data is derived by applying an operation to input sets and
producing an output set. Data transformations are represented as a graph of
set operations. In contrast, data transformations in DC are represented as a
graph of _function operations_.

\- [Research] Formally, Data Commandr relies on the _concept-oriented model_
of data [2]. In particular, DC does not rely on such difficult to use and
understand operations like join and group-by. Instead, it uses _link columns_
[4] and _accumulate functions_ [1], respectively.

\- [Competitors] When I started this project [5] I knew only one similar
product which is Power BI and its DAX language. Now there exist such
relatively new products like airtable, fieldbook and rowshare.

[1] A. Savinov. From Group-By to Accumulation: Data Aggregation Revisited.
Proc. IoTBDS 2017, 370-379, 2017.
[https://www.researchgate.net/publication/316551218_From_Grou...](https://www.researchgate.net/publication/316551218_From_Group-
by_to_Accumulation_Data_Aggregation_Revisited)

[2] A. Savinov. Concept-Oriented Model: the Functional View. arXiv:1606.02237
[cs.DB] 2016.
[https://arxiv.org/abs/1606.02237](https://arxiv.org/abs/1606.02237)

[3] A. Savinov. DataCommandr: Column-Oriented Data Integration, Transformation
and Analysis. Proc. IoTBD 2016, 339-347.
[https://www.researchgate.net/publication/301764506_DataComma...](https://www.researchgate.net/publication/301764506_DataCommandr_Column-
Oriented_Data_Integration_Transformation_and_Analysis)

[4] A. Savinov. Joins vs. Links or Relational Join Considered Harmful. Proc.
IoTBD 2016, 362-368.
[https://www.researchgate.net/publication/301764816_Joins_vs_...](https://www.researchgate.net/publication/301764816_Joins_vs_Links_or_Relational_Join_Considered_Harmful)

[5] A. Savinov. ConceptMix: Self-Service Analytical Data Integration Based on
the Concept-Oriented Model, Proc. DATA 2014, 78-84
[https://www.researchgate.net/publication/265301356_ConceptMi...](https://www.researchgate.net/publication/265301356_ConceptMix_Self-
Service_Analytical_Data_Integration_based_on_the_Concept-Oriented_Model)

