
Gunnar Carlsson on the Shape of Data (2012) [video] - espeed
https://www.youtube.com/watch?v=kctyag2Xi8o
======
xtacy
I find the topic intriguing, but can someone who is well versed with _both_
topology and machine learning comment on what is the key innovation here?

On first glance, the methods here seem a lot like the toolbox of
dimensionality reduction techniques (PCA, spectral embedding, or more general
manifold learning, etc.) from machine learning literature.

What specific insights from the field of topology have helped further our
understanding of data that we missed earlier?

~~~
espeed
Prof Carlsson's
([http://math.stanford.edu/~gunnar/](http://math.stanford.edu/~gunnar/))
"Topology and Data" paper provides a good overview:
[http://www.ams.org/journals/bull/2009-46-02/S0273-0979-09-01...](http://www.ams.org/journals/bull/2009-46-02/S0273-0979-09-01249-X/S0273-0979-09-01249-X.pdf)

My recent dive into the literature has enlightened my thinking in terms of
database and systems design. It has led me to think more in terms of
properties, invariants, intervals, constraints, and dynamic fluidity -- "there
are no things" (only actions and properties): [https://edge.org/response-
detail/11514](https://edge.org/response-detail/11514)

Maybe the antiquated abstractions we have been using for database systems is
what limits us. Maybe we need to stop thinking in terms of things -- objects,
partitions, and static state -- and start thinking in terms of millions of
fluid dynamic processes. Maybe Jim Starkey is on the right track:
[http://www.nuodb.com/about-us/jim-starkey](http://www.nuodb.com/about-us/jim-
starkey).

Sussman seems to be converging there too -- see his talk "We Really Don't Know
How to Compute" ([http://www.infoq.com/presentations/We-Really-Dont-Know-
How-T...](http://www.infoq.com/presentations/We-Really-Dont-Know-How-To-
Compute)) and his work on the Propagator
([https://github.com/ProjectMAC/propagators](https://github.com/ProjectMAC/propagators)).

The rapid flow of data id stressing our system designs is making this more
apparent, and we're starting to see stream processing systems emerge like
Google Dataflow and Apache Flink. Ideas from functional programming and
immutable state are looking more prescient. Now our database management
systems need to evolve.

    
    
      "At no period in human culture have men understood the     
      psychic mechanisms involved in invention and technology. 
      Today it is the instant speed of electric information that, 
      for the first time, permits easy recognition of the 
      patterns and the formal contours of change and development. 
      The entire world, past and present, now reveals itself to 
      us like a growing plant in an enormously accelerated movie. 
      Electric speed is synonymous with light and with the 
      understanding of causes."
    
      — Marshal McLuhan, Understanding Media: The Extensions of Man (1964)
    

'okram's recent paper provides a new graph-based model for stateless
functional flows that could be applied in other systems: See "Quantum Walks
with Gremlin"
([http://arxiv.org/pdf/1511.06278v1.pdf](http://arxiv.org/pdf/1511.06278v1.pdf))

And Vladimir Kornyak touches on some of these ideas in these papers:

1\. On Compatibility of Discrete Relations (2005) [http://arxiv.org/pdf/math-
ph/0504048.pdf](http://arxiv.org/pdf/math-ph/0504048.pdf)

2\. Structural and Symmetry Analysis of Discrete Dynamical Systems (2010)
[http://arxiv.org/pdf/1006.1754.pdf](http://arxiv.org/pdf/1006.1754.pdf)

3\. Discrete Dynamical Models: Combinatorics, Statistics and Continuum
Approximations (2015)
[http://mmg.tversu.ru/images/publications/2015-vol3-n1/Kornya...](http://mmg.tversu.ru/images/publications/2015-vol3-n1/Kornyak-2015-01-05.pdf)

~~~
pramodliv1
It's quite clear from the Ayasdi website that they're targeting enterprise
customers. I wish they had an API similar to clarif.ai or wit.ai.

~~~
espeed
Javaplex: Persistent Homology and Topological Data Analysis Library
([http://appliedtopology.github.io/javaplex/](http://appliedtopology.github.io/javaplex/))
-- primarily developed by the Computational Topology workgroup at Stanford.

~~~
pramodliv1
Thank you!

