

Show HN: Sense - A New Cloud Platform for Data Science and Big Data Analytics - tristanz
https://senseplatform.com

======
tristanz
Sense cofounder here. We're just getting started, so feedback is welcome.

Sense supports R, Python, JavaScript and SQL out of the box, but is fully
extensible to new languages and tools:

[https://github.com/SensePlatform/sense-
engine](https://github.com/SensePlatform/sense-engine)

We have Julia, Hive, and Spark engines in development.

~~~
tansey
Looks great!

Do you have full support for numpy, scipy, matplotlib, pymc, and pandas?

~~~
tristanz
Yes. I'll also point our that Anand (apatil) was a core developer for PyMC. We
have big plans around Bayesian computation on Sense.

------
houshuang
Looks awesome, great with multi-engine support. I'd love if you could open-
source your two-pane approach to IPython... I really like IPython (and have
been working with IHaskell lately), but I find the RStudio approach much
better... having my code on the left, moving up an down and executing lines
with Cmd+Enter, running entire cells (knitr Rmd style), seeing graphs and
documentation while you're working, etc...

~~~
apatil
Glad you like it. The IPython engine isn't open source at the moment, but that
may change in the future. Out of curiosity, if it were open source, what might
you use it for?

~~~
houshuang
For my own work, maybe help integrate it with IPython as an alternative front-
end. I don't have a cloud project, I just do my own data analysis/learning
with IPython and IHaskell, and think a multi-pane approach would be much more
powerful. (I applied for an account with Sense, looking forward to playing
with it and providing feedback).

------
fawce
Also see Domino Data Lab
([http://www.dominodatalab.com](http://www.dominodatalab.com)) which is in
public beta. Similar, with more emphasis on reproducibility of past results.

------
Blahah
_Very_ interesting! Nice work guys. Just sent you an email about academic
research use case.

You don't mention anything about RAM in your pricing - what are the
restrictions? And what about I/O and storage?

~~~
apatil
We're planning to charge per core for usage. Each core will be a true physical
core, with 5.5 ECU. The container will get 3.75GB of RAM and a slice of the
host's bandwidth per core. We'll also have very inexpensive micro tier, and
eventually some kind of long-lived services tier, as Tristan mentioned in
another comment.

~~~
Blahah
any plans for more RAM options? something like the EC2 244GB?

~~~
apatil
Not at the moment. Right now, the biggest single dashboard is 60GB. You can
launch as many dashboards as your plan allows if your application can be
distributed, but I'm guessing that isn't the case for you.

------
jasonkolb
Hey Tristan, looks awesome. Great work!

This has some really interesting adjacencies to a project that we currently
have in limited beta and getting to roll out widely very soon. I'd love to
chat about some ideas I have to work together that could work out really
nicely for both of us. If you're interested drop me a line:
jason@applieddatalabs.com

------
notastar
Tristan, quick question here.. who did all coding part ? Do you still code ?
Does Anand code ?

Since all of your team are very high profile ( Stanford, Harvard) I am
wondering how much all have kept to ground work after rising to such level ?
Thanks for your answer.

ps: I am hopeful for Stanford MBA admission.

~~~
tristanz
Anand and I built everything, including the choosing the colors of buttons.
That's early stage startup life.

------
micro_cam
Looks nice.

How well does the distributed filesystem perform and what size data sets an it
handle?

How quickly can you ramp up 10, 100 or 1000 cores?

Improved performance in these areas are the big things that would get our
group to adopt a new platform.

~~~
tristanz
Scaling up from 10, 100, to 1000 cores is fast (3s per engine in parallel).
However, something like 1000 cores would currently require spinning up new
instances (1-2 minutes) if deployed in the cloud.

The distributed filesystem is meant for easily sharing code and medium sized
data across containers. In the cloud, it is best to to use S3 directly for
large data and local disks for high IO tasks. For on premise deployments,
there are more options.

I'd be interested to hear about your use case. Feel free to drop me a line at
tristan@senseplatform.com

------
MaBu
If you think the platform is finished enough maybe post it on Kaggle:
[http://www.kaggle.com/](http://www.kaggle.com/) there are many potential
users for this app IMHO.

------
berto99
A little off topic, but nice to see you're using Angular.

------
hcarvalhoalves
Looks a lot like GitHub. Were you inspired by it?

~~~
tristanz
Yes, we're fans of how sharing and collaboration works on GitHub. The goal is
to make Sense the center of gravity for data scientists the way GitHub is for
developers.

We're not trying to replicate GitHub's features. The core of Sense is a better
way to work with data: the compute infrastructure, engines, and analytics
workflow. Advanced users using git will likely use Github in addition to
Sense.

------
berto99
Tried to sign up, but I need an invitation code.

~~~
tristanz
You currently need an invitation code to register. We're giving these out
slowly to make sure everything works smoothly.

~~~
mdda
Typo : "Distributed POSIX _complaint_ project file system"

~~~
apatil
Thanks.

------
quasiben
Is IPython notebook on the roadmap?

~~~
tristanz
The Python engine is IPython underneath the hood. Any code or visualizations
that work in IPython notebooks should work in Sense.

There is a difference though. In our experience, we've found that the notebook
style development, with code inline, is awkward when doing serious analytics.
It's harder to use version control, editors, etc. We have opted for the dual
pane experience common in R and Matlab. The output however can be rich and
interactive just like an IPython notebook and is always saved.

