

Rate our pivot: A Python cloud spreadsheet  - gpjt

We're a not-quite-startup that's pivoting. We're looking for feedback on our new product, Dirigible, a programmable cloud spreadsheet.<p><pre><code>   http://www.projectdirigible.com/
</code></pre>
Dirigible is a spreadsheet that displays in a browser and calculates on the server. It uses Python both as its formula language and as its (more than) macro language. Our users can run their spreadsheets in parallel across a bunch of servers.<p>It's in free beta right now, we're aiming to start charging for compute time soon (we're using EC2, so will be charging a mark-up on what Amazon charges us).<p>We'd appreciate any feedback at all, but we're particularly interested in:<p>- Ideas about people that might benefit from using it (so far we have thought of bioinformaticians, finance professionals and accountants/actuaries).<p>- Missing features that would prevent you from using it.<p>- How well we're presenting our case.<p>- How smooth the signup process is.<p>Any thoughts would be much appreciated -- thanks in advance!
======
todayiamme
I love the product, but I have doubts on whether or not you can make this a
long term success.

>>>\- How well we're presenting our case.<<<

I think that's the central challenge to your startup. The problem isn't that
whether or not dirigible is good (which it is). It's the fact that why should
anyone use this?

As the average jane I really don't care about using Python in my spreadsheet
or that it can contain other objects than just text or numbers. I really don't
care. I have Google on one end, Zoho in another corner and something nice from
Microsoft that's as cool as the Office on my PC.

So, why should I invest time to learn something like this? The website really
isn't that clear on the matter and other than terms like that why should
anyone move beyond options that work okay for now?

~~~
gpjt
That's excellent feedback, many thanks. I think that you're right that the
average user won't get any benefits from Dirigible -- it only really helps
power users, so it sounds like we need to be crystal-clear on that, and then
make it clear how it can benefit those power users:

\- maintainability (which is the real advantage of the whole objects-in-the-
grid thing)

\- calculation speed (through parallelism and big-iron servers on the backend)

\- potentially, larger spreadsheets (we'll need to work to implement that but
in theory we should be able to support vast datasets)

Would that improve the pitch, do you think?

~~~
todayiamme
There isn't any need to thank me. I actually like doing this.

Yes, but the thing is that why can't I just run this locally without having to
worry about network lag?

If you target quants then most of the serious ones already have their own
setup to crunch numbers. So, why should they use something like that? Or why
shouldn't they just write a python script on EC2? And download a raw text file
or something instead?

I think that the core of your product is wonderful and it has many, many uses,
but I'm afraid that as a market the entire spreadsheet business isn't the best
one to be in. I might be wrong as I hardly have any experience other than
reading stuff, but the thing is that it seems to _me_ no one will have the,
um, otaku in this field to make your product a success (as Seth Godin would
say).

I just hope that I'm wrong.

~~~
gpjt
I hope you're wrong too, but it's a very good point. The advantage we _think_
we have over writing their own Python (or Hadoop, or...) on EC2 is that it's
just easier to put these things together in a spreadsheet environment. But we
clearly need proof points for that, and perhaps we should be focusing as much
as possible on finding someone who can champion it.

One thought -- as lazylland suggested elsewhere in this thread, perhaps we
should consider downplaying the spreadsheet aspect? Ultimately, what we're
trying to do is provide an easy-to-use way for people to access large-scale
computing resources. It just so happens that a spreadsheet-like interface
turns out to be a great way of doing that, but that doesn't mean we need to
bill ourselves immediately as "a spreadsheet".

(Also -- it sounds like we should be promoting the option of running the app
on local server farms more heavily.)

~~~
todayiamme
You know something why don't you take the core of your service (the ability to
handle large datasets) and just make something that's essentially Quantitative
Analysis for dummies. Or a HFT fund for dummies?

There are tons of, um for the lack of a better word, amateur would be stock
brokers who like to play the market, but they really don't have the knowledge
neither the access to tools like this. So, if you offer this to them then the
sky is the limit to what you can make. Essentially leverage all of your
abilities to help amateurs to professionals work in the cloud and handle
datasets.

Moreover, replace python by your own query language that has an english like
syntax that will appeal to such people.

If you see this then can you email me to continue this conversation? This is
an interesting problem.

~~~
gpjt
Only just saw this! I'll drop you a line now.

------
notahacker
Charging based on compute time makes no sense to me. Not only does it force
your customer to guesstimate their required compute time (a function of code
they haven't yet written and servers they don't administer), but it also
implies that your service offers little more value than the commodity
infrastructure underpinning it.

I'm far more likely to have advance knowledge of the number of inputs and
outputs I'm expecting and to pay more for features. My gut feeling is that the
real value in what you've developed is that business types with little
programming knowledge can use a spreadsheet based on data automatically
imported from elsewhere. They'll be the sort of people very happy to pay more
for widgets that simplify using the relevant libraries (your screen-scraping
example is excellent) and very uncertain about how many minutes I need (unless
I'm doing something very simple on the free plan). If you need to charge more
for intensive processes it might make more sense to have it as an abstract
bolt-on like Heroku's dynos that can be added on top of the monthly fee.

Unrelated aside: for some reason I have to sign in to view one of your
examples: /user/tutorial/sheet/1024

~~~
gpjt
Interesting, thanks for that. Our model for the compute-time-plus-markup was
the plethora of companies that make money doing a similar model with storage
-- Dropbox being the shining example. But you raise a good point that those
are inherently more predictable for the users. You know that you need _N_ Gb
of storage because that's what's on your hard drive, whereas knowing how many
seconds a calculation will take is much less knowable.

Of course, if we were to take the more-technical route suggested by others on
this page (ie. moved from being a "super-powerful programmable spreadsheet" to
being "easy cloud supercomputing [with a shiny spreadsheetlike interface]")
then perhaps our current pricing model would make sense.

On the other hand, if we stick to the spreadsheet model, you could well be
right.

Hmm. Food for thought. Thanks for the pointer, anyway, and in particular for
the reference to Heroku's dynos -- always good to know how people have
successfully solved similar problems!

Oh, and thanks for the heads-up re: the sheet that needed a sign-in, I've
fixed that, and 1025 as well (it seemed to have the same problem).

------
vineet
Very interesting tool.

Two main thoughts:

I felt that you could remove the word programmable from 'programmable cloud
spreadsheet' - since most people will think that spreadsheets are
programmable.

I am a little worried that you have only gone half-way to your users. If you
are about the cloud and your users understand that, then perhaps you should
let them deploy to their own cloud machines, the other alternative would be to
not charge per compute time but to charge per operation (line of code).

Beyond that you might need to experiment with a couple of things: Try giving
an example of how much a small license will buy me, or just giving me an
example per compute time. Try letting me just buy buy time so that you can see
what I would do with it, and perhaps don't show the number of concurrent
servers in the signup page - just making the decision process simpler. Having
lots of options means more decisions that a new user will have to go through -
it might work for you if they are aligned with your users, but otherwise it
might just mean more work for you to do to get it right.

~~~
gpjt
Thanks for that.

Re: "programmable" -- what we're trying to emphasise with that is that it's
more programmable than (say) Excel -- the whole Python in cells thing, plus
its ability to rework the recalculation loop with Python code. Our first cut
was actually "a Python cloud spreadsheet" but we figured that we didn't want
to tie ourselves down to people who already knew and liked Python, so we moved
on to "programmable". Is there a better word, do you think?

We've definitely been thinking about having some kind of option for people to
deploy to their own private clouds (or indeed to their own server farms) but I
see that we've not put anything on the site about that; we'll add something
for that.

I like the idea of having an example of what the different pricing
plans/compute time allowances get you, we'll definitely add that.

We've had a lot of debates internally about the concurrent server support;
perhaps we should just drop that from the plans entirely?

~~~
vineet
Re: "programmable". What you are saying makes sense. But when I first see a
site that says 'programmable cloud spreadsheet' - I don't get that.

Infact, there is a part in me that is still trying to figure out in which
problems you are better than Google Spreadsheets and its programming model
(even though I understand that you are different from it).

Perhaps what you need to do is talk about the problems that you solve instead
of what you are building. This will definitely be helpful when trying to get
to product/market fit.

Re: dropping concurrent server support - I doubt that anyone can really tell
you what you should do there. You need to experiment. And the answer right now
(with early adopters) might be very different from the answer a few months
down the road.

~~~
gpjt
Right, that makes sense -- the classic benefits versus features balance --
which is, of course, made more complex by the fact that one user's benefit is
another user's feature, but that's not to say that there's no balance to
strike. One thing we did with our earlier product was to have a "How we can
help..." section on the front page, naming three markets, with links through
to separate pages for each, naming appropriate problem and solution/benefits
for each. Perhaps we need to do the same thing here.

Re: what we're better at than Google spreadsheets -- our big benefits come in
simplifying complex spreadsheets, by using programmerly techniques such as
objects/lists/NumPy arrays in cells, and in automated parallelisation for
large calculations. These are definitely power user features, for normal
cashflow spreadsheets and the like there's no advantage to using Dirigible
instead of Google.

(Random thought: we're currently using the example of calculating the orbits
of the planets to show a bunch of those features off. We chose that example
specifically because it's simple enough for most technical people to grok
pretty quickly and doesn't require any real domain knowledge to see what's
going on. Perhaps we also need more industry-specific examples to make the
same point -- eg. pricing portfolios of exotic options for the finance guys.)

Re: concurrent server support -- well, I guess a bit of A/B testing might be
in order -- with a mind to the fact that we'll need to keep running the tests
as time goes on and we (fingers crossed) cross the chasm.

------
peterbe
Not going to comment on the usefulness of the product but...

1) I love that the website presents a video right away that is calm and clear
and explains both the simplicity and the power at the same time

2) What you revealed here about the simple fact that you're running right on
top of EC2; I think that you should be made clear on the website because it's
a trustworthy brand and a transparent business aspect

3) The light blue background colour hurts my eyes. It's very intense and sharp
and not pleasant background for the white logo.

4) What is the logo? I Balloon trapped in ropes? A H2O molecule?

~~~
gpjt
Thanks for the feedback. It's really useful:

1\. Cool. We noticed yesterday that it could use an update, but we'll stick
with the calm simplicity.

2\. Excellent idea.

3\. Interesting. We'll experiment with a less intense version to see how that
looks.

4\. Like ihumanable says, it's an airship where the gas envelope is a cloud.
It breaks up into pixels to suggest a spreadsheet.

------
spne
Do you have any plans to license the software to clients, where we could use
it on our own servers? The functionality looks great, but I have concerns
about proprietary and confidential information.

~~~
millenniumhand
We do. We're using the public version to polish the interface and debug the
server provisioning, but a self-hosted version is definitely on the cards.

~~~
spne
What kind of pricing model are you planning?

~~~
gpjt
To be honest, we haven't given that any thought. We are open to suggestions,
though.

------
imechura
Hi,

I know this is outside of your requested feedback but have you considered
adding some vertical business back ends to it.

I cannot see why I would want to use your spreadsheet over google apps'. But,
if you where to offer built in statistical analysis or say accounting
functions, then market it to that user base you may be able to make it stick.

Another example would be common functions used to perform land surveys.

Just my 2 cents. Probably worth less than that.

~~~
imechura
Just to add a little more... Could you create an API so third party developers
could create "function packs" that could be bought and sold to end users?

~~~
gpjt
Thanks, that's a great idea -- sorry for the slow response (PyCon got in the
way)

We already support a bunch of Python modules -- for open source ones it's just
a case of using pip to install them on our server -- and we've always intended
to allow people to upload their own. Making it easy for people to share
extension modules they've written with other people (hmm, and maybe even to
charge for use, app-style) would make this much more powerful.

------
lazylland
* 'Spreadsheet' is a really bad summary of the product. My first impression was 'Great another Office wannabe, _NEW_ with Python', but when I watched the video, I saw it as "effortless desktop supercomputing" .. cool !

* Another message that could be given is "wrangling big data the easy way"

* Focus the product on the niches you have already identified, all the big boys with the hard numbers to crunch. Do not dilute for the Mom and Pop segment.

~~~
gpjt
Oh wow, that's an ace tagline! Still, it does run on web servers, would
"effortless cloud supercomputing" have as good a ring, do you think?

We're definitely going to avoid trying to cater for the Mom and Pop segment --
Google docs have a perfectly decent solution for them.

~~~
lazylland
Now that I think of it .. just "effortless supercomputing" would suffice !
Maybe it's just me, but the "cloud" term is so overused now that it has become
to mean everything and nothing.

Kudos to you guys on a cool product .. wish you all success !

~~~
gpjt
Heh, we had a crisis of conscience before we decided to use it :-)
<http://blog.projectdirigible.com/?p=507>

Thanks!

------
epynonymous
love the web interface, it's quite clean.

i didn't rtf-faq, but can you do things like import xls and google doc
spreadsheets? that would be pretty cool to translate xls macros into python.

it would also be cool to have like a side by side comparison of say a common
problem that's done in an xls spreadsheet versus this and how much easier it
would be to use this.

nice work!

~~~
gpjt
Thanks! We can import xls files, but only the values right now -- not the
formulae or the macros. Formulae are definitely on the cards, we're not so
sure about macros -- compiling VBA to Python is doable but might lead to
completely unmaintainable code.

Google Docs is something we definitely need to add.

I love the idea of the side-by-side comparison, we'll have to do that.

------
limmeau
Nice. I sometimes wonder why I have to write my formulae as one line into one
tiny window when I have a full-HD screen on my desk.

The simplest example on your examples page consists of a Python program with a
function and a for-loop. Perhaps you could add even simpler examples to pick
up spreadsheet users (e.g. just add A1 and B1, giving C1)?

~~~
gpjt
Thanks, that's great feedback. We kind of expect that our users will know how
to do the spreadsheet basics, but I think you're right that the for loop
should move out of the first tutorial, and maybe the function too. Best not to
scare people off :-)

------
sagacity
Clickable: :)

<http://www.projectdirigible.com/>

~~~
gpjt
Thanks! Couldn't get it to work in the post. HN markup remains a mystery to
me...

------
jcsalterego
Reminds me of <http://www.resolversystems.com>

~~~
jared314
It is the same person.

~~~
gpjt
Yup, that's right. Resolver One is a desktop app, we're now focussing much
more on the web (though we'll obviously not abandon Resolver One)

------
mef
My first thought upon getting into an actual spreadsheet was "I'd use it if it
supported Ruby or Javascript". Any plans to support languages other than
python?

Also, any plans to import spreadsheets from excel/gdocs?

~~~
gpjt
We definitely could support other languages, Python is the first one we went
for because of modules like NumPy and SciPy, which are a great fit for
spreadsheets that do serious number-crunching.

If we were to support Ruby or JS, the one question would be what to do with
the formula language. Right now we recognise a language that is kind of a
superset of both "traditional" spreadsheet formula syntax and Python
expressions. (We actually started with the BNF for the Python grammar and then
extended it.) Once we've parsed formulae using that grammer, we compile down
to Python and run it. Now, obviously we could compile down to Ruby or
JavaScript equally well, but I suspect we'd need to use a more Rubyesque or
JavaScript-like formula language too, which might be... interesting.

We currently can import values from Excel, but not formulae -- we've
definitely got to improve that. Gdocs import is something we'll get sorted
too.

~~~
mryan
Before spending time adding support for additional languages, I would wait for
someone to demonstrate their need and willingness to pay for this feature.

I have read some interesting things about Python being used in the
maths/science worlds (NumPy/SciPy being great examples), but have never heard
of Ruby or JS being used in this context. So I think a lot of web devs will
say "It should support Ruby/JS/whatever" - but how many of those people are
actually going to pay for the site? Is there any reason to support these
languages other than the cool factor?

Adding support for VBA would probably help bring a lot of corporates over to
your site, although I really feel for whoever ends up coding that! :-)

This is a great site btw, lovely logo/design, and a fantastic piece of tech
behind it. I could see this being bought by Google and folded in to
Spreadsheets.

~~~
gpjt
Sounds sensible. And definitely agreed re: VBA...

> This is a great site btw, lovely logo/design, and a fantastic piece of tech
> behind it. I could see this being bought by Google and folded in to
> Spreadsheets.

Thanks! Our fingers are crossed :-)

------
revorad
This looks very interesting. Have you looked at Hypernumbers? They are
building something similar. YC-funded Skysheet was also working on a web-based
spreadsheet but they seem to be still in private beta mode.

~~~
gpjt
We've been watching Hypernumbers, they look pretty good. I'd somehow missed
Skysheet (great product name), thanks for the pointer!

------
Yoric
Haven't checked it yet, but calculating on the server, in particular with
Python as a macro language, sounds like a can of worms: doesn't it make very
easy for any malicious user to DoS your server?

~~~
gpjt
We've done a lot of work on locking stuff down, and the back-end runs across a
number of servers, so it would be pretty hard to take down the site as a
whole. But yeah, there are bound to be loopholes.

OTOH, if we're charging for compute time and someone chooses to write
something that locks up the server completely, then, hey, it's their money ;-)
So long as we can keep it running for everyone else, anyway.

We're a bit more worried about how to stop people from using it to create a
botnet or otherwise use it to do Bad Things to other people; but I guess every
cloud computing service ultimately hits that problem, and all you can do is
monitor and shut stuff down quickly if something bad starts happening.

------
hlfshell
What was your original idea/goal? Why did you pivot? Pure curiosity.

~~~
gpjt
We started out to produce a similar desktop tool (see comments about Resolver
One). That was working out OK but hasn't been the breakout success we'd always
hoped for. In addition, people kept asking us (quite reasonably) about a
cross-platform version.

So we thought about that and decided that a rewrite for the web would solve
the cross-platform bit (JavaScript being the ultimate cross-platform language
;-) and also allow us to support parallel execution of spreadsheets so that
people could scale their compute resources up and down.

------
jared314
Can I upload my companies proprietary modules? And restrict access to the api?

~~~
millenniumhand
I'm another developer on Dirigible. We already have module upload on our
roadmap. We know what we need to do to implement is and all the hard work to
prevent cross-sharing of modules between users is already done.

The API is secured with a key that is unique to each spreadsheet and it can be
easily regenerated if necessary. Do you have other restrictions that would be
interesting to you?

~~~
jared314
Can I generate multiple api keys to control customer access?

~~~
gpjt
Not yet, but soon.

