
Guesstimate – A Spreadsheet for things that aren't certain - 666_howitzer
https://www.getguesstimate.com/
======
ozgooen
Cofounder Here: Happy to see this on hnews again.

Update: Matthew (the other cofounder) and I got Guesstimate to a stage we were
happy with. After a good amount of work it seemed like several customers were
pretty happy with it, but there weren't many obvious ways of making a ton more
money on it, and we ran out of many of the most requested/obvious
improvements. We're keeping it running, but it's not getting much more active
development at this time.

Note that it's all open source, so if you want to host it in-house you're
encouraged to do so. I'm also happy to answer questions about it or help with
specific modeling concerns.

Right now I'm working at the Future of Humanity Insitute on some other tools I
think will compliment Guesstimate well. There was a certain point where it
seemed like many of the next main features would make more sense in separate
apps. Hopefully, I'll be able to announce one of these soon.

~~~
smallnamespace
Are you able to apply global correlations to all the variates?

One of the triggers for the financial crisis in '08 was that the Monte Carlo
pricers assumed the various risks were much less correlated than they actually
were.

For example, they largely assumed that it was unlikely for many mortgages or
underlying MBS securities to simultaneously default (low correlation). This is
how many AAA rated CDO securities ended up trading at 50%+ discounts.

IMHO, any multivariate Monte Carlo analysis that doesn't show your sensitivity
to correlation is essentially useless, since your answers may change
completely.

In the second example model
([https://www.getguesstimate.com/models/316](https://www.getguesstimate.com/models/316)),
Fermi estimation for startups, you would expect many of the inputs (deals at
Series A, B, C, amount raised per deal) in real life to be highly correlated
with each other since they all depend on 'how well is VC in general doing
right now?'

The final estimate of 'Capital Lost in Failed Cos from VC' has a range of 22B
to 39B, this seems way too low. The amount of VC money lost during a crisis
(like in '01) can easily be an order of magnitude more.

~~~
ozgooen
I'd definitely agree that correlations can be a really big deal, especially in
very large models like that one.

Guesstimate doesn't currently allow for correlations as you're probably
thinking of them. However, if two nodes are both functions of a third base
node, then they will both be correlated with each other. You can use this to
make somewhat hacky correlations in cases where there isn't a straightforward
causal relationship.

Implementing non-causal correlations in an interface like this is definitely a
significant challenge. It could introduce essentially another layer to the
currently 2-dimensional grid. It's probably the feature I'd most like to add,
but the cost was too high so far.

I think Guesstimate is really ideal for smaller models, or for the prototyping
of larger models. However, if you are making multi-million dollar decisions
with hundreds of variables and correlations, I suggest more heavyweight tools
(either enterprise Excel plugins or probabilistic programming).

~~~
smallnamespace
Thanks for explaining your thought process, I read your other replies and it's
agree that many decisions are being made without any formal probabilistic
model at all. There's a lot of value in sitting down and working out how
things might be related to each other.

> where there isn't a straightforward causal relationship

One way to interpret a global pairwise correlation is simply that the person
building the model is being systematically biased in one direction—either
being too pessimistic or optimistic. This is a 'non-causal' relationship but
often the biggest contributor to variance between the model and the real
world.

Philosophically, this is a bit like the difference between 538's modeling
approach and Princeton Election Consortium's for the 2016 election—the former
gave Hillary a 2/3 chance of winning, while the latter ascribed a ~99% chance.

The risk of leaving modeling error out is that you'll end up with much more
confidence than is called for—it feels very different to come up with a point
estimate (I'll save $10k this year) vs. a tight range (I'll save 9k-11k this
year), if the true range is much wider.

In the former case you know your point estimate may be very far off, but in
the latter you may be tempted to rely on an estimate for variance that too
low.

> It could introduce essentially another layer to the currently 2-dimensional
> grid

You could probably get away with doing almost all of this automatically for
the user as long as the decide on what the 'primary' output is:

\- For every input, calculate whether it's positively or negatively correlated
with the output

\- Apply a global rank correlation to all the inputs with all the standard
techniques, flipping the signs found above as appropriate

\- Report what the output range looks with a significant positive correlation
(usually the negative correlation case isn't as interesting)

------
carlob
Maybe it's just the short video and the FAQ, but I found it particularly
difficult to find information about the distributions involved and how to
choose that.

I imagine there a bunch of cases where the defaults would not work like you're
trying to do error propagation (all normal distributions) or you're trying to
compute interval arithmetic.

Is it the case that if you input a range which span multiple orders of
magnitude then you get lognormal rather than normal?

I might not be exactly the target audience, but I would appreciate a more in-
depth of the math and heuristics involved

EDIT: I found this on their blog

[https://medium.com/guesstimate-blog/lognormal-
normal-833bf41...](https://medium.com/guesstimate-blog/lognormal-
normal-833bf413c7a3)

~~~
ozgooen
Thanks for the feedback.

We have some documentation
[here]([https://docs.getguesstimate.com/](https://docs.getguesstimate.com/)),
and some in the sidebar entree.

Generally, we recommend lognormal distributions for estimated parameters that
can't be negative. This works when you span multiple orders of magnitude,
though it's possible you may want an even more skewed distribution (which is
unsupported).

I may be able to make a much longer video introduction some-time soon.

------
iamwil
I saw this a couple of years ago, when it was just a project. Now that there's
a price, how did you guys decide on a price? How did you find your first
customers? For a broadly applicable tool, how did you know where to start
looking?

~~~
EGreg
Wouldn’t it be cool if they got it from their own spreadsheet?

~~~
iamwil
Yeah, I wonder if they did. That would be an interesting spreadsheet to keep
updated as people kept purchasing to update their model.

------
wcrichton
Does this permit Bayesian inference? e.g. looks like graphical probabilistic
programming (hooking up various distributions and performing inference),
except the key missing component is the ability to observe values for any
given distribution beyond the prior.

~~~
marmaduke
> the key missing component is the ability to observe values

Spot on, it’d need to load some data on final predictions. Or, it could dump
the model in a way that another software could use it.

I was thinking of the same thing.

------
zemlyansky
I'm developing a similar open-source app for statistical modeling and
inference in the browser: [https://statsim.com](https://statsim.com). You can
create probabilistic models and then infer their parameters using algorithms
such as MCMC or Hamiltonian Monte Carlo. The app is still in beta but it might
be useful. Some models:
[https://github.com/statsim/models](https://github.com/statsim/models)

------
samstave
This looks fantastic.

I love that it was a no BS signup and start using. Super clean and easy. It
would be great to be able to show data on GIS as well - effectively showing
the outcomes geographic representations. Ill see if the data I was looking to
work with today will work with this tool meaningfully.

------
triggercut
I've used Palisade @Risk quite a bit, but for my use case, most of the time I
feel like I'm taking a Lamborghini to the comer store. This is perfect for
someone like me who is more of a "casual" estimator of things modelling with
probability.

------
r0mulus
God dammit. This sort of thing pisses me off. Here I am, on vacation, waiting
for my family to wake up. What better way to spend my time but pursue HN. I
happen upon something like this. Something so damned useful that I have no
choice but investigate.

------
wenc
Can this sample from an empirical distribution? (i.e. from a CDF, not from a
known distribution family)

~~~
ozgooen
You can copy & paste an array of samples and Guesstimate will sample from that
cluster. For instance, try pasting the following into the value field of a
cell: [1,1,1,2,2,2,2,3,3,3,3,3,4,4,4,4,4,7,7,7,7]

You can use tools like distshaper6 to generate arbitrary distributions, then
copy the samples into Guesstimate.

[http://smpro.ca/pjs/distshaper/](http://smpro.ca/pjs/distshaper/)

Guesstimate doesn't yet support an input format for distributions outside of
via samples.

------
jmhnilbog
I would love to see this idea translated into event planning/calendaring.
Probabilistic party planning. I want to see what might be happening tonight in
addition to what is definitely happening.

"If 5 people show up at my house tomorrow evening, I'll hold a poker night."
10 people were invited and 4 of them RSVP yes and 2 of them RSVP no. It looks
like there's a 95% chance I'm holding a poker night tomorrow.

"The X team has a monthly meeting on the 1st, never fail. They haven't decided
on the location yet, just that it's on the North Side." As the team members
pick possible locations, the possible locations appear more distinct until one
is chosen.

------
paraschopra
It wasn’t obvious from the landing page but can you link estimates from
different models? It would be super cool to directly import variables and
their estimates from other models.

------
ken
This is terrific! The UI is very clever. I may have to steal some ideas from
this.

~~~
ozgooen
Please do so. The UI is all open-source react, so you may be able to copy some
components directly if you wanted. I'd be happy to help people out with this
if you have requests.

------
crb002
I proposed writing something like this while working for DuPont's Encirca
platform. Years later still little to no adoption in the farm IT field of
these models.

------
dang
From 2015:
[https://news.ycombinator.com/item?id=10816563](https://news.ycombinator.com/item?id=10816563)

------
m0zg
This is such a no-brainer for Microsoft to acquire.

------
hliyan
Is there a regular spreadsheet product that allows cells to have explicit
names and descriptions like this? (in addition to the value)

~~~
knight17
Excel has named cell ranges. So instead of

    
    
      =A5*A6
    

you could have

    
    
      =interest*principal
    

You can create them easily too -- can name the individually, can assign names
from existing tables and so on. You can have constants too, that is, they
don't have to point to any cell [1]. It is a godsend when working with bigger
tables having lots of formulas.

[1] [https://www.ablebits.com/office-addins-
blog/2017/07/11/excel...](https://www.ablebits.com/office-addins-
blog/2017/07/11/excel-name-named-range-define-use/#named-constant)

[2][https://www.contextures.com/xlNames01.html](https://www.contextures.com/xlNames01.html)

~~~
hliyan
Yes, but I was hoping for a feature where the name is displayed on the
interface...

~~~
chrispsn
Mesh? www.mesh-spreadsheet.com. Toggle name visibility for a cell with the F3
key.

------
chairmanwow
This looks incredible! Heads up: the mobile version of the site
(Safari/iPhoneX) could use some TLC.

------
struct
Great tool! (I’m a regular user :D)

~~~
samstave
Can you give some examples on what you have built with it?

~~~
struct
Sure! Here's a particularly big one I built when deciding whether to invest in
Walmart:
[https://www.getguesstimate.com/models/10935](https://www.getguesstimate.com/models/10935)

Here's another one I did to ballpark a pension plan:
[https://www.getguesstimate.com/models/11133](https://www.getguesstimate.com/models/11133)

Another thing I like is that you can do simple statistical reckoning for it.
For my job, I often have to benchmark something several hundred times with or
without a patch applied. It can be bit difficult to put "on average x% faster"
in context when the benchmark is noisy, but Guesstimate allows you to answer
questions like "assuming somebody ran one run of this benchmark with the
patch, and one run without it, what's the expected range of performance
improvement that they'd see?" with the actual numbers that you get out of the
benchmark:
[https://www.getguesstimate.com/models/11850](https://www.getguesstimate.com/models/11850)

~~~
thisisit
> whether to invest in Walmart:
> [https://www.getguesstimate.com/models/10935](https://www.getguesstimate.com/models/10935)

That is one heavy model. My computer comes to a standstill trying to open this
one.

I am curious though what was the result of the analysis?

------
leot
The mode of the distribution in the video appears to be zero. How is that
possible?

~~~
jsharf
I think technically the mode is one of the middle buckets, it looks slightly
taller to me.

Anyways, it's a histogram, so the x-axis is split into buckets. The bar all
the way on the left is likely some range of hours from 0 to whatever the
bucket size is

------
oferzelig
I love it! so well executed!

------
scottmcdot
Almost looks like it could be used in education, teaching stats?

~~~
ozgooen
I've heard of it being used in a few classes. There was one estimation session
with one group of what I remember to be 8th-graders. Honestly, I really don't
think you need to be great at statistics to understand the fundamental
concepts.

------
Bootvis
Is this project still active? The activity on GitHub seems low.

~~~
ozgooen
We're keeping it running but aren't actively improving it.

------
Invictus0
You should add a gaussian quadrature method as well.

------
flaque
This is super cool.

