
Show HN: Prolly – DSL to express and query probabilities in code - iamwil
https://github.com/iamwilhelm/prolly
======
noelwelsh
Probability distributions can be nicely modelled as a monad. See, for example,
[https://github.com/jliszka/probability-
monad](https://github.com/jliszka/probability-monad) or
[http://www.cs.tufts.edu/~nr/pubs/pmonad-
abstract.html](http://www.cs.tufts.edu/~nr/pubs/pmonad-abstract.html)

I think this goes beyond what Prolly offers, and so might provide some
inspiration for extending it in a consistent manner.

~~~
tel
Oleg's Hansei is also very worth checking out

[http://okmij.org/ftp/kakuritu/](http://okmij.org/ftp/kakuritu/)

------
meesterdude
Wahoo! this is neat, and in ruby!

However, I have some issues here. First, I don't know that I like the
interface. Adding to the class instead of an instance of the class irks me. I
also find the documentation hard to follow - the examples are too limited, and
others are poorly formatted.

But its ruby, so I have no excuse to not try and clean it up some. I'll tinker
and see what I can come up with in a PR.

Related, can anyone point me to something digestible that will bring me up to
speed on the finer points of the mathematics?

Also, curious what kinda things this is used for in the wild? I don't know of
anything that operates at all based on probabilities, so some examples would
be helpful.

~~~
iamwil
OP here, I used "adding to class" because I had assumed people would only use
a single probability space at a time, and it would shorten the syntax since
you didn't need to instantiate the space.

However, it's possible to make an instance of Ps and use the same interface.
But I'd need to change a method or two around to support it. It wouldn't be a
hard change. Do you envision using more than one separate probability spaces
at once?

As for the docs, lemme know in an issue or PR what's too hard to follow. I
tried to make it clear, but I welcome other eyes on it.

As for mathematics, perhaps Naive Bayes is a good place to start?
[http://suanpalm3.kmutnb.ac.th/teacher/FileDL/choochart822554...](http://suanpalm3.kmutnb.ac.th/teacher/FileDL/choochart82255418560.pdf)

Naive Bayes Classifier is an example of something expressed all in
probabilities. I only implemented a decision tree learner using Prolly, but am
planning to implement more things using it in the near future.

~~~
meesterdude
I would definitely rather instantiate and build that way. I can dream up some
examples where I might want to work with more than one type of probability. It
would be good design to facilitate that, for sure.

I'm not up on the actual functionality - any reason you couldn't just work
with an arbitrary collection of objects? Would be nice to make an AR query and
wrap it in a Prolly object and work that way.

So, I'm pondering. When/where would I want to rely on probabilities instead of
raw data? Everything I come up with seems like it would just work better with
actual data values than deriving probabilities. But, that might just be my
mathematical ignorance at play.

Cool gem, thanks for creating!

~~~
iamwil
Sure, I'll change it around to have the option to use instantiated PSpace.

As for relying on probabilities, check out Naive Bayes. Hidden Markov Models
also rely on probabilities, rather than raw counts.

Thanks for the feedback!

------
brandonb
Super cool idea! I love the fact that the interface is so simple and
intuitive; I think that makes machine learning accessible to a much longer
tail of problems that people would not have previously considered.

~~~
iamwil
Thanks! I had tried a couple times to write this, but had some false starts. I
tried to make it less verbose, and it's as good as I could get it for now.

