

Show HN: Pykov – a tiny Python module on finite regular Markov chains - ricsca
https://github.com/riccardoscalco/Pykov

======
ayrx
Quite a few bad practices.

1\. Separate python 2 and python 3 modules? How can I be sure that they both
do the same thing?

2\. You have the tarball in your source control. Why?

3\. You have a 1.7k line file. Something tells me that they can be split up
into smaller modules.

4\. Commented out code? Why is it in there? If it's unused, remove it.

5\. No lines between methods, single line between classes. Please have a look
at PEP8.

6\. You have that .pysparse file in source control. Why?

7\. You say that Pyvok depends on Scipy and Numpy but you don't specify the
dependency in your setup.py file. Anyone that installs the package from PyPI
is going to have a broken package.

~~~
ricsca
All good points, thanks for pointing them out. I'll consider each one
carefully.

------
Latty
I'd note that `Vector.sort()` is an awkward choice of name for a function that
returns a result, rather than mutating the object in-place, given the standard
library tends to use infinitive verbs to mean in place functions, while past
tense verbs return a new value with the modification applied.

e.g: `list.sort()` is in-place, `sorted(list)` returns a new list.

~~~
shoyer
True, but who really wants the mutating version? Worse, do we really want to
make it more obvious/natural to modify in place? I think this is a standard
library convention that is best forgotten. In pandas, for example, you need to
write inplace=True to use the mutating versions of methods, which hopefully
makes mutation awkward enough to discourage it.

~~~
andreasvc
More often than not, I want the in-place version. When dealing with a
significant amount of data, it's better not to make unnecessary copies. What
is your argument for wanting to discourage mutation? Either way, the argument
of the parent comment is clear, it's best to follow established naming
conventions.

~~~
shoyer
This stack overflow discussion summarizes pros of immutability reasonably
well: [http://stackoverflow.com/questions/1863515/pros-cons-of-
immu...](http://stackoverflow.com/questions/1863515/pros-cons-of-immutability-
vs-mutability)

The main reason is that mutable objects are more complex and thus harder to
reason about (which leads to more bugs).

------
piqufoh
I'm currently choosing whether to go with an existing MC library, or roll-my-
own. Has anyone used this library? How is it working out?

Reading the source sets a couple of alarm bells ringing -

entirely separate and independent py2k py3k versions?

entire methods just commented out?

and as latty points out wandering from established naming conventions...

~~~
ricsca
I coded this library during my PhD, where I applied Markov Chains to the study
of protein folding dynamics. I am quite confident with it, but as far as I
know few others used it. At that time, I used pykov with networks of around
100k nodes and 500k links. Only recently I replaced pysparse with scipy sparse
solvers. Entire methods are comment out because they are heuristics that do
not belong to the standard Markov Theory, and they can be removed.

------
mrcactu5
How does this differ from PyMC? How is this tiny?

I would be interested in a Python Markov chain library since I use them a lot.

~~~
ricsca
Pykov implements the computation of some of the most common quantities related
to discrete-time finite regular Markov Chains, namely: steady state, mean
first passage times and absorbing times. The calculation are performed by
means of the analytic formulas described on Kemeny&Snell, and the steady state
is derived with the inverse iteration method. If you are interested on random
walkers, pykov offers a handy way to generate them and evaluate their
probability. Sorry, I do not know PyMC well enough to say if the above
quantities can be calculated also with it.

