
Pareto Front Sort for Python - kummappp
https://github.com/kummahiih/pypareto
======
kummappp
I did not find pareto front sorting library so I made one. It took a weekend.
Save your weekend and use my module instead.

~~~
anonymoushn
In the first example, why does (None, 0, 1) dominate (0, 0, 0)? The latter has
a higher value in the first position.

~~~
kummappp
Oh, it is a bug. I'll fix it. How embarrassing and thanks. I have mainly used
the None means the whole row is inferior -thing.

------
phijFTW
Generally, I would also recommend using pymoo [1] if you have multi-objective
optimization problems. They also have an implementation of Non-dominated
Sorting [2], which should serve the same purpose.

[1]: [https://pymoo.org/](https://pymoo.org/)

[2]: [https://github.com/msu-
coinlab/pymoo/blob/master/pymoo/util/...](https://github.com/msu-
coinlab/pymoo/blob/master/pymoo/util/nds/fast_non_dominated_sort.py)

~~~
kummappp
Thanks. The 'Non-dominated sorting' was the magic phrase for google i was
missing.

------
anamexis
Can you explain, for the uninitiated, what Pareto front searching is?

~~~
hansvm
The general idea is that if you have multiple things you care about
(objectives) it's generally not possible to optimize them all at once (e.g. in
computer science you often can't minimize runtime and memory usage in the same
algorithm). However, there are some objectively bad answers. If you have an
algorithm running in O(n) time and O(n) space (assuming no other objectives of
interest), one running in O(n) time and O(n^2) space is worse.

The Pareto front is a formalization of that idea. It's a collection of all
"non-dominated" points. A point is dominated if there is some other point
that's at least as good in all dimensions and better in at least one. A point
is "non-dominated" if there are no points that dominate it.

Going back to the [O(n), O(n)] example, this would not be dominated by, e.g.,
an [O(1), O(n^2)] algorithm since even though the time complexity is better,
the space complexity is worse. Out of the three examples I've given, those two
would be in the pareto front.

I haven't read through the whole library yet, but it looks like the way this
works is it finds the pareto front (all non-empty, finite collections have a
non-empty sub-collection of non-dominated points), finds the pareto front of
the remainder, and so on till it's exhausted the input. Your output is a
collection of "waves" of the original data.

Edit: I used algorithmic complexity just because it would be serve as a
familiar example to Hacker News. In general you just need a comparison
function in each dimension (and often you'll make a choice for each dimension
as to whether you care about maximums or minimums -- they're directly
interchangeable, so that's just for end user convenience). E.g., you might
want to maximize crop yield and minimize input costs, or in a dimensionality
reduction algorithm you might want to maximize separation while minimizing how
different your reduction is from the original data by some metric. It's an
extremely general purpose paradigm that sidesteps some of the issues that
arise when you have multiple things you care about that can't be cleanly
converted into a single cost function (all the hyperparameters in a machine
learning model come to mind as well). You still need a mechanism for choosing
things from the Pareto front, and there are whole fields of study in how
people interact with a Pareto solver, but being able to restrict your inputs
to _just_ the Pareto front is still a big win.

~~~
anamexis
That is very helpful, thank you!

------
cochleari_major
For low dimensions, it might be useful to look at the convex properties of the
Pareto front. If a point P is on Pareto front, it is not in the interior of
the convex hull of undominated points. In two dimensions, one can compute the
convex hull of N points in O(N log N) time. This typically allows for faster
Pareto front computation, but not in the worst case.

~~~
contravariant
Yeah I'm not sure this works. As far as I can tell the Pareto front doesn't
have to be convex and can therefore contain points that _are_ in the interior
of the convex hull.

Of course if you accept convex combinations of options then the pareto front
is part of the convex hull.

