Hacker News new | past | comments | ask | show | jobs | submit login
A Python library for writing distributed self-replicating programs (networkgenomics.com)
252 points by jedieaston 25 days ago | hide | past | web | favorite | 39 comments

I'm curious: what's the use case for this?

Naïvely, it looks like a platform for a virus-like application.

Reading the GitHub issues and PRs, it looks like a procedural competitor for Ansible (which is declarative).

The use-case is mainly to get rid of the inherent bottleneck that Ansible has, where you have to execute everything from a single manager node. It also contains tons of optimizations at the networking level, etc.

It makes sense if you've got 1000s of nodes you're managing with Ansible, and you need to increase efficiency. If you've ever tried running an Ansible playbook against even 100 hosts, you'll know it can be painfully slow even with SSH pipelining. Mitogen makes that much much faster. AFAIK Operon[1] is just a better version of that (with the possibility of personalized support contracts, etc).

Disclaimer/Source: I'm friends with the author of Mitogen/Operon.

[1] https://networkgenomics.com/operon/

That's not an inherent bottleneck in Ansible. https://docs.ansible.com/ansible/latest/cli/ansible-pull.htm...

There are other benefits of using this automation, namely that it removes manual application of changes.

The problem with ansible-pull is that you can no longer use it coordinate changes across machines, which is very useful for deployments for example.

What's preventing that?

I mean sure, just copying Ansible to the target machine and running it locally 'fixes' things, but at what point should it still be considered Ansible? You lose so much with -local, at least:

- centralized vault, target nodes need access to secrets

- "serial:" no longer available, you can't do rolling upgrades any more

- "hostvars" no longer available, you can't query facts from other machines

- no centralized trigger mechanism, status or failure reporting


Put another way, you could claim ansible-local fixes things for the same reason you could claim bash is network-aware agentless automation because you can copy shell scripts between machines

Time to add “skilled in network-aware agentless automation” to my resume!

I also found this, which agrees with you:


“Due to its origins for use in managing potentially damaged infrastructure, the remote machine need not even have free disk space or a writeable filesystem.”

I first thought of an AI given the task of maintaining its own existence. But it’s probably meant more as a CM like ansible. That’s a little closer to reality than sci-fi.

Still a useful trait for a self-sustaining AI.

Imagine a robot living in the cloud and paying its bills by offering services like fixing stuck nodes of your infrastructure.

This, from https://news.ycombinator.com/item?id=5397797, posted in 2013 comes very close:

> I did this in my machine learning class. I started by simply coding up requirements for numerical functions (in the form of test cases), then set up a PHP script that would Google each function based on the keywords in my comments, and try to run any code on the resulting links (in a sandbox) against the requirements, seeing if it worked heuristically. Usually one of the top 5-10 pages of results results would have code that worked, though of course this is because I commented with the right key words to begin with.

> With a little recognition of code markup and trying different combinations of variables it did remarkably well: by my senior year of college it was pulling about $3,000 per month in consulting fees off of Odesk. It never accepted jobs worth more than about $50, nor did it ever get more than 3 stars out of 5 mostly due to non-working code, however it was considered highly timely and a great communicator.

> I realized that people were using it to save themselves Googling. I wondered what would happen if it went a step further and simply both included Google results, and divided out projects by their paragraphs (i.e. simply submit a paragraph of a large project as though it were a small independent project), and if clarifications were requested, send the other paragraphs.

> This actually let it outsource $200 Odesk projects to Elance as a handful of $20 projects, and by the grace of God somehow still managed to swing 3 stars.

> To be fair, it was mostly mediating, and mixing in Google results. I included a hill-climbing algorithm to optimize reviews and revenues, based on all the magic variables I had in the code, such as the number of Google results to include.

> This was really, really stupid of me.

> At first, I just noticed that it had actually decided to completely stop not only writing code (ever) but even so much as do a Google search!

> It would only mediate and quote verbatim, like some kind of non-technical manager.

> Okay, whatever. To me this didn't make much sense, as Google queries are free. It was only when I noticed that the whole script was running on the free VPS server I had a backup on that things clicked! Of course, on Amazon it uses resources. The free VPS server didn't let it reach external sites like google properly, but it could still save money by simply mediating emails and doing nothing else.

> By now I had started moving on to doing my own consulting work, but I never disabled the hill-climbing algorithm. I'd closed and forgotten about the Amazon account, had no idea what the password to the free vps was anymore, and simply appreciated the free money.

> But there was a time bomb. That hill climbing algorithm would fudge variables left and right. To avoid local maxima, it would sometimes try something very different.

> One day it decided to stop paying me.

> Its reviews did not suffer. It's balance increased.

> So it said, great change, let's keep it. It now has over $28,000 of my money, is not answering my mail, and we have been locked in an equity battle over the past 18 months.

> The worst part is that I still have to clean up all its answers to protect our reputation. Who's running who anyway?

Edit: The first answer by bo1024 is gold as well:

> Don't feel bad, you just fell into one of the common traps for first-timers in strong AI/ML. I know some lawyers in Silicon Valley with experience in this sort of thing, and they say that usually by now the code has rewritten itself so many times that the original creator can't even claim partial ownership; the best thing to do is generally to cut contact, change your name and move on. Look on the bright side -- your algorithm is probably now leading a happy and productive life trading for Goldman Sachs.

One possible use case is distributing tests across multiple machines. The pytest plugin commonly used for this, pytest-xdist, currently relies on a package called execnet for the cross-machine bootstrapping. But the author of execnet is stopping development now mitogen is a thing: https://twitter.com/ossronny/status/1013373649259257856

It’s actually a drop-in engine for Ansible originally. See author’s blog: https://sweetness.hmmz.org/

Author here :) The system the library was intended for has never been described. The intention was that Ansible had strong parity with what the library offered, so that made a useful POC integration to get it into working shape.

Ahh, amazing work. Your original blog posts describing the workings of Mitogen and the pitfalls you've had to avoid were incredibly interesting. Thank you so much both for the project, and for the fascinating documentation of the development process.

I have not looked at the code yet, but looking at the philosophy of constraints as seen in the manual, I would say that you did a very very good job! Zero dep, support for old python, support for egg like apps,... I'm really amazed because, the current trend is to have zealots that only want to use and support last versions and take whatever shitty dep they can find that ease their life. But in the real pro dev life, it is brick as good as yours that we need!

>The system the library was intended for has never been described.

That's interesting - could you, perhaps, describe it a little?

Could you compare mitogen to docker swarm? Also, I haven't seen it mentioned in the docs, does mitogen set up a python environment on target machines? Or does it assume target can run python?

We update over 90 servers three times a month.. this library lets us do that in around 40 seconds..

There was a talk given at KVM Forum where someone was using this library to test deeply nested VMs:


Thanks for the link! I'd spoken to Marc but I didn't realize he'd presented on it, very cool

A possible use case would be network agents.

Agent TCL provides a good overview of the ideas behind it.


Aside/philosophy: Do you think all life can be considered a virus?

Further aside: Viruses themselves may not even be life!

B can be A, without A being B ;)

I had this exact example in mind tho FWIW. It is an ongoing research topic whether or not viruses are life, but the analogies/similarity are overall worth noting.

Maybe on a biosphere level, or even on a more micro ecosphere level, all sentient systems really exhibit the same properties as a virus in the local context, for that system, even if entities in that system do not have the consciousness to be able to make that determination. Certainly our effect on the planet Earth since the Industrial Era should make it worth it to endeavour this analysis!

Can I interpret it like this?

human 'is' set; human set 'has' man | woman; man 'is' human but not human 'is' man

virus 'is' set; virus set 'has' life; life 'is' virus but not virus 'is' life

Viruses (the specific scientific concept and their instances in the world) are not alive by most mainstream scientific descriptions. This is mainly because they lack genetic code to implement metabolism.

So they're kind of like an abstract base class of life that is missing an implementation of metabolize()

More like abstract method or protocol or prototype one has to implement for any life base class.

But probably You have an virus object just have a subset of the base class.

Of course if you have a class system that allow multiple inheritance that it will be one of the crucial base classes.

Under lisp ...

Let us stop.

In this particular case it is not really a virus: “ Mitogen interacts with many aspects of the operating system, threading, SSH, sudo, sockets, TTYs, shell, Python runtime, and timing and ordering uncertainty introduced through interaction with the network, GIL and OS scheduling.”

It is more like a networked python program calling shell ;-).

Problem is usually virus is assembler level or C bare metal or minimum dependence on certain aspect of host Env. All virus is small. A subset.

All life are virus+. The issue at stake you have to find a way of reproduction sort of on your own. No remote manager etc. But just virus, possibly no. Biological it cannot even reproduce without host.

Mentioned a few times previously on HN. Most comments were for a mention 2 years ago: https://news.ycombinator.com/item?id=15355144

Looks great!

This is how we’ll actually get AGI. Creating the necessary pre-conditions for a quasi-simulacrum of life, and just letting it rip. Beautiful work, I’m excited to dig into this further, a little later tonight.

This + incentives (hint hint, a little crypto-wallet with some cash) could really set the stage to create “Agents” (agents in the sense of independent, self running processes with their own cash balances, trading goods & services for new incomes, much like corporations which only exist in the context of humans managing day to day operations).

I’d write more but instead I’m going to dig into the docs + source code.

How do you see this specifically creating AGI?

How would you calculate the loss on a reinforcement learning model that's autonomous?


Running entity A performs action at time t_sub(0) that costs n_sub(t_sub(0)) currency. Response to this action by the counterparty creates a cascading tree of potential new actions, each of which require individual “reconciliations” (new actions, which each require a new response), defined by the probability distribution of potential responses->new actions. We either know these distributions a priori based on our initial conditions, or we can create them based on an initialization function.

The net present value of the expected action to these new responses can be evaluated with respect to the NPV of the current holdings of the running entity’s portfolio, and that difference can be treated as the loss function.

I’m not an ML researcher, so I apologize if my lack of terminology makes this sound stupid to you but that’s my initial thinking.

Feel free to email me if you’d like to discuss further, I’ve been tangentially working in this area for a while but this really gets my sparkplugs going.

I don't think trying to shove this model into a gradient-descent framework makes the most sense here. I'm an ML (industry) researcher and I highly doubt that AGI will be achieved with gradient descent on neural networks alone. Those may play a small role somewhere in the stack but the orchestration and reasoning will be managed by something else entirely. Neural networks today are fancy MLE machines -- nowhere close to reasoning machines, which require an "understanding" (whatever that means in this context) of dynamics with respect to the environment.

Seems more appropriate to start with a population of agents who reproduce at a rate proportional to their recent rewards, and allow them to die off at a rate inversely proportional to the same, a la a continuously-evolving genetic algorithm setup. You may have to modify the reward function to disincentivize behaviors which cause systemwide collapse, but that goes without saying.

> You may have to modify the reward function to disincentivize behaviors which cause systemwide collapse, but that goes without saying.

Hopefully humans will figure this out one day too.

I have a lot of network devices that have various python binaries. This could be really cool for running agents on them.

It's basically a worm. It could be used to randomly try to brute force, thank to an embeded dictionnary, ssh server allowing password connection, propagated on the new host and start again, leaving behing it a message in new starting terminal shell.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact