Hacker News new | comments | show | ask | jobs | submit login
Show HN: My friend's project to simulate an entire C. Elegans (davidad.org)
147 points by SlyShy 1448 days ago | hide | past | web | 42 comments | favorite

While interesting (especially from a funding perspective!), most of these approaches aren't quite as ground-breaking as one might think. Neuroscientists in academia are doing most of these things. For optogenetics, see Deisseroth's work; for in vivo Ca2+ imaging in nematodes, refer to the work of Bargmann as well as the various Witesides collaborations involving microfluidics; pan-neuronal in vivo imaging is currently being pioneered by Engert at Harvard and a couple of Janelia Farm labs.

These are massive efforts, and involve horrendous heaps of diligent busywork. This makes me the boring naysayer, but please don't be distracted by the startup-like appearance and the peculiar financing situation. It's possible but unlikely that the major obstacle here is simply the combination of available techniques!

Honestly, what I'm most curious about are his thoughts on model-driven interrogation of an in vivo system -- biologists, and even computational neuroscientists, are a bit too hesitant when it comes to letting computers find and test hypotheses. In the age of highly advanced genetic techniques (e.g., binary expression systems in Drosophila or zebrafish) and 2-photon imaging, the process of actually evaluating hypotheses has become a bit old-fashioned...

I like to describe my techniques as "not novel, just new." I'm absolutely building on the work of many others in academia. I've spent time with Engert at Harvard, Rex Kerr at Janelia, and some of Deisseroth's students, among others (including Boyden, Ramanathan, Manuel Zimmer in Vienna, Aravi Samuel, George Church, and more). I've only briefly met Bargmann and I loved the advice she gave me: "Hmm. That's probably not going to work, but...it's really worth trying."

Unfortunately, most academic scientists are not in a position to risk their careers on such a bold and encompassing proposal. However, I benefit from this in some ways, because academics who are interested in this problem often work on a small piece of it instead of going for the whole thing, and are then incentivized to share that piece with me to integrate with all the other pieces, so they can see their contribution realize its full potential.

I wouldn't underestimate the technical demands of the project - it certainly would have been unthinkable 10 years ago. Sydney Brenner once famously wrote: "Progress in science depends on new techniques, new discoveries, and new ideas, probably in that order." However, you're right to point out that it's really the abstract methodology of delegating experimentation to a machine which truly distinguishes my technical proposal from related work. I have to admit that I haven't worked out in detail what the math will look like, though <http://arxiv.org/abs/1103.5708>; is a pretty good start. The tricky bit is defining the probability space (that is, the family of models under consideration). As a probability space, it must have a measurable structure. But for efficient and effective inference, it should also have additional structure, like a vector space. Yet any particular choice of vector space representation will trade off dimensionality against non-convexity, so it's desirable to have multiple representations of the same space. I've been taking a "cross that bridge when I come to it" approach, as I'm occasionally reminded by academics that if all I manage to do is collect a bunch of time series data about hundreds of neurons simultaneously, that would still probably be scientifically interesting and novel, even with ordinary, human-driven analyses.

I think it's fair to say that the hope is to combine the state of the art technology to discover more scientifically than has been before known.

Additionally, this pushes technology. The hope is to push the technology far enough a system such that not only do the components work, but you can build actual useful engineering systems with them.

David's start is at 14 and not post 20 as most grad students. At minimum, he will end up making something like Mathematica like Wolfram and at max, he will change the world.

The Big and Little Oh of this story are both extreme events.

That's an awfully high "minimum". I knew several "geniuses" at that age (might even qualify myself, though admittedly I was only three years ahead of par at that age); most of them have gone on to fairly normal (though by no means unsuccessful) careers, and a couple burned out quite spectacularly. I don't think putting that weight of expectation on is helpful.

In deed, you remind me of Solon's warning. Perhaps the better way to put it is he's attacking one heck of a problem - and with the right detachment to both industry and academia - so the minimum is based more on the quality of problems than the individual ( as unique as that is in this particular case ).

But yes, being an entrepreneur is an added layer, as you've to learn to arrange people to attain a larger goal than just research.

And he shows signs of that even on the jobs page while avoiding the broken method of interviewing that is oft practiced, he's almost defining a boundary for relatively high signal from the applicants.

His fluid approach is reminiscent of the caper that the Google guys pulled at Stanford.. and as it turns out, I think he also got funded by Larry.

I love the internship application page http://nemaload.davidad.org/jobs. The author does expect any candidate to be perfect before meeting, but sets out a small set of criteria for any potential candidate to research.

Probably in the process of researching any of these topics a potential intern would gain a good understanding of the upcoming internship.

Sometimes I wish job interviews were laid out like this.

Sounds really interesting. You could try to secure additional funding for this project via Microryza (http://www.microryza.com/), which is a kickstarter for scientific research.

Side note, I used to do work with CAD systems for simulating/building gene circuits. I can't wait until we move from open tools (Tinkercell, Biobricks, Genome Compiler, etc) to fully-modeled open platforms for organisms. Imagine having the latest build for E.Coli or C.Elegans, or a package manager to organize them, at your fingertips so that you can hack away at new designs.

It makes you wonder, with so many people independently studying and creating observed and simulated data, are there centralized places to aggregate and share these?

How is "Genome Compiler" an open tool? I hate those guys, it's not even an actual compiler. or open. If they wanted to do something useful, then yeah I would shoot for that package management concept.

Hmm - it seems those guys are not what I remembered/thought they were. I think they were somehow tied to the open bio standards project which is what I was thinking of.

Oh definitely, they pump out lots of confusing/wrong marketing all the time. I can see how they would get into that sort of list. Edit: you have a phd in pirate science?

I do not think funding is David's problem :-).

This isn't directly related, but he enrolled in an MIT graduate program at the age of 14? That's incredible.

I would love to see how this project progresses. It has fascinating implications for artificial intelligence and singularity.

...and was fired from it.

This is not actually true.

The about page [1] is very interesting. "3. Providing a foundation for uploading research .... If it can be done for a worm, the next steps are to attempt a zebrafish, then a fruit fly, then a honeybee, then a mouse, then a dog, then a macaque monkey, then a chimpanzee, and, ultimately, a human. " -- that really fascinate me.

Isn't that awesome? But ... "the philosophical assumptions fail, and human immortality through uploading is fundamentally impossible".

Could anybody explain this a little bit?

[1] http://nemaload.davidad.org/about

Do be careful when you get to lobsters.

Charles Stross' book Accelerando includes a short story about uploaded lobster brain-scans defecting from the KGB. And they don't like humans. :)


The jump from honeybee to mouse is enormous - much larger than any of the other jumps except chimpanzee to human.

Maybe try lizard in between them.

I'm tempted to point out that reptiles are so distinct from either mammals or insects that they wouldn't be a natural stepping stone. But some of my esteemed colleagues advocate jumping from zebrafish to mouse, and I just realized they're probably thinking along similar lines: why try to figure out insects along the way from fish to mammals?

Personally, though, I'm limiting my scope to the worm (at least for now!). The particular path which is best to take beyond that is much less clear.

The rest are a bit out of order too... you'd probably want to try Drosophilia before Zebrafish. But yeah, the leap to mouse is pretty huge. You could potentially try to do Xenopus tropicalis (frog) before mouse, but I'm not sure how well that is mapped as a neural model.

The link in the sidebar offers a reasonable elaboration: http://nemaload.davidad.org/about/philosophy

I'm curious, is the goal to prove one or few models wrong? I think while it is extremely sexy, his explanation kind of answers the question for him in a cheesy way - "depends on which lens you look at the problem with".

Doing a nematode-version Turing test is one thing. You could even do it as a nematode in a Chinese box. But isn't there a flaw in comparing how a real nematode's brain reacts to light vs. a nemaload function to a specific computed wavelength of light energy? I think I'm not understanding something.

> Isn't that awesome? But ... "the philosophical assumptions fail, and human immortality through uploading is fundamentally impossible".

To boil it down: They assume roughly that simulating the state and functionality of the neurons is sufficient to reproduce consciousness.

If there's something more to consciousness - say for example that consciousness requires the specific organisation of matter of the human brain - then uploading, at least into software, will fail.

Unless the software models those bits too and models them well enough.

Interesting .. although a similar project already exists (http://www.openworm.org/), and is, from a cursory glance, much more developed.

Doesn't seem to be same thing. OpenWorm is computer simulation, while the OP plans to build a model based on observing an actual living individual [1].

[1] http://nemaload.davidad.org/about

That's about right. My end product is also a computer simulation, as far as that goes, but I'm taking an obsessively data-driven approach, motivated by the advent of new technologies for single-neuron measurement and perturbation in living, behaving animals. Meanwhile, OpenWorm is taking a bottom-up approach, driven by physics and other first principles. We'll meet in the middle eventually.

yo david, long time no talk.. did you ever get a chance to make use of this data set?


It's a few thousand hours of videos of the worms under different stimulation conditions for observing/identifying common behaviors.

We miss you in IRC.

The videos and statistics, as well as a description of the setup used for the recordings can be found here:


BTW, what you're attempting blew my mind away. That's a nice hack involving many different disciplines, it will be amazing if you can pull this off.

So its not just 'simulate', its more like 'virtualise'. They want to examine neurons in a real worm in real time and then model that. You know all those sci-fi books where people upload their minds into software? This worm is going to get there first. Hope they give it somewhere nice to virtually live.

Oooh Exciting stuff! I've been following openworm off and on for a while now. The projects should be extremely complementary. I hope they communicate.

(This projects is going to collect a totally new dataset)

Yes, I know Stephen Larson and some of the other OpenWorm guys. In fact, I'm even a member of their organization on GitHub <https://github.com/openworm?tab=members>. Once both projects are a bit more developed, I expect ample opportunities for substantive cooperation; but for now, I'm focusing on data collection challenges, while they're focusing on simulations, visualizations, and data management.

I like the "optional" hands-on challenges he listed for applicants in the jobs page. It may end up filtering out some talented candidates (not wanting to put in this much effort for one application), but it will certainly call out to the tinkerers and hobbyists who would do those tasks for fun anyway.

The challenges are great because it's not just another meaningless programming test. Instead it's "week 1" of the internship, where you learn the background necessary to start work on the project. And even if you don't get the position, all those tasks are useful for a grad student in neuroscience or bioinformatics.

This has been out for a while. Would be interesting to see the next steps.

Yeah, I haven't updated the site in a while. (This really wasn't the best time for a Hacker News blitz. I was just telling my friend that Nemaload hadn't been on HN yet, as far as I knew, and he simply decided to fix that pronto.) But there have been plenty of developments since November (when I think I last updated the site). I have access to a few candidate strains for pan-neuronal calcium imaging (though the genetic engineering is still an ongoing optimization process) and have collected a couple of large datasets, one on a light-sheet microscope and one on a light-field microscope. Much of the data is useless, and we're still working on the algorithms to extract signals from it, as well as the infrastructure for opening it up for distribution. But there will certainly be more publicly visible activity soon (June at the latest).

Oops, didn't mean to link to the HTTPS-restricted area. <http://nemaload.davidad.org/news/2013-developments>, but it's basically just what I wrote above, as suggested (-:

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact