
Deep learning sharpens views of cells and genes - lainon
https://www.nature.com/articles/d41586-018-00004-w
======
klmr
The “DeepVariant” example exemplifies the problem with deep learning in
biology: some exceptions (when actual imaging is involved) notwithstanding, we
still haven’t found a good use for it. Most problems that we can formulate in
biology are solvable with linear classifiers: the models are purely additive.
The nonlinearity of deep learning offers no benefit.

The article says that

> In tests, DeepVariant performed at least as well as conventional tools.

Which is technically correct but highly misleading. But a more honest way of
saying this would be that

> DeepVariant performs _no better than_ conventional tools, and given the
> nature of the problem there is no reason to expect further boosts in the
> future from this method.

This is pretty much what the rest of the field thinks, anyway (example of this
view in [1]).

From a pure science point of view, DeepVariant is interesting: it applies a
new technique to an old problem and shows that it works. This alone is
exciting (to me personally at least). But in practical terms it’s useless; it
does no better than existing methods and is far more complex and _orders of
magnitude less efficient_.

[1] [https://www.forbes.com/sites/stevensalzberg/2017/12/11/no-
go...](https://www.forbes.com/sites/stevensalzberg/2017/12/11/no-googles-new-
ai-cant-build-your-genome-sequence/#7b636c6c5774)

~~~
EpicEng
>This is pretty much what the rest of the field thinks, anyway (example of
this view in [1]).

Yup. I'm not a CV engineer myself, but I work in between our CV folks and pure
software engineering. I've worked on automatic labeling and classification of
circulating tumor cells (focusing primary on the imaging system(s)) for the
last five years. Every experienced CV person I've worked with or spoken with
dismisses DL out of hand for what we're doing. The less experienced folk jump
right to a DL solution without understanding the problem domain.

~~~
xyhopguy
The biggest complaint about DeepVariant is that it isn't even a CV problem. I
mean you start with a bunch of images, but those get processed and turned into
basecalls, which get aligned to the genome, and then turned into an image
again (why though?) for DeepVariant. Interesting approach to say the least.

------
aaavl2821
The advances in this field will probably come from people like the biologists
quoted at the end of the article, who are using these new techniques to
improve workflow and identify new techniques to explore biology

The google stuff, as others have said, seems...useless. Deepvariant performs
just as well, but not better, than conventional methods. As far as being able
to tell someone's age or smoking status or blood pressure, that is already
done pretty well by...asking people, looking at birth certificates or using a
blood pressure monitor

A few years ago, google publicized its ability to detect macular degeneration
or some disease better than humans using deep learning. But it was only
marginally better, not enough to change clinical decisions. And to actually
implement that tech would be almost impossible given existing healthcare
workflows, treatments and economics. The ability to predict heart attack from
eye images is cool in theory, but they probably can't actually do that yet
technically with a good enough specificity, and how would you get eye images
on a regular enough basis for it to be useful?

~~~
voidifremoved
> how would you get eye images on a regular enough basis for it to be useful

Could they be collected from all those devices with front - facing cameras
people spend their lives staring at?

~~~
aaavl2821
I don't know enough about those cameras to know, but not sure if they'd be
able to get retinal images of sufficient quality. My guess would be they can't
but I dunno

------
xyhopguy
even nature is hyping the verily paper now?

It's like I'm taking crazy pills. Variant calling was already very very good
and grounded in statistics. Why ON EARTH do we need to convert it to AN IMAGE
and give it 1000x more compute? The gains are like 1% too. That's barely above
the error rate of sequencing.

~~~
klmr
> The gains are like 1% too

If that. I’m sceptical (and so are others). Given that all methods are
imprecise, it’s not terribly hard to find a special case on which method X
outperforms all the others. But on average?

------
daemonk
Isn't this the software that turns alignments into images to run through an
image analysis pipeline? They just jerryrigged alignment data onto an image
analysis system?

Does it offer speed/memory advantages over traditional variant calling?

~~~
xyhopguy
It's worse than that. It turns imaging data (illumina basecaller) into
sequences, aligns them, turns them back into images (nucleotides go into the
same channel for some reason), THEN uses 100x more compute to get a
questionable 1% gain on indels.

Why.

------
amelius
Disappointed by the lack of images.

~~~
lainon
the cited paper "Building a 3D Integrated Cell"
([https://www.biorxiv.org/content/biorxiv/early/2017/12/21/238...](https://www.biorxiv.org/content/biorxiv/early/2017/12/21/238378.full.pdf))
has images

