
Classifying White Blood Cells with Deep Learning - sethbannon
https://blog.athelas.com/classifying-white-blood-cells-with-convolutional-neural-networks-2ca6da239331?hn2
======
itchyjunk
So is this 98% with stained images from the data set? I thought CNN's were
data hungry with my limited knowledge. 98% of ~400 images seems fairly
impressive. But I wonder about how well it will perform with unseen images.

Tangentially, there seems to be little news on new opensource data sets for
anything. Saw Google do a few different sometime back. Do research companies
only care about advancing techniques so they can use their private dataset to
reap all that public research? Or is there genuinely no way to create data
sets? For example, is it financially/ technologically impossible to make an
"ImageNet" of cells? (Or maybe a lot of data sets are coming out and I am just
unaware of them.)

~~~
devl82
it is very expensive & time consuming to create a vast amount of properly
labeled image cell dataset. In general you need >2 pathologists to confirm the
cell types (they disagree sometimes and you usually take the majority vote);
this almost never happens with cat images;) Also there exist a multitude of
device acquisition modalities for image capturing in microscopy, different
stains for the same types of cell, etc. & actually simple RGB cameras are
considered fairly low tech for these kind of operations. ps. I am no deep l
expert (i use more 'traditional' ml) but as you pointed out ~400 images for
these techniques can be an 'overfitting' recipe of disaster..

~~~
itchyjunk
Would something like images of rat cells to train to than transfer learn on
humans be worth while? The author of the article tried it with ImageNet and it
didn't work out. But I wonder about the viability of that techniques with non
human cells.

~~~
devl82
well the same principle as with the VGGnet applies here too. If the rat images
differ 'significantly' (whatever that means; i am a research engineer not a
pathologist) then you will have nothing to transfer. Maybe it would be more
fruitful to try to transfer via a huge amount of artificially generated cell
images (there are toolboxes for that and its not linear transformations like
rotation etc.) blended with some subset of vggnet (or similar) trained only
with 'circular' objects ..

------
minimaxir
As an off-topic aside, there has been _very_ suspicious behavior with this
submission. It has been submitted _atleast 6 times in 3 days_ by mostly the
same people
([https://news.ycombinator.com/from?site=athelas.com](https://news.ycombinator.com/from?site=athelas.com)),
and I know the OP of this submission submitted it earlier today and deleted it
after it didn't get upvotes.

Don't do that. At the least, it isn't worth it for just a blog post.

~~~
dang
Yes, deleting stories like that isn't allowed (see
[https://news.ycombinator.com/newsfaq.html](https://news.ycombinator.com/newsfaq.html))
and we penalize accounts that do it. Deletion is for things that shouldn't
have been submitted in the first place.

This particular submission is one that a moderator saw (independently) and put
in the second-chance queue described at
[https://news.ycombinator.com/item?id=11662380](https://news.ycombinator.com/item?id=11662380).
We might not have done that if we'd seen the previous ones, but as long as the
community is interested in the story it can stay up now.

