Ok there are some warning signs here. First, bagged decision trees are a little ...

NamTaf · on Dec 12, 2016

It sounds like the OP is scanning for internal defects in bonds via impurities being trapped in there. These occur relatively randomly and there's some balancing point where it's just not worth trying to make the production line cleaner vs binning parts that fail some QA criteria. I do similar things with castings, where you simply just get certain voids and porosity in the steel when cast and either you can spend a tonne of money trying to eliminate them or you can spend less money testing the parts and binning those that aren't up to par.

I'd hazard to guess that the 95% is the reduction in how many parts made it through the first test only to be caught later at the more expensive stage. So instead of binning 100 parts a month at that second stage, they now bin 5 parts a month and catch way more early on.

It sounds like the OP is using ML to identify flaws that simply just occur due to imperfections in the manufacturing process. That's life, it happens. You can know that they will occur without necessarily being able to prevent them because maybe there's some dust or other particulates in the air that deposit into the resin occasionally, or maybe the resin begins to cure and leaves hard spots that form bond flaws. There's heaps of possible reasons. It sounds more like the ML is doing classification of 'this too much of a flaw in a local zone' vs 'this has some flaws but it's still good enough to pass', which is how we operate with casting defects. For example, castings have these things called SCRATA comparitor plates, where you literally look at an 'example' tactile plate, look at your cast item, then mentally decide on a purely qualtative aspect which grade of plate it matches. Here [1] are some bad black and white photos of the plates.

[1] http://www.iron-foundry.com/ASTM-A802-steel-castings-surface...

altshiftprtscrn · on Dec 13, 2016

This is pretty spot on. We know why the defects happen and why they cause downstream test failures, but we lack the ability to prevent (all of) them.

To clarify on that 95% value because it is admittedly really vague: That's actually a 95% correct prediction rate. So far we get ~2.5% false-positives and ~2.5% false-negatives. 2.5% of the parts evaluated will be incorrectly allowed to continue and will subsequently fail downstream testing (no big deal). More importantly, 2.5% of parts evaluated will be wrongly identified as scrap by the model and tossed, but this still works out to be a massive cost savings because a lot of expensive material/labor is committed to the device before the downstream test.

andrewchambers · on Dec 13, 2016

I hope you get a decent chunk of those cost savings as a reward for your effort, great job.

mattmanser · on Dec 13, 2016

D'Angelo Barksdale: Nigga, please. The man who invented them things? Just some sad-ass down at the basement at McDonald's, thinkin' up some shit to make some money for the real players.
Malik 'Poot' Carr: Naw, man, that ain't right.
D'Angelo Barksdale: Fuck "right." It ain't about right, it's about money. Now you think Ronald McDonald gonna go down in that basement and say, "Hey, Mista Nugget, you the bomb. We sellin' chicken faster than you can tear the bone out. So I'm gonna write my clowny-ass name on this fat-ass check for you"?
Wallace: Shit.
D'Angelo Barksdale: Man, the nigga who invented them things still workin' in the basement for regular wage, thinkin' up some shit to make the fries taste better or some shit like that. Believe.
[pause]

Wallace: Still had the idea, though.

agopaul · on Dec 13, 2016

Can't believe I'm seeing the Wire referenced on HN

slmyers · on Dec 13, 2016

Oh, a Wire reference on HN... my life is one step closer to completion.

taejo · on Dec 13, 2016

> 2.5% of parts evaluated will be wrongly identified as scrap by the model and tossed

2.5% of what, though? if only 1 in a million parts are actually bad, you're still tossing many more good parts than bad parts.

altshiftprtscrn · on Dec 14, 2016

Correct - as mentioned the cost savings still work out. The defect rate is around 30% (Nowhere close to 1 in a million).