
A 0.07-Second Power Problem at Chip-Plant Affect Digital Device Availability - PanMan
http://spectrum.ieee.org/riskfactor/semiconductors/memory/a-007second-power-problem-at-toshiba-chipplant-may-affect-digital-device-availability-and-shortterm-higher-prices?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+IeeeSpectrum+%28IEEE+Spectrum%29
======
mechanical_fish
There is something weird about this story. The glitch happened at 5am on
December 8. They got everything up and running at 100% capacity by 15:00 on
December 10. That's 2.5 days, at most. 2.5 days isn't 20% of two months, so
why are they losing 20% of shipments over two months?

The article says it's because "wafers in process were ruined". But in a fab,
at any given time, most of the wafers are sitting in carriers waiting for
robots to move them in and out of the tools. It's easy to see how a power
glitch might kill wafers that are sitting in an etcher being etched, but not
so easy to see how it might kill wafers that are sitting on a conveyor belt.
It might _stop_ the belt, maybe even for days, but then don't you just fall
behind by a few days?

Could it be a post-bottleneck disaster? Cartoon example: You spend six months
making wafers, put the six-month supply into a big box, then drop the box. But
that would be quite odd. Engineers don't tend to let nearly-finished wafers
pile up in (metaphorical) big boxes and then process them in huge batches...
precisely because such a practice makes you vulnerable to a tiny glitch like
this one.

Not that I disbelieve the story or anything. The number of crazy things that
can go wrong in a fab is truly unbounded. And, though I once worked in a fab,
that was a few years ago and it wasn't a flash RAM fab, so maybe I just don't
appreciate the range of possible disasters. But I wish we had more details.

~~~
sparky
They could mean that wafer-starts-per-day were back up to their normal levels
within 2.5 days, but the pipeline bubbles that result from destroyed wafers
anywhere in the process almost certainly won't be filled for another month.

It's easy to believe that the numbers cited in the article are fudged slightly
-- fabs tend to be tight-lipped -- but there are also several stages in the
process where many wafers could get screwed up at once, e.g., most ion implant
machines process wafers in big batches. From a vulnerability point of view,
you're right that it's foolish to do so, but you need to balance satisfying
Little's Law with the cost and space of equipment -- it'd just be too
expensive most of the time to buy enough single-wafer implant machines to
match the rate of other, much faster machines.

------
patio11
Working at a Japanese company in that region, where all engineers are heavily
influenced by the manufacturing culture, was enormously educational. We
literally have reliability down to a science, because if you don't, "small"
mistakes wreck you.

(Sadly, nobody reports when "Worldwide economy saved _again_ today after
emergency line shutdown at key Toyota plant averted by high school trained
production laborer, using Stats 101 and a checklist.")

~~~
krschultz
I _love_ lean manufacturing, kanbans, statistical process control, all of that
stuff. When I first learned about all of the ways Japanese manufacturing had
innovated it was akin to when I first learned about lambdas and functional
programming, except on the mechanical engineering side of my brain.

~~~
tomjen3
I have read about the lean startups, but I have neveread about the
manufacturing side of things. Could you recomend an good book on the subject
for a fellow hn user?

~~~
count
I think the classic in this field is 'The Toyota Way':
<http://www.amazon.com/Toyota-Way-Jeffrey-Liker/dp/0071392319>

