“Has something ever failed in the field due to a bug in a CAD tool?”

sammcf · on April 30, 2016

I had a tremendously embarrassing failure in a sheet metal part designed in Solidworks due to a bug in the way bend allowance is calculated. The long story short is that the flat pattern developed by SW was about 10% dimensionally inaccurate which resulted in a rotating part binding up in the field and going blammo. This was years and years ago as a fresh young designer, so I reported the issue to my reseller and the response was essentially "yeah, you're gonna have to take that into account." I've since learned that this sort of idiosyncratic difference in calculating certain values, particularly things like bend radii and bend allowance, things like beam extensions on structural frames, etc, is par for the course for CAD packages. Pretty much every engineer I know has learnt this lesson the hard way.

I guess it's not really a bug per se, just hidden behaviour that you have to learn about and take into consideration.

kazinator · on April 30, 2016

> "yeah, you're gonna have to take that into account."

What? But the software was supposed to take that into account, which is why it it supports bend allowance calculations in it!

(I have a tiny bit of experience with this; I designed some sheet metal thing. I somehow clued into bend allowances before doing it, even though it was my first and only one! I figured, you can't just draw the naive shape; the metal has a bend radius. How the hell does that work? [half a day of Googling] Oh. I used a 2D drafting program that had no clue about this, so I manually drew the pattern with the allowance in it already, based on the thickness of the aluminum.)

sparky_z · on April 30, 2016

> But the software was supposed to take that into account, which is why it it supports bend allowance calculations in it!

Structural engineers aren't allowed to treat software as a black box and trust that it works correctly. They're legally responsible for choosing their tools and verifying that they work correctly.

alkonaut · on April 30, 2016

I think think this varies very much between different "cultures", such as different countries and different fields.

I make a structural engineering program which is used by pretty senior engineering types in Scandinavia and Germany, who can certainly be trusted to know that with some nonsense input, the output may also be nonsense.

The requirements for the UK are completely different - they want the program to be used not as a tool (like a fancy calculator) but as a simple black box that a novice engineer or even a non-engineer can use.

kazinator · on April 30, 2016

I.e. the "that" in "that into account" means the tool being possibly wrong.

There is no, "sorry the TV blew up in your face; my scope was out of calibration". :)

rdl · on April 30, 2016

Yet another example that software engineering isn't.

chris_wot · on April 30, 2016

Uh, I don't think there is a better example of closed source than CAD packages.

afuchs · on April 30, 2016

I don't think he was talking about verifying the software itself. Even if the software was open source, would it have been formally verified to make sure it has zero bugs? It is better to check the calculations by hand to make sure that no critical errors make it into the final design.

chris_wot · on April 30, 2016

If you are checking the calculations by hand, then it seems to me you are verifying the software. It's not a bad thing to do, but I wonder why it's considered acceptable when the results don't match your own? If your results are correct, then you have a malfunctioning tool...

Wouldn't it be helpful to be able to look at the code to see what it is actually doing behind the scenes?

joesmo · on April 30, 2016

Not if you can't code, which most engineers cannot (nothing wrong with that). In fact, I'd say without a team of highly trained software engineers and a whole lot of time and money, having the code for a CAD program is almost useless. In civil engineering, it's my understanding that it's standard procedure to check everything CAD tools output. If a bug leads to failure in the field, at least one person, but probably multiple people, has failed horribly and he will rightfully be held responsible.

azernik · on April 30, 2016

There are many problems where finding a solution is much harder than verifying a solution.

masklinn · on April 30, 2016

A well known one used all the time in crypto being finding the prime factorisation of a number versus verifying that prime factorisation.

FeepingCreature · on April 30, 2016

So verify the solution you give me!

azernik · on April 30, 2016

If the whole point if verification is that you don't trust your software, then that's not a very good idea. So you use the ultimate slow-but-reliable computer, the trained engineer.

IkmoIkmo · on April 30, 2016

I remember meeting an engineer over easter brunch a few years back and I asked him what his actual day looked like, you know day to day activities. Then someone started making a toast so he quickly summarised 'I make calculations, and then I add a giant margin of error that's totally unnecessary if our model was completely correct, just to make sure' haha.

rtpg · on April 30, 2016

I don't get this though...

Like if the software calculates bend radii that will break things, isn't that a bug? What's the explanation for that even happening? Is it just that the formula involved means that you can't always get a "right" answer, so the software will be like "well, this is the best we can do, figure out the rest"?

chris_wot · on April 30, 2016

So what are you gonna do? Move to another CAD package? Take a look at the reasonably large number of CAD packages listed on Wikioedua here:

https://en.m.wikipedia.org/wiki/Comparison_of_computer-aided...

Virtually all of them are proprietary. They are too expensive to move away from, and every one of them has flaws that firms workaround and build there workflows on. Retraining, proprietary file formats and the disruption of migrating an essential part of most businesses is prohibitive.

Frankly, IMO the CAD marketplace is an example of how closed source really bites an industry. Funnily enough, most engineers - who would never accept inaccurate tolerances on items like screws or bolts - seem to accept this is the way it is and despite the stories we read here let the situation continue.

I'm frankly amazed there aren't more failures in infrastructure from bugs in CAD software. That's more a testament to the fact that engineers are engineers though - when they realise there are problems, they dig in, figure out where the problem is and find a solution, and in the case of CAD software I really think they build or design their way around the problems.

Overall, CAD software is, despite all the flaws of being coded source, still pretty amazing and vendors do fix issues eventually. We can do so many more things in society because of it, but as we evolve as a society it becomes apparent there are limitations. Eventually CAD software vendors will be disrupted, but we're a long way from that point right now!

Niksko · on April 30, 2016

> Eventually CAD software vendors will be disrupted, but we're a long way from that point right now!

Not so sure we're that far off. Nobody really has much incentive to innovate in this space because it's so niche. If you're a believer in the version of the future where 3D printers are as commonplace as 2D printers are today, then this may all change.

Suddenly, 3D modelling skills are today's word processing skills (either for work or for fun), and there are a whole bunch of people working on coming up with innovative CAD solutions for the masses.

vlehto · on April 30, 2016

>Nobody really has much incentive to innovate in this space because it's so niche.

http://www.pwc.com/gx/en/industries/technology/publications/...

Siemens makes CAD packages. Dassault too. Autodesk does nothing but CAD packages. All in the top 20 of "software companies by revenue". PTC is number 42 and Bentley is 70 again CAD packages. It might be a niche, but it's pretty expensive niche then. Pure CAD companies Autodesk, PTC and Bentley made 2,5 billion dollars of revenue in 2014.

CompelTechnic · on April 30, 2016

I would argue that SolidWorks has been doing a good bit of innovation. The fact that its been stealing more and more market share over time speaks to it.

auxym · on April 30, 2016

Each new version of SW introduces new "features" (some genuinely useful, some shiny marketing gimmicks) and a host of new bugs. Software churn at it's best. Gotta release a new version to make money, just fixing bugs isn't profitable. Oh, and make sure file formats are not backwards compatible, even if for no good reason but to force upgrades.

Pamar · on April 30, 2016

Your analogy would work better if wordprocessor vendors had been disrupted by open source alternatives that took the world by storm.

rtpg · on April 30, 2016

I mean sure, things have bugs, right? I get that.

Sometimes autocorrect messes up things I type. It's not really a bug per-se, more just a side-effect of how things like autocorrect work, right? But sometimes Chrome crashes, and that's a bug...

The OP seems to say that the bend radius thing is more of the first case than the second, and I'm wondering how true that is. I'm not really sure of the science behind it (maybe it's an impossible constraints problem).

chris_wot · on April 30, 2016

Oh, no doubt. The issue is that the tools don't appear to document their limitations and constraints, and there is often no way of knowing until you get bitten.

At least with OSS you can look at the code behind the program to work out how it calculates things like the bend radius.

I'm no engineer, so I had to look it up (Wikipedia was no help whatsoever on this occasion) but I located this page which gives the formula and an explanation:

"Calculating the correct flat pattern layout is crucial to getting a good quality finished part from your press brake. Yet, many CAD and CNC programmers have no idea how to calculate the required values. Years ago, the real experts created cheat sheets and tacked them to the wall. They only taught the new apprentice how to apply the results shown on the cheat sheet, not how to calculate the numbers. Well, now those experts have retired and it's time for a new generation to learn the right way to do the calculate the correct flat pattern layout."

http://www.sheetmetalguy.com/bend-allowance.htm

Now admittedly the guy does put the responsibility on the operator, but it appears CAD packages can indeed calculate this for you. So long as the right info was input into the program, the formula should be programmed correctly and the operator should be able to have confidence in the results!

sammcf · on April 30, 2016

>Now admittedly the guy does put the responsibility on the operator, but it appears CAD packages can indeed calculate this for you. So long as the right info was input into the program, the formula should be programmed correctly and the operator should be able to have confidence in the results!

"Should" is the operative term. I can tell you with that having hand checked the automated results of several CAD packages at developing sheet metal flat patterns and finding unexplained variability and errors too many times to count that if you want confidence in the dimensional accuracy of a sheet metal part, you need to develop it by hand. Especially for anything over 5mm in sheet thickness. As far as I'm concerned you really can't engage in sheet metal design without being able to manually verify the development results, so in a sense it is up to the designer regardless of the capabilities of the CAD software.

To be fair, I'm picking on one of the hardest physical manufacturing processes to actually get right - even CNC brake presses have a great amount of variance in calibration (e.g. the amount of bend allowance you need to make for various sheet thicknesses...) across different suppliers, materials, etc to produce physically identical parts. The valuable lesson for a fresh designer is don't trust your 3d representation of the part, don't trust the manufacturer specs on ANYTHING and always prototype. I'm guessing this is why people tolerate it - the world with 3d CAD is still a vastly nicer place to be an engineer than otherwise, so we just rationalise away the shitty parts as a reasonable price to pay.

chris_wot · on April 30, 2016

I'm guessing this is why people tolerate it - the world with 3d CAD is still a vastly nicer place to be an engineer than otherwise, so we just rationalise away the shitty parts as a reasonable price to pay.

Yeah, I know. In my experience, engineers are normally immensely pragmatic people. My dad was an engineer, so I have a lot of respect for the profession. It would just be nice if CAD would be more open.

rjsw · on April 30, 2016

There is a common file file format for CAD as well as other areas of engineering [1], most proprietary CAD tools support it. I am the editor of a small part of it.

[1] https://en.wikipedia.org/wiki/ISO_10303

colomon · on April 30, 2016

That said, in the real world there are plenty of people who don't want to use it, for whatever reason. I've got customers who insist that they'd rather use a not-yet-written reverse-engineered Parasolid X_T exporter than a well-tested STEP exporter that's been around a decade.

rjsw · on April 30, 2016

We also get people coming along every few years who argue that because there are some people not using STEP we should start again from scratch. They are very enthusiastic for a while until they realize how much work is involved and they give up, having wasted a fair bit of time in standards meetings.

I think there is scope for some disruption from small companies willing to use STEP as the internal data model for their software but just working on small parts of the picture.

coldtea · on April 30, 2016

>Like if the software calculates bend radii that will break things, isn't that a bug?

Well, it's not like all companies fixes their bugs or care about them. If, as the parent says, all CAD packages have such issue, what you gonna do?

chris_wot · on April 30, 2016

The difference in LibreOffice not rendering an EMF correctly and a bug that causes wrong dimensions on a drawing in a CAD tool is that LibreOffice gives you a bad looking document and a CAD program costs you money when the material starts to be cut up and it all had to be thrown away as scrap.

Mostly I can fix the LibreOffice bug and make the document render correctly (unless it physically corrupts the document when saving), once physical material gets an error it's often impossible to fix.

SixSigma · on April 30, 2016

It is a formula but you need to know the physical characteristics of your metal.

malka · on April 30, 2016

Or it's on purpose. This way you can sell training courses and each user is locked on it's cad implementation due to the cost of retraining on another tool

paulyg · on April 30, 2016

What all the non mechanical engineers don't realize is the bend allowance is different for types of material (steel, aluminum), grade or strength (high tensile steel will stretch less than mild carbon steel), and even the tool (mandrill vs brake bender). Software like Solidworks and Inventor come with default values. You need to know to go in and change them based on your circumstances.

cm2187 · on April 30, 2016

I don't know how expensive is the stuff you were making but if it was just about cutting metal it can't be that expensive. Isn't testing samples the 101 of engineering or have we gone 100% trusting CAD now? It looks to me that there are so many things that can go wrong in real life like deformations because of temperature, different calibrations on different models of machine, different composition of the material used, etc...

sammcf · on April 30, 2016

Metal isn't that expensive I guess but still, several hundred dollars and a turnaround time increase of a week+ minimum isn't always on the cards for every project. Sure, if I had unlimited time and budget, I'd physically prototype and field test everything. Client demands tend to take precedence. Part of being a good engineer is balancing those concerns, learning which parts of the design and manufacturing process can be trusted implicitly and which assumptions must be rigorously tested - suffice to say I've been a lot more stringent about sheet metal design in the years since.

PhasmaFelis · on April 30, 2016

> I reported the issue to my reseller and the response was essentially "yeah, you're gonna have to take that into account."

> I guess it's not really a bug per se, just hidden behaviour that you have to learn about and take into consideration.

I'm not that kind of engineer, so I may be missing something, but it sounds like the reseller snowed you into blaming yourself for their faulty software.

I get what others have said, that a structural engineer is responsible for the things they produce and for doing whatever is necessary to verify correctness, but that doesn't mean that software engineers get to pass off responsibility for the things they produce.

sammcf · on April 30, 2016

I mean, it's a really easy issue to work around (if occasionally time consuming), most people who use the software are familiar with the issue, it's not enough to make me want to change CAD package given the massive sunk costs one rapidly accumulates using a particular package...I guess there's no real incentive for the software guys to devote time to fixing this when they could be working on new features or whatever.

In a better world, I'd love for bug fixing, or even just polishing workflows for existing features, to be as high of a priority for CAD companies as new feature. However, only one of those things is seen as growing the install base...

malka · on April 30, 2016

Usually there is a lire in the eula for this purpose. I have no idea how this would plan in front of à judge though

kazinator · on April 30, 2016

The answer is a resounding YES, from me right here. Wait, did you say in the field? Maybe not exactly then.

I routed a two-layer PCB (printed circuit board) using a buggy program whose DRC (design rule check) did not catch the fact that a plated through-hole meant went right through an unrelated track in the bottom layer.

Yes, and so the couple of boards I had fabbed failed. Now that is not in the field. But it's not hard to imagine that such a thing could escape into the field; it's almost in the field.

I didn't catch that until I stuffed one board with all the components.

The workaround was to drill out that through hole slightly wider, removing the metal connecting the layers, but leaving plenty of the pad in the correct layer to which that hole belonged.

This was not a case of logic not implemented in the DRC, but a bug. I tested the issue by adding such mistakes on purpose, and DRC was flagging them properly.

In every other regard, the boards worked perfectly; the fix in the artwork is just to move a track a few millimeters over not to intersect with unrelated through-hole.

Of course, the responsibility was mine to check the artwork and not trust the tool. But then, if you can't trust the tool, what good is it? What is the value in a design rule check which fails, say 1% of the time. Does it still increase productivity? I believe that it does, during the development of the artwork. Any time the DRC catches something, it saves you time. It's just that at the very end when you have the thing fabricated, you have to make a couple of passes over it with your own eyes. You don't have to be that careful at absolutely every stage of the routing, because the DRC "has your back" (mostly).

It's the same with compilers. Just because they don't catch every error with a diagnostic doesn't mean diagnostics aren't useful. They are still useful even if they are buggy, in that certain cases of something that requires a diagnostic isn't diagnosed. But then who reads tens of thousands of lines of code to spot where some compiler failed to diagnose an invalid pointer conversion without a cast, you know?

akavel · on April 30, 2016

Not a straightforward answer, but this reminded me again of the "RISKS Digest", or "Forum On Risks to the Public in Computers and Related Systems", a long running and still active free e-periodical about various computer-related incidents to infrastructure in real world. Highly recommended!

https://groups.google.com/forum/#!forum/comp.risks

From https://en.wikipedia.org/wiki/RISKS_Digest:

"RISKS is concerned not merely with so-called security holes in software, but with unintended consequences and hazards stemming from the design (or lack thereof) of automated systems."

analog31 · on April 30, 2016

I suspect that on a percentage basis, the incidence of field failures due to CAD bugs are vastly outnumbered by field failures for other reasons.

Also, perhaps by culture, we don't trust CAD enough to let it get us into that kind of trouble, because we know that modeling always involves approximations and assumptions. Truth be told, I rarely see CAD used to determine critical design margins, without at least a sanity check.

ms013 · on April 30, 2016

Finite element methods are often used to validate that a structure meets its expected requirements, and are part of the CAD workflow. The Sleipner-A offshore platform failed and sank due to bad finite element analysis.

See: https://www.ima.umn.edu/%7Earnold/disasters/sleipner.html

Hondor · on April 30, 2016

That was the analyst making a mistake, not a bug in the software. They used poorly shaped or too few elements which lead to an underestimate of the stress. This has always been a well known limitation of the finite element method and any competent user will know about it and the techniques to protect against it.

Here's a slightly more detailed article

http://journals.library.mun.ca/ojs/index.php/prototype/artic...

sitkack · on April 30, 2016

Edit: Wasn't just a matter of not enough elements, but that the software couldn't handle the geometric configuration of the tri-cell. And that the computed values were extrapolated from the simulation, compounding the flaws in the simulation.

--

Seems like this could be avoided by automatically adding more elements and rerunning simulation, measure the gradient of the solution as the number of elements change. Exactly the kind of thing a computer should catch. And exactly the kind of thing having alternative analysis should prevent.

  * used an unverified model
  * trusted software, didn't double check results
  * low safety margin
  * incomplete risk model, lack of project wide oversight

auxym · on April 30, 2016

From my experience in the field, convergence studies are something everyone knows they should do, but never actually happen because results are always needed yesterday.

As for automating it, sure for a model like that with tens or hundreds of elements, it would take seconds on a modern computer, great. Except, model complexity always scales to reach the boundary of reasonable computation time on modern computers. This has to be a corollary to Moore's law or something. Recently I've been working on a model with tens of millions of elements. Takes 12-ish hours to solve a few load cases. A good convergence study might take weeks.

Thankfully, finite element solvers have gotten much better at tolerating bad element geometry, and at warning users for about geometry likely resulting in bad results. Plus things like NASTRAN's integrated extrapolation of stresses at element corners, instead of outputting only element stresses and having the analyst extrapolate. A colleague of mine ran some tests a few years ago at a customer's request, running a very coarse model with corner stress extrapolation enabled. Results were plenty good, well within normal engineering uncertainties and tolerances.

FEM was still pretty young in the early 90s and probably required very experienced/knowledgeable engineers to use it. As with most things, it's gotten more and more user frendly, and more and more software are taking very automated and behind the scenes approaches to FE (ANSYS Workbench, Solidworks Simulations, Simlab, etc). It has its good and bad sides.

sitkack · on April 30, 2016

> A good convergence study might take weeks.

How about a bulk simulation service for running HQ models in a couple different packages for a final check? I could imagine that being a regulatory requirement for big buildings, bridges, aircraft, etc.

sitkack · on May 2, 2016

Could the convergence study go the other direction, towards a coarser mesh to show that on isn't already operating at the limit?

jotux · on April 30, 2016

I designed a PCB once that had a pin on a regulator connected to a ground pour. In the CAD tool the pour extended to the pad between two traces but when I generated gerbers for the board the pour no longer connected to the pad. I didn't realize this until I got the board back. The bug was known and fixed in the next version but I didn't know about it at the time. Because of that board I now review all gerbers before I send them out for fab.

fsloth · on April 30, 2016

At least in construction to a large part the drawings are still considered the ground truth. They may be produced by CAD tools and autogenerated from 3D models. Generally they are read and re-read by various discipline specialists until anything gets built. So no one usually is just blindly trusting the output.

That said, mistakes happen in construction all the time. With or without cad. So, things failing in the field is an 'accepted' facet of the process - in that context it's probably very difficult to collect precise data which parts are to CAD errors and which are of different sort - too much noise.

That said, any engineering office worth their salt will enforce a single software version for a duration of a construction process. They all know there will be horrible bugs, but then they have a process to take them into account with expert users doing special configurations and vendors suggesting all sorts of workarounds. It's still far more efficient to do it like that rather than with pen and paper.

CAD is totally unlike this modern apps and ecosystems thing in so many ways.

cm2187 · on April 30, 2016

But in construction you typically keep huge safety margins, don't you (like in case someone left a dead body in the concrete!)?

fsloth · on April 30, 2016

(Note:I'm not personally in construction but my daily job involves implementing features for various CAD products.)

Sure, but a design error is still a design error. The capability to recover from design errors might be there but it does not mean unexpected flaws won't incur an overhead to costs and schedule.

jpollock · on April 30, 2016

My understanding is that the Therac-25 UI was essentially a CAD program that allowed the operator to design treatment programs.

The software failed to prevent the operator from entering invalid treatment programs which then killed patients.

I seem to remember that this was made worse by operators using a shortcut from a previous version which was safe because of some additional hardware, but the hardware lock was removed in the 25, and it killed people.

https://en.wikipedia.org/wiki/Therac-25

kazinator · on April 30, 2016

> Therac-25 UI was essentially a CAD program

Clever comment! But wait, is that really the Therac-25 you're thinking of? The Therac-25 was basically just a bug-trap of assembly language race conditions, that's all.

There was another similar case of a radiation machine in which the dose was supposed to take into account the shield. The user could draw the shield with a CAD-like plannning program. The users tried to draw a shield with a cutout, and the CAD display led them to believe that the shapes were being subtracted. (The filled path was drawn with the in-out rule or whatever.) The actual dose calculation though ignored this and added the negative area as a positive. Whoa, you have lots of shielding, crank up the radiation! Or something like that.

This is not actually the radiation machine but the treatment "planning" software. (It meets the definition of CAD: it's computer software, and it's assisting in the design of something: the treatment.)

Gee what was this? Perhaps that circa 2000 incident in Panama?

http://www.ncbi.nlm.nih.gov/pubmed/17199912

This is a very informative report with all the details, including diagrams:

http://www-pub.iaea.org/mtcd/publications/pdf/pub1114_scr.pd...

(See P. 22 for instance)

chris_wot · on April 30, 2016

You are seriously scaring the shit out of me. I'd be more scared, only my Uncle was in charge of a standards body in a certain nuclear research organisation several decades ago and a few of his old war stories would make your hair turn white.

Thankfully when I did work for that same organisation they had completely changed their culture, and things were considerably better, but still I suspect that if you explore the bush around the facility (which I doubt the police would allow...) you might see some unusual things.

jpollock · on April 30, 2016

Yeah, that's what it was!

goalieca · on April 29, 2016

Yes. Microprocessors. But you probably mean things like bridges and buildings.

HorizonXP · on April 30, 2016

Is that actually a bug in the CAD tool though? I always thought microcode bugs were design issues.

Unless you mean simulation should've picked it up.

Gibbon1 · on April 30, 2016

Some guys I worked with got a mixed signal IC back and the inductors were off. I don't know if it was a true bug[1] but the proprietary software used to layout the inductors was giving values about 50% too large. Which meant the inductors values were about 33% too small.

[1] The inductors were small spirals laid out on the top metalization layer. I assume finite element analysis of those involves a lot of magic.

raverbashing · on April 30, 2016

Yes, really, people can not put too much trust in IC resistors, to think an inductor is going to have an exact value is just asking for trouble (but probably not 50% off though)

Gibbon1 · on April 30, 2016

The little I know about RF circuits and inductors is that within 10% is usually fine. Off by 30% though means your RF circuit isn't centered anymore. With the IC mentioned the RF receive path was hosed. The guy that did the inductor design also had the issue that a simulation took about 8 hours.

In their case once they knew how far off the inductors were with the process being used they were able to adjust the target inductance and then do a re-layout (couple of 80 hours weeks, and another full set of masks, no problem). I think part of the problem was that for the previous geometry and design frequency 1GHZ they were using their tool gave good results. New geometry and 2.4GHZ and it was wrong. No way to test either.

jonsen · on April 30, 2016

... the root cause of the failure resulted from inaccurate NASTRAN calculations in the design of the structure.

https://en.m.wikipedia.org/wiki/Sleipner_A

boulos · on April 30, 2016

Would you count either of Frank Gehry's fiascos (Stata Center Leaking, Concert Hall Laser Beam) as caused by "bugs"?

I was going to say "No, ignoring your tools doesn't count as a bug" but now I'm curious if the Disney Hall one (like the more recent building in London http://www.bbc.com/news/magazine-23944679) was caused by only seeing approximation-based global illumination. If you don't get caustics rendered correctly, you don't realize you'll melt a car ;).

arnarbi · on April 29, 2016

Not a bug per se, but Airbus claimed some of the delays in the production of A380 were due to incompatible designs from two different versions of the CAD software used.

http://calleam.com/WTPF/?p=4700

xythobuz · on April 30, 2016

I've heard of this for the first time and just talked to a family member that worked at Airbus and was part of the team using Catia V5.

They had heard of this story, which is told often, but it's just not that simple. Problems like this should have been noticed and fixed much earlier with the proper processes.

Now blaming Catia for the organizational problems seems too easy.

bpchaps · on April 30, 2016

My dad does food manufacturing and loves to talk about this subject. Fact is, if you pour sugar paste on something, don't expect anything under it to slide away very cleanly. CAD doesn't do a very good job with sticky stuff. :)

chris_wot · on April 30, 2016

I'd love to read about the physics of sugar paste on moving surfaces :-)

bpchaps · on April 30, 2016

I wish I knew more about it, but what he's told me is that it's common to do something similar to the pulling a sheet from a table trick in certain manufacturing processes. Well, if you pour sugar goo onto something and expect it to stay still... you're going to have a day just as bad as the wall. He said it was one of the funniest things he's ever seen. Who doesn't love a high speed pastry launcher? Management doesn't. :)

dang · on April 29, 2016

I noticed that pg posted this and thought HN users might know some relevant things.

AndrewKemendo · on April 30, 2016

It's almost an impossible question to answer because it assumes that follow on reviews and implementations would not QC or spot check the design through the build process.

It happens a lot actually with CNC processes, if you transfer one drawing between systems, but even then I wouldn't consider that a bug.

jameshart · on April 30, 2016

Has a book ever contained a misprint because of a bug in a word processor?

Has software ever contained a bug because of a bug in an editor?

skmurphy · on April 30, 2016

The premise of the question is flawed; by definition a field failure is a failure of the organization and in particular the leadership. Gerald Weinberg made this point many years ago: programmers cannot make million dollar mistakes, only senior management can. Every person, every tool, every individual process step is fallible. The key is to create an overall methodology that detects and corrects errors.

jws · on April 30, 2016

How about the SpaceX tank struts? These were presumably designed with the aid of a computer, and were not as strong as they were needed to be.

The news stories focused on testing, but testing is kind of the emergency fail safe of product construction. The fix involved a redesign, not more testing, so it seems safe to say the initial design was a failure.

It is also possible that the requirements were off.

hga · on April 30, 2016

From their investigation, http://www.spacex.com/news/2015/07/20/crs-7-investigation-up...:

The strut that we believe failed was designed and material certified to handle 10,000 lbs of force, but failed at 2,000 lbs, a five-fold difference.

jws · on April 30, 2016

I could read that as either evidence that the CAD computations were wrong, or evidence that something completely different happened. For example, someone might have taken a shortcut in manufacturing.

mikeash · on April 30, 2016

Most of the struts performed to spec, with a fraction of a percent failing under much less force than they should have. That sounds like a manufacturing problem, not a design problem.

mng2 · on April 30, 2016

What I heard from a friend of a friend is that QA was not handled as well as it could've been.

desdiv · on April 30, 2016

I think that's why PG specified "general-purpose CAD tool", in order to exclude in-house software.

vermontdevil · on April 30, 2016

Not bug in CAD but more of human oversight maybe?

http://www.slate.com/blogs/the_eye/2014/04/17/the_citicorp_t...

sp332 · on April 30, 2016

Along the same lines, this sensor used for parachute deployment was designed, installed, and tested (?) upside-down. https://en.wikipedia.org/wiki/Genesis_%28spacecraft%29#Recov...

sitkack · on April 30, 2016

https://en.wikipedia.org/wiki/Poka-yoke

imglorp · on April 30, 2016

I would argue the Intel FDIV bug was one. This stuff gets simulated and synthesized at length, using Cadence, Synopsys and other tools before it sees tapeout.

https://en.wikipedia.org/wiki/Pentium_FDIV_bug

chollida1 · on April 29, 2016

Does the mars rover crash due to metric vs imperial units not being converted count?

http://www.cnn.com/TECH/space/9909/30/mars.metric.02/

danso · on April 29, 2016

I don't think so. From the summary posted on Wikipedia [0], it sounds more like a routine software engineering interface error: the Lockheed Martin system returned U.S. units, while the NASA system expected metric units.

Also, looks like classic project mismanagement made a vital contribution, too:

> The discrepancy between calculated and measured position, resulting in the discrepancy between desired and actual orbit insertion altitude, had been noticed earlier by at least two navigators, whose concerns were dismissed. A meeting of trajectory software engineers, trajectory software operators (navigators), propulsion engineers, and managers, was convened to consider the possibility of executing Trajectory Correction Maneuver-5, which was in the schedule. Attendees of the meeting recall an agreement to conduct TCM-5, but it was ultimately not done.

[0] https://en.wikipedia.org/wiki/Mars_Climate_Orbiter

sitkack · on April 30, 2016

Which would have been prevented by using unit preserving calculations.

http://futureboy.us/frinkdocs/

https://msdn.microsoft.com/en-us/library/dd233243.aspx

http://unitsofmeasurement.github.io/

jokoon · on April 30, 2016

I wonder if industrial software has enough developers, compared to web developers, and if one could compare the added value of web versus the industry.

I tend to think the web is more volatile and more oriented towards sales, ads and services, which in my mind are somehow "less important" than industrial software.

Of course every developer makes his own choice, but I'll always be more condescendent towards web techs in general, versus industrial, sometimes C-oriented programming. Maybe it comes from my technical teaching where I learned programming.

hyperion2010 · on April 30, 2016

This doesn't happen much any more, but the case of the Vasa is a pretty good example of a bug in the measurement hardware. Different, incompatible rulers used during construction of opposite sides of the ship. Unfortunately that was during implementation phase and not the design phase. I would guess that there might be some errors in simulation routines that have cause structures to not have been designed with the tolerances they needed, however I don't think that counts as CAD either.

agumonkey · on April 30, 2016

Is the situation improving or steady around the same number of issues ?