Maybe it is true for him. This isn't my area, but the Wikipedia article for the RAD5500 says it's for "high radiation environments experienced on board satellites and spacecraft," but the grandparent only a need for tolerance to "modest radiation exposure (avionics)." Maybe the RAD5500 is overkill for his application, and there actually hasn't been anything new (in his niche) for 10 years.
Or, it can design a safety critical and rad-hard chip and sell you maybe a couple thousand? :-)
At least, there is now the option to get defense-grade FPGAs and throw some radiation hardening techniques (triple-modular-redundancy, scrubbing, etc) on it.
Why not just have three exactly identical chips, run the exact same code on them, and compare results? If results ever differ, you power cycle the bad one, self test it, and you're good to go again.
Sure you have to not use on-chip random number generators, and make sure you don't have any non-deterministic things in your silicon (eg. things that depend on PLL locking time), but that sounds pretty trivial.
There are three main things to protect against with radiation hardening: lifetime dosage, latch-ups, and upsets.
Chips can only go for so long being blasted with heavy particles until they cease to function properly. Smaller, newer technology is more affected (thicker traces and larger gates might last longer) by this so designers will tend to use older chip designs when making new radiation hardened stuff.
This design choice also leads to less chance of latch-ups. Large particles can short out traces and cause the chip to halt and draw a lot of current. Larger tech makes this less likely to happen.
Upsets are a lot different. You have to mitigate these a few different ways, an easy one being thick layers of insulating substrate to isolate gates better. This can also help with latch-ups too. Next are things like memory voting and error correction, methods that can be done in software. One layer up from that are redundant systems, very robust in terms of bit-flips but adds more complexity and cost to the system.
>Why not just
The problem with 'just' is that it's usually not true. Hardening versus redundancy is typical safety engineering problem where details and balance between different requirements, verification and maintenance are important.
Easier said than done :) But yes, that is a common solution.
They do not run in lock step but if one of the machines detects a different output from it's peer than it's own, it just triggers an emergency brake.
Time flies I guess.
To your comment on Infineon... I thought the TriCore was stupid, but my German boss at the time insisted. We're using one of their SoCs today, but it's actually very very good for what we're doing and it's fairly cheap. We could get cheaper but that would require the addition of other parts on the board. If we can secure the volumes we may go custom - and that's something I haven't seen in person in automotive (I'm beginning to hear about it though).
Note: If speed isn't big deal, there's the formally-verified AAMP7G from Rockwell and the old 1802 ftom Intersil.
"We can run this $gadget at $somany GHz and most of them will fail at 4 years, or we can get to 5 years if we reduce the number of gigglehertzes." (Imagine software people gnashing teeth here, but they always do, so ignore them).
Then it might turn out that the models were conservative, so the hardware folks could say "Every machine gets an additional 50 gigglehertzes" (note that nobody tells the consumer about that lifetime projection, though).
It's not just silicon, of course. It's the nature of hardware to die: Electrolytic caps go dry, or coils buzz themselves open, or electromigration murders a metal run on a chip, or fans seize, or thermal cycling cracks some solder. Memento mori, don't expect your grandchildren to play on your current generation game console.
IIRC, NASA still requires all space hardware to be manufactured with leaded solder to avoid having a multi-million/billion dollar mission ruined by some tin whiskers.