
Apollo Guidance Computer switching power supply works after 50 years - asmithmd1
http://www.righto.com/2019/08/reliable-after-50-years-apollo-guidance.html
======
userbinator
_We found that the capacitors were all in good shape with the proper
capacitances. This is in contrast to modern capacitors, which often leak or
fail after a few years. NASA used expensive aerospace-grade capacitors and
X-rayed each one to test for faults, and this made a large difference._

That's also because these are hermetically sealed _wet tantalum_ caps, not the
dry type that's notorious for shorting out and catching fire. "Wet tants" are
_very_ expensive and you can still buy them today:

[https://www.mouser.com/Vishay/Passive-
Components/Capacitors/...](https://www.mouser.com/Vishay/Passive-
Components/Capacitors/Tantalum-Capacitors/Tantalum-Capacitors-Wet/HE3-Series/)

 _We were somewhat surprised that both power supplies worked flawlessly after
50 years._

I'm not all that surprised, actually --- but perhaps that's because I've seen
plenty of videos on YouTube of more mundane equipment, like vehicles and
appliances, coming back to life after several decades of storage or exposure
to the elements with minimal repairs, so for a clearly high-reliability
component like this AGC PSU to work is almost expected.

~~~
xellisx
Working link: [https://www.mouser.com/Vishay/Passive-
Components/Capacitors/...](https://www.mouser.com/Vishay/Passive-
Components/Capacitors/Tantalum-Capacitors/Tantalum-Capacitors-
Wet/_/N-75hr5?P=1z0zls5)

~~~
jakeogh
$/F record?
[https://www.mouser.com/ProductDetail/Vishay/XTV157T270P0L?qs...](https://www.mouser.com/ProductDetail/Vishay/XTV157T270P0L?qs=sGAEpiMZZMuEN2agSAc2ptl%252BlXX0IRpaER3axfKJ7FA%3D)

------
jacquesm
That is beautiful work, the _welded_ connections are really impressive.
Obviously they had to take into account massive forces, vibration and possibly
impact damage, but still, to see components welded in place is something I've
never seen before, not even in very high end HF power electronics.

> The AGC that we're restoring belongs to a private owner who picked it up at
> a scrapyard in the 1970s after NASA scrapped it.

That is one lucky find. Imagine that it had _not_ been found and scrapped for
its metal value. The mind boggles at the potential destruction of such a
historical artifact.

> the cordwood components are mounted differently from the other cordwood
> modules.

For those from cities who have never seen cordwood:

Cordwood is a stack of wood of short length seen from the endgrain, typically
used when referring to firewood but also sometimes used in construction.

~~~
consp
> ... Obviously they had to take into account massive forces, vibration and
> possibly impact damage ...

That's why they also glued them in epoxy. If you don't want you electronics to
move (and thus break connections), put them tight in epoxy. And I mean
completely glued in. No air left. This is as valid for modern solder joints as
well as welded.

~~~
kens
The AGC we restored was used for ground testing, so most of the modules
weren't encased in epoxy. This made things much more convenient for us. We did
have to dig through encapsulation to fix one module, though.

------
esmi
I was shocked to see the title because I thought forward converters were
invented in the 70s but it turns out they were invented in the 50s with some
early examples in the 20s, if one gets loose enough with the definition of
“converter”.

[https://pdfs.semanticscholar.org/0911/34a5dc72c5a363ec937e40...](https://pdfs.semanticscholar.org/0911/34a5dc72c5a363ec937e40c0aa3706f1cd99.pdf)

~~~
kens
Yes, these converters go way back, although the AGC used a buck converter not
a forward converter. They became more practical in the 1970s as transistor
technology improved. I wrote a history of power supplies in the IEEE Spectrum
recently: [https://spectrum.ieee.org/computing/hardware/a-half-
century-...](https://spectrum.ieee.org/computing/hardware/a-half-century-ago-
better-transistors-and-switching-regulators-revolutionized-the-design-of-
computer-power-supplies)

~~~
esmi
Nice article thanks for sharing it. What definition are you using for forward
converter? The definition I always use, which I thought I got from Erickson’s
Fundamentals of Power Electrons, is any topology where the inductance is
charged in series (as opposed to parallel like a boost or fly back) with the
load. I’ve always liked this definition because it still conveys critical
meaning even when engineers do trivial topology changes from the “standard”
forward converter topology.

~~~
kens
I've seen a forward converter described as buck-derived, or buck with a
transformer, but buck converters and forward converters are generally viewed
as two separate things. I'm not too attached to definitions but I consider a
forward converter as using a transformer and transferring energy while the
switch is on. This is the definition used by the article you linked to, and by
Wikipedia for Forward Converter. I also like the TI poster of power
topologies:
[https://www.ti.com/lit/sg/sluw001f/sluw001f.pdf](https://www.ti.com/lit/sg/sluw001f/sluw001f.pdf)

~~~
esmi
I think the article I linked to didn’t require the transformer. Only that
there be transfer with the switch on which implies the series connection. :)
Wikipedia certainly does though and honestly, especially in power conversions,
there’s not a lot of standardization in terms which is why I asked. I’m not
too hung up on the definition either. I’m probably just over sensitive because
all the marketing material I’m asked to review...

------
CamperBob2
Huh? They got the whole computer working, not just the power supply.

Anybody who hasn't seen the two dozen or so videos made by CuriousMarc is in
for a geekout of historic proportions. Some truly amazing work, with more ups
and downs than the Apollo program itself. I can't easily come up with a link
on mobile but you'll find it by looking up CuriousMarc on YT. Watch the whole
playlist, it's awesome.

~~~
kens
Yes, the whole AGC works; this article discusses the power supplies in
particular. The video playlist is at:
[https://www.youtube.com/playlist?list=PL-_93BVApb59FWrLZfdli...](https://www.youtube.com/playlist?list=PL-_93BVApb59FWrLZfdlisi_x7-Ut_-w7)

~~~
chiph
Were you able to see who made the tantalum capacitors? Or did the cordwood
construction prevent that?

I ask because my dad was Group Product Manager at Kemet during this time
period. Growing up, he mentioned their products going into several aerospace
systems (various ICBMs and military projects, etc) as well as the IBM System
360. But he never mentioned involvement with Apollo and I'm curious if he
might have had a hand in the AGC.

BTW, your link to the NASA contract drawing is incorrect - here's the one you
probably wanted.

[http://www.ibiblio.org/apollo/SCDs/scd_1006755t.pdf](http://www.ibiblio.org/apollo/SCDs/scd_1006755t.pdf)

~~~
kens
The (corrected) drawing says Sprague made these capacitors:
[http://www.ibiblio.org/apollo/SCDs/scd_1006755-.pdf](http://www.ibiblio.org/apollo/SCDs/scd_1006755-.pdf)
(I don't know if Kemet made other capacitors used in Apollo.)

~~~
thewonderidiot
To the contrary! In our AGC, all of the capacitors not tucked into cordwood
holes sport a KEMET logo and part number:
[https://photos.app.goo.gl/oGsrgXBpNMccmkWV7](https://photos.app.goo.gl/oGsrgXBpNMccmkWV7)

I think the Computer History Museum's might have Sprague caps in it, though.

~~~
CamperBob2
Yep, that's the capacitor I was referring to below. At least some of them were
definitely Kemet parts. I don't remember seeing any Sprague logos but I
imagine there were some of those around, too.

------
beloch
It's always impressive to see somebody try to build something that lasts, and
succeed. It's even rarer to see such a thing in consumer electronics.

I'm currently sitting in the same room as a Bryston amplifier that, according
to it's date code, was manufactured in late 1998. That means it's almost 21
years old and just 1 year off warranty. It's been switched on for most of
those 21 years but still works great and has never been serviced. Even more
surprisingly, it's not obsolete. It's currently hooked up to a 2018 receiver.

I'd love to see somebody do tear-downs of devices like this and explain how
their construction takes longevity into consideration without resorting to the
same extremes as NASA (e.g. X-raying caps).

~~~
WalterBright
I run my Carver amplifier, bought new in 1981, every day all day. That's 38
years of near constant use!

~~~
beloch
Damn! Take that NES!

------
blackflame7000
It’s sad that there is a dichotomy between building a good product and
building too good of a product that you don’t get enough return customers.

~~~
smitty1e
The art would seem to be to build the durable product with room for extension,
so that compelling upgrades are possible.

As seen with people who collect Apple products and just keep adding to the
fruit basket.

~~~
zaat
Apple products are pretty much the opposite of good products, at least in the
meaning of "good" implied in the context. Leaving aside the failure rate of
these, most of their products are sealed with propriety battery. This gives
them a pretty short service life which guarantees returning costumers.

~~~
smitty1e
Was that true of the old Mac computers, though?

I'm not talking about the post-Jobs stuff.

~~~
zaat
The old Mac computers were not as bad as the first iPod, but they weren't
heavy-duty or anything of the sort. Nothing exceptional in that dimension.

------
bladedtoys
Slightly off topic but I understand as the Apollo 11 LEM descended, there were
several "computer overload errors".

Is this out of memory or skipped input or dropped processing requests or too
much power drain? What exactly is an "overload error" on such an unusual
machine?

~~~
jhayward
If you google for "apollo 1201 or 1202 alarm" you'll find several good
magazine-length articles on the design of the landing guidance computer
systems.

The TLDR is: a switch misconfiguration was tasking the computer to process
some rendezvous tracking information which was not needed. This exhausted a
rather limited set of storage locations and triggered the alarm.

The computer was properly programmed to treat the condition as a secondary
failure and continue with its primary task.

The thing that was remarkable to me was the control room engineer's response
to the 1201 right before touchdown - they hadn't seen one of those, and it was
not the same as the 1202's they had seen earlier.

In the videos I've seen you hear the audio - "1201 alarm" and within less than
a second the 28-year-old (average age, just guessing) responds "1201 go". He
just gave instant clearance to proceed past a malfunction a few hundred feet
above the moon's surface, in real time. Talk about being in the zone.

~~~
thewonderidiot
Sadly most of what you read by googling these problems is misinformation. It
was actually an incredibly sinister systems integration bug, that wasn't well
described even at the time.

It wasn't a switch misconfiguration; the Apollo 11 astronauts were flying to
the checklist, and did as they had simulated. The Rendezvous Radar switch has
three settings -- LGC, AUTO TRACK, and SLEW. In LGC mode, the AGC controls the
positioning of the antenna; in AUTO TRACK, the radar automatically tracks the
CSM based on return strength; and in SLEW it is automatically positioned.

The trouble came from how the trunnion and shaft angles of the antenna were
measured. They used "resolvers", which are sort of like variable transfomers.
Resolvers look like motors, and attached to the shaft there are two windings
positioned 90 degrees apart from each other. An AC "reference voltage" is
applied to an outer winding in the case, and that voltage couples onto the two
inner windings with a magnitude proportional to the angle on the shaft. One
winding (the "sine" winding) produces an output equal to Vref _sin(theta) and
the other ( "cosine") winding produces an output equal to Vref_cos(theta),
where Vref is the reference voltage and theta is the angle of the shaft. The
voltage and phase of both windings can be used to determine exactly what the
theta was that produced them.

The circuitry to do this is a bit involved though and lived outside the
computer, in a device called the CDU, or Coupling Data Unit. The CDU
constantly maintained its own idea of what the angle ("psi") in a digital
register. It translated the incoming sine and cosine voltages into a digital
representation by mechanizing the equation +-sin(theta-psi) = +-sin(theta)
_cos(psi) -+ cos(theta)_ sin(psi). It did so by using the bits of its digital
register containing psi to switch on and off resistor dividers that effected
cos(psi) and sin(psi) onto the incoming signals, which were then added
together with a summing amplifier. The goal of the CDU is to zero this sum; to
accomplish this, it "counts" the angle register up or down to reduce the
magnitude of the sum. As it counts, switches are changed, which switch out
resistors in the circuit, which in turn change cos(psi) and sin(psi) in the
above equation. And also, with every other increment, a pulse is transmitted
to the AGC to indicate that the angle has changed slightly.

The problem comes in because in addition to the above, the CDU also, for many
angles, added to the sum some fraction of the reference voltage directly. This
is fine when the switch is in the LGC position; the resolvers are supplied
with the same 28V, 800Hz reference voltage that is used inside the CDU.
However, when the switch is put in either of the other two positions, the
reference voltage for the RR resolvers is switched to an unrelated 15V rail.
Critically, this 15V reference has no defined phase relationship with the
CDU's 28V 800Hz reference. The phasing is locked in by the exact millisecond
at which you power up your subsystems.

So when the switch is changed, the sine and cosine outputs from the resolver
are suddenly derived from the 15V reference -- they are much lower before and
at a random phase. The CDU doesn't know that this has happened, and still
tries to perform the summing as before. However, for many theta/phase
relationships, it becomes impossible for the CDU to actually null the sum. In
these cases, the CDU becomes "manic", and starts seeking back and forth,
frantically changing switches to try to figure out what the angle is, but
never succeeding.

This causes a huge flurry of +1 and -1 pulses to the AGC. In order to minimize
circuitry, the AGC implemented what was called "unprogrammed" or "cycle-
stealing" instructions. The computer only contains a single adder, and adding
or subtracting 1 from the current angle requires use of that adder and a
memory cycle. Rather than generating a full interrupt, which would require
many memory cycles and instructions to handle, the computer simply
transparently inserts a single-cycle instruction in between two "programmed"
instructions that performs the addition or subtraction. This is totally
transparent to software, normally. But with a manic CDU that is incessantly
seeking on both RR angles, the AGC receives something close to 12,800 pulses
per second, which translates into something around 15% of its total
computational time. The landing software had only been designed with a margin
of 10% or so.

The 1202s were also a lot less benign than is often reported. They occurred
because of the fixed two-second guidance cycle in the landing software. That
is, once every two seconds, a job called the SERVICER would start. SERVICER
had many tasks during the landing. In order: navigation, guidance, commanding
throttle, commanding attitude, and updating displays. With an excessive load
as caused by the CDU, new SERVICERs were starting before old ones could
finish. Eventually there would be two many old SERVICERs hanging around, and
when the time came to start a new one, there would be no slots for new jobs
available. When this happened, the EXECUTIVE (job scheduler) would issue a
1201 or 1202 alarm and cause a soft restart of the computer. Every job and
task was flushed, and the computer started up fresh, resuming from its last
checkpoint. It was essentially a full-on crash and restart, rather than a
graceful cancellation of a few jobs. And unlike is often said, the computer
wasn't dropping low-priority things; it was failing to complete the most
critical job of the landing, the SERVICER.

Luckily, the load was light enough that of the SERVICER's duties, the old
SERVICER was usually in the final display updating code when it got preempted
by a new SERVICER. This caused times in the descent when the display stopped
updating entirely, but the flight proceeded mostly as usual. However, with
slightly more load, it was fully possible that the SERVICER could have been
preempted in the attitude control portion of the code, or worse yet, the
throttle control portion. Since each SERVICER shared the same memory location
as the last one (since there was only ever supposed to be one running at a
time), this could lead to violent attitude or throttle excursions, which would
have certainly called for an abort. Luckily, this didn't happen -- and the
flight controllers didn't abort the mission not because 1202s were always
safe, but because they didn't understand just how bad it could be, were the
load just a tiny bit higher.

~~~
gazsp
Could I ask how you know so much about this, or where I can read something
more detailed than the usual story that's reported? Thanks.

~~~
thewonderidiot
Many years now of research and simulation of the system (I led the restoration
of the computer mentioned in the article). There's not a single place where
you can read everything, unfortunately, aside from the comment above. We're
planning on making a video on it in the future. But I can cite sources:

CDU theory of operation (starting PDF page 15):
[http://www.ibiblio.org/apollo/Documents/HSI-208435-003.pdf](http://www.ibiblio.org/apollo/Documents/HSI-208435-003.pdf)

CDU coarse module schematic:
[https://archive.org/stream/apertureCardBox462NARASW_images#p...](https://archive.org/stream/apertureCardBox462NARASW_images#page/n1507/mode/1up)

Grumman memo (from 1968!) describing the problem, and mentioning it is due to
the reference switching to a 15V 800Hz source:
[https://www.ibiblio.org/apollo/Documents/Memo-
GAEC_LMO_541_1...](https://www.ibiblio.org/apollo/Documents/Memo-
GAEC_LMO_541_108_text.pdf)

Excerpt from the LM-8 Systems Handbook showing the reference voltage RR switch
wiring: [https://i.imgur.com/fMsQ7RI.png](https://i.imgur.com/fMsQ7RI.png)

Don Eyles describes the software side best in his book Sunburst and Luminary
(which I highly recommend) but he also talks about it in some detail on his
website:
[https://doneyles.com/LM/Tales.html](https://doneyles.com/LM/Tales.html)

------
avmich
Now I wonder if fuel elements powering this PSU can be restored to a workable
state? Then moving forward to restoration of the whole spacecraft. Shouldn't
be too hard :) given that some of most complex units are already working...

------
fortran77
Buck converters, as any fan of Big Clive videos knows, are still widely used
in low cost power supplies for LED light bulbs and cheap USB chargers, due to
the few components needed and wide tolerances the circuits can handle.

~~~
tsomctl
Buck converters are still widely used in absolutely everything.

------
NewOldComputer
I wonder if there were any issues with the STBY button being located where it
is. It seems really risky place to have a off key right between other
frequently used button.

~~~
thewonderidiot
Nah, they put a couple of precautions in to make sure you couldn't put the
computer into standby accidentally. First, software has to set a bit to enable
standby mode to be entered. For the astronauts, this was keying in VERB 37
ENTER 06 ENTER, which started P06, the pre-standby program. They then had to
press and hold the button for a period of time between 1.28 and 2.56 seconds
(exactly how long it takes depends on where the clock divider chain was at
when they first press in the button). To bring it out of standby, you have to
push the button in for a similar duration.

