
PID Without a PhD (2016) [pdf] - darshanrai
http://www.wescottdesign.com/articles/pid/pidWithoutAPhd.pdf
======
metaphor
Fortunately, basic PID analysis, including linear state-space representation
and practical tuning methods (e.g. Ziegler-Nichols[1]), is taught in pretty
much any first EE control theory course at the undergraduate level.

[1]
[https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method](https://en.wikipedia.org/wiki/Ziegler%E2%80%93Nichols_method)

~~~
beambot
Not only that... Classic controls (including PID, Routh-Hurwitz, root-locus,
etc) are NOT taught in many graduate PhD controls programs!

~~~
budadre75
Are those topics no longer relevant to today's state of the art research?

~~~
Sean1708
It's possible that you're just expected to know them at that point.

~~~
donquichotte
Agreed. Many state of the art control strategies nowadays are based on
optimization techniques (e.g. Model Predictive Control, MPC, optimizes a
convex cost function of the state, setpoint and input to your system over a
certain time horizon). This has become a viable control strategy for fast
(>1kHz) processes only in the past 15 years, since the optimization can be
computationally intensive.

------
fest
There was an "aha" moment for me when I realized that PID loop only works well
for linear systems (and sometimes it's not obvious if a system is linear or
not).

Real world example: imagine you have a single-axis motion stage driven by
electric motor and you want to control position of a carriage. Usually your
control output is motor voltage. Motor voltage approximately translates to
current through motor windings, which in turn approximately translates to
torque exerted by motor. Torque exerts a force (T = F * r) on carriage.
Applying force to the carriage makes it accelerate (F = m * a). Acceleration
linearly increases the carriage velocity (v = v0 + a * t). Carriage having
some velocity finally causes the position change (s = v0 * t + a * t^2).

In the essence, the system turns out to be non-linear with the respect to
parameter you are controlling.

To improve this system, one solution is to add velocity sensor (or
differentiate the position sensor, if it's resolution is high enough) and
introduce a cascading PID loop topology- the outer loop takes position error
and outputs velocity error, which is fed to inner loop (input = velocity
error, output = acceleration). The coefficients for the loops have to be tuned
starting from innermost loop.

Another solution is to use different control algorithm which is suited for
non-linear systems (e.g. LQR).

~~~
darshanrai
We actually ended up using this when writing a controller for a quadcopter[1].
Essentially, we have one PID controller operate on the absolute angle (error =
desired angle - actual angle). The output of this controller is fed into the
second controller as the desired rate of rotation (RoR) (error = desired RoR -
actual RoR). The output of the second controller is finally fed to the motors.

Apart from being easier to tune, I just found a good article[2] for why this
approach works better for problems such as this. For quadcopters, of course,
this allows one to easily switch between rate/acro mode and angle mode.

[1]
[https://github.com/ThePinkScareCrow/TheScareCrow/blob/master...](https://github.com/ThePinkScareCrow/TheScareCrow/blob/master/controls.cpp#L284-L285)
[2] [https://www.controleng.com/single-article/fundamentals-of-
ca...](https://www.controleng.com/single-article/fundamentals-of-cascade-
control/bcedad6518aec409f583ba6bc9b72854.html)

~~~
fest
This is also the approach PX4/Ardupilot folks use. It's actually even more
elaborate in their case: position loop -> velocity loop -> acceleration/angle
loop -> angular rate loop -> motor outputs.

------
cr0sh
Just going to throw my 2-cents in - I should note that my knowledge of PID is
very limited.

The first time I encountered an explanation of how PID worked that made sense
to me, was via the Udacity CS373 course that I took in 2012 (Thrun also had a
great explanation on how a Kalman filter worked as well). This explanation of
PID was repeated when I took the Udacity Self-Driving Car Engineer Nanodegree
(starting in 2016).

In the CS373 course, Thrun detailed a few different ways to tune a PID
controller - but one way he covered was rather curious in how it worked. It
wasn't perfect, but it got you "close enough". I'm not sure if it would work
well for a "real system" but for the purpose of learning it seemed to work
well enough.

He called it "twiddle" \- I don't know if he was the original author of it, or
what - but here's a video that describes how it is implemented and how it
works:

[https://www.youtube.com/watch?v=2uQ2BSzDvXs](https://www.youtube.com/watch?v=2uQ2BSzDvXs)

It's actually pretty neat - and the concept can be applied to virtually any
algorithm in which you are trying to minimize an error amount.

~~~
gh02t
Twiddle, AKA coordinate ascent, definitely works in practice. It's basically
the same process a human uses to tune PID parameters, cyclically tune one
parameter at a time until you're performance is good enough. It's closely
related to gradient ascent and suffers from the same problems, namely that it
can get trapped in a local minimum, but it's easy to implement and does a good
job if you start with an ok guess at the parameters.

------
dosshell
I am a little suprised that the author does not use a low-pass filter for the
D part, for example 1/(1+s Tf) and instead tells you to modify the hardware.

Sure, you need cut off the noise with hardware in respect to your Nyquist
frequence but you can still have a lot of high frequency noise. And it is not
always possible to use hardware filters to get rid of the noise, for example
if your are tracking an object with a camera and the x,y is your input to the
pid. Ofcourse you can skip the D part.

What am I missing here?

I also find it more easy to use a reset based integrator anti-windup, where
you specify your limits on your control signal instead of limiting the
integrator.

Here is an diagram of what I usally do:

    
    
              +--------+           +------+
       -y     |    1   |           |      |
     -------->+ ------ +---------->+ Kd*s +----+
              | 1+s*Tf |           |      |    |
              |        |           +------+    v
              +--------+   +----+            +-+-+       +-------+   
       e=r-y               |    |            |   |       |    __ |      u
     --------+------------>+ Kp +----------->+ ∑ +---+-->+ __/   +---+--->
             |             |    |            |   |   |   |       |   |
             |             +----+            +-+-+   |   +-------+   |
             |                                 ^     |               |
             |   +----+   +---+    +-----+     |     |     +---+     |
             |   |    |   |   |    |  1  |     |     |   - |   | +   |
             +-->+ Ki +-->+ ∑ +--->+  -  +-----+     +---->+ ∑ +<----+
                 |    |   |   |    |  s  |                 |   |
                 +----+   +-+-+    +-----+                 +-+-+
                            ^                                |
                            |                                |
                            |      +----+                    |
                            |      |    |                    |
                            +------+ kt +<-------------------+
                                   |    |
                                   +----+
    
    
    

The kt gain will reduce the integrator when we have saturated output.

Does some one knows the pros and coins with the different anti-windup
solutions?

~~~
jgable
I always do integrator wind-up similar to how you do it -- apply limits to the
overall controller output (which are typically necessary for the system
anyway), and use that to limit the integrator. Just putting limits on the
integrator is too crude, and it's not like it's any easier to program. Maybe
the author introduced that because it's easier for a beginner to understand.

However, I typically do not change the state of the integrator the way you are
doing here. I just hold the integrator state (i.e. don't update it) if the
controller is limited. That way, noise on the P or D signals doesn't cause my
integrator state to hop around. I may do it your way if the controller limits
are changing over time for some reason.

I also always low-pass filter the D term, as you mentioned. Otherwise, the D
term is just too noisy.

I often find that my P and D terms are limited by how much sensor noise I'm
willing to let thru to my controller output.

------
RhysU
'Feedback systems: an introduction for scientists and engineers' is another
nice reference [1]. I used it and another Wescott text to put together
effectively a somewhat extended version of the OP's code [2].

[1]
[http://www.cds.caltech.edu/~murray/amwiki/index.php/Main_Pag...](http://www.cds.caltech.edu/~murray/amwiki/index.php/Main_Page)

[2] [https://github.com/RhysU/helm](https://github.com/RhysU/helm)

------
xchip
Or you can use my PID simulator, where you can play with the constants in a
simulated car to reach different speeds, it updates the output graph instantly

[http://codinglab.blogspot.be/2016/04/online-pdi-
trainer.html](http://codinglab.blogspot.be/2016/04/online-pdi-trainer.html)

~~~
Tepix
Great, thanks!

------
throwawaybbqed
My SW eng program did not even hint at these. Is there a good edX or coursera
course that goes through more extensive material than the pdf? I'm also
curious what tools are used by professionals? Matlab, simulink???

~~~
wustangdan
Not sure if this is allowed but I created a Udemy course on PID control where
I go through the theory and then in the assignments you write Python code to
create a PID controller for an elevator (with the goal of moving the elevator
to the desired height). A PID controller can be written in less than 10 lines
of code but understanding the different components is very important for
tuning it and getting it to work. If your interested just go to Udemy and
search PID Control, you'll find it.

If you'd like a discount code (not worth the full price if your a SW eng) DM
me on Twitter as I'm not sure I'm allowed to post it here.

When I did my MSc I did all my PID controllers in MATLAB/Simulink but since
the actual code for a PID controller is very simple, it's easy to implement in
Python or C++.

~~~
throwawaybbqed
Looks interesting .. as a SW Eng, I find Udemy very useful for these hw
courses. Thanks for the pointer!

------
roel_v
So here's an awfully uninformed question that will probably get me laughed at
by actual engineers (as in, not 'software engineers' like myself - and yes
Canadians please bite your tongues, I've heard your shpiel by now): when you
have a more powerful computer to work with, isn't it much easier to just
derive some regression parameters and call it a day?

My use case: I have this home-build growhouse that I use to start my seedlings
in at the end of winter (and as a fermentation chamber for wine). It's
controlled by a Raspberry Pi and as inputs I have 4 groups of LED grow lights,
a 30w heating element, a fan that draws warm air out of the box and thus
brings the temperature within the box towards ambient temperature, and DHT22
sensors that measure temperature and humidity inside and outside the box. The
goal is to keep temperature as close to a certain target temperature as
possible, keeping in mind that I want the lights to be on for a certain nr of
hours per day and the lights give off so much heat that they will pretty much
always make the temperature go over the target (so the fan then needs to bring
it down again). It's easy to get this working the naive way with an hysteresis
of say +/\- 2 degrees (which is fine for my purposes), but the nerd inside me
sees it as a game to make that band smaller. So I've been reading up on PID
controllers and dear baby jesus I can't even work out how I'd get started. So
the pragmatic part of my brain goes 'just run it for a few weeks with
incremental inputs between 15 and 35 degrees (C, of course), measure the
responses, toss all measurements into R and derive some regression
parameters'.

Does anyone do this in industry? Probably not because everywhere I've asked,
everybody's who's ever done real work on this uses PID controllers - but is
that because they're easier to build on microprocessors (i.e., need less
number crunching)?

~~~
Tepix
There will almost always be external influences that change the behaviour over
time. If you have a vehicle, going up the hill will require more energy to
keep the speed constant. For your growhouse, if it's hot outside of the box it
may requires less heating, etc. Or the LEDs that you use get less efficient
over time. You really want to have a dynamic algorithm such as PID that can
deal with these changes in a near optimal fashion.

------
darshanrai
There's also this[1] series of blog posts by Brett Beauregard that speaks of a
few improvements for real-world implementations.

[1]: [http://brettbeauregard.com/blog/2011/04/improving-the-
beginn...](http://brettbeauregard.com/blog/2011/04/improving-the-beginners-
pid-introduction/)

~~~
carlmr
That gets you through 90% of real world control problems. Filtering the
derivative input gives you another 5% for some noisy processes. The other 5%
require somebody who knows a bit more.

------
budadre75
what I have been searching for years is not theories or tutorials about PID,
but a detailed explanation on how to tune a digital controller(instead of a
continuous one) in a easy-to-setup environment, whether it be simulation or a
cheap hardware. I know examples can include inverted pendulum(IP), double IP,
mountain cart, heater with thermometer feedback, line-follower, etc, but where
can I easily play around with these examples?

~~~
cellularmitosis
I found this video useful:
[https://youtu.be/uXnDwojRb1g](https://youtu.be/uXnDwojRb1g)

------
matthberg
Wish my highschool FRC robotics team had this when we naively attempted to
control elevator levels with a PID.

~~~
dokem
I also remember tuning PID values on the old Jaguar motor controllers. It was
not a pleasant or particularly fruitful experience.

------
midjji
The matlab system identification toolbox is nice to get a casual feel for the
systems. Note that pid controllers are mostly used due to their computational
rather than control performance. They are math equivalent for linear systems,
but thats like saying x=inv(A)b is just as good...

The powerful idea is the linear system model or the stochastic linear system
model. Understand them and the properties of recursive filters and pid will
come naturally.

------
Double_a_92
It's so sad. I studies this stuff in depth for almost 1 year in college... And
now after 2 years of working as a software dev I forgot almost everything.

~~~
rrmm
I wouldn't worry about it. In your career, you're going to forget a ton of
stuff (if you're not repeating the same project over and over again). That's
just how it works.

The win is that someday you will end up needing either the stuff you learned
or the math that you used, and you'll find out it's way easier (and more
illuminating) learning it the second time around.

------
clebio
509'd. Does anyone have the PDF for it? Google cache seems to just render the
document, not give me the PDF (I like to download and read offline, from time
to time).

------
wnoise
He stresses regularly spaced sampling. I don't see why high-precision
timestamps wouldn't work just as well, though that would complicate the
computation of the I and D signals.

Actually, throughout he doesn't seem to acknowledge the different units
between the P, the I, and the D, instead implicitly treating the whole system
as discrete time for many purposes, in e.g. the recommendations for 100:1
ratios between P, and I; and D and P.

~~~
ebrewste
Regarding equal spacing: the sample rate determines a part of the system
latency, and therefore performance and stability limitations. Eliminatating
the stability of sample rate eliminates the decades of work basic controls
analysis relies on. Or said another way, slow your sample rate and pay with
your phase margin. Not to say it’s impossible to deal with (small) changes in
sample rate, but in a text where he is already wildly simplifying things, this
sounds like the wrong place to innovate.

------
beagle3
Does anyone here have experience with Fliess iPID controllers?[0] The papers
indicate they are superior, Fliess is a generally respected academic, but the
math seems the same with slightly different terminology....

[0] [https://hal-polytechnique.archives-
ouvertes.fr/inria-0037232...](https://hal-polytechnique.archives-
ouvertes.fr/inria-00372325/document)

------
fixermark
This is a great short overview, and timely for the First Robotics Competition
season.

I took a course on this in college, and while that course mentioned the notion
of sampling the D signal directly from feedback instead of from error, it
never laid out why. This guide's practical explanation of why one might want
to do that is great.

------
davidblair
Mirror:
[https://web.archive.org/web/20170721121215/http://www.wescot...](https://web.archive.org/web/20170721121215/http://www.wescottdesign.com:80/articles/pid/pidWithoutAPhd.pdf)

------
xchip
My sousvide controller uses a PID to regulate the temperature super precisely.

[https://github.com/aguaviva/SousVide](https://github.com/aguaviva/SousVide)

------
the-dude
I have been doing a (integer) PID without a PhD over the last years as well :
meCoffee ( [https://mecoffee.nl](https://mecoffee.nl) ) for espresso machines.

~~~
jmiserez
I just bought an espresso machine, and it’s crazy that something as simple as
a PID controller justifies a several hundred dollar markup on some models. The
processing power needed is miniscule...!

~~~
the-dude
True. And if your machine does not have a pressostat, it makes a _huge_
difference.

~~~
jmiserez
Does the optional dimming of the boiler make a difference in your design? Mine
just fully turns on and off all the time, and I reckon dimming would probably
be easier on the power grid. 2000W on/off vs ~80W average (I measured) is
quite the difference.

~~~
lmilcin
I am using SSR with zero crossing and it makes much difference.

First, I am dithering the boiler power (the power is applied to the boiler on
only some AC cycles), I am basically deciding per cycle whether I want my
boiler turned on for the next 1/100th or 1/120th of the second. I am already
running my model about 50 times per second (not using PID) so that's ok.

Being able to apply fractional power to the boiler means it is quiet and much
less scale deposits.

Also due to the zero-crossing control I have much less electrical noise.
Turning the boiler fully-on blindly causes electrical noise which is not nice
for other devices (my Baratza Sette grinder turns on spontaneusly when this
happens).

------
Tepix
Is a machine learning algorithm capable of getting better results than a PID?
Is the extra effort worth it?

~~~
dosshell
How do you mean with "better result"?

For a non-linear system, machine learning can perform much better and solve
more complex tasks. Like walking for a humanoid robot. A linearzied PID
solution can not beat that.

For linear systems:

Depends of what you mean with better I guess. My understanding is that you can
place some of the poles of a system with a PID pretty much where you want. So
for a simple linear system: no.

There exist also other controllers where a LQR can be used in linear systems
to get a perfect theoritical controller based upon to minimize the cost of
control outputs. (yes, several outputs).

However maybe a ML can be used to choose and tune the controllers better than
humans?

EDIT: Also, the discretication is not always perfect (backward or forward
euler is common estimates). So maybe ML can help in that too?

~~~
tnecniv
> For a non-linear system, machine learning can perform much better and solve
> more complex tasks. Like walking for a humanoid robot. A linearzied PID
> solution can not beat that.

Right but you are comparing a nonlinear regression model with a linear control
scheme, which isn't really fair. Plenty of nonlinear control strategies exist
and are used in practice for things like humanoid walking. Moreover, many
people prefer these controllers due to the theoretical guarantees they provide
over a learned control policy.

That said, ML does have a place in control theory, and you can find papers
going back to at last the 90s (I never looked earlier but they probably exist)
that combine the two, and not just in a reinforcement learning context.

> However maybe a ML can be used to choose and tune the controllers better
> than humans?

Probably not, assuming you are discussing linear systems. You would need to
come up with some cost function for your ML algorithm to determine the
performance of a particular control policy, but then why not just use that as
the objective function for your optimal control algorithm?

------
leafario2
What's the reason he insists on using double precision math? (If not using
fixed point)

~~~
davepage
Keeping things simple, if the input data SNR is, say, 16 bits, and the PID
coefficients are 16 bits, then there is 32 bits of significand at the output
of the control gain multipliers. For single precision float, this is truncated
to 23 bits. So, the system may settle when the transducer input is near zero
(and the exponent can scale down to maintain sufficient precision), but then
fail to settle well at the ends of the transducer range (where the transducer
input is near +/-2^15 and the exponent in the single needs to stay high to
avoid saturation). Also, consider the numerical integrator is running an
infinite series of adds, so therein lies a trap for error growth, as well.

I have made this mistake before, and the sometimes strange control behavior
can be frustrating. Considering the small performance penalty for using
doubles on modern hardware, it is excellent advice.

~~~
leafario2
Thanks for the explanation! I have only hardware with no FPU (teensy 3.2) or
single precision FPU(teensy 3.5)... I suppose in that case it would be better
to write a fixed point implementation?

~~~
ebrewste
Fixed point often works better, as it has better resolution because it doesn't
need the exponent. The flip side is that the range of a fixed point is very
limited (because it doesn't have the exponent). Practically, that means you
need to do some analysis to make sure don't overflow or saturate, and use the
bulk of your fixed point range to get the resolution benefits. If you do that,
fixed point is great.

If you don't have time or knowledge to do that, floating point works better
because you can be more ham-fisted with your scaling and still not overflow or
saturate.

~~~
leafario2
I have the choice between using a slower processor (96 MHz) without an FPU or
a faster one(120MHz) with a 32 bit FPU. I might use the slower one but with a
fixed-point implementation. Thanks for your tips!

------
senatorobama
This reminds me of my Achilles heel during undergrad. Kalman filters.

No matter how much I read about the topic, I just could not grok it. The whole
"state update" algorithm and sheer amount of different variables threw me off.
Does anyone else feel the same way?

~~~
blt
The book "Probabilistic Robotics" explains the Kalman filter really well by
putting it in the larger context as a Bayesian filter. I understood it only
after reading that book

~~~
carlmr
I haven't read that book, but seeing it as a Bayesian filter made it click for
me. All the control theory guys who just handwaved with "it's basically a
stochastic Luenberger observer" didn't capture the essence of the filter for
me. With a strong statistics background it was much better to go from Bayesian
first principles to the Kalman filter.

