* Multiple contingencies occurred simultaneously (loss of generation from two major generators and lost of distributed generation totalling 1,400 MW) resulting in a drop in system frequency to 49.1 Hz
* Standby generation (frequency response reserve) was deployed, totaling 1,000 MW or the largest single generation contingency and began to arrest the system frequency decline
* Just as system frequency began to recover, a third contingency occurred resulting in the loss of a further 210 MW of generation. This caused system frequency to decline again to 48.8 Hz
* Load shedding kicked in as designed and dropped 5% of load to stabilize the system
The largest loss of generation was from Hornsea offshore wind farm. The wind farm should have rode through the system disturbance, but instead its control and protection systems rapidly curtailed active power generation in response to an undamped oscillation in the response of its voltage regulator through the disturbance.
Basically, the internal voltage of the Hornsea wind farm collector system dropped due to the voltage regulator oscillations (from 35 kV nominal to 20 kV), while active power generation remained the same. Power = current * voltage, so an overcurrent condition occurred and protection systems operated to prevent overload of the wind turbine generators.
Subsynchronous oscillations (SSO), i.e. oscillations at below power frequency (50 Hz), are a known issue in power system controls that can lead to unstable or unexpected consequences during system disturbances. The reduction in system inertia caused by the replacement of large synchronous machines with asynchronous generators as wind and solar replace conventional generators exacerbates the possibility of problematic SSO because there is less damping.
Nowadays, in North America, very specific modelling is done in design stage to identify the possibility for such behaviour and ensure that if present it is adequately damped. Some system operators, such as ERCOT (Texas), require this for new wind projects. I imagine that this major occurrence will led to revisions to modelling and grid code testing standards in the UK to protect against future incidents.
All in all, kudos to Ofgem, National Grid and all other participants for producing a thorough, public technical report in just about one month.
Turbine-generator systems have nice, simple behaviour in reponse to frequency drop: they act to maintain the frequency by transferring more energy from the shaft rotation to the generator. In the long run this slows the turbine down or triggers a throttle response, but over the few second period we're talking about the shaft speed is basically constant due to its own inertia.
The wind farm "saw" the rapid fluctuations in connection voltage, tried to compensate, and instead went into oscillation. This appears to have been a software bug:
> "During the incident, the
turbine controllers reacted incorrectly due to an insufficiently damped electrical resonance in the subsynchronous frequency range, so that the local Hornsea voltage dropped and the turbines shut
Orsted have since updated the control system software for the wind turbines and have observed that
the behaviour of the turbines now demonstrates a stable control system that will withstand any future
events in line with Grid Code and CUSC requirements"
(Oscillation damping is "control theory 101", but in a complex system like this it's not so easy!)
Good news is that, while more renewables do potentially have this kind of vulnerability, battery systems are the perfect counter. Some are already being deployed for "fast frequency response". Being a DC-AC system, they can deploy power with any frequency and phase angle required to compensate for problems.
This is a nice understatement.
My first gig (summer after freshman year) I worked with D. Van Ness, who had an inquiry from he Bonneville Power Administration to determine why their frequency was oscillating (yes, the frequency). This oscillation would rapidly get worse until something tripped and the whole network in the Northwest would go down.
He modeled the system with a state vector and interconnect matrix. The matrix was 500x500 and the path to understanding it was to find Eigenvectors and Eigenvalues of this system. If there any poles to the right of the y-axis, you have an oscillator. Over time, they changed enough to get it stable.
And you make some good points about the synchronization available if everything is a classic generator, and these other power sources are not.
And this was many years ago, so the power systems of today are likely much harder to model.
You put this nicely:
> They act to maintain the frequency by transferring more energy from the shaft rotation to the generato
Another way to think of this is that in a system with more than one generator, a phase difference anywhere in the hookup causes power to flow in direct proportion to the difference in phase angle. In other words, the slow generator becomes a motor.
In general I don't see it as a renewable vs. conventional issue. SSO/SSR/SSCI have been around since the 1960s when PSS started to be deployed in synchronous generator excitation control systems. Rather it reflects the greater complexity in modelling involved high speed digital controls vs. physical, inertial responses that are expressed very effectively by well-known equations. As we layer on more and more controls, we don't only have to model what is going on at power frequency (50 or 60 Hz) but also at harmonic frequencies and sub-synchronous frequencies. Renewable generators just happen to depend much more heavily on complex control systems for power conversion, mimicking synchronous generator response characteristics and to marry all the components of a large renewable plant together.
At the same time, we have far more powerful tools for power system simulation today that can effectively mitigate this risk, as long as engineers realize the risk is there.
A good reference explaining SSO as it applies to conventional generators can be found here: http://www.cigre.org.br/archives/pptcigre/07_subsynchronous_...
* Power system stabilizers (PSS) are a part of synchronous generator excitation control that improves dynamic stability by damping generator oscillations against the grid. However, PSS systems can actually cause additional, long-distance oscillations with other PSS systems in the frequency range of 0.1 to 1 Hz. See: https://www.wecc.org/Reliability/Power%20System%20Stabilizer... and http://www.meppi.com/Products/GeneratorExcitationProducts/St...
This is all made more complex by the fact that many of the components of these systems have really non-linear behavior. Like a dam spill that hits a hard boundary.
So I don't see that wind (in the sense of weather patterns) was a contributing factor, but that complexity of regulation probably did contribute.
But then again, as the Little Barford CCGT station showed, it's still perfectly possible to have unexpected failure modes on more conventional generating equipment. (Little Barford enter service in 1996, and is presumably fairly typical of the kind of CCGT stations that were built in large numbers in the UK through the 90s.)
So would a large fly wheel, or something similar be of value?
I wasn't sure for the intended usecase (compensating for 1000s of MW of powerloss) whether a battery would necessarily be the best thing.
Page 27. "The effects were exacerbated as the fleet was undergoing a software change which meant the train drivers could not recover trains which were operating on the new software."
Appendix F – Govia Thameslink Railway (GTR) technical report, Page 47-50. http://www.ofgem.gov.uk/system/files/docs/2019/09/eso_techni...
Appendix F, page 49: "Therefore, the affected Class 700 and 717 sets did not react according to their design intent in these circumstances."
Great technical report, would have loved to have had more information on Victoria line. Lessons can be learned from this report.
There also seem to be multiple faults at once that they don't know anything about. Turbine trip? No clue, could be bad sensors, could have been some actual physical problem in the turbine. Overpressure in the condenser? No clue, could be anything. Second generator tripped? Dunno boss.
Implies that don't have enough sensor coverage and are hoping to literally eyeball something wrong when they next open it up. Also implies they can't shut down their own plant to diagnose apparent faults in anything like a reasonable timeframe? Not good.
Also worth noting - Newcastle Airport were totally fine on their UPS but demanded to be considered a priority customer anyway, and that request was granted? Why? They clearly don't need to be, they have a working UPS!
Honestly I'd be sending this report back for more work if I were the boss guy receiving it. It's not good. Filled with repetition, bad grammar, missing information (why did London Underground shut down, what was this 'internal traction issue') and most seriously it leaves a gaping hole around Little Barford.
The impression I get from the data centre outages reported here on HN is that backup generators are about the least reliable thing in IT.
And a 1% chance of losing power to air traffic control and runway lights is worse than a 100% chance of 1000 homes having their dinner spoiled by the cooker turning off.
Not just in IT, as the Badim Hospital fire (with 11 dead and IIRC over 70 wounded) two days ago shows.
They say it required a software update to fix it which was applied the next day - probably this was just a change in the gains in the voltage controller rather than an update to the actual program or firmware.
Somebody did a bad job of commissioning the voltage regulators on the wind turbines, and that is what caused a normal transmission line reclose to escalate in to such a large loss of generation.
I am still curious as to why the DAR delayed action reclose time on the transmission line is 20s. I would have thought it would be more like 1-2s tops.
Also I'm not sure I agree that the SSO was there all along. The system configuration appears to be several STATCOMs at the HV interconnection substation plus the VAR capabilities of the individual wind turbines. There may be an interaction between these control systems that leads to SSO under certain conditions only while being effectively damped at other times.
As for the reclose time, it may have to do with circuit breaker duty cycles. We don't know what equipment they're using but if it's dated stuff, it's conceivable that it requires that level of delay before it's rated for another interrupting operation.
It should be a device that can be brought in on a truck, hooked up to the plant during commissioning, and the artificial load should be able to replay any conditions during grid incidents in the past, to check all the control systems work as designed.
A 1 Gigawatt artificial load that can work for 1 second could consist of 10 cubic meters of water (on a truck) and a few miles of nicrome wire... You'll also be needing some beefy switching silicon to simulate something more than a simple resistive load, but they exist already in any DC undersea power project.
When I’m setting the gains for any control loop I tend to prefer choosing lower gains that still meet the performance requirements rather than having high gains closer to the edge of stability. I would not leave a system behind that had 13 oscillations after a step.
If this kind of oscillation was present under all conditions, it would be an oversight to not have caught it in the modelling stage. The level of modelling we have to do in some North American regions for a 50 MW wind farm would catch that kind of behaviour, let alone an 800 MW unit.
From section 5.2.1:
"The train manufacturer, Siemens, are developing a patch which will allow the drivers to recover the trains themselves without the need for a reboot or technician to attend site."
It's psych-profiled, companies which employ drivers will be looking for "compliance" (a psychological tendency to obey rules even if you don't understand why) so that the driver obeys all the safety rules.
It's also a fairly complicated machine, not as complicated as a jet liner but far more complicated to operate than a bus, so that reduces your pool of candidates further, in most cases they'll be looking for someone with some mechanical aptitude to understand how it works.
They need communication skills, the driver needs to work with their signallers, and potentially also company dispatch, and on trains without separate customer service personnel they need to talk directly to passengers.
For example yesterday I was on a train which was delayed by trespassers. The driver will have needed to use "proceed with caution" rules, where they drive the train slowly enough that they can always stop it within the distance they can clearly see, obeying any signals, and then call their signaller back each time a signal cancels that authority, to get a new authority overriding each signal. Then, clear of the problem but much delayed, they needed to handle the fact that their dispatch turned their train into an Express to get it back where it should be, so they need to make announcements to passengers about where passengers should disembark to get a different train that's still going to their destination.
Mainline train drivers make similar money to me (or at least similar to what I made five years ago) but I can't say I feel like they don't earn it. Like me their job is pretty easy when things go right, but not so much when things go wrong. Lots of people couldn't do it, and more wouldn't.
I've been on trains when they were obviously rebooting due to some sort of fault. It takes forever. What is the train doing during this time, exactly? Does every sub-computer on board boot serially or something?
Every actuator will be energised, de-energised, moved end to end to test limit switches, etc.
It's mostly done serially because some things would be disrupted by other things, and figuring out a dependancy tree is tricky.
Quite a lot of stuff is auto-configured in bootup. For example, it might spend 10 seconds trying to ping a debug console to see if it should enter debug mode. Or 30 seconds with IP addresses configured to the 'lab' setup before switching over to the production networking config when the lab settings won't let it connect to anything.
Vector Shift Protection (triggered by lightning, led to loss of 150 MW):
As far as I can see, this protection shuts down generation when part of the grid might be disconnected from the rest. Shutting down when islanding hasn't occurred is wrong, and destabilises the grid. Perhaps we should be measuring islanding another way? What about applying gold coded frequency modulation to the actual system frequency? A gold code of length 1 million could be injected on just a few points on the national grid, at a power of just a few kilowatts, and be measurable from anywhere. When islanding occurred, the signal disappears, and embedded generation can switch off?
Rate of change of frequency protection (led to loss of 350MW). What's the purpose of this protection at all? If frequency is changing in a downward direction, the faster it's falling, the more important it is not to disconnect supply.
High positive rate of change of frequency might be a reason to disconnect generation to prevent oscillation (effectively acting as the "D" term in a pid loop), but did this occur?
Many Important plants and transmission lines are connected by a utilities own fiber which can be used to transmit the actual state of the system instead of trying to infer it from the power waveforms. This is the best solution but obviously expensive.
Uncontrolled decentralized embedded generation is not meant to ever energize a dead line. If it has a solid state power electronics interface to inject power (a fancy inverter) it is probably operating in a mode where it follows the waveform on the grid to make sure it stays in phase. If the grid waveform is poor quality, full of harmonics due to faults, or phase shifts due to major loads or generation disappearing resulting in instantly changing power flows, the inverter can’t stay in phase. It is probably a delicate balancing act to be able to follow the grid frequency but also affect it by injecting active and reactive power. I’m not an inverter guy so this is mostly speculation from reading data sheets and manuals for grid tie small battery systems.
Not sure about rocof tripping generation on falling frequency. There are situations in which injecting power in to an island can result in overvoltages that would damage all of the equipment on the island, so it is better to avoid it if it looks like an island might be forming. Just a guess based on experience with an embedded steam turbine.
There could have been a very short increase in frequency for the 80ms or 4-5 cycles when the single phase to ground fault occurred as faults cause machines to accelerate since it is the same as removing the load and replacing with a short circuit. Otherwise the only increase in frequency was when it started to recover by the system operator calling for more generation.
"not be expected to trip off or de-load in response to a lightning strike. This therefore appears to represent an extremely rare and unexpected event."
Looking at the timeline, both of those events are logged within 1 second of the strike. To me, with even a little bit of experience with systems having complex interacting components, it seems vastly more likely that there is some unknown interaction rather than pure chance. I would imagine the prior probability of either of those two going offline is very low, so the probability of both independently going offline within one second of a potential causal event seems vanishingly small.
On the other hand, by releasing publicly the internal findings of the involved power companies can be scrutinized by academics, independent engineers and members of the public.