Iberian Split - Final Grid Incident Report from ENTSO

The final report from ENTSO is available following the Iberian blackout in 28th April 2025. The report is not an easy read, and its presentation style makes it difficult to identify the key takeaways amongst the sea of text and images.

As a power systems engineer and someone completing a PhD in power systems stability, I have tried to pick through the weeds and give a high level summary below. This article is written in two parts, the first section is a high level summary written by myself, the second part is a structured AI generated response, with me leading Claude Opus 4.6 through the discussion and analysis.

Part 1: Human Only (My View)

From a human based analysis, the challenge on all reports is what to focus on. For brevity, and technical interest I have focused primarily on the key issues, and highlighted the key points and surprising issues.

Two distinct oscillatory events occurred in the 30 minutes before the blackout. The first at 0.63 Hz was a forced converter-driven oscillation from a specific plant, not a classical SSO or natural mode. The second at 0.2 Hz was the known East-Centre-West inter-area mode, tipped into instability partly due to the earlier event.
The control room spotted it quickly and activated a protocol to contains the disturbance. But the protocol was designed for 0.1–0.35 Hz inter-area modes — there was no procedure for locating and disconnecting the source of a forced oscillation.
The control room were trying to manually switch lines and shunt reactors to manage the network and whilst this was being done, they were fighting against competing requirements of system strength (impedance) vs power flow and voltage control.
The source was traced to a large legacy PV plant in Badajoz. This was confirmed by multiple independent methods. The plant was pre-RfG, operating below available output due to market schedule.
Interestingly, an eigenvalue analysis did not detect the mode. There are a few reasons for this, Eigenvalue analysis is not perfect and cannot capture detail modes from IBR plant controllers, if the IBR models are fully open (many are simplistic or encrypted) – but they are also do not detect forced injections.
The HVDC Power Oscillation Damping controller could have prided some damping for the inter-area mode, but it was disabled during the first oscillation due to output saturation and never re-enabled.
PSS coverage was very weak. Surprisingly, the use of PSS controllers on traditional synchronous plant in Spain is not standard, and this allowed the oscillation to continue without being damped.
System inertia was low, but not unreasonably so and was identified not to be the root cause. The system fault level was low, but not unreasonably so.
Local grid conditions around the source plant were mixed. Fault level was ok (ish), there sparse synchronous generation and massive IBR concentration. Exactly the conditions where converter-driven instability becomes likely.
What caused the system collapse was loss of voltage control, and a cascading voltage rise. There was insufficient voltage control mechanisms in the network to prevent it from spiraling out of control. To be clear here there were potentially enough physical resources in the system to manage it, but many were left in fixed power factor control mode.
The Lower Frequency Demand Disconnection (LFDD) operated as designed but it was intended to protect against falling frequency, whilst the problem was voltage control and sites tripping on over-voltage.

Part 2: AI With Human Prompts

AI is becoming increasingly import in power systems analysis. At a personal and professional level we are actively exploring using AI and antigenic systems in workflows and analysis. This has led to many interesting observations and improvement,s but equally many hours lost diving down rabbit holes and varying levels of hallucinations, or ‘missing the wood for the trees’.

So to see how good an analysis could be, I compacted the PDF and ran it through Claude Opus 4.6, with a structured series of prompts. This is my question and answer response back to Claude.

1. What triggered the initial SSO / 0.63 Hz oscillation?

The report is quite definitive on this: the 0.63 Hz oscillation was not a subsynchronous oscillation in the classical torsional/electrical resonance sense — it was a forced/converter-driven oscillation. The Expert Panel classified it firmly within the “converter-driven stability” family.

The source was traced, through multiple independent methods (Dissipative Energy Flow analysis, harmonic presence analysis, SCADA-based source location, and PCA on PMU data), to a single large PV plant in the Badajoz province in southwest Spain. The convergence of all methods on the same node is compelling. The plant was classified as an “existing generator” (pre-RfG), meaning it was operating under legacy grid code requirements with older inverter technology.

What’s interesting is the mechanism: the inverter was not operating at maximum power during 12:03–12:15 (available RES energy was higher but was being curtailed), and when it increased its P setpoint at ~12:16, the oscillatory behavior diminished. This is consistent with well-documented inverter impedance interaction phenomena — old inverters running at intermediate power levels are known to exhibit instability when the system reactance seen from the terminals hits an unfavorable operating point. The report references this explicitly, citing literature on IBR instability under high grid impedance conditions.

The eigenvalue analysis found no poorly damped natural mode anywhere near 0.63 Hz, and the historical modal analysis from January–April 2025 showed this frequency appearing only sporadically with no consistent “dominance.” When the simulated forcing signal was removed, the oscillation died almost immediately — confirming its non-natural origin. So this wasn’t SSO in the IEEE sense — it was a converter-grid interaction that forced a periodic disturbance onto the network.

2. What triggered the voltage rise and runaway — and was it linked to the SSO?

The voltage rise and the oscillations are linked, but not in the way you might initially expect. The causal chain is more subtle and involves several compounding factors:

The structural cause of high voltages was the combination of massive solar PV output in the south/southwest (~16 GW of solar across Spain at the time), long electrical distances to load centres, lines opened for voltage control (which increased system impedance), and critically — the reactive power control mode of RES generators. Spanish RES was operating in fixed inductive power factor mode under PO 7.4, meaning reactive power absorption was proportional to active power output. When a generator ramped down its P, its Q absorption reduced proportionally. The Expert Panel identified three generators at TS1-Granada that ramped down by 78.8 MW in 24 seconds, reducing their reactive absorption by 33.3 MVAr — and the local voltage jumped at a rate of 22.6 kV/min.

The oscillations contributed through two mechanisms. First, each oscillation episode caused small embedded PV disconnections at distribution level due to overvoltage protection trips on the swings — the 5-minute aggregated data from two major inverter manufacturers showed clear correlation between oscillation episodes and overvoltage trip rates, particularly in Andalusia. Second, the oscillations depleted operator bandwidth and introduced a period where shunt reactors were being disconnected (to arrest low voltage during oscillation troughs) — but then weren’t reconnected quickly enough when voltages swung back.

The runaway itself was a classic positive feedback cascade: generation trips → loss of reactive absorption → voltage rises → more overvoltage trips → more reactive absorption lost → voltage rises further. Many of the overvoltage protection settings on the generation were non-compliant with Order TED 749/2020. The Expert Panel found multiple plants with thresholds set at 1.089 pu or 1.09 pu with 1–1.5 second delays, when they should have been withstanding 1.10 pu (440 kV) for 60 minutes. One 550 MW PV plant at TS1-Seville tripped at 435.8 kV — well below the 440 kV threshold. The relay hysteresis problem is also flagged: once overvoltage protection picks up, it requires the voltage to fall to ~95% of the pickup threshold to reset, so brief exceedances can lock in a trip even after the voltage recovers.

The event on 22 April was a warning shot — similar overvoltage conditions occurred reaching 452 kV at PMU ARN4ALD, but the conventional plant connection and shunt reactor response recovered the situation.

3. Why were local PSSs and HVDC oscillation damping unable to intercept?

This is where the picture gets uncomfortable. The PSS coverage was appallingly thin:

In Spain: No PSS on nuclear, no PSS on hydro, no PSS on coal. Only one pumped storage plant (five units) and six CCGT plants (eight units) had PSSs. Against a system with ~29 GW of demand, the PSS-equipped synchronous generation was a fraction of the dispatched fleet. The nuclear fleet (2,740/3,030 MW in the centre and 660/1,000 MW in the east) had zero stabilisation contribution — and the eigenvalue sensitivity studies showed that fitting PSSs to NPPs alone would have added ~1.27% to inter-area mode damping. That’s significant when your base damping is 0.28%.

For the 0.63 Hz forced oscillation specifically, PSSs were never going to be the primary defence. The report’s simulation shows that even with all corrective measures applied (HVDC mode change, countertrading, topology changes), the impact on the 0.63 Hz component was “small but positive” — because you cannot fully suppress a forced oscillation without removing the source. PSSs damp natural electromechanical modes; they don’t cancel a forcing function. The real failure was the inability to identify and disconnect the source plant.

For the 0.2 Hz inter-area mode, the HVDC POD controllers had a more complex story. The POD-Q on the INELFE-1 link auto-disabled at 12:03:51 — approximately 50 seconds into the first oscillation — because the manufacturer’s logic detected the controller saturating at ±100 MVAr too frequently. The report’s eigenvalue analysis shows POD-Q was actually more effective than POD-P for the inter-area mode (adding 2.09% damping vs 0.46% for POD-P alone). So the most effective damping tool shut itself down during the first event and was still disabled when the second (more dangerous) inter-area oscillation began at 12:19.

The switch from ADC to CPC mode on the HVDC was the right call — ADC mode emulates an AC line, which actually exacerbates inter-area oscillations. Table 4-6 in the report is telling: the system went from -1.03% damping (ADC, 180 MW, full exchange) to +0.28% (CPC, 500 MW, reduced exchange). But even after all measures, the base damping of the ECW mode was only 0.28% — one stochastic perturbation away from going negative.

4. Why had similar SSO not been seen before despite higher renewable penetration?

The report addresses this carefully through the FSSI analysis on multiple days in April 2025. The historical modal analysis from January to April 2025 showed no consistent 0.63 Hz mode in the system — it appeared only occasionally and sporadically.

The key insight is that this wasn’t about renewable penetration per se — it was about a specific converter-grid impedance interaction at a specific plant under specific operating conditions. The Badajoz plant was running at intermediate power (not maximum), the grid impedance seen from its terminals was elevated due to open lines and the geographical concentration of IBR generation in the southwest, and the inverter technology was pre-RfG legacy equipment. Change any one of those conditions and the oscillation may not have appeared.

The report also notes a brief 0.63 Hz oscillation (15 mHz peak-to-peak) on 8 April at 14:06, after a 2.6 GW upward generation ramp — suggesting the mode was latently present under certain dispatch conditions but not persistent or dominant.

Higher penetration days would have had different topology, different dispatch, different grid impedance profiles. The conditions on 28 April were a particular confluence — the “chaotic system behaviour” that the Expert Panel references. This is fundamentally different from classical SSO where you have a fixed torsional mode being excited by series compensation. Converter-driven oscillations are operating-point dependent and can appear and disappear with dispatch changes.

5. Was system inertia in line with their own policies?

The report is somewhat diplomatic here but the data tells its own story. The figures show the number of conventional units coupled to the Spanish network from January to April 2025, and on 28 April the values weren’t at the absolute minimum, but they were in the lower range of what had been seen in the period.

However, the more important finding is that inertia was not the decisive factor. The sensitivity analysis in Section 4.4.2.3 is remarkably clear: a +20% increase in system inertia had “minimal impact” on the final trajectory of the event. The simulations show the loss of synchronism was barely delayed, and in fact the higher inertia case caused slightly more overvoltage disconnections because the system exchanged energy with CE more slowly and strongly during the loss-of-synchronism swing, resulting in higher transient unloading and earlier overvoltage trips.

For the inter-area oscillation analysis (Section 4.2.5.3.4), tripling the Spanish system’s equivalent inertia improved the ECW mode damping by 0.93% — non-negligible, but less effective than fitting PSSs to nuclear plants (1.27%) or activating the HVDC POD-Q (2.09%). The report explicitly states that “increasing inertia solely to damp inter-area oscillations is not a realistic option.”

Spain didn’t have a formal minimum inertia requirement as such — the system was operated within the bounds of existing market outcomes and dispatch. The real gap wasn’t inertia — it was the absence of dynamic voltage control tools and PSS coverage.

6. Could there have been forced oscillation / hunting?

Yes — and the report confirms this is exactly what happened with the 0.63 Hz event. The Expert Panel’s classification is unambiguous: forced/converter-driven oscillation. The inverter at the Badajoz plant was forcing a periodic disturbance onto the network, and nearby generators responded to it.

The more nuanced question is whether there was modal coupling / resonance between the forced oscillation and the natural inter-area modes. The report’s simulation in Section 4.2.5.1.3 demonstrates this explicitly: when they artificially reduced the ECW mode damping in the model while maintaining the 0.63 Hz injection, the inter-area mode was excited and amplified by the forced oscillation. The 0.63 Hz injection produced both its fundamental frequency and harmonics at 1.26 Hz (visible in FSSI analysis), and it also excited the 0.2 Hz component — the sliding-window FFT shows the 0.2 Hz component growing over the 300-second simulation while the 0.63 Hz component was still being injected.

So the sequence was: forced oscillation at 0.63 Hz → partial excitation of the already-marginal 0.2 Hz inter-area mode → corrective measures applied but insufficient → second (inter-area) oscillation at 12:19 when the ECW damping went negative. The “hunting” between the two modes is visible in the PMU data and reproduced in simulation.

The SCADA data also shows 0.5 pu peak-to-peak active power swings and 0.25 pu reactive power swings at the Badajoz plant during 12:03–12:08 — that’s a very active generator for something nominally under control.

7. Could the TSOs have taken more credible actions?

The report is measured but several things stand out:

Before the event: The voltage control framework was fundamentally fragile. Operating at voltage references of 405–410 kV on a 400 kV system with 435 kV as the “eventually acceptable” limit leaves only 25–30 kV of headroom — and with RES in fixed power factor mode, any generation ramp down immediately erodes reactive absorption. The 22 April event was a clear precursor that should have triggered an urgent review. The fact that Operational Procedure PO 7.4 dated from 2000 and had never been effectively enforced (penalties were not applicable in practice) is a structural governance failure. Conventional generators were not reliably meeting their reactive power obligations, and there was no close-to-real-time monitoring of this divergence.

During the event: RE and RTE activated the oscillation protocol within one minute of detection and took broadly the right actions — switching HVDC to CPC, initiating countertrading, connecting lines, disconnecting reactors. But the protocol was designed for the 0.1–0.35 Hz inter-area band and wasn’t equipped for forced oscillations at 0.63 Hz. The most effective action for a forced oscillation — identifying and disconnecting the source — was not something the control room could execute in real time with the tools available.

What more could have been done? The report’s sensitivity analyses suggest: (a) If POD-Q had remained active on the HVDC (the auto-disable logic was a design vulnerability), damping would have been substantially better. (b) If the oscillation source location tools (DEF, SCADA-based algorithms) had been available in real-time rather than post-mortem, the source plant could potentially have been identified and curtailed. (c) If the RES fleet had been in voltage control mode rather than fixed power factor, the reactive power response would have been automatic rather than perverse. (d) The LFDD plan achieved only 28.7% load shedding against a 45% target (Spain) — but the simulation shows even “perfect” load shedding wouldn’t have prevented the blackout. The voltage cascade was the killer, not the frequency cascade.

8. Could AI plugged into real-time monitoring have anticipated the event?

This is where I’ll go beyond what the report says and give you my honest assessment based on what the data shows.

Could it have anticipated it? Partially, yes — but with important caveats. An AI system monitoring the right signals could have provided earlier warning of degrading conditions, though “anticipating” the exact blackout requires distinguishing between conditions that are necessary and conditions that are sufficient.

What the data shows was detectable in advance:

The modal analysis tool in RE’s control room was already tracking oscillatory modes in real-time and the 0.63 Hz mode was showing increasing occurrences through the morning (48 occurrences in the 10:00–11:00 window, 140 in 11:00–12:00). An AI classifier watching the rate of change of mode occurrence density and the trend in damping ratios could have flagged the deteriorating oscillatory stability margin well before 12:03. The damping of the ECW mode was “consistently low for several minutes before becoming negative around 12:19” — that’s a trackable trajectory.

The voltage trajectory was also signalling trouble. An AI monitoring the dQ/dt of the aggregated RES fleet against voltage trends would have spotted the perverse reactive power dynamics (reduced P → reduced Q absorption → voltage rise) accelerating through the morning. The correlation with the 22 April precursor event would have been an obvious training signal.

Part 3: Human Vs Machine

This was an interesting exercise both to understand the causes of the blackout, but also in the capability of an AI to analyze a complex problem when compared to a human. The AI generally did very well and although it could be argued it was not ‘understanding’ the problem the synthesis of the overall analysis and the ability to sift through a large volume of data to find the salient points is very impressive. The AI did not really miss any of the key points in the report, but perhaps over-focused on a lot of detail that perhaps distracts from the overall narrative and story

This is the coming future of power systems and analysis. We are not in the position where AI will be replacing engineers, but instead will be a powerful assistant to leverage research, understanding and innovation.

The final report can be seen here.

https://eepublicdownloads.blob.core.windows.net/public-cdn-container/clean-documents/Publications/2025/iberian-blackout/Final%20Report%20on%20the%20Grid%20Incident%20in%20Spain%20and%20Portugal%20on%2028%20April%202025.pdf

#IberianBlackout #PowerSystems #GridStability #RenewableIntegration #EnergyTransition #AI #Claude #AnthropicAI #AIinEngineering

Iberian Split – Final Grid Incident Report from ENTSO

Part 1: Human Only (My View)

Part 2: AI With Human Prompts

Part 3: Human Vs Machine