Operational Resilience is Raising the Bar: What GRC Teams Need from Risk and Continuity Programs

In the fourth installment in OCEG™'s expert panel blog series, Ben Bradley and Oliver DeBoer, Product Managers for GRC at Resolver, explore how operational resilience regulation is shifting GRC expectations from having policies in place to proving they work under pressure, and why connecting risk and continuity programs is no longer optional for organizations that want to move from compliance tasks to strategic value.

Operational resilience has been sitting in the background of risk programs for years: Important, but rarely urgent. That’s changed. Regulatory bodies have started setting expectations that go beyond having policies in place. They want evidence that those policies work when tested.
That’s shifting how GRC teams structure risk programs, where they focus, and what they report up to the board. And the pressure isn’t theoretical. Between March 2023 and April 2024, four U.S. banks failed. One of them, Silicon Valley Bank, was the second-largest bank failure in U.S. history, holding more than $200 billion in assets when it collapsed. Each one pointed to a different surface-level issue: liquidity management, deposit concentration, or governance. But underneath that was the same operational pattern. No early warning, no contingency plan that worked, no consistent indicators being captured, connected, or acted on. Then, in 2025, two more failed. These weren’t fringe institutions.
In 2023, EU/EEA banks reported 3 million operational risk events, consistent with historical reporting levels. The bigger story was the financial impact: Losses from those events were 27% higher than the year before. That spike in losses forced risk teams to spend more time on post-incident investigations, strengthening controls, and reassessing high-impact vulnerabilities.
These weren’t caused by obscure black swan events. They were the result of missed signals and fragmented planning: Continuity strategies that looked fine in theory but weren’t built to hold up under pressure.
What GRC Teams Need from Risk and Continuity Programs
For most teams, the challenge isn’t building something from scratch. It’s untangling the work currently happening across different systems, departments, and mandates, and making it work as a single framework.
Many GRC programs still treat risk events and continuity planning as separate domains. They’re managed by different teams, stored in different systems, and structured around different workflows. And most of the time, they stay that way, until something breaks.
That division used to be manageable. Risk teams focused on identifying issues and tracking controls. Continuity leads maintained static recovery plans, mostly for audit. The assumption used to be that, if each function handled its part, the organization would be covered. But under operational stress, that assumption falls apart.
We’ve seen issues spiral because no one was watching for repeated patterns. Or someone was, but didn’t have the mandate to act. We’ve seen continuity plans break down because they were built around org charts instead of real dependencies. The risk gets logged, the plan gets built, and no one checks if the two line up.
That’s what operational resilience regulation is trying to surface. Not to penalize teams for not predicting every disruption, but to raise expectations around how those disruptions are managed. Can the board demonstrate that the controls and systems tied to critical operations are both effective and understood? That risk and continuity activities are connected and regularly reviewed? That they function together, not just exist side by side?
When responsibility for operational resilience lives “off the side of the desk,” the result is a patchwork of disconnected tasks. Each one defensible on its own, but not cohesive enough to support the organization when it matters.
Disruption Doesn’t Care About Silos
Disruption isn’t a one-team problem. It doesn’t follow workflows or org charts. But most programs still do.
Events are logged by one group. Continuity is managed by another. Risk teams close out cases without knowing if they’ve been raised upstream. Recovery assumptions are made without visibility into third-party performance. When that happens, teams are solving local problems
without fixing the real issue.
That’s how problems build.
We’ve seen it play out over and over. A near-miss gets logged in the risk tracker, but the root cause never makes it to continuity planning. The same vendor misses the same SLA three times, yet it’s only after a major outage that the pattern gets attention. By then, there’s no backup vendor, no manual workaround, and operations grind to a halt. Plans are based on how things
ran last year, not on the incidents happening right now.
These aren’t edge cases. They’re symptoms of siloed programs that don’t share accountability.
That’s what connected risk and continuity teams can do. Not by centralizing everything under one platform or function, but by creating shared triggers that prompt review. If a control fails three times, it should escalate. If a vendor is tied to a critical operation, the risk owner should know what continuity plans depend on them.
The organizations that are starting to get this right aren’t overhauling their tools. They’re reconnecting existing ones. They’re building muscle memory, catching patterns early, prompting reviews, and course-correcting before the real impact hits.
What a Modern Operational Resilience Framework Looks Like
Incidents happen, recovery takes place, and life moves on. But unless the issue is massive (or politically painful) no one goes back to ask: What did this reveal about the way we work? That’s the shift modern frameworks are making.
A vendor failure isn’t an isolated event, it’s a prompt to challenge the assumptions in your recovery plans. The same goes for a control breakdown that keeps repeating: It’s a sign your risk program isn’t getting the visibility it needs.
Connecting those dots means using past incidents to tighten the plan you have now, while also watching for the early signals that risk events can reveal. That forward view is what keeps small disruptions from snowballing into outages.
It plays out the same way in risk-heavy environments. Teams log the same incident more than once. The cause is familiar. The fix is manual. But because no one's connecting events over time, the pattern gets missed, and the control never improves. A modern framework wouldn’t let that loop stay open. It would surface repeat issues, flag stale mitigations, and prompt review without waiting for a crisis.
The same principle applies to business continuity. Recovery plans are only as useful as their inputs. If the BIA hasn’t been revisited since a reorganization, or if it still treats all vendors equally, it’s not a resilience document. It’s a false sense of security. Modern frameworks rebuild that map regularly, because operational reality doesn’t sit still.
Most programs don’t need a new tool. They need a better sense of pattern recognition. There’s no single template that gets this right. But there are shared traits:
- Critical operations are defined based on what would make you fail.
- Recent risk events and disruptions are part of continuity planning.
- External and internal dependencies are weighted based on real impact.
- Feedback isn’t optional, it’s continuous.
- Control failures are prompts, not paperwork.
The best programs don’t guess what’s coming. They learn from what already did.
From Compliance Task to Strategic Value
Most GRC programs are built to meet requirements. The ones that actually influence decisions are built to expose gaps.
That’s the difference. One is built to pass. The other is built to improve.
Compliance asks if the BIA was completed. Resilience asks if it’s still accurate after the last reorganization. Compliance logs the incident. Resilience asks what pattern it reveals, and which control(s) didn’t hold. Compliance pushes testing before audit. Resilience tests when something meaningful changes.
Programs shift from task to value when they stop treating disruption as an interruption and start treating it as information.
It doesn’t take a massive overhaul. Risk events, incident logs, continuity gaps, failed vendor handoffs, it’s all there. What’s missing is the structure to connect it, escalate it, and use it. Disruption is the data. It’s what tells you what didn’t work.
This shift forces clarity around ownership. When risk and continuity are tracked meaningfully, gaps in responsibility surface fast. That’s when real improvement starts. You stop arguing over whose spreadsheet is right and start deciding who’s going to fix the failure.
The organizations getting ahead aren’t more compliant. They’re more aware. They have the mechanisms to surface what’s not working and the discipline to follow it up. And when leadership sees that kind of operational self-awareness, they stop viewing the program as overhead, and start treating it like insight.
Because at that point, it is.
Continuity, Risk, and What Every GRC Team Should Ask
Audit-ready programs don’t just document controls. They learn from what fails, and correct it.
Risk teams can spend months building out continuity plans and refining control libraries. Something breaks. A vendor misses their SLA. The test veers off course. That same outage reappears, for the third time. And still, the response? Note it, close it, and move on...again.
Some of the strongest continuity programs we’ve seen treat each disruption as a signal. They don’t just track whether recovery met the timeline. They investigate what slowed it down, who owned it, and what changed afterward. And they treat that information as input, not admin.
If you’re trying to close audit gaps (or even just pressure test your own program), ask yourself:
- What’s failing repeatedly, and who else needs to know?
- Do our tests reflect what’s changed, or what’s convenient?
- When something breaks, who owns the fix before it escalates?
If the answer is unclear, the next audit won’t just ask for documentation. It’ll ask why no one acted on the data.
Resolver helps connect your data together so you’re not relying on fragmented tools or tribal knowledge to prove resilience, because software alone doesn’t build operational strength. What does is treating every failure like a warning, and building the system that prevents the next one.
Featured in: Resilience / Continuity