An outage a day won’t keep the regulator away

Time for a check-up

When there are more UK banking outages than days in the year, it’s not a good sign of the health of the British financial industry. 2018 saw 375 incidents reported over a nine month period – and the Financial Conduct Authority (FCA) has been calling for check-ups ever since.

Consumers are more reliant than ever on technology for banking services. Where once access to online banking was an exciting and convenient development of the digital age, it is now viewed as a given. Yet for millions of account holders this isn't always the case.

The result? Regulatory and consumer pressure has been mounting against the financial services sector as system failures escalate. In its ‘2019 rethink’ on the future of regulation, the FCA stepped in and made operational resilience a priority for all c-suite decision makers. Following a Bank of England discussion paper released in July 2018 and an inquiry from the MP commission, the report showed once and for all that operational resilience is top of the agenda, and firms must prepare accordingly or face the consequences.

Legacy rot

The past fifty years has seen an astounding evolution of operations within the banking sector, with little pause for thought. First you had the technology used by the bank’s teller, then ATMs and call centres came in, more recently banking websites, and now mobile banking. Instead of a clean shift between each phase of technology, it's been layered one on top of the other, so that new technology and channels are running in tandem with legacy technology.

Finally, new development and deployment techniques like DevOps are used when they are not well suited to these different technology stacks. In short, old IT systems aren’t compatible with the modern world, but for a long time it’s simply been seen as too expensive to overhaul.

It should have come as no surprise when the FCA revealed there had been a 300% increase in outages in the financial services industry year-on-year. While many were quick to point to hackers as the scapegoat, and cyberattacks have historically hogged the press, the fact of the matter is that mundane operational systems failures far outrun them for frequency of occurrence.

The cure

Ever wondered why outages hit on a Monday? The truth is that many firms attempt to implement systems changes and updates over the weekend when things are quieter. But it doesn’t always go to plan, if processes overrun, firms are unprepared to manage the demands of users who come flooding in on Monday when systems are still down for maintenance.

This brings us to the first, most simple but often overlooked solution of IT failures: change management. Over 60% of outages could be avoided with a careful change management plan and a system to fall back on if things aren’t up and running in time.

The second solution, and the key to pre-empting a failure, is thorough testing. Say you’ve had an issue with change management and a few of your customers are experiencing IT failures. As soon as the rest of your customers get wind of this – which could be within minutes in the age of Twitter – they all rush to check their applications, compounding the change management issue with overcapacity. Load testing can simulate the number of users on a platform to see at what point the system will fail and provision for it exactly. This is just one example of the ways a variety of testing tools can improve your resilience and uptime.

Third is end-user experience monitoring. Firms must know not just what’s going on in their corner of the globe, but also in the cloud and across the estates of all outsourced third parties. It’s imperative to track the health of applications and infrastructure in real time so that they can react to problems faster and proactively fix applications issues before they impact both business and customers alike.

Finally, you wouldn’t build your business headquarters on sand, so why would you build your vastly more important IT estate on unstable, incompatible legacy technology? Firms must be prepared to overhaul any points of weakness and build on modern, up-to-date software that can operate across multiple computers so that, if one fails, the rest are able to pick up the slack.

The UK banking sector stands at a crossroads; disrupt their own current systems and practices, or be disrupted. Firms that argue their lack of financial resources as a reason not to prioritise operational resilience will soon find themselves lost to history. Those that champion round-the-clock operational resilience will be rewarded with a healthy IT estate, and competitive edge.

An outage a day won’t keep the regulator away

Time for a check-up

Legacy rot

The cure

Related posts

Gambling and IT: ‘Always on,’ yes, but also ‘always vigilant’

LAMA reporting: Failure is not an option

The challenge of monitoring Zero Trust environments

How to hedge against unknown economic challenges