Financial institutions have a multitude of different systems
running - in different environments, on a wide range of hardware, and, in some
cases, thousands, of different applications. In an ideal world there would be
standards that are adhered to by the vendors making the management of all these
systems as straightforward as possible. Unfortunately, in the real world this
is not the case. Far from it. The reality is that institutions are running
hundreds, or even thousands of business critical systems -some with their own interface and some with
no interfaces. How can an institution keep track of all these applications and
make sure they are fully functioning, especially for real-time applications
where outages can be very expensive indeed? How can they ensure that these
systems are running smoothly, or in fact know that they are running at all? If
an application has stopped it may mean that the institution is no longer
fulfilling its compliance obligations and its risk management is compromised.
Training support staff in all the different systems, their
interfaces, and their operation is time-consuming, labour intensive, and
ultimately highly inefficient as well as potentially ineffective. Of course
banks do monitor their systems, but given the complexity and the sheer number
of systems in place this is a virtually impossible task. The drawback with the
manual “reactive” approach to system management is that frequently there is
little or no warning before a [catastrophic] failure occurs. By this time the
chances are that there will be traders screaming and the institution will be
haemorrhaging money. This is not an attractive proposition, and a situation
that is - obviously – best avoided if at all possible. A much better approach would
be to have an interface that is able to communicate and analyse all critical
systems in real-time and provide an overall management system that is easy to
use and able to predict problems and issue alerts so that problems can be
spotted and fixed before failures occur.
When all the business-critical processes are tracked and
have constant automated management and are fully transparent support staff are
not only alerted to full-blown problems but also to systems and processes that
may be close to hitting their maximum load or threshold. This buys valuable
time to deal with potential problems, before they become serious and expensive
to fix.
There are thousands of things that can go wrong with trading
systems, software, hardware and networks. The cost of major failures can be
frightening, but even problems on a smaller scale can be cripplingly expensive
to detect and repair. The ability to spot a potential problem early and then
fix it will avert a major disaster.
As an example, the cost of a single failed trade has been
estimated at $250 for a domestic trade
and $500+ for a failed cross-border trade [source: Fulcrum Research]. The costs can mount up quickly,
especially with the squeeze on margins across many markets. Failed trades also
present a business risk and make it difficult to calculate an institution’s
position, thus possibly affecting compliance. This is particularly the case
where a trade is part of a complex structured trade. There are numerous reasons
why a trade might fail - no communications link, no trade confirmation, the
exchange server is down, etc. – but the quicker a management system can
identify the point in the settlement or payment chain where a trade has failed,
the lower will be that cost.
ITRS’s Geneos is a management and monitoring system which
can do that and much more. It is designed as a ‘follow the sun’ system so it
can be deployed in multiple locations around the world. Support staff in
different locations have the same overview of their systems and applications,
so not only can business critical processes be managed from different
locations, the problems can also be fixed remotely by staff best trained to
handle a specific fault. At the end of a day in each region the management of
the systems can be handed over to the next trading region.
Financial institutions now operate in a world where nothing less than
100% system availability is acceptable, so it is no longer possible or
necessary to endure management systems that at best can only inform of a
fault or failure after it has occurred and limiting support to reactive
firefighting. The true value of a management system is to alert an institution
to a potential problem giving it time to fix it before it affects the business.
It is far more efficient, secure and reliable to implement an institution-wide
management system that can, in real-time, capture and analyse stress loads,
capacity levels, hardware and network problems before they become catastrophic
failures, allowing a proactive and effective approach to system management and
support.