Monitoring the RV Routing Daemon
Geneos monitors all aspects of a financial institution’s trading infrastructure. It captures, in real-time, value and event data from thousands of data points, and provides an early warning of impending disruption as well as key performance metrics to support operational decision making.
It manages the processes for recovery from error conditions, including the allocation of operator resources and resource optimisation, but also captures and stores event and value information from across the trading infrastructure.
One of the most popular protocols with market data distribution networks is the RV protocol, especially when Reuters Data Feeds are involved. RV (or ‘rendezvous’) is multicast-based messaging middleware for low latency, high-throughput data distribution, and the RVRD Daemon is used to route this RV traffic between different subnets. “Our own RVRD plug-in uniquely allows you to monitor the RV Routing Daemon - there is no other product that collects this information and presents it all in one place. That’s important to know, because one of our clients approached us with a problem,” says Joy Tiu, Support Services at ITRS Group”
Solving a client’s problem
“The client manually visited the RVRD admin page to look at each neighbor interface and compare its backlog to the max backlog for the router. This involved checking about 500 neighbor interfaces manually. What the client wanted to know was whether there was a way to automate this process. Our answer was: use the RVRD plug-in.”
Backlogs occur when there is buffering between neighbors. They can occur for several reasons: unexpected bursts of data, insufficient WAN capacity etc. Excessive backlogs on a router can lead to a process where the whole machine runs out of memory. Each router has an amount of allotted memory and you want to make sure that limit is not breached.
The RVRD plug-in, which monitors the RV Routing Daemon, helps ensure the integrity of this limit by producing key metrics in three categories:
- Router Metrics
- Local interface Metrics, and
- Neighbor interface Metrics
With the metrics collected by this plug-in, users can write a rule that compares the current backlog for each neighbor interface to the Max Backlog for each router. Without this plug-in, the user would have to monitor each route on each RV routing daemon individually, by going to its corresponding admin page. And with hundreds of routes, that is clearly a daunting task.
The plug-in itself simply consists of a runnable jar file (rvrdmon.jar) which can be executed by configuring a Toolkit sampler on the netprobe. The plug-in requires the following configuration information:
URL: the URL of the RVRD admin interface
Username: the username of the RVRD user
Password: the password of the RVRD user
It’s important to be aware that the plug-in does not need to be deployed onto the machine where the RVRD is located. One netprobe can be used to monitor several RVRDs, by creating multiple samplers.
Effective monitoring of RVRD processes
Why is this important? Because without effective monitoring, an RVRD process or an entire host could go down. That would mean the flow of data would stop, and critical trading applications would be impacted. A host going down would bring the same result, only on a much larger scale.
A final thought on Neighbour, Local, and Router mode.
In Neighbor mode the plug-in is used to display metrics for each neighbor interface.
This diagram shows all neighbour interfaces regardless of what router they are on.
The neighbour interfaces can be matched up with their corresponding routers using the Router column. In Router mode the plugin is used to display metrics for each router. This is the page that shows the max backlog for each router . In Local mode the plug-in is used to display metrics for each local interface.
This image shows the exported subjects and the imported subjects for each local interface.
Thank you for reading and I would love to hear any thoughts or comments.
Client Solutions Architect