Insurance Industry
Insurance providers are expected to provide to their clients all applicable information regarding their insurance through a web portal and the stability of the portal is therefore extremely important, especially during enrollment periods.
Overview
In this case study we'll take a look at once such provider that was looking to consolidate their alerts into a single management system and provide dashboards to the portal owners depicting the portal's health and availability.
The Challenge
There were a number of tools that were already monitoring the portal but not all of them had been integrated into OMNIbus, which was the event management system. In addition, the existing monitors were not designed for use by a dashboard and therefore required additional enrichment.
The portals' resources and their dependencies on each other and outside systems had previously been discovered by Tivoli Application Dependency Discovery Manager (TADDM). However, these resources had not been grouped together with Business Services or Business Applications in order to present them in a dashboard view.
Finally, the portal owners had never attempted to view their application from a monitoring/events/kpi perspective before and were having difficulty providing the enterprise systems management team with requirements for their dashboards.
Solution
The first step to successfully completing a project involving business dashboards is meeting with the application owners and consumers of the dashboards to ensure that all facets of the application are understood, to obtain architecture diagrams and to design the dashboards.
In this case, the owners and consumers were one in the same. In meeting with the application owner, a design for the technical level dashboards was developed that was loosely based on their application architecture diagrams. In meeting with the executives, it was determined that the executive level dashboards needed to include not only the status of the portal application but the status of other critical applications as well and it must display the status of certain infrastructure related services, such as DNS and LDAP.
Tivoli Business Service Manager (TBSM) was used to create the dashboards via custom canvases. The underlying service models that supported the indicators on the dashboards were populated by resources discovered and grouped by TADDM. The TBSM dashboard design dictated the Business Service and Business Applications required to be built within TADDM.
Once and once the resources were properly grouped within TADDM, the XML Toolkit was customized to properly import the data from TADDM and align the resources with the proper custom TBSM templates designed to support the dashboards. TADDM discovers a vast amount of resources and relationships that are not always needed within TBSM. In addition, sometimes the resources and/or the hierarchy of those resources must be modified to be properly utilized by TBSM and the XML Toolkit provided the means by which to accomplish both.
In order to properly show the state of the business services within the dashboard, all existing monitoring tools had to be integrated into OMNIbus, which was accomplished utilizing various probes. While the modifications to the XML Toolkit provided the framework to correlate events to services, the probe rules had to be updated to properly enrich the events. The enrichment data and the Toolkit modifications worked together to accurately correlate events and services.
Determining which events affected the TBSM services, how they affected the service and how child services affected parent services was done through multiple meetings with the application owner. Initial meetings were held to create an initial set of business service rules and as the service model was built and the dashboards began to take form, the rules were reviewed multiple times with the application owner.
Seeing the dashboards in action provides the application owner a greater understanding of how the alerts and rules affect the dashboards and always leads to requested changes.
Conclusion
Having monitoring in place for all technology components within an application is the first step towards gaining a greater understanding of the application and its health. However, viewing an event list with hundreds or thousands of events does not help an application owner understand how their application is performing, whether their application is adversely impacted, what the root cause of an issue is or aid in the recovery time during an outage.
The dashboards are affected by the various alerts generated by the monitoring tools by correlating the alerts with the various resources that make up the application. While it is possible to create the business services representing all the resources that make up an application manually within TBSM, it can be extremely difficult, time consuming and error prone. Therefore, utilizing TADDM to perform the discovery of the resources and their dependencies and importing this data into TBSM provides an automated build of the service model, automated updates and a finer level of resource granularity than what is typically created when done manually.
Creating business level dashboards provides this missing level of insight and ability. However, since most application owners have never seen their application from this perspective, designing and developing the dashboards is an iterative process as the application owner gains additional understanding of the concepts along the way. It is through this process that you can ensure that the final dashboards put in place will exactly meet the needs of the application owner, which in turn guarantees a successful project.