Case Study
Identifying and Remediating Key Risks for a Global Airline

Customer Challenge
A large North American airline approached Citihub following multiple technology failures that significantly impacted their operations. In addition to identifying and rectifying the issues that triggered the outages, the airline sought to conduct a detailed review of a specific core application and the infrastructure used to support it and many other business-critical applications.
Preliminary anecdotal evidence could not identify any noticeable pattern to the outages and the vendors who were brought in to inspect their products did not identify any glaring deficiencies.
How Citihub Helped
Using our Application Availability Assessment (AAA) methodology, Citihub began by collecting all pertinent information relating to the specific application and its production infrastructure. The information-gathering exercise detailed everything from incoming data feeds and core infrastructure supporting the airlines applications, performance and utilization statistics, architecture diagrams and service level agreements, all the way through to plans and policies governing configuration management, resiliency tests, security policies, release management, production acceptance tests, reboot cycles and more.
This information was then used as a basis for conducting a series of interviews with our clients SME’s to identify all risks relating to IT availability and performance. The interview phase of the project is where Citihub separates our value from other consultancies. Our people are all veterans of the IT industry and have held the same responsibility as the SME’s we interview. A level of trust is quickly established, making for productive conversations that quickly drive out underlying issues affecting our client’s IT ecosystem.
In conjunction with the interviews, our team conducted a hands-on inspection of Windows and Unix servers, databases, virtualization devices, network configuration and security, storage solutions and converged technologies. A “Trust but Verify” model ensures that we have a complete understanding of the environment prior to making any recommendations.
The documentation review, interviews and hands-on inspection all contributed to a risk log and remediation deliverable that categorized all risk findings into levels of criticality and by IT theme.
Results
Citihub identified 60+ “quick win” remediations that could be implemented by mid-June prior to the start of their summer peak season.
A further 200+ risks were identified which had either short term tactical remediation solutions (but could not be achieved by mid June) or had much longer strategic solutions.
The risks were rolled into themes to assist the client in prioritizing the remediations and assigning them to the correct teams.
The biggest themes were architecture and design, monitoring and governance where the client had systemic, organizational and cultural issues.
As a result of this review the client has seen a reduction in the number of incidents for the aircraft communications application and are starting to see the benefits of the quick wins in other areas too.
In addition, the client has also started to restructure their organization and are putting plans in place to deal with the systemic and cultural issues.