Crisis Averted: Top AI Use Cases for Predicting and Preventing IT Failures
October 20, 2022 / Dan Chalk
Predict and respond to IT failures earlier for greater business resilience
For years, digital transformation has been a top strategic priority for enterprises. But it took a global pandemic to push the adoption of digital technologies – such as AI and machine learning, automation, the cloud, robotics, and the Internet of Things – into overdrive. IT leaders deployed these technologies in less than a year to keep remote knowledge workers connected and productive and their businesses agile enough to navigate market disruptions, innovate new digital offerings and deliver personalized experiences for customers, partners, and workers.
This tidal wave of digital transformation has radically increased the speed, agility, and competitiveness of businesses. But it has also created more complex IT environments that are harder to monitor and manage. As enterprise networks have scaled, the sheer volume of data and transactions flowing across them has grown exponentially, as have the number of layers required to support business operations.
Traditional IT monitoring tools (typically threshold-based alert systems) simply weren’t built for modern IT landscapes. They can’t scale to provide comprehensive monitoring of vast IT landscapes. They send ineffective alerts that are received too late for IT to head off failures and bottlenecks. And they can’t provide the business context IT needs to understand the meaning of these alerts. For example, what’s changing in the business to cause this alert? What’s at risk operationally and for customers? And how should we respond to best support the business?
What’s needed is an intelligent early-warning system designed for the modern IT infrastructure – one that can predict IT issues earlier, send recommendations based on business context, and trigger preventive actions to keep the business up and running optimally.
The good news is that with artificial intelligence (AI), you can make this a reality today.
Using AI to power a predictive warning and prevention system
AI-powered monitoring systems can be embedded across IT networks to predict problems earlier so IT can proactively head off issues – and even automate corrections for a self-healing network. For AI to deliver, it needs to analyze and correlate lots of data, including:
- Business data (such as log data on sales volumes from an ERP, forecasts, etc.)
- IT data (such as server usage data, storage array volumes and application performance)
IT professionals already have IT data at their fingertips. To create an early warning system, they need to combine it with business data from ERPs, CRMs, and other enterprise systems used to run the business, such as transaction volumes per minute, usage data, forecasts, and more. When both types of data are fed to an AI – and this data is both accurate and complete – AI models can:
- Discover what’s “normal” for your business so it can detect what’s not.
It learns all by itself about what's typical for your business so that it can detect subtle changes and anomalies earlier than any human or traditional threshold detection system.
- Correlate business and technical data to assign meaning to these events.
AI can correlate business and IT data to determine when a particular business event occurs, a specific technical event occurs, and vice versa. This provides essential business context needed to inform any number of strategic and operational decisions.
- Drill down on these correlations to make recommendations and trigger preventive actions.
Over time, AI models can use signals and patterns to predict events (such as failures or overloads) far earlier, trigger prebuilt automations to prevent them, and ultimately create a self-healing IT network. For example, AI can trigger a predefined automation such as “restart a server” to prevent a failure that would halt online purchases by customers.
What does this look like in the real world?
The value of AI-enabled warning systems extends to any business area – not just IT operations. Organizations can apply AI to understand it more deeply, automate responses, and empower humans to effect better business outcomes if there's a clear business problem to solve. For example, here are a few ways Unisys has applied this technology and the resulting real-world impact.
- A mining company: AI-enabled IT network monitoring and management
A large mining company was suffering routine IT outages. In one case, a two-day outage for a critical process cost the business $10M in penalties and lost business.
To better predict issues and head them off, the company implemented AI-enabled IT network monitoring and management. Their next network outage lasted just two seconds because the AI instantly triggered a prebuilt automation to self-correct the problem. As a result, their network recovered automatically with no business impact.
- A transportation company: AI-enabled early warning system
A large regional transportation company had frequent outages and transportation disruptions due to weather events leading to massive financial ramifications.
Weather events can cause sudden, huge spikes in traffic patterns that stress underlying IT systems (e.g., such as network bandwidth processing). Therefore, IT deployed an AI-enabled early warning system to feed weather forecast data into their AI. Now, IT receives early warnings about planning for spikes in demand due to forecasted weather events – even as far out as two weeks in advance. As a result, IT can scale up bandwidth and processing power (for example, by putting more virtual machines online) to handle the spike and then scale it down afterward.
Top use cases for AI
As these examples illustrate, with AI, IT can see what’s coming well in advance, understand the business implications of anomalies and changes, and take preventive actions at a network level to keep the business running optimally. So, where should you get started?
There are two primary use cases for AI adoption by IT:
- Dealing with unplanned changes to your existing IT environment. Sudden unplanned changes in your supply chain, customer purchasing behavior or business operations can cause usage patterns to vary. AI can be leveraged to continuously looks for early signs of change and use deductive analysis to respond to them. AI also gives IT early insights needed to proactively engage with business leaders about what’s changing and why so it can invest in new infrastructure strategically without overbuilding and overspending.
- Dealing with a planned change to your existing IT environment. The business may be planning for a change in their operations or infrastructure, e.g., moving your ERP to the cloud. However, often, IT needs to run through several business cycles to build confidence that everything is running correctly. With AI, you have the necessary business context to anticipate the business risks and benefits of IT changes, moves and investments. And once a change is made, AI can detect problems early, so there’s no need to wait through business cycles to have confidence in an IT change.
Where should you get started?
Chances are you’re already thinking about how AI can lighten workloads and create a more resilient, adaptive digital foundation for your business. Success requires focusing on AI investments where:
- You have the most vulnerable business systems or are changing something substantive. These are the high-value opportunities that can dramatically improve the resilience of your business.
- You can access high-quality and complete data – both transactional data and IT performance data. Incomplete data reduces the ability of AI to predict accurately, and poor-quality data results in excessive false negatives and positives. So, focus AI investments where you have quality, complete data to feed to AI.
- The risk-benefit ratio makes sense for both IT – and the business. For instance, where you have high transaction volumes, a low cost of failure, and a low level of complexity to implement AI, using AI is a no-brainer. This could be where you have a firewall with limited capacity; AI can automatically reboot the machine at the optimal time to throw off “dead users” and avoid a crash.
Are you ready to get started with AI?
To jump-start your planning, use this simplified risk analysis grid to think through your business scenarios and identify the best opportunities. Identify opportunities where your data quality is high (e.g., high level of accuracy, completeness, reliability, and timeliness), and where there is a low risk for something going wrong (e.g., impacted users, security, or cost of change).
Start simple and safe – perhaps where are you not looking at things as closely as you should. You’ll learn from every AI implementation, no matter how small, and build confidence and evidence of positive impact for your next business case.
Unisys offers proven tools, technology, and experience to help you on your journey to AI. We’re here to help – and accelerate your success.