AI As A Service Redefines Customer Experiences, But Think Twice Before You Jump Into AutoML
August 11, 2020 / Unisys Corporation
The rapid application development market is expected to grow from $7.8 billion in 2018 to $46.2 billion by 2023, at a CAGR of 42.9%. Due to this market shift, the need for collaboration between professional data scientists and application developers is no longer necessary. Today, citizen developers can operate alone, using predefined machine learning (ML) models built with low code or no code to deliver polished, AI-enhanced solutions.
Does this mean predefined models ignore the important role of human input? Keep reading, as this article will help you understand how humans and machines can work together to make machine learning more effective with augmented ML, and most importantly, cover a few key technical debt concerns you need to address prior to jumpstarting your own AutoML (AI for AI).
A Quick Recap Of The AI Market
- AI augmentation will create $2.9 trillion of business value in 2021.
- According to IDC findings, by 2025, at least 90% of new enterprise applications will embed AI, and “over 50% of user interface interactions will use AI-enabled computer vision, speech, and natural language processing.”
Evolving ROI-Driven AI Use Cases (Prediction + Automation + Optimization)
Here are some snapshot use cases of how AI is being profitably deployed across many vertical markets:
- In fintech, fraud detection and robo-advisory engines leverage machine learning to detect fraudulent and abnormal financial behavior.
- IT operations employ robotic process automation (RPA) to digitize processes in weeks, without replacing legacy systems.
- Governments are preventing physical threats by using AI to scan videos and identify suspicious activity in public places.
- News entities are experimenting with AI content management systems, such as Bertie, Forbes’s Content Management System.
But where do you start when experimenting and implementing AI-driven projects? By following these three steps in your own environment, you can minimize typical augmented AI project implementation time and obtain better results.
1. In-Database Analytics Data Strategy
Build ML on top of existing data. Take AI to your data, thereby keeping your significant data troves intact and utilizing them “as is” and not compromising the legacy processes that already do a great job of collecting and storing data. Most companies have established on-premises, cost-effective ways to exploit existing data using products, such as Isilon OneFS storage, to create data lakes and executing hot (real-time, in-memory), warm (near real-time, NoSQL) and cold (batch, hadoop) datasets for both batch and real-time access in the AI following a lambda architecture. Implement data virtualization with automated data tiering that takes analytics to the data and not the other way around, thereby significantly reducing latency and performance issues.
2. Tech Stack Layers DNA
To bring some clarity to the complex AI ecosystem, divide the technology stack into layers across services, hardware, etc., as there is no standard definition in the industry:
- Infrastructure for the AutoML, including GPU/TPU processors, cloud containers, etc.
- Framework for AI programming, such as TensorFlow, R, etc.
- Platforms that support AI projects, such as AWS SageMaker, Google AutoML, H20.ai, etc.
- Services that enable output and input to the applications, such as vision and translation.
With the above stack, you can jumpstart a hybrid cloud “bring your own” project with prebuilt blueprints.
- Zero Setups: With serverless plug and play running open-source preinstalled and data governance prebuilt, you can establish full data governance and that embraces DataOps automation.
- Optimized Algorithms: Use streaming datasets, bring your own algorithm to manipulate exceptionally large datasets using data virtualization. These run anywhere — wholly in the cloud or on edge servers, with production-like data copies for ML training.
- Easier Training: For ML training, use hyper-parameterization, bring your own container and bring your own script strategies to perform complex operations.
3. The Need For Augmented ML
After you establish the above key foundation, then you can jump into AutoML. With this, the machine learning pipeline components leverage the data characteristics and metadata learning with built-in optimization and reinforcement of learning methods.
Augmented ML will play a significant role in future applications, but we have a long way to go for full AutoML automation capabilities due to the following:
- No consortium or credible third party can prove AutoML accuracies are valid, as underlying hardware plays a significant role in the AutoML algorithmic choices.
- Data cleaning and preprocessing of the current AutoML frameworks have many limitations.
- The wealth of knowledge brought by subject matter experts from years of experience is unparalleled in data processing and feature set engineering.
- Model “explainability” needs human intervention, particularly for addressing accuracy or prediction (e.g., business impact, ROI, etc.). Users want to trace why AI made the decision it made for transparency, trust and compliance.
- AI will be deployed first with industry vertical focuses (i.e., health care, manufacturing, energy, etc.) that require domain experts and subject matter experts for input and validation.
Doing a full ML proof of concept is very easy; the challenge is integrating your current on-premises to evolving cloud-native analytics as a service with cloud bursting (i.e., horizontally scaling onsite for data lakes). Many cloud vendors talk about cloud bursting (without significantly moving on-premises workloads) with very minimal proven scenarios and push vendors to show pilots for specific on-prem flavors (e.g., Cloudera cloud bursting with Google Cloud).
Let’s challenge the status quo of most organizations, including Fortune 500s, which continue to use their data the same way they would have used it a decade ago. Let’s create better customer experiences with augmented intelligence. Establish end-to-end data governance by addressing “garbage in and garbage out” and use COTS that already exists in the market today.