The hard road to AI deployment

By Ravi Annavajjhala, CEO Kinara Inc.

Friday, 9th September 2022 Posted 3 years ago in by Phil Alsop

Artificial Intelligence (AI) is booming. A Gartner report in 2019 revealed that since 2015, AI implementation exploded by 270 percent. According to PwC, as AI implementation continues to spread, AI’s global economic impact will be valued at over $15 trillion by 2030.

However, it must be recognized that successfully training and deploying AI is no mean feat. In fact, a 2020 report from IDC shows that as many as 28% of AI and Machine Learning (ML) startups fail. Many organizations find the process beset with challenges and hurdles that stall their journey towards deploying AI.

Training is crucial to success

Getting an AI model to deployment can be an immensely fraught process. Companies must first collect the right amount of data to train their models. The actual amount of data required depends both on the complexity of the problem and the complexity of the learning algorithm. That data must then be labelled so that it can differentiate the various elements which the AI model will spot and analyze. This is an arduous task which demands time and expense to identify the objects, motions, and inputs that the AI is meant to detect. Fortunately, there are specialized tools that can significantly help in automating this labelling process.

Then users have to choose their models. Few ever create their own because of the tremendous skills and effort that are required. Furthermore, there is already a wide availability of pre-existing models which can be selected for most use cases. However, selecting those models can present another challenge. Developers must balance several factors such as accuracy, total number of computations, or the sheer size of the model.

Take, for example, a cashierless store that often requires 100s to 1000s of AI-powered cameras to establish what's going on within the store. Further challenges come as developers try out different options and attempt to balance their various needs. They may initially opt for one model like MobileNet-SSD and then find that its accuracy falls short of the requirements and will then switch to an alternative such as Yolo or Efficient-Det – models which provide greater accuracy but require more processing power.

Training to deployment

With those elements set in place, training can begin. These AI models must be trained to know what to spot, how to respond and how to predict behavior. This typically involves training data that simulate and teach and continually improve the model over time. Training is typically carried out in the cloud because of its capacity for high performance processing and the fact that cloud services like Microsoft Azure have readily understood mechanisms for training models.

When it is time for full operation, organizations must deploy their AI models onto live systems. This can be an especially difficult phase and it is here where many projects stall. A 2020 report shows that 40% of companies said that it takes more than a month to deploy an ML model. Only 14% were able to do so in seven days or less.

One of the reasons behind the stalls is that many do not consider the architectural and technical specificities of their edge AI accelerator.

To successfully deploy a preferred ML model into a real system, it must be integrated into an inference pipeline, producing outputs which an application can act upon. It’s here where a whole new range of factors and considerations come into play that take the “on paper” model onto an operational edge system.

The importance of tools

At this stage, tools become a critical ingredient for success and organizations must pick the right accelerator with tools that can ease the process of taking the chosen model into live deployment.

Taking a system from the cloud to the edge requires tools that can be easily installed, set up and are - without any special configuration – able to optimally convert that model to the format required for the chosen edge AI accelerator.

At this stage - where the model is quantized and compiled - AI projects can run into accuracy problems. AI models are mostly trained with 32-bit floating point format to ensure that accuracy remains high. However, when they’re deployed, that model must be quantized into the 8-bit integer format, which AI inference accelerators use to process models and also reduces the model size by approximately 4 times.

Retraining might also be required when developers must tailor their models to the specific needs of the edge AI accelerator that the model will be deployed to. These might have architectural limitations which mean that the model cannot be run without losses to performance or accuracy as they currently exist. This can be a pain point for AI initiatives and so, again, developers need to choose their tools carefully to limit or avoid the need for retraining in these areas.

The process of conversion can involve a significant amount of accuracy loss if not done carefully. From that point of view, tools that maintain that accuracy loss to a minimum are crucial to success.

Training is an ongoing process

The move from training to deployment doesn't just happen once. In fact, a well-maintained AI model will likely be retrained and redeployed continuously over its lifespan. After all, AI models learn and as they operate within a given use case, the system collects information to learn how to more precisely and more efficiently perform the required operations. If a model is working correctly, it should be able to anticipate what happens within a normal range of activity.

However, when new things occur, models must incorporate the new data. Although this is an example of a cloud-based AI algorithm, according to one study from The Bank of England, over a third of British bankers reported that their ML models struggled during the pandemic. They had never seen a shock like that before, and as a result, had not incorporated that possibility into their understanding. Those models would have had to be retrained and many such AI models must do so consistently to update their analysis capabilities and take advantage of that inherent ability to learn.

Retraining is also used to refine a model to fit a given use case. Openly available models will typically be trained with datasets like Google Open Images, which are readily available online. These can encompass a wide variety of data types and developers will often need to retrain their models to fit the specific dataset that they want their AI model to analyze. If, for example, an image recognition model has been trained but the developers want to use it within a store, it may have to be retrained to spot cans of beans or bunches of carrots.

Arriving at your AI destination

Deploying AI can be challenging and many stall on their journey, which can be filled with obstacles. However, the key to reaching the destination is to carefully select the right AI model deployment tools and accelerators. These decisions will make the difference between an effective, live AI application and one that stalls at the side of the road.