AI’s Reliability Problem, and the Human-Centered Solution

AI’s Reliability Problem, and the Human-Centered Solution

AI can synthesize and reproduce the combination of words it has seen before. It can provide helpful recommendations or predict the next word in a sentence. Yet, when striving to understand something new to human knowledge, or where context is critical, or where human experiences play a major role in the understanding, there are severe limits to the technology’s abilities. When applying generative AI in an advanced field to push knowledge forward, the systems are not yet in a place where they can reach full autonomy.

AI’s Limitations

Over-reliance on AI’s ability has manifested itself in many ways. Earlier this year, a study by the National Highway Traffic Safety Admin shared data showing hundreds of accidents and several deaths over one year connected to driver assistance tech. This technology may be groundbreaking and exciting, but it needs to be utilized with a watchful eye. These programs are still so new that we must monitor them to be certain outlying factors aren’t causing accidents.

There continue to be more and more examples of AI not living up to its promised role such as stickers on stop signs. A sticker on a stop sign could lead to a self-driving car not recognizing that it needs to stop, which could result in a serious accident. There was also the infamous story of the self-driving Uber not trained to stop for jaywalking pedestrians which ended up killing someone. These factors can cause skewed results in data for training future models, but they can also be dangerous to human life.

Another example of AI not proving reliable is when scientists at the National Institutes of Health showed how AlphaFold2, a remarkable advancement, failed to predict protein fold switching. The AI can synthesize the new abilities of creating the protein models, but this further proves that it does not yet have the human knowledge needed to predict something so vital, resulting in dire inaccuracies.

More recently, we see major issues with AI chat technologies fabricating answers, convincingly, sharing wrong data, and otherwise completely hallucinating – in smaller ways misleading the user, and in bigger ways, spreading misinformation or suggesting dangerous or criminal ideas or actions.

The balance here comes in utilizing AI for its benefits while mitigating its downsides as much as possible. This reflects both the extremes and the dangers of AI systems left unchecked. Judea Pearl, a pioneer in AI, said in Quanta Magazine, “all the impressive achievements of deep learning amount to just curve fitting.” This is translated much more crudely as “garbage in, garbage out.” A machine is often discarded for its faults rather than learning what caused the mistake to create a solution or an improvement. Those faults could also be ignored, which would create misleading research.


Advancing AI in Systems Across the Enterprise

The answer to AI’s reliability problem is in the tools we create — tools to augment and complement AI with rigor, trustworthiness, and ways to amplify human judgment. Much work is happening to move towards this, including enabling better monitoring, root-cause analysis of faults, simplifying retraining, understanding bias, improving interpretability, and more. These tools should be paired with a human-centered AI approach, where AI techniques amplify and support our abilities and respect and preserve the human context. When we think about the technology as augmenting humans, AI actually works. When we think about it as a replacement, AI fails.

Not long ago, there were predictions that radiologist jobs were in imminent danger and that AI would soon replace these roles. We’ve consistently seen that AI can help, not replace, the humans in these roles. Understanding how technology can amplify human experiences in these areas will lead to a continued broader interest and advancement in AI. Radiologists are now using AI to amplify their abilities to read vast amounts of images to identify and detect diseases. They must monitor these systems though, because something that may seem as small as markings on radiology images could lead to inaccurate test results.


An interactive loop is possible across the enterprise. AI systems can help us advance hypotheses, process and synthesize results, and iterate. Such an AI-augmented approach is not only possible but is the best way to come up with amazing discoveries and insight. Again, this requires tools that provide scalable monitoring solutions. These tools are key to ensuring that the risks associated with models – drift, uncertainty in the data, lack of documentation, lack of clarity on lineage, etc. – are minimized so we may freely use AI to amplify our judgments.


The Future of Productive AI

We will see a shift toward parameters or “features” for AI to consider how it works with humans rather than to replace them. AI techniques help improve efficiency – encouraging AI symbiosis into our daily lives. Incorporating cooperation and human feedback will create intelligent systems that save us time and energy to use our cognitive thinking more effectively.


We are currently in a crucial phase of AI development — deciding how it will work in our lives in a trustworthy, explainable, and beneficial way. This necessitates features like monitoring and automatic retraining that enable companies to make scalable, effective AI solutions and ensure models are not becoming biased or unfair in their behavior. Trust, more than the technology itself, is our most significant responsibility. When used to amplify human abilities, the technology truly has unlimited potential to accelerate humanity’s advancement.

Our Approach

H+AI is the philosophy that underpins all of the work we do at Vianai, all of the products we build, and how we work with customers.

Our ML monitoring solution enables high-performance ML operations at scale across the enterprise, to enable detailed monitoring, root-cause analysis, retraining and model validation in a continuous loop across large, complex, feature-rich models – ensuring models are trustworthy, explainable and transparent.

Our performance acceleration technology aims to bring down the cost and resources needed to run AI, to increase access and ensure AI is more responsible in terms of cost–performance and environmental impact.

Dealtale brings conversational AI that sits on top of marketing, CRM and advertising platforms, and causal inference – advanced AI techniques – directly to marketing professionals.

Finally, hila, our AI-powered financial research assistant was built from scratch with reliability in mind – our document-centric approach helps us to ensure that answers are accurate, including providing citations from the financial text.

To learn more about our high-performance ML monitoring capabilities that can help your business tackle AI’s reliability problems, request a demo here. To learn more about all of our products, get in touch here. We would love to connect!

Be sure to also check out our new video series, A Conversation about Human-Centered AI, in which we tackle various aspects of AI’s reliability problems, and how we can work to solve them. The first episode, Hype vs. Reality, is live here.

AI Hype vs. Reality | Part One in Our Video Series, A Conversation About Human-Centered AI with Dr. Vishal Sikka

Where is AI today, and what is real vs what is hype?


During the last 10 years, there have been many impressive advancements in AI, with the last six months bringing on a rapid acceleration in AI technologies.

On the other hand there are significant issues around AI that need to be addressed – issues of bias and fairness, reliability and responsibility of AI. How do we monitor these systems, govern them?


How do we ensure that they are trustworthy?

Underpinning these questions are specifics that we hope to tackle in this series, and in this first episode:


    • AI investments are in the hands of a few, and we need more democratization of AI both from a usage standpoint as well as an investment standpoint.
    • AI expertise is extremely limited worldwide. The people who understand these systems – how they work, why they are behaving as they are, how to monitor and govern them – is a very small group. AI education is critical.
    • AI needs to understand the real world, the human experience, in order to have impact.
    • AI is incredibly expensive to run, and we need to bring down the cost so that it is more accessible for all, and responsible not just from an accuracy standpoint but from an environmental-impact standpoint.

These are big issues that aren’t easy to solve. We tried to take a thoughtful approach to appreciating the breakthroughs while examining the limitations so that we can all benefit from the ways AI can enhance humanity. The ways AI can be human-centered, in order to bring real value.

We announced our video series, “A Conversation about Human-Centered AI” last month and are excited to share Episode One, Part One: AI Hype vs. Reality.

Let’s start a conversation! Tell us what you think about AI hype vs. reality on LinkedIn or Twitter. Or reach out to see how your company can navigate the hype vs the reality for your organization at

Our CEO Dr. Vishal Sikka sits down with Vianai’s Shabana Khan to explore the rapid and impressive advancements of AI, as well as the limitations of AI technology.

There’s never been a more important time to dive deep into the complexities of AI. In our video series, “A Conversation about Human-Centered AI,” we explore important questions and challenges facing AI advancements, from ethical considerations and bias issues to government regulations and the education needed to ensure AI is responsible, trusted, and aligned with the advancement of humanity.

Tune in as we explore the exciting breakthroughs in AI that we have seen accelerating in recent months, as well as the sometimes-uncomfortable questions around the reality of AI today, and how it should be managed so that it is trustworthy, reliable, and human-centered.

A Conversation about Human-Centered AI

There’s never been a more important time to dive deep into the complexities of AI. In our new video series, “A Conversation about Human-Centered AI” with Vianai’s Founder & CEO, Dr. Vishal Sikka

We explore important questions and challenges facing AI advancements, from ethical considerations and bias issues to government regulations and the education needed to ensure AI is responsible, trusted, and aligned with the advancement of humanity. Tune in as we explore the exciting breakthroughs in AI that we have seen accelerating in recent months, as well as the sometimes-uncomfortable questions around the reality of AI today, and how it should be managed so that it is trustworthy, reliable, and human-centered.

Take a look at the sneak peek here:

How We Harnessed Advanced AI to Solve Costly Supply Chain Issues

Lean Operations

Lean operations have been the holy grail of manufacturing for decades with a variety of tools and processes applied over the years, such as Kanban, Poka Yoke, and Kaizen. Yet, often, the systems rely on many working behind the scenes racing to meet tight customer deadlines, ensuring supplier delivery, and keeping the warehouses as lean as possible. This results in a monthly sprint cajoling suppliers, juggling inventory, and managing clients.

Against this backdrop, one very large manufacturer asked us to help bring advanced AI technologies to predict, as early as possible, when a supplier would miss a deadline.

The Problem

At the root of our customer’s problem was stranded inventory cost. Essentially, because our client built large, complex, and often custom-made products, which involve tens of thousands of components, one part missing or late could delay the rest of the product from being complete. This occurs regardless of the cost of the component, i.e. a $1,000 part could delay a several million dollar product.

Today, for our customer, this results in $150 million per year, nearly half a million dollars a day, in inventory costs.

Our customer needed to have the ability to intercede as early as possible. Today, the system functions on a very human basis. Supply chain managers often know who might need more time, which suppliers are habitually late, and which parts are irreplaceable.

In the end, the ML models reduce the error in classifying late orders by more than 60 percent.

Yet, even the best supply chain managers can’t accurately map out tens of thousands of parts. And, worse yet, often the ERP systems they’re working on don’t have reliable nor accurate information.

So, in short, the problem for our customer came down to several dimensions:

    • One part could delay an entire product, regardless of the cost of the part.
    • The supply chain professionals couldn’t look across a complete product’s thousands of parts to accurately predict where to put their attention.
    • And the systems they use had unreliable datasets.

The Solution

We started with tying our customer’s many databases together. These yielded several thousands of tables of data. From this, we interviewed the main supply chain managers who work on these problems daily.

From their notes, we generated hundreds of features along with additional tables through our analysis, ML deep learning and feature engineering work, which leveraged our extensive understanding of supply chain management processes and ERP systems.

We learned how they interact with the data, what types of signals they look for, and what kind of tools they needed to be more successful. In this effort, we discovered a few key insights:

    • Intercession was a key component to a successful tool. Our aim should be to reduce the number of suppliers that a manager had to consider.
    • The system we built would need to use the ERP tools in ways that our client hadn’t considered.
    • Our system would need to provide tens of thousands, even 100,000 predictions in a day across all the parts in all the product lines.

The true difference and difficulty in our system comes in how we have adapted it to work on incomplete data. Using our customer’s faulty ERP data, we could predict late deliveries at eight weeks out — the most important time to make a prediction about a supplier because it allowed buyers to react to a change in the expected behavior.

Furthermore, as the delivery date approaches system accuracy improves. Eight weeks was viewed as a key point in tension between the absolute certainty in a prediction and providing enough time to take mitigating actions. Our client also indicated that a supplier changing its delivery date eight days before delivery is equivalent to completely missing it, because it’s too late for a change in behavior to react. Our system then classified all last-minute changes as being missed deliveries.

In the end, the ML models reduce the error in classifying late orders by more than 60 percent.

That translates into 60 percent reduction of effort for users (i.e., procurement managers) to contact suppliers. For orders expected to be late, the error in delivery date prediction is reduced by more than 50 percent by ML models as well. That is a great improvement, given most orders have long lead time (1+ year).

The Results

Our customer had a complete model that could accurately predict a late delivery or commit failure with enough time to allow supply chain managers to intercede before it endangered the shipment date of their product and caused stranded inventory issues.

We emplaced a robust ML system using the data they had on hand, deriving value that they had not realized existed. This system allowed for greater trust in the system, and greater transparency, and amplified the abilities of the supply chain managers to triage the most important and challenging problems.

For more information on what we did for this particular customer, and to find out how we could work with your company to solve complex problems using AI and ML capabilities, reach out today.

Vianai Heads to the Toronto Machine Learning Summit

Let’s meet at the Toronto Machine Learning Summit!

We are looking forward to a great event this week at the Toronto Machine Learning Summit! @Sue Dunnell @Jeremy Jiang @Max Bergstein and Srikar Srinath from the Vianai team will be onsite to join the discussion and explain how we can help data scientists, machine learning engineers, machine learning operations teams and others.
We are ready to help businesses bridge the gaps across the AI lifecycle to monitor models for drift, bias, uncertainty and other risks, monitor models at a very large scale, and dramatically increase model execution speed and throughput. We’ve worked with companies to accelerate model inference speed more than 100x while reducing the model’s footprint by 300x. We are able to run state-of-the-art AI models on limited compute surfaces without taxing firewalls or routers. We also know last-mile issues cause delays when figuring out when to retrain and redeploy models. Our next-generation monitoring capabilities help eliminate alert fatigue, and make it easy for ML engineers – or anyone tasked with monitoring models – to understand why models are drifting, and we suggest actions to take to keep models trustworthy while running in production. We call it our continuous operations process, and it includes wizard-driven policy creation for each model, highly flexible and granular custom thresholds for distance and window-based monitoring, and the ability to view billions of data points in a single window. We provide retraining recommendations and make it easy to use advanced deployment techniques like challenger/champion models. There’s a lot more, and we’d love to tell you all about it.  

Attending? Find us on the Whova app to set up a meeting or visit us at Booth #2.


We can’t wait to meet with you!