As boards of directors call for the deployment of artificial intelligence, IT managers like chief information officers know that there is more to the story than having a solid AI use case.
The challenge preventing technology leaders from deploying AI isn’t actually generating a model and rolling it out, says Prukalpa Sankar, co-founder of data catalog and management software Atlan. Instead, she said it’s not AI-ready data. “Everyone is ready for AI except your data,” Sankar said.
In a recent global study of more than 1,300 technology and data executives, only 18% of companies say they are fully ready for AI deployment, meaning their data is fully accessible and unified (another 40% consider themselves mostly ready, but not completely not there).
To get to that point of readiness, Sankar said companies must overcome several hurdles. The first is finding and organizing all your data, a job primarily for data engineers. “You want to bring together data that is otherwise siled in different business units to actually deploy for a specific use case,” she said.
Businesses must also complete complex data labeling and classification, primarily to keep private data within proper boundaries. “Depending on who is asking the question, I can change the data behind it,” Sankar said. For example, a human resources chatbot might be able to use payroll data while an overall chatbot cannot.
With AI, data management is not so cut and dry
All of this falls under the umbrella of data governance, or how an enterprise manages data assets through policies, processes and standards. Matt Carroll, CEO and co-founder of data security platform Immuta, said data management is not new, but AI is changing how it is done.
“When you think about traditional business intelligence, which we’ve been doing for 30 years, management has always been a structured, well-oiled machine,” Carroll said. “The way you introduce AI, you can’t do it the same way.”
This is because businesses must constantly add new data to support AI models from both internal and external sources.
Ultimately, Carroll said, AI readiness comes down to three things: “They have to be able to find the data, they have to use it, and they have to be able to observe how it’s being used.”
Having a mature data management pipeline is not common across industries, or at least not yet. A 2024 AI Readiness Report from MIT found that data governance, trust and security are a greater focus in government and financial institutions versus other industries. Carroll said that this practice should extend far beyond banks and government, as they are not the only industries that handle sensitive data. All businesses pursuing generative or other types of AI solutions must perform a dance between IT, legal and broader organizational managers, as well as the departments into which they trickle down.
Additionally, Carroll wants to see more businesses implement continuous data readiness even after deploying AI. One such way companies can do this is through an AI hotline, which could be a full hotline in a large company, or a more feasible managed Slack channel in a smaller company. Importantly, domain experts have a direct line to the engineering team to report issues such as hallucinations or incorrect data labeling.
“They need that feedback loop, so maybe a model review board can take it down or reevaluate it, or possibly flag it for retraining and revalidation,” Carroll said, “which isn’t a negative thing, by the way. That’s just the game.”
This is of course in addition to continuous testing on models to look for abnormal behavior and make sure it meets the company’s quality standards.
Companies are getting creative to get ready for AI
From the start of AI deployment journeys, Sankar said she’s seeing companies create AI readiness scores to help quantify the process of getting their data in order. For example, the measurable score for AI readiness might rank a data set out of 5.0, based on a range of factors. “Unless you measure it, nothing moves,” she said.
Another trend experts are seeing is adding a secondary title of data steward to an employee’s primary role. “You’re in the business, you happen to know the domain, but now, all of a sudden, you’re going to own this dataset that may or may not be used for AI,” Carroll said. Additionally, he said, highly specialized data controllers (who might have an official title of data management managers or data management engineers, for example) are hard to find, but increasingly important and something we’ll see more of in the future.
Sankar likens the data infrastructure ecosystem to a marketplace. “On one side of the market you have business-ready AI use cases,” she said. “And on the other side is your complicated data infrastructure.”
For organizations pursuing AI, experts agree that data readiness must come first. But even the broad category of data readiness breaks down further. Before even tackling step one, Carroll said, it’s worth asking what can be an unpopular question in the C-suite: “In data readiness, there’s also a question of, should you do it at all ?” By this, Carroll means that there is an ethical decision that all companies must make whether or not to expose certain types of data in your systems. Only with that approval can companies truly pursue AI readiness.