Digital Trends

How you can turn 2025 AI pilots into an enterprise platform – cio.com

Most enterprises right now are running two AIs.
The first AI is the visible, exciting one: developer-led copilots, RAG pilots in customer support, agentic PoCs someone spun up in a cloud notebook and the AI that quietly arrived inside SaaS apps. It’s fast, easy to get up and running, with a very impressive potential and usually lives just outside the formal IT perimeter.
The other AI is the one the CIO has to defend: the one that must be governed, costed, secured and mapped to board expectations. Those two AIs are starting to collide — which is exactly what May Habib described when she said 42% of Fortune 500 executives feel AI is “tearing their companies apart.”
As with past waves of innovation, AI follows an inevitable path: new tech starts in the developer’s playground, then becomes the CIO’s headache and finally matures into a centrally managed platform. We saw that with virtualization, then with cloud, then with Kubernetes. AI isn’t the exception.
Application and business teams have been getting access to powerful generative AI tools that help them solve real problems without waiting for a 12-month IT cycle; that’s what generative AI has been doing so far. Yet, success breeds sprawl and enterprises are now dealing with multiple RAG stacks, different model providers, overlapping copilots in SaaS and no shared guardrails.
That’s the tension showing up in 2025 enterprise reporting — AI value is uneven and organizational friction is high. We have definitely reached the point where IT has to step in and say: this is how our company approaches AI — a single way to expose models, consistent policies, better economics and plenty of visibility. That’s the move McKinsey describes as “build a platform so product teams can consume it.”
What’s different with AI is where the pain is. With cloud adoption, for example, security and network were the first blockers. With AI, the blocker is inference — the part that delivers the business returns, touches private and confidential data and is now the main source of opex. That’s why McKinsey talks about “rewiring to capture value,” not just adding more pilots. And this matches the widely reported results of a recent MIT study: 95% of enterprise gen-AI implementations have had no measurable P&L impact because they weren’t integrated into existing workflows.
The issue isn’t that models don’t work — it’s that they weren’t put on a common, governed path.
The biggest mistake we can make today is treating AI infrastructure like a static, dedicated resource. The demands of language models (large and small), the pressure of data sovereignty and the relentless drive for cost reduction all converge on one conclusion: AI inference is now an infrastructure imperative. And the solution is not more hardware; it’s a CIO-led platformization strategy that enforces accountability and control, making AI a strategic infrastructure service. This requires a strong separation of duties and the implementation of a scale-smart philosophy versus just a scale-up approach.
We must elevate the management of AI infrastructure to a financial priority. This mandates a clear split: the infrastructure team focuses entirely on the platform — ensuring security, managing the distributed topology and driving down the $/million tokens cost — while the data science teams focus solely on business value and model accuracy.
This framework, which I call the AI P&L center, ensures that resource choices are treated as direct financial levers that increase margin and guarantee compliance. Research highlights that CIOs are increasingly tasked with establishing strong AI governance and cost control frameworks to deliver measurable value.
The technical strategy must implement a scale-smart philosophy — a continuous process of monitoring, analyzing, optimizing and deploying models based on economic policy, not just load. This involves deep intelligence to perfectly map the model’s needs to the infrastructure’s capabilities. This operational shift is essential because it enables the effective use of resources in support of the requirements coming from the adoption of two of the most critical pieces of innovation in artificial intelligence:
In both cases and more in general any time a model is used to perform inference, achieving a double-digit reduction in $/million tokens is possible only when every request is automatically routed based on cost policy and optimized by techniques that continuously tune the model’s execution against the heterogeneous hardware, but that will only be possible if a centralized and unified platform is designed and built to support inference across the enterprise.
The traditional approach we use to manage most of our enterprise infrastructure — what I call the scale-up mentality — is failing when applied to continuous AI inference and can’t be used to build the inference platform needed by CIOs. We’ve been provisioning dedicated, oversized clusters, often purchasing the newest and largest GPUs and replicating the resource-intensive environment required for training.
This is fundamentally inefficient for at least two key reasons:
A unified platform is not about forcing alignment to a single model; it’s about establishing the governance layer necessary to unlock a much wider variety of models, agents and applications that meet enterprise security and cost management requirements.
The transition from scale-up to scale-smart is the essential, unifying task for the technology leader. The future of AI is not defined by the models we train, but by the margin we capture from the inference we run.
The strategic mandate for every technology leader must be to embrace the function of platform owner and financial architect of the AI P&L center. This structural change ensures that data science teams can continue to innovate at speed, knowing the foundation is secure, compliant and cost-optimized.
By enforcing platformization and adopting a scale-smart approach, we move beyond the wild west of uncontrolled AI spending and secure a durable, margin-driving competitive advantage. The choice for CIOs is clear: Continue to try managing the escalating cost and chaos of decentralized AI or seize the mandate to build the AI P&L center that turns inference into a durable, margin-driving advantage.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Dante Malagrino is a seasoned Silicon Valley executive, entrepreneur and technology leader with a deep focus on data security, AI adoption and enterprise platform evolution. He is currently the CEO and co-founder of servescale.ai, an AI inference-serving platform provider for enterprises.

Previously, Dante served as chief product and technology officer at Protegrity, where he was instrumental in advancing the company’s data protection platform, enabling global enterprises to navigate multi-cloud analytics while ensuring strict regulatory compliance. He also held a key leadership role at Riverbed, driving the development and go-to-market strategy for cloud and digital networking solutions, including the launch of major SaaS acceleration products.

Malagrino spent a significant part of his career at Cisco in various technical and business executive roles, playing a key part in critical industry transitions across storage, networking and compute. As an entrepreneur, he co-founded Embrane, a company focused on software-defined networking solutions, which successfully raised several rounds of VC funding and was acquired by Cisco in 2015.
Sponsored Links

source
This is a newsfeed from leading technology publications. No additional editorial review has been performed before posting.

Leave a Reply