The AI conversation has been dominated by model size, training breakthroughs and eye-watering infrastructure spend but I get a real sense that this is increasingly the wrong lens. For me the true battleground is not where AI is built, it is where it is used and that place is inference.
Inference is where AI models move from theoretical capability to operational reality. Every customer interaction, financial decision, medical recommendation or autonomous action happens at this layer. It is continuous, not episodic, it is the brokerage of AI. It is where cost accumulates, where latency matters and crucially where risk materialises. Which is precisely why it is also the primary cyber security attack surface in AI systems and exactly where adversaries operate through:
- Prompt injection and manipulation at runtime
- Data leakage and exfiltration through model responses
- Model inversion and extraction attacks
- Abuse of toolchains and API integrations
In short, inference is where AI systems are interacted with and therefore attacked. At present, most inference stacks are engineered for efficiency, not assurance. They route queries, optimise compute and balance workloads but they do not prove that outcomes are compliant, safe or even correct. In a world shaped by the EU AI Act, DORA and NIS2, that is no longer sufficient. It is not enough for AI to simply work; it must be demonstrably trustworthy at the point it operates. This is where trust and cyber security converge.
WAKE UP Big Tech … Trust can no longer be a policy statement or retrospective audit. It must be enforced at runtime continuously validating inputs, constraining outputs, monitoring behaviour and generating evidence. In cyber terms, inference must evolve into a zero-trust execution layer with real time assurance because ultimately, inference is not just where AI delivers value, it is where it is tested, exploited and either secured or compromised.
This is why inference is fast becoming the economic control point of the AI market and a very exposed one at that. Optimising inference determines margin. Orchestrating inference determines performance. Governing inference and demonstrating verifiable real time assurance determines whether AI can be deployed at scale and that last point is where trust becomes decisive.
This reframes the market entirely. Trust is no longer a policy exercise or post-hoc audit it must be embedded at runtime and verifiable in real time, continuously validating behaviour, enforcing boundaries and generating authoritative evidencee.
My concern is that we find ourselves in the peculiar position of building ever more intelligent (and I use that term loosely) machines, only to deploy them through runtime layers that trust everything and verify nothing; comforted, no doubt by dashboards reassuring us that performance is excellent, right up until the moment s**t hits the fan … so stop treating inference as just another technical layer, it is the moment where AI either earns trust or loses the right to operate.
Posted on April 13, 2026
0