The engineering and architectural review of a vendor's agentic AI platform conducted during procurement evaluation, alongside commercial, legal, and technical due diligence. The fourth DD lens — board-readable, written for the people who sign on the line, not the people who fix it after.
AI Agent Infrastructure
From 'Do AI' Mandate to Governed Agent Architecture Your board says 'do AI.' Your team has six disconnected experiments. Nobody has encoded what the organisation actually wants these agents to achieve, or how to govern them when they act autonomously. Intent engineering solves the gap between AI capability and organisational purpose. Model Context Protocol provides the universal integration layer. Multi-agent orchestration coordinates it all. We build the governed infrastructure that turns AI experiments into enterprise operations.
By Gregory McKenzie · Registered Trans-Tasman Patent Attorney & Systems Architect · NETEVO
Why Most Enterprise AI Initiatives Stall
The AI race has shifted. It's no longer about which organisation has the most sophisticated model. It's about which organisation has built the infrastructure that allows AI to operate with a strategically correct understanding of what the firm is actually trying to accomplish.
Most enterprises are stuck in what Deloitte calls the 'AI preparedness gap.' Strategic intent is high, but operational readiness in infrastructure, data management, and talent is lagging. Only 6% of firms attribute more than 5% of their EBIT impact to AI. The rest are running experiments that never reach production.
The root cause isn't technical capability. Your engineers can build an AI chatbot in a weekend. The root cause is three missing layers: intent (what should the agent optimise for?), integration (how does it access your systems?), and governance (who's responsible when it acts autonomously?). Without these layers, every AI project is an island.
Meanwhile, each team picks its own tools, builds its own integrations, and creates its own prompt libraries. Shadow AI spreads. Governance gaps widen. The board asks for an AI strategy; engineering delivers a tools list. The gap between 'we have AI' and 'AI creates measurable enterprise value' is an infrastructure problem, not a model problem.
Then the Lilli incident landed. In March 2026, an autonomous AI agent breached McKinsey's internal generative-AI platform in under two hours for twenty dollars in tokens, surfacing the architectural pattern that had been hiding inside every agentic AI deployment: when the consumer of a vendor system is an autonomous agent rather than a human at a screen, the user interface stops being the permission boundary. The architectural work the UI did silently has to be done explicitly in code, on every API call. McKinsey's response was textbook; the procurement question that should have been asked first was not.
Symptoms of missing AI agent infrastructure:
- Board mandates 'do AI' but nobody owns the architecture
- Multiple AI experiments, no shared infrastructure or governance
- Shadow AI spreading across teams with no visibility
- Agents optimising for wrong metrics (speed vs. quality vs. compliance)
- No audit trail for autonomous agent decisions
- Integration complexity growing with every new AI tool
- Vendor agentic AI platforms procured without engineering or architectural review
What You Get: The Agent Infrastructure Blueprint
From strategy to governed agent operations in 4-8 months.
Your organisational purpose encoded as machine-readable decision frameworks. Trade-off hierarchies, decision boundaries, and alignment measurement — so agents optimise for what your business actually values, not just what's technically efficient.
A universal integration layer connecting your AI agents to enterprise systems. One governed endpoint instead of dozens of brittle, one-off connections. Progressive discovery means agents find the right tools without consuming massive context windows.
Coordinated agent workflows replacing disconnected experiments. Role assignment, conflict resolution, shared context, and monitoring — with human-in-the-loop controls where your risk profile demands them.
Your organisational knowledge structured for agent consumption. Entity relationships, domain logic, and business rules mapped into a queryable knowledge graph that grounds agent reasoning in verified facts, not probabilistic guesses.
Systems designed for autonomous agent operation from day one. Deterministic CLIs, machine-readable APIs, and governance-as-code pipelines — so your products work for both human users and autonomous agents without separate integration layers.
Zero Trust for autonomous systems. Every agent action logged immutably. Sandboxed execution, least-privilege access, continuous verification, and context snapshotting — the evidence trail your auditors and regulators need.
From Experiments to Enterprise Operations
The measurable shift from pilot purgatory to governed AI value.
The Foundation We've Already Built
RISKflo
13M+ events per year, fully governed
Event-sourced architecture processing 13+ million events annually with 99% correlation accuracy. Every action permanently recorded, every decision auditable. This is the same architectural pattern — event sourcing, policy-as-code, immutable audit trails — that underpins governed AI agent infrastructure. The platform we built for enterprise risk management is the blueprint for enterprise agent management.
Read Full Case StudyThe technical blueprint
Architectural detail for CTOs, CISOs, and Lead Architects. (Whitepaper coming.)
Notify me when the Whitepaper is releasedHow It Works
From intent discovery to governed agent operations in 4-8 months.
Intent Discovery & Architecture
Weeks 1-4
- Audit existing AI experiments and shadow AI across teams
- Map organisational intent to machine-readable decision frameworks
- Design agent architecture, MCP topology, and governance model
- Select orchestration framework and integration patterns
MCP & Integration Layer
Weeks 5-10
- Implement Model Context Protocol infrastructure
- Build central tool registry with governed discovery
- Connect enterprise systems as MCP servers
- Establish proxied access with audit logging
Agent Development & Orchestration
Weeks 11-20
- Build agent workflows for prioritised use cases
- Implement multi-agent orchestration with conflict resolution
- Deploy monitoring and human-in-the-loop controls
- Iterate based on alignment measurement
Production & Governance
Weeks 21-32
- Zero Trust agent deployment to production
- Policy-as-code for AI-specific operations
- Automated audit trails for all agent actions
- Team enablement, documentation, and training
Questions
AI Agent Infrastructure FAQ
What is Agentic Due Diligence?
Agentic Due Diligence (ADD) is the engineering and architectural review of a vendor's agentic AI platform conducted during procurement evaluation — alongside commercial, legal, and technical due diligence — to determine whether the platform can be safely deployed before a contract is signed. ADD examines four dimensions: agent identity and scoping (does the platform distinguish human users from agents, and can scope be narrowed per task?), policy-as-code enforcement (are controls executable or only asserted in PDFs?), audit and observability (is every agent action logged immutably with full decision context?), and revocation (can the buyer pull an agent's access from a console without waiting for a vendor deploy?). The output is a board-readable risk position the buyer can sign on, not a checklist of vendor self-assertions. NETEVO conducts ADD as a discrete pre-contract engagement; many buyers commission it before going to procurement so the vendor brief is informed by the architectural posture they need.
What does the McKinsey Lilli incident reveal about AI agent procurement?
The March 2026 disclosure that an autonomous AI agent breached McKinsey's internal Lilli platform in under two hours — for twenty dollars in tokens — is best read as a market signal, in the words of IDC analyst Alessandro Perilli, about how procurement and architecture have decoupled. The incident is not a McKinsey-specific failure; it is a category-wide architectural pattern. When the consumer of a vendor system is an autonomous agent rather than a human at a screen, the user interface stops being the permission boundary. Twenty-two endpoints out of two hundred-plus required no authentication, and the database held writable AI configuration — system prompts, RAG knowledge bases — alongside user data. McKinsey patched within hours; the procurement-architecture question of how 22 unauthenticated endpoints reached production in the first place is the one boards are now asking. The architectural pattern Lilli exposed is the one every enterprise agentic AI deployment has to design against. This is the territory Agentic Due Diligence (ADD) operates in.
Does AU regulation already require what the Lilli incident exposed as missing?
Yes, in significant part. The Australian Prudential Regulation Authority's CPS 230 (operational risk management; effective 1 July 2025) covers material service-provider and technology dependencies — agentic AI platforms sit squarely inside scope for APRA-regulated entities. The Privacy and Other Legislation Amendment Act 2024 introduced disclosure requirements for substantially automated decisions affecting individuals' rights or interests. The ASX Listing Rule 3.1 continuous-disclosure framework applies to material events arising from AI-system compromise the way it has always applied to other technology-incident exposure. The National AI Centre's Guidance for AI Adoption (October 2025) sets out six Essential Practices — accountability, impacts and planning, risk measurement and management, information sharing, testing and monitoring, human control — that form the de-facto private-sector benchmark. NETEVO encodes obligations like these as executable controls in policy-as-code; we do not interpret the application of any specific statute to any specific factual scenario, which is legal-practitioner work.
What is intent engineering and why does it matter for enterprise AI agents?
Intent engineering is the practice of encoding organisational purpose, values, and decision-making frameworks into machine-readable formats that AI agents can consume and act upon autonomously. It's the third era of AI interaction, following prompt engineering (crafting instructions) and context engineering (providing data via RAG). The distinction matters because context alone tells an agent what to know, but not what to want or how to prioritise competing objectives. Without intent engineering, AI agents optimise for metrics that may be misaligned with business values. Intent engineering addresses this by formalising three layers: strategic (translating OKRs into agent parameters), operational (defining decision boundaries), and feedback (measuring alignment drift). For enterprise organisations, this means transforming the implicit values and heuristics that human employees acquire through culture into explicit, machine-readable decision frameworks.
What is Model Context Protocol and how does it work for enterprise AI integration?
Model Context Protocol (MCP) is a standardised interface that allows AI agents to access tools, databases, and business applications through a universal client-server model. Before MCP, every combination of an AI model and enterprise system required bespoke integration, creating fragmented governance and shadow AI. MCP provides three capability types: tools (execute actions), resources (provide data), and prompts (offer templates). For enterprise deployment, implementation follows three patterns: a Central Tool Registry as single source of truth, Proxied Access with rate limiting and audit logging, and a Controller/Worker Pattern for scalable operations. Progressive discovery — agents querying only for relevant tools as needed — reduces token usage by up to 98.7%, significantly lowering operational costs.
How does multi-agent orchestration work in enterprise environments?
Multi-agent orchestration coordinates different types of AI agents within unified workflows: capturing intent, planning subtasks, assigning roles with access control, sharing context through APIs, monitoring with human-in-the-loop controls, and preserving outcomes for improvement. The agent hierarchy mirrors an organisational chart: LLM agents reason through problems, workflow agents manage execution flow, and custom agents handle specific business logic. In non-trivial systems, the orchestration framework includes conflict detection and resolution — multiple agents may propose competing actions, and the system applies deterministic rules or priority hierarchies to resolve them. Without this layer, organisations face inconsistent decisions and hidden operational risk.
What is an agent-native product and how is it different from agent-friendly?
Agent-native products are designed with AI agents as the primary operators, not an afterthought. They favour deterministic CLIs and local execution over polished UIs, enabling end-to-end autonomous operation. The progression runs from agent-powered (passive AI benefit, human operates) through agent-friendly (APIs allow human-guided agents) to agent-native (entire pipeline automatable by autonomous agents). The critical distinction is the failure mode: agent-native products fail through governance ownership gaps, making governed architecture the essential foundation. For enterprises, this means rethinking API design for AI consumption, writing tool descriptions for AI reasoning, and embedding governance directly in the pipeline.
How do knowledge graphs and business ontologies support AI agent decision-making?
Knowledge graphs provide the structured understanding that LLMs lack. LLMs reason probabilistically over unstructured text — powerful but unpredictable. Knowledge graphs reason deterministically over structured relations — rigid but auditable. Combined, they deliver balanced decision-making with explainable actions. Business ontologies define relationships between entities and domains, and when connected as knowledge graphs, they give agents navigable, grounded context. For enterprise deployment, this means building a structured repository of your firm's logic that is discoverable via APIs and follows formal schemas for logical reasoning. The knowledge graph connects to agent infrastructure via MCP, with governance over how agents query and act on institutional knowledge.
What does Zero Trust architecture look like for AI agents?
Zero Trust for AI agents extends traditional principles to autonomous systems: context segmentation (sandboxed execution with strict egress filters), least privilege by default (minimum permissions per task), continuous verification (permissions reassessed as context changes), and context snapshotting (full state capture before critical actions for rollback). Every agent action generates immutable audit logs — timestamps, agent identifiers, actions, target resources, authorisation decisions. This transparency is essential for emerging regulations including the EU AI Act. Implementation combines policy-as-code enforcement with event sourcing for complete action history.
How long does it take to implement enterprise AI agent infrastructure?
Typically 4-8 months from strategy to governed production. Phase 1 (Weeks 1-4): Intent discovery, architecture design, and governance model. Phase 2 (Weeks 5-10): MCP infrastructure, tool registry, and system integration. Phase 3 (Weeks 11-20): Agent workflow development, orchestration, and monitoring. Phase 4 (Weeks 21-32): Production deployment, policy-as-code, and team enablement. Organisations with existing Governed SDLC complete in 4-5 months because the policy-as-code foundation is already established. Variables: number of systems, regulatory complexity, platform maturity, and organisational readiness.
What is the difference between NETEVO's approach and hiring an AI consultancy?
Most AI consultancies focus on model selection and pilot projects that remain experimental. NETEVO starts from infrastructure, not from the model. Three distinctions: (1) Intent before intelligence — we encode organisational purpose before deploying AI capability. (2) Governed by design — every agent action generates audit evidence via policy-as-code, applying patent-attorney rigour to AI operations. (3) Internal ownership — post-engagement, your team owns and operates the infrastructure independently. The result: governed production infrastructure instead of impressive demos that never scale.
Is This Right for You?
This is for you if...
- Board or leadership has mandated AI strategy with no clear architecture
- Multiple AI experiments running with no shared infrastructure
- Shadow AI is spreading across teams without governance
- You need AI agents to act within defined organisational boundaries
- Regulatory requirements demand auditable AI decision-making
- Existing Governed SDLC foundation ready for agent layer
- Enterprise systems need governed AI integration (not just chatbots)
This is NOT for you if...
- You need a single chatbot or copilot (simpler tools exist)
- No executive sponsorship for AI infrastructure investment
- Engineering team <30 people (start with Governed SDLC first)
- Looking for model selection advice without infrastructure commitment
- Budget under $40K (consider Governed SDLC as foundation first)
The vocabulary
Three named patterns NETEVO uses across the agentic AI engagement. The vocabulary is precise because the architecture is precise.
- Bounded SaaS / Unbounded Agents
- The two procurement regimes. Bounded SaaS is the legacy regime in which a human user interacts with a vendor application through a UI that silently mediates permission. Unbounded Agents is the emerging regime in which an autonomous software agent interacts via APIs with no UI; permission must be verified explicitly in code on every call.
- Implicit Authority Cascade (IAC)
- The failure mode in multi-agent workflows where an upstream agent delegates a task to a downstream agent without explicitly narrowing the downstream agent's scope. The downstream agent inherits the upstream agent's full authority by default. The 'implicit' is load-bearing — the failure is that scope was never narrowed, not that scope was widened.
- Agentic Due Diligence (ADD)
- The engineering and architectural review of a vendor's agentic AI platform conducted during procurement evaluation, alongside commercial, legal, and technical due diligence. Examines four dimensions: agent identity and scoping, policy-as-code enforcement, audit and observability, and revocation.
How We're Different
Intent before intelligence
We encode what your organisation values before deploying any AI capability. Agents aligned with organisational purpose, not just technically functional.
Infrastructure, not experiments
Shared MCP infrastructure replaces disconnected pilots. Every new agent inherits governed access, audit trails, and alignment measurement.
Patent-grade governance
The same evidentiary standards used in patent prosecution applied to AI agent operations. Defensible, auditable, board-ready.
Works Best With
The foundation layer.
Policy-as-code and governed pipelines provide the infrastructure foundation that AI agent operations extend. Organisations with Governed SDLC in place deploy agent infrastructure 30-40% faster.
Learn moreThe compliance and readiness layer.
Organisational readiness, workforce transformation, and regulatory compliance. AI Agent Infrastructure is the technical layer; AI Governance ensures the organisation is ready to operate it.
Learn moreBe cited by the agents you're building for.
The same entity authority and structured data that makes your brand visible to external AI agents also strengthens your internal knowledge graph.
Learn moreReady to Move Beyond AI Experiments?
A short discovery call. We'll discuss your current AI landscape, identify the highest-impact infrastructure gaps, and outline what governed agent operations would look like in your context.