The first article in this series identified a structural gap: in regulated organisations deploying AI agents, the question of who carries operational accountability for what those agents do does not have a clean answer. The accountability is present in formal documents. It is less present in practice.
A reasonable response to that observation is to ask what already exists. If the governance gap is real, surely the frameworks and tooling that have emerged over the past several years are attempting to close it. This article examines what they cover and what they do not. The conclusion is not stated at the outset, because the map only makes sense once you have walked it.
Part One, The Frameworks
Four frameworks matter most for regulated industries deploying AI. They are not interchangeable, and understanding the difference between them is more useful than treating the landscape as a single body of guidance.
NIST AI Risk Management Framework
Released in January 2023 and extended with a Generative AI Profile in July 2024, the NIST AI RMF is the most operationally accessible framework for product and engineering teams. Its four core functions, Govern, Map, Measure, Manage, are designed to integrate with existing development workflows rather than sit above them as an external audit layer. Its strength is its vocabulary: it gives organisations a common language for AI risk before they have a mature governance programme.
What it leaves open: being voluntary, it accommodates a wide range of implementation depth. The MEASURE function covers feedback on measurement efficacy, but what regulatory-grade evidence looks like for financial services, who inside a product organisation is responsible for generating it, and what happens when an agent operates outside its assessed parameters remain implementation decisions the framework defers to the organisation.
EU AI Act
The EU AI Act is the only instrument here that carries direct legal force for organisations operating in or selling into European markets. For high-risk systems, conformity obligations are substantial: technical documentation, human oversight mechanisms, decision logging, and registration before deployment. Its scope applies to any AI system deployed to EU residents regardless of where the developer is headquartered, making EU AI Act compliance a market access condition, not a distant consideration.
What it demands from product specifically: the obligation to maintain technical documentation, ensure human oversight is architecturally present, and generate audit-ready evidence cannot be delegated entirely to legal or compliance. The logging architecture, override mechanisms, and monitoring infrastructure the Act requires are product design decisions, which means the product team carries them.
ISO/IEC 42001
Published in December 2023, ISO/IEC 42001 is the first international standard for AI management systems that organisations can be independently certified against. Its Plan-Do-Check-Act structure integrates with existing ISO 27001 infrastructure, which reduces friction for organisations already inside the ISO ecosystem. The standard mandates a management system rather than specific controls. Its commercial significance is growing: ISO 42001 is becoming a procurement baseline in regulated enterprise markets, and both NIST and the EU AI Act cross-reference it, making it a useful structural backbone for organisations building toward either framework.
AIGA Hourglass Model
The AI Governance Framework from Turku School of Economics maps governance tasks to the OECD AI lifecycle model across environmental, organisational, and system layers. For product teams trying to understand where a specific governance obligation needs to be implemented in their system, rather than in their policies, it is one of the more useful reference points available. It treats accountability and ownership as a distinct governance domain: not a consequence of other controls, but a design decision in its own right.
| Regulatory obligations | Process & policy | System lifecycle | Runtime behaviour | |
|---|---|---|---|---|
| NIST AI RMF | ✓ | ✓ | ◑ | – |
| EU AI Act | ✓ | ◑ | ✓ | ◑ |
| ISO/IEC 42001 | ◑ | ✓ | ◑ | – |
| AIGA Hourglass | ◑ | ✓ | ✓ | – |
| GRC platforms | – | ✓ | – | – |
| AI gov. platforms | – | ◑ | ✓ | – |
| ML observability | – | – | – | ✓ |
Figure 1, Governance coverage by framework and tool (✓ primary ◑ partial – not addressed)
Part Two, The Tooling
The commercial landscape for AI governance tooling has developed rapidly, and the claims being made by platforms in this space have expanded considerably in the past eighteen months. Four categories are relevant to this analysis. Each addresses a distinct governance surface, and each warrants careful reading before drawing conclusions about what the landscape collectively covers.
Agentic GRC Automation Platforms
A category of compliance automation platforms has evolved significantly beyond its origins in evidence collection and policy management. These platforms now deploy AI agents that claim to conduct control assessments, review compliance evidence against defined criteria, flag deficiencies, score vendor risk, and draft remediation plans. Several integrate directly into enterprise toolstacks, cloud environments, identity providers, development pipelines, and claim to surface continuous compliance signals from those integrations rather than capturing point-in-time records.
Coverage of AI-specific frameworks is expanding within this category. NIST AI RMF and ISO 42001 appear among the supported frameworks of several platforms, and at least one has achieved ISO 42001 certification for its own operations. The positioning of the most developed platforms in this category has shifted from audit readiness toward continuous, agentic compliance programme management.
What warrants closer examination: the governance surfaces these platforms monitor are defined by the integrations they have built. Whether those integrations reach the specific runtime artefacts that financial regulators examine when auditing a live AI agent deployment, inference logs at decision time, input data at inference, model version in production at a specific timestamp, override events, is an empirical question that depends on each platform's integration architecture. This research has not verified that level of specificity, and does not draw a conclusion from general platform positioning alone.
Dedicated AI Governance Platforms
A second category manages the AI system lifecycle rather than the compliance programme around it. These platforms provide centralised AI system inventory, risk classification aligned to regulatory frameworks, automated model cards and impact assessments, and documentation packages structured for audit purposes. They are recognised by major analyst firms as a formally tracked market category, with customers that include large financial institutions.
Platforms in this category explicitly claim coverage of EU AI Act conformity requirements, NIST AI RMF alignment, and ISO 42001 management system support. Their stated governance surface is the AI system lifecycle: what systems exist, who approved them, how they are classified, and what the policy record shows.
A structural consideration: the assumption these platforms are designed around is that a governance programme exists to operate them against. For organisations that have already resolved the foundational questions of AI system inventory and ownership, they provide substantial infrastructure. For organisations still working through those questions, the platform arrives before the operating model that would make it effective. Whether they extend into continuous runtime behaviour monitoring, and through what mechanism, is a question this research continues to examine.
ML Observability and Model Monitoring
A third category monitors what models actually do in production: data drift detection, inference quality evaluation, LLM-specific testing, explainability analysis, and AI agent validation. Platforms in this category generate runtime evidence of model behaviour, the kind of technical signal that governance frameworks reference but rarely specify how to produce.
The most governance-oriented platforms in this category have developed explainability and audit features explicitly positioned for regulated industries including financial services. They claim counterfactual analysis and compliance certification features that can support examination of individual model decisions.
The translation question: the evidence these platforms generate is structured for ML engineering and data science teams. Whether that technical output maps directly onto the evidence standard a financial regulator applies when examining an AI agent deployment is not a question the platforms fully address in their documentation. The relationship between runtime monitoring output and regulatory audit artefact may be narrowing as governance features develop, or it may require an organisational interpretive layer that currently sits outside any product. This research holds that question open.
GRC Engineering as Adjacent Discipline
A practitioner movement has emerged that frames compliance as an engineering discipline: embedding governance requirements at design time, treating policy as continuously tested code, and connecting security controls directly to engineering delivery. This framing is analytically adjacent to the accountability questions this research examines, though it remains an emerging discipline rather than an established organisational model within regulated financial services. Its arguments are noted here as context rather than examined as a primary subject.
Companies observed within these categories during this research phase. Their inclusion is descriptive, not evaluative, and capabilities are evolving. Category 1, Agentic GRC Automation: Complyance, Anecdotes, Sprinto, Vanta, Drata. Category 2, AI Governance Platforms: Credo AI. Category 3, ML Observability: Evidently AI, Fiddler AI, Arize. GRC Engineering movement: GRC Engineer. Readers operating within these organisations or their customer environments are better positioned than this research to assess current capability state.
Part Three, What the Map Reveals
Laid out in sequence, the frameworks and tools form a recognisable pattern. The regulatory instruments define what accountability should exist. The agentic GRC platforms claim to systematise compliance programme management with increasing intelligence. The dedicated AI governance platforms claim to manage the AI system lifecycle against documented policy. The observability tools generate runtime evidence of what systems actually do in production.
Whether these categories, individually or in combination, address the translation between regulatory obligation and the specific evidence a financial regulator examines when auditing a live AI agent deployment is a question this research cannot resolve from outside the organisations operating these systems. It depends on integration architectures, regulatory interpretation in practice, and organisational operating models that vary considerably. What this mapping exercise surfaces is that the question warrants precise examination rather than assumption.
| Layer | What it governs | Typical owner |
|---|---|---|
| Regulatory instruments | Legal obligations, risk classifications | Regulator → Legal / Compliance |
| Governance frameworks | What a governance programme should achieve | Risk / Compliance function |
| Agentic GRC platforms | Compliance programme management, agentic evidence review, control assessment | Compliance / InfoSec |
| AI governance platforms | AI system lifecycle, inventory, risk classification, audit artefacts | AI governance / Risk function |
| ML observability tools | Runtime model behaviour, drift, quality, anomalies, explainability | Data science / ML engineering |
| ↓ Open question ↓ | Translation between regulatory obligation and runtime evidence for AI agent deployments in regulated financial environments | Typically within product, ownership and coverage not yet settled |
Figure 2, The governance layer stack and the open question of where translation between layers currently occurs
| Layer | What the landscape claims |
|---|---|
| Regulatory frameworks | Principles, obligations, risk classifications, what governance must achieve |
| Agentic GRC platforms | Continuous compliance programme management, agentic evidence review, control assessment, AI framework support |
| AI governance platforms | AI system lifecycle governance, inventory, risk classification, regulatory alignment, audit documentation |
| ML observability tools | Runtime model behaviour monitoring, drift detection, inference quality, explainability, anomaly identification |
| Open question | Whether existing capabilities, individually or in combination, cover the translation between regulatory obligation and runtime evidence for AI agent deployments in regulated financial environments |
The organisations that navigate the next regulatory cycle without significant remediation costs are not those with the most governance tooling. They are those where accountability is resolved at design time rather than after deployment, where the question of what evidence the system needs to generate is answered before the system is built, not after it has been audited.
Whether the tooling landscape has reached the point where that accountability can be genuinely operationalised, rather than documented, is the substantive open question this research continues to examine. Practitioners working inside regulated financial organisations who have formed a view, from either direction, are a primary audience for the next phase of this work.