The Agents Were Working. The Governance Wasn’t: Runtime AI Observability on SAP BTP

[[{“value”:”

The Agents Were Working. The Governance Wasn’t: Runtime AI Observability on SAP BTP

A set of A2A agents had been running, routing requests, calling tools, producing answers. The question that surfaced: which agent handled that session, what routing decision was made, and what did it cost in tokens?

The agents were working. The governance wasn’t.

That gap led to AEGIS, a lightweight runtime governance dashboard running on SAP Business Technology Platform, that answers those questions without changing the agent’s decision-making logic.

The Problem

The setup is a set of A2A agents: an orchestrator that routes incoming requests and a collection of specialist agents that do the actual work. Each agent calls Claude via SAP AI Core / Generative AI Hub. The tool-use loop can span multiple back-and-forth iterations before producing a final answer, which means token usage and latency accumulate in ways that are hard to reason about after the fact.

The operational questions that could not be answered:

Which agent handled this request, and with what confidence?
How many tokens did that interaction consume across the full tool-use loop?
Did any agent breach a cost or latency threshold?
Can the governance layer demonstrate, through schema design and retained records, that no prompts or responses were stored?

SAP AI Agent Hub is SAP’s vendor-agnostic command center for discovering, inventorying, governing, and evaluating AI agents, LLMs, and MCP servers across the enterprise landscape. Its focus on agent observability, verification, and runtime governance reflects what practitioners are finding in the field: governance is not an optional extra for agentic AI. It is a load-bearing requirement, especially in enterprise settings.

AEGIS is a practical step in that direction, not a product or framework, but a working prototype built to answer these questions on a live deployment.

Why Application Logs Were Not Enough

Cloud Foundry logs are useful for troubleshooting, but they are not a governance model. What is needed is structured, queryable, role-protected records that can answer business-facing questions: which agent acted, what decision label was produced, what confidence score was reported, how much the interaction cost, and whether any threshold was breached.

That requires a purpose-built operational metadata layer rather than relying on raw application logs.

One Design Principle: Only Metadata

The most important design decision before getting into architecture:

⚠️AEGIS never stores prompts or responses. Only structured metadata is persisted.

Every governance event contains fields like agent name, event type, timestamp, session ID, routing decision, confidence score, token counts, and latency. The actual content of what the user asked, or what the agent answered, never leaves the agent process. This is a deliberate privacy boundary, not an accidental omission.

Architecture Overview

AEGIS is four Cloud Foundry applications deployed from a single mta.yaml, wired together through SAP HANA Cloud and XSUAA.

The key design choice is separation: agents emit metadata, the collector owns ingestion, CAP owns governed read access, and the UI remains a consumer of already-sanitized operational data.

The flow in six steps:

Agent calls LLM via SAP AI Core / Generative AI Hub. Token usage accumulates across the full tool-use loop.
Agent emits a governance event, metadata only, after each significant action.
aegis-collector (Python / FastAPI) receives the event via POST /events over HTTPS, protected by the aegis.ingest XSUAA scope.
Collector writes to SAP HANA Cloud via direct SQL into four tables: AEGIS_DECISIONLOG, AEGIS_TOKENBUCKET, AEGIS_ANOMALYALERT, AEGIS_THRESHOLDCONFIG.
aegis-cap (CAP Node.js / OData V4) exposes those tables as a governed read API with role-based access tied to XSUAA scopes.
aegis-approuter + React UI serves the dashboard with BTP SSO. Three tabs: Audit, Operations, Anomaly Alerts.

What the Dashboard Shows

Audit Tab

A time-ordered log of every governance event. Each row shows the agent name, event type, routing decision, confidence score, and timestamp. Clicking a row opens a detail drawer with token counts and latency for that specific interaction.

This is the tab for the question: “what happened, when, and which agent made the call?”

Operations Tab

Daily token cost rolled up by agent and model. A nightly job aggregates raw decision log entries into AEGIS_TOKENBUCKET, giving a clean view of where tokens are being spent over time.

This turns token usage from a hidden runtime detail into something teams can track before the cloud bill arrives.

Anomaly Alerts Tab [pending]

Per-agent configurable thresholds for token count, latency, and confidence. A background job runs every 15 minutes, checks recent activity against the configured thresholds, and writes alerts when a threshold is breached. Alerts are debounced to avoid flooding.

The goal is not to prove the agent is wrong, but to surface behavior that deserves human review.

example :

The Integration Point: One Function Call

Adding governance to an agent requires a single async call after a decision is made. No middleware, no interceptors, no changes to routing or decision logic.

Simplified example below. Production usage should include schema validation, retry/backoff behavior, request correlation IDs, and clear failure handling:

await fire_governance_event({
“agent”: “orchestrator”,
“event_type”: “routing”,
“session_id”: context_id,
“timestamp”: datetime.utcnow().isoformat() + “Z”,
“decision”: f”routed to {routing[‘name’]}”,
“confidence”: routing.get(“confidence”, 0.0),
“latency_ms”: int((time.monotonic() – t0) * 1000),
“input_tokens”: usage[“input_tokens”],
“output_tokens”: usage[“output_tokens”],
“model_id”: os.environ.get(“AICORE_HAIKU_DEPLOYMENT_ID”) or os.environ.get(“AICORE_SONNET_DEPLOYMENT_ID”) or “unknown”,
})

The collector validates the XSUAA token, writes the event to HANA, and triggers the anomaly check, all asynchronously, so the agent’s response path is not blocked.

Implementation Notes

Token accounting across tool-use loops is non-trivial. Claude’s API returns token usage per API call, but an agentic loop makes multiple calls before producing a final answer. AEGIS accumulates input_tokens and output_tokens across the full loop, not just the last call. The number on the Audit Tab reflects real cost, not a single-turn approximation.

Routing confidence is more useful than routing outcome. Knowing that the orchestrator routed to a given agent is less useful than knowing it did so with 0.43 confidence. Low-confidence routing decisions are exactly the ones worth reviewing.

What This Is and What It Is Not

AEGIS is not a product, not a framework, and not a recommendation that every team should build its own governance layer. It is a working prototype that demonstrates what runtime metadata matters when agents start operating across systems.

SAP AI Agent Hub brings enterprise-scale agent governance to the SAP ecosystem. AEGIS is a small prototype built in that same spirit, a practical exercise in understanding what runtime metadata matters when A2A agents operate across systems on BTP.

Technical Summary

Platform: SAP BTP Cloud Foundry
Agent protocol: A2A (Agent-to-Agent)
LLM: Claude Sonnet via SAP AI Core / Generative AI Hub
Collector: Python / FastAPI, receives governance events, writes to HANA
Data store: SAP HANA Cloud HDI container, 4 tables, 90-day retention
API layer: CAP Node.js / OData V4, role-based read access
UI: React / Vite, served via SAP AppRouter with BTP SSO
Security: XSUAA scopes: aegis.ingest (agents), aegis.audit (reviewers), aegis.ops (operators), aegis.admin (admins)
Deployment: Single mta.yaml, all four apps, HDI container, and XSUAA service instance

Disclaimer: The views expressed in this post are my own and do not represent the position, strategy, or roadmap of SAP SE or any SAP affiliate. AEGIS is a personal prototype built for learning and demonstration purposes only. It is not an SAP product, not an official SAP sample, and not endorsed or supported by SAP. Nothing in this post constitutes a product announcement, commitment to deliver functionality, or representation of future SAP product direction.

“}]]

[[{“value”:”The Agents Were Working. The Governance Wasn’t: Runtime AI Observability on SAP BTPA set of A2A agents had been running, routing requests, calling tools, producing answers. The question that surfaced: which agent handled that session, what routing decision was made, and what did it cost in tokens?The agents were working. The governance wasn’t.That gap led to AEGIS, a lightweight runtime governance dashboard running on SAP Business Technology Platform, that answers those questions without changing the agent’s decision-making logic.The ProblemThe setup is a set of A2A agents: an orchestrator that routes incoming requests and a collection of specialist agents that do the actual work. Each agent calls Claude via SAP AI Core / Generative AI Hub. The tool-use loop can span multiple back-and-forth iterations before producing a final answer, which means token usage and latency accumulate in ways that are hard to reason about after the fact.The operational questions that could not be answered:Which agent handled this request, and with what confidence?How many tokens did that interaction consume across the full tool-use loop?Did any agent breach a cost or latency threshold?Can the governance layer demonstrate, through schema design and retained records, that no prompts or responses were stored?SAP AI Agent Hub is SAP’s vendor-agnostic command center for discovering, inventorying, governing, and evaluating AI agents, LLMs, and MCP servers across the enterprise landscape. Its focus on agent observability, verification, and runtime governance reflects what practitioners are finding in the field: governance is not an optional extra for agentic AI. It is a load-bearing requirement, especially in enterprise settings.AEGIS is a practical step in that direction, not a product or framework, but a working prototype built to answer these questions on a live deployment.Why Application Logs Were Not EnoughCloud Foundry logs are useful for troubleshooting, but they are not a governance model. What is needed is structured, queryable, role-protected records that can answer business-facing questions: which agent acted, what decision label was produced, what confidence score was reported, how much the interaction cost, and whether any threshold was breached.That requires a purpose-built operational metadata layer rather than relying on raw application logs.One Design Principle: Only MetadataThe most important design decision before getting into architecture:⚠️AEGIS never stores prompts or responses. Only structured metadata is persisted.Every governance event contains fields like agent name, event type, timestamp, session ID, routing decision, confidence score, token counts, and latency. The actual content of what the user asked, or what the agent answered, never leaves the agent process. This is a deliberate privacy boundary, not an accidental omission.Architecture OverviewAEGIS is four Cloud Foundry applications deployed from a single mta.yaml, wired together through SAP HANA Cloud and XSUAA.The key design choice is separation: agents emit metadata, the collector owns ingestion, CAP owns governed read access, and the UI remains a consumer of already-sanitized operational data. The flow in six steps:Agent calls LLM via SAP AI Core / Generative AI Hub. Token usage accumulates across the full tool-use loop.Agent emits a governance event, metadata only, after each significant action.aegis-collector (Python / FastAPI) receives the event via POST /events over HTTPS, protected by the aegis.ingest XSUAA scope.Collector writes to SAP HANA Cloud via direct SQL into four tables: AEGIS_DECISIONLOG, AEGIS_TOKENBUCKET, AEGIS_ANOMALYALERT, AEGIS_THRESHOLDCONFIG.aegis-cap (CAP Node.js / OData V4) exposes those tables as a governed read API with role-based access tied to XSUAA scopes.aegis-approuter + React UI serves the dashboard with BTP SSO. Three tabs: Audit, Operations, Anomaly Alerts.What the Dashboard ShowsAudit TabA time-ordered log of every governance event. Each row shows the agent name, event type, routing decision, confidence score, and timestamp. Clicking a row opens a detail drawer with token counts and latency for that specific interaction.This is the tab for the question: “what happened, when, and which agent made the call?” Operations TabDaily token cost rolled up by agent and model. A nightly job aggregates raw decision log entries into AEGIS_TOKENBUCKET, giving a clean view of where tokens are being spent over time.This turns token usage from a hidden runtime detail into something teams can track before the cloud bill arrives. Anomaly Alerts Tab [pending]Per-agent configurable thresholds for token count, latency, and confidence. A background job runs every 15 minutes, checks recent activity against the configured thresholds, and writes alerts when a threshold is breached. Alerts are debounced to avoid flooding.The goal is not to prove the agent is wrong, but to surface behavior that deserves human review.example : The Integration Point: One Function CallAdding governance to an agent requires a single async call after a decision is made. No middleware, no interceptors, no changes to routing or decision logic.Simplified example below. Production usage should include schema validation, retry/backoff behavior, request correlation IDs, and clear failure handling: await fire_governance_event({
“agent”: “orchestrator”,
“event_type”: “routing”,
“session_id”: context_id,
“timestamp”: datetime.utcnow().isoformat() + “Z”,
“decision”: f”routed to {routing[‘name’]}”,
“confidence”: routing.get(“confidence”, 0.0),
“latency_ms”: int((time.monotonic() – t0) * 1000),
“input_tokens”: usage[“input_tokens”],
“output_tokens”: usage[“output_tokens”],
“model_id”: os.environ.get(“AICORE_HAIKU_DEPLOYMENT_ID”) or os.environ.get(“AICORE_SONNET_DEPLOYMENT_ID”) or “unknown”,
}) The collector validates the XSUAA token, writes the event to HANA, and triggers the anomaly check, all asynchronously, so the agent’s response path is not blocked.Implementation NotesToken accounting across tool-use loops is non-trivial. Claude’s API returns token usage per API call, but an agentic loop makes multiple calls before producing a final answer. AEGIS accumulates input_tokens and output_tokens across the full loop, not just the last call. The number on the Audit Tab reflects real cost, not a single-turn approximation.Routing confidence is more useful than routing outcome. Knowing that the orchestrator routed to a given agent is less useful than knowing it did so with 0.43 confidence. Low-confidence routing decisions are exactly the ones worth reviewing.What This Is and What It Is NotAEGIS is not a product, not a framework, and not a recommendation that every team should build its own governance layer. It is a working prototype that demonstrates what runtime metadata matters when agents start operating across systems.SAP AI Agent Hub brings enterprise-scale agent governance to the SAP ecosystem. AEGIS is a small prototype built in that same spirit, a practical exercise in understanding what runtime metadata matters when A2A agents operate across systems on BTP. Technical SummaryPlatform: SAP BTP Cloud FoundryAgent protocol: A2A (Agent-to-Agent)LLM: Claude Sonnet via SAP AI Core / Generative AI HubCollector: Python / FastAPI, receives governance events, writes to HANAData store: SAP HANA Cloud HDI container, 4 tables, 90-day retentionAPI layer: CAP Node.js / OData V4, role-based read accessUI: React / Vite, served via SAP AppRouter with BTP SSOSecurity: XSUAA scopes: aegis.ingest (agents), aegis.audit (reviewers), aegis.ops (operators), aegis.admin (admins)Deployment: Single mta.yaml, all four apps, HDI container, and XSUAA service instance Disclaimer: The views expressed in this post are my own and do not represent the position, strategy, or roadmap of SAP SE or any SAP affiliate. AEGIS is a personal prototype built for learning and demonstration purposes only. It is not an SAP product, not an official SAP sample, and not endorsed or supported by SAP. Nothing in this post constitutes a product announcement, commitment to deliver functionality, or representation of future SAP product direction. “}]] Read More Technology Blog Posts by SAP articles

#SAPCHANNEL

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

The Agents Were Working. The Governance Wasn’t: Runtime AI Observability on SAP BTP

Byali