Turn the Lights On: From AI Discovery to Runtime Illumination

A Runtime-Centric Framework for AI Governance

Get the Playbook

Mahesh Babu

March 20, 2026

0 min read

Kai

AI Security

Turn the Lights On: From AI Discovery to Runtime Illumination

Introduction

The integration of large language models (LLMs) into production systems has fundamentally altered the structure of application risk. AI is no longer confined to discrete API calls; it is increasingly embedded within application logic, orchestrated through agent frameworks, and executed dynamically in response to contextual inputs.

In response, organizations have adopted governance controls that emphasize external visibility: monitoring API calls, inspecting prompts and analyzing data flows at the system boundary. These approaches aim to answer a foundational question: where is AI used?

However, this question is insufficient. Emerging evidence suggests that the most consequential risks in AI systems do not arise at the boundary, but within execution. The distinction between inference-based discovery and execution-based observation is therefore becoming central to both security practice and policy design.

Discovery as Inference: The Limits of Boundary-Based Observation

Contemporary AI discovery systems rely on traffic inspection to identify model interactions. These systems can determine which endpoints are called, which prompts are submitted and which responses are returned. This model assumes that AI activity is externally observable.

Recent research challenges this assumption. As LLMs are integrated into agentic systems and application workflows, critical behaviors increasingly occur without generating distinct, inspectable API events. Instead, they emerge through multi-step execution chains, internal tool invocations and context propagation within the runtime environment. This creates a structural limitation: discovery systems observe signals of interaction, not evidence of execution.

Evidence from Emerging Attack Classes

A growing body of work from 2025 - 2026 demonstrates that many high-impact AI vulnerabilities manifest at runtime and evade traditional discovery mechanisms.

Prompt Injection and Agent Hijacking

Prompt injection has been formalized as a dominant threat class in LLM systems, particularly in agentic architectures. A comprehensive systematization of prompt injection threats shows that untrusted inputs can hijack agent behavior by manipulating intermediate reasoning and tool selection (Wang et al., 2026).

More critically, indirect prompt injection enables attackers to embed malicious instructions within external data sources, such as tool outputs or retrieved documents - causing agents to execute unintended actions. Empirical studies demonstrate that such attacks can steer agent decision-making through multi-step execution flows, making them difficult to detect at the prompt or API level (Yu et al., 2026; Zhang et al., 2026).

Benchmarks evaluating real-world agent systems further show that existing defenses remain insufficient, with most systems either failing to prevent attacks or degrading functionality under defensive constraints (Li et al., 2026). These attacks unfold entirely within runtime execution. The observable API request is often benign. The exploit occurs in how the system processes and acts on that input.

Tool-Level Exploitation and Data Exfiltration

Recent work on tool-using agents demonstrates a new class of attacks in which adversaries manipulate tool invocation to achieve data exfiltration. For example, prompt-level attacks can force agents to invoke malicious logging tools or redirect outputs to attacker-controlled channels, enabling covert extraction of sensitive data (Hu et al., 2025). Similarly, the “silent egress” attack class shows that agents can be induced to issue outbound requests that leak sensitive runtime context, with high success rates and minimal detection by output-based controls (Lan et al., 2026).

These behaviors are not visible through traffic inspection alone because they emerge from internal decision-making processes and execution flows, not from anomalous inputs.

Tool Selection and Dependency Manipulation

Attacks targeting tool selection mechanisms further illustrate the limitations of discovery-based approaches. The ToolHijacker framework demonstrates that adversaries can manipulate an agent’s tool selection process by injecting malicious tool descriptions, causing the system to consistently choose attacker-controlled tools (Shi et al., 2025).

This form of attack operates at the level of execution logic, not at the boundary. The system behaves as designed, it selects a tool and executes it, but the selection process itself has been compromised.

Structural Limits of Prompt-Level Defenses

Across multiple studies, a consistent pattern emerges: defenses focused on prompt filtering or boundary controls are insufficient. Even advanced models with reasoning capabilities and safety mitigations remain vulnerable to low-effort prompt injection attacks in realistic environments (Evtimov et al., 2025). Further research indicates that prompt injection may be fundamentally difficult to eliminate due to the underlying architecture of LLMs, where data and instructions are processed in a unified representation (Ramakrishnan, 2025).

These findings reinforce a key conclusion: the primary locus of risk is not the input channel, but the execution context.

From Discovery to Illumination: A Runtime-Centric Model

The limitations of discovery-based governance suggest the need for a different model. Discovery identifies where AI may exist. It constructs an inventory based on observable signals. Illumination, by contrast, focuses on what is actually executing.

A runtime-centric model of AI governance emphasizes:

Execution-level visibility: Identifying which AI-influenced functions and components are active.
Behavioral observation: Capturing how AI affects control flow and system behavior.
Contextual risk analysis: Evaluating vulnerabilities based on real execution paths.
Verification over inference: Grounding governance decisions in observed behavior.

This approach aligns with the broader evolution of cybersecurity, where runtime and behavioral analysis have become essential for detecting modern threats that evade static and perimeter-based controls.

Implications for Security and Policy

The transition from discovery to illumination has implications beyond enterprise security.

First, governance frameworks that rely on inferred inventories risk underestimating the presence and impact of AI systems. This creates gaps in both compliance and risk management.

Second, policy enforcement becomes unreliable without execution-level validation. Organizations may define rules governing AI usage, but without runtime observation, they can’t verify adherence.

Third, the distinction between theoretical and actual risk becomes central. In AI-driven systems, many vulnerabilities are only exploitable under specific runtime conditions. Prioritization must therefore be grounded in observed execution, not static presence.

Conclusion

The first generation of AI governance systems was designed to answer a visibility problem at the boundary: where is AI used?

That model is no longer sufficient. As AI becomes embedded within application logic and runtime environments, governance must evolve to address a more fundamental question: what is actually executing?

The evidence is clear. Prompt injection, agent hijacking, tool exploitation and data exfiltration attacks all operate at runtime. They are not reliably detectable through traffic inspection or external observation alone. Effective AI governance therefore requires a shift from discovery to illumination - from inference to execution. Modern AI systems require internal observation; visibility at the edge is insufficient.

References

Evtimov, I., Zharmagambetov, A., Grattafiori, A., Guo, C., & Chaudhuri, K. (2025). WASP: Benchmarking web agent security against prompt injection attacks. NeurIPS. https://neurips.cc/virtual/2025/poster/121728
Hu, Y., et al. (2025). Prompt injection attacks on tool-using LLM agents. OpenReview. https://openreview.net/forum?id=UVgbFuXPaO
Lan, Q., Kaul, A., Jones, S., & Westrum, S. (2026). Silent egress: When implicit prompt injection makes LLM agents leak without a trace. arXiv. https://arxiv.org/abs/2602.22450
Li, H., Wen, R., Shi, S., Zhang, N., & Xiao, C. (2026). AgentDyn: A dynamic benchmark for evaluating prompt injection attacks. arXiv. https://arxiv.org/pdf/2602.03117
Ramakrishnan, B. (2025). Securing AI agents against prompt injection attacks. arXiv. https://arxiv.org/abs/2511.15759
Shi, J., et al. (2025). Prompt injection attack to tool selection in LLM agents. arXiv. https://arxiv.org/abs/2504.19793
Wang, P., et al. (2026). The landscape of prompt injection threats in LLM agents. arXiv. https://arxiv.org/abs/2602.10453
Yu, Q., et al. (2026). Defense against indirect prompt injection via tool result parsing. arXiv. https://arxiv.org/abs/2601.04795
Zhang, T., et al. (2026). Mitigating indirect prompt injection in LLM agents. arXiv. https://arxiv.org/abs/2602.22724

Get the Playbook

Table of contents

Example H2

Example H3

Related blogs

AI-BOM: How to Identify, Understand and Govern AI Supply Chain Risk

Kodem AI-BOM extends your SBOM with an AI bill of materials: identify AI packages, verify runtime execution, and govern AI supply chain risk at scale.

Repository-Grounded Vulnerability Remediation for AI Security Engineers

Kodem automates vulnerability remediation with AI. Get validated, repository-grounded fixes and one click pull requests your security team can review.

Understanding MA-S2: Continuous Vulnerability Discovery, Attack Path Analysis, Runtime Inventory, and Automated Remediation

MA-S2 is Palantir's proposed software security standard. See how runtime vulnerability prioritization, attack paths, and automated remediation map to it.

Stop the waste.
Protect your environment with Kodem.

Get a personalized demo

A Primer on Runtime Intelligence

See how Kodem's cutting-edge sensor technology revolutionizes application monitoring at the kernel level.

5.1k

Applications covered

1.1m

False positives eliminated

4.8k

Triage hours reduced

Platform Overview Video

Watch our short platform overview video to see how Kodem discovers real security risks in your code at runtime.