Turn the Lights On: From AI Discovery to Runtime Illumination
A Runtime-Centric Framework for AI Governance
Introduction
The integration of large language models (LLMs) into production systems has fundamentally altered the structure of application risk. AI is no longer confined to discrete API calls; it is increasingly embedded within application logic, orchestrated through agent frameworks, and executed dynamically in response to contextual inputs.
In response, organizations have adopted governance controls that emphasize external visibility: monitoring API calls, inspecting prompts and analyzing data flows at the system boundary. These approaches aim to answer a foundational question: where is AI used?
However, this question is insufficient. Emerging evidence suggests that the most consequential risks in AI systems do not arise at the boundary, but within execution. The distinction between inference-based discovery and execution-based observation is therefore becoming central to both security practice and policy design.
Discovery as Inference: The Limits of Boundary-Based Observation
Contemporary AI discovery systems rely on traffic inspection to identify model interactions. These systems can determine which endpoints are called, which prompts are submitted and which responses are returned. This model assumes that AI activity is externally observable.
Recent research challenges this assumption. As LLMs are integrated into agentic systems and application workflows, critical behaviors increasingly occur without generating distinct, inspectable API events. Instead, they emerge through multi-step execution chains, internal tool invocations and context propagation within the runtime environment. This creates a structural limitation: discovery systems observe signals of interaction, not evidence of execution.
Evidence from Emerging Attack Classes
A growing body of work from 2025 - 2026 demonstrates that many high-impact AI vulnerabilities manifest at runtime and evade traditional discovery mechanisms.
Prompt Injection and Agent Hijacking
Prompt injection has been formalized as a dominant threat class in LLM systems, particularly in agentic architectures. A comprehensive systematization of prompt injection threats shows that untrusted inputs can hijack agent behavior by manipulating intermediate reasoning and tool selection (Wang et al., 2026).
More critically, indirect prompt injection enables attackers to embed malicious instructions within external data sources, such as tool outputs or retrieved documents - causing agents to execute unintended actions. Empirical studies demonstrate that such attacks can steer agent decision-making through multi-step execution flows, making them difficult to detect at the prompt or API level (Yu et al., 2026; Zhang et al., 2026).
Benchmarks evaluating real-world agent systems further show that existing defenses remain insufficient, with most systems either failing to prevent attacks or degrading functionality under defensive constraints (Li et al., 2026). These attacks unfold entirely within runtime execution. The observable API request is often benign. The exploit occurs in how the system processes and acts on that input.
Tool-Level Exploitation and Data Exfiltration
Recent work on tool-using agents demonstrates a new class of attacks in which adversaries manipulate tool invocation to achieve data exfiltration. For example, prompt-level attacks can force agents to invoke malicious logging tools or redirect outputs to attacker-controlled channels, enabling covert extraction of sensitive data (Hu et al., 2025). Similarly, the “silent egress” attack class shows that agents can be induced to issue outbound requests that leak sensitive runtime context, with high success rates and minimal detection by output-based controls (Lan et al., 2026).
These behaviors are not visible through traffic inspection alone because they emerge from internal decision-making processes and execution flows, not from anomalous inputs.
Tool Selection and Dependency Manipulation
Attacks targeting tool selection mechanisms further illustrate the limitations of discovery-based approaches. The ToolHijacker framework demonstrates that adversaries can manipulate an agent’s tool selection process by injecting malicious tool descriptions, causing the system to consistently choose attacker-controlled tools (Shi et al., 2025).
This form of attack operates at the level of execution logic, not at the boundary. The system behaves as designed, it selects a tool and executes it, but the selection process itself has been compromised.
Structural Limits of Prompt-Level Defenses
Across multiple studies, a consistent pattern emerges: defenses focused on prompt filtering or boundary controls are insufficient. Even advanced models with reasoning capabilities and safety mitigations remain vulnerable to low-effort prompt injection attacks in realistic environments (Evtimov et al., 2025). Further research indicates that prompt injection may be fundamentally difficult to eliminate due to the underlying architecture of LLMs, where data and instructions are processed in a unified representation (Ramakrishnan, 2025).
These findings reinforce a key conclusion: the primary locus of risk is not the input channel, but the execution context.
From Discovery to Illumination: A Runtime-Centric Model
The limitations of discovery-based governance suggest the need for a different model. Discovery identifies where AI may exist. It constructs an inventory based on observable signals. Illumination, by contrast, focuses on what is actually executing.
A runtime-centric model of AI governance emphasizes:
- Execution-level visibility: Identifying which AI-influenced functions and components are active.
- Behavioral observation: Capturing how AI affects control flow and system behavior.
- Contextual risk analysis: Evaluating vulnerabilities based on real execution paths.
- Verification over inference: Grounding governance decisions in observed behavior.
This approach aligns with the broader evolution of cybersecurity, where runtime and behavioral analysis have become essential for detecting modern threats that evade static and perimeter-based controls.
Implications for Security and Policy
The transition from discovery to illumination has implications beyond enterprise security.
First, governance frameworks that rely on inferred inventories risk underestimating the presence and impact of AI systems. This creates gaps in both compliance and risk management.
Second, policy enforcement becomes unreliable without execution-level validation. Organizations may define rules governing AI usage, but without runtime observation, they can’t verify adherence.
Third, the distinction between theoretical and actual risk becomes central. In AI-driven systems, many vulnerabilities are only exploitable under specific runtime conditions. Prioritization must therefore be grounded in observed execution, not static presence.
Conclusion
The first generation of AI governance systems was designed to answer a visibility problem at the boundary: where is AI used?
That model is no longer sufficient. As AI becomes embedded within application logic and runtime environments, governance must evolve to address a more fundamental question: what is actually executing?
The evidence is clear. Prompt injection, agent hijacking, tool exploitation and data exfiltration attacks all operate at runtime. They are not reliably detectable through traffic inspection or external observation alone. Effective AI governance therefore requires a shift from discovery to illumination - from inference to execution. Modern AI systems require internal observation; visibility at the edge is insufficient.
References
- Evtimov, I., Zharmagambetov, A., Grattafiori, A., Guo, C., & Chaudhuri, K. (2025). WASP: Benchmarking web agent security against prompt injection attacks. NeurIPS. https://neurips.cc/virtual/2025/poster/121728
- Hu, Y., et al. (2025). Prompt injection attacks on tool-using LLM agents. OpenReview. https://openreview.net/forum?id=UVgbFuXPaO
- Lan, Q., Kaul, A., Jones, S., & Westrum, S. (2026). Silent egress: When implicit prompt injection makes LLM agents leak without a trace. arXiv. https://arxiv.org/abs/2602.22450
- Li, H., Wen, R., Shi, S., Zhang, N., & Xiao, C. (2026). AgentDyn: A dynamic benchmark for evaluating prompt injection attacks. arXiv. https://arxiv.org/pdf/2602.03117
- Ramakrishnan, B. (2025). Securing AI agents against prompt injection attacks. arXiv. https://arxiv.org/abs/2511.15759
- Shi, J., et al. (2025). Prompt injection attack to tool selection in LLM agents. arXiv. https://arxiv.org/abs/2504.19793
- Wang, P., et al. (2026). The landscape of prompt injection threats in LLM agents. arXiv. https://arxiv.org/abs/2602.10453
- Yu, Q., et al. (2026). Defense against indirect prompt injection via tool result parsing. arXiv. https://arxiv.org/abs/2601.04795
- Zhang, T., et al. (2026). Mitigating indirect prompt injection in LLM agents. arXiv. https://arxiv.org/abs/2602.22724
Related blogs
Turn the Lights On: Why AI Governance Cannot Rely on Traffic Inspection Alone
Inspecting traffic to AI endpoints cannot provide a complete picture of enterprise AI activity. The core governance question is therefore changing. It is no longer simply “What AI traffic do we observe?” It is increasingly “What AI systems are actually executing?”
5

Prompt Injection was Never the Real Problem
A review of “The Promptware Kill Chain”Over the last two years, “prompt injection” has become the SQL injection of the LLM era: widely referenced, poorly defined, and often blamed for failures that have little to do with prompts themselves.A recent arXiv paper, “The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multi-Step Malware,” tries to correct that by reframing prompt injection as just the initial access phase of a broader, multi-stage attack chain.As a security researcher working on real production AppSec and AI systems, I think this paper is directionally right and operationally incomplete.This post is a technical critique: what the paper gets right, where the analogy breaks down, and how defenders should actually think about agentic system compromise.
A Primer on Runtime Intelligence
See how Kodem's cutting-edge sensor technology revolutionizes application monitoring at the kernel level.
Platform Overview Video
Watch our short platform overview video to see how Kodem discovers real security risks in your code at runtime.
The State of the Application Security Workflow
This report aims to equip readers with actionable insights that can help future-proof their security programs. Kodem, the publisher of this report, purpose built a platform that bridges these gaps by unifying shift-left strategies with runtime monitoring and protection.
.png)
Get real-time insights across the full stack…code, containers, OS, and memory
Watch how Kodem’s runtime security platform detects and blocks attacks before they cause damage. No guesswork. Just precise, automated protection.

