vm2 Sandbox Escape Vulnerabilities: The 2026 CVE Wave Turning AI Agents Into Host RCE Vectors

Kodem Security Research Team
May 14, 2026
May 14, 2026

0 min read

Vulnerabilities
vm2 Sandbox Escape Vulnerabilities: The 2026 CVE Wave Turning AI Agents Into Host RCE Vectors

A wave of 13 vm2 sandbox escape vulnerabilities disclosed across early May 2026, many carrying CVSS scores between 9.0 and 10.0, allow attackers to break out of vm2’s isolated JavaScript execution environment and run arbitrary code on the underlying host system. Potentially affected environments include AI agent frameworks, plugin systems, code execution platforms and SaaS automation services that rely on vm2 to evaluate untrusted or LLM-generated JavaScript.

vm2 has repeatedly demonstrated structural weaknesses as a security isolation boundary, and AI agent execution has turned that long-known weakness into an infrastructure-level remote code execution problem. This blog covers the active CVEs, the attack chain from sandboxed JavaScript to host shell, the indicators security teams should hunt for and the response runbook.

What Happened: The 2026 vm2 Sandbox Escape Disclosure Wave

vm2 sandbox escape vulnerabilities disclosed throughout 2026 allow attackers who can execute untrusted JavaScript inside vm2 to escape the sandbox and run arbitrary code on the host Node.js process. The May 4 to May 5, 2026 window produced more than ten CVEs against the library, all with CVSS scores between 9.0 and 10.0, alongside earlier 2026 disclosures including CVE-2026-22709 in January.

CVEAffected VersionsFixed VersionDisclosedTechnique
CVE-2026-22709vm2 < 3.10.2vm2 3.10.2January 26, 2026Promise callback sanitization bypass
CVE-2026-24118vm2 < 3.11.0vm2 3.11.0May 4, 2026__lookupGetter__ sandbox escape
CVE-2026-24120vm2 < 3.10.5vm2 3.10.5May 4, 2026Promise species patch bypass
CVE-2026-24781vm2 < 3.11.0vm2 3.11.0May 4, 2026inspect() proxy unwrap
CVE-2026-26332vm2 < 3.11.0vm2 3.11.0May 4, 2026SuppressedError sandbox escape
CVE-2026-26956vm2 < 3.10.5vm2 3.10.5May 4, 2026WASM/JSTag sandbox escape to host RCE
CVE-2026-43997vm2 < 3.11.0vm2 3.11.0May 7, 2026Host object access plus sandbox escape
CVE-2026-43999vm2 < 3.11.0vm2 3.11.0May 7, 2026Allowlist bypass to child_process RCE
CVE-2026-44005vm2 < 3.11.0vm2 3.11.0May 7, 2026Prototype pollution plus sandbox escape
CVE-2026-44006vm2 < 3.11.0vm2 3.11.0May 7, 2026getPrototypeOf injection-based escape
CVE-2026-44007vm2 < 3.11.1vm2 3.11.1May 7, 2026nesting: true sandbox escape
CVE-2026-44008vm2 < 3.11.2vm2 3.11.2May 7, 2026Array species handling escape
CVE-2026-44009vm2 < 3.11.2vm2 3.11.2May 7, 2026Null-prototype exception escape

Broader JavaScript sandbox research disclosed on February 9, 2026 also included SandboxJS escape CVE-2026-25881, which NVD tracks separately from vm2.

The May 2026 wave continues a multi-year pattern of vm2 sandbox escape disclosures, not a one-time anomaly. Each successive wave adds new escape primitives - proxy unwrap, async sanitization bypass, prototype pollution chains, exception handling abuse, allowlist bypass against child_process -  that vm2's isolation model has no structural defense against, only one-off patches. CVE-2026-22709 disclosed in January 2026. The May 4 and May 5 cluster added more than ten further CVE identifiers across two days.

How the vm2 Sandbox Escape Chain Executes

A vm2 sandbox escape requires the attacker to execute code inside the sandbox, then chain a known weakness in vm2's isolation model (prototype pollution, exception handling, async primitives, or proxy unwrap) to reach Node.js host capabilities and run arbitrary commands. The chain runs in four stages: initial execution, sandbox boundary abuse, host capability access, and payload execution.

Initial Execution

Untrusted JavaScript reaches vm2. In AI agent environments, the source is often LLM output produced in response to a manipulated prompt. In plugin or SaaS environments, the source is user-submitted code.

Sandbox Boundary Abuse

The attacker invokes one of the disclosed primitives. Three from the 2026 wave illustrate the pattern:

  1. Promise callback sanitization bypass (CVE-2026-22709), documented in StepSecurity's analysis.
  2. SuppressedError exception abuse (CVE-2026-26332), disclosed May 4, 2026.
  3. Prototype pollution chained to host access (CVE-2026-44005), paired in the May 2026 cluster with CVE-2026-44006 (getPrototypeOf injection-based escape) as the two prototype-related primitives.

All three produce the same outcome by different paths: code executing inside vm2 obtains a reference, callback, or property lookup that crosses the sandbox boundary.

Host Capability Access

Once outside the sandbox, the attacker has access to Node.js process APIs, including child_process, the file system, and outbound network. CVE-2026-43999 demonstrates this path directly with an allowlist bypass to child_process RCE.

Payload Execution

Arbitrary shell commands execute with the privileges of the Node.js runtime. In containerized AI agent deployments, this can lead to arbitrary code execution within the containerized runtime environment and potentially broader infrastructure access depending on hardening controls.

Why This Wave Matters: Prompts Become Shells

Microsoft Security Research, names the structural problem the 2026 vm2 CVEs expose: in AI agent frameworks where prompts can influence executable logic, a sandbox escape vulnerability converts a prompt injection into host-level remote code execution.

AI agents generate and execute code dynamically, and many AI agent frameworks use vm2 as the isolation layer for that execution. That placement makes vm2 a structural component of AI agent security, not a peripheral dependency.

vm2 sandbox escapes are not theoretical. The library has produced sandbox escape disclosures repeatedly across multiple years and multiple CVE waves; the 2026 cluster covered above is the latest. For agentic AI security teams, the operational implication is that vm2's isolation model has structural weaknesses, and the 2026 pattern is incremental disclosure of new escape primitives that patches address one at a time.

The combination converts a prompt injection (a known and increasingly common attack vector) into an infrastructure compromise. The threat class is qualitatively different from chatbot output manipulation. Prompt injection that surfaces a hallucinated answer is an AI agent security problem at the response layer. Prompt injection that reaches a vm2 evaluation path is an AI agent security problem at the host layer, and the latter is what "Prompts Become Shells" names. Agentic AI security programs need to map which frameworks in their environment run dynamic JavaScript through vm2 or any equivalent sandbox.

The question is no longer whether vm2 is installed. The question is whether untrusted prompts can reach a dynamic execution path that uses it.

Indicators of Compromise (IOCs) and Behavioral Signals

Hunting for vm2 sandbox escape activity is primarily a runtime exercise. The escape itself happens inside a Node.js process, and the host-level signal is anomalous child process creation, unexpected outbound network activity, or filesystem access by a process that should not have any.

No malware hashes, C2 domains, or named binaries are available for the 2026 vm2 CVE wave because the disclosures are vulnerability research rather than active campaign reports. The signals below are behavioral patterns to hunt for in runtime telemetry, grouped into four categories.

Package versions to audit. vm2 installations across direct and transitive dependencies, through and including the most recent vulnerable release. Confirm the exact patched version against the current npm advisory before deploying detection logic.

Process indicators. node processes spawning child_process.exec or child_process.spawn calls that originate from sandboxed contexts. Shell invocations (sh, bash, cmd.exe) parented to a Node.js runtime that should not be invoking them.

Filesystem indicators. Reads of ~/.aws/credentials, ~/.npmrc, ~/.docker/config.json, /proc/self/environ, and any environment variable dump from inside an agent runtime. Writes to /tmp, /var/tmp, or container-writable paths from an agent process.

Network indicators. Outbound DNS or HTTPS from an agent runtime to a domain not on the application allowlist. Unexpected egress from CI/CD runners and AI agent execution environments.

Immediate Response: The First-Hour Runbook

The response priority for a vm2 sandbox escape disclosure is: identify exposure, patch, rotate credentials accessible from the affected runtime, and audit logs for indicators of prior exploitation. Do this in that order.

  1. Inventory every service that has vm2 in its dependency tree, direct or transitive. Include AI agent frameworks, plugin systems, SaaS platforms, CI/CD runners, and any code execution service.
  2. Upgrade vm2 to the latest patched release. Where upgrade is blocked, isolate the affected service from sensitive infrastructure immediately.
  3. Identify every credential the affected runtime had access to. This includes cloud credentials, API keys, registry tokens, and CI/CD secrets. Rotate all of them.
  4. Pull logs covering the disclosure window. For AI agent environments, look for prompts that resemble code execution attempts. For plugin systems, look for code submissions that exercise prototype manipulation (CVE-2026-44005), exception handling abuse (CVE-2026-26332) or proxy unwrap patterns (CVE-2026-24781).
  5. Audit child process creation and outbound network activity from any affected runtime during the disclosure window.
  6. Apply defense in depth. Add container or process-level isolation in addition to vm2. Treat vm2 as a convenience layer, not a hardened security boundary.
  7. Hold publication and freeze production changes on the affected service until the audit completes.

Why Conventional Tooling Caught vm2 But Missed the Escape

Most SCA scanners detect vm2 as a dependency. Few of them flag the runtime context that turns a vm2 install into a critical exposure: whether the sandbox actually executes untrusted code, and whether that execution path can reach AI agent prompts or user-submitted logic.

Static scanning identifies the package. Runtime telemetry identifies whether untrusted execution paths can actually reach it. A web application that uses vm2 to evaluate a hardcoded internal script is a different risk profile than an AI agent framework that executes LLM-generated code in vm2 with shell tools enabled. Both will look the same in a dependency manifest. 

Kodem covers this risk surface across three products:

  • Kodem SCA identifies vm2 in the dependency graph.
  • Kodem ADR (Application Detection and Response) catches the runtime behavior the escape produces, including child process spawning, unauthorized filesystem reads, and anomalous outbound network activity. 
  • Kodem Malicious Package Detection handles the upstream supply chain risk on related packages in the same ecosystem.

Hardening Your Pipeline Against the Next vm2 Variant

Additional vm2 sandbox escape variants are highly likely. The pattern across 2026 is incremental disclosure of new escape primitives, and the underlying isolation model has structural weaknesses that patches address one at a time. Operate on the assumption that more are coming.

  1. Treat vm2 as a non-security boundary. If you must execute untrusted JavaScript, layer process isolation (separate container, gVisor, Firecracker) underneath the sandbox.
  2. Enforce least privilege on the Node.js process. Remove child_process access where the runtime doesn’t legitimately need it.
  3. Disable network egress from sandboxed execution paths unless the workload requires it. Apply allowlists, not blocklists.
  4. Audit AI agent tool configurations. Shell access and filesystem access for an agent multiply the impact of a sandbox escape from "code runs" to "infrastructure compromised."
  5. Strip credentials from sandbox-adjacent environments. The Node.js process should not have AWS, GCP, or registry credentials available unless the workload requires them. Use scoped tokens with short TTLs.
  6. Monitor child process creation, anomalous filesystem reads, and outbound network activity from runtimes that use vm2 or any equivalent JavaScript sandbox.
  7. Track the vm2 advisory feed and patch on disclosure. Build the assumption of frequent CVEs into your patch cadence.

What the vm2 Wave Reveals About the AI Agent Threat Trajectory

The 2026 vm2 disclosure wave demonstrates that JavaScript sandbox escapes are evolving from isolated Node.js vulnerabilities into AI infrastructure risks. In AI agent environments, prompt-controlled execution paths now create a realistic path from prompt injection to host-level compromise.

  • Dynamic execution environments are proliferating. AI agents, autonomous tooling, and runtime code generation are all expanding the attack surface for sandbox escape primitives.
  • Static security boundaries are losing ground. The traditional model of "scan dependencies, fix versions" is necessary but insufficient against a class of vulnerabilities that requires understanding what the runtime actually does.
  • Prompt injection is becoming an RCE precursor. The "Prompts Become Shells" framing will likely apply to additional sandbox technologies (isolated-vm, V8 isolates, custom JavaScript runtimes) as researchers turn attention to the broader category.

One specific, falsifiable prediction: by the end of Q4 2026, expect public disclosure of at least one production AI agent compromise in which a prompt injection was chained to a JavaScript sandbox escape to reach the host. The vm2 CVE wave is the early indicator.

Frequently Asked Questions

Questions security teams ask first when a vm2 advisory drops, grounded in the 2026 wave above.

  1. What is the vm2 sandbox escape vulnerability? A vm2 sandbox escape is a class of vulnerability in which an attacker who can execute untrusted JavaScript inside the vm2 sandboxing library breaks out of the isolation boundary and runs arbitrary code on the host system. Throughout 2026, over a dozen CVEs in this class have been disclosed, with severity scores between 9.0 and 10.0.
  2. Which CVEs affect vm2 in 2026? The 2026 vm2 sandbox escape wave includes CVE-2026-22709 (Promise callback sanitization bypass), CVE-2026-26956 (WASM/JSTag escape leading to host RCE), CVE-2026-25881, and additional CVEs disclosed on May 4 and 5, 2026, covering proxy unwrap, prototype pollution, and child_process allowlist bypass techniques.
  3. What does "Prompts Become Shells" mean? "Prompts Become Shells" is Microsoft Security Research's May 2026 framing for the structural risk that vm2 sandbox escape vulnerabilities create in AI agent frameworks. When an AI agent uses vm2 to execute dynamically generated code, a prompt injection can be chained through a sandbox escape into host-level remote code execution.
  4. How do I know if my environment is affected? Inventory every service with vm2 in its direct or transitive dependency tree, focusing on AI agent frameworks, plugin systems, CI/CD runners, online IDEs, and any code execution service. If any of these execute untrusted code or LLM-generated code through vm2, the environment is exposed.
  5. Can SCA tools detect a vm2 sandbox escape? SCA tools detect that vm2 is installed and flag the relevant CVEs. They do not detect whether the runtime actually executes untrusted code through vm2, which is what determines whether the installation is a critical exposure or a low-risk dependency. Runtime detection is required to close that gap.
  6. How do I respond if I find vm2 in my AI agent framework? Upgrade to the latest patched release immediately, then rotate every credential the affected runtime had access to (cloud credentials, API keys, registry tokens, CI/CD secrets). Audit logs for prompts or inputs that resemble code execution attempts during the disclosure window.
  7. Is vm2 still safe to use after patching? vm2 should not be treated as a hardened security boundary. The 2026 disclosure pattern continues a multi-year history of escape techniques, and additional variants are expected. Use vm2 as a convenience layer for ergonomics, and layer process-level or container-level isolation (gVisor, Firecracker, separate containers) underneath it for actual security isolation.
  8. What is the fastest way to stop prompt-to-RCE chains in AI agents? Restrict the tools available to AI agents. An agent that can’t invoke child_process, can’t write to the filesystem outside a scratch directory, and can’t reach unallowlisted network destinations can’t turn a sandbox escape into an infrastructure compromise. Tool restriction is faster to deploy than re-architecting the execution model.

References

  1. BleepingComputer. May 6, 2026. Critical vm2 sandbox bug lets attackers execute code on hosts. BleepingComputer.
  2. Kodem Security. September 8, 2025. Malicious Packages Alert: The Qix npm Supply-Chain Attack: Lessons for the Ecosystem. Kodem Security.
  3. Kodem Security. March 18, 2026. Malicious React Native npm Releases Trigger Supply Chain Exposure. Kodem Security
  4. Kodem Security. Runtime SCA: Know which packages are actually exploitable, in your environment. Kodem Security.
  5. Kodem Security. Integrity intelligence for your software supply chain. Kodem Security.
  6. Kodem Security. Triage-Free Open Source Security. Kodem Security.
  7. Kodem Security. Detection and response that beats attackers to the punch. Kodem Security
  8. Microsoft. May 7, 2026. When prompts become shells: RCE vulnerabilities in AI agent frameworks. Microsoft.
  9. NVD. January 26, 2026. CVE-2026-22709 Detail. NVD
  10. NVD. February 9, 2026. CVE-2026-25881 Detail. NVD.
  11. Semgrep. January 2026. New Sandbox Escape Affecting Popular nodejs Sandbox library vm2. Semgrep.
  12. StepSecurity. January 27, 2026.CVE-2026-22709: Critical Sandbox Escape Vulnerability in vm2. StepSecurity.
Table of contents

Related blogs

CVE-2026-0300: PAN-OS Captive Portal Zero-Day Breakdown and Response Runbook

CVE-2026-0300 is an actively exploited PAN-OS Captive Portal zero-day allowing unauthenticated root RCE. Get affected versions, IOCs, and response steps.

May 8, 2026

16

CVE-2026-31431 (Copy Fail): Linux Kernel LPE Breakdown and Remediation Runbook

CVE-2026-31431, the Copy Fail Linux kernel LPE, lets authenticated users gain root. See affected kernels, exploit details, IOCs and patches.

May 5, 2026

12

Mini Shai-Hulud Strikes PyTorch Lightning and intercom-client: Inside the Cross-Ecosystem Supply Chain Attack

Mini Shai-Hulud compromised PyTorch Lightning (2.6.2, 2.6.3) and intercom-client (7.0.4). Affected versions, IOCs and response runbook.

May 1, 2026

9

Stop the waste.
Protect your environment with Kodem.

Get a personalized demo
Get a personalized demo

A Primer on Runtime Intelligence

See how Kodem's cutting-edge sensor technology revolutionizes application monitoring at the kernel level.

5.1k
Applications covered
1.1m
False positives eliminated
4.8k
Triage hours reduced

Platform Overview Video

Watch our short platform overview video to see how Kodem discovers real security risks in your code at runtime.

5.1k
Applications covered
1.1m
False positives eliminated
4.8k
Triage hours reduced

The State of the Application Security Workflow

This report aims to equip readers with actionable insights that can help future-proof their security programs. Kodem, the publisher of this report, purpose built a platform that bridges these gaps by unifying shift-left strategies with runtime monitoring and protection.

Get real-time insights across the full stack…code, containers, OS, and memory

Watch how Kodem’s runtime security platform detects and blocks attacks before they cause damage. No guesswork. Just precise, automated protection.

Combined author
Kodem Security Research Team
Publish date

0 min read

Vulnerabilities