CVE-2025-61620

CVE-2025-61620 is a medium-severity improper input validation vulnerability in vllm (pip), affecting versions >= 0.5.1, < 0.11.0. It is fixed in 0.11.0.

Summary

A resource-exhaustion (denial-of-service) vulnerability exists in multiple endpoints of the OpenAI-Compatible Server due to the ability to specify Jinja templates via the chat_template and chat_template_kwargs parameters. If an attacker can supply these parameters to the API, they can cause a service outage by exhausting CPU and/or memory resources.

Details

When using an LLM as a chat model, the conversation history must be rendered into a text input for the model. In hf/transformer, this rendering is performed using a Jinja template. The OpenAI-Compatible Server launched by vllm serve exposes a chat_template parameter that lets users specify that template. In addition, the server accepts a chat_template_kwargs parameter to pass extra keyword arguments to the rendering function.

Because Jinja templates support programming-language-like constructs (loops, nested iterations, etc.), a crafted template can consume extremely large amounts of CPU and memory and thereby trigger a denial-of-service condition.

Importantly, simply forbidding the chat_template parameter does not fully mitigate the issue. The implementation constructs a dictionary of keyword arguments for apply_hf_chat_template and then updates that dictionary with the user-supplied chat_template_kwargs via dict.update. Since dict.update can overwrite existing keys, an attacker can place a chat_template key inside chat_template_kwargs to replace the template that will be used by apply_hf_chat_template.

# vllm/entrypoints/openai/serving_engine.py#L794-L816
_chat_template_kwargs: dict[str, Any] = dict(
    chat_template=chat_template,
    add_generation_prompt=add_generation_prompt,
    continue_final_message=continue_final_message,
    tools=tool_dicts,
    documents=documents,
)
_chat_template_kwargs.update(chat_template_kwargs or {})

request_prompt: Union[str, list[int]]
if isinstance(tokenizer, MistralTokenizer):
    ...
else:
    request_prompt = apply_hf_chat_template(
        tokenizer=tokenizer,
        conversation=conversation,
        model_config=model_config,
        **_chat_template_kwargs,
    )

Fixes

Impact

If an OpenAI-Compatible Server exposes endpoints that accept chat_template or chat_template_kwargs from untrusted clients, an attacker can submit a malicious Jinja template (directly or by overriding chat_template inside chat_template_kwargs) that consumes excessive CPU and/or memory. This can result in a resource-exhaustion denial-of-service that renders the server unresponsive to legitimate requests.

The application does not adequately validate input before processing it, allowing unexpected values to reach sensitive code paths. Typical impact: varies by context: data corruption, logic bypass, or denial of service.

CVE-2025-61620 has a CVSS score of 6.5 (Medium). The vector is network-reachable, low privileges required, and no user interaction. A CVSS score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether this affects your application depends on whether the vulnerable code is present and reachable in your environment. A fixed version is available (0.11.0); upgrading removes the vulnerable code path.

Affected versions

vllm (>= 0.5.1, < 0.11.0)

Security releases

vllm → 0.11.0 (pip)

Kodem intelligence

Severity tells you how bad this could be in the worst case. It does not tell you whether you are exposed. Exploitability and impact are functions of runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A vulnerable package can sit in your dependency tree and never run.

Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter. Kodem's runtime-powered SCA identifies whether this CVE is reachable in your applications.

See it in your environment

Remediation advice

Upgrade vllm to 0.11.0 or later to resolve this vulnerability.

Kodem Kai can prioritize this vulnerability in your dependency tree and generate a fix recommendation.

Frequently Asked Questions

  1. What is CVE-2025-61620? CVE-2025-61620 is a medium-severity improper input validation vulnerability in vllm (pip), affecting versions >= 0.5.1, < 0.11.0. It is fixed in 0.11.0. The application does not adequately validate input before processing it, allowing unexpected values to reach sensitive code paths.
  2. How severe is CVE-2025-61620? CVE-2025-61620 has a CVSS score of 6.5 (Medium). This score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether it represents real risk in your environment depends on whether the vulnerable code is present and reachable.
  3. Which versions of vllm are affected by CVE-2025-61620? vllm (pip) versions >= 0.5.1, < 0.11.0 is affected.
  4. Is there a fix for CVE-2025-61620? Yes. CVE-2025-61620 is fixed in 0.11.0. Upgrade to this version or later.
  5. Is CVE-2025-61620 exploitable, and should I be worried? Whether CVE-2025-61620 is exploitable in your environment depends on whether the vulnerable code is present and reachable. A CVSS score is a worst-case rating; it does not account for your specific deployment, configuration, or usage patterns. Kodem, an Intelligent Application Security platform, uses runtime intelligence to show which vulnerabilities actually execute in production, so you can focus on the ones that represent real risk. Get a demo
  6. What actually determines whether CVE-2025-61620 is exploitable, and how bad it is? Exploitability and impact are not fixed properties of a CVE. They depend on runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A high CVSS score on a dependency that never runs is not the same as real risk. Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter.
  7. How do I fix CVE-2025-61620? Upgrade vllm to 0.11.0 or later.

Other vulnerabilities in vllm

CVE-2026-54233CVE-2026-54236CVE-2026-53923CVE-2026-12491CVE-2026-48746

Stop the waste.
Protect your environment with Kodem.