Summary
Issue 1: EXIF orientation not normalized → The image orientation processed by the model differs from how humans view it, introducing interpretation bias.
Issue 2: PNG tRNS not explicitly flattened before converting to RGB → After conversion, transparent/semi-transparent pixels are rendered unexpectedly, making otherwise subtle overlay elements visible and distorting the input content. (This attack is similar to AlphaDog: RGBA handling is already correct in vLLM, but since tRNS permits RGB images, the correct processing path isn’t taken.)
Issue 3 : Pillow only loads the first frame when loading APNG or GIF files.
Root Cause
- Rotation: After opening an image,
ImageOps.exif_transposeis not called to normalize EXIF orientation. - Transparency: Only RGBA→RGB is flattened with a background; PNGs carrying
tRNSinP/L/RGB + tRNSand other non-RGBA modes take theimage.convert("RGB")path, which implicitly discards/remaps transparency semantics.
Affected Code
Current state: ImageOps.exif_transpose is not used. (Although the rescale_image_size function (https://github.com/vllm-project/vllm/blob/main/vllm/multimodal/image.py#L14) exists and includes a transpose parameter, I’ve found that it doesn’t seem to be called anywhere outside the test directory.)
Call order: _convert_image_mode runs first; if the conditions are met, convert_image_mode is called.
Issue: Only the “RGBA → RGB” path is explicitly flattened. P, L, or RGB with tRNS all fall back to image.convert("RGB"). For PNGs that include tRNS, convert("RGB") directly produces 24-bit RGB, leading to:
Pmode: The transparent index becomes an actual RGB color (often black, white, or an undefined background), so transparency is lost.L/LAandRGB + tRNS:convert("RGB")doesn’t composite against a chosen background first, so elements that relied on transparency to be hidden or softened become solid.
Impact & Scope
- Impact: Pixels the model sees can diverge from operator expectations (due to orientation or transparency handling), potentially altering downstream reasoning.
- Scope: The image I/O and mode-conversion paths in
vllm/multimodal/image.py. The existing RGBA→RGB flattening is correct; the issues center on missing EXIF normalization and non-RGBAtRNSnot being explicitly composited.
Case
EXIF: http://qiniu.funxingzuo.top/exif_orient_180.jpg
tRNS: http://qiniu.funxingzuo.top/hello.png
Impact
CVE-2026-12491 has a CVSS score of 4.8 (Medium). The vector is network-reachable, no privileges required, and no user interaction. A CVSS score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether this affects your application depends on whether the vulnerable code is present and reachable in your environment. No fixed version is listed yet, so configuration controls and monitoring matter more in the interim.
Affected versions
Security releases
Kodem intelligence
Severity tells you how bad this could be in the worst case. It does not tell you whether you are exposed. Exploitability and impact are functions of runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A vulnerable package can sit in your dependency tree and never run.
Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter. Kodem's runtime-powered SCA identifies whether this CVE is reachable in your applications.
Remediation advice
A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44974
Frequently Asked Questions
- What is CVE-2026-12491? CVE-2026-12491 is a medium-severity security vulnerability in vllm (pip), affecting versions >= 0.11.0, <= 0.23.0. No fixed version is listed yet.
- How severe is CVE-2026-12491? CVE-2026-12491 has a CVSS score of 4.8 (Medium). This score reflects the worst-case severity of the vulnerability, not your specific exposure. Whether it represents real risk in your environment depends on whether the vulnerable code is present and reachable.
- Which versions of vllm are affected by CVE-2026-12491? vllm (pip) versions >= 0.11.0, <= 0.23.0 is affected.
- Is there a fix for CVE-2026-12491? No fixed version is listed for CVE-2026-12491 yet. Monitor the advisory for updates and apply mitigations in the interim.
- Is CVE-2026-12491 exploitable, and should I be worried? Whether CVE-2026-12491 is exploitable in your environment depends on whether the vulnerable code is present and reachable. A CVSS score is a worst-case rating; it does not account for your specific deployment, configuration, or usage patterns. Kodem, an Intelligent Application Security platform, uses runtime intelligence to show which vulnerabilities actually execute in production, so you can focus on the ones that represent real risk. Get a demo
- What actually determines whether CVE-2026-12491 is exploitable, and how bad it is? Exploitability and impact are not fixed properties of a CVE. They depend on runtime truth: whether the vulnerable code is present, reachable, and actually executes in your application. A high CVSS score on a dependency that never runs is not the same as real risk. Kodem, an Intelligent Application Security platform, uses runtime intelligence to reveal which vulnerabilities actually execute in production, so teams prioritize the ones that genuinely matter.