CVE-2026-54235

Description

vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. This vulnerability is fixed in 0.23.1rc0.

CVSS breakdown

CVSS 4.0

Attack Vector

Network

Attack Complexity

Low

Attack Requirements

None

Privileges Required

None

User Interaction

None

Confidentiality (Vulnerable System)

None

Integrity (Vulnerable System)

None

Availability (Vulnerable System)

Low

Confidentiality (Subsequent System)

None

Integrity (Subsequent System)

None

Availability (Subsequent System)

None

Affected products

vllm-project / vllm< 0.23.1rc0 – < 0.23.1rc0

References

VENDOR_ADVISORYhttps://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823
PATCHhttps://github.com/vllm-project/vllm/pull/45116
PATCHhttps://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198