Description
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.0, vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies --revision or --code-revision can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository subfolder weights/config from an unpinned/default revision. This is a supply-chain integrity issue for pinned vLLM deployments. Operators can believe they are serving a reviewed model revision while vLLM resolves behavior-affecting nested or sibling artifacts outside that reviewed revision. This vulnerability is fixed in 0.22.0.
CVSS breakdown
CVSS 3.1
Attack Vector
Network
Attack Complexity
High
Privileges Required
None
User Interaction
None
Scope
Unchanged
Confidentiality
Low
Integrity
High
Availability
None
Affected products
- vllm-project / vllm< 0.22.0 – < 0.22.0