Description
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.0, an assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the server by publishing a malicious HuggingFace model, when vLLM runs in Python optimized mode (python -O or PYTHONOPTIMIZE=1). This vulnerability is fixed in 0.22.0.
CVSS breakdown
CVSS 3.1
Attack Vector
Network
Attack Complexity
High
Privileges Required
None
User Interaction
Required
Scope
Unchanged
Confidentiality
High
Integrity
High
Availability
High
Affected products
- vllm-project / vllm< 0.22.0 – < 0.22.0