Description
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0.
CVSS breakdown
CVSS 3.1
Attack Vector
Adjacent
Attack Complexity
Low
Privileges Required
Low
User Interaction
None
Scope
Changed
Confidentiality
High
Integrity
High
Availability
High
Affected products
- vllm-project / vllm>= 0.6.5, < 0.8.0 – >= 0.6.5, < 0.8.0