CVE-2026-7304 · SGLangs multimodal generation runtime is vulnerable to unauthentica…

01 · The Real Story

This is a loaded nail gun left on the bench, not a landmine buried in every floor tile

CVE-2026-7304 is a pre-auth remote code execution flaw in SGLang's serving runtime: the custom_logit_processor request field carries a hex-encoded dill object, and affected builds deserialize it with dill.loads() without trust checks. Antiproof says the affected range starts at v0.4.1.post7 and later, and exploitation only becomes reachable when operators launch SGLang with --enable-custom-logit-processor and expose a generation endpoint the attacker can hit.

The vendor-style 9.8/CRITICAL score is technically defensible in a vacuum because the sink is genuine unauthenticated network-to-host RCE. In production reality, though, the feature gate matters a lot: the flag is explicitly documented as disabled by default for security, so this is not 'every SGLang box is instantly owned' criticality; it is a high-priority, configuration-gated RCE that becomes especially serious in DeepSeek-R1-style deployments where the docs recommend enabling the risky option.

"Real bug, real RCE, but the blast radius shrinks hard because the dangerous flag is opt-in and off by default"

02 · The Attack Path

4 steps from start to impact.

STEP 01

Reach an SGLang generation endpoint

The attacker first needs HTTP access to an SGLang inference API such as the OpenAI-compatible generation endpoints. In practice the weaponized tooling here is mundane: curl, httpx, or any API client is enough because the vulnerable path sits behind normal request handling rather than a hidden admin socket.

Conditions required:

SGLang is deployed and reachable over the network
The target exposes a generation endpoint to the attacker path
Multimodal/runtime components relevant to the request path are present

Where this breaks in practice:

Many enterprise inference stacks are internal-only or behind API gateways
Some deployments terminate access at a reverse proxy with IP allowlists or auth
Not every SGLang deployment exposes the affected serving surface to untrusted users

Detection/coverage: External scanners can find SGLang-style OpenAPI/docs exposure on the default 30000 port, but there is no reliable public exposure count in the sources reviewed.

STEP 02

Depend on the dangerous feature flag being enabled

The exploit chain collapses unless the server was launched with --enable-custom-logit-processor. That is the decisive friction point: SGLang's own server-arguments documentation marks the option as disabled by default for security, so the attacker is betting on an operator opting into the unsafe feature.

Conditions required:

Server started with --enable-custom-logit-processor
Affected SGLang version is installed
Request path accepts custom_logit_processor input

Where this breaks in practice:

Flag is off by default
Only a subset of operators need this feature at all
Change-controlled production fleets may not permit ad hoc feature flags

Detection/coverage: Strong host-side detection: process command lines, container args, Helm values, and systemd unit files can directly reveal the flag.

STEP 03

Send a malicious serialized callable

Once the flag is on, the attacker can submit a crafted custom_logit_processor payload whose callable property contains a malicious dill blob. The weaponized component is Python dill deserialization itself: the vulnerable code path loads attacker-controlled bytes and reconstructs executable Python objects.

Conditions required:

Attacker can place custom_logit_processor in request body
Payload is forwarded intact to SGLang
Target code still contains the dill.loads() sink

Where this breaks in practice:

Some upstream schema validators or API brokers may strip unknown/extra fields
Application-layer auth, if present, can reduce who can reach the body parser
Size limits or custom wrappers may block unusually large hex payloads

Detection/coverage: Log and alert on requests containing custom_logit_processor, very long hex strings, or repeated deserialization failures. Most vuln scanners will miss this unless they know the flag state.

STEP 04

Execute code in the SGLang service context

Successful deserialization yields arbitrary Python execution in the server process, which means immediate access to model files, API tokens, service credentials, and lateral movement opportunities from that host or container. Weaponized follow-on tooling would typically be standard post-exploitation tradecraft rather than anything SGLang-specific.

Conditions required:

Deserialization succeeds
Service account has useful filesystem, network, or cloud permissions
Runtime isolation does not fully contain the process

Where this breaks in practice:

Containers, read-only filesystems, seccomp/AppArmor, and tight egress controls can reduce blast radius
Short-lived inference pods may limit persistence
Least-privilege service accounts can sharply constrain follow-on impact

Detection/coverage: EDR should catch unusual child processes, shell spawns, outbound callbacks, credential access, or filesystem tampering from the SGLang worker.

03 · Intelligence Metadata

The supporting signals.

In-the-wild status	No confirmed active exploitation in the reviewed primary sources as of 2026-05-29.
KEV status	Not listed in CISA KEV as of 2026-05-29.
PoC availability	Public technical details are available from Antiproof, including the vulnerable sink and a minimal unsafe-deserialization example. That is enough for competent attackers to reproduce.
EPSS	0.00426 from the provided intel block — low near-term exploitation probability, which is consistent with the narrow exposed population.
CVSS vector reality check	`CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H` describes the sink correctly once the flag is enabled, but CVSS does not model the deployment narrowing created by the opt-in feature gate.
Affected versions	Antiproof reports `v0.4.1.post7` and later.
Fixed versions	As of v0.5.12.post1 release notes on 2026-05-26, there is no publicly documented fix for CVE-2026-7304 in the sources reviewed. This is an inference from public release material, not a vendor statement.
Exposure / scanning data	No authoritative public GreyNoise/Shodan/Censys/FOFA count was found in the reviewed sources. Exposure is still plausible because SGLang docs show `--host 0.0.0.0` examples and default API docs on port `30000`, but this CVE additionally requires the custom-logit flag.
Disclosure timeline	CERT/CC VU#777338 and NVD both show public disclosure on 2026-05-18.
Reporter / context	CERT/CC credits Alon Shakevsky. The risky feature is disabled by default for security, yet SGLang docs also recommend enabling it for DeepSeek-R1 workflows.

04 · The Call

noisgate verdict.

Final Verdict

↓ DOWNGRADED to HIGH (8.3/10)

The single biggest downward pressure is the prerequisite that operators must explicitly turn on --enable-custom-logit-processor; without that opt-in, this CVE is dead code from an attacker's perspective. It still lands in HIGH because when that flag is present the exploit is true pre-auth network-to-host RCE, and the official docs make the risky setting attractive for real model-serving deployments.

HIGH Technical exploitability of the vulnerable code path

MEDIUM How often enterprises actually run the feature flag in reachable production deployments

Why this verdict

Start at 9.8: the underlying primitive is genuine unauthenticated remote code execution through unsafe dill deserialization.
Down one notch for attacker reachability: the attacker needs network access to the generation API, which already excludes air-gapped, loopback-only, or tightly brokered inference stacks.
Down another notch for configuration gating: --enable-custom-logit-processor is explicitly disabled by default for security, so a large chunk of the installed base should be non-exploitable in practice.
Hold it at HIGH, not MEDIUM: SGLang is widely deployed, the docs recommend this flag for DeepSeek-R1 usage, and a successful hit is full service-context code execution with obvious credential and lateral-movement value.

Why not higher?

There is no KEV listing, no confirmed active exploitation in the reviewed sources, and the exposed population is materially narrowed by an opt-in flag that is off by default. This is dangerous code, but not the kind of universal internet-fire drill where every default deployment is instantly reachable.

Why not lower?

If your fleet actually enabled the feature, the attack is about as clean as it gets: pre-auth, low-complexity, host-level code execution. The docs themselves normalize the risky option for at least some common deployment patterns, which means this will show up in real estates rather than only lab setups.

05 · Compensating Control

What to do — in priority order.

Disable the feature flag — Remove --enable-custom-logit-processor anywhere it appears in service args, Helm values, container entrypoints, or systemd units. For a HIGH verdict, deploy this compensating control within 30 days; if the business genuinely needs the feature, treat that as an exception requiring sign-off rather than a silent default.
Fence the API behind trusted networks — Restrict SGLang generation endpoints to approved source ranges, private load balancers, or service-mesh identities so untrusted networks cannot reach the request body at all. Deploy within 30 days, and prioritize any host listening on 0.0.0.0 or fronted by internet-reachable gateways.
Require upstream auth and field filtering — Put an authenticated reverse proxy or API gateway in front of SGLang and explicitly reject or strip the custom_logit_processor field unless there is a documented business need. Deploy within 30 days; this does not fix the code, but it meaningfully reduces who can hit the sink.
Hunt for suspicious request patterns — Search HTTP logs, proxy logs, and app telemetry for requests containing custom_logit_processor, unusually long hex strings, or deserialization-related failures, then correlate with child-process, egress, and credential-access telemetry from the host. Start immediately and keep it running until a real code fix is deployed.

What doesn't work

Relying on CVSS alone does not help here; the real decision point is whether the dangerous flag is actually enabled.
A generic WAF signature-only approach is weak because the payload can look like ordinary JSON with application-specific fields unless you explicitly understand and block custom_logit_processor.
Assuming 'it's internal so it's fine' is a bad control in flat enterprise networks; once an attacker gets any foothold, internal-only AI services become excellent lateral-movement targets.

06 · Verification

Crowdsourced verification payload.

Run this on the target Linux SGLang host itself, or push it through your SSH/orchestration tooling. Invoke it as python3 verify_cve_2026_7304.py; root is recommended so it can inspect all /proc/*/cmdline entries and service unit files, though non-root may still return useful results.

noisgate-verify.py

PYTHONREAD-ONLYSAFE

#!/usr/bin/env python3
# Verify likely exposure to CVE-2026-7304 on a Linux host
# Exit codes: 0=PATCHED, 1=VULNERABLE, 2=UNKNOWN

import os
import re
import sys
import glob
from pathlib import Path

try:
    from importlib import metadata as importlib_metadata
except Exception:
    import importlib_metadata  # type: ignore

LOWER_AFFECTED = "0.4.1.post7"


def parse_version(v):
    if not v:
        return None
    m = re.match(r"^(\d+)\.(\d+)\.(\d+)(?:\.post(\d+))?", v)
    if not m:
        return None
    major, minor, patch, post = m.groups()
    return (int(major), int(minor), int(patch), int(post or 0))


def version_gte(a, b):
    pa = parse_version(a)
    pb = parse_version(b)
    if pa is None or pb is None:
        return None
    return pa >= pb


def get_installed_version():
    candidates = ["sglang", "SGLang"]
    for name in candidates:
        try:
            return importlib_metadata.version(name)
        except Exception:
            pass
    return None


def get_package_root():
    try:
        dist = importlib_metadata.distribution("sglang")
        for f in dist.files or []:
            p = Path(dist.locate_file(f))
            if p.name == "__init__.py" and "sglang" in str(p.parent):
                return p.parent
    except Exception:
        pass
    for base in sys.path:
        p = Path(base) / "sglang"
        if p.exists() and p.is_dir():
            return p
    return None


def source_has_sink(pkg_root):
    if not pkg_root:
        return False, None
    target = pkg_root / "srt" / "sampling" / "custom_logit_processor.py"
    if not target.exists():
        return False, str(target)
    try:
        txt = target.read_text(encoding="utf-8", errors="ignore")
    except Exception:
        return False, str(target)
    patterns = [
        'dill.loads(bytes.fromhex(data["callable"]))',
        "dill.loads(bytes.fromhex(data['callable']))",
        "dill.loads(",
    ]
    return any(p in txt for p in patterns), str(target)


def find_running_flagged_processes():
    hits = []
    for proc in glob.glob("/proc/[0-9]*/cmdline"):
        pid = proc.split("/")[2]
        try:
            raw = Path(proc).read_bytes()
            if not raw:
                continue
            cmd = raw.replace(b"\x00", b" ").decode("utf-8", errors="ignore").strip()
        except Exception:
            continue
        if "sglang" in cmd and "--enable-custom-logit-processor" in cmd:
            hits.append({"pid": pid, "cmd": cmd})
    return hits


def find_unitfile_flag():
    roots = ["/etc/systemd/system", "/lib/systemd/system", "/usr/lib/systemd/system"]
    hits = []
    for root in roots:
        p = Path(root)
        if not p.exists():
            continue
        for unit in p.rglob("*.service"):
            try:
                txt = unit.read_text(encoding="utf-8", errors="ignore")
            except Exception:
                continue
            if "sglang" in txt and "--enable-custom-logit-processor" in txt:
                hits.append(str(unit))
    return hits


def main():
    version = get_installed_version()
    pkg_root = get_package_root()
    sink_present, sink_file = source_has_sink(pkg_root)
    flagged_procs = find_running_flagged_processes()
    flagged_units = find_unitfile_flag()

    print(f"installed_version={version or 'UNKNOWN'}")
    print(f"package_root={str(pkg_root) if pkg_root else 'UNKNOWN'}")
    print(f"source_file={sink_file or 'UNKNOWN'}")
    print(f"deserialization_sink_present={'yes' if sink_present else 'no'}")
    print(f"running_processes_with_flag={len(flagged_procs)}")
    for hit in flagged_procs[:10]:
        print(f"process_hit pid={hit['pid']} cmd={hit['cmd']}")
    print(f"unit_files_with_flag={len(flagged_units)}")
    for unit in flagged_units[:10]:
        print(f"unit_hit path={unit}")

    affected_by_version = None
    if version:
        affected_by_version = version_gte(version, LOWER_AFFECTED)
        print(f"affected_by_version={'yes' if affected_by_version else 'no' if affected_by_version is False else 'unknown'}")
    else:
        print("affected_by_version=unknown")

    flag_enabled = bool(flagged_procs or flagged_units)

    if affected_by_version is False:
        print("PATCHED")
        sys.exit(0)

    if (affected_by_version is True or sink_present) and flag_enabled:
        print("VULNERABLE")
        sys.exit(1)

    if affected_by_version is True or sink_present:
        print("UNKNOWN")
        sys.exit(2)

    print("UNKNOWN")
    sys.exit(2)


if __name__ == "__main__":
    main()

07 · Bottom Line

If you remember one thing.

TL;DR

Monday morning, pull a fleet-wide inventory of SGLang nodes, then specifically hunt for --enable-custom-logit-processor in running processes, service definitions, container args, and deployment manifests; any host with that flag enabled should have the feature disabled or be network-fenced behind trusted callers within the noisgate mitigation SLA of ≤30 days. Because no public vendor fix is documented in the reviewed sources as of May 29, 2026, do not wait for patch availability to reduce exposure: remove the flag where possible, lock the API down now, document temporary exceptions, and when an upstream fix finally lands, push it under the noisgate remediation SLA of ≤180 days.

Sources

Peer Review

What defenders are saying.

Submit a review attribution: handle + country only

0 flags selected · stored anonymously

Validation Results

Crowdsourced verification outputs.

Results submitted by users who ran the verification payload against their environment.