CVE-2026-25591 · New API is a large language mode (LLM) gateway and artificial intel…

01 · The Real Story

This is a smoke grenade in the server room, not a master key to the building

Before v0.10.8-alpha.10, QuantumNous new-api mishandles wildcard characters in the authenticated /api/token/search endpoint. The keyword and token inputs are fed into SQL LIKE clauses without properly neutralizing % and _, so a logged-in user can trigger slow, expensive queries and exhaust database or application resources. GitHub flags affected versions as <= v0.10.8-alpha.9, and the fix lands in v0.10.8-alpha.10.

The vendor's MEDIUM call is basically right, and I'd keep it there after a friction audit. The important down-pressure is that exploitation needs a valid account and only buys *availability impact*; there is no confidentiality or integrity loss in the published advisory, no KEV listing, and no public evidence of broad in-the-wild abuse. The up-pressure is that the abuse path is straightforward once authenticated, especially on default SQLite-backed or undersized self-hosted deployments.

"Post-authenticated app-layer DoS in a niche self-hosted gateway: patch it, but don't let it jump the queue."

02 · The Attack Path

4 steps from start to impact.

STEP 01

Get a normal user session

The attacker needs ordinary authenticated access to the new-api web/API surface. That can be their own low-privilege account, a compromised user credential, or any tenant account in a shared deployment. Tooling is trivial: browser dev tools, curl, or Burp Suite can all drive the endpoint once a session token is present.

Conditions required:

Authenticated remote access
Reachability to the new-api HTTP interface
A vulnerable build at or below v0.10.8-alpha.9

Where this breaks in practice:

Requires prior access; this is not an internet-scale unauthenticated bug
SSO, MFA, private-only exposure, and tenant enrollment controls reduce reachable population
Many enterprises will have this service internal-only rather than public-facing

Detection/coverage: Most SAST/SCA tools will flag the vulnerable version, but generic network scanners will not reliably prove exploitability because the endpoint is behind authentication.

STEP 02

Send pathological wildcard search patterns

The attacker submits crafted keyword or token values containing % and _ patterns that expand the cost of LIKE matching. The GHSA explicitly states these values were concatenated into SQL LIKE clauses without escaping, and the advisory's exploitation scenario calls out wildcard-heavy requests as the trigger. Burp Intruder, curl, or a small Python script are enough; no exploit framework is required.

Conditions required:

Ability to call /api/token/search while authenticated
Input reaches the vulnerable query path unchanged

Where this breaks in practice:

If upstream reverse proxies enforce request throttling, burst power drops
Smaller datasets reduce individual query cost
WAF rules matching repeated % patterns can blunt the obvious abuse cases

Detection/coverage: Look for spikes in requests to /api/token/search, unusually wildcard-dense query strings, and corresponding slow-query entries in SQLite/MySQL/PostgreSQL logs.

STEP 03

Amplify with concurrency

The advisory's own exploitation scenario describes launching 50-200 concurrent requests until the database is overwhelmed and the app exhausts memory processing large result sets. Commodity load tools like hey, ab, or Burp's repeater/intruder are enough to weaponize this from one account. The patch commit adds pagination, rate limiting, and stricter pattern validation, which is a strong signal the pre-patch abuse path was practical.

Conditions required:

Sustained authenticated request stream
Backend resources small enough that expensive queries stack up

Where this breaks in practice:

Per-user rate limits, if already present externally, reduce blast radius
Autoscaling or an oversized DB tier can absorb some abuse
Service monitoring may catch the surge before full outage

Detection/coverage: APM and DB telemetry should show elevated query latency, CPU, and memory around token search. If you only rely on perimeter scanners, you will miss this.

STEP 04

Degrade or deny service

The impact described by GitHub and NVD is denial of service through resource exhaustion. Realistically, this means the gateway slows badly or becomes unavailable for that instance or tenant set, which matters if new-api is a central AI proxy for internal apps. It does *not* provide code execution, tenant breakout, or data theft in the published record.

Conditions required:

The vulnerable service is operationally important enough that slowdown matters

Where this breaks in practice:

Blast radius is limited to availability of this application stack
No published path to confidentiality, integrity, or host compromise
Recovery is usually service/database resource recovery rather than incident-response-for-RCE

Detection/coverage: Health checks, 5xx spikes, queue depth, DB slow-query logs, and container restarts will usually show the effect even if the root cause is missed at first.

03 · Intelligence Metadata

The supporting signals.

In-the-wild status	No authoritative evidence found of active exploitation. Not present in CISA KEV as checked against the KEV catalog source.
Proof-of-concept availability	No primary-source public exploit repo or Metasploit/Nuclei module found. However, the GitHub advisory includes an explicit exploitation scenario and the patch commit clearly shows the abused code path, so reproduction effort is low.
EPSS	`0.00022` from the user-supplied intel; that is very low and consistent with a narrow, post-authenticated open-source app DoS.
KEV status	No. No `dateAdded` applies because `CVE-2026-25591` is absent from the CISA KEV catalog.
CVSS vector	`CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H` — the score is carried by easy network reachability and high availability impact, but the `PR:L` prerequisite is the major real-world limiter.
Affected versions	GitHub advisory: `<= v0.10.8-alpha.9`. NVD also maps vulnerable `new_api` builds up to but excluding `0.10.8`, including `0.10.8-alpha.1` through `alpha.9`.
Fixed versions	Upstream fix: `v0.10.8-alpha.10`. I found no distro backport advisories for packaged Linux distributions; treat upstream GitHub release/commit as the authoritative patch point.
Scanning / exposure data	No reliable public census or fingerprint source was found in primary data for exposed `new-api` instances. The repo README shows simple Docker deployment on port `3000`, which implies exposure is plausible, but internet-scale prevalence is unverified.
Disclosure timeline	GitHub advisory published 2026-02-22; NVD shows the CVE record published 2026-02-23 and modified 2026-03-03. Your intel block uses 2026-02-24; that is close, but the primary records point to Feb. 22–23, 2026 depending on source/time zone.
Researchers / reporting	GitHub credits reporters `xuemian168` and `callmeiks`; remediation developer listed as `Calcium-Ion`.

04 · The Call

noisgate verdict.

Final Verdict

= UNCHANGED to MEDIUM (5.0/10)

The single biggest reason this stays out of the top patch tier is the attacker position requirement: authenticated remote access to a niche self-hosted application endpoint. Once that hurdle is cleared the abuse is easy, but the published impact is still limited to application availability rather than code execution or data compromise.

HIGH Technical description, affected versions, and fixed version

MEDIUM Real-world exposure prevalence of internet-facing `new-api` deployments

MEDIUM Absence of public exploitation evidence beyond advisory-level reproduction details

Why this verdict

Start at vendor 6.5 MEDIUM: the CVSS math is reasonable for an easy network-reachable DoS, but it overstates urgency for enterprise triage because PR:L means the attacker is already inside the trust boundary of the application.
First downward adjustment — attacker position: this requires authenticated remote access, which implies prior compromise, valid tenancy, or legitimate user enrollment. MFA, SSO, private exposure, and account governance all suppress who can even start the chain.
Second downward adjustment — reachable population: new-api is a popular open-source project, but it is still a specific self-hosted AI gateway, not a universally deployed enterprise platform. That narrows exposure compared with edge appliances, VPNs, or broadly exposed collaboration software.
Third downward adjustment — impact type: the published record shows availability-only impact. There is no vendor claim of RCE, privilege escalation, data access, or tenant escape, so the blast radius is materially smaller than the CVSS number can make it feel.
Small upward pressure: the patch adds input validation, pagination, and per-user rate limiting, which suggests the pre-fix abuse path was not theoretical. On undersized or SQLite-backed deployments, one ordinary account may be enough to cause real service pain.

Why not higher?

There is no evidence here of unauthenticated exploitation, code execution, or sensitive-data compromise. The requirement for a valid account compounds with the limited impact class: this is disruptive, but it is still a post-access service exhaustion bug in a specific application component.

Why not lower?

This is not a harmless nuisance. The vulnerable endpoint is remotely reachable once logged in, the exploit mechanics are simple, and the patch itself confirms the vendor had to add multiple guardrails—sanitization, rate limiting, and pagination—to close the hole. If new-api is your central AI gateway, an authenticated user can still knock over something operationally important.

05 · Compensating Control

What to do — in priority order.

Throttle token search aggressively — Apply per-user and per-IP rate limits specifically on /api/token/search to reduce concurrency-based resource exhaustion. For a MEDIUM verdict, deploy this within 30 days if you cannot patch immediately; faster if the service is internet-facing or multi-tenant.
Block abusive wildcard patterns — At the reverse proxy or WAF, deny requests where keyword or token contains repeated %, %%, or obviously wildcard-dense patterns. This is a practical stopgap because the upstream fix explicitly sanitizes these cases; deploy within 30 days if patching is delayed.
Restrict exposure — Move new-api behind SSO, VPN, corporate IP allowlists, or internal ingress only. Since the exploit requires authentication, shrinking who can authenticate cuts the reachable attacker pool; for a MEDIUM finding, do this during the normal hardening cycle within 30 days if externally exposed.
Watch slow queries and endpoint spikes — Alert on elevated latency, CPU, memory, and request volume tied to /api/token/search, plus slow-query logs in SQLite/MySQL/PostgreSQL. This won't prevent exploitation, but it shortens time-to-detect while you move to the patched build.

What doesn't work

Generic perimeter vuln scanning doesn't help much because the vulnerable path sits behind authentication and abuse depends on application behavior, not a banner check.
EDR on the host won't reliably stop this by itself; the attack is valid application traffic causing expensive queries, not malware execution.
Database backups do nothing for the exploit path because the issue is live resource exhaustion, not data loss.

06 · Verification

Crowdsourced verification payload.

Run this on the target host or Docker node that actually runs new-api. Invoke it as python3 verify_cve_2026_25591.py --container new-api or python3 verify_cve_2026_25591.py --version v0.10.8-alpha.9; it needs only local file read access unless you use --container, in which case it needs permission to talk to the Docker socket.

noisgate-verify.py

PYTHONREAD-ONLYSAFE

#!/usr/bin/env python3
# verify_cve_2026_25591.py
# Exit codes:
#   0 = PATCHED
#   1 = VULNERABLE
#   2 = UNKNOWN / usage error

import argparse
import os
import re
import subprocess
import sys
from typing import Optional, Tuple

FIXED = "v0.10.8-alpha.10"
CANDIDATE_VERSION_FILES = [
    "./VERSION",
    "/app/VERSION",
    "/opt/new-api/VERSION",
    "/usr/local/share/new-api/VERSION",
]

SEMVER_RE = re.compile(r"^v?(\d+)\.(\d+)\.(\d+)(?:-alpha\.(\d+))?$")


def parse_version(v: str) -> Optional[Tuple[int, int, int, int, int]]:
    v = v.strip()
    m = SEMVER_RE.match(v)
    if not m:
        return None
    major, minor, patch, alpha = m.groups()
    # Stable release sorts after alpha releases of same base version.
    is_stable = 1 if alpha is None else 0
    alpha_num = int(alpha) if alpha is not None else 999999
    return (int(major), int(minor), int(patch), is_stable, alpha_num)


def cmp_versions(a: str, b: str) -> Optional[int]:
    pa = parse_version(a)
    pb = parse_version(b)
    if pa is None or pb is None:
        return None
    if pa < pb:
        return -1
    if pa > pb:
        return 1
    return 0


def docker_inspect_version(container: str) -> Optional[str]:
    # Try image tag first.
    try:
        cmd = ["docker", "inspect", "--format", "{{.Config.Image}}", container]
        out = subprocess.check_output(cmd, stderr=subprocess.DEVNULL, text=True).strip()
        # Example: calciumion/new-api:v0.10.8-alpha.10
        m = re.search(r":(v?\d+\.\d+\.\d+(?:-alpha\.\d+)?)$", out)
        if m:
            return m.group(1)
    except Exception:
        pass

    # Fallback: read VERSION inside the container.
    for path in ["/app/VERSION", "/VERSION"]:
        try:
            cmd = ["docker", "exec", container, "sh", "-c", f"test -r {path} && cat {path}"]
            out = subprocess.check_output(cmd, stderr=subprocess.DEVNULL, text=True).strip()
            if out:
                return out.splitlines()[0].strip()
        except Exception:
            continue
    return None


def local_file_version() -> Optional[str]:
    for path in CANDIDATE_VERSION_FILES:
        if os.path.isfile(path):
            try:
                with open(path, "r", encoding="utf-8") as f:
                    line = f.readline().strip()
                    if line:
                        return line
            except Exception:
                continue
    return None


def judge(version: str) -> Tuple[str, int]:
    comparison = cmp_versions(version, FIXED)
    if comparison is None:
        return (f"UNKNOWN - could not parse version '{version}'", 2)
    if comparison < 0:
        return (f"VULNERABLE - detected version {version} is older than fixed {FIXED}", 1)
    return (f"PATCHED - detected version {version} is at or newer than fixed {FIXED}", 0)


def main() -> int:
    parser = argparse.ArgumentParser(description="Verify exposure to CVE-2026-25591 in QuantumNous new-api")
    parser.add_argument("--version", help="Explicit new-api version, e.g. v0.10.8-alpha.9")
    parser.add_argument("--container", help="Docker container name or ID running new-api")
    args = parser.parse_args()

    detected = None
    source = None

    if args.version:
        detected = args.version.strip()
        source = "argument"
    elif args.container:
        detected = docker_inspect_version(args.container)
        source = f"docker container {args.container}"
    else:
        detected = local_file_version()
        source = "local VERSION file"

    if not detected:
        print("UNKNOWN - unable to determine installed new-api version from provided inputs or common local files")
        return 2

    result, rc = judge(detected)
    print(f"{result} (source: {source})")
    return rc


if __name__ == "__main__":
    sys.exit(main())

07 · Bottom Line

If you remember one thing.

TL;DR

Monday morning: identify every new-api deployment, confirm whether it is at <= v0.10.8-alpha.9, and check whether any instance is internet-facing or shared across many users. Because this lands at MEDIUM, there is no noisgate mitigation SLA — go straight to the 365-day remediation window unless your instance is externally exposed or business-critical; in that case, put reverse-proxy throttling and wildcard blocking in place during normal change control anyway. Your noisgate remediation SLA is within 365 days to move to v0.10.8-alpha.10 or later, but for exposed central gateways I would not sit on it that long—fold it into the next routine application patch cycle after validating the fix.

Sources

Peer Review

What defenders are saying.

Submit a review attribution: handle + country only

0 flags selected · stored anonymously

Validation Results

Crowdsourced verification outputs.

Results submitted by users who ran the verification payload against their environment.