CVE-2026-7482 · Ollama before 0.17.1 contains a heap out-of-bounds read vulnerabili…

01 · The Real Story

This is a memory peephole that only matters when you leave the side door to model management unlocked

CVE-2026-7482 is a heap out-of-bounds read in Ollama’s GGUF loader affecting versions before 0.17.1. A crafted GGUF can declare tensor offsets and sizes that exceed the real file length; when Ollama processes that file during model creation and quantization, it reads past the intended heap buffer and can pull adjacent process memory into the generated model artifact. That memory can include environment variables, API keys, system prompts, and other users’ in-flight chat data.

The vendor’s 9.1 CRITICAL score is technically defensible for an exposed instance, but too broad for enterprise prioritization. The decisive real-world friction is exposure shape: Ollama serves locally by default on 127.0.0.1, and the vulnerable path is the model-management flow (/api/create, often plus /api/push), not the normal inference path alone. That keeps this out of universal-CRITICAL territory, but any internet-reachable or weakly proxied deployment is still a serious no-auth remote secret leak.

"Bad bug, but not every Ollama host is reachable; this is a HIGH exposure-shaped leak, not a universal CRITICAL."

02 · The Attack Path

4 steps from start to impact.

STEP 01

Find an exposed Ollama API with curl or internet scan data

An attacker first identifies Ollama instances by probing the default API base URL or by using internet-facing exposure datasets. The product serves on http://localhost:11434/api by default, but operators commonly rebind with OLLAMA_HOST=0.0.0.0:11434 or front it with a reverse proxy, which turns a local service into a remote target.

Conditions required:

Attacker has network reach to the Ollama listener or proxy
The deployment is not strictly localhost-bound
Firewall or reverse proxy permits access to the API

Where this breaks in practice:

Default installs bind locally, which kills unauthenticated remote exploitation outright
Many enterprises expose chat endpoints but not model-management endpoints
Network ACLs, VPN-only access, or private VPC placement sharply reduce reachable population

Detection/coverage: External attack-surface tools can usually identify exposed Ollama services; Censys already documents dedicated scanning for Ollama exposure.

STEP 02

Reach the vulnerable model-creation path with curl and a crafted GGUF

The exploit path targets model import and creation, not ordinary prompt submission. Public research and PoC tooling show a malicious GGUF can be uploaded and processed through the creation workflow so the loader trusts attacker-controlled tensor metadata and reads beyond the valid backing buffer.

Conditions required:

The attacker can hit /api/create or equivalent model-import path
The instance runs a vulnerable version before 0.17.1
The service accepts attacker-supplied GGUF content

Where this breaks in practice:

Some deployments never expose model import to untrusted users
Reverse proxies may publish /api/generate but block admin-like endpoints
Operators may disable custom model workflows operationally even if the binary is vulnerable

Detection/coverage: Version scanners will flag <0.17.1, but most commodity scanners will not prove exploitability because they do not safely exercise GGUF parsing.

STEP 03

Trigger the over-read during quantization using gguf_cve2026_7482_poc.py-style input

The attacker submits a GGUF whose declared tensor offset and size exceed the file’s actual contents. During quantization and tensor handling, Ollama reads past the allocated heap buffer and mixes unintended process memory into the generated model output. This is a disclosure primitive first, not a clean RCE primitive.

Conditions required:

Malformed tensor metadata is accepted by the vulnerable loader
The target executes the quantization path
The process has sensitive material resident in memory

Where this breaks in practice:

Heap contents are opportunistic, so exact loot quality varies by timing and workload
The result is data disclosure, not reliable code execution from this CVE alone
Some attempts may crash or fail noisily rather than yield clean secrets

Detection/coverage: EDR may only see normal Ollama child activity; application logs around model creation are more useful than endpoint exploit signatures here.

STEP 04

Exfiltrate leaked memory through model publication with /api/push

The CVE record explicitly notes that leaked data can be exfiltrated by publishing the resulting model artifact to an attacker-controlled registry via /api/push. That makes the bug much more operationally dangerous than a local-only memory disclosure because the same exposed control plane can both read and ship the loot.

Conditions required:

Outbound network egress permits registry access
The deployment exposes or permits /api/push semantics
The attacker can name a reachable registry destination

Where this breaks in practice:

Egress filtering can break the clean one-shot exfil chain
Some organizations do not allow outbound registry access from inference nodes
Registry auth and proxy controls may add noise or fail closed

Detection/coverage: Watch for unexpected model push activity, new outbound registry destinations, and anomalous create/push sequences from the same source.

03 · Intelligence Metadata

The supporting signals.

In-the-wild status	No authoritative evidence of active exploitation found in reviewed primary sources as of 2026-05-31. Not in CISA KEV.
Proof-of-concept availability	Public PoC and detector material exists, including `msuiche/gguf_cve2026_7482` on GitHub and third-party validation by RAXE Labs.
EPSS	`0.00034` from the supplied intel, which is effectively near-floor exploit-likelihood signal; percentile was not authoritatively retrieved from FIRST during this review.
KEV status	Not listed in the CISA KEV catalog during source review on 2026-05-31.
CVSS vector	`CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:H` — vendor model assumes no-auth network reach and high confidentiality/availability impact once the vulnerable path is reachable.
Affected versions	All Ollama versions before `0.17.1`; NVD/OpenCVE show a semver range of `0` to `<0.17.1`.
Fixed versions	Upstream fixed in `0.17.1` via PR `#14406` / commit `88d57d0`. No distro backport data was found in reviewed sources.
Exposure data	Censys measured about 10.6K high-confidence exposed Ollama services after filtering likely honeypots, with over 25% on non-default ports. That supports real exposure, but not universal exposure.
Disclosure timeline	PR merged 2026-02-25; CVE published 2026-05-04; release `v0.17.1` shipped without a visible security callout in the release notes.
Researcher / reporting org	OpenCVE/RAXE attribute the CVE publication to Echo and credit Cyera Research Team (Dor Attias, Ofek Itach).

04 · The Call

noisgate verdict.

Final Verdict

↓ DOWNGRADED to HIGH (8.2/10)

The single biggest downgrade factor is that remote reachability is configuration-shaped, not default: localhost-bound Ollama is not remotely exploitable. I kept it in HIGH because any exposed instance gives an unauthenticated attacker a practical path to leak secrets and conversation data, and public PoC material lowers attacker effort substantially.

HIGH Affected version and patch mapping

HIGH Exposure-shape downgrade versus vendor baseline

MEDIUM Real-world exploit prevalence assessment

Why this verdict

Downgrade for exposure shape: vendor scoring assumes network reach, but Ollama serves on 127.0.0.1 by default and only becomes remotely reachable when admins deliberately publish it via OLLAMA_HOST, port forwarding, or a reverse proxy.
Downgrade for path specificity: the attacker needs access to the model-management workflow, especially /api/create, not just any chat/inference endpoint. In real deployments that sharply narrows the exploitable population.
Held at HIGH because blast radius is real when exposed: no auth is required on the local API path, the leak can include secrets and user data, and /api/push gives a clean exfil route. Public PoC material means this is not theoretical.

Why not higher?

I did not keep the vendor’s CRITICAL because this is not a wormable, universal edge-service bug across every host that has Ollama installed. The remote condition depends on operator exposure decisions, and many enterprise deployments either stay localhost-bound or never expose model creation to untrusted users. There is also no authoritative in-the-wild exploitation signal or KEV listing during this review.

Why not lower?

I did not drop this to MEDIUM because exposed instances are one unauthenticated HTTP workflow away from leaking high-value process memory. The combination of public internet exposure, no auth on the relevant local API, and public PoC availability makes this too actionable and too damaging for a lower bucket.

05 · Compensating Control

What to do — in priority order.

Bind Ollama to localhost — Force OLLAMA_HOST=127.0.0.1:11434 or equivalent and remove direct external listeners. This is the strongest exposure reduction and should be deployed within 30 days for a HIGH verdict, with same-day priority for anything currently internet-facing.
Put auth in front of model-management routes — Require authentication and authorization at the reverse proxy for /api/create, /api/push, and related import/publish paths. If business needs force network access, do this within 30 days so the vulnerable workflow is no longer anonymously reachable.
Restrict egress from Ollama hosts — Allow only approved registries and package destinations from inference nodes. This breaks the clean /api/push exfil chain and should be enforced within 30 days on systems that handle sensitive prompts or credentials.
Hunt for model creation and push anomalies — Review Ollama logs, reverse-proxy logs, and outbound connections for unexpected create/import/push sequences, especially from unfamiliar IPs. Do this immediately on exposed nodes, then operationalize alerting within 30 days.

What doesn't work

Changing to a non-default port does not materially help; Censys specifically found a long tail of exposed Ollama instances on non-standard ports.
A WAF that only protects chat endpoints is insufficient if /api/create or /api/push remain reachable behind the same proxy.
Traditional AV signatures do not reliably detect malformed GGUF metadata abuse because this is an application-layer parsing flaw, not a commodity malware dropper.

06 · Verification

Crowdsourced verification payload.

Run this on the target Ollama host or inside the Ollama container/VM. Invoke it as bash check-cve-2026-7482.sh http://127.0.0.1:11434 or just bash check-cve-2026-7482.sh; no root is required for version/API checks, though root may help if you separately inspect service config.

noisgate-verify.sh

BASHREAD-ONLYSAFE

#!/usr/bin/env bash
# check-cve-2026-7482.sh
# Determine whether Ollama is vulnerable to CVE-2026-7482 based on installed/runtime version.
# Exit codes: 0=PATCHED, 1=VULNERABLE, 2=UNKNOWN

set -u

TARGET_URL="${1:-http://127.0.0.1:11434}"
API_URL="${TARGET_URL%/}/api/version"
REQUIRED="0.17.1"

have_cmd() {
  command -v "$1" >/dev/null 2>&1
}

normalize_ver() {
  # Strip leading v and any non-semver suffix.
  echo "$1" | sed -E 's/^v//; s/[^0-9.].*$//'
}

ver_ge() {
  # returns 0 if $1 >= $2
  [ "$1" = "$2" ] && return 0
  local first
  first=$(printf '%s
%s
' "$1" "$2" | sort -V | head -n1)
  [ "$first" = "$2" ]
}

extract_version_text() {
  # Accepts strings like:
  # "ollama version is 0.17.0"
  # "ollama version 0.17.0"
  # '{"version":"0.17.0"}'
  echo "$1" | grep -Eo '([0-9]+\.[0-9]+\.[0-9]+)' | head -n1
}

VERSION=""
SOURCE=""

if have_cmd ollama; then
  CLI_OUT=$(ollama --version 2>/dev/null || true)
  CLI_VER=$(extract_version_text "$CLI_OUT")
  if [ -n "$CLI_VER" ]; then
    VERSION="$CLI_VER"
    SOURCE="cli"
  fi
fi

if [ -z "$VERSION" ] && have_cmd curl; then
  API_OUT=$(curl -fsS --max-time 3 "$API_URL" 2>/dev/null || true)
  API_VER=$(extract_version_text "$API_OUT")
  if [ -n "$API_VER" ]; then
    VERSION="$API_VER"
    SOURCE="api"
  fi
fi

if [ -z "$VERSION" ]; then
  echo "UNKNOWN - could not determine Ollama version from CLI or $API_URL"
  exit 2
fi

VERSION=$(normalize_ver "$VERSION")
REQUIRED=$(normalize_ver "$REQUIRED")

if ver_ge "$VERSION" "$REQUIRED"; then
  echo "PATCHED - Ollama version $VERSION detected via $SOURCE (fixed in $REQUIRED+)"
  exit 0
else
  echo "VULNERABLE - Ollama version $VERSION detected via $SOURCE (requires $REQUIRED+)"
  exit 1
fi

07 · Bottom Line

If you remember one thing.

TL;DR

Monday morning, pull a fleet-wide inventory of every Ollama node, then split the list into internet-/untrusted-network reachable versus localhost/private-only. For any exposed or reverse-proxied instance, remove anonymous access to /api/create and /api/push or rebind to localhost as fast as operationally possible; under the noisgate mitigation SLA, this is within 30 days for a HIGH verdict, but exposed edge cases should be handled in the current sprint. Then upgrade every instance running <0.17.1; under the noisgate remediation SLA, complete that patch rollout within 180 days.

Sources

Peer Review

What defenders are saying.

Submit a review attribution: handle + country only

0 flags selected · stored anonymously

Validation Results

Crowdsourced verification outputs.

Results submitted by users who ran the verification payload against their environment.