This is a trapdoor hidden in the PDF intake chute, not a fire in every server room
CVE-2017-9096 is an XXE bug in iText's XML handling: versions before 5.5.12 and 7.0.3 can resolve external entities when parsing attacker-controlled XML embedded in a PDF, especially the XFA path called out by iText 7 release notes. In plain English, if your application *ingests* untrusted PDFs and uses vulnerable iText code to parse that XML content, an attacker may coerce the server into reading local files or making outbound requests.
The vendor-style HIGH 8.8 score is technically defensible in a lab, but it overstates enterprise reality. iText is usually an embedded library, not a directly reachable service, and a large share of deployments use it for PDF generation only; the vulnerable population narrows further if the app must specifically parse XFA/XML content from attacker-supplied PDFs. That is why this lands as MEDIUM in practice: real impact can be serious, but reachability is much smaller than the CVSS headline suggests.
4 steps from start to impact.
Craft a malicious XFA-bearing PDF with Burp Suite or a public XXE PoC
file:// or http(s):// content. Public reporting and PoC references show this is straightforward XXE tradecraft, not bespoke exploit development.- The attacker can submit a PDF to the target workflow
- The target workflow accepts PDFs from untrusted or weakly trusted sources
- Many iText deployments only *generate* PDFs and never parse attacker-supplied files
- Some upload paths validate file type, strip forms, or reject XFA-heavy documents before iText ever sees them
Trigger the vulnerable parser path in PdfReader / XFA handling
7.0.3 notes explicitly tie the fix to PdfReader parsing XFA, which means generic library presence alone is not enough; the application has to hit the vulnerable feature path.- The application uses vulnerable iText versions
- The workflow invokes parsing/flattening/extraction on XFA or related XML content
- A lot of enterprise code uses iText for server-side generation, stamping, or merging only
- Not every PDF-processing feature reaches XFA/XML parsing
PdfReader and XFA/form-processing paths. Runtime detection is weak unless the app logs parser exceptions or outbound fetches.Abuse XXE for local file read or SSRF using Interactsh/DNS callbacks
- The parser allows external entity resolution
- The application can reach local files and/or make outbound network requests
- Egress filtering, container isolation, read-only runtimes, and least-privilege service accounts reduce value
- Some XXE attempts succeed only as blind SSRF with no direct response body
Turn disclosed secrets into follow-on access
- Sensitive material is readable from the application's execution context
- The attacker can use the disclosed data elsewhere
- Secrets may be vaulted, rotated, or scoped too narrowly to matter
- Even successful SSRF/file read may expose low-value data only
The supporting signals.
| In-the-wild status | No authoritative exploitation evidence surfaced in reviewed sources, and CISA KEV does not list this CVE. That is meaningful downward pressure versus internet-wormable bugs. |
|---|---|
| Proof-of-concept availability | Public PoC references exist, including reporting around jakabakos/CVE-2017-9096-iText-XXE; this is weaponizable commodity XXE, not a theoretical parser edge case. |
| EPSS | 0.07637 from the user-provided intel, and GitHub Advisory shows roughly 7.637% / 92nd percentile. That says attackers *could* use it, but it is far from the top of the pile. |
| KEV status | Not KEV-listed. No CISA due date, no public KEV-driven urgency signal. |
| CVSS vector reality check | CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H assumes network reachability and high CIA impact, but the real choke point is feature reachability: the attacker needs a PDF-ingestion flow that hits the vulnerable XML/XFA parser. |
| Affected versions | Authoritative sources show com.itextpdf:itextpdf < 5.5.12 and >= 7.0.0, < 7.0.3 are affected. GitHub also tracks legacy com.lowagie:itext <= 4.2.2 with no fix in that old line. |
| Fixed versions and distro posture | Upgrade targets are 5.5.12 and 7.0.3. Debian marks this issue NOT-FOR-US, reinforcing that this is typically an application dependency problem, not a distro-managed network service patch. |
| Reachable attack surface | There is no meaningful Shodan/Censys/GreyNoise census for this CVE because iText is an embedded library, not a fingerprintable internet service. Exposure depends on whether your apps accept untrusted PDFs and parse forms/XFA. |
| Disclosure timeline | Compass Security published the advisory on 2017-11-06; NVD lists publication on 2017-11-08. This is a mature, well-understood bug with long-standing fixes. |
| Reporting researcher / org | The original public advisory was released by Compass Security (CSNC-2017-017). iText's own 7.0.3 notes later tied the fix to PdfReader parsing XFA. |
noisgate verdict.
The decisive factor is reachability: this is an embedded-library flaw that only matters when a real application accepts attacker-supplied PDFs and drives them into the vulnerable XML/XFA parsing path. That sharply reduces the exposed population compared with the vendor's network-reachable HIGH baseline, even though the resulting file-read/SSRF primitive can still be dangerous where the workflow exists.
Why this verdict
- Baseline starts at vendor HIGH 8.8 because unauthenticated remote delivery of a crafted PDF is plausible where PDF upload or email-ingestion workflows exist.
- Downward adjustment: attacker must reach a PDF-ingestion path. This is not a standalone service flaw; it requires an application feature that accepts untrusted PDFs, which immediately shrinks the reachable population.
- Downward adjustment: not every iText deployment parses the dangerous content. iText's own
7.0.3release notes tie the fix toPdfReaderparsing XFA, so simple PDF generation, stamping, or merging workloads are often unaffected in practice. - Downward adjustment: modern controls can break the chain. Egress filtering, sandboxed workers, read-only runtimes, and least-privilege service accounts often turn successful XXE into low-value noise instead of a major breach.
- No exploitation amplifier from CISA KEV or reviewed public campaign reporting. There is public PoC material, but not the operational signal that would justify keeping this in
HIGHon reachability grounds alone.
Why not higher?
This is not internet-wormable and not broadly reachable just because a vulnerable JAR exists somewhere in the fleet. The exploit chain depends on a fairly specific business workflow: accepting untrusted PDFs, parsing the right XML/XFA path, and exposing enough file or network access for the XXE primitive to matter. That is too much real-world narrowing for HIGH.
Why not lower?
Where exposed PDF-processing workflows do exist, the attacker does not need prior authentication to land a malicious document, and XXE can still yield server-side file disclosure or SSRF. Those impacts are materially useful in modern cloud and app environments, so this is more than backlog trivia.
What to do — in priority order.
- Inventory inbound PDF workflows — Identify every application, batch job, mailroom service, and document pipeline that uses iText to *read* PDFs rather than just generate them. For a MEDIUM verdict there is no mitigation SLA; do this as part of normal risk triage and use the result to drive remediation within 365 days.
- Constrain egress from PDF workers — Block direct outbound DNS/HTTP(S) from PDF-processing services except approved destinations so blind XXE and SSRF attempts cannot pivot out. There is no mitigation SLA for MEDIUM, but this is the highest-value hardening move for any exposed ingestion path while you patch within 365 days.
- Run PDF parsing in a low-privilege sandbox — Move document processing into isolated workers with minimal filesystem access, no instance-metadata access, and tightly scoped service credentials. This contains the bug's blast radius if XXE is triggered and should be implemented opportunistically where those workflows already exist, with patching still completed inside the 365-day remediation window.
- Disable or strip XFA/form content where business permits — If your workflow does not need XFA, reject or normalize XFA-bearing PDFs before they hit iText. That directly attacks the code path iText associated with the fix and is a strong compensating control for exposed upload pipelines; for a MEDIUM issue there is no mitigation SLA, but use it to reduce risk until remediation is done within 365 days.
- Prioritize SCA over perimeter scanning — Use SBOM/SCA, repository scanning, and code search to find
com.itextpdf:itextpdf,com.lowagie:itext, and old bundled JARs. This CVE is mostly invisible to network scanners, so dependency discovery is the practical way to close it inside the 365-day remediation window.
- A generic WAF does not reliably help because the exploit payload is often inside a PDF upload or mail attachment, not clean XML in an HTTP parameter.
- Version-only perimeter scans do not help much because iText is an embedded library with no native wire fingerprint.
- MFA is irrelevant to the core flaw; this is about server-side parsing of untrusted content, not account takeover.
Crowdsourced verification payload.
Run this on an auditor workstation or CI runner with read access to application directories, artifact caches, build outputs, or golden images. Invoke it as python3 check_cve_2017_9096_itext.py /opt/apps /srv/jars or python check_cve_2017_9096_itext.py C:\Apps; no admin rights are required unless the paths are protected.
#!/usr/bin/env python3
# check_cve_2017_9096_itext.py
# Detect likely vulnerable iText artifacts for CVE-2017-9096.
# Exit codes: 0=PATCHED, 1=VULNERABLE, 2=UNKNOWN
import os
import re
import sys
import json
import zipfile
from pathlib import Path
VULN_FOUND = []
PATCHED_FOUND = []
UNKNOWN_FOUND = []
JAR_NAME_RE = re.compile(r'(itextpdf|itext|itext7)[-_]?([0-9][0-9A-Za-z._-]*)?\.jar$', re.I)
DLL_NAME_RE = re.compile(r'(itextsharp|itext7)[._-]?([0-9][0-9A-Za-z._-]*)?\.dll$', re.I)
NUPKG_RE = re.compile(r'(itextsharp|itext7)[._-]?([0-9][0-9A-Za-z._-]*)?\.nupkg$', re.I)
def normalize(v):
if not v:
return []
v = v.strip().lower()
v = v.replace('+', '.')
parts = re.split(r'[^0-9]+', v)
nums = [int(p) for p in parts if p != '']
return nums
def cmp_ver(a, b):
aa = normalize(a)
bb = normalize(b)
maxlen = max(len(aa), len(bb))
aa += [0] * (maxlen - len(aa))
bb += [0] * (maxlen - len(bb))
if aa < bb:
return -1
if aa > bb:
return 1
return 0
def classify_itext_version(version, package_hint=''):
if not version:
return 'UNKNOWN'
hint = package_hint.lower()
# Legacy GHSA note: com.lowagie:itext <= 4.2.2 has no fix
if 'lowagie' in hint:
if cmp_ver(version, '4.2.2') <= 0:
return 'VULNERABLE'
return 'UNKNOWN'
# iText 7 range: 7.0.0 - 7.0.2 vulnerable, 7.0.3+ patched
if cmp_ver(version, '7.0.0') >= 0:
if cmp_ver(version, '7.0.3') < 0:
return 'VULNERABLE'
return 'PATCHED'
# iText 5 and earlier: < 5.5.12 vulnerable, 5.5.12+ patched
if cmp_ver(version, '5.5.12') < 0:
return 'VULNERABLE'
return 'PATCHED'
def record(state, path, version, detail):
item = {'path': str(path), 'version': version or '', 'detail': detail}
if state == 'VULNERABLE':
VULN_FOUND.append(item)
elif state == 'PATCHED':
PATCHED_FOUND.append(item)
else:
UNKNOWN_FOUND.append(item)
def scan_jar(path):
version = None
package_hint = ''
try:
with zipfile.ZipFile(path, 'r') as zf:
for name in zf.namelist():
low = name.lower()
if low.endswith('pom.properties') and ('itext' in low or 'lowagie' in low):
data = zf.read(name).decode('utf-8', errors='ignore')
for line in data.splitlines():
if line.startswith('version='):
version = line.split('=', 1)[1].strip()
elif line.startswith('groupId='):
package_hint = line.split('=', 1)[1].strip()
elif low.endswith('manifest.mf') and not version:
data = zf.read(name).decode('utf-8', errors='ignore')
for line in data.splitlines():
if line.lower().startswith('implementation-version:'):
version = line.split(':', 1)[1].strip()
if not version:
m = JAR_NAME_RE.search(path.name)
if m and m.group(2):
version = m.group(2)
state = classify_itext_version(version, package_hint)
record(state, path, version, f'jar package_hint={package_hint or "unknown"}')
except Exception as e:
record('UNKNOWN', path, version, f'jar read error: {e}')
def scan_deps_json(path):
try:
data = json.loads(path.read_text(encoding='utf-8', errors='ignore'))
except Exception as e:
record('UNKNOWN', path, None, f'deps.json read error: {e}')
return
libs = data.get('libraries', {})
hit = False
for key in libs.keys():
low = key.lower()
if low.startswith('itextsharp/') or low.startswith('itext7/'):
hit = True
name, version = key.split('/', 1)
state = classify_itext_version(version, name)
record(state, path, version, f'deps.json package={name}')
if not hit:
# no relevant package reference; stay silent
pass
def scan_filename_only(path):
m = DLL_NAME_RE.search(path.name) or NUPKG_RE.search(path.name)
version = m.group(2) if m and m.group(2) else None
state = classify_itext_version(version, path.stem)
record(state, path, version, 'filename-based detection only')
def walk(root):
for dirpath, _, filenames in os.walk(root):
for fn in filenames:
p = Path(dirpath) / fn
low = fn.lower()
if low.endswith('.jar') and 'itext' in low:
scan_jar(p)
elif low.endswith('.deps.json'):
scan_deps_json(p)
elif low.endswith('.dll') and ('itextsharp' in low or 'itext7' in low):
scan_filename_only(p)
elif low.endswith('.nupkg') and ('itextsharp' in low or 'itext7' in low):
scan_filename_only(p)
def main():
if len(sys.argv) < 2:
print('UNKNOWN - usage: python3 check_cve_2017_9096_itext.py <path> [<path> ...]')
sys.exit(2)
for arg in sys.argv[1:]:
if os.path.exists(arg):
walk(arg)
else:
record('UNKNOWN', arg, None, 'path does not exist')
if VULN_FOUND:
print('VULNERABLE')
for item in VULN_FOUND:
print(f"[VULN] {item['path']} version={item['version']} detail={item['detail']}")
for item in PATCHED_FOUND:
print(f"[PATCHED] {item['path']} version={item['version']} detail={item['detail']}")
for item in UNKNOWN_FOUND:
print(f"[UNKNOWN] {item['path']} version={item['version']} detail={item['detail']}")
sys.exit(1)
if PATCHED_FOUND and not UNKNOWN_FOUND:
print('PATCHED')
for item in PATCHED_FOUND:
print(f"[PATCHED] {item['path']} version={item['version']} detail={item['detail']}")
sys.exit(0)
print('UNKNOWN')
for item in PATCHED_FOUND:
print(f"[PATCHED] {item['path']} version={item['version']} detail={item['detail']}")
for item in UNKNOWN_FOUND:
print(f"[UNKNOWN] {item['path']} version={item['version']} detail={item['detail']}")
sys.exit(2)
if __name__ == '__main__':
main()
If you remember one thing.
5.5.12 or 7.0.3 on a normal priority track. For this MEDIUM verdict there is no noisgate mitigation SLA — go straight to the 365-day remediation window; if you discover internet-facing or email-fed PDF parsing that processes forms/XFA, apply temporary egress/sandbox controls immediately and complete patching within the noisgate remediation SLA of ≤365 days.Sources
What defenders are saying.
Crowdsourced verification outputs.
Results submitted by users who ran the verification payload against their environment.