arXiv PinTrace — LLMs Systemically Pin Vulnerable Dependency Versions
AI relevance: AI coding agents that generate requirements.txt or pyproject.toml files routinely pin library versions carrying known critical CVEs — the bias is baked into training data, meaning switching to a "better" model does not fix the problem.
- Wang et al. evaluated 10 LLMs on PinTrace, a 1,000-task Python benchmark drawn from Stack Overflow, instrumenting every generated dependency specification against the National Vulnerability Database (arXiv:2605.06279).
- 36.70%–55.70% of generated tasks include at least one library at a version with a known CVE.
- All ten models converge on the same small set of risky releases — the failure is systemic, caused by co-occurrence bias in the training corpus, not per-model behavior.
- Of pinned versions that carry CVEs, 62.75%–74.51% are rated Critical or High severity.
- 72.27%–91.37% of vulnerable versions were disclosed before the model's training cutoff — the models had access to the information but selected vulnerable versions anyway due to corpus frequency bias.
- Manifest files (requirements.txt, pyproject.toml) receive version specifications least often (6.45%–59.19% rate), yet this is the surface that controls reproducible installs.
- Static install success rates for model-pinned versions range from 19.70%–63.20%, with functional test pass rates as low as 6.49%–48.62% — vulnerable versions that do install silently introduce CVEs, while broken versions at least self-correct.
Why it matters
AI coding agents are increasingly used to scaffold projects and generate dependency manifests. When 37–56% of agent-generated code pins CVE-carrying versions, and the fix cannot be achieved by switching models, organizations need infrastructure-level guards — not better prompts. The CVE feed has no signal path into the model's co-occurrence prior.
What to do
- Run
pip-audit,npm audit, or Dependabot security updates as a blocking CI gate on any manifest generated by an AI agent. - Use an internal package mirror (Artifactory, Nexus) that blocks known-vulnerable versions at install time.
- Pair agent output with automated version bumping via Renovate or Dependabot so safe-at-merge versions stay current.
- Lock then resolve: pipe agent-generated
requirements.txtthroughpip-compile,uv lock, orpoetry lockin a clean environment before merging.