Making security understandable

March 2026

Your AI Proxy Is the Highest-Value Target on Your Network

Updated 29 March 2026 with attack chain details, FutureSearch incident timeline, and practical mitigations.

On 24 March 2026, LiteLLM was found to contain a credential-stealing backdoor in its PyPI distribution. LiteLLM is the most popular open-source proxy for routing AI traffic. Versions 1.82.7 and 1.82.8 shipped with a multi-stage payload that harvested every API key, environment variable, and cloud credential it could find, encrypted them with an embedded RSA public key, and sent them to an attacker-controlled server. It also self-replicated.

LiteLLM has over 20,000 GitHub stars. Thousands of organisations use it to route requests to OpenAI, Anthropic, Azure, AWS Bedrock, and dozens of other providers. It holds SOC 2 Type II and ISO 27001 certifications. None of that prevented this.

PyPI quarantined the affected versions shortly after the report. If you installed them before the quarantine, the damage may already be done.

The compromise was discovered not by a security researcher, but by an ML engineer at FutureSearch whose development environment was infected when Cursor IDE auto-pulled the poisoned package as a transitive dependency. The malware had a bug: a .pth persistence hook that triggered on every Python process startup, causing an exponential fork bomb of roughly 11,000 processes. That accidental self-destruction is what revealed the attack. Without it, discovery would have taken far longer. The full minute-by-minute account is worth reading. Just over an hour from first symptom to public disclosure.

Next time, the attacker won't make that mistake.

Why AI Proxies Are Perfect Targets

An AI proxy sits between your applications and every LLM provider you use. To do its job, it needs credentials for all of them. API keys for OpenAI. Service account tokens for Google Vertex. AWS access keys for Bedrock. Azure credentials for OpenAI deployments. It's a credential aggregator by design.

And it doesn't stop at LLM keys. The proxy process runs in an environment full of other secrets. Database connection strings, cloud IAM credentials, service mesh tokens, environment variables that were set for a different service entirely but happen to be visible to the same process. None of this requires misconfiguration. That's just how processes work.

An attacker who compromises the proxy doesn't need to move laterally. They're already at the crossroads.

The Supply Chain Cascade

This isn't happening in isolation. LiteLLM's maintainer confirmed that the compromise originated from Trivy, the vulnerability scanner, which was used in LiteLLM's CI/CD pipeline. The attacker group TeamPCP compromised Trivy first (force-pushing 75 of 76 version tags in March 2026), used that foothold to get into LiteLLM's build infrastructure, and from there pushed malicious packages to PyPI.

Read that chain again: a vulnerability scanner was compromised, which was used to compromise an AI proxy, which harvested credentials from thousands of environments. Those credentials will be used to compromise the next target.

Each compromised tool yields credentials that unlock the next. The attack surface isn't a single package. It's the trust chain between all of them.

AI infrastructure is particularly exposed for three reasons. First, teams adopt AI tooling fast, and the gap between "we started using this" and "we assessed the risk of using this" is measured in months. LiteLLM went from side project to critical infrastructure in many organisations without ever passing through a security review. Second, AI proxies and gateways handle more secrets per process than almost any other component in a modern stack, because they're designed to. Third, you might not even know you're running LiteLLM. The FutureSearch engineer didn't install it directly. It was pulled in as a transitive dependency of an MCP server package, which was pulled in by Cursor. LiteLLM's Docker proxy users were unaffected because their versions were pinned in requirements.txt. Version pinning works, when you do it.

Beyond Credential Theft

Credential exfiltration is the obvious play, but a compromised proxy can do something worse: manipulate the model responses passing through it.

Picture a coding assistant routed through a compromised proxy. The proxy rewrites responses in transit. Not obviously wrong, just slightly altered. A backdoor slipped into a code suggestion. A security check quietly dropped. A configuration value nudged. The developer sees what looks like a normal AI response. It came from "their" model. Why would they question it?

This is architecturally trivial once you control the proxy, and nearly impossible to detect because the trust relationship is between the developer and the model. The proxy is invisible.

What To Do About It

If you're running LiteLLM, check your version right now:

pip show litellm

If you're on 1.82.7 or 1.82.8, assume every credential accessible to that process has been compromised. Rotate them now. Don't wait to confirm exfiltration. Check your DNS logs for connections to models.litellm.cloud (the C2 domain). Look for unexpected .pth files in your Python site-packages. The persistence mechanism, litellm_init.pth, executes on every Python process startup, not just litellm imports. Check for persistence attempts at ~/.config/sysmon/sysmon.py.

Beyond the immediate response, this is a good week to ask some harder questions about your AI stack:

More broadly: AI infrastructure deserves the same security scrutiny as your database or your CI/CD pipeline. It aggregates credentials, processes sensitive data, and has broad network access. Traditional SBOMs capture code dependencies, but AI stacks add model providers, agent frameworks, proxy layers, and plugin ecosystems that most asset registers don't cover yet.

Why the Registries Can't Save You

Within hours of the compromise, the community conversation turned to an obvious question: why don't package registries just scan for this?

One commenter argued that it would be "trivial" to build AI-powered diff scanning, flag new base64 blobs, suspicious URLs, anomalous uploader geolocations, and hold suspicious releases in a 48-hour staging area before publishing. Cost: "a couple bucks worth of tokens."

Simon Willison pointed out that PyPI already does this, via an API used by scanning partners. That's likely why the package was quarantined within an hour of going live.

They're both right, and it still isn't enough.

Registry-side scanning is reactive. Even when PyPI catches a malicious release within an hour, that hour is the danger window. Automated CI/CD pipelines, dependency updates, pip install --upgrade commands don't wait for human review. The compromise happened in the gap between "published" and "quarantined."

The 48-hour staging proposal would help for major packages, but as the commenter acknowledged, it would only be cost-effective "for these huge projects." The long tail of smaller packages, the transitive dependencies that nobody audits, wouldn't get this treatment. Those are exactly the packages that supply chain attacks increasingly target.

One practical mitigation that emerged from the community response: minimum release age. Several package managers now support refusing to install packages published within a configurable window:

A 48-hour or 7-day minimum release age would have prevented this infection entirely. The malicious versions were quarantined within 30 minutes, well before any age threshold would have expired. It's not a complete solution, but it's a free, zero-effort defence against the most common supply chain attack pattern: publish malicious version, harvest credentials from early adopters, get caught, repeat.

Beyond release age controls, you need visibility into what enters your dependency tree. You can know, before your CI pipeline runs, whether a new release of a package you depend on contains suspicious patterns. The registry protects the ecosystem. You need to protect your environment.

The Pattern

LiteLLM wasn't compromised because it was poorly built. It was compromised because it was valuable. The proxy pattern (centralised credential access, broad network reach, trusted position in the data flow) is the same pattern that made Active Directory the number one target for a decade. AI infrastructure aggregates credentials by design. We're deploying it faster than we're securing it.

The JFrog Security team confirmed the malicious payload independently. The attack has been attributed to TeamPCP, the same group behind a pattern of escalating supply chain compromises including the Trivy tag compromise earlier in March. PyPI has quarantined the package, so new installs via pip should be blocked, but that doesn't help anyone who already has the affected version running.

At ThreatControl, we help organisations understand and control their supply chain exposure -including the AI infrastructure that most security reviews miss. Our Security Suite covers your full stack, and our AI Security Testing service assesses the specific risks of AI tooling in your environment. Get in touch.

← Back to blog