Detecting Malware in AI Models with MetaDefender Sandbox

Why malicious AI models are the next frontier in supply chain attacks—and how MetaDefender Sandbox exposes their hidden payloads.

Oct 20, 2025 por OPSWAT

Partilhar esta publicação

Artificial intelligence has become part of everyday life. According to IDC, global spending on AI systems is projected to surpass $300 billion by 2026, showing how rapidly adoption is accelerating. AI is no longer a niche technology—it is shaping the way businesses, governments, and individuals operate.

Software developers are increasingly incorporating Large Language Model (LLM) functionality into their applications. Well-known LLMs such as OpenAI’s ChatGPT, Google’s Gemini, and Meta’s LLaMA are now embedded into business platforms and consumer tools. From customer support chatbots to productivity software, AI integration is driving efficiency, reducing costs, and keeping organizations competitive.

But with every new technology comes new risks. The more we rely on AI, the more appealing it becomes as a target for attackers. One threat in particular is gaining momentum: malicious AI models, files that look like helpful tools but conceal hidden dangers.

The Hidden Risk of Pretrained Models

Training an AI model from scratch can take weeks, powerful computers, and massive datasets. To save time, developers often reuse pretrained models shared through platforms like PyPI, Hugging Face, or GitHub, usually in formats such as Pickle and PyTorch.

On the surface, this makes perfect sense. Why reinvent the wheel if a model already exists? But here’s the catch: not all models are safe. Some can be modified to hide malicious code. Instead of simply helping with speech recognition or image detection, they can quietly run harmful instructions the moment they are loaded.

Pickle files are especially risky. Unlike most data formats, Pickle can store not only information but also executable code. That means attackers can disguise malware inside a model that looks perfectly normal, delivering a hidden backdoor through what seems like a trusted AI component.

From Research to Real-World Attacks

Early Warnings – A Theoretical Risk

The idea that AI models could be abused to deliver malware is not new. As early as 2018, researchers published studies such as Model-Reuse Attacks on Deep Learning Systems showing that pretrained models from untrusted sources could be manipulated to behave maliciously.

At first, this seemed like a thought experiment—a “what if” scenario debated in academic circles. Many assumed it would remain too niche to matter. But history shows that every widely adopted technology becomes a target, and AI was no exception.

Proof of Concept – Making the Risk Real

The shift from theory to practice happened when real examples of malicious AI models surfaced demonstrating that Pickle-based formats like PyTorch can embed not just model weights but executable code.

A striking case was star23/baller13, a model uploaded to Hugging Face in early January 2024. It contained a reverse shell hidden inside a PyTorch file and loading it could give attackers remote access while still allowing the model to function as a valid AI model. This highlights that security researchers were actively testing proof-of-concepts at the end of 2023 and into 2024.

UI screenshot showing warning about detecting malware in AI models with a file scanned as unsafe and YAML metadata warning — PoC model on Hugging Face

Screenshot of Python code showing reverse shell commands, illustrating detecting malware in AI models for proof of concept — Reverse Shell embedded in Pytorch

By 2024, the problem was no longer isolated. JFrog reported more than 100 malicious AI/ML models uploaded to Hugging Face, confirming this threat had moved from theory into real-world attacks.

Supply Chain Attacks – From Labs to the Wild

Attackers also began exploiting the trust built into software ecosystems. In May 2025, fake PyPI packages such as aliyun-ai-labs-snippets-sdk and ai-labs-snippets-sdk mimicked Alibaba’s AI brand to trick developers. Although they were live for less than 24 hours, these packages were downloaded around 1,600 times, demonstrating how quickly poisoned AI components can infiltrate the supply chain.

For security leaders, this represents a double exposure:

Operational disruption if compromised models poison AI-powered business tools.
Regulatory and compliance risk if data exfiltration occurs via trusted-but-trojanized components.

Python SDK package page showing supply chain security and vulnerability scores for detecting malware in AI models — Malicious PyPi packages example 1

UI screenshot showing supply chain risk alert for detecting malware in AI models in a Python SDK package — Malicious PyPi packages example 2

File explorer showing a compromised PyPi package with a suspicious model.pt file, illustrating detecting malware in AI models — Malicious PyPi packages example 3

Advanced Evasion – Outsmarting Legacy Defenses

Once attackers saw the potential, they began experimenting with ways to make malicious models even harder to detect. A security researcher known as coldwaterq demonstrated how “Stacked Pickle” nature could be abused to hide malicious code.

By injecting malicious instructions between multiple layers of Pickle objects, attackers could bury their payload, so it looked harmless to traditional scanners. When the model was loaded, the hidden code would slowly unpack step by step, revealing its true purpose.

Security scan UI showing 0 of 62 vendors detected malware in AI model file, highlighting evasion in detection — Not detected on VirusTotal

The result is a new class of AI supply chain threat that is both stealthy and resilient. This evolution underscores the arms race between attackers innovating new tricks and defenders developing tools to expose them.

How Metadefender Sandbox detections help preventing AI attacks

As attackers improve their methods, simple signature scanning is no longer enough. Malicious AI models can use encoding, compression, or Pickle quirks to hide their payloads. MetaDefender Sandbox addresses this gap with deep, multi-layered analysis built specifically for AI and ML file formats.

Leveraging Integrated Pickle Scanning Tools

MetaDefender Sandbox integrates Fickling with custom OPSWAT parsers to break down Pickle files into their components. This allows defenders to:

Inspect unusual imports, unsafe function calls, and suspicious objects.
Identify functions that should never appear in a normal AI model (e.g., network communications, encryption routines).
Generate structured reports for security teams and SOC workflows.

Fickling tool UI showing a malicious verdict for detecting malware in AI models using pickle scanning — Obtained Fickling analysis verdict

The analysis highlights multiple types of signatures that can indicate a suspicious Pickle file. It looks for unusual patterns, unsafe function calls, or objects that do not align with a normal AI model’s purpose.

UI screenshot showing malware detection in AI model pickle file with malicious verdict and unsafe Python imports listed — Analysis found interesting clue in pickle

In the context of AI training, a Pickle file should not require external libraries for process interaction, network communication, or encryption routines. The presence of such imports is a strong indicator of malicious intent and should be flagged during inspection.

UI showing pickle file scan results for detecting malware in AI models, listing suspicious import and function calls — Pickle file calls a function

Análise estática profunda

Beyond parsing, the sandbox disassembles serialized objects and traces their instructions. For example, Pickle’s REDUCE opcode—which can execute arbitrary functions during unpickling—is carefully inspected. Attackers often abuse REDUCE to launch hidden payloads, and the sandbox flags any anomalous usage.

UI screenshot showing deep static analysis detecting malware in AI model pickle file with malicious verdict and code details — Uncover reverse shell from star23/baller13

Threat actors often hide the real payload behind extra encoding layers. In recent PyPI supply-chain incidents, the final Python payload was stored as a long base64 string, MetaDefender Sandbox automatically decodes and unpacks these layers to reveal the actual malicious content.

UI screenshot showing deep static analysis detecting malware in AI models via pickle file REDUCE opcode warning and code output — Arbitrary code execution contains encoded payload

UI screenshot showing deep static analysis verdict of malicious pickle file for detecting malware in AI models — Pickle file executes base64 encoded commands

Python code screenshot showing static analysis for detecting malware in AI models, relevant to deep static analysis — Payload after decoding. (Found within “Extracted Files“ section in MD sandbox report)

Uncovering Deliberate Evasion Techniques

Stacked Pickle can be utilized as a trick to hide malicious behavior. By nesting multiple Pickle objects and injecting the payload across layers then combined with compression or encoding. Each layer looks benign on its own, therefore many scanners and quick inspections miss the malicious payload.

MetaDefender Sandbox peels those layers one at a time: it parses each Pickle object, decodes or decompresses encoded segments, and follows the execution chain to reconstruct the full payload. By replaying the unpacking sequence in a controlled analysis flow, the sandbox exposes the hidden logic without running the code in a production environment.

For CISOs, the outcome is clear: hidden threats are surfaced before poisoned models reach your AI pipelines.

Malware detection UI highlights evasion techniques in AI models using pickle files, verdicts, and suspicious code analysis — Pickle file might execute a payload via REDUCE opcode

Conclusão

AI models are becoming the building blocks of modern software. But just like any software component, they can be weaponized. The combination of high trust and low visibility makes them ideal vehicles for supply chain attacks.

As real-world incidents show, malicious models are no longer hypothetical—they are here now. Detecting them is not trivial, but it is critical.

MetaDefender Sandbox provides the depth, automation, and precision needed to:

Detect hidden payloads in pretrained AI models.
Uncover advanced evasion tactics invisible to legacy scanners.
Protect MLOps pipelines, developers, and enterprises from poisoned components.

Organizations across critical industries already trust OPSWAT to defend their supply chains. With MetaDefender Sandbox, they can now extend that protection into the AI era, where innovation doesn’t come at the cost of security.

Learn more about MetaDefender Sandbox and see how it detects threats hidden in AI models.

Falar com um especialista

Indicadores de compromisso (IOCs)

star23/baller13: pytorch_model.bin
SHA256: b36f04a774ed4f14104a053d077e029dc27cd1bf8d65a4c5dd5fa616e4ee81a4

ai-labs-snippets-sdk: model.pt
SHA256: ff9e8d1aa1b26a0e83159e77e72768ccb5f211d56af4ee6bc7c47a6ab88be765

aliyun-ai-labs-snippets-sdk: model.pt
SHA256: aae79c8d52f53dcc6037787de6694636ecffee2e7bb125a813f18a81ab7cdff7

coldwaterq_inject_calc.pt
SHA256: 1722fa23f0fe9f0a6ddf01ed84a9ba4d1f27daa59a55f4f61996ae3ce22dab3a

C2 Servers
hxxps[://]aksjdbajkb2jeblad[.]oss-cn-hongkong[.]aliyuncs[.]com/aksahlksd

IPs
136.243.156.120
8.210.242.114

Etiquetas:

Análise de malware

Mensagens mais recentes

MetaDefender ICAP Server 5.11.0 Powers High-Throughput File Security for Enterprise Workloads
Oct 21, 2025
Detecting Malware in AI Models with MetaDefender Sandbox
Oct 20, 2025
5 Strategies to Building Cyber Resilience in Manufacturing
Oct 17, 2025
Genesis Energy Secures New Zealand’s Renewable Energy Against Removable Media Threats with OPSWAT
Oct 16, 2025
Explicação das vulnerabilidades comuns Cloud e dos riscos de segurança
Oct 15, 2025

Subscrever a newsletter OPSWAT

Obtenha as últimas actualizações da empresa OPSWAT , juntamente com informações sobre eventos e as notícias que estão a fazer avançar a indústria.

Inscrever-me

Siga-nos nas redes sociais Media

Siga OPSWAT no seu LinkedIn, Facebook, Twitter e YouTube para mais informações!

Mantenha-se atualizado com OPSWAT!

Inscreva-se hoje para receber as últimas actualizações da empresa, histórias, informações sobre eventos e muito mais.

Subscrever