CVE-2025-23298
Remote Code Execution in NVIDIA Merlin Transformers4Rec
Discovered by the Trend Micro Zero Day Initiative (ZDI) Threat Hunting Team, CVE-2025-23298 represents a critical vulnerability in the NVIDIA Merlin Transformers4Rec library. The vulnerability stems from unsafe deserialization practices in the model checkpoint loading functionality, specifically the use of Python's pickle module without proper safety controls.
What makes this finding particularly significant is how it highlights endemic security challenges facing the ML/AI ecosystem's reliance on Python's pickle serialization. Despite years of warnings from the security community, this class of vulnerability continues to plague machine learning frameworks.
Real-World Impact

Source: Trend Micro Research - Comprehensive impact analysis of CVE-2025-23298
Technical Analysis
NVIDIA Transformers4Rec is part of the Merlin ecosystem, designed to leverage state-of-the-art transformer architectures for sequential and session-based recommendation tasks. It acts as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with Hugging Face Transformers.
Affected Systems
NVIDIA Merlin Transformers4Rec
VulnerableAll versions prior to the security patch are affected by this vulnerability.
load_model_trainer_states_from_checkpointfunction- Model checkpoint loading functionality
- PyTorch model state restoration
- Training resumption mechanisms
Deployment Scenarios at Risk:
Mitigation & Remediation
1. Apply Security Patch Immediately
Update to the latest version of NVIDIA Merlin Transformers4Rec that includes the security fix. The patch modifies the checkpoint loading function to use safe deserialization methods.
pip install --upgrade transformers4rec2. Use Safe Loading Parameters
When loading PyTorch models, always use the weights_only=True parameter to prevent arbitrary code execution:
torch.load(checkpoint_path, weights_only=True)3. Validate Model Sources
Only load checkpoint files from trusted, verified sources. Implement cryptographic verification (checksums, digital signatures) for all model files before loading. Maintain an allowlist of approved model repositories and sources.
4. Implement Sandboxing
Load untrusted models in isolated, sandboxed environments with restricted permissions. Use containerization (Docker, Kubernetes) with security policies that limit system access. Consider using dedicated model loading services with minimal privileges.
5. Apply Principle of Least Privilege
Run ML services with minimal necessary permissions. Avoid running model loading processes as root or with elevated privileges. Use dedicated service accounts with restricted access to sensitive resources.
6. Monitor and Audit
Implement comprehensive logging for all model loading operations. Monitor for suspicious activities such as unexpected system calls or network connections during model loading. Regularly audit model sources and loading patterns.
7. Use Alternative Serialization Formats
Consider migrating to safer serialization formats like SafeTensors, which is designed specifically for ML models and doesn't allow arbitrary code execution:
from safetensors.torch import load_file
model_state = load_file("model.safetensors")8. Network Segmentation
Isolate ML infrastructure from critical systems and sensitive data. Implement network segmentation to limit the blast radius of a potential compromise. Use firewalls and access controls to restrict communication between ML systems and other infrastructure.
Organizations should implement detection mechanisms to identify potential exploitation attempts:
File Integrity Monitoring
Monitor checkpoint files for unexpected modifications or suspicious metadata
Process Monitoring
Watch for unusual child processes spawned during model loading operations
Network Traffic Analysis
Detect unexpected network connections initiated during checkpoint loading
Behavioral Analysis
Identify anomalous system behavior following model loading events