BlackIce: A Containerized Red Teaming Toolkit for AI Security Testing

AI models are being increasingly integrated into real-world systems, raising significant concerns about their safety and security. Consequently, AI red teaming has become essential for organizations to proactively identify and address vulnerabilities before they can be exploited by adversaries. While numerous AI red teaming tools currently exist, practitioners face challenges in selecting the most appropriate tools from a rapidly expanding landscape, as well as managing complex and frequently conflicting software dependencies across isolated projects. Given these challenges and the relatively small number of organizations with dedicated AI red teams, there is a strong need to lower barriers to entry and establish a standardized environment that simplifies the setup and execution of comprehensive AI model assessments. Inspired by Kali Linux's role in traditional penetration testing, we introduce BlackIce, an open-source containerized toolkit designed for red teaming Large Language Models (LLMs) and classical machine learning (ML) models. BlackIce provides a reproducible, version-pinned Docker image that bundles 14 carefully selected open-source tools for Responsible AI and Security testing, all accessible via a unified command-line interface. With this setup, initiating red team assessments is as straightforward as launching a container, either locally or using a cloud platform. Additionally, the image's modular architecture facilitates community-driven extensions, allowing users to easily adapt or expand the toolkit as new threats emerge. In this paper, we describe the architecture of the container image, the process used for selecting tools, and the types of evaluations they support.

Key Contributions

BlackIce: a version-pinned Docker container bundling 14 curated open-source AI security tools accessible via a unified CLI, eliminating dependency conflicts across projects
Structured tool selection process categorizing tools by type (static/dynamic) across LLM and classical ML threat categories including jailbreaking, prompt injection, adversarial robustness, and supply chain security
Modular, extensible architecture enabling community-driven addition of new tools as the AI threat landscape evolves

🛡️ Threat Analysis

Input Manipulation Attack

Bundles classical adversarial ML testing tools (CleverHans, ART) for generating and evaluating adversarial examples against traditional ML models at inference time.