defense 2025

Enhancing Reliability in LLM-Integrated Robotic Systems: A Unified Approach to Security and Safety

Wenxiao Zhang , Xiangrui Kong , Conan Dewitt , Thomas Bräunl , Jin B. Hong

0 citations

α

Published on arXiv

2509.02163

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Achieves 30.8% improvement under prompt injection attacks and up to 325% improvement in complex adversarial environment settings compared to baseline scenarios


Integrating large language models (LLMs) into robotic systems has revolutionised embodied artificial intelligence, enabling advanced decision-making and adaptability. However, ensuring reliability, encompassing both security against adversarial attacks and safety in complex environments, remains a critical challenge. To address this, we propose a unified framework that mitigates prompt injection attacks while enforcing operational safety through robust validation mechanisms. Our approach combines prompt assembling, state management, and safety validation, evaluated using both performance and security metrics. Experiments show a 30.8% improvement under injection attacks and up to a 325% improvement in complex environment settings under adversarial conditions compared to baseline scenarios. This work bridges the gap between safety and security in LLM-based robotic systems, offering actionable insights for deploying reliable LLM-integrated mobile robots in real-world settings. The framework is open-sourced with simulation and physical deployment demos at https://llmeyesim.vercel.app/


Key Contributions

  • Unified framework combining structured prompt assembly, dynamic state management, and interpretable safety validation to reject unsafe LLM outputs in mobile robots
  • Novel adversarial evaluation methodology with purpose-built metrics (MOER, TLR, ADR) for quantifying mission robustness and safety under injection attacks
  • Real-world deployment on a physical LiDAR/camera robot providing the first empirical sim-to-real validation under adversarial conditions

🛡️ Threat Analysis


Details

Domains
nlpmultimodal
Model Types
llm
Threat Tags
black_boxinference_time
Applications
llm-integrated mobile roboticsautonomous robot navigationembodied ai