defense 2025

Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan

Yu-Kai Shih , You-Kai Kang

0 citations · 20 references · arXiv

α

Published on arXiv

2509.21367

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Secure RAG variant achieves >95% accuracy on 223 benign queries with substantial injection detection; GPT-5 alone blocks ~85% of adversarial prompts, underscoring residual need for layered defenses.

Reverse RAG Guardrail

Novel technique introduced


As smart tourism evolves, AI-powered chatbots have become indispensable for delivering personalized, real-time assistance to travelers while promoting sustainability and efficiency. However, these systems are increasingly vulnerable to prompt injection attacks, where adversaries manipulate inputs to elicit unintended behaviors such as leaking sensitive information or generating harmful content. This paper presents a case study on the design and implementation of a secure retrieval-augmented generation (RAG) chatbot for Hsinchu smart tourism services. The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections, achieving high contextual awareness and security. Key features include a tiered response strategy, RAG-driven knowledge grounding, and intent decomposition across lexical, semantic, and pragmatic levels. Defense mechanisms include system norms, gatekeepers for intent judgment, and reverse RAG text to prioritize verified data. We also benchmark a GPT-5 variant (released 2025-08-07) to assess inherent robustness. Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks. GPT-5 blocked about 85% of attacks, showing progress yet highlighting the need for layered defenses. Findings emphasize contributions to sustainable tourism, multilingual accessibility, and ethical AI deployment. This work offers a practical framework for deploying secure chatbots in smart tourism and contributes to resilient, trustworthy AI applications.


Key Contributions

  • Multi-layered prompt injection defense framework combining system norms, intent gatekeepers, and reverse RAG text prioritization to ground responses in verified data over adversarial inputs
  • Tiered secure RAG chatbot architecture with multi-level linguistic intent decomposition (lexical, semantic, pragmatic) for tourism-domain query handling
  • Empirical evaluation of GPT-5's inherent robustness against 674 adversarial prompt injection prompts without additional guardrails, finding ~85% attack blocking

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Datasets
Deepset adversarial promptsRubend18 adversarial prompts
Applications
tourism chatbotcustomer service chatbotsmart tourism services