defense 2025

Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan

Yu-Kai Shih , You-Kai Kang

National Dong Hwa University

0 citations · 20 references · arXiv

Published on arXiv

2509.21367

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Secure RAG variant achieves >95% accuracy on 223 benign queries with substantial injection detection; GPT-5 alone blocks ~85% of adversarial prompts, underscoring residual need for layered defenses.

Reverse RAG Guardrail

Novel technique introduced

As smart tourism evolves, AI-powered chatbots have become indispensable for delivering personalized, real-time assistance to travelers while promoting sustainability and efficiency. However, these systems are increasingly vulnerable to prompt injection attacks, where adversaries manipulate inputs to elicit unintended behaviors such as leaking sensitive information or generating harmful content. This paper presents a case study on the design and implementation of a secure retrieval-augmented generation (RAG) chatbot for Hsinchu smart tourism services. The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections, achieving high contextual awareness and security. Key features include a tiered response strategy, RAG-driven knowledge grounding, and intent decomposition across lexical, semantic, and pragmatic levels. Defense mechanisms include system norms, gatekeepers for intent judgment, and reverse RAG text to prioritize verified data. We also benchmark a GPT-5 variant (released 2025-08-07) to assess inherent robustness. Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks. GPT-5 blocked about 85% of attacks, showing progress yet highlighting the need for layered defenses. Findings emphasize contributions to sustainable tourism, multilingual accessibility, and ethical AI deployment. This work offers a practical framework for deploying secure chatbots in smart tourism and contributes to resilient, trustworthy AI applications.

Key Contributions

Multi-layered prompt injection defense framework combining system norms, intent gatekeepers, and reverse RAG text prioritization to ground responses in verified data over adversarial inputs
Tiered secure RAG chatbot architecture with multi-level linguistic intent decomposition (lexical, semantic, pragmatic) for tourism-domain query handling
Empirical evaluation of GPT-5's inherent robustness against 674 adversarial prompt injection prompts without additional guardrails, finding ~85% attack blocking

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

Deepset adversarial promptsRubend18 adversarial prompts

Applications

tourism chatbotcustomer service chatbotsmart tourism services

Read PDF arXiv DOI

Design and Implementation of a Secure RAG-Enhanced AI Chatbot for Smart Tourism Customer Service: Defending Against Prompt Injection Attacks -- A Case Study of Hsinchu, Taiwan

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Prompt Attack Detection with LLM-as-a-Judge and Mixture-of-Models

Attacks by Content: Automated Fact-checking is an AI Security Issue

LLM Reinforcement in Context

Active Honeypot Guardrail System: Probing and Confirming Multi-Turn LLM Jailbreaks

Incentive-Aligned Multi-Source LLM Summaries

BlueCodeAgent: A Blue Teaming Agent Enabled by Automated Red Teaming for CodeGen AI

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?

A Multi-Agent LLM Defense Pipeline Against Prompt Injection Attacks