defense 2026

SEAL-Tag: Self-Tag Evidence Aggregation with Probabilistic Circuits for PII-Safe Retrieval-Augmented Generation

Jin Xie , Songze Li , Guang Cheng

0 citations

α

Published on arXiv

2603.17292

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Reduces adaptive PII leakage by over 8× compared to baselines while matching utility and speed of unsafe systems

SEAL-Tag

Novel technique introduced


Retrieval-Augmented Generation (RAG) systems introduce a critical vulnerability: contextual leakage, where adversaries exploit instruction-following to exfiltrate Personally Identifiable Information (PII) via adaptive extraction. Current defenses force a rigid trade-off between semantic utility and latency. We present SEAL-Tag, a privacy-preserving runtime environment that resolves this via a Verify-then-Route paradigm. SEAL-Tag introduces the SEAL-Probe protocol, transforming auditing into a structured tool-use operation where the model generates a verifiable PII-Evidence Table (PET) alongside its draft. To adjudicate this evidence, we employ a Probabilistic Circuit (PC) that enforces verifiable logical constraints for robust decision-making. To overcome the privacy "Cold Start" problem, we introduce the S0--S6 Anchored Synthesis Pipeline, generating high-fidelity, provenanced RAG interactions. We pair this with a Two-Stage Curriculum that first optimizes for entity detection before aligning the model to the rigorous audit protocol. Our evaluation demonstrates that SEAL-Tag establishes a new Pareto frontier, reducing adaptive leakage by over 8$\times$ while matching the utility and speed of unsafe baselines.


Key Contributions

  • SEAL-Probe protocol transforming privacy auditing into structured tool-use with PII-Evidence Tables (PET)
  • Probabilistic Circuit (PC) for verifiable, calibrated policy enforcement with microsecond latency
  • S0-S6 Anchored Synthesis Pipeline generating high-fidelity training data for privacy-aware RAG

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Applications
retrieval-augmented generationenterprise ai with sensitive documentspii-protected question answering