Unvalidated Trust: Cross-Stage Vulnerabilities in Large Language Model Architectures
Published on arXiv
2510.27190
Prompt Injection
OWASP LLM Top 10 — LLM01
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
String-level input filtering is architecturally insufficient against cross-stage LLM vulnerabilities; zero-trust principles (provenance enforcement, context sealing, plan revalidation) are required to mitigate 41 identified risk patterns in commercial LLM pipelines.
Countermind
Novel technique introduced
As Large Language Models (LLMs) are increasingly integrated into automated, multi-stage pipelines, risk patterns that arise from unvalidated trust between processing stages become a practical concern. This paper presents a mechanism-centered taxonomy of 41 recurring risk patterns in commercial LLMs. The analysis shows that inputs are often interpreted non-neutrally and can trigger implementation-shaped responses or unintended state changes even without explicit commands. We argue that these behaviors constitute architectural failure modes and that string-level filtering alone is insufficient. To mitigate such cross-stage vulnerabilities, we recommend zero-trust architectural principles, including provenance enforcement, context sealing, and plan revalidation, and we introduce "Countermind" as a conceptual blueprint for implementing these defenses.
Key Contributions
- Mechanism-centered taxonomy of 41 recurring risk patterns arising from unvalidated trust in multi-stage commercial LLM pipelines
- Empirical analysis demonstrating that inputs can trigger implementation-shaped responses and unintended state changes without explicit commands, establishing these as architectural failure modes
- Countermind — a conceptual zero-trust defense blueprint incorporating provenance enforcement, context sealing, and plan revalidation for LLM pipeline security