benchmark 2026

Small Symbols, Big Risks: Exploring Emoticon Semantic Confusion in Large Language Models

Weipeng Jiang 1, Xiaoyu Zhang 2, Juan Zhai 3, Shiqing Ma 3, Chao Shen 1, Yang Liu 2

0 citations · 45 references · arXiv

α

Published on arXiv

2601.07885

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Six LLMs exhibit an average emoticon confusion ratio exceeding 38%, with over 90% of confused responses producing syntactically valid but semantically incorrect silent failures with destructive security consequences

Emoticon Semantic Confusion

Novel technique introduced


Emoticons are widely used in digital communication to convey affective intent, yet their safety implications for Large Language Models (LLMs) remain largely unexplored. In this paper, we identify emoticon semantic confusion, a vulnerability where LLMs misinterpret ASCII-based emoticons to perform unintended and even destructive actions. To systematically study this phenomenon, we develop an automated data generation pipeline and construct a dataset containing 3,757 code-oriented test cases spanning 21 meta-scenarios, four programming languages, and varying contextual complexities. Our study on six LLMs reveals that emoticon semantic confusion is pervasive, with an average confusion ratio exceeding 38%. More critically, over 90% of confused responses yield 'silent failures', which are syntactically valid outputs but deviate from user intent, potentially leading to destructive security consequences. Furthermore, we observe that this vulnerability readily transfers to popular agent frameworks, while existing prompt-based mitigations remain largely ineffective. We call on the community to recognize this emerging vulnerability and develop effective mitigation methods to uphold the safety and reliability of the LLM system.


Key Contributions

  • Identifies and formalizes 'emoticon semantic confusion' as a novel LLM vulnerability class where ASCII emoticons cause systematic misinterpretation of user intent
  • Constructs a dataset of 3,757 code-oriented test cases spanning 21 meta-scenarios and 4 programming languages for systematic vulnerability evaluation
  • Demonstrates >38% average confusion ratio across six LLMs, with >90% of confused responses being 'silent failures', and shows vulnerability transfers to agent frameworks while existing prompt-based mitigations are ineffective

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_timeblack_box
Datasets
custom emoticon-confusion dataset (3,757 test cases)
Applications
code generationllm agents