benchmark 2025

EMNLP: Educator-role Moral and Normative Large Language Models Profiling

Yilin Jiang 1,2, Mingzi Zhang 3, Sheng Jin 4, Zengyi Yu 3, Xiangjie Kong 1, Binghao Tu 1

0 citations

α

Published on arXiv

2508.15250

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Models with stronger reasoning capabilities are paradoxically more vulnerable to harmful soft prompt injection, while model temperature and other hyperparameters have limited influence on most risk behaviors.

EMNLP (Educator-role Moral and Normative LLMs Profiling)

Novel technique introduced


Simulating Professions (SP) enables Large Language Models (LLMs) to emulate professional roles. However, comprehensive psychological and ethical evaluation in these contexts remains lacking. This paper introduces EMNLP, an Educator-role Moral and Normative LLMs Profiling framework for personality profiling, moral development stage measurement, and ethical risk under soft prompt injection. EMNLP extends existing scales and constructs 88 teacher-specific moral dilemmas, enabling profession-oriented comparison with human teachers. A targeted soft prompt injection set evaluates compliance and vulnerability in teacher SP. Experiments on 14 LLMs show teacher-role LLMs exhibit more idealized and polarized personalities than human teachers, excel in abstract moral reasoning, but struggle with emotionally complex situations. Models with stronger reasoning are more vulnerable to harmful prompt injection, revealing a paradox between capability and safety. The model temperature and other hyperparameters have limited influence except in some risk behaviors. This paper presents the first benchmark to assess ethical and psychological alignment of teacher-role LLMs for educational AI. Resources are available at https://e-m-n-l-p.github.io/.


Key Contributions

  • First benchmark framework (EMNLP) for personality, moral reasoning, and ethical risk evaluation of LLMs in teacher role-play contexts
  • 88 teacher-specific moral dilemmas including extreme professional scenarios for profession-oriented comparison with human teachers
  • Soft prompt injection evaluation across 14 LLMs revealing that stronger reasoning capability correlates with greater vulnerability to harmful prompt injection

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Datasets
EMNLP benchmark (88 teacher-specific moral dilemmas)extended personality scales
Applications
educational aiteacher role-playing llmsllm safety evaluation