Latest papers

5 papers
defense arXiv Apr 19, 2026 · 4w ago

Representation-Guided Parameter-Efficient LLM Unlearning

Zeguan Xiao, Lang Mo, Yun Chen et al. · Shanghai University of Finance and Economics · Southern University of Science and Technology +1 more

LoRA-based LLM unlearning using representation geometry to remove knowledge while preserving utility, evaluated on TOFU and WMDP

Model Inversion Attack Sensitive Information Disclosure nlp
PDF Code
benchmark arXiv Jan 9, 2026 · Jan 2026

FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

Zhi Yang, Runguo Li, Qiqi Qiang et al. · Shanghai University of Finance and Economics · The Chinese University of Hong Kong +8 more

Benchmarks prompt injection and jailbreak attacks on LLM financial agents in execution-grounded, state-writable sandbox environments

Prompt Injection Excessive Agency nlp
PDF Code
defense arXiv Dec 2, 2025 · Dec 2025

Adaptive Decentralized Federated Learning for Robust Optimization

Shuyuan Wu, Feifei Wang, Yuan Gao et al. · Shanghai University of Finance and Economics · Renmin University of China +2 more

Defends decentralized federated learning against Byzantine and data-poisoned clients via adaptive per-client learning rate adjustment

Data Poisoning Attack federated-learning
PDF
defense arXiv Nov 14, 2025 · Nov 2025

HealSplit: Towards Self-Healing through Adversarial Distillation in Split Federated Learning

Yuhan Xie, Chen Lyu · Shanghai University of Finance and Economics

Defends Split Federated Learning against five poisoning attack types via topology-aware detection and adversarial multi-teacher distillation recovery

Data Poisoning Attack visionfederated-learning
PDF
benchmark arXiv Oct 16, 2025 · Oct 2025

On the Ability of LLMs to Handle Character-Level Perturbations: How Well and How?

Anyuan Zhuo, Xuefei Ning, Ningyuan Li et al. · Shanghai University of Finance and Economics · Tsinghua University

Benchmarks LLM robustness to invisible Unicode character injection intended to block exam cheating, finding surprising resilience

Prompt Injection nlp
PDF