Latest papers

4 papers
attack arXiv Feb 13, 2026 · 7w ago

Realistic Face Reconstruction from Facial Embeddings via Diffusion Models

Dong Han, Yong Li, Joachim Denzler · Huawei Technologies · Friedrich Schiller University Jena

Attacks privacy-preserving face recognition systems by inverting facial embeddings into realistic face images using KAN and diffusion models

Model Inversion Attack vision
PDF
benchmark arXiv Jan 7, 2026 · 12w ago

What Matters For Safety Alignment?

Xing Li, Hui-Ling Zhen, Lihao Yin et al. · Huawei Technologies

Large-scale safety alignment benchmark evaluating 32 LLMs with 56 jailbreak techniques, finding CoT prefix attacks raise ASR by 3.34x

Prompt Injection nlp
PDF
attack arXiv Dec 6, 2025 · Dec 2025

Metaphor-based Jailbreaking Attacks on Text-to-Image Models

Chenyu Zhang, Yiwen Ma, Lanjun Wang et al. · Tianjin University · Huawei Technologies

Metaphor-based jailbreak attack bypasses T2I model safety filters without knowing deployed defense type using LLM multi-agent prompt generation

Prompt Injection visionnlpmultimodalgenerative
1 citations PDF Code
defense arXiv Nov 11, 2025 · Nov 2025

Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks

Yaxin Xiao, Qingqing Ye, Zi Liang et al. · The Hong Kong Polytechnic University · Huawei Technologies +1 more

Proposes WRK to break existing black-box model watermarks, then introduces CFW watermarking resilient to combined extraction and removal attacks

Model Theft vision
PDF Code