Latest papers

5 papers
attack arXiv Apr 1, 2026 · 5d ago

Fluently Lying: Adversarial Robustness Can Be Substrate-Dependent

Daye Kang, Hyeongboo Baek · University of Seoul

Discovers substrate-dependent adversarial failure mode where SNN detectors maintain detection count while accuracy collapses under standard PGD

Input Manipulation Attack vision
PDF
defense arXiv Mar 6, 2026 · 4w ago

SPOILER: TEE-Shielded DNN Partitioning of On-Device Secure Inference with Poison Learning

Donghwa Kang, Hojun Choe, Doohyun Kim et al. · Korea Advanced Institute of Science and Technology · University of Seoul

Defends edge-deployed DNNs against model theft via TEE partitioning and self-poisoning that renders the exposed backbone functionally incoherent

Model Theft vision
PDF
attack arXiv Feb 2, 2026 · 9w ago

Zero2Text: Zero-Training Cross-Domain Inversion Attacks on Textual Embeddings

Doohyun Kim, Donghwa Kang, Kyungjae Lee et al. · Korea Advanced Institute of Science and Technology · University of Seoul

Training-free embedding inversion attack recovers private text from RAG vector databases without in-domain data, defeating differential privacy defenses

Model Inversion Attack Sensitive Information Disclosure nlp
1 citations PDF
benchmark arXiv Aug 23, 2025 · Aug 2025

ObjexMT: Objective Extraction and Metacognitive Calibration for LLM-as-a-Judge under Multi-Turn Jailbreaks

Hyunjun Kim, Junwoo Ha, Sangyoon Yu et al. · AIM Intelligence · KAIST +2 more

Benchmarks LLM judges on recovering hidden jailbreak objectives in multi-turn transcripts and calibrating their own confidence in safety evaluations

Prompt Injection nlp
PDF Code
attack arXiv Aug 19, 2025 · Aug 2025

Timestep-Compressed Attack on Spiking Neural Networks through Timestep-Level Backpropagation

Donghwa Kang, Doohyun Kim, Sang-Ki Ko et al. · Korea Advanced Institute of Science and Technology · University of Seoul +1 more

Accelerates gradient-based adversarial attacks on spiking neural networks by 57% via timestep-level backpropagation and membrane potential reuse

Input Manipulation Attack vision
PDF