IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation
Yanpei Guo 1, Wenjie Qu 1, Linyu Wu 1, Shengfang Zhai 1, Lionel Z. Wang 2, Ming Xu 1, Yue Liu 1, Binhang Yuan , Dawn Song 3, Jiaheng Zhang 1
Published on arXiv
2602.22700
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
IMMACULATE reliably detects model substitution, quantization abuse, and token overbilling with under 1% throughput overhead on both dense and MoE LLMs
IMMACULATE
Novel technique introduced
Commercial large language models are typically deployed as black-box API services, requiring users to trust providers to execute inference correctly and report token usage honestly. We present IMMACULATE, a practical auditing framework that detects economically motivated deviations-such as model substitution, quantization abuse, and token overbilling-without trusted hardware or access to model internals. IMMACULATE selectively audits a small fraction of requests using verifiable computation, achieving strong detection guarantees while amortizing cryptographic overhead. Experiments on dense and MoE models show that IMMACULATE reliably distinguishes benign and malicious executions with under 1% throughput overhead. Our code is published at https://github.com/guo-yanpei/Immaculate.
Key Contributions
- Threat model formalizing economically motivated LLM provider deviations: model substitution, quantization abuse, and token overbilling
- Selective auditing protocol using verifiable computation that amortizes cryptographic overhead across requests to achieve under 1% throughput overhead
- Empirical demonstration on dense and MoE LLMs that IMMACULATE reliably distinguishes benign from malicious executions without trusted hardware or model internals
🛡️ Threat Analysis
IMMACULATE is a verifiable inference scheme that detects when LLM API outputs have been tampered with or produced dishonestly (wrong model, wrong precision, fraudulent token counts). ML09 explicitly includes 'verifiable inference schemes (proving outputs weren't tampered with)' — this is the paper's core contribution.