DeepSight: An All-in-One LM Safety Toolkit

Authors: Bo Zhang, Jiaxuan Guo, Lijun Li, Dongrui Liu, Sujin Chen, Guanxu Chen, Zhijie Zheng, Qihao Lin, Lewen Yan, Chen Qian, Yijin Zhou, Yuyao Wu, Shaoxiong Guo, Tianyi Du, Jingyi Yang, Xuhao Hu, Ziqi Miao, Xiaoya Lu, Jing Shao, Xia Hu | Published: 2026-02-12

PAC to the Future: Zero-Knowledge Proofs of PAC Private Systems

Authors: Guilhem Repetto, Nojan Sheybani, Gabrielle De Micheli, Farinaz Koushanfar | Published: 2026-02-12

More Haste, Less Speed: Weaker Single-Layer Watermark Improves Distortion-Free Watermark Ensembles

Authors: Ruibo Chen, Yihan Wu, Xuehao Cui, Jingqi Zhang, Heng Huang | Published: 2026-02-12

LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection

Authors: Christian Rondanini, Barbara Carminati, Elena Ferrari, Niccolò Lardo, Ashish Kundu | Published: 2026-02-12

Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs

Authors: Dong Yan, Jian Liang, Ran He, Tieniu Tan | Published: 2026-02-12

Differentially Private and Communication Efficient Large Language Model Split Inference via Stochastic Quantization and Soft Prompt

Authors: Yujie Gu, Richeng Jin, Xiaoyu Ji, Yier Jin, Wenyuan Xu | Published: 2026-02-12

Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models

Authors: Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis | Published: 2026-02-12

Cachemir: Fully Homomorphic Encrypted Inference of Generative Large Language Model with KV Cache

Authors: Ye Yu, Yifan Zhou, Yi Chen, Pedro Soto, Wenjie Xiong, Meng Li | Published: 2026-02-12

Towards Explainable Federated Learning: Understanding the Impact of Differential Privacy

Authors: Júlio Oliveira, Rodrigo Ferreira, André Riker, Glaucio H. S. Carvalho, Eirini Eleni Tsilopoulou | Published: 2026-02-10

CAPID: Context-Aware PII Detection for Question-Answering Systems

Authors: Mariia Ponomarenko, Sepideh Abedini, Masoumeh Shafieinejad, D. B. Emerson, Shubhankar Mohapatra, Xi He | Published: 2026-02-10