作者:Urchade Zaratiana, Mary Newhauser, George Hurn-Maloney, Ash Lewis 译者:知道创宇404实验室翻译组 原文链接:https://arxiv.org/html/2605.07982v1 摘要 保障大语言模型(LLM)输出安全合规、符合政策要求,需要能跨多安全维度实时扩展的内容审核机制。然而,当前最优的安全护栏模型均基于70亿–27...
Author: Knownsec 404 Advanced Threat Intelligence Team
I. Introduction
SilverFox has become one of the most active cyber threats in recent years, targeting managerial and finance staff in organizat...
Author: Knownsec 404 Advanced Threat Intelligence Team I. Introduction SilverFox has become one of the most active cyber threats in recent years, targeting managerial and finance staff in organization...
作者:Neha Nagaraja, Lan Zhang, Zhilong Wang
译者:知道创宇404实验室翻译组
原文链接:https://arxiv.org/html/2603.03637v1
摘要:多模态大语言模型(MLLMs)融合视觉与文本能力赋能各类应用,但这种融合也引入了新的安全漏洞。本文研究基于图像的提示注入(IPI) 攻击——一种黑盒攻击方式,攻击者将对抗性指令嵌入自然图像...