Aggregator
HackedGPT – 7 New Vulnerabilities in GPT-4o and GPT-5 Enables 0-Click Attacks
Seven critical vulnerabilities in OpenAI’s ChatGPT, affecting both GPT-4o and the newly released GPT-5 models, that could allow attackers to steal private user data through stealthy, zero-click exploits. These flaws exploit indirect prompt injections, enabling hackers to manipulate the AI into exfiltrating sensitive information from user memories and chat histories without any user interaction beyond […]
The post HackedGPT – 7 New Vulnerabilities in GPT-4o and GPT-5 Enables 0-Click Attacks appeared first on Cyber Security News.
Belangrijke rol van vrouwen bij vrede en veiligheid over het voetlicht
新 HDR10+ Advanced 将改进运动平滑
Eurojust Leads Global Crackdown on EUR 300 Million Credit Card Fraud Scheme with 18 Arrests
UK carriers to block spoofed phone numbers in fraud crackdown
Google строит дата-центр в небесах. Проект Suncatcher: почему TPU-ускорители переедут на орбиту и как это удешевит ИИ-вычисления
KubeCon + CloudNativeCon North America 2025 — Must-See Sessions
KubeCon + CloudNativeCon North America 2025 is almost here, and whether you’re a cloud-native newcomer, seasoned SRE, or Kubernetes fan, Atlanta will be full of energy this month. The conference isn’t just for hardcore technologists, either. It’s designed for anyone interested in how cloud-native technology is shaping modern software and infrastructure. Here’s what to expect, which sessions we’ve circled on our schedules, and how to maximize your KubeCon + CloudNativeCon adventure this year.
The post KubeCon + CloudNativeCon North America 2025 — Must-See Sessions appeared first on Security Boulevard.
Ruag Confirms Ransomware Attack on U.S. Subsidiary
Why Data Security Is the Key to Transparency in Private Markets
CWES Review – HackTheBox Certified Web Exploitation Specialist
Interlock
You must login to view this content
INC
You must login to view this content
Akira
You must login to view this content
Akira
You must login to view this content
Rhysida
You must login to view this content
University of Pennsylvania confirms data stolen in cyberattack
NDSS 2025 – Safety Misalignment Against Large Language Models
SESSION
Session 2A: LLM Security
Authors, Creators & Presenters: Yichen Gong (Tsinghua University), Delong Ran (Tsinghua University), Xinlei He (Hong Kong University of Science and Technology (Guangzhou)), Tianshuo Cong (Tsinghua University), Anyu Wang (Tsinghua University), Xiaoyun Wang (Tsinghua University)
PAPER
Safety Misalignment Against Large Language Models
The safety alignment of Large Language Models (LLMs) is crucial to prevent unsafe content that violates human values. To ensure this, it is essential to evaluate the robustness of their alignment against diverse malicious attacks. However, the lack of a large-scale, unified measurement framework hinders a comprehensive understanding of potential vulnerabilities. To fill this gap, this paper presents the first comprehensive evaluation of existing and newly proposed safety misalignment methods for LLMs. Specifically, we investigate four research questions: (1) evaluating the robustness of LLMs with different alignment strategies, (2) identifying the most effective misalignment method, (3) determining key factors that influence misalignment effectiveness, and (4) exploring various defenses. The safety misalignment attacks in our paper include system-prompt modification, model fine-tuning, and model editing. Our findings show that Supervised Fine-Tuning is the most potent attack but requires harmful model responses. In contrast, our novel Self-Supervised Representation Attack (SSRA) achieves significant misalignment without harmful responses. We also examine defensive mechanisms such as safety data filter, model detoxification, and our proposed Self-Supervised Representation Defense (SSRD), demonstrating that SSRD can effectively re-align the model. In conclusion, our unified safety alignment evaluation framework empirically highlights the fragility of the safety alignment of LLMs.
Our thanks to the Network and Distributed System Security (NDSS) Symposium for publishing their Creators, Authors and Presenter’s superb NDSS Symposium 2025 Conference content on the organization’s’ YouTube channel.
The post NDSS 2025 – Safety Misalignment Against Large Language Models appeared first on Security Boulevard.