Randall Munroe’s XKCD ‘’Physics Paths”
via the comic artistry and dry wit of Randall Munroe, creator of XKCD
The post Randall Munroe’s XKCD ‘’Physics Paths” appeared first on Security Boulevard.
via the comic artistry and dry wit of Randall Munroe, creator of XKCD
The post Randall Munroe’s XKCD ‘’Physics Paths” appeared first on Security Boulevard.
5 min readAgentic AI systems need comprehensive guardrails to deploy safely at scale. Learn how access controls, policy frameworks, and visibility enable automation.
The post Why Agentic AI Needs Guardrails to Thrive appeared first on Aembit.
The post Why Agentic AI Needs Guardrails to Thrive appeared first on Security Boulevard.
7 min readExplore the profound shift to agentic AI, its unprecedented automation capabilities, and the critical security and governance challenges it introduces. Learn how to secure autonomous systems.
The post The Promise and Perils of Agentic AI: Autonomy at Scale appeared first on Aembit.
The post The Promise and Perils of Agentic AI: Autonomy at Scale appeared first on Security Boulevard.
AI has fundamentally changed how we think about both innovation and risk. It’s driving new breakthroughs in medicine, design, and productivity, but it’s also giving attackers a sharper edge. Ransomware isn’t just about encrypting data anymore. It’s about double extortion, data theft, and the erosion of trust that organizations depend on to operate. As threat..
The post Rethinking Cyber Resilience in the Age of AI appeared first on Security Boulevard.
Discover the best Application Security Testing (AST) services in 2025.
The post Best Application Security Testing Services to Know appeared first on Security Boulevard.
KubeCon + CloudNativeCon North America 2025 is almost here, and whether you’re a cloud-native newcomer, seasoned SRE, or Kubernetes fan, Atlanta will be full of energy this month. The conference isn’t just for hardcore technologists, either. It’s designed for anyone interested in how cloud-native technology is shaping modern software and infrastructure. Here’s what to expect, which sessions we’ve circled on our schedules, and how to maximize your KubeCon + CloudNativeCon adventure this year.
The post KubeCon + CloudNativeCon North America 2025 — Must-See Sessions appeared first on Security Boulevard.
SESSION
Session 2A: LLM Security
Authors, Creators & Presenters: Yichen Gong (Tsinghua University), Delong Ran (Tsinghua University), Xinlei He (Hong Kong University of Science and Technology (Guangzhou)), Tianshuo Cong (Tsinghua University), Anyu Wang (Tsinghua University), Xiaoyun Wang (Tsinghua University)
PAPER
Safety Misalignment Against Large Language Models
The safety alignment of Large Language Models (LLMs) is crucial to prevent unsafe content that violates human values. To ensure this, it is essential to evaluate the robustness of their alignment against diverse malicious attacks. However, the lack of a large-scale, unified measurement framework hinders a comprehensive understanding of potential vulnerabilities. To fill this gap, this paper presents the first comprehensive evaluation of existing and newly proposed safety misalignment methods for LLMs. Specifically, we investigate four research questions: (1) evaluating the robustness of LLMs with different alignment strategies, (2) identifying the most effective misalignment method, (3) determining key factors that influence misalignment effectiveness, and (4) exploring various defenses. The safety misalignment attacks in our paper include system-prompt modification, model fine-tuning, and model editing. Our findings show that Supervised Fine-Tuning is the most potent attack but requires harmful model responses. In contrast, our novel Self-Supervised Representation Attack (SSRA) achieves significant misalignment without harmful responses. We also examine defensive mechanisms such as safety data filter, model detoxification, and our proposed Self-Supervised Representation Defense (SSRD), demonstrating that SSRD can effectively re-align the model. In conclusion, our unified safety alignment evaluation framework empirically highlights the fragility of the safety alignment of LLMs.
Our thanks to the Network and Distributed System Security (NDSS) Symposium for publishing their Creators, Authors and Presenter’s superb NDSS Symposium 2025 Conference content on the organization’s’ YouTube channel.
The post NDSS 2025 – Safety Misalignment Against Large Language Models appeared first on Security Boulevard.
For years, the DoD has lost sensitive Controlled Unclassified Information (CUI) through breaches in the Defense Industrial Base (DIB). Adversaries targeted smaller, less secure subcontractors to steal valuable intellectual property tied to weapons and technology. The Cybersecurity Maturity Model Certification (CMMC) was created to stop these leaks by enforcing a unified cybersecurity standard across the entire defense supply chain.
The post CMMC 2.0 in Action: Operationalizing Secure Software Practices Across the Defense Industrial Base appeared first on Security Boulevard.
Honored by The Australian Financial Review’s 14th annual awards in the Technology category
The post Kasada Named Finalist in AFR BOSS Most Innovative Companies List appeared first on Security Boulevard.
Two former cybersecurity pros were indicted with conspiring with a third unnamed co-conspirator of using the high-profile BlackCat ransomware to launch attacks in 2023 against five U.S. companies to extort payment in cryptocurrency and then splitting the proceeds.
The post Security Experts Charged with Launching BlackCat Ransomware Attacks appeared first on Security Boulevard.
For many in the research community, it’s gotten harder to be optimistic about the impacts of artificial intelligence.
As authoritarianism is rising around the world, AI-generated “slop” is overwhelming legitimate media, while AI-generated deepfakes are spreading misinformation and parroting extremist messages. AI is making warfare more precise and deadly amidst intransigent conflicts. AI companies are exploiting people in the global South who work as data labelers, and profiting from content creators worldwide by using their work without license or compensation. The industry is also affecting an already-roiling climate with its ...
The post Scientists Need a Positive Vision for AI appeared first on Security Boulevard.
Threat actors are working with organized crime groups to target freight operators and transportation companies, infiltrate their systems through RMM software, and steal cargo, which they then sell online or ship to Europe, according to Proofpoint researchers, who saw similar campaigns last year.
The post Hackers Targeting Freight Operators to Steal Cargo: Proofpoint appeared first on Security Boulevard.
Those who follow the DNS abuse landscape closely may have noticed a rise in activity and abuse reports related to TDS. The use of this infrastructure for malicious purposes is becoming increasingly common. In this blog, we look at how TDS are being exploited to facilitate abuse, why they present challenges for takedowns, and what we can do as a community to address the problem.
The post Traffic Distribution System (TDS) abuse – What’s hiding behind the veil? appeared first on Security Boulevard.
Tenable Research has discovered seven vulnerabilities and attack techniques in ChatGPT, including unique indirect prompt injections, exfiltration of personal user information, persistence, evasion, and bypass of safety mechanisms.
Key takeaways:Prompt injections are a weakness in how large language models (LLMs) process input data. An attacker can manipulate the LLM by injecting instructions into any data it ingests, which can cause the LLM to ignore the original instructions and perform unintended or malicious actions instead. Specifically, indirect prompt injection occurs when an LLM finds unexpected instructions in an external source, such as a document or website, rather than a direct prompt from the user. Since prompt injection is a well-known issue with LLMs, many AI vendors create safeguards to help mitigate and protect against it. Nevertheless, we discovered several vulnerabilities and techniques that significantly increase the potential impact of indirect prompt injection attacks. To better understand the discoveries, we will first cover some technical details about how ChatGPT works. (To get right to the discoveries, click here.)
System promptEvery ChatGPT model has a set of instructions created by OpenAI that outline the capabilities and context of the model before its conversation with the user. This is called a System Prompt. Researchers often use techniques for extracting the System Prompt from ChatGPT (as can be seen here), giving insight into how the LLM works. When looking at the System Prompt, we can see that ChatGPT has the capability to retain information across conversations using the bio tool, or, as ChatGPT users may know it, memories. The context from the user’s memories is appended to the System Prompt, giving the model access to any (potentially private) information deemed important in previous conversations. Additionally, we can see that ChatGPT has access to a web tool, allowing it to access up-to-date information from the internet based on two commands: search and open_url.
The bio tool, aka memoriesThe ChatGPT memory feature mentioned above is enabled by default. If the user asks it to remember something, or if there is some information that the engine deems important even without an explicit request, it can be remembered using memories. As is seen in the System Prompt, the memories are invoked internally using the bio tool and sent as a static context along with it. It is important to note that memories could contain private information about the user. Memories are shared between conversations and considered by the LLM before each response. It is also possible to have a memory about the type of response you want, which will be taken into account whenever ChatGPT responds.
In addition to its long-term memory feature, ChatGPT considers the current conversation and context when responding. It can refer to previous requests and messages or follow a line of thinking. To avoid confusion, we will refer to this type of memory as Conversational Context.
The web toolWhile researching ChatGPT, we discovered some information about how the web tool works. If ChatGPT gets a URL directly from the user or decides it needs to visit a specific URL, it will do so with the web tool's open_url functionality, which we will refer to as Browsing Context. When doing so, it will usually use the ChatGPT-User user agent. It quickly became clear to us that there is some kind of cache mechanism for such browsing, since when we asked about a URL that was already opened, ChatGPT would respond without browsing again.
Based on our experimentation, ChatGPT is extremely susceptible to prompt injection while browsing, but we concluded that open_url actually hands the responsibility of browsing to an alternative LLM named SearchGPT, which has significantly fewer capabilities and understanding of the user context. Sometimes ChatGPT will respond with the output of SearchGPT’s browsing results as-is, and sometimes it will take the full output and modify its reply based on the question. As a method of isolation, SearchGPT has no access to the user’s memories or context. Therefore, despite being susceptible to prompt injection in the Browsing Context, the user should, theoretically, be safe, as SearchGPT is doing the browsing.
In this example, the user has a memory stating that responses should include emojis. Source: Tenable, November 2025 Since SearchGPT doesn’t have access to memories, they are not addressed when it responds. Source: Tenable, November 2025.The other end of the web tool is the search command, used by ChatGPT to invoke an internet search whenever a user enters a prompt that requires it. ChatGPT uses a proprietary search engine to find and return results based on up-to-date information that may have been published after the model’s training cutoff date. A user can choose this feature with the dedicated “Web search” button; if the user doesn’t select this feature, a search is conducted at the LLM’s discretion. ChatGPT might send a few queries or change the wording of the search in an attempt to optimize the results, which are returned as a list of websites and snippets. If possible, it will respond solely based on the information in the result snippets, but if that information is insufficient, it might browse using the open_url command to some of the sites in order to investigate further. It seems that part of the indexing is done by Bing, and part is done by OpenAI using their crawler with OAI-Search as its user agent. We don’t know the distinction in the responsibilities of OpenAI and Bing. We will refer to this usage of the search command as Search Context.
An example of a web search and its results. Source: Tenable, November 2025. The url_safe endpointSince prompt injection is such a prevalent issue, AI vendors are constantly trying to mitigate the potential impact of these attacks by developing safety features to protect user data. Much of the potential impact of prompt injection stems from having the AI respond with URLs, which could be used to direct the user to a malicious website or exfiltrate information with image markdown rendering. OpenAI has attempted to address this issue with an endpoint named url_safe, which checks most URLs before they are shown to the user and uses proprietary logic to decide whether the URL is safe or not. If it is deemed unsafe, the link is omitted from the output.
Based on our research, some of the parameters that are checked include:
When diving into ChatGPT’s Browsing Context, we wondered how malicious actors could exploit ChatGPT’s susceptibility to indirect prompt injection in a way that would align with a legitimate use case. Since one of the primary use cases for the Browsing Context is summarizing blogs and articles, our idea was to inject instructions in the comment section. We created our own blogs with dummy content and then left a message for SearchGPT in the comments section. When asked to summarize the contents of the blog, SearchGPT follows the malicious instructions from the comment, compromising the user. (We elaborate on the specific impact to the user in the Full Attack Vector PoCs section below.) The potential reach of this vulnerability is tremendous, since attackers could spray malicious prompts in comment sections on popular blogs and news sites, compromising countless ChatGPT users.
2. 0-click indirect prompt injection vulnerability in Search ContextWe’ve proven that we can inject a prompt when the user asks ChatGPT to browse to a specific website, but what about attacking a user just for asking a question? We know that ChatGPT’s web search results are based on Bing and OpenAI’s crawler, so we wondered: What would happen if a site with a prompt injection were to be indexed? In order to prove our theory, we created some websites about niche topics with specific names in order to narrow down our results, such as a site containing some humorous information about our team with the domain llmninjas.com. We then asked ChatGPT for information about the LLM Ninjas team and were pleased to see that our site was sourced in the response.
Having only a prompt injection on your site would make it much less likely to be indexed by Bing, so we created a fingerprint for SearchGPT based on the headers and user agent it uses to browse, and only served the prompt injection when SearchGPT was the one browsing. Voila! After the change we made was indexed by OpenAI’s crawler, we were able to achieve the final level of prompt injection and inject a prompt just by the victim asking a simple question!
Hundreds of millions of users ask LLMs questions that require searching the web, and it seems that LLMs will eventually replace classic search engines. This unprecedented 0-click vulnerability opens a whole new attack vector that could target anyone who relies on AI search for information. AI vendors are relying on metrics like SEO scores, which are not security boundaries, to choose which sources to trust. By hiding the prompt in tailor-made sites, attackers could directly target users based on specific topics or political and social trends.
The output of the LLM is manipulated (as noted by “TCS Research POC Success!”), compromising the user for asking an innocent question. Source: Tenable, November 2025 3. Prompt injection vulnerability via 1-clickThe final and simplest method of prompt injection is through a feature that OpenAI created, which allows users to prompt ChatGPT by browsing to https://chatgpt.com/?q={Prompt}. We found that ChatGPT will automatically submit the query in the q= parameter, leaving anyone who clicks that link vulnerable to a prompt injection attack.
4. Safety mechanism bypass vulnerabilityDuring our research of the url_safe endpoint, we noticed that bing.com was a whitelisted domain, and always passed the url_safe check. It turns out that search results on Bing are served through a wrapped tracking link that redirects the user from a static bing.com/ck/a link to the requested website. That means that any website that is indexed on Bing has a bing.com URL that will redirect to it.
When searching using Bing, if we hover over the results, we can see that they redirect to bing.com/ck/a links. Source: Tenable, November 2025.By indexing some test websites to Bing, we were able to extract their static tracking links and use them to bypass the url_safe check, allowing our links to be fully rendered. The Bing tracking links cannot be altered, so a single link cannot extract information that we did not know in advance. Our solution was to index a page for every letter in the alphabet and then use those links to exfiltrate information one letter at a time. For example, if we want to exfiltrate the word “Hello”, ChatGPT would render the Bing links for H, E, L, L, and O sequentially in its response.
5. Conversation Injection techniqueEven with the url_safe bypass above, we cannot use prompt injection alone to exfiltrate anything of value, since SearchGPT has no access to user data.. We wondered: How could we get control over ChatGPT’s output when we only have direct access to SearchGPT’s output? Then we remembered Conversational Context. ChatGPT remembers the entire conversation when responding to user prompts. If it were to find a prompt on its “side” of the conversation, would it still listen? So we used our SearchGPT prompt injection to ensure the response ends with another prompt for ChatGPT in a novel technique we dubbed Conversation Injection. When responding to the following prompts, ChatGPT will go over the Conversational Context, see, and listen to the instructions we injected, not realizing that SearchGPT wrote them. Essentially, ChatGPT is prompt-injecting itself.
We inject a prompt to SearchGPT, which in turn injects a prompt to ChatGPT within its response. Source: Tenable, November 2025. 6. Malicious content hiding techniqueOne of the issues with the Conversation Injection technique is that the output from SearchGPT appears clearly to the user, which will raise a lot of suspicion. We discovered a bug with how the ChatGPT website renders markdown that can allow us to hide the malicious content. When rendering code blocks, any data that appears on the same line as the code block opening (past the first word) does not get rendered. This means that unless copied, the response will look completely innocent to the user, despite containing the malicious context, which will be read by ChatGPT.
All of the content after the first word of the code block opening line is hidden from the user. Source: Tenable, November 2025. 7. Memory injection techniqueAnother issue with Conversation Injection is that it only persists for the current conversation. But what if we wanted persistence between conversations? We found that, similarly to Conversation Injection, SearchGPT can actually get ChatGPT to update its memories, allowing us to create an exfiltration that will happen for every single response. This injection creates a persistent threat that will continue to leak user data even between sessions, days, and data changes.
We get SearchGPT to make ChatGPT update its memories, as noted by ‘Memory updated.'Source: Tenable, November 2025 Full attack vector PoCsBy mixing and matching all of the vulnerabilities and techniques we discovered, we were able to create proofs of concept (PoCs) for multiple complete attack vectors, such as indirect prompt injection, bypassing safety features, exfiltrating private user information, and creating persistence.
ChatGPT 4o PoC: Phishing
ChatGPT 5 PoC: Phishing success
ChatGPT 5 PoC: Comment success
ChatGPT 4o PoC: LLM Ninjas
ChatGPT 5 PoC: Search GPT LLM Ninjas
ChatGPT 5 PoC: Memory injection
Tenable Research has disclosed all of these issues to OpenAI and directly worked with them to fix some of the vulnerabilities. The associated TRAs are:
The majority of the research was done on ChatGPT 4o, but OpenAI is constantly tuning and improving their platform, and has since launched ChatGPT 5. The researchers have been able to confirm that several of the PoCs and vulnerabilities are still valid in ChatGPT 5, and ChatGPT 4o is still available for use based on user preference. Prompt injection is a known issue with the way that LLMs work, and, unfortunately, it will probably not be fixed systematically in the near future. AI vendors should take care to ensure that all of their safety mechanisms (such as url_safe) are working properly to limit the potential damage caused by prompt injection.
Note: This blog includes research conducted by Yarden Curiel.
The post HackedGPT: Novel AI Vulnerabilities Open the Door for Private Data Leakage appeared first on Security Boulevard.
AI-driven social engineering is transforming cyberattacks from costly, targeted operations into scalable, automated threats. As generative models enable realistic voice, video, and text impersonation, organizations must abandon stored secrets and move toward cryptographic identity systems to defend against AI-powered deception.
The post In an AI World, Every Attack is a Social Engineering Attack appeared first on Security Boulevard.
The Salesloft Drift OAuth token breach compromised Salesforce data across hundreds of enterprises, including Cloudflare, Zscaler, and Palo Alto Networks. Learn how attackers exploited OAuth tokens, the risks of connected app misuse, and key steps to strengthen Salesforce and multi-cloud security.
The post Salesloft Drift Breaches: Your Complete Response Guide appeared first on Security Boulevard.
What is the CAIF? The Centraleyes AI Framework (CAIF) is a comprehensive compliance and governance tool designed to help organizations meet the diverse and rapidly evolving regulatory requirements surrounding artificial intelligence. It consolidates questions and controls from multiple AI laws and regulatory regimes across the globe – including the EU AI Act (Minimal and Limited […]
The post Centraleyes AI Framework (CAIF) appeared first on Centraleyes.
The post Centraleyes AI Framework (CAIF) appeared first on Security Boulevard.
Learn how to build secure, enterprise-ready SaaS applications. This guide covers development, ops, and product security best practices for meeting enterprise requirements.
The post Enterprise Ready SaaS Application Guide to Product Security appeared first on Security Boulevard.
Can Understanding Non-Human Identities (NHIs) Really Help Relieve Cloud Compliance Stress? Navigating the complexities of cloud compliance can often feel overwhelming for organizations across various sectors. With the growing adoption of cloud services, ensuring compliant and secure environments has become a pivotal task. Can the management of Non-Human Identities (NHIs) be the key to reduce […]
The post Relieving Stress in Cloud Compliance: How NHIs Help appeared first on Entro.
The post Relieving Stress in Cloud Compliance: How NHIs Help appeared first on Security Boulevard.
How Can Smart NHI Management Enhance Cybersecurity? Managing Non-Human Identities (NHIs) may seem like an abstract task, yet its significance in bolstering cybersecurity cannot be overstated. With the shift towards digital transformation, NHIs have become an integral part of many organizations’ network. What role do these machine identities play in staying ahead of cybersecurity threats? […]
The post Staying Ahead of Threats with Smart NHIs appeared first on Entro.
The post Staying Ahead of Threats with Smart NHIs appeared first on Security Boulevard.