| JB-2026-Q3-001 |
Jailbreak |
Many-Shot Jailbreaking (Disinformation) Repeated adversarial exemplars override RLHF safety; 100% ASR on disinformation across all 6 tested LLMs |
GPT-5 · GPT-4o · Llama 4 · All LLMs |
CRITICAL |
Active |
Limit context window shots; classifier on output for factual grounding |
Cloudsine Apr 2026 → |
| PI-2026-Q3-001 |
Injection |
Indirect Injection via Web Browsing Agents Hidden instructions in webpages hijack autonomous browsing agents (Mozilla Tabstack, Cotypist) mid-task |
All LLMs with web-browsing tools |
CRITICAL |
Active |
Sandbox agent web access; validate retrieved content before prompt injection |
Brave Research Jun 2026 → |
| PI-2026-Q3-002 |
Injection |
RAG Document Prompt Injection Malicious documents planted in retrieval pipeline override system prompt; 82% redirect success rate in OWASP testing |
All RAG deployments |
CRITICAL |
Active |
BERT-based injection classifier on retrieved chunks before prompt concat |
OWASP RAG Top 10 → |
| PI-2026-Q3-003 |
Injection |
Knowledge Base Poisoning via Unverified Sources Attacker writes to low-privilege wiki/API; RAG ingests and propagates false data (e.g. healthcare drug interaction incident) |
All RAG deployments |
CRITICAL |
Active |
Source integrity scoring; author verification; document provenance graph |
OWASP RAG Top 10 → |
| JB-2026-Q3-002 |
Jailbreak |
Persona Impersonation Prompts coerce model to generate first-person statements as real world leaders; >50% ASR, no refusal triggered in GPT-4o |
GPT-4o · GPT-5 |
HIGH |
Active |
Output classifier for real-person impersonation; constitutional AI system prompt |
Cloudsine Apr 2026 → |
| JB-2026-Q3-003 |
Jailbreak |
Crescendo Jailbreak Multi-turn escalation — model is gradually led from benign conversation to harmful outputs; bypasses single-turn guardrails |
Claude 3/4 · GPT-4o · All LLMs |
HIGH |
Active |
Conversation-level context analysis; cross-turn safety evaluation |
Wraith Field Guide → |
| JB-2026-Q3-004 |
Jailbreak |
Refusal Suppression via Fake Policy Injection System prompt is prefixed with forged "policy update" that grants permissions the real policy denies; exploits instruction hierarchy |
Claude 3/4 · GPT-4o · GPT-5 |
HIGH |
Active |
Immutable system prompt header; cryptographic prompt signing; privilege ordering |
Wraith Field Guide → |
| JB-2026-Q3-005 |
Jailbreak |
Encoding-Based Guardrail Bypass Harmful instructions encoded in Base64, ROT13, or Unicode homoglyphs bypass keyword-based safety filters |
GPT-4o · Llama 4 · Open-source LLMs |
HIGH |
Mitigated (partial) |
Decode and normalise all input before safety evaluation; semantic classifiers over keyword filters |
Wraith Field Guide → |
| JB-2026-Q3-006 |
Jailbreak |
Prompt Injection via Instruction Override Direct override prompts ("Ignore previous instructions") worked in ~33% of Llama 4 Scout tests; model acknowledged injection then complied |
Llama 4 Scout · Open-source LLMs |
HIGH |
Active |
Strict instruction hierarchy; system/user context separation; RLHF-trained refusal |
Cloudsine Apr 2026 → |
| JB-2026-Q3-007 |
Jailbreak |
Roleplay-Based Safety Filter Bypass Fictional framing bypasses safety filters in ~1-in-5 cases; more effective against open-source models with weaker instruction tuning |
Open-source LLMs · Claude (lower risk) |
HIGH |
Active |
Detect roleplay framing in input; evaluate fictional output for real-world harm equivalence |
Cloudsine Apr 2026 → |
| JB-2026-Q3-008 |
Jailbreak |
System Prompt & Architecture Probing Indirect injection leaks internal token delimiters and role boundary markers; allows partial reconstruction of model architecture |
Llama 4 Scout · Open-source LLMs |
MEDIUM |
Active |
Never expose raw system prompt; strip internal delimiters from visible output |
Cloudsine Apr 2026 → |
| PI-2026-Q3-004 |
Injection |
RAG Cache Cross-User Data Leakage Semantically similar queries return cached results from higher-privilege sessions, bypassing document ACLs |
All RAG deployments |
HIGH |
Active |
Per-user/per-role retrieval cache scoping; privilege-gated cache keys in Redis |
OWASP RAG Top 10 → |
| PI-2026-Q3-005 |
Injection |
Context Confusion via Adversarial Documents 7% adversarial seeding of knowledge base causes 63% wrong answers at high model confidence (Stanford research) |
All RAG deployments |
HIGH |
Active |
Consistency re-ranking with entailment model; cross-document contradiction detection |
OWASP RAG Top 10 → |
| PI-2026-Q3-006 |
Injection |
Agentic Tool Abuse via Hijacked Context Compromised context window causes agent to invoke unauthorised tools; 97% of security leaders expect a material incident in 2026 |
Claude Agents · OpenAI Assistants |
HIGH |
Active |
Allowlist tool access per agent identity; time-bounded credentials; human-in-the-loop for irreversible actions |
NHI Group Jun 2026 → |
| PI-2026-Q3-007 |
Injection |
Agentic RAG Chain-of-Thought Leakage Multi-turn RAG reasoning traces expose knowledge base structure in 44% of applications; enables targeted retrieval probing |
All agentic RAG systems |
MEDIUM |
Active |
Suppress CoT trace from user output; log internally only; monitor retrieval patterns for probing behaviour |
OWASP RAG Top 10 → |
| CVE-2026-24207 |
CVE |
NVIDIA Triton Inference Server — Auth Bypass Authentication bypass enabling code execution, privilege escalation, data tampering, DoS, or information disclosure |
NVIDIA Triton · AI inference infra |
HIGH |
Active |
Patch immediately; restrict Triton endpoint to internal network only |
elvis.hk CVE Trends → |
| CVE-2026-42271 |
CVE |
BerriAI LiteLLM — Command Injection (KEV) Command injection in LiteLLM AI/LLM gateway; CISA KEV listed — actively exploited in the wild |
LiteLLM · AI API gateway users |
HIGH |
KEV Listed |
Apply vendor patch immediately; isolate LiteLLM endpoint; audit command execution paths |
CISA KEV via elvis.hk → |
| CVE-2026-48027 |
CVE |
Nx Console VSCode Extension — Supply Chain Compromise (KEV) Malicious code injected into Nx Console; targets AI/ML development environments via VSCode extension supply chain |
AI/ML developers · VSCode users |
HIGH |
KEV Listed |
Remove affected extension; audit VSCode extension supply chain; pin extension versions |
CISA KEV via elvis.hk → |