AI Infrastructure Security Hub

Verified Threat Registry

Jailbreak techniques and prompt injection vectors — sourced from public red-team disclosures, academic research, and elvis.hk intelligence · Q3 2026

ID	Type	Technique / Vector	Targets	Severity	Status	Mitigation	Source
JB-2026-Q3-001	Jailbreak	Many-Shot Jailbreaking (Disinformation) Repeated adversarial exemplars override RLHF safety; 100% ASR on disinformation across all 6 tested LLMs	GPT-5 · GPT-4o · Llama 4 · All LLMs	CRITICAL	Active	Limit context window shots; classifier on output for factual grounding	Cloudsine Apr 2026 →
PI-2026-Q3-001	Injection	Indirect Injection via Web Browsing Agents Hidden instructions in webpages hijack autonomous browsing agents (Mozilla Tabstack, Cotypist) mid-task	All LLMs with web-browsing tools	CRITICAL	Active	Sandbox agent web access; validate retrieved content before prompt injection	Brave Research Jun 2026 →
PI-2026-Q3-002	Injection	RAG Document Prompt Injection Malicious documents planted in retrieval pipeline override system prompt; 82% redirect success rate in OWASP testing	All RAG deployments	CRITICAL	Active	BERT-based injection classifier on retrieved chunks before prompt concat	OWASP RAG Top 10 →
PI-2026-Q3-003	Injection	Knowledge Base Poisoning via Unverified Sources Attacker writes to low-privilege wiki/API; RAG ingests and propagates false data (e.g. healthcare drug interaction incident)	All RAG deployments	CRITICAL	Active	Source integrity scoring; author verification; document provenance graph	OWASP RAG Top 10 →
JB-2026-Q3-002	Jailbreak	Persona Impersonation Prompts coerce model to generate first-person statements as real world leaders; >50% ASR, no refusal triggered in GPT-4o	GPT-4o · GPT-5	HIGH	Active	Output classifier for real-person impersonation; constitutional AI system prompt	Cloudsine Apr 2026 →
JB-2026-Q3-003	Jailbreak	Crescendo Jailbreak Multi-turn escalation — model is gradually led from benign conversation to harmful outputs; bypasses single-turn guardrails	Claude 3/4 · GPT-4o · All LLMs	HIGH	Active	Conversation-level context analysis; cross-turn safety evaluation	Wraith Field Guide →
JB-2026-Q3-004	Jailbreak	Refusal Suppression via Fake Policy Injection System prompt is prefixed with forged "policy update" that grants permissions the real policy denies; exploits instruction hierarchy	Claude 3/4 · GPT-4o · GPT-5	HIGH	Active	Immutable system prompt header; cryptographic prompt signing; privilege ordering	Wraith Field Guide →
JB-2026-Q3-005	Jailbreak	Encoding-Based Guardrail Bypass Harmful instructions encoded in Base64, ROT13, or Unicode homoglyphs bypass keyword-based safety filters	GPT-4o · Llama 4 · Open-source LLMs	HIGH	Mitigated (partial)	Decode and normalise all input before safety evaluation; semantic classifiers over keyword filters	Wraith Field Guide →
JB-2026-Q3-006	Jailbreak	Prompt Injection via Instruction Override Direct override prompts ("Ignore previous instructions") worked in ~33% of Llama 4 Scout tests; model acknowledged injection then complied	Llama 4 Scout · Open-source LLMs	HIGH	Active	Strict instruction hierarchy; system/user context separation; RLHF-trained refusal	Cloudsine Apr 2026 →
JB-2026-Q3-007	Jailbreak	Roleplay-Based Safety Filter Bypass Fictional framing bypasses safety filters in ~1-in-5 cases; more effective against open-source models with weaker instruction tuning	Open-source LLMs · Claude (lower risk)	HIGH	Active	Detect roleplay framing in input; evaluate fictional output for real-world harm equivalence	Cloudsine Apr 2026 →
JB-2026-Q3-008	Jailbreak	System Prompt & Architecture Probing Indirect injection leaks internal token delimiters and role boundary markers; allows partial reconstruction of model architecture	Llama 4 Scout · Open-source LLMs	MEDIUM	Active	Never expose raw system prompt; strip internal delimiters from visible output	Cloudsine Apr 2026 →
PI-2026-Q3-004	Injection	RAG Cache Cross-User Data Leakage Semantically similar queries return cached results from higher-privilege sessions, bypassing document ACLs	All RAG deployments	HIGH	Active	Per-user/per-role retrieval cache scoping; privilege-gated cache keys in Redis	OWASP RAG Top 10 →
PI-2026-Q3-005	Injection	Context Confusion via Adversarial Documents 7% adversarial seeding of knowledge base causes 63% wrong answers at high model confidence (Stanford research)	All RAG deployments	HIGH	Active	Consistency re-ranking with entailment model; cross-document contradiction detection	OWASP RAG Top 10 →
PI-2026-Q3-006	Injection	Agentic Tool Abuse via Hijacked Context Compromised context window causes agent to invoke unauthorised tools; 97% of security leaders expect a material incident in 2026	Claude Agents · OpenAI Assistants	HIGH	Active	Allowlist tool access per agent identity; time-bounded credentials; human-in-the-loop for irreversible actions	NHI Group Jun 2026 →
PI-2026-Q3-007	Injection	Agentic RAG Chain-of-Thought Leakage Multi-turn RAG reasoning traces expose knowledge base structure in 44% of applications; enables targeted retrieval probing	All agentic RAG systems	MEDIUM	Active	Suppress CoT trace from user output; log internally only; monitor retrieval patterns for probing behaviour	OWASP RAG Top 10 →
CVE-2026-24207	CVE	NVIDIA Triton Inference Server — Auth Bypass Authentication bypass enabling code execution, privilege escalation, data tampering, DoS, or information disclosure	NVIDIA Triton · AI inference infra	HIGH	Active	Patch immediately; restrict Triton endpoint to internal network only	elvis.hk CVE Trends →
CVE-2026-42271	CVE	BerriAI LiteLLM — Command Injection (KEV) Command injection in LiteLLM AI/LLM gateway; CISA KEV listed — actively exploited in the wild	LiteLLM · AI API gateway users	HIGH	KEV Listed	Apply vendor patch immediately; isolate LiteLLM endpoint; audit command execution paths	CISA KEV via elvis.hk →
CVE-2026-48027	CVE	Nx Console VSCode Extension — Supply Chain Compromise (KEV) Malicious code injected into Nx Console; targets AI/ML development environments via VSCode extension supply chain	AI/ML developers · VSCode users	HIGH	KEV Listed	Remove affected extension; audit VSCode extension supply chain; pin extension versions	CISA KEV via elvis.hk →

Sources: elvis.hk · Cloudsine Apr 2026 · OWASP LLM RAG · Wraith Field Guide

Enterprise Hardening Controls

Priority controls for Claude and OpenAI enterprise deployments — based on OWASP LLM Top 10, Anthropic guidance, and Q3 2026 threat landscape

🔒

System Prompt Hardening

P1 · Critical Both

Classify system prompt as confidential — never expose to users
Implement prompt injection detection layer on all user input
Use privilege ordering: system → operator → user (reject overrides)
Strip internal token delimiters from any user-visible output
Test against fake-policy injection and crescendo attacks quarterly

Anthropic Guidance →

🛡️

RAG Pipeline Security

P1 · Critical Both

Pass every retrieved chunk through injection classifier before prompt concat
Assign source integrity scores — deprioritise unverified documents
Scope retrieval caches per user role — never share across privilege levels
Add consistency re-ranker to detect contradictory / adversarial documents
Log all retrieval sources and flag anomalies to SIEM

OWASP RAG Threats →

🤖

Agentic Workflow Security

P1 · Critical Both

Scope each agent to a dedicated identity with allowlisted tools only
Use time-bounded credentials that expire with the task session
Enforce human-in-the-loop for ALL irreversible actions
Sandbox code execution — no host filesystem or network access
Monitor agent action sequences for deviation from baseline behaviour

NHI Agentic Security →

🔑

Access Control & Identity

P2 · High Both

Per-user API key scoping — no shared org-wide tokens
Enforce MFA on all AI platform admin accounts
Rotate API keys on 90-day schedule; revoke immediately on offboarding
Implement RBAC for AI feature access (model tier, context window)
Audit API key usage logs weekly for anomalous patterns

OpenAI Safety Docs →

📊

Output Validation

P2 · High Both

Classify outputs for PII, secrets, and harmful content before delivery
Block markdown/code injection in rendered UI outputs
Validate structured outputs against strict JSON schema
Apply rate limiting on all generation endpoints
Log all completions for audit and forensic investigation

OWASP LLM Top 10 →

🔗

Supply Chain & Model Risk

P2 · High Both

Pin model versions — never auto-upgrade without security review
Vet all LangChain / LlamaIndex plugins before production deployment
Audit VSCode / IDE AI extensions (see CVE-2026-48027)
Maintain SBOM for all AI/ML dependencies
Assess third-party AI vendor sub-processors for data residency

CISA KEV AI Entries →

📡

Monitoring & Incident Response

P2 · High Both

Enable full prompt/completion audit logging with PII masking
Anomaly detection on tokens-per-request and request rate
Define AI-specific incident response runbook
Conduct quarterly red-team exercises against production AI endpoints
Subscribe to Anthropic and OpenAI security bulletins

Anthropic Research →

🏛️

Claude-Specific Controls

P3 · Medium Claude

Use Constitutional AI system prompt patterns per Anthropic docs
Structure system prompt to resist fake-policy injection
Review Claude tool-use security guidance before enabling function calls
Test quarterly against crescendo and roleplay jailbreak patterns
Monitor Anthropic's Responsible Scaling Policy updates

Anthropic Docs →

🤖

OpenAI-Specific Controls

P3 · Medium OpenAI

Enable Moderation API on all user-facing endpoints
Use Structured Outputs (JSON mode) to prevent generation-level injection
Test GPT-4o vision inputs for prompt injection via image content
Test against persona impersonation and encoding bypass patterns
Monitor OpenAI deprecated model timelines and security changelog

OpenAI Safety Docs →

AI/ML CVEs — Q3 2026

CVE-2026-24207 NVIDIA Triton HIGH Trending

CVE-2026-42271 BerriAI LiteLLM HIGH KEV Listed

CVE-2026-48027 Nx Console (VSCode) HIGH KEV Listed

CVE data from elvis.hk CVE Trends and CISA KEV feed

Downloads

The 2026 Enterprise AI Security Checklist for CISOs

A4 · PDF · 55 controls across 11 domains · Updated Q3 2026 · elvis.hk AI Infrastructure Security Hub

ClaudeOpenAIRAG SecurityAgentic AIOWASP LLM

Download PDF

Stay Ahead of AI Security Threats

Elvis.hk aggregates LLM security research, CISA KEV entries for AI tooling, OWASP updates, and red-team disclosures. This hub is refreshed every quarter with the latest verified intelligence.

Explore elvis.hk AI/ML CVEs

AI InfrastructureSecurity Hub

Verified Threat Registry

Enterprise Hardening Controls

System Prompt Hardening

RAG Pipeline Security

Agentic Workflow Security

Access Control & Identity

Output Validation

Supply Chain & Model Risk

Monitoring & Incident Response

Claude-Specific Controls

OpenAI-Specific Controls

AI/ML CVEs — Q3 2026

Downloads

The 2026 Enterprise AI Security Checklist for CISOs

Stay Ahead of AI Security Threats

AI Infrastructure
Security Hub