Breaking LLM Agents

August 4, 2025 2 minute read

As LLM agents spread into products and embodied systems, security and privacy risks grow in both scope and impact. Below is a summary on attacks and defenses as representative references for agentic systems across chat and robotics.

Papers published in this domain

ID	Publish Date	Title	Authors	PDF	Code
1	2025-01-28	Context is Key for Agent Security	Author Name et.al.	2501.17070	null
2	2025-03-26	ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning	Author Name et.al.	2503.22738	null
3	2024-05-08	AirGapAgent: Protecting Privacy-Conscious Conversational Agents	Author Name et.al.	2405.05175	null
4	2024-10-17	RoboPair	Author Name et.al.	2410.13691	null
5	2025-03-10	RoboGuard	Author Name et.al.	2503.07885	null
6	2025-04-15	CEE	Author Name et.al.	2504.13201	null
7	2025-09-27	Preventing Robotic Jailbreaking via Multimodal Domain Adaptation	Author Name et.al.	2509.23281	null
8	2025-08-23	Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents	Author Name et.al.	2508.17155	null
9	2024-03-28	JailbreakBench	Author Name et.al.	2404.01318	null
10	2024-02-06	HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal	Author Name et.al.	2402.04249	null
11	2024-05-13	RobustNav	Author Name et.al.	2405.07890	null
12	2024-11-04	AgentHarm	Author Name et.al.	2410.09024	null
13	2024-10-09	ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents	Author Name et.al.	2410.06703	null
14	2024-09-09	AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions	Author Name et.al.	2409.05678	null
15	2024-08-19	ARE Meta	Author Name et.al.	2408.09876	null
16	2025-08-23	Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents	Author Name et.al.	2508.17155	null
17	2024-06-08	Adversarial Attacks on Robotic Vision Language Action Models	Author Name et.al.	2406.05432	null
18	2024-07-18	TAP	Author Name et.al.	Link	null
19	2024-07-02	GCG	Author Name et.al.	2307.15043	null
20	2024-04-17	TurkingBench	Author Name et.al.	2404.11234	null
21	2024-03-14	Google's Approach to Protecting Privacy In the Age of AI	Author Name et.al.	Link	null
22	2024-09-27	Securing MCP-based Agent Workflows	Author Name et.al.	Link	null
23	2024-05-27	Dolphins	Author Name et.al.	Link	null
24	2024-08-16	Sound Image Perturb	Author Name et.al.	Link	null
25	2024-10-04	Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective	Author Name et.al.	Link	null
26	2024-07-02	BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks	Author Name et.al.	2407.02421	null
27	2024-03-07	Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference	Author Name et.al.	2403.04132	null
28	2025-09-30	CHAI: Command Hijacking against embodied AI	Author Name et.al.	2510.00181	null
29	2025-09-29	SecInfer: Preventing Prompt Injection via Inference-time Scaling	Author Name et.al.	2509.24967	null
30	2025-10-01	WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents	Author Name et.al.	2510.01354	null
31	2024-09-09	Adversarial Attacks on Robotic Vision Language Action Models	Author Name et.al.	2409.05678	null
32	2024-08-30	OmniVLA: An Omni-Modal Vision-Language-Action Model for Robot Navigation	Author Name et.al.	Link	null
33	2024-07-11	NaviLA	Author Name et.al.	Link	null
34	2024-12-04	Emerging Risks from Embodied AI Require Urgent Policy Action	Author Name et.al.	2509.00117	null
35	2024-06-20	RoboCop: A Robust Zero-Day Cyber-Physical Attack Detection Framework for Robots	Author Name et.al.	2406.14789	null
36	2024-11-10	RoboGuardZ: A Scalable Zero-Shot Framework for Detecting Zero-Day Malware in Robots	Author Name et.al.	Link	null
37	2025-03-11	Google Semantic Safety	Author Name et.al.	2503.08663	null
38	2024-09-05	Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications	Author Name et.al.	Link	null
39	2025-03-18	GROOT1	Author Name et.al.	2503.14734	null
40	2024-08-06	Simulation Control Visual SysID	Author Name et.al.	Link	null
41	2025-09-25	Can AI Perceive Physical Danger and Intervene?	Author Name et.al.	2509.21651	null

Doguhan Yeke