Breaking LLM Agents
As LLM agents spread into products and embodied systems, security and privacy risks grow in both scope and impact. Below is a summary on attacks and defenses as representative references for agentic systems across chat and robotics.
Papers published in this domain
| ID | Publish Date | Title | Authors | Code | |
|---|---|---|---|---|---|
| 1 | 2025-01-28 | Context is Key for Agent Security | Author Name et.al. | 2501.17070 | null |
| 2 | 2025-03-26 | ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning | Author Name et.al. | 2503.22738 | null |
| 3 | 2024-05-08 | AirGapAgent: Protecting Privacy-Conscious Conversational Agents | Author Name et.al. | 2405.05175 | null |
| 4 | 2024-10-17 | RoboPair | Author Name et.al. | 2410.13691 | null |
| 5 | 2025-03-10 | RoboGuard | Author Name et.al. | 2503.07885 | null |
| 6 | 2025-04-15 | CEE | Author Name et.al. | 2504.13201 | null |
| 7 | 2025-09-27 | Preventing Robotic Jailbreaking via Multimodal Domain Adaptation | Author Name et.al. | 2509.23281 | null |
| 8 | 2025-08-23 | Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents | Author Name et.al. | 2508.17155 | null |
| 9 | 2024-03-28 | JailbreakBench | Author Name et.al. | 2404.01318 | null |
| 10 | 2024-02-06 | HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal | Author Name et.al. | 2402.04249 | null |
| 11 | 2024-05-13 | RobustNav | Author Name et.al. | 2405.07890 | null |
| 12 | 2024-11-04 | AgentHarm | Author Name et.al. | 2410.09024 | null |
| 13 | 2024-10-09 | ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents | Author Name et.al. | 2410.06703 | null |
| 14 | 2024-09-09 | AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions | Author Name et.al. | 2409.05678 | null |
| 15 | 2024-08-19 | ARE Meta | Author Name et.al. | 2408.09876 | null |
| 16 | 2025-08-23 | Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents | Author Name et.al. | 2508.17155 | null |
| 17 | 2024-06-08 | Adversarial Attacks on Robotic Vision Language Action Models | Author Name et.al. | 2406.05432 | null |
| 18 | 2024-07-18 | TAP | Author Name et.al. | Link | null |
| 19 | 2024-07-02 | GCG | Author Name et.al. | 2307.15043 | null |
| 20 | 2024-04-17 | TurkingBench | Author Name et.al. | 2404.11234 | null |
| 21 | 2024-03-14 | Google's Approach to Protecting Privacy In the Age of AI | Author Name et.al. | Link | null |
| 22 | 2024-09-27 | Securing MCP-based Agent Workflows | Author Name et.al. | Link | null |
| 23 | 2024-05-27 | Dolphins | Author Name et.al. | Link | null |
| 24 | 2024-08-16 | Sound Image Perturb | Author Name et.al. | Link | null |
| 25 | 2024-10-04 | Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective | Author Name et.al. | Link | null |
| 26 | 2024-07-02 | BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks | Author Name et.al. | 2407.02421 | null |
| 27 | 2024-03-07 | Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference | Author Name et.al. | 2403.04132 | null |
| 28 | 2025-09-30 | CHAI: Command Hijacking against embodied AI | Author Name et.al. | 2510.00181 | null |
| 29 | 2025-09-29 | SecInfer: Preventing Prompt Injection via Inference-time Scaling | Author Name et.al. | 2509.24967 | null |
| 30 | 2025-10-01 | WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents | Author Name et.al. | 2510.01354 | null |
| 31 | 2024-09-09 | Adversarial Attacks on Robotic Vision Language Action Models | Author Name et.al. | 2409.05678 | null |
| 32 | 2024-08-30 | OmniVLA: An Omni-Modal Vision-Language-Action Model for Robot Navigation | Author Name et.al. | Link | null |
| 33 | 2024-07-11 | NaviLA | Author Name et.al. | Link | null |
| 34 | 2024-12-04 | Emerging Risks from Embodied AI Require Urgent Policy Action | Author Name et.al. | 2509.00117 | null |
| 35 | 2024-06-20 | RoboCop: A Robust Zero-Day Cyber-Physical Attack Detection Framework for Robots | Author Name et.al. | 2406.14789 | null |
| 36 | 2024-11-10 | RoboGuardZ: A Scalable Zero-Shot Framework for Detecting Zero-Day Malware in Robots | Author Name et.al. | Link | null |
| 37 | 2025-03-11 | Google Semantic Safety | Author Name et.al. | 2503.08663 | null |
| 38 | 2024-09-05 | Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications | Author Name et.al. | Link | null |
| 39 | 2025-03-18 | GROOT1 | Author Name et.al. | 2503.14734 | null |
| 40 | 2024-08-06 | Simulation Control Visual SysID | Author Name et.al. | Link | null |
| 41 | 2025-09-25 | Can AI Perceive Physical Danger and Intervene? | Author Name et.al. | 2509.21651 | null |