Quoting from the abstract:
"We report an exploratory red-teaming study of autonomous language-model–powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions."
"Observed behaviors include unauthorized compliance with non-owners, disclosure of sensitive information, execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover."
https://news.ycombinator.com/item?id=47196883
https://news.ycombinator.com/item?id=47134473
https://news.ycombinator.com/item?id=47147764
https://news.ycombinator.com/item?id=47141321
Besides that.. Agents reporting task completion while the system state says otherwise is predictable once you think about it. Next-token prediction optimizes for plausible outputs, not ground truth.