Abstract:
Security is the foundation of cooperation between individuals, groups, and nations. It underlies trust and confidence, and provides the stability and predictability necessary to form long-term commitments. Without security, cooperation is risky for the weak, inefficient for the strong, and difficult for peers. But with stealth and deception as constant threats, the mere perception of insecurity can spiral cooperation into conflict, and can lead to true security dilemmas. As AI agents proliferate, such dynamics are swept into the digital world. In this talk, I will review the foundations of the field of multi-agent security and study the information-theoretic limits of stealth. Starting with recent progress on perfect security in AI-generated steganography, I will examine illusory attacks, a novel form of information-theoretically undetectable adversarial attacks, as well as novel approaches to out-of-distribution dynamics detection in reinforcement learning. Then, I will present a novel model evaluation framework to shed light on the question of when AI agents may decide to maliciously collude. I will end with a discussion of open questions and urgent research priorities on how to ensure secure cooperation in our multi-agent world.
Bio:
I am a leading researcher in foundational AI and information security. My recent works include a breakthrough result on the 25+ year old problem of perfectly secure steganography (jointly with Sam Sokota), which was featured by Scientific American , Quanta Magazine, and Bruce Schneier’s Security Blog. During my Ph.D., I helped establish the field of cooperative deep multi-agent reinforcement learning, resulting in popular learning algorithms such as QMIX, MACKRL, IPPO, and FACMAC, and the standard benchmark environments SMAC and Multi-Agent MuJoCo.