Multi-agent AI could change everything – if researchers can figure out the risks
You might have seen headlines sounding the alarm about the safety of an emerging technology called agentic AI.
That’s where Sarra Alqahtani comes in. An associate professor of computer science at Wake Forest University, she studies the safety of AI agents through the new field of multi-agent reinforcement learning (MARL).
Alqahtani received a National Science Foundation CAREER award to develop standards and benchmarks to better ensure that multi-agent AI systems will continue to work properly, even if one of the agents fails or is hacked.
AI agents do more than sift through information to answer questions, like the large language model (LLM) technology behind tools such as ChatGPT and Google Gemini. AI agents think and make decisions based on their changing environment – like a fleet of self-driving cars sharing the road.
Multi-agent AI offers innumerable opportunities. But failure could put lives at risk. Here’s how Alqahtani proposes to solve that problem.
What’s the difference between the AI behind Chat GPT and the multi-agent AI you study?
ChatGPT is trained on a huge amount of text to protect what the next word should be, what the next answer should be. It’s driven by humans writing. For AI agents that I build – multi-agent reinforcement learning – they think, they reason and they make decisions based on the dynamic environment around them. So they don’t only predict, they predict and they make decisions based on that prediction. They also identify the uncertainty level around them and then make a decision about that: Is it safe for me to make a decision or should I consult a human?
AI agents, they live in certain environments and they react and act in these environments to change the environments over time, like a self-driving car. ChatGPT still has some intelligence, but that intelligence is connected to the text, predictability of the text, and not acting or making a decision.
You teach teams of AI agents through a process called multi-agent reinforcement learning, or MARL. How does it work?
There are a team of AI agents collaborating together to achieve a certain task. You can think of it as a team of medical drones delivering blood or medical supplies. They need to coordinate and make a decision, on time, what to do next—speed up, slow down, wait. My research focus is on building and designing algorithms to help them coordinate efficiently and safely without causing any catastrophic consequences to themselves and to humans.
Reinforcement learning is the learning paradigm that actually is behind even how us humans learn. It trains the agent to behave by making mistakes and learning from their mistakes. So we give them rewards and we give them punishments if they do something good or bad. Rewards and punishments are mathematical functions, or a number a value. If you do something good as an agent, I’ll give you a positive number. That tells the agent’s brain that’s a good thing.
Us researchers, we anticipate the problems that could happen if we deploy our AI algorithms in the real world and then simulate these problems, deal with it, and then patch the security and safety issues, hopefully before we deploy the algorithms. As part of my research, I want to develop the foundational benchmarks and standards for other researchers to encourage them to work on this very promising area that’s still underdeveloped.
It seems like multi-agent AI could offer many benefits, from taking on tasks that might endanger humans to filling in gaps in the healthcare workforce. But what are the risks of multi-agent AI?
When I started working on multi-agent reinforcement learning, I noticed when we add small changes to the environment or the task description for the agents, they will make mistakes. So they are not totally safe unless we train them in the same, exact task again and again like a million times. Also, when we compromise one agent, and by saying compromise, I mean we assume there’s an attacker taking over and changing the actions from the optimal behavior of that agent, the other agents will be also impacted severely, meaning their decision-making will be disrupted because one of the team is doing something unexpected.
We test our algorithms in gaming simulations because they are safe and we have clear rules for the games so we can anticipate what’s going to happen if the agents made a mistake. The big risk is moving them from simulations to the real world. That’s my research area, how to still keep them behaving predictably and to avoid making mistakes that could affect them and humans.
My main concern is not the sci-fi part of AI, that AI is going to take over or AI is going to steal our jobs. My main concern is how are we going to use AI and how are we going to deploy AI? We have to test and make sure our algorithms are understandable for us and for end users before we deploy it out there in the world.
You have received an NSF CAREER award to make multi-agent AI safer. What are you doing?
Part of my research is to develop standards, benchmarks, baselines that encourage other researchers to be more creative with the technology to develop new, cutting-edge algorithms.
My research is trying to solve the transitioning of the algorithms from simulation to the real world, and that involves paying attention to the safety of the agents and their trustworthiness. We need to have some predictability of their actions, and then at the same time, we want them to behave in a safe manner. So we want to not only optimize them to do the task efficiently, we want them also to do the task safely for themselves, as equipment, and for humans.
I’ll test my MARL algorithms on teams of drones flying over the Peruvian Amazonian rainforest to detect illegal gold mining. The idea is to keep the drones safe while they are exploring, navigating and detecting illegal gold mining activities while avoiding being shot by illegal gold miners. I work with a team of diverse expertise – hardware engineers, researchers in ecology and biology and environmental sciences, and the Sabin Center for Environment and Sustainability at Wake Forest.
There’s a lot of hype about the future of AI. What’s the reality? Do you trust AI?
I do trust AI that I work on, so I would flip the question and say, do I trust humans who work on AI? Would you trust riding with an Uber driver in the middle of the night in a strange city? You don’t know that driver. Or would you trust the self-driving car that has been tested in the same situation, in a strange city?
I would trust the self-driving car in this case. But I want to understand what the car is doing. And that’s part of my research, to provide explanations of the behavior of the AI system for the end users before they actually use it. So they can interact with it and ask it, what’s going to happen if I do this? Or if I put you in that situation, how are you going to behave? When you interact with something and you ask these questions and you get to understand the system, you’ll trust it more.
I think the question should be, do we have enough effort going into making these systems more trustworthy? Do we spend more effort and time to make them trustworthy?