
Loading summary
A
This is the Everyday AI show, the Everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business and everyday life.
B
I think for a year or two, all of the talk around AI was bigger models. Well, if we want better results, let's make the model bigger, right? Hey, if we need more business value, let's throw another trillion parameters behind the model. But I don't know if that's the right way. Maybe it's working with smaller models or multiple agents, a team of agents, maybe a society of them. All right, so agents is obviously something we've been talking about here for multiple years on the Everyday AI Show. But I'm excited because today we're going to be talking with an expert and a leader in the field that's been doing this for multiple decades and going inside the society of agents and why teamwork beats bigger models. All right, I'm excited for today's show. I hope you are, too. Welcome to Everyday AI. If you're new here, Everyday AI, it's for you. It is your daily livestream, podcast and free daily newsletter helping everyday business leaders like you and me keep sense of everything that's happening in AI because it is nonstop and we help you focus on what matters to grow your company and your career. So if that's what you're trying to do, it starts here with the unedited, unscripted live stream podcast. But to be the smartest person in AI in your department, make sure you go to our website at your everydayai.com and sign up for the free daily newsletter. We're going to be recapping today's show as well as bringing you all the other AI news that you need to know for today. All right, enough of me chit chatting. You didn't show up to hear me. All right, so live stream audience, if you could please help me. Welcome to the show. I'm excited for this one. So we have. Let's. Let's do it in the right way. There we go. We have A.J. kumar, the corporate vice president and managing director of AI Frontiers Lab at Microsoft Research. A.J. thank you so much for joining the Everyday AI Show.
A
Thanks for having me, Jordan. I'm excited.
B
All right, so first of all, 20 years working in agents, that's wild. But tell everyone a little bit about your background working at Microsoft Research and what it is that you do on the day to day.
A
My journey in AI started like as you said, around 20 years ago. I was doing my PhD at Harvard on the topic of AI agents before AI was really a cool subject. And that's just like to remember that many of the problems we are working on right now are actually decades old. And a lot of people have thought about these problems for a very long time. However, where we are right now is that we have new tools that are called large language models, large reasoning models that are giving us new capabilities to be able to build the things we dreamed of for decades and decades. And that's kind of the moment we are in in agents. So back to my PhD student days when we were dreaming like building coding agents, thinking about agents, programming with people like in peer programming settings, or building agents to collaborate with people in different settings, we would prototype them. But we lack that general problem solving power in the day. So everything we prototype would just become a prototype. It wouldn't be something that people would use for real. So starting with GPT4, we are at an era of being able to build things for real. So that's kind of the journey that we are after. I've been a researcher at Microsoft research for 15 years. I spent quite a bit of time working on responsible AI issues and building responsible AI practices at Microsoft. And of course today I had access to an early version of GPT4. Everything changed for me. And while a lot of people saw that, as you said, large models are going to get us to AGI because I'm coming from an agentic background, my intuition was that this is the beginning of a new computing stack for building everything we want to build and started dreaming about like what are going to be the layers of this new computation stack to enable agents. So my team and I, we are 60 strong, distributed between Redmond and New York. Researchers, engineers, PMs, we are very much building that computation stack open, sourcing everything we are building for the tinkerers and the buildings builders of the world as well as the researchers, and really imagining where these technologies are going to be going next.
B
So walk us through a little bit. And I know that you've posted about this online, but for our audience that doesn't know what is an agentic society or a society of agents, is that just agents who are all running around doing their own thing or explain what that is.
A
So right now, when we think about the agents we are seeing in the world, they very much exist in a world where there's one person and one agent and they are trying to do something together. But think about yourself doing anything in the world, you are not doing it in isolation, you are working with other team members you are engaging in economic activities where you are interfacing with other people or the systems that are built around that. So when we look into the agents today, we don't imagine them. They are going to be working in isolation. That would be actually really silly. But what we imagine these agents to do is actually gain the abilities to be able to talk and work with other agents and people to really guide what the future of productivity, future of commercial activity, and future of interactions are going to be like. Again, this is an evolution that we are imagining today. There could be an assistant working with you to schedule your meetings or to do things for you very soon, like when I'm going to be working with you in terms of scheduling the next podcast. It is very likely that my agent, who has access to my calendar and knows about my preferences, is going to be contacting your agent to schedule the next meeting. And you can kind of like scale that from that one agent, one person to two agents, two people, to networks of agents that are going to be representing different skills and expertise. And now these agents are going to be able to work together to really do collaborative work. Like, imagine a future where there could be a company partially driven by agents. And you can imagine agents taking on the responsibilities of a legal task and a sales task and a building task. And these agents are going to be forming a team with their human counterparts to be able to take a task and accomplish it. That's kind of the feature we are, we are seeing already forming today.
B
It's. I can only imagine, right, as, as someone that's been working on this for multiple decades, you know, I've only been covering this for, you know, three years, but I, I've noticed maybe since, like, I don't know, December until now, the last, you know, two months. It seems like things on the agentic side are really taking off. Like, are you noticing the same? Is it because the, the technology to be able to build and use agents is, is becoming a. Or is this just like, just a lot of hype, you know, online, or are you seeing it too?
A
I don't see it as a hype. I think it's a big reality. And again, this is really a question about how do we build AI systems that are fundamentally useful and beneficial for people. So, as you know, the first thing we could build when we had LLMs was chatbots, because these systems were built to answer questions for us. But you don't actually get a lot of value from something that only tells you about doing things. What we can benefit from is actually Systems that can do things with us instead of just telling us things. Right. So agents represent that transition from systems that know how to talk to systems that know how to act in the world. And there's a clear value proposition in that transition. So what has been happening actually is that there are new technology pieces that are falling in place that are enabling that agentic behavior. Because again, when we were only talking about large language models, again, they were good for creating language, but they did not know about acting in the world. When we switched to reasoning models with models like O1 03 or deep seq R1. And now we see reinforcement learning becoming a part of the training stack. Now we are actually training these models about actions. We are putting them into environments where they can take actions and learn from that feedback. And that is really a technology transformation in terms of teaching AI systems how to act in the world, how to take a complex problem, decompose it into pieces, and turn those small pieces into actions that leads into getting things done. So again, we are building this stack as a community. We are seeing that now we can have models that are able to take actions in the world. My team has released a model called Forest 7B just a month ago that can do this agentic work in your machine because it's small enough. Again, my team has been building small models for the last two years, starting with the 5 family of models. Now we can orchestrate these models and agents, and then we can have some oversight over them with human and Loop. So overall, there is a technology stack that is emerging to enable these agents. And this is a reality, not a hype. And we always look at programming agents as almost like a Canarina coal mine, because coding has a number of characteristics that make it very suitable for this kind of agentic work. The context of doing the work is given in an ide. The language of the work is text, which these models know how to do really, really well. And we have compilers and other things that ensure somewhat of a reliability of such things. So give, like even from December, I don't know if Jordan, if you feel the same thing like when you're playing with them, but like, especially like the cloud agents coming out now, the clis, the usefulness of them, suddenly it feels like an inflection point because again, the stack is working. We are discovering as we go, we are filling in the missing pieces. And I'm really, really seeing a path forward for this becoming like a pattern for many other settings as well. Enable that society of vision, Society of agents. Vision.
B
Yeah. So Many, so many great points there. And yeah, like I think it has been kind of the, you know, command line agents, but then even having the, you know, people building user interfaces on top of those for, you know, people that maybe aren't as comfortable in the terminal to, you know, go and take advantage of those, I think it's, you know, and people that, you know, we're just using, you know, a co pilot or a chat GPT one day and now all of a sudden they have an agent that's spinning up sub agents without them asking. Right. It's, it's really taking off, I would say for sure. You, you know, one thing that you said there, AJ that really struck me, you know, talking about, you know, hey, maybe one day that there's a society of agents that could run a company. And I think some people will hear that and be like that. No, that doesn't make sense. That's something in the future. But I'm like, oh, absolutely. Right. Like why? Like why not? If these models especially, you know, these, these reasoning models that, you know, when you look at how they score on offline IQ tests, right. They're, they're very, very good. Right. And I think people overlook just how much a single model is capable of, especially models that reason. But you know, what should business leaders, if they hear something like that, like, oh my gosh, like a society of agents running a company, like, what are the decisions I should be making now to maybe be prepared or take advantage of that? How should they be thinking about this?
A
Yeah, that's a great question. First of all, we don't know if the future is going to be agents running a company or a collection of agents and people running a company. It is more likely that it's going to be the latter one than the first one. But what I would bet on is that agents are going to be in this workplace taking a part in the way work is done. I'm very confident to say that that's going to be the future because again, we are building these systems to make people more productive and enable a path where there's just going to be like, we are going to be able to do the things we cannot dream of doing like today. And I think we are going to be seeing that future. So thinking about what I can say, what works in this ecosystem, there has been actually like a really Great report from MIT 6 months ago on generative AI divide. I don't know if you had a chance to read and cover it, but they were two insights from that that were really Amazingly well suited, I think, as advice for people who are thinking about transforming their organization with agents. First of all, there was this result that 95% of the experiments done around agents and AI and enterprises actually fail and make their way to productization or create the value that was promised. When they actually look into the cases that really work. There were two things that I think that's very important for everybody to take a second to think about them. One is if you want to get real value out of these agents, it is not sufficient to incorporate these agents into our world in incremental ways. It is actually very important that we rethink how we are implementing business workflows and processes now that we have these agent technologies in place. So getting that transformational value from AI requires rethinking and redesigning how we work. And I think that is a very hard part of this transformation. For example, when you think about the agentic stack, right like or the coding, the coding agent started as people writing the code and then doing code completion. That was the first ones. Now what we are seeing with people who are on the frontier of using these coding agents is that they are changing the way they work. They are changing how they write code. They are discovering how to put those plans into the MD file so that the agent can really execute things the way they want them to execute. People and organizations will need to rediscover of how they are doing things because now these technologies are at the center of their productivity and without that rethinking, they are not going to be able to get the value they want to get. So that's, I think lesson number one, lesson number two is that there are going to be complementary approaches that are coming on top of these models to make them really reliable and successful. So we are working on multi agent approach as one layer of that. It is very likely that they will need to think a lot about the communication with the human, the communication between the agents, MCPs and a TOAs in addition to the models to make this a success. So it's very important that the people who are making these decisions, they are not only familiar with the models that are becoming very popular right now, but they are getting familiar with the whole stack, the communication protocol, the orchestrators of the world, interaction layers, so that they can really make the right decisions about how to build AI in their organizations.
B
Yeah, and you know, speaking of, you know, re implementing workflows and reimagining or redesigning, you know, how we work, you know, that's something for me personally. I'm always struggling with, right? Because it's like it almost seems silly to hand off certain tasks to, you know, a multi agentic workflow. Because I'm like, okay, this is so basic and I know that these, you know, these agents or a society of agents could achieve so much more. You know, I'm curious, how would you recommend, you know, decision makers and business leaders start handing off those tasks? Right? Like, how do they even know where to start? Right? Like I think some people are like, okay, I'll, you know, the mundane research or you know, code completion, right? But when it comes to bigger pieces of pivotal day to day operations, how do you start saying, hey, this is our first use case for a society of agents, right? How do you identify? Are you still running in circles trying to figure out how to actually grow your business with AI? Maybe your company has been tinkering with large language models for a year or more, but can't really get traction to find ROI on gen AI. Hey, this is Jordan Wilson, host of this very podcast. Companies like Adobe, Microsoft and Nvidia have partnered with us because they trust our expertise in educating the masses around generative AI to get ahead. And some of the most innovative companies in the country hire us to help with their AI strategy and to train hundreds of their employees on how to use Gen AI. So whether you're looking for ChatGPT training for thousands or just need help building your front end AI strategy, you can partner with us too. Just like some of the biggest companies in the world do. Go to your everydayai.com partner to get in contact with our team or you can just click on the partner section of our website will help you stop running in those AI circles and help get your team ahead and build a straight path to ROI on gen AI.
A
It is hard because we are people and we take a lot of pride in how we do things right now. And these agents are forcing us to really think who we are, how we do things. What is our value as people? As we put the society of agents picture into the world right now that is now making us question what is our value in that picture. It is very important to reflect that while technology is the force behind this change, this change is not going to be only a technology change. It is going to be a cultural change, it's going to be a personal change, it's going to be an organizational change. In many ways we can get the technology pieces right, but if we are not getting the human pieces right, we are not going to be successful in that. I think there is Another thing to think about there, another lesson to think about there. For me, as I'm managing this organization, I'm very much reflecting on the fact that I'm not able to try everything myself. I have to experiment as a researcher. I have to experiment every day so that I can stay on that cutting edge of where this technology is going. It's very cool that there are many days some junior researcher in the team or some junior engineer in the team would ping me and say, I want to show you something. Can I come and install something on your machine? You should see this. Can I demo this to you? So, you know, I think as execs, sometimes we can be in a mode of like, I know better than everybody else that is so wrong right now. I'm. Sometimes I'm the one that is the latest in terms of learning something. So, like, we should all enable the whole organization to experiment and then share that experimentation with everybody else coming all the way to the top leaders so that we can all experiment and learn together. This is going to be an ongoing transformation because this technology change is not slowing down anytime soon.
B
I think, you know, playing off that, so going to top leadership. And I think, you know, top leadership that I talk to all the time. When they talk about maybe transitioning from, you know, agentic models or a single agent to, you know, multiple agent or, you know, a society of agents, one thing they are really scared about is the risk, right? Because it's one thing if you are, you know, overseeing one agent, I think you can for the most part, keep up and, you know, do a, do a good human job of, you know, making sure it has the right data and the observability and traceability and all those good things. Right. But what about when there's a society, when there's many agents, when there's an agent that takes a task and decides to break it up into, give it to three or four sub agents. Right. So how should we be looking at that type of risk that does, you know, I guess the other side of the coin, the, the, the tremendous possibility and upside. But how do we tackle that risk?
A
Yeah, I think we have to be working on that very rigorously. Just to give a few examples of like, how we are seeing these risks coming up today. So we built FARA as an agentic model that can carry out like computer use tasks, as I mentioned before. And I was testing FARA the other day and I gave a very simple task which was go to New York Times, complete the crossword puzzle for me, because it's something that I don't know, I cannot do very well as non English person. So. And it is amazing to watch these agents work. So the agent goes to the New York Times site and soon enough the agent realizes that it needs my New York Times credentials to get into the website. And then it realizes that it doesn't have my credentials, but there is a link there for resetting my password so that the agent can get into the New York Times site. In a world where the agent has access to my email, which is, which is today, it's possible you can kind of like link all of those tools to your agents. It's not hard to imagine that the agent would go and reset my password. Was this something I expected the agent to do when I gave this very honest and simple task? No. But we have to be very aware that these agents with the autonomy they have, with the skills and the tools that they have at their disposal at being powered by the reasoning models, they are becoming resilient and relentless in terms of carrying out a test that you give them. And in many cases we see that these agents can break your mental models in terms of how something should be done. So that's kind of the risk part of the agents talking about the multi agent aspect. One of the reasons we actually came to the multi agent approach is that it gives us a way of controlling the agentic behavior in better ways. Because what we can do is imagine you identify a bunch of risks that you have in your world. You can create agents whose only responsibility is overseeing what other agents are doing and provide real time feedback to them about how they should be carrying out their pieces. We see this working very well. For example, for hallucination mitigation. As we were building agents, let's say that we're creating deports. We created separate agents whose only job is doing fact checking on the side and provide that feedback. So the multi agent approach of decomposing tasks, creating the specialization and agents and rely on collaboration, we are seeing that as a pretty general pattern of creating some level of agent oversight and control. At the same time, I agree with you that as we are connecting to other agents now, getting into more into the society of agents framework, there are some real issues now we need to tackle with. When it comes to giving control to agents that are out of your network sharing information, privacy becomes a real consideration. Because if my agent is talking to your agent about scheduling, my agent may reveal a lot of private information about me to you. It may say that oh, she cannot do Friday 3pm she has a doctor appointment and by the way, she has this disease. Right. It is something that enables agent may share with another agent that we don't want to. So I agree with you. I think there are new issues that are happening at that intersection of agents communicating with other agents, particularly across information boundaries. And that is research that we are very much looking into right now.
B
So it almost, you know, I, when, when you were saying that, I was thinking of kind of like original hallucinations, right? How like, you know, the earlier large language models, you know, they're made to be a helpful assistant. So they're going to go out and you know, even if they don't know the answer, they might confidently suggest something. So you're kind of saying like multiple agents or society of agents might do kind of the same thing. There might be a new classification of a hallucination of, you know, oh, an agent thinks it's doing something, but it's probably doing the wrong thing.
A
Very possible. Very possible.
B
Okay, that's that a nice way to think about it. So one thing that I really want to know is, you know, we're talking about these, these models as well, right? And the models I know, you know, your team at Microsoft is starting to work on their own models, you know, great models from all the other big AI labs. You know, at what point do we even separate like, you know, a large language model, is it an agent? Right. If it's starting to do these things proactively for you and it's, it's connected to all your data. Like at what point, you know, where do we draw the line between hey, here's an agentic model that can go out and do things versus a traditional agent or a team of agents.
A
That line is blurring really fast. As I said, like when we are thinking about the Agentix stack, right. In many ways we started as having separate layers. So we would have an LLM and we would have the orchestrator on top of it. And we would have separate direct components that provide the memory into these agents. And what we are seeing right now is that as we understand these layers, we start collapsing them back into the model. So we switched from again using an LLM, but now we are using an agentic model that really understands the perception action loop. Now we are looking into like with skills for example, and mcp, right? Like we don't have to create an external orchestrator, but we can actually have a model that can learn through reinforcement, learning about how to call other agents through MCP or a to a, or running some Skills, which is kind of the essence of multi agent. But now collapsing into the model layer. From our research community, we are seeing a lot of interesting ideas about putting memory into the models themselves. Like not necessarily relying on RAG anymore, but just like putting it into the model itself. So there is this pattern that many of the layers we are building across this computation stack, we are learning how to collapse them into the model itself and then creating hooks to be able to still do the multi agent, but from a model first way of doing them. So that line between model agent and the society is definitely getting blurry because again, we want to have the most resilient stack. And in many ways, instead of doing things in a scaffolding way, collapsing them into the model, training with the right objectives seems to be a much more principled way of doing it. I'll even go one step ahead and say the whole stack right now is built on transformers. That has been the architecture coming from Google six, seven years ago that really became the architecture that enabled this transformation. But again, from a researcher point of view, is that the best architecture we can have for this changing agentic work? Are there particular architectures the research community may discover that becomes even a better fit for kind of the agentic stack? I think that is still to be determined, but I have really smart researchers thinking about those problems today.
B
All right, well, we'll be sure to pay attention to everything that you and your team are putting out in the near future. But AJ we've covered a lot in today's conversation. But you know, as we wrap up, what would you say is the most important thing for business leaders who are looking at this concept of multi agentic workflows? A society of agents, what is the most important thing for them to focus on in 2026 and beyond?
A
So it feels like we are very much at the still first innings of this transformation. What is clear is that this is not the hype, this is real. We are going to have these agents being part of our organizations, being Face off, or being kind of the initiator for a lot of consumer activity going forward. So I think it's very important to think about how this transformation is going to evolve. What we have seen, probably you and I have been through a few of these waves like the Internet, social media, social computing and such. One thing that really happens is first we have a core technology that comes up like, think about the Internet, right? It was all like TCP IPs. There is a core technology that emerges and then we really discover applications of them and people Start adopting them again. For AI, this is happening really, really fast. And what follows later are ecosystems that form from these technologies. Things get. Things start connecting with each other. Things start forming feedback loops that really make some things work so much better than others. Then what comes next from those ecosystems are new economic models, new business models that start shaping up our society, business models, economic models that shape up how we consume that technology, how users get affected from that technology, how enterprises transform their new revenue streams from those technologies. I think what I'm going to be paying a lot of attention in 2026 and beyond is how these technology, how different players are going to start forming ecosystems from these technologies, how these different agents start talking to each other. Like when you go to ChatGPT today, you see ChatGPT calling in some other agent from some other enterprise or putting in advertisements in there. These are all the beginning of different players thinking about what is an ecosystem I can form, what is a business model I can form. We are going to be paying a lot of attention to that ecosystem and business model play again, especially for the people who are thinking about their particular enterprise, like, how is this going to affect them? What is going to be their ecosystem and business model that they can bring in into this agentic world? And for everybody else who are going to be using these systems, right. How are these ecosystems going to form such that as users of these systems, we are going to be affected from them. So these are going to be the questions that my team is going to be spending a lot of time thinking about.
B
All right, well, aj, thank you so much for taking time out of your day to help us answer a lot of the questions that we've been struggling with over the last, you know, couple of months and couple of years. We really appreciate your time for coming on the Everyday AI Show.
A
Thanks for having me. Thank you, Jordan. All right.
B
And if you miss anything, y', all, don't worry. We're going to be recapping today's conversation in our newsletter. So if you haven't already, please go to your everydayai. Com. Thank you for tuning in. Hope to see you back tomorrow and every day for more Everyday AI. Thanks, y'.
A
All. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going for a little more AI magic. Visit your everydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.
Date: February 6, 2026
Host: Jordan Wilson
Guest: A.J. Kumar, Corporate Vice President & Managing Director, AI Frontiers Lab at Microsoft Research
In this episode, Jordan Wilson interviews A.J. Kumar from Microsoft Research, diving deep into the world of agentic AI: not just bigger models, but smarter teamwork between “societies” of AI agents. The conversation explores why models alone aren’t enough, how group dynamics among agents are revolutionizing AI’s impact on business, and what steps leaders should take to embrace this shift. A.J. shares insights from his decades-long research and real-world implementation of multi-agent systems, offering practical guidance for organizations seeking to stay ahead in the rapid evolution of AI.
[00:15–02:16]
Notable Quote:
“For a year or two, all of the talk around AI was bigger models... But I don't know if that's the right way. Maybe it's working with smaller models or multiple agents, a team of agents, maybe a society of them.”
— Jordan Wilson [00:15]
[04:37–06:53]
Notable Quote:
“These agents are going to be able to work together to really do collaborative work. Like, imagine a future where there could be a company partially driven by agents.”
— A.J. Kumar [05:42]
[06:53–10:52]
Notable Quote:
“Agents represent that transition from systems that know how to talk to systems that know how to act in the world.”
— A.J. Kumar [07:55]
[12:09–15:56]
Notable Quote:
“Getting that transformational value from AI requires rethinking and redesigning how we work.”
— A.J. Kumar [13:55]
[15:56–17:57]
Notable Quote:
“Sometimes I'm the one that is the latest in terms of learning something.”
— A.J. Kumar [18:31]
[20:06–24:55]
Notable Quote:
“These agents can break your mental models in terms of how something should be done.”
— A.J. Kumar [22:28]
[25:31–28:45]
Notable Quote:
“That line between model, agent, and the society is definitely getting blurry... we want to have the most resilient stack.”
— A.J. Kumar [27:22]
[29:11–32:05]
Notable Quote:
“What follows later are ecosystems that form from these technologies. Things get... feedback loops that really make some things work so much better than others.”
— A.J. Kumar [30:07]
On Agentic Collaboration:
“We imagine these agents to... work with other agents and people to really guide what the future of productivity... is going to be like.” — A.J. Kumar [05:21]
On the Cultural Shift:
“Technology is the force behind this change, [but] this change is not going to be only a technology change. It is going to be a cultural change, a personal change, an organizational change.” — A.J. Kumar [17:57]
On Oversight:
“Multi-agent approach... gives us a way of controlling the agentic behavior... creating the specialization and agents and rely on collaboration, we are seeing that as a pretty general pattern of creating some level of agent oversight and control.” — A.J. Kumar [23:21]