Transcript
A (0:00)
Creating great products isn't just about features or roadmaps. It's about how organizations think, decide and operate around products. Product Thinking explores the systems leadership and culture behind successful product organizations. We're bringing together insights from multiple product leaders pulled from past conversations to explore one shared topic offering different perspectives and lessons from real world experience. I'm Melissa Perry and you're listening to the Product Thinking podcast by Product Institute. Today we're looking at how AI is moving from experimentation into real products and what that means when trust, accuracy and risk really matter. We'll start with Dr. Mariam Ashuri, who works at IBM and was involved in building WatsonX. She breaks down what AI agents actually are, how they reason, plan and take action, and why those capabilities introduce new challenges and when systems start operating on their own. Let's hear from Maryam.
B (1:00)
AI Agent is an intelligent system with reasoning and planning capabilities that can automatically make decisions and take actions. So now how can I use this? Because it has some sort of reasoning, we can question if it's reasoning or just rudimentary planning capabilities, but because it has those capabilities, it has the opportunity to break down complex problems problems and play the role of a partner for brainstorming, for market evaluation, for gathering and analyzing information and data. And then because it has some sort of action taking actions we call it in the AI world tool calling or action calling. I can automate some of the actions that is potentially possible to execute on for an agent. For example, we are having this conversation on the podcast. I can have an agent to automatically summarize everything that we are talking about, identify a list of actions depending on what we said. So use those reasoning and planning capabilities to identify the actions and connect them to some internal systems that automatically enables it to go and execute those action items and get them started. Let's go back to you mentioned ChatGPT the story of Gen AI in 2023. There's a series of use cases that Gen AI was applicable to at the time was around content generation, code generation, content gr, rounded question and answering, summarization, classification and information extraction. Mid 2024 we started seeing LLMs expanding to taking actions. The tool calling and function calling capabilities that we talked about. So suddenly businesses saw an opportunity, a window of opportunity to connect all those acceleration that we previously got from LLMs and bring them to through action calling and function calling bring them to every single of their businesses, even the legacy systems. You can think of automation that can come everywhere. You can think of acceleration that can come everywhere. So practically any use case that you can think of can potentially benefit from some of the use cases that are enabled by agents. There is no reasoning really, there is no logic of thinking behind LLMs. This is an unsupervised learning that basically LLM is exposed to a body of information, very large body of information. So when you ask a question, it basically calculates what's the probability of the next token. It's just the word of probability of the next token versus the whole thing. And because of this, the model can hallucinate because based on just following the probability rules, there is a high probability that the output that you have generated is sound. But then when you use the logic behind it, it's like, hey, it looks good. And most of the time they are confident, very confident on generating this, that you can feel the confidence in the tone. But it's not an accurate information. It's just a collection of words nicely put together. There are two things that they can do. One is using the technology to identify some of those. So for example, for agents we have been developing guardrails, agentic guardrails that identify context, relevant faithfulness to the contents and something like that. So as a product manager you can put together these guardrails in the flow of decision making for the agent to make sure that they stay close to that truth. The second thing that they can do is when they design the workloads, they make sure that the human is in the loop when there is a sensitive information or when there is the need to very high accurate output. So for example, if you're using AI to provide recommendations for where to eat, maybe the confidence can be low and nothing can happen unless we are talking about food allergies and stuff, which is the serious thing. So maybe product owner is providing recommendation. You can make sure that if there is something in that query about food allergies, you want to make sure a human is in the loop or you want to make sure that there is an extra checking to make sure that the output before is communicated to the end user is double validated.
