
Loading summary
Benjamin Shapiro
The Martech Podcast is a proud member of the iHear Everything Podcast Network. Looking to launch or scale your podcast, iHear everything delivers podcast production, growth and monetization solutions that transform your words into profit. Ready to give your brand a voice? Then visit iheareverything.com.
From advertising to software as a service to data across all.
Juan Mendoza
Of our programs and clients, we've seen a 55 to 65% open rate.
Getting brands authentically integrated into content performs better than TV advertising.
Benjamin Shapiro
Typical life span of an article is about 24 to 36 hours.
Juan Mendoza
If we're reaching out to the right person with the right message and a clear call to action, then it's just a matter of timing.
Benjamin Shapiro
Welcome to the Martech Podcast, a member of the I Hear Everything Podcast Network. In this podcast, you'll hear the stories of world class marketers that use technology to drive business results and achieve career success. Here's the host of the Martech Podcast, Benjamin Shapiro.
Welcome to the Martech Podcast. I'm Benjamin Shapiro, the Executive Producer of the Martech Podcast and today we've got a special episode for you which is going to be guest hosted by Juan Mendoza, the author of the Martech Weekly Newsletter. Juan is a recovering Martech consultant turned creator who writes an amazing weekly newsletter about the Martech industry and I'm thrilled to invite him and some of his friends to take the mic and share their knowledge with you, our loyal Martech Podcast listeners. All right, here's a special episode of the Martech Podcast, guest hosted by Juan Mendoza, the author of the Martech Weekly Newsletter.
Juan Mendoza
Welcome back Martechers. My name is Juan Mendoza, your guest host on the Martech podcast and the CEO of the MarTech Weekly, a Martech nerd who is excited to talk about AI and how it's hallucinating in the marketplace right now. Joining me today is Philip Miller. He's the Senior Product Marketing Manager for AI at Progress. Progress is innovating at the edge of disruptive change in generative AI, large language models, and the real sort of bleeding edge of technology in using machine Learning and other AI models relied upon by over 4 million developers and technologists worldwide. Progress empowers organizations to develop, deploy and manage AI powered applications and experiences with agility and ease. So Philip Miller and I are going to discuss navigating generative AI's hallucinations. I'm sure you've seen the memes, you've seen Google recommending that the best way to stick cheese on a pizza is to glue it down, or random stuff that comes out of a chat GPT session today. This episode is kind of to think about what do we do with that from a marketing standpoint? How much can you actually trust these tools when they don't get it right? How should we treat them? Should we treat them like a coworker, a boss? Or should we treat them like an intern that thinks they know it all, but they really don't? So Philip and I are going to dive into this. He's a senior Product marketing manager for AI at Progress, and we're excited to have him.
Benjamin Shapiro
But before we get to today's interview, I want to tell you about what I'm listening to. Ever wanted to sit down to a candid conversation with marketing leaders from the world's biggest brands? The Current podcast is your chance. On the current podcast, you'll find exclusive interviews with the experts and trendsetters who are on the front lines of digital advertising. And they always leave the ad tech jargon at the door. So subscribe to the current@www.thecurrent.com or anywhere you get your podcasts today.
Juan Mendoza
Philip, thanks for joining us.
Philip Miller
Well, thanks very much for having me. Yeah, really excited to be here today. Really looking forward to discussing, like you say, hallucinations and how to kind of navigate that in this space. It's a new field to a lot of people. But interestingly, as we've been defining AI in our organization, we've been pulling out some sort of foundational subsets of AI, like machine learning, like expert systems, like retrieval, augmented generation. Now that's new in that space. That can really help in this arena. So, yeah, really interested to have a chat with you today.
Juan Mendoza
I am really excited, too. I've been dying to do an episode on this topic on AI hallucinations and this whole problem around AI getting stuff wrong. And I think that if we take a step back, is hallucinations the right word here? Because when you say someone's hallucinating, it means that they're either being somewhat knocked unconscious or they're mad. Like they literally have a mental disorder. But we've kind of used this word to explain what happens when these large language models get it wrong. But maybe you can kind of start there with the definition of what is a hallucination and what does it actually mean in the context of large language models?
Philip Miller
Well, it's kind of in our nature to humanize the things around us, to recognize patterns and put shapes in faces. So hallucinations fit the bill. But underlying the hallucinations is really maths. So when you prompt A gen AI, it will then go and look at its model, which come from its training, and then go, I need to answer this question. Okay, what's the most likely next token? Not necessarily whole word, but next token to come up in my response to this prompt. And then it will fill out that until it has basically recognized the prompt and then answered that question. Now, because it's maths and essentially I like to say it's like a weighted number generator, so it's producing these random sequences, but it's saying, actually, I know that this should be here. So sometimes if it doesn't have the answer, but it has to answer the question, it's been told it's got to answer this question, then it will then, quote, unquote, hallucinate. It will generate the next word in that thing and can diverge quite quickly from what the answer should be. And that's for businesses, for organizations, for ngos, whatever it might be. That is a problem because either they're relying on these systems internally for efficiency sake and things like that, or they're pushing this out to customers and that's the way their customers are engaging with them using these systems. So if things go wrong, then we need to look at that and how do we mitigate that?
Juan Mendoza
It's interesting because the metaphor I use is your typical kid logic, like your toddler logic. So one story, that which I love to retell, which is a father brings a daughter to the airport for the first time. It's the first time she's this little girl has seen airplane, right? Let's say she's five years old. So she's sitting there and she's watching a plane take off, and then she turns to her dad and she said, daddy, why when people travel or when people get on planes and they go somewhere else, is it that they get really small? Why is that? Why do they get really small? And then he had to explain, no, that's just because they're moving into the distance really fast. But she thought that she made the wrong inference there. Just because the plane looks like it's getting smaller, it means that that's what travel is, is getting really small. And I think that's a good metaphor for AI hallucinations, which is it's pattern matching. But sometimes it's just matching on the wrong things. Like I mentioned before, cheese and glue, well, they seem to be sticky type substances that join things together. There's this sort of false positive on things that match together that really don't, which seems to me more of an experience question than like a learning question as opposed to creating random things out of nowhere. It seems to be pattern matching, but just on the wrong sort of categories or the wrong edges.
Philip Miller
There's two things really. AI has been trained on publicly available data. This is data that it has and I'm not going to go into the ethics of this in terms of how they've got this data and things like that, but I will say that it's been trained on this publicly available data, but there's a whole other data set that is underneath this that it's never seen before, that is either inaccessible to it for whatever reason, but it's that proprietary private data. It's the data in your organization, about your organization, about the products, the services, the domain you exist in, how you describe things in your business. The AI has never seen that. So when you start to get specificity coming into place in these gen AI solutions, I it's not asking a generalist question, it's asking specific questions and you're relying on those giving specific answers. You need to provide it with your data, you need to share with it your view of the world. Now we could and people have in a very basic implementation of what they call retrieval, augmented generation, one solution that can help reduce hallucinations. You could just send all of your data to the gen AI and say, have a look at this. Learn about my business, learn about my world. Everything's there, just have a look at it. Right? And the AI that can then make inferences. But again it's based on the Genai's model that has been trained on that other generalist set. What we've done and what our customers are now deploying is we use a multimodal semantic RAG approach and these are all fancy words to basically say we take your data and we run it through our progress data platform, which includes two kind of fundamental technologies. One is the MarkLogic server, which is a multimodal server that will ingest the data as is. It will curate and harmonize that data, put it into like a single model of your data for like downstream applications. You can then search that with our multimodal query engine to return results. And that's great. We also have another feature that works in conjunction with our sister technology in the platform, which is Semaphore, which is this knowledge management tool. It's the knowledge modeling tool. It identifies data about the data and then it puts it into a model of your world, of your business and can take in external models like industry standard taxonomies on ontologies. Again, Big words. But basically these are models of the world that they exist in. And then what you can do is you're not just providing raw data to the gen AI. What you're providing is connected contextualized data that essentially you are telling the AI you know you exist. Great. But you exist in this world. And this world is really important that we make sure that we know that this is part of this. Which is part of this is that maybe look at it as like the tree of life or other taxonomy models or ontological models of the world. But it's basically putting a lens on your data. And what that allows people to do is the AI then has more context, has more information, but it has correct knowledge of your business. And it's informed by, and this is what we try and do with our customers and how we implement these things. It's informed by having that human in the loop. It's informed by the dad in that conversation. Yeah, you're kind of right, but it's going further away. But that's perspective and not what actually happens. So it's taking those subject matter experts within your business, business experts, whether they be technical or non technical, it's taking the industry knowledge of that world with those external ontologies or taxonomies. It might be standard say for instance, or legislation or something like that. And it's applying that to that data so that you really have that data in context and that the AI can understand that data. And that's where we're seeing when we get more accurate answers, we get specificity to the answers because it's not just saying like standard Google search, you Google something and then you go, it gets to the top page and you have a look and then you have a look down. Doesn't give you necessarily the answer, it just gives you where it found that sequence of words. And what we're saying is no, no, in that document or in that webpage or in that whatever it might be, in paragraph seven there is the answer to your question. And so not only are we getting specificity, but we're getting trust because it's trust but verify that kind of intelligence world approach to everything. So you are helping the end user quickly gain insight and trust into the answers that the gen AI is delivering.
Juan Mendoza
I think that there's some interesting angles on this which is there's this like this semantic approach where we're starting to see this and this will be out in tomorrow's episode when hopefully he'll be back with us. Phil. But we're talking a lot More about integrating generative AI into the enterprise. But I think a big part you called out is that semantic layer, right, which is a taxonomy, which is basically fancy words for a bunch of data and a bunch of information about your specific context in which you want the generative AI tool to operate in, whether that be on a specific data set within your marketing or a specific bunch of email campaigns or customer feedback. It's not general intelligence. It's like very good, specialized intelligence that you can kind of build now with these tool sets. So I appreciate you calling that out because that's one way to kind of put the guardrails around hallucinations. The smaller the world, in theory, the less rate of error when working with clean data and clean information for that model. You can kind of see, as companies start to look at this, our marketers start to pick it up, there's this opportunity to go, well, how do you minimize that error rate that hallucination? It's probably making a smaller world for the AI to play in, because a lot of what obviously has made so much noise in the industry is chatgpt when it came out back in November, in 2022, which feels like forever now, but when it came out, it was trained on just the slop of the Internet, just everything. And yes, a lot of the information is actually pretty good. It's pretty high quality. But on the other side, there is a lot of stuff that's not great in it as well. And so you can kind of think through this and go, okay, well, what model application do I actually need? And then maybe that's one way to tackle hallucinations. Another way I've been thinking about recently, and I would love your thoughts on this, is how much responsibility should you give an AI model if it's getting some things wrong and you can't trust it to be absolutely correct? Well, you can't really trust an intern to be absolutely correct on everything either. But the difference, and maybe I'm wrong here, but the difference is that an intern can at least, well, sheepishly explain how they got wrong or explain what they did to get to the outcome that was incorrect. How do you approach this? Is this a co pilot? Is this an intern that is kind of sometimes wrong and you laugh it off. How do you see the way in which a marketer should think about the error rate with these platforms in their own business and the risks they should take with it?
Philip Miller
Content creation has always been a collaborative effort in any organization. You have editors, you have final checks, you have people looking at the Content, I have people. It makes me a better writer. It makes my message be more on point. Or I can use different words and things like that. And we accept that as human beings. So why is that not the same for an AI agent where you're asking it, can you create me a piece of content now? You can give it better prompts. You can then include data into those prompts to say, oh, and take this from here. So you're taking that rag approach. Yeah, that retrieval or rental generation. You're searching your data, you're providing it context, and then you're saying, generate that response. But it's the back end as well when it comes back. And that response, you should be looking at that. Now, I can tell you, because of the platform that we use, what happens to your data for 90% of its journey. I can tell you when that data was created, if that data is talking about other timelines within the document itself. So it might be talking about stuff that happened two weeks ago. And I can create timelines from that. I can tell you whether it's compliant with your internal controls and governance. I can tell you whether it's compliant with legislation. I can classify and I can tell you why I classified that data, why those facts were important, how they're connected, why we connected them. Right. I can tell you right up until the AI. So I can do 90% of the journey. That 10% is in that. And it's that black box. And explainable. AI is a field of study, but it's not something that is going to be solved quickly, if ever. Because dimensions that I are dealing with, much more complicated than we view the world. I like to say, like we have a model of the world in our heads and we only use our perception that five senses more than that. But the five senses that we use, only 2% of that data actually informs how we view the world roughly about the rest of it, the model. So I do it like this. Walk into the living room, look around, and then you go, this is fine. You don't actually use the eyes and sounds and smells and things like that. Most of that information is from the model. Because you expect the living room to be the same. If you open the door and there's a dinosaur there, it goes to the model and go, should this dinosaur there, no quick flight or flight.
Juan Mendoza
Right.
Philip Miller
Get out of the door. So you won't necessarily be able to explain why the AI, but what you can then do is with the response, take that and then look at your knowledge base, look at that connect Contextualized data that you have of your thing, does it exist? Yes, it does. Okay, is this the right answer? And that comes down to that human in the loop human AI collaboration that we see not only for successful classification and connected and contextualized data, but we also see it in the output as well. So marketers are one thing we're really interested in. We have a lot of publishing customers, we have a lot of standards organizations that publishing adjacent. We have a lot of content use cases in our business. Right throughout the content value chain. We have a smarter content series coming out. Essentially the first two are out and another one is next week dealing with this topic, this next gen application, including the application of gen AI. But it starts with the data. Like any AI solution starts with your data. Bad data in, bad data out. Good, clean, contextualized, connected data, you're going to get a better response. Will we eliminate completely hallucinations? No, because every time it does it, there are a lot of AI people will be very cross with me for saying this, but it's rolling the dice, seeing what the response should be and then putting that response in there. That's why that human in the loop on both ends of the spectrum, the prompt and the response, that's why that's so important for organizations.
Juan Mendoza
There's this thing that you're touching on which is a very interesting philosophical concept, which is this mental model of the world. And every person's mind has you walk into your living room. On the way there, you're predicting what it's going to look like and your attention on certain things fades into the, into the distance as well. You might be just focused on one thing like your car keys in your living room. Not the table, not the lounge, not the light switches, not all the other things that are going on. Your mind focus is actually very selective on ignoring most of the information that's in your domain and selecting the things that are the most important to what you want to achieve or accomplish. And I think that's kind of an interesting way of thinking about generative AI, because in a way it's got billions of billions of data sets that it's at its fingers, right? And then it has to sift through all of that in order to find the one focal point, the one needle in the haystack that is actually the correct answer that isn't formulated the right way and that it makes sense for human. Now that is a technological feat itself and that is something worth celebrating. And I think that's kind of where we're heading here. Yes, there's problems, yes, there's hallucinations, but there are ways explainable AI, as you mentioned, there are ways to actually overcome this and there are ways to sort of minimize that error weight by giving a better focal point for what you want to achieve with AI instead of taking ChatGPT off the shelf, looking to apply it to specific domains of business and knowledge would just be a far more sound way of thinking about integrating this technology. So Phil, this has been a fabulous conversation, very fascinating, very interesting. A lot of new things happening in this space, but I do appreciate your expertise in navigating generative AI hallucinations. We could talk about this all day long, but we can't because you have to head off and we'll be catching you again tomorrow. So thank you to Philip Miller from Progress for joining us in our second interview tomorrow, Philip and I are going to discuss developing and deploying AI powered apps successfully in the enterprise. If you can't wait until our next episode, you'd like to learn more about Philip. You can find a link to his LinkedIn profile in our show notes or you can visit his company website@progress.com Phil thank you so much for joining us.
Philip Miller
Brian, thanks very much for having me. Cheers.
Benjamin Shapiro
Okay, that wraps up this episode of the Martech Podcast. Thanks to our guest host, Juan Mendoza, the author of the Martech Weekly newsletter. If you'd like to get in touch with Juan, you could find a link to his LinkedIn profile in our show notes or you can contact him on Twitter. His handle is Juan Mendoza, but it's spelled Crazy pants. It's J U4N M E N D0Z4. Or it's a little easier to just visit his company's website, which is themartekweekly.com A special thanks to the Current Podcast for sponsoring today's interview. If you're looking for candid conversations with marketing leaders from the world's biggest brands, then give the Current Podcast a listen. On the Current Podcast you'll find exclusive interviews with experts and trendsetters who are on the front lines of digital advertising, and they always leave the ad tech jargon at the door. So subscribe to the current@www.thecurrent.com or anywhere you get your podcasts today. Just one more link in our show Notes I'd like to tell you about. If you didn't have a chance to take notes while you were listening to this podcast, head over to martakpod.com where we have summaries of all of our episodes and contact information for our guests. You can also subscribe to our weekly newsletters and you can even send us your top topic suggestions or your marketing questions, which we'll answer live on our show. Of course, you can always reach out on social media. Our handle is martechpod M A R T E C H P o D on LinkedIn, Twitter, Instagram and Facebook. Or you can contact me directly. My handle is benjshap B E N J S H A P and if you haven't subscribed yet and you want a daily stream of marketing and technology knowledge in your podcast feed, we're going to publish an episode every day this year, so hit the subscribe button in your podcast app and we'll be back in your feed tomorrow morning. All right, that's it for today, but until next time, my advice is to just focus on keeping your customers happy.
Thanks for listening to the MarTech podcast and I hear everything. Production Looking to launch or scale a podcast like this one for your brand? Then visit iheareverything.com.
Episode: Navigating GenAI's Hallucinations
Release Date: November 26, 2024
Host: I Hear Everything Podcast Network
Guest Hosts: Juan Mendoza & Philip Miller
In this insightful episode of the MarTech Podcast™, guest host Juan Mendoza delves into the intriguing yet challenging phenomenon of generative AI (GenAI) hallucinations. Joined by Philip Miller, Senior Product Marketing Manager for AI at Progress, the discussion unpacks the complexities of AI-generated inaccuracies and explores strategies to mitigate their impact on marketing and business operations.
Philip Miller begins by demystifying AI hallucinations, clarifying that these are not instances of AI "losing it" but rather mathematical outcomes of how large language models generate responses.
“When you prompt a Gen AI, it looks at its training model to determine the most likely next token... sometimes if it doesn't have the answer, it has to generate something, which can lead to what we call hallucinations.”
— Philip Miller [04:58]
He explains that hallucinations arise because AI models predict the next most probable word based on their training data, which doesn't always guarantee factual accuracy.
Juan Mendoza uses relatable analogies to illustrate AI hallucinations, comparing them to a child misinterpreting the shrinking size of an airplane with a father's explanation.
“...the plane looks like it's getting smaller, it means that’s what travel is, is getting really small.”
— Juan Mendoza [06:27]
This metaphor highlights how AI, much like a child, can misinterpret patterns and make inaccurate inferences when context is limited or misaligned.
The conversation shifts to strategies for reducing AI errors. Philip Miller emphasizes the importance of integrating semantic layers—structured taxonomies and ontologies—that provide context-specific data to GenAI systems.
“We take your data and run it through our Progress data platform... providing connected contextualized data that essentially you are telling the AI you know you exist in this world.”
— Philip Miller [11:35]
By using tools like multimodal semantic retrieval-augmented generation (RAG), organizations can ensure that AI systems are grounded in accurate, domain-specific information, thereby minimizing hallucinations.
Juan Mendoza raises critical questions about the responsibility placed on AI systems, likening AI to an intern in terms of error rates and accountability.
“How much responsibility should you give an AI model if it's getting some things wrong and you can't trust it to be absolutely correct? It’s like an intern that is kind of sometimes wrong and you laugh it off.”
— Juan Mendoza [14:37]
This comparison sparks a discussion on the balance between leveraging AI for efficiency and maintaining oversight to catch and correct mistakes, highlighting the necessity of human involvement.
Philip Miller advocates for a collaborative approach where humans remain integral to both prompting AI systems and verifying their outputs.
“Human in the loop on both ends of the spectrum, the prompt and the response, that's why that's so important for organizations.”
— Philip Miller [17:10]
He underscores that while AI can significantly enhance content creation and data analysis, human expertise is crucial for ensuring accuracy and contextual relevance.
Juan Mendoza expands the discussion to a philosophical level, contemplating the mental models AI uses to interpret data and the ongoing quest for explainable AI.
“It's a technological feat itself and that is something worth celebrating... but you can kind of think through this and go, what model application do I actually need?”
— Juan Mendoza [18:24]
This reflection acknowledges both the impressive capabilities and the inherent limitations of current AI technologies, pointing towards the future of more specialized and trustworthy AI applications.
The episode concludes with anticipation for the next discussion on developing and deploying AI-powered applications in the enterprise. Juan Mendoza and Philip Miller provide listeners with resources to learn more about their work and encourage continued exploration of AI's role in marketing and business growth.
“Thank you to Philip Miller from Progress for joining us... If you can't wait until our next episode, visit his company website@progress.com.”
— Juan Mendoza [20:28]
For more detailed insights and summaries of all episodes, visit martekpod.com. Stay connected with the MarTech Podcast™ on LinkedIn, Twitter, Instagram, and Facebook @martechpod.