Summary7 min read

Podcast Summary

Everyday AI Podcast

Episode 786: "2026 LLM Cheat Code: 10 Essential Steps To Get the Most out of Any AI Chatbot" (Start Here Series Vol. 26)
Host: Jordan Wilson
Date: May 28, 2026

Episode Overview

This episode of the Everyday AI Podcast, hosted by Jordan Wilson, provides a comprehensive framework for effectively leveraging any large language model (LLM) chatbot in 2026. Jordan distills insights from years of rapid AI evolution and shares the "10 essential steps" that apply across the biggest AI platforms—ChatGPT, Claude, Gemini, Copilot, and more. The focus is on practical, actionable best practices that help everyday professionals and enterprises harness AI for dramatically improved productivity, output, and business value.

Key Discussion Points & Insights

1. The State of LLMs in 2026: Cookie Cutter Era (00:16–04:00)

Main Idea: Previously, every AI model and interface felt different—now, major players have standardized core interfaces and capabilities.
- "It's not like there's a cheat code, right? ... But there's one thing I found out over the last three or four months, the big players have all just kind of started to copy each other, which is actually a good thing for you." (Jordan, 01:28)
Implication: Best practices can now be applied almost universally.
Challenge: The flip side is innovation on the surface can appear slow, but under the hood, models are getting smarter and automation is becoming default.

2. The 10 Essential Steps for LLM Mastery

Step 1: Understand What an LLM Is (and Isn't) (13:42)

LLMs are generative and non-deterministic, not search engines or simple chatbots.
Outputs to identical prompts can vary widely—context, prompt quality, and model choice dramatically affect outcomes.
- "A large language model is not a search engine, it is not a chatbot. It is a reasoning engine that combined with your data and your human knowledge can output economically viable work at a rate better and faster than humans." (Jordan, 14:53)
Notable Quote: "If you're only using ChatGPT or Claude...as a smarter, faster, better source version of Google, you're missing out." (Jordan, 13:45)

Step 2: Choose Your AI Operating System (17:38)

Opt to standardize most of your work within a primary platform (ChatGPT, Claude, Gemini, etc.).
Modular approach: Be ready to pivot parts of your workflow as new features or models emerge.
Migration and transfer of context between platforms is more streamlined but remains fundamentally "a very in-depth and detailed prompt."

Step 3: Select the Right Surface (20:50)

Surfaces (interfaces) are now shifting: from web to desktop, with desktop versions offering deeper access to local data, built-in browsers, and software control.
Choose based on your organization's privacy needs, automation requirements, and speed.
- "Everything is moving to the desktop...it can read and write on the desktop surface, which is huge." (Jordan, 22:40)

Step 4: Use the Right Account Plan and Model (24:36)

Critical advice: Avoid free plans for anything beyond basic experimentation.
- "People, stop using a free plan, period. ... If you're making any business decisions based on using a free or a non-thinking model, ... you, your department, your company...not gonna make it." (Jordan, 24:38)
Modern features and 'thinking' models are only on paid tiers.

Step 5: Master the Context Layer (28:14)

Context determines what the AI "sees"—training data (often outdated), your added data (files, app connectors), and the web.
Understand context windows: Data outside the window gets ignored or dropped—quality and order matter.
- Analogy: Star Wars scrolling credits as context window—older data scrolls off screen.

Step 6: Context Engineering: Prompting & Role Design (32:10)

Go beyond basic prompts: Use structured methods like prime-prompt-polish, and understand when to use AI or handle tasks manually.
Include roles, goals, sources, examples to clarify context; iterate on prompts for better performance.
- "The first output that a model gives you is usually garbage, even if you do everything else correctly." (Jordan, 33:52)

Step 7: Integrate with Files, Apps, and Company Data (35:00)

Modern AI platforms now offer deep integrations (read/write) with common business apps, files, and connectors (e.g., Gmail, CRM, project management).
Agents and connectors reduce human "duct tape" by moving and transforming data between tools automatically.
- "Files make AI far more specific to the actual task... you still have to tell it when... and sometimes direct it to look into what chapter." (Jordan, 35:11)
Reference to earlier episodes: "Agentic Context Carry" for building workflows across apps.

Step 8: Privacy, Permissions, and Governance (38:10)

Always follow approved channels and permissions when connecting company data.
Proper setups ensure security matches or exceeds traditional cloud vendors.
- "If you upload, you know, your company's data to a cloud it is the exact same thing... as using these connectors in these apps." (Jordan, 38:28)
Well-designed governance means not relying on shadow IT or personal accounts for business workflows.

Step 9: Transparency, Observability, and Reasoning Artifacts (40:14)

Track and understand every step AI takes in workflows, from context ingestion to final output.
Enterprise platforms make it easier to audit usage, track actions by agents, and view documentation of decision-making ("reasoning artifacts").
- "The steps in the middle are sometimes the most overlooked, but they're the most important, because if you don't understand what's going on, you don't own it." (Jordan, 41:07)
Critical as models and agentic tools evolve rapidly.

Step 10: Verification, Iteration, and Workflow Design (41:57)

Never accept first outputs—review, refine, and iterate to achieve high-quality, on-brand results.
Once you have a solid workflow, convert it into repeatable, automated processes: skills, plugins, or scheduled agentic workflows.
- "Iteration is huge... Even if you do everything else, steps one through nine correctly, your first output...is at best case...generic garbage." (Jordan, 41:58)

Notable Quotes & Memorable Moments

Cookie Cutter LLMs:
- "We have the McMansions of models because they're all kind of the same. Obviously the capabilities and the harnessing of the tool is all completely different and unique. But I think for the first time...we have a set of rules that can apply unilaterally." (Jordan, 02:44)
The Model Is Not a Chatbot:
- "A large language model is not a search engine, it is not a chatbot. It is a reasoning engine that, combined with your data and your human knowledge, can output economically viable work at a rate better and faster than humans." (Jordan, 14:53)
Stop Using Free Plans:
- "People, stop using a free plan, period. ... If you're making any business decisions based on using a free or a non-thinking model, ... you, your department, your company...not gonna make it." (Jordan, 24:38)
On Context Windows:
- "Think of [context window] as the Star Wars credits ... comes in, words come in slightly slanted and...eventually the words, they get smaller, smaller, smaller, and then they're off the screen. Think of that as a context window." (Jordan, 28:30)
First Output Is Garbage:
- "Your first output for the most part from a large language model, even a great one, is at best case it's going to be generic garbage." (Jordan, 33:52)

Timestamps for Important Segments

| Timestamp | Segment | Key Topic/Event | |------------|----------------------------------------------|---------------------------------------------------| | 00:16–04:00| Introduction, AI Overload & LLM Convergence | The ‘fire hose’ pace of AI/AI platforms converging| | 13:42 | Step 1: What Is (and Isn’t) an LLM? | Defining LLMs, generative vs. deterministic | | 17:38 | Step 2: Choosing Your OS | Pick a core AI platform; modular approach | | 20:50 | Step 3: Choose the Right Surface | Shift from web to desktop, key considerations | | 24:36 | Step 4: Paid Models Only | Why free plans don’t cut it | | 28:14 | Step 5: Context Layer | Sources of context, pitfalls of old data | | 32:10 | Step 6: Context Engineering | Prime-prompt-polish, structured prompt techniques | | 35:00 | Step 7: Files, Apps, Company Data | Deep integrations, connectors, agents | | 38:10 | Step 8: Governance and Permissions | Privacy, shadow IT, why permissions matter | | 40:14 | Step 9: Transparency & Observability | Traceability, reasoning artifacts, audit trails | | 41:57 | Step 10: Verification and Iteration | How to refine outputs & turn them into workflows |

Practical Takeaways

The chaos of LLM innovation has led to a surprisingly uniform set of best practices—adopt them for supercharged results, no matter your preferred tool.
Start with a proper platform, plan, and interface; always wrangle the context layer; iterate outputs and make them part of scheduled, transparent, automated workflows.
Data safety and governance are as critical as technical skills in today’s agent-powered workplaces.

Additional Resources Mentioned

Start Here Series:
- Dedicated, ordered episodes on AI basics & advanced topics start here series.com
- Free “inner circle” community for support and learning
Recommended Past Episodes:
- "Agentic Context Carry" (details on connecting workflows & context across platforms)
- "The Seven Deadly Sins of AI" (pitfalls in LLM work)

Final Note from Jordan (43:31)

"I want to cut through the bs. I want to tell people exactly how these models work, and I want to be able to do it for as for free as long as I can, right? ... Go to starthereseries.com to get exclusive access to all of the episodes in this series."

For anyone overwhelmed by AI, this episode condenses years of trial, error, and evolution into a practical, step-by-step playbook that just works—no matter what logo is on your chatbot.

Loading summary

Transcript7 lines

[00:01]
Everyday AI Host
This is the Everyday AI show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business and everyday life.
[00:17]
Jordan
The saying of keeping up with AI is like trying to drink water from a fire hose, though it's tired and cliche, is obviously 100% true. I mean, with all the non stop updates, your head is probably spinning trying to keep up. So you're thinking, with all these new releases, how do I use ChatGPT? Or is using Claude kind of the same as using Gemini on the desktop? Or can I prompt copilot? Kind of like I'd prompt Codex? Or maybe you've had to start the race sprinting and you never really got the proper 101 on how all of these AI chatbots work. Work. Regardless, you definitely aren't alone in the fire hose updates drowning you out. That's the new struggle for every enterprise. Because it's not like there's a cheat code, right? Because as the AI models are changing almost daily, so too does the input required to get the best output. And I get it. As someone that covers AI daily, I understand the struggle of trying to play a game, yet the field dimensions change without notice. And, and you were using a softball yesterday, but cricket geared today and tomorrow you might kind of play it out like rugby. The input rules and the output capabilities are moving targets. But there's one thing I found out over the last three or four months, the big players have all just kind of started to copy each other, which is actually a good thing for you. So I think it's actually been over the last two months that we've finally stumbled on a set of best practices for getting the best outputs out of any large language model. So that means, well, there is a cheat code or at least a set of somewhat concrete rules that when followed, will give you stellar outputs, really, no matter what model you're using. And that's exactly what we're going to be giving you today on Everyday AI. Welcome to the Start Here series. If you're new here, the Start Here series is your essential guide to getting caught up in getting ahead with AI. But first, let's talk about the big picture. So the capabilities of these models are very, very real and it is hard to keep up, right? So right now you probably see examples all over the place. You see people sharing examples, whether it's online, whether it's media articles that are so poorly written, and you're like, well, these AI models are really bad. Look at all these mistakes. But Then you also see these benchmarks and these tests and, you know, these big companies laying people off in lieu of spending billions of dollars on AI. And you're confused because you can't even know or don't even really understand what button to click or which model should I use. And up until recently, it's been too hard to follow because there's been too many different paths. And now I think essentially we have cookie cutters, right? We have the McMansions of models because they're all kind of the same. Obviously the capabilities and the harnessing of the tool is all completely different and unique. But I think for the first time, maybe ever, at least, when it comes to, you know, the three to five handful of big AI players, we have a set of rules that can apply unilaterally to really any of them. So that's what we're going to be going over on today's show. Stick with me. This one, trust me, is going to be 30 minutes. All right? I know sometimes I say 25 to 30 and then you're like 45 minutes in, but stick with me on today's show. Here's what you're going to learn. You're going to know why the cookie cutter nature of today's large language models is actually both a good and a bad thing. You're going to know the up to date capabilities, features, pros and cons of the big players, Specifically focusing on ChatGPT, Claude J. Gemini, technically applicable, the co pilot, but we're not going to go into that one as deep as well as, I mean you can apply it to Perplexity, Grok, Open models, etc, because they are kind of all getting the same. And you're going to leave today's show with the 10 essential steps to get the most out of any AI chatbot. All right, this is the Start Here series. This is the essential podcast series to both learn the AI basics and to double down on your AI knowledge. Because after doing this for three and a half years, nearly almost 800 episodes now, I never had a good answer when people ask, Jordan, I'm new to the podcast, where do I start? Well, you start here with the Start Here series and you go to start here series.com that is going to give you free access to our inner circle community. It's exclusive. This is the only way you can actually get access in the Start Here series space inside of our free community. You can go find every single Start Here series episode all in one easy to find place. A Spotify playlist that updated all the newsletters go read it, it's awesome. All right, so if you missed our last start here series episode, we covered Bill by partner or weight the four layered AI stack decision framework for 2026 and today we are giving you the LLM cheat code. 10 essential steps to get the most out of any AI chatbot. All right, and I'm just going to give you those steps now and we're obviously going to break the them down as we go along but here they are kind of steps, best practices, rules to live by, whatever. But number one, you have to understand what a large language model is and what it is not. Number two, you have to choose your AI operating system like Chat, GPT, Claude, Gemini or Copilot and really stick to those as much as you can for the majority of your day to day knowledge work. Number three, you have to choose the right surface and the surfaces are unfortunately changing but I think it's actually, actually for the good. Number four, you have to choose the right account plan and model. I'm going to break down those best practices as well. Number five, you have to understand the context layer first. You have to understand it before you get to number six, context engineering basics. I'm going to tell you prime, prompt, Polish, refine, Q555, a lot of the secrets that I've kind of held close to the best throughout the years. Number seven, I'm going to tell you how and why in risks on working with files, apps and company data and why that matters. Number eight, we're going to talk about privacy, permissions and governance and how that actually plays out in these AI operating systems. Yeah, I call AI chatbots now AI operating system because that's what they are. Number nine, transparency, observability and reasoning artifacts. I'm going to show you how to use those things to your advantage to get better outputs. And last but not least, verification, iteration and workflow design. There you go. You can stop here if you want, but trust me, this is like thousands of hours of conversations over the past three years and I'm going to do my dangest to get them to you in 30 minutes or less. So here's what I want to talk about, why this is both a good and a bad thing. And I think this really became evident when Google at their I O conference like last week. And I know, right? So right now it's the end of May 2026. I should put this out there because I know, you know, a lot of you might be listening to this in, you know, July or December. So obviously some of the things are going to change here when I'm talking about certain surfaces, models, modes, et cetera. But hopefully these concrete is 10 steps will still hold. But I noticed at Google's IO conference when they announced their new anti Gravity 2.0 desktop app, right? And not going to get into it, but I'm like, wait, this just looks like Codex. It functions like Codex. And then I'm like wait, cursor kind of does too. And that's when I started to notice that for the most part, especially compared to 18 months ago when all of the different whether we're talking about the web interfaces, right? Gemini.google.com, claude, AI, chatgpt.com, grok.com, whatever you're using. 18 months ago they all looked really different. It was almost like speaking different languages. Now it's all speaking the same language with slightly different dialects, right? Like Anti Gravity literally looks like it is Codex light, right? Color schemes, fonts all look the same layouts. So it's good and bad. From a bad perspective, it might seem to you, the non technical user, that the front facing innovation is kind of slow, right? As an example, foreign.
[09:06]
Start Here Series Narrator
AI moves too fast to follow, but you're expected to keep up. Otherwise your career or company might lag behind while AI native competitors leap ahead. But you don't have 10 hours a day to understand it all. That's what I do for you. But after 700 plus episodes of everyday AI, the most common questions I get
[09:27]
Jordan
is where do I start?
[09:29]
Start Here Series Narrator
That's why we created the Start Here series, an ongoing podcast series of more than a dozen episodes you can listen to in order. It covers the AI basics for beginners and sharpens the skills of AI champions pushing their companies forward. In the ongoing series, we explain complex trends in simple language that you can turn into action. There's three ways to jump in. Number one, go scroll back to the first one in episode 691. Number two, tap the link in your show notes at any time for the Start Here series. Or you can just go to start here series.com which also gives you free access to our inner circle community where you can connect with other business leaders doing the same. The Start Here series will slow down the pace of AI so you can get ahead.
[10:18]
Jordan
Go back to, you know, 2024 front end innovation for users and maybe early in 2025 was everywhere, right? All of a sudden we had artifacts from Claude and then we had, you know, canvas mode and you know, we had these agents that Worked inside. And all of a sudden, you know, large language models went from simple, you know, next token prediction transformers to models that could think and reason and call tools, right? So I think there is this period of fast innovation in terms of what buttons you would click on all of these modes, deep research, etc. But if I'm being honest, in 2026 we haven't seen that same thing. And I think that's actually good because what happened is it. I think it gave us number one. Now we can kind of apply these best practice rules. But more than anything else, I think it allows enterprises to be able to modular, modularly pivot as needed. Right? Because 18 months ago, if you were a heavy chat GPT teams user back what it was called before it turned into ChatGPT business, and then you try to go to, you know, Gemini, right? There was no Gemini business at the time, it was just Gemini. You'd be confused. But now they've kind of all copied each other with certain features, projects, GPTs, gems, right? They're all kind of just the same thing now. But the actual race is smarter, faster models with autonomous desktop harnesses. So the surface is changing a little bit. But ultimately the thing that you have to keep in mind, if you don't get anything else from today's show, AI models are smarter than all of us if you're using them the right way, and that's a big if. And when I talk about jagged, jagged capabilities, right, People will share something where it's like, oh, AI models are dumb. And then you have this false sense of security, like, oh, I don't have to worry about keeping up with these day to day. Yes you do. Anyone that shares those things, that's a skill issue. You know, send them this video and then they'll be like, oh, I've been using everything wrong. But AI models by default, studies show this even judged blindly by experts, today's AI models with all of this harnessing built around them, they can produce artifacts one shot, right? You know, complete websites, complete apps, spreadsheets, PowerPoints, Word docs, PDFs, etc. Right? So they're no longer just a little chatbot to talk to. And we've covered that in depth in the Stark Here series so far. So let's dive in. We have some upgraded visuals today, but let's get into our kind of the 10 new rules of work, I'll say. So number one, what is a large language model and what is it not? Well, if you're only using chat GPT or quad or whatever, As a version of Google, as a smarter, faster, better source version of Google, you're missing out, right? If you're just treating it back and forth like talking to a smart friend, you're also missing out. So by default, large language models are generative. So that means they are non deterministic, where a search engine like Google, aside from localization and personalization, is deterministic, right? If we put in the exact same query, chances are, aside from that, personalization and localization. But if we all did an incognito Google search, right, five years ago, before AI overviews in AI mode, we for the most part all would have gotten the exact same outputs, right? Even if there's, you know, 20,000 people listening to this episode and we all did it, we would for the most part get the exact same results. It's not the same with generative AI, it is generative. There is a, an, an element of next token prediction. And essentially large language models have been trained on the entirety of the Internet. Copyrighted works, you know, open source coding, projects, videos, photos, everything, just terabytes of data. So the old quote, unquote, I always say that there is a line in the sand that came with OpenAI with the first widely available reasoning model. So previously when you first started using ChatGPT or maybe you're listening to this and it's been a while, right? The older models were transformers and that's really all they were is next token predictors and but still they're generative. So the whole point is if the 20,000 people listening to this show all put the same prompt in chat GBT or Claude or co pilot or whatever, you're probably going to get very different results, right? There might be 2,000 very different results, there might be 20,000 very different results, right? It's, it kind of go through this next token prediction process. But that is why you have to really understand how these work and build up a context and good practices to get the best output. So there's also in this, without going too deep, there's a lot of nuance to this, right? Because now these models can think very much like a human would. And we're going to get to that a little bit later. But for the most part, a large language model is not a search engine, it is not a chatbot. It is a reasoning engine that combined with your data and your human knowledge can output economically viable work at a rate better and faster than humans. All right, so that is number one, what a large language model is and what it isn't. So now that we know what it is, where do we use it? All right, so also in the Start Here series I had an entire episode on this concept of an AI operating system. But I think that's important to talk about because now they can all connect with your data. All right. And depending on what you go and you know, what route you go down, I do think it's best to mo to move most of your organization, most of your enterprise, into one AI operating system. Right. So personally I'm using Chat GBT via Codex more than anything. A lot of people are using Claude, whether that's on the web or the desktop. We're going to get to that in a minute. There's also Gemini for those people who are really ingrained in the kind of Google workspace Ecosphere. And then obviously with co pilot it's a little more difficult, difficult to explain because there's so many hoops to jump through in terms of permissions, role based access control, all these things. It's a little convoluted. Convoluted. But I always tell you, I, I, I tell companies, try to move the majority of your day to day operations inside one of these AI operating systems. Obviously you should be building modularly. So when there are, you know, new kind of frontier breakthroughs or huge advantages, maybe a section of your team can take that context over to another one also now every single big player makes it fairly easy to bring context over. So whether that's things like skills and plugins, your context, memory, chat history, etc. There's easy ways to port this over. It is more or less just a very in depth and detailed prompt, FYI. All right, so number two is you have to choose the operating system. Number three, you have to choose the right surface. And this at least in 2026 is where we've seen the most movement on, not necessarily what we saw in, you know, mid 2024 to mid 2025, which is the features and the modes, the deep researches, right, the you know, agentic capabilities with these reasoning models by default, the connectors, the apps, the surface is what's changing. So what do I mean that? In 2022 with the advent of chat GPT, it was the web, we went and chatted, right? Then we started to bring our teams on board and you could collaborate with others in kind of a shared worksp. And now in, you know, late 2025 with Claude Code, then Claude Cowork in early 2026 and now Codex and everyone else, right? Google got on board with their Gemini Desktop app with their anti gravity, which I don't think is that great, but hopefully it'll get better. Right now everything is moving to the desktop and one of the reasons is as well, it can use kind of your local machine to do a lot more faster and access your data. It can read and write on the desktop surface, which is huge. And then obviously everything is agentic by default. So there's no easy answer to say, okay, well should I still use the web version? Should I use the desktop version? You know, to get to that answer, you have to sit down and talk about what's important in terms of access, in terms of automations, in terms of, you know, what you want in the cloud versus running locally. Actually, you know, although there's a lot of risk, reward, trade off with a desktop version in some instances it might be better for some companies that are, you know, privacy first. So number three, you know, is understanding that that surface dictates access, right? So because it's important to know that the agent tools, I mean they can use your actual browsers, right? And that's one of the biggest advantages I think to the desktop surface aside from being able to read and write to your local folders. Right? But to be able to use your actual browsers, I think Codex's built in browser is so far ahead. Computer use technology where it can literally, you know, use your computer, launch different programs. I have mine, you know, can go through and read my imessage on my computer and you know, open up all these different desktop programs if there isn't a built in app or connector. All right, number four, choose the right model, people stop using a free plan, period. If your company is like, yeah, use the free plan until we get this approved. Don't do that, the risks are too high. All right? I don't care if you're using Chat, GPT, Quad, Gemini, it doesn't matter for the most part you might get like one or two quote unquote prompts or outputs, you know, with a decent enough plan, if you're on a decent enough model, if you're on a free plan, don't do it, it's not worth it. If you want real outputs, you have to use a paid model, right? All these people that are sharing things online and you're like, oh, AI is so dumb, right? They're usually using an old model, a free model and they really just don't understand kind of the underlying harness. So this is like, you know, you could have a convertible body, but oh, the engine is actually, you know, A dude on a bike, right? That's the equivalent of using a free model. You can't hide it in the fact that, like, oh, I'm using Claude. No, if you're using the free model, right, for the most part, you probably don't even know because companies don't always tell you what version it is, right? Like, did you know that there's a GPT 5, 4 latest? Like, no, you don't. Because unless you're like me, you're not out there testing these things, right? You just think, oh, I'm using the best in the world. And you probably aren't if you're on a free plan. And you also need to be using thinking models. Do not use these models that don't think, don't reason, all right? You need to be using those always, even if that means waiting an extra 10, 20, 30 seconds. All right, I'm going to get to this later, but read the chain of thought. Do some exercise. Have multiple, you know, prompts running at the same time and, you know, dictate, you know, move between. I do think the. The job of the future is agentic orchestration. All right? So don't just, you know, go and chat with one, you know, free chatbot thread. At that point, you're wasting your time. And if you're making any business decisions based on using a free or a non thinking model, I'm sorry, as the kids say, ngm, I not gonna make it you, your department, your company. If you're doing that, if that's part of your actual strategy, don't. Right? Just, just please don't. Unless you're just trying to learn, like where all the buttons are, like, how do I work this thing? That's the only time you should be doing that. All right, so number five, understanding the context layer. And this part is important before we even dive into the, you know, best practices of context engineering. But context is essentially everything that an AI model can see or not see, right? The way that I've, you know, we've taught. You used to do like two live, you know, kind of prompt engineering courses a week for like two and a half years. And kind of the analogy I've always used, if you think back to the Star wars credits from the original Star wars, right? And it comes in, and the words come in slightly slanted and it says, in a galaxy far, far away, right? And eventually the words, they get smaller, smaller, smaller, and then they're off the screen. Think of that as a context window. Different models have different context windows. So essentially you might be using a large language model, or you might have a skill saved or a plugin or a workflow saved, and you're using it repeatedly. And all of a sudden, you know, started out great, and all of a sudden it's like, wait, this stinks. Now you know, oh, you know, this company must have nerfed the model. Maybe, but probably not. Probably. You just ran over your context window. So that's essentially the amount of information like a computer's hard drive that a large language model can retain. All right? And a lot of times the first prompts, the first pieces of information you give a model are oftentimes the most important. It's where you give it direction. So the difference between a context window or understanding how context works in a hard drive is if you try to save a 2 gigabyte file on a full hard drive, it's just going to say, nope, can't do it, no room. All right? A large language model is going to do it, and it's just going to kick off something and you have no clue. So working with relevant, accurate business context is one of the most important things to do when working with a large language model, because essentially, large language models get data from one of three places. Number one, their internal training data, which for the most part is very, very old. Number two, any of your company data that you share that can be, you know, apps, connectors, your file memory, chat history, you know, there's essentially a layer of your, you know, personal context, company context, you know, your. All your SaaS, apps that you use. Right? And then there's the web. All right? So you have to understand that there's now this context layer that didn't really exist before. Right? It used to just be the training data. And keep in mind, training data is usually really old. There's a knowledge cutoff date. But I would venture to guess the overwhelming majority, like 99% of the data that actually ends up in today's large language models is nowhere close to that date. So, you know, let's just say if there was a way to turn off web search and to turn off your company's context, and if you were to only rely on a model's training data, it would be very bad, right? It's actually crazy to think it took Anthropic so long to add web search. And which is why I was like, companies should never use it, right? Because without web search, you know, you're probably playing with data at minimum, that's probably on average 15 to 18 months old on the Good side. And think of how quickly your company, your competitive landscape, your sector moves. And imagine if you only had access to data that was 15 to 18 months old. Yeah, recipe for disaster. And that's why also you get these, you know, out outputs sometimes, especially, you know, pre, you know, 2024 reasoning models that could agentically call the web and pull your data. Why you'd get these outputs that look like they were absolutely terrible. Well, because they were. It was relying on old data. And large language models are trained to be helpful assistance. All right, so make sure you go back and listen to the Start Here series going over the, I think we called it the seven deadly sins of AI where we kind of talked about some of these things. But that's why sometimes outputs from large language models seem generic and you're like, is this actually truthful or is it kind of lying to me? Well, the context helps steer that away from just giving you these general. It sounds helpful, but is it true to being pinpoint specific and valuable for your company? Number six, context engineering. All right, this obviously plays in line with the context layer, but this is where you insert, right, one of those three layers of data into the context window. So this can be things like your traditional prompt engineering. So you can share that context via your prompting. Right. But the important thing really is knowing that it works in layers. So I'm going to give you a couple hints. Right. So we taught prime prompt polish. All right. The 32nd version of that is you should always prime a model, work with it a lot before you prompt it and ask it for an output and then polish. We're actually going to get to that technically in number nine and 10 because the first output that a model gives you is usually garbage, even if you do everything else correctly. All right, refine Q. That's where we talk about, you know, roll examples, fetch insights, narration, explanation and questions. All right. If you want to go more on the, you know, prime prompt Polish Refine Q555. This is like a decision tree on when you should use a large language model versus when you should do it, quote unquote manually without AI. Actually, when you do go to start here series.com and sign up for our community, you will also get access. Yeah, a lot of people don't know this. We recorded that it's on demand, right. I used to do it live. I can't do it live anymore. But I keep the videos pretty updated. So you can go take our updated prime prompt polish course. I think it was updated as of like GPT 5.4. So I'll probably update it once we get like 5, 6 or something. But the, the basic, this is basic context engineering 101. All right? So it can be as simple as, you know, doing the role goal, sources, constraints, examples, whatever it is, but it's making sure that through your words that you're sharing, you are steering the model in the right direction and telling it what context to use, what output you need, and working with it before you just expect it to give you amazing output. All right, number seven, working with files, apps and company data. This one is huge. So files make AI far more specific to the actual task. All right? And a lot of people think, oh, if I just dump all of my context into one one of these large language models and tell it, you know, okay, now go, you know, write, you know, next quarter's press releases and update all our job descriptions based on the sops. And right. All you're doing in this instance without getting too technical and accidentally turning this again into a 90 minute show. These, think of them as books on the bookshelf, right? You still have to tell the model when via whether that's through a skill, a plugin, a certain workflow, automation, etc. You still have to tell it when to use the book and sometimes direct it to look into what chapter. But creating these, you know, app connections are huge because depending on if you're talking about, you know, Chat, GPT, Claude or Gemini as an example, they all treat these connectors a little bit differently. As an example, some of them might index or cache some of your most important connectors, like your Gmail, right? So you don't even have to wait, it just knows and it can actually, actually, you know, proactively go out and find these things and surface them to you, which is actually huge. So, you know, whether we're talking about uploading files, you know, connecting a connector or an app, right, Like a Google Drive or a Microsoft SharePoint OneDrive box, etc, your CRM, right. Most of the big three, I don't know if Google does, but I think mostly everyone else has like HubSpot and some of the big names in CRMs, right? All of these things now for the most part have read and write and that is big. And that's much different than where we were at, you know, in the early days of connectors, like nine months ago, right? So now these can well run actions for you. So, you know, let's just say you get some information off of a Google sheet and you have to go, you know, update something in a database and then you have to go, you know, change some options on a CRM and then you have to update your project management tool ClickUp as an example. Right? These are all processes that normally I call the human duct tape that you would have to do. Now these connectors can talk to each other and carry that context over. We did a show called Agentic Context Carry in the Start Here series. So go back and listen to that if you want to know more about how that works. But essentially this is why these models with the context, you know, layer in this, an agent being able to carry that context and keep it in the context window is huge because this is the majority of what so many knowledge workers spend their time doing. You know, you have the 5 to 15 probably different SaaS applications that you spend your time in. You use your brain, your role, your requirements, your KPIs, you grab all of that important context from all of these different, you know, apps. You, you know, personalize it, create some value out of it, and then you piece it together in all of these other apps. This is what agents do now, right? So working with files, apps, connectors and your company data is huge. All right, number eight, and this is important, especially following up, maybe it should have been number seven, right? Don't do the shadow it, right? Shadow it is when you don't have permission or, you know, shadow AI or shadow agents, whatever you want to call it, when you don't have permission to use your company data, you shouldn't be uploading it, right? But the good thing is whether we're talking about Claude Enterprise, Gemini Enterprise, ChatGPT Enterprise, they have the exact same data security and privacy that you would have from your cloud provider. So if you upload, you know, your company's data to a cloud is the exact same thing. It is the exact same thing as using these connectors in these apps. As long as you're going through, you know, turning off model training and doing the, you know, 101 best practices, you're not running any additional risk. Again, that's a big if. As long as you're, you know, using these models correctly, turning off, you know, training data, which most of the paid business or enterprise versions do by default, you're not running any additional risk with your company's data. It's not like I'm going to upload, you know, or connect my data. Turn off model training and my competitor is going to be like, oh, what's everyday AI doing? That's not how it works. But people, smart people, with brains still think that today. That's not how it works. Right. Governance is also extremely important. That is permission design. Right. This is not paperwork, this is not a one time 20 minute training because the systems are constantly changing. All right. Also, personal accounts, those aren't like unofficial company AI systems. Like I said, you know, you should only be connecting your company's data. Number one, if you've gone through the guardrails, the governance, the privacy permissions, getting the sign off from everyone involved, but you should also only be using it through the proper channels. Right. I can't tell you the number of companies when they're telling the truth. You're like, okay, well yeah, we, we got our, you know, Copilot approved but you know, no one can get the right access to it. But well, if we're using SharePoint and Copilot, we should just be able to use it in chat GPT, right? Well, in theory, kind of, but also, absolutely not. Right. That is going against Governance 101. So you know, expert driven loops are so important when we talk about, you know, properly setting up these guardrails, especially as these large language models, these quote unquote AI chatbots by default can now take actions for us and we can schedule agents to go take actions. Right. With write wri te right capabilities. Right. It can go off and email customers and update your CRM and you know, maybe if you're not, you know, keeping an expert driven loop, it could potentially make a huge mistake that can cost your organization a lot of money. So there's obviously a downside to productivity and efficiency if you skip over step number eight, which is proper privacy, permissions and governance. Yeah. So after permissions though, leaders need visibility into what actually happened and that's where we have to talk about the importance of transparency, observability and reasoning artifacts. So this is where people skip over showing your work and being able to observe that as a team. Right. So I can't go into all the specifics because it's going to take a while, but especially when you get up to enterprise accounts, not only can you see, you know, usage and things like that, you can see, right. I think Copilot actually leads in this. You know, I think with their intra id you can see every action certain, you know, agents make. I think ChatGPT's new workspace agents for business and enterprise teams do a great job of doing that as well. But you have to be able to understand and show the work, especially when agents are, or, sorry, AI chatbots are agentic by Default. You have to be able to observe them and understand the reasoning artifacts. What does that mean? You know, there's usually a little thing you can click accept in Gemini. Gemini. Can we get a little better at that please? Right. I know there's a tiny step that we got with 3.5flash and the, you know, updated version of Gemini, but we need complete transparency when it comes to seeing how models get from context, engineering and great setups and great workflows to creating this great economically valuable work. And the reason being is because models change all the time. So if you can't properly see every single step in every tool, call in every website that a model or an agent went to in order to deliver that, you know, first draft deliverable for your team that you ultimately sent off to a client or the RFP or whatever it is, then you don't have the observability. Then you don't actually own that asset because you don't know what's happening. Right. So I like to talk about the steps in the middle are sometimes the most overlooked, but they're the most important because if you don't understand what's going on, you don't own it. Right. So what does that mean if something happens in the harnessing or a model changes or, you know, there's a drastic switch up in how a model uses different tools. If you don't own and understand that observability, the, the reasoning, you know, artifacts, then you may not be able to replicate those same things. And if they, if these things are running on loops scheduled, which is very easy to do in one little thing changes, it can break your entire business. If you become overly reliant on passive, lazy human in the loop workflows versus expert driven loops. Expert driven loops means that you are looking at the transparency and observability. All right. Last but not least is verification, iteration and creating those workflows. Right. So what do I mean by this iteration is huge? So I talked about, you know, the prime prompt polish. This is the polish. You know, your first output. Even if you do everything else, steps one through nine correctly, your first output for the most part from a large language model, even a great one, is at best case it's going to be generic garbage, right? But it's not going to be very well. It's not going to sound like you. It's even when you give it the proper examples. Right. You have to really iterate on the output. You have to understand how it got to that output and then run it over and over and make it better. That is this iteration loop where you turn, you know, your first output, right? My old journalism days, the first version of the story you turned in was never really good or the best, right? You always had to go through the multiple editors and all the red. All the red lines to make it better. And then once you get to that point, you need to then turn it into a workflow. And this looks a little bit different in ChatGPT versus Claude versus Gemini, right? But for the most part, we are getting these universally applicable skills. Another thing, anthropic led the way on this, and everyone else is adopting them, but skills are essentially ways that at the end of iterating, you can go through. And essentially, in more words than this, they turn this into a skill, right? And then once you turn it into a skill, then you can schedule it, because most of these systems have scheduling by default. So it's all about verify that it works. You know, make sure you're using the right model, the right mode for the right tool. You're getting the right outputs, you're verifying it. You're going through the transparency, observability, the reasoning artifacts. But then you don't stop when you get that first, you know, hey, this draft is good enough. No, keep working. That's where you polish. You turn it into a skill, you turn it into a plugin, and then you turn it to a. An automated workflow. And that's how we go from working with an AI chatbot to commanding AI agents. All right, I hope this one was helpful, y'. All. If so, let me know if you're listening on the podcast. Please do me a favor. All right, this. This one has been a long time in the waiting, so, you know, I hope more people can hear this because I want to cut through the bs. I want to tell people exactly how these models work, and I want to be able to do it for as for free as long as I can, right? So that only happens if you're number one sharing this with others. So if you are listening, you know, on LinkedIn as an example, please repost this if this was helpful. If you are listening on the podcast, please subscribe to the show. That would mean a lot. Leave us a rating too, if you can. So I hope this was helpful. Like I said, make sure to go to starthereseries.com to get exclusive access to all of the episodes in this series. Thank you for tuning in. I hope to see you back tomorrow and every day for more everyday AI. Thanks, y'. All.
[40:45]
Everyday AI Host
And that's a wrap for today's edition of Everyday. AI, thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going for a little more AI magic. Visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.