#728: The Duck Talks Back - Using GENAI in Your Work - AWS Podcast

Summary5 min read

AWS Podcast Episode #728: The Duck Talks Back - Using GENAI in Your Work

Release Date: July 7, 2025

In Episode #728 of the AWS Podcast, hosted by Amazon Web Services, Lish delves into the evolving landscape of generative AI (GenAI) and its practical applications in software development and other technology-driven fields. This episode, titled "The Duck Talks Back - Using GENAI in Your Work," offers a comprehensive exploration of how developers and IT professionals can effectively integrate GenAI into their workflows, leveraging best practices to maximize benefits while navigating the rapid advancements in AI technologies.

1. Embracing the Fundamentals in a GenAI Revolution

Lish begins the discussion by emphasizing the importance of foundational software development practices amidst the GenAI revolution. He asserts that "the fundamentals have not changed. In fact, they're more important than ever" (00:02:30). Continuous Integration and Continuous Deployment (CI/CD) are highlighted as critical for future-proofing systems against the swift evolution of Large Language Models (LLMs). By implementing robust CI/CD pipelines, developers can seamlessly integrate new AI models and updates without overhauling existing systems.

Key Points:

Adaptability: Systems must be designed to accommodate frequent updates and changes in LLM capabilities.
Good Practices: Traditional best practices in development remain essential and should be diligently applied alongside new AI technologies.

2. Collaborative Development with LLMs

Transitioning to the practical application of GenAI, Lish discusses how LLMs can be collaborative partners in the development process. He shares his personal experience using QCLI, a conversational CLI tool, to aid in creating documentation and design systems.

“Instead of sitting there at your desk talking to a rubber ducky, which is an old debugging technique, the duck talks back.” (00:07:45)

Key Points:

Interactive Dialogue: Engaging in a two-way conversation with LLMs enhances problem-solving and design thinking.
Automated Documentation: LLMs can generate comprehensive design documents and reports, saving time and improving accuracy.
Prompting for Deeper Insights: Using LLMs to ask probing questions can uncover aspects of projects that may not have been initially considered.

3. Swarming Multiple LLMs for Enhanced Productivity

Lish introduces the concept of "swarming," where multiple LLMs work concurrently on different aspects of a problem domain. This approach mirrors traditional methods like autoscaling in web server management.

Key Points:

Separation of Concerns: Dividing tasks among various LLMs ensures focused and efficient problem-solving.
Managing State: Implementing strategies like shared project files or git work trees helps maintain coherence and manage state across multiple LLM interactions.
Model Coordination Protocol (MCP): Utilizing MCP allows LLMs to access and integrate information seamlessly, enhancing their utility and functionality.

4. Practical Use Cases: Debugging and Project Management

Lish provides concrete examples of how GenAI tools have streamlined his workflow. One notable instance involved debugging Lambda code related to pre-signed S3 URLs. By instructing QCLI to review the code without making immediate changes, he received a detailed report identifying credential-related issues and actionable fixes.

“What this type of technology does well is process text really, really well, understand context really, really well, and connect information sources really, really well.” (00:15:30)

Key Points:

Efficient Debugging: LLMs can quickly analyze codebases and identify intricate issues that might take humans significantly longer to uncover.
Automated Reporting: Detailed reports and suggested fixes provided by LLMs enhance troubleshooting efficiency.
Project Automation: From managing ticketing systems to updating Slack groups, LLMs can handle various project management tasks without writing additional code.

5. Leveraging Tools and Protocols for Enhanced AI Integration

The episode highlights several tools and protocols that facilitate the integration of GenAI into development workflows:

Model Context Protocol (MCP): A uniform system for accessing information both locally and remotely, allowing LLMs to utilize and interact with various data sources effectively.
SAM CLI and CloudWatch Logs MCPs: Enable better error checking, validation, and log analysis, turning what is typically a tedious task into an automated, efficient process.
Git MCP: Assists in version control by managing commits and facilitating rollback to previous states when necessary.

Key Points:

Automation of Repetitive Tasks: Tools like MCP streamline interactions with existing systems, reducing manual effort.
Enhanced Error Handling: Automated tools improve error detection and resolution processes.
Persistence and Documentation: LLMs can maintain up-to-date documentation and integrate with tools like Quip for information persistence.

6. Best Practices and Human Oversight

While GenAI offers substantial automation and efficiency gains, Lish underscores the necessity of human oversight:

“You don’t have to do all the work for yourself to make your life easier.” (00:25:10)

Key Points:

Human-in-the-Loop: Continuous monitoring and intervention ensure that AI-generated outputs meet quality standards and align with project goals.
Balanced Automation: While LLMs can handle unit testing effectively, integration testing still requires significant human involvement.
Commitment to Best Practices: Regular git commits and documentation of architectural decisions remain crucial for maintaining project integrity and coherence.

7. Insights and Future Outlook

Lish concludes with insightful reflections on the future of GenAI in technology:

“Generative AI will disrupt every knowledge-based value chain. It's about getting insight into the process and what's going on and how it's working.” (00:35:50)

Key Points:

Skill Evolution: As LLMs handle more routine tasks, the value of deep problem domain expertise and strategic thinking increases.
Prompt Engineering Evolution: The art of crafting prompts is becoming less critical as LLMs become better at understanding context, similar to the diminishing emphasis on advanced search techniques with improved search engines.
Kent Beck’s Observation: Quoting Kent Beck, Lish notes that while 90% of skills may lose individual value due to automation, the remaining 10% become exponentially more valuable, emphasizing the enduring importance of specialized knowledge and strategic application.

Conclusion

Episode #728 of the AWS Podcast offers a deep dive into the integration of generative AI in professional workflows, particularly in software development. Lish articulates a balanced perspective that celebrates the efficiencies and enhancements provided by GenAI while advocating for the continued importance of foundational practices and human oversight. By embracing collaborative tools like QCLI, leveraging multiple LLMs, and adhering to best practices, developers and IT professionals can harness the full potential of GenAI to drive innovation and maintain robust, future-proof systems.

Listeners are encouraged to share their experiences and insights, fostering a community-driven exploration of GenAI’s evolving role in technology.

For more insights and to share your feedback, visit AWSpodcastmazon.com.

Timestamps:

00:02:30 - Importance of Fundamentals in GenAI
00:07:45 - Collaborative Development with LLMs
00:15:30 - Efficient Debugging with LLMs
00:25:10 - Best Practices and Human Oversight
00:35:50 - Future of GenAI and Skill Evolution

Note: The timestamps provided are illustrative and correspond to key points discussed in the episode.

Loading summary

Transcript1 lines

[00:00]
Lish
This is episode 728 of the AWS podcast, released on July 7th, 2025. Hello everyone. Welcome back to the AWS Podcast. I'm Lish here with you. Great to have you back flying solo on this one because it's a little bit of a different episode. We're not talking about any particular service, we're not interviewing anyone. I'm just going to talk for a little bit and I wanted to share some insights because I've had a lot of questions from folks around. You know this newfangled technologies and generative AI and people are using it for coding and other things and not quite sure how to necessarily grasp it and how it works and how it all fits together and how you can actually use it. And I'm not going to stand here and say I'm the expert of all experts, everyone's learning as they go. But I'm using it a lot and I have a lot of customers using it a lot. So I have some perspectives that might be useful for you. So I'm offering them in that spirit. So firstly, I want to step back before we even talk about LLMs in generative AI and how it all fits together to some fundamentals that I think are really important to consider in this technology revolution, really that we're facing, and that some of the fundamentals have not changed. In fact, they're more important than ever. And in particular, the ability to do continuous integration and continuous deployment is really, really important. The reason is because the evolution of generative AI technology, the LLMs that are popular at the time, the capabilities of the LLMs, et cetera, the price performance is changing really, really rapidly. And so it's not necessarily about just building a system that can take advantage of this technology. It's a system that can future proof itself against using this technology in different iterations. So things like the Converse API in Bedrock is really useful because it lets you change your models as you go, have different models, et cetera. They're having to make kind of wholesale changes to the code. But you have to think about your end to end process. Let's say you build a system that has some LLMs involved in the processing. You have to imagine from an architectural standpoint that you're going to change those. In fact, you definitely will change those over time. It's just going to happen. And what does that look like from a development flow, from a testing flow, from a verification flow, from a tooling flow, from a observability flow. So nothing of this is new. These are the Good practices we should be doing anyway, right? But I think they kind of get forgotten sometimes. And it's really made me think how important this is because as I've been doing some work myself and been seeking to change models and do different things, I'm like, oh my goodness, all these good practices we're supposed to do actually make a lot of sense and are really important. So how do we actually get some benefit out of these types of technologies? Well, the first thing I'll tell you is, and particularly this is relevant for coding and software development. YOLOing and single shot vibe coding is not the way you're not going to get a great outcome. And so when I see folks experimenting, oh, I try. You know, I wrote a prompt that said, you know, build a system that does A, B and C and it really sucked. But it looked like it was convincing, but it didn't do anything. Well, I'm not surprised it doesn't work that way. In fact, what I've found is that the classic case of proper planning prevents poor performance. The multiple P strategy is even more important when using Genai to do software development. So it is actually a huge payoff to spend an inordinate amount of time designing your system up front, thinking about architectural decisions, clarifying your user requirements, your user stories. Doing all that stuff upfront is really, really important. Now the good news is, is you're not doing it on your own. You actually do it in concert with the chatbot, with the LLM. In my case, I'm always using qcli. It's the thing that works for me and I like using it. And so I won't sit there and go, wow, I've now got to create a set of design documents. I'll sit down with a CLI and I'll say, hey, I'm about to create a set of design documents. I want you to help me. You can help me by keeping them organized. You can help me by asking me questions, offering me different opinions, approaches, choices, etc. Let's start the dialogue. And then I may spend hours having that dialogue with the bot, essentially. And it's prompting me to think about things that I may not think about. It may be asking me questions about testing strategies. It may be offering design choices. I may be coming up with ideas and saying, hey, can you go investigate this capability? Or what are the best options for libraries to do X these days? What's the best practice? Or go look in the documentation for product Y about how to do something? These things are really important and help get you to a point where you have, I guess, a coherent view of what it is you're trying to build. Now, you'll see many different luminaries in this space. We'll talk about these types of process and there's a bunch of different ones that are out there. Reuven Cohen's SPAC1 is a really spark one I should say, is really interesting, but there's lots. And it's all evolving all the time. So if you find a process, you'll go, oh, well, that's great, but now it's changed again, that's okay. But the key thing is that the skill of building a system has not gone away. But what's really nice is instead of sitting there at your desk talking to a rubber ducky, which is an old debugging technique, the duck talks back. And so you actually get to have an interesting informed dialogue with something that has access to the vast amount of information across the Internet. And so once you've gone through this very deep process of asking questions, challenges, design decisions, having IT do the documentation for you, so good, let me tell you, you don't have to write documentation anymore. It does a much better job than I ever did. And markdown is king. And so it will create all these design documents for you. And then the next step is you can start telling it to work methodically to create the prompts to start to create the designs for these components. And this is where things like separation of concerns and good coding practices also come into play. Because another thing I like to do is I'm an impatient man. I don't want one LLM working on my problem domain. I want multiple LLMs to work on my problem domain. So having a relatively clean separation of concerns means that multiple qchat CLI instances can be off and running and writing code for their particular functional domain and share the information between them. Or you can go with a sort of git work tree type approach where you're completely running on separate trees. That brings other challenges. And again, you'll see this is a space that's evolving rapidly. But this concept, I think a lot of folks are calling it swarming of having multiple LLMs working on the same problem domain at the same time, but dividing up the work is key. It's again, fundamentals. What do we do when a web server is overloaded? We auto scale it and have lots of web servers. It's kind of the same thing, except we're working to manage state. And so managing state can be challenging. But having sort of a shared set of project files, et cetera, is a quite useful strategy for that. The other thing that's really useful is when you're having a dialogue with the system, it can really get excited and want to do a lot of stuff, can be quite enthusiastic. So one technique I find very useful is to tell it to just give me a report on what it finds. Don't actually do anything. So I'll give you a real example that literally I did a few minutes ago. This is how relevant my examples are. So I've created a lot of code to help me produce this podcast. There's a lot of work that goes on behind the scenes. There's, you know, the recording process, then episodes have to get edited and then they come back and they get approved and all that good stuff happens. And that means sharing of data around there. And I've written some lambda code to do a lot of that stuff for me. And I've been finding some bugs in a new version of this that I've released. I tried to, you know, classic engineer, I tried to make it better. We all try and make it better. And I've clearly broken something along the way. And it was related to pre signed URLs. So these are pre signed URLs that were related to both uploading to S3 and downloading from S3. And these were links I was gonna share with people around the place and get sent automatically and all that good stuff. And they weren't working properly and I, I kind of knew that the way I was signing them was not right. There was some sort of problem there, but I just wasn't sure what the result was. So I literally jumped into, into QCLI and I gave it a simple, what you would think is a simple instruction that actually yields a lot of work that would have taken me a long time to do. Now I'm just scrolling so I can, can tell you what it was. So basically I gave it a. I said to the system, familiarize yourself with this code base. I'm going to solve some bugs with you, so I need you to be ready. So it went through, looked at all the files, understands all the files that are in the code base, et cetera. And then I basically said, okay, so the issues are related to pre signing of S3 URLs. I'm having issues with both the pre signed upload URL and generated URLs for reviewing processed episodes in S3. I'm getting issues where the expiration time is not at the maximum. I get errors when I paste the URLs, etc. Can you do a full review of the related logic for all URL generation? Review the documentation deeply first to ensure you have maximum reference information at hand to suggest fixes. Do not make any changes yet. Give me a report on what you find. And it did a tremendous job. It chunked away, figured stuff out, et cetera, and it produced a full and robust report for me explaining that some of the problems that I had related to the credentials that were being used for certain signing in certain cases. Because if you use one type of credential, you can get one maximum. If you use a different kind of credential, you can get a different maximum. So we're getting expirations that were earlier than we thought they were getting. There were silent errors because I wasn't doing correct error checking on the actual things that were being created. So I wasn't sure that was going on. The whole experience was bad, which is what I'd identified, but it broke it down for me specifically, and it gave me some specific fixes that I could do straight away. So things like using the right credentials in the right places, implementing credential aware exploration, standardizing the exploration logic, aggregating expiration validation, updating the notification system, the whole thing, really nice breakdown, pseudocode, Python code, the whole thing. And it gave me a set of five actions to do of different priorities and also a testing plan. And so I was able to then say, hey, go ahead, think step by step and implement all these changes. And it went and did all those things for me. And so now it's waiting for me to go test it. So what's interesting about this process is using the technology for the things it does well. So what this type of technology does well is process text really, really well, understand context really, really well, and connect information sources really, really well. So rather than me having to go and do a deep dive into various documents to figure out what I'd done wrong, because clearly I'd done something wrong, it could do it in seconds. And what do you do with that time? Well, it's interesting. You can get some of your emails done, you can be reading something else that's important to you, you can be thinking about a strategy that you're trying to work on, et cetera, et cetera, you got lots of options. What's also related to this point here is how you instrument your session to have access to the tools you need. And this is where MCP or Model Context Protocol becomes really, really important. It's nothing amazing per se, and that a lot of people are going, well, this is just sort of like, you know, open API, but just differently implemented, et cetera. It's kind of not about that. What it is is that it's a uniform system to access information locally and remotely, and doing it in a way that the LLM can natively figure out what to do with it without having to be retrained on it, which is really, really important. And so I'm using a lot of the tools that are provided by AWS to do AWS functionality. So for example, I'm a big user of the SAM CLI to do the serverless application model. There's a MCP tool for that. So it can have much better error checking, much better validation, it doesn't have to use a command line, etceter. Another one that's really good is the CloudWatch logs one as well. Because let me tell you, spelunking through logs is not a fun thing to do as a human, but it's great to do as a robot. So it does it really well. And essentially thinking about all the tooling that you want makes for a better experience. Now, this can also be around ergonomics and things that make your life easier. So you'll recall earlier on I was talking about the fact that I kind of had multiple gen AI agents working on a particular software piece I was working on. Not the podcast one, a different, much bigger, more involved one. And I think I had four separate sessions going at once doing different things. And I'm a big believer in human in the loop. So there were sort of, you know, with each of these tasks there'd be points where once it was done with something, it would. I would want to know that it was done and I could tell it what to do next and assess the success it had in doing the thing I'd asked it to do. But also because these bots can work 24, seven, I was having them run, you know, out of hours, you know, doing, doing work while I was off the clock, essentially. But I like the feeling of having my computer doing stuff while I'm not doing stuff. So I wanted to keep going, but it would mean that I'd need to keep an eye on it to know when things were done. And so I could implement a really simple MCP called say, that would do the say command on my Mac. And I instructed it that whenever it finished a particular task, use the say tool and tell me that you're done and you're ready for input. So I'd be off doing Something in the house, and this sort of disembodied voice would come and say, hey, you know, Project X ready for the next step? Like, oh, awesome. I'll go over and give it what it needs and continue on. So, again, you have this real flexibility in how you operate the system to your needs. And again, best practices are evolving and approaches are evolving. Your way is not necessarily my way, my way is not your way, but there's some really interesting ways. So investigating MCPS is really worth your time. It really unlocks a lot of stuff. I know for a lot of folks, the access to internal systems of record within their organizations has been unlocked by MCPS and lets you do cool things. So let me give you a for instance. You can actually get your QCLI session to essentially become an intelligent bot, a repeatable intelligent bot to do certain tasks. And the way you do this is firstly, I use a really cool prompt that comes from the prompts, that's prompts with a Z. I have to put a link in the Show Notes website that has a really interesting set of rules and prompts for Amazon Q. And one of the prompts is called Project Intelligence. And it tells the system how to organize itself to maintain a full set of project documentation and essentially keep up to date with what it's doing and have the sense of intelligence. And what I found is this means that you can also then tell the system to create a Persona of itself to do certain functions. So let me give you a for instance. So again, the production of the podcast is an evolved process. We have ticketing system, we have the preparation of content, we have guests, we have an approval process, all that good stuff. And I sat down with my LLM one day and said, okay, let me have a conversation with you in a fresh project context and say, this is what I want to build. I want to build an entire production flow. I want you to be able to manage it all for me. I want you to be able to interact with an MCP into the ticketing system. I don't like using the ticketing system myself. I don't have to have interaction with it. I just want to talk to you. And so I spent probably half an hour to an hour telling it about what I'm doing, how it works, et cetera, giving it the tools that it needs. And now I have a complete natural language system that lets me ask it all kinds of questions and do things. You know, I can have it update tickets for me. I can get it to give me a report on what's coming up next, what I need to prepare for, I can ask it to give me a brief rundown of my next two episodes I have to record and what the content should be. I can generate information and then send it to Slack groups as well. So for example, for my co host, I can say, hey, here's what's coming up and it'll just put it into the Slack group. All of these things don't use code. That's what's really interesting to me in this particular solution. It's just created for itself a set of markdown documents that records how it thinks about things and how it processes information. And then it uses the MCPS to reach out to information sources and it can also write information. So for example, Quip is one of the tools we use. It can create a quip document, it can update the Quip document, it can do, you know, read, write, type activities as well. So there's persistence of information. But this makes for a really interesting use case of what you can do. So another example of something I've been toying with just lately. So it's not fully developed or a complete thought process, but I did the same thing with my time management because there's, you know, I use Outlook as my email client. There's an MCP for Outlook. I like to use a tool called Things for my task management. There's an MCP for Things. You're getting, you're getting the hint here. There's an MCP for everything, really. And I also have got a document that I created at the start of the year which sort of outlines all the things I want to work on within Amazon, the projects I'm working on, my priorities, the things I'm going to pay attention to, et cetera. And again, I was able to create this instance, this essentially personal assistant. Bottom and I could give it this document, I could give it a set of my tools, I could talk it to it about my priorities and start to say, hey, let's look at my calendar and tell me how I'm tracking in terms of what I'm doing. Am I meeting my goals? Am I devoting enough time to the things I want to be working on? What are some of the things that you're seeing are taking my attention, et cetera, et cetera. So you can see how suddenly you have this intelligent view of your world that you can interact with and query like you would an assistant. Now the important thing is asking questions in the right way and again, not just YOLOing it. And so you'll hear People talk a lot about prompt engineering, et cetera. And you can go down a real rabbit hole, prompt engineering. And I think if I look in my crystal ball, I think the skill of prompt engineering will become less valuable in the future. Just like the skill of how to do a good Google search became less important as the search engines got better and could understand context better. But today what you put in is really, really important. So think about your, your prompting and your queuing. And so that's an example of, you know, trying to solve a problem. You know, I've been saying for a long time that generative AI will disrupt every knowledge based value chain. And this is what I mean. It's, it's these kinds of tasks that can be automated, compressed, made easier, made richer, you can get better information. But it's not just about, you know, paper shuffling or moving things through ticketing systems. It's trying to actually get insight into the process and what's going on and how it's working. So my current stack and it changes all the time. But like I say, for these types of jobs that I'm doing, I'm reaching for QCLI or qchat off the rack straight away. I'm making sure I set up my project rules, I'm making sure I set up my project intelligence prompt. I'm making sure I think about the tools that I want it to have access to and give it access to those tools. Then I create a profile for that particular Q chat session and that saves that information altogether so that I know that whichever session I'm in, what I'm actually doing. Because it gets confusing when you have a lot ongoing. And that's kind of my approach at the moment. And I'm finding it, it's helping me a lot. Now it doesn't do all the work for you. You must pay attention. You're going to have to review. I'm finding that it's really good at unit testing, not so good at integration testing. So I'm spending a lot more time on my integration testing that I used to spend on my unit testing. Understanding flows and documenting decisions is always really important. So telling it to document architectural decisions along the way and refer back to them is also really important because it forgets. So sometimes you'll, you'll tell it to do something, it'll do something and then it does it the old way, like, hang on, I told you not to do it this way, please undo it, et cetera. So yeah, committing to git is also really important. Making sure you have regular commit Points is also important. The good thing is, is that again, there's a git mcp. So, you know, I've often had situations where the system's made a change and go, look, I really don't like this change. Go look at a previous version and see what you did then and redo it. And it would just do it for me. So, again, all this stuff, you don't have to do yourself to make your life easier. So, yeah, that's what this episode was a bit about. Just sharing some of the knowledge work changes that Generative AI opens up for a technology person. As a builder, as a technology person, we're really lucky that we have access to these tools. We either know how to use these tools or can learn how to use these tools, but the real skill is understanding the problem domain and applying the tools. And this hasn't changed. Like, I've been doing this for 35 years. The technologies have changed many times. I've done mainframe, I've done client server, I've done web base, I've done mobile, I've done cloud. Now, in general AI, you know, the technologies just change and change and change. But the fundamental skill of being really crisp in understanding what the need is, what the problem domain is, and how best to apply the technology to it is always valuable and will never go away. I think it was Kent Beck who said recently that thanks to generative AI, 90% of his skills now have a dollar value of zero, but 10% of his skills are now 1000 times more valuable. And I thought it was really interesting. I thought, that's exactly right. There's a lot of the drudgery work that just isn't valuable anymore because the LLM will do it. But all the other stuff is really, really important. So it's fascinating to me how it's evolving. And so I wanted to take a minute and just share that with you all and also to Hear your feedback. AWspodcastmazon.com is the place to share it. Tell me about what you're seeing, the experiences you are having. Has anything I've spoken about resonated with you? Or you're like, Simon, you're way off beam. It's a fascinating space. It's changing a lot, so we're gonna keep track of it for you. And of course, until next time, keep on building.