Podcast Summary: Software Misadventures
Episode: "LLMs are like your weird, over-confident intern"
Guest: Simon Willison (Datasette)
Hosts: Ronak Nathani, Guang Yang
Date: September 10, 2024
Overview
In this engaging episode, Ronak and Guang sit down with Simon Willison, the creator of Datasette and a prolific independent developer, to discuss the evolution of large language models (LLMs), their everyday use, best practices in documentation, and the future (and challenges) of LLM-powered software development. Simon offers his unique perspectives on productivity, open source life, prompt engineering, and why LLMs are best understood as "over-confident interns." The conversation covers surprising moments in AI's evolution, practical workflows, the impact on programming culture, and the responsibilities of practitioners.
Key Topics & Insights
1. The "Weird Intern" Mental Model of LLMs
- Simon coins the "weird intern" analogy:
"It's like having an intern who has read all of the documentation and memorized the documentation for every programming language and is a wild conspiracy theorist and sometimes comes up with absurd ideas and they're completely, massively overconfident. It's the intern that always believes that they're right." — Simon Willison (00:00)
- LLMs can be endlessly prompted, corrected, and even bullied into better results:
"You can be like, do it again. Do that again. No, that's wrong. And you don't have to feel guilty about it... One of my favorite prompts is you just say, do better, and it works." (00:00, 78:54)
2. Simon's Journey Into LLMs
- Early explorations starting with GPT-2 and GPT-3:
"I actually started playing with GPT2 back in 2020... I tried to use it to generate New York Times headlines for current affairs... but it never felt really like an AI. It wasn't like you were conversing with something." (01:51)
- Transformation with the advent of ChatGPT:
"All they did is they slapped the chat interface on top of their existing model... ChatGPT was an experimental prototype and a bunch of people inside of OpenAI thought it was a bad idea... And it was, I think it's the fastest growing consumer application in the history of the world." (03:37)
3. Blogging, Learning, and Developer Growth
- Simon’s enduring blogging habit as accountability and learning mechanism:
"Since January 1st, I've been trying to post something on my blog every single day... It's been an accountability mechanism for me for wider work for a few years now because I'm now sort of independent." (05:40)
- TIL ("Today I Learned") style entries for continuous learning and lowering the bar for public writing:
"With a TIL blog, no, you don't [have to say something new]... I'm publishing it—honestly, it's mainly for me." (08:44)
- Value in documenting and celebrating “small” learnings, even as a veteran engineer.
4. Documentation & Productivity Habits
- Rigorous workflow using GitHub issues for all work, research, and project memory:
"All of my work that I do, software work and a lot of my other stuff as well is in GitHub issues... Every single one of my projects has a very active GitHub issues setup." (11:27)
- Issues as living design documentation; every detail and decision is archived and searchable.
"The issues are the design documentation effectively... The only problem with design documentation... is if it falls out of sync with the code, then people lose trust in it." (15:38)
- Evolution from long commit messages to rich issues, with screenshots, video, and update logs.
"Issues are a blog, right? An issue thread is basically a one off blog for the story of this change the story of this feature." (18:59)
5. Blogging vs. Substack vs. Own Domain
- Substack used as a broadcast mechanism for the blog via a custom tool:
"I'm using Substack as a free mechanism to let people subscribe to my blog via email... I built myself a little tool... that pulls all of the content from my blog, reformats it into like HTML rich text, and then gives me a big copy button." (20:50)
- Strong belief in the value of owning your content and domain:
"There's something sort of wholesome about having a little corner of the Internet that's just for you like that." (22:58)
6. Command-Line LLM Tool ("llm") & Unix Philosophy
- Simon's
llmCLI tool: invoke LLMs from the terminal, with piping, logging, and plugin architecture."The UNIX piping idea is always like, you get some content, you pipe it into another thing which transforms it, you pipe it back out again. That's all language models are." (28:52) "I grabbed this beautiful three letter... so pip install llm is how you get it." (29:38)
- Support for OpenAI, Anthropic, Google Gemini, local models, and logging all prompts/results for research.
- Plugins as a model for scalable open source.
7. Building the LLM Usage Habit
- Expertise comes only from sustained use and learning the models' quirks and limitations:
"You need to build this really, really deep model of what they can and can't do... One of the lessons I think people need to learn as quickly as possible is you've got to run prompts where it gets the answer wrong in a really confident way. Like the earlier do that the better..." (33:31, 36:43)
- The value of follow-up prompts, iteration, and never assuming a single output is correct.
8. LLMs as End-User Programming
- LLMs as a democratizing tool for automation, potentially finally solving end-user programming:
"I feel like language models could be the key to unlocking this... I continually hear from people who... are using these tools on a daily basis, who've never programmed before, but now they can do stuff." (43:40) "The fact that you've got a semicolon error and paste it into ChatGPT, it tells you the fix. So it's like having a teaching assistant on hand 24 hours a day..." (44:30)
- Caution about code as “goop”—emphasizing the need for foundational skills even as barriers lower.
9. Beyond Chatbots: Interfaces to LLMs
- The power of non-chat UI, like GitHub Copilot's inline code completion and TLDraw's "make it real" feature:
"GitHub Copilot was the first mainstream non chat-based... That interface, the gray text which you get to approve, seems so simple and obvious now." (47:46)
- Simon's Dataset tooling: English-to-SQL translation, structured extraction, and enriching tables via LLMs.
"Ask a question in English and have... Claude Haiku... turn [it] into a SQL query and then it'll run that SQL query..." (51:42) "You can define a SQLite table... and it will populate the database for you. That works so well." (58:40)
10. Prompt Engineering & Model Selection
- Prompt engineering as a genuine craft:
"For me, prompt engineering is about figuring out... for a SQL thing we need to send the full schema and these examples... That's engineering. It is engineering. It's complicated." (53:32)
- Practical tips: always ask for options, include schema/examples, and iterate.
- Simon’s model usage is "vibes based evaluation": usage patterns shift as models and buzz shift.
"It's vibes based. It's frustrating. I wish I had better benchmarks of my own to try these things out." (79:59)
11. The Double-Edged Sword of LLM Productivity
- Simon is willing to trade some atrophy of low-level skills for vastly increased ambition and productivity:
"I can get so much more stuff done that I'm willing to pay with a little bit of my soul... It's making me ambitious. I'm writing software I would never have even dared to write before..." (86:09)
- Key workflow: From idea → prototype with LLM → GitHub issues for research, code, and documentation → automated release.
"The moment it turns into a project I'm actually going to try and commit to, I start a GitHub issue for it." (73:32)
12. Open Questions: Quality, Slop, and Social Impact
- Concern about the growing volume of "slop" (unverified, low-quality AI output), especially outside code.
- LLMs benefit senior engineers most, but raise worries about code quality and “goop” from those who don’t understand what’s being generated.
- Call for more sociology/humanities research into AI's impact:
"I want sociology papers. I want all of the sort of humanities doing research into the impact on these things. How do people learn to use them?" (91:58)
13. Advice for Junior Engineers
- Build real projects, not just follow tutorials:
"The fastest way to learn anything in software is to build something with it... If a candidate can show me stuff that they've built that's worth more to me than any degree... The two easiest forms of blogging are something that I learned or something that I built." (95:29)
- Invest in README files with screenshots and documentation.
14. Life as an Independent Open Source Developer
- Struggles with prioritization; uses "weeknotes" and deadlines like conference talks to create accountability.
"What to work on next. Prioritisation is so difficult when you don't have any external sort of forcing factors." (101:09) "My ambition is I want someone to win a Pulitzer Prize for a piece of investigative reporting where my software was one of the things tools that they used." (103:25)
- Backstory: the Lanyard startup, co-founding with his wife, acquisition by Eventbrite, and the experience of true autonomy after Stanford’s JSK fellowship.
"I experienced freedom for a year and I'm like, I do not want to give this up. I'm having so much fun working on these things." (105:39)
15. Ethics and Practitioner Responsibility
- Practitioners have a duty to figure out positive use cases, be transparent, and educate others.
"We understand this stuff better than 99% of the population, which I think puts a responsibility on us to figure out the positive ways of using this and then to share that." (111:31)
Notable Quotes
-
On LLMs as overconfident interns:
"It's like having an intern who has read all of the documentation and memorized the documentation for every programming language and is a wild conspiracy theorist and sometimes comes up with absurd ideas and they're completely, massively overconfident." — Simon (00:00)
-
On blogging as learning:
"Writing is thinking and it's such a great way of forcing you to structure your thinking. You know, the best way to learn something is to try and explain it to somebody else." — Simon (05:40)
-
On Unit Testing and Documentation:
"It feels like writing all of these notes should slow you down. It's the opposite. It speeds you up." — Simon (11:27)
-
On Chat Interfaces:
"It's a joke. It's an absolute joke that we've got this incredibly sophisticated software and we've given it a command line interface and launched it to 100 million people. What were we thinking?" — Simon (40:31)
-
On Prompt Engineering:
"Prompt engineering is about figuring out, yeah, okay, for a SQL thing we need to send the full schema and we send these three examples and these three responses, we need to prompt it in this specific way. That's, that's engineering." — Simon (53:32)
-
On "Slop" and the Dark Side:
"Will we just be flooded in slop and be like wow, I wish nobody'd ever invented this stuff at all. And it's harder for me to evaluate that because I think programmers are the best equipped to use these tools..." — Simon (91:27)
-
On Responsibility:
"If we can figure out what other things we can do that generally enhance people's lives that make the world a better place, those positive impacts. And if we stay away from generating like garbage slop and dumping that on people, that feels right." — Simon (111:31)
Additional Memorable Moments
- 1.4 million hits from an Elon Musk tweet (25:08)
"Yeah, I got 1.4 million hits on a page from that one. And yeah, without Cloudflare, I would have instantly melted."
- The "goop" of LLM-generated code (86:09)
"The idea that code is now goop. As a programmer, that offends my very soul. Like, that's, that's sort of horrifying."
- Copy-paste as the ultimate LLM API (67:42)
"Copy and paste is the best API, right? Copy and paste. Copy and paste half a million tokens of information about that person in I am certain you'd get good results out of that."
Recommended Resources
- Prompt engineering: Anthropic’s Claude documentation (56:35)
- Simon’s LLM CLI tool: [
pip install llm][Check Simon's GitHub] - Simon’s blog and Substack: [simonwillison.net]
- Prompt Injection security: [Simon's blog / PyCon talk]
- Stanford JSK Fellowship: [Stanford JSK]
Timestamps for Essential Segments
- LLM as “Weird Intern” analogy: 00:00 / 78:00
- Early LLM experimentation: 01:51
- Blogging as thinking/accountability: 05:40 / 08:44
- Productivity via GitHub issues: 11:27 / 15:38
- Substack as blog delivery: 20:50 / 22:58
- llm CLI tool and plugins: 28:52
- Vibes-based model evaluation: 79:59
- "Goop" and fears about code quality: 86:09
- Advice for junior devs: 95:29
- Life as an independent/prioritization: 101:09
- Stanford JSK independence story: 105:39
Final Takeaways
- LLMs can be thought of most effectively as flawed but tireless and educable assistants—overconfident, sometimes wrong, but immensely powerful when used wisely and with a strong feedback loop.
- Effective documentation and iterative habits, particularly via living issue logs and open, iterative blogging, are secret weapons for increased productivity in complex environments.
- Prompt engineering is not mere wordplay—it is rapidly becoming a foundational software discipline.
- The balance of innovation and harm ("slop") in the age of AI demands both optimism and ethical awareness from practitioners.
- Senior engineers and independent developers will pull farthest ahead, but LLMs hold massive potential to democratize automation and programming for all.
For further reading and Simon’s tools, refer to the resources and show notes.
This summary focuses exclusively on substantive discussion and content.
