Eye on A.I. Podcast – Episode #324
Guest: Sharon Zhou, VP of AI at AMD
Host: Craig S. Smith
Date: February 27, 2026
Episode Overview
In this episode, host Craig S. Smith sits down with Sharon Zhou, VP of AI at AMD, to delve into AMD's strategy for building self-improving artificial intelligence. Sharon discusses the cutting-edge efforts underway to use AI for automatic kernel code generation—crucial for boosting performance on AMD hardware—and how these self-improving systems could shape the ecosystem at large. The conversation covers kernel generation, continual learning and catastrophic forgetting, the boundary between automation and genuine self-improvement, and the broader educational outreach Sharon is piloting. The discussion situates AMD's efforts within the constantly evolving landscape of AI research, hardware, and global compute demands.
Key Discussion Points and Insights
1. Introduction: Sharon Zhou’s Background and AMD’s AI Focus
- Sharon introduces herself as VP of AI at AMD, describing her transition from AI research at Stanford, through a startup focused on post-training language models on AMD GPUs, to her current role.
- “I am Sharon. I'm the VP of AI at AMD and I think about self improving AI, self improving LLMs which we'll get into later. ... And most recently ... we have transitioned now to AMD. So my team and I are there now and very excited to enable more people to use compute and to get access to compute because that really is the limiting factor.” (00:23)
2. Defining Self-Improving AI and Kernel Generation
- Sharon explains "self-improving AI" as models that can edit any part of themselves for improvement—data, architecture, evaluation, or even the foundational kernel code.
- AMD’s efforts are particularly focused on enabling language models to write their own low-level kernel code to optimize performance on GPUs.
- “It's the idea of these models being able to edit any part of themselves to improve themselves ... what I'm working on is ... how fast they actually run on the GPUs themselves. They are writing the kernel code that underlies these models to run faster on these GPUs ...” (02:00)
3. Kernel Generation: Industry Collaboration and Benchmarks
- Sharon describes a landscape where multiple organizations—Meta, Google, DeepMind, Nvidia, Stanford—are working on AI-assisted kernel generation.
- AMD collaborated on a NeurIPS tutorial to educate the community about AI-generated kernels and shared benchmarks/methods.
- “What we did most recently was in collaboration with actually a bunch of different institutions ... Was NeurIPS tutorial on generating kernels using AI ... how we're using AI agents [to] generate these kernels and how we're thinking about post training these models to generate kernels more effectively.” (02:51)
4. Technical Deep Dive: What Are Kernels and Why Optimize Them?
- The conversation explores what “kernels” are: small, low-level programs that run specific operations (e.g. matrix multiplication) on GPUs and are key to AI model efficiency.
- “For a given piece of hardware ... you do have to write the software to connect the models of today and of tomorrow, ideally to that hardware. ... It's utilizing the GPU effectively, both the memory ... as well as the actual raw compute power.” (04:58)
- Matrix multiplication is the most common and critical kernel operation for LLMs.
5. Evolutionary Strategies, Agentic Approaches & Bottlenecks
- AMD combines evolutionary, agentic, and post-training methods to generate and refine these kernels.
- Manually written kernels require rare expertise in both hardware and software, making automation/self-improvement a huge productivity booster.
- “Having those two areas of expertise in one person's head is quite rare. ... But of course we're also teaching the models about this. ... We have the Rockham Stack, which is open source. ... That being open actually helps us from a language model perspective because the models can read that data and train from that data.” (08:52)
Notable Quotes & Discussion Highlights
The Impact and Ambitions of Kernel Self-Improvement
- “The part of self improving though we've been working on is using these language models to write low level kernel code. ... Get the model to actually write its own code to run even faster on the GPU so that it can learn even faster. This is the self improving loop…” (13:59)
On Continual Learning and Catastrophic Forgetting (12:22)
- Sharon explains the risk of “catastrophic forgetting” during post-training, especially when access to pre-training data is limited:
- “Even if you include ... only 1% of the pre training data back in during post training, you can actually prevent catastrophic forgetting, basically enabling the model to actually like connect back to its representations back in pre training.” (12:22)
Distinction: Automation vs. Self Improvement (26:21)
- Sharon sees kernel generation as more than rote automation:
- “I view it as self improvement, not automation ... the kernel itself is what, is what the model is running on. So it is the model's own code and it enables the model to run even faster on the gpu. ... but if we want to view it through a lens that's like more exciting ... that is just a different perspective on it.” (26:42)
The Reality of AI Autonomy (27:20)
- On whether models will soon truly rewrite their own code:
- “Yeah, I definitely think so. I think more likely it will be a different model that does it.” (27:31)
Resource Saving and Industry Economics (34:30)
- On the tangible impact:
- “A very, very complex kernel can take a, at least you know, let's say a non expert, but someone who might still be tasked to do it. Months to write ... like an expert would take a couple weeks ... if you can even shave off ... a tiny amount of time it takes for one matrix multiplication that occurs billions trillions of times inside of a model ... that can incur billions, hundreds of billions of dollars to a company.” (34:30)
Compute Demand and Infinite Chips (39:26)
- On whether kernel optimization could ease the chip shortage:
- “I think people want infinite chips, Craig. So no, it doesn't relieve the pressure. I don't think they found a plateau where they are like, we're, we're done ... I, I don't see that right now.” (39:26)
Segment Timestamps
| Timestamp | Segment | |------------|------------------------------------------------| | 00:23 | Sharon’s background & her excitement at AMD | | 02:00 | Defining "self-improving AI" | | 02:51 | Industry collaborations on kernel generation | | 04:58 | Technical explanation: Kernels & GPU stack | | 07:01 | Evolutionary/agentic strategies for kernel gen | | 08:52 | Bottlenecks & value of open source Rockham | | 12:22 | Catastrophic forgetting & continual learning | | 13:59 | Models writing their own kernel code | | 19:53 | AMD's approach: research & product focus | | 21:39 | Recap of NeurIPS tutorial for community | | 26:42 | Automation vs. self improvement discussion | | 27:20 | Realistic prospects for model autonomy | | 34:30 | Economic impact and time savings | | 39:26 | Compute demand & market implications | | 41:54 | Use cases: Language models, diffusion, vision | | 45:46 | Sharon’s educational initiatives |
Educational & Outreach Efforts
- Sharon teaches AI at scale, collaborates with DeepLearning.AI and Harvard University, and is developing practical courses for different levels, including a popular post-training class on RL and fine-tuning.
- "I teach online. I teach about a million people ... and lately we built up a partnership with Deep Learning AI ... launched a post training class on reinforcement learning and fine tuning of these models." (42:36)
- Courses are available for free on DeepLearning.AI; Harvard courses coming soon. (46:04)
Memorable Moments
- On AI replacing kernel engineers:
- “If you're listening, I encourage you to go learn about it. But of course we're also teaching the models about this. ... That's the impact ... massive ...” (08:52)
- State of AI self-improvement:
- “We're further than we think and closer than we think.” (15:33)
- On the open/closed nature of GPU stacks:
- “We have the Rockham Stack, which is open source. That's the equivalent of Nvidia's Cuda Stack, which is not as open. That being open actually helps us ...” (08:52)
Closing Synthesis
This episode offers an insider’s view into how AMD is leveraging AI not only to speed up its own hardware, but also to automate critical software infrastructure, democratize access to high performance compute, and seed a broader community of AI practitioners. Sharon Zhou’s expertise bridges academic research, practical productization, and educational outreach, situating AMD at the intersection of hardware innovation and AI self-improvement.
Where to Learn More
- Courses:
- DeepLearning.AI: Free courses on RL and model fine-tuning. (46:04)
- Harvard University: Upcoming pro courses for non-engineers and leaders.
- Open source contributions from AMD: Kernels and benchmarks.
For more insight on the AI hardware-software frontier and the evolving balance between automation and autonomy, this episode is a must-listen for practitioners and enthusiasts alike.
