Podcast Summary: Software Engineering Daily – "CodeRabbit and RAG for Code Review with Harjot Gill"
Release Date: June 24, 2025
In this enlightening episode of Software Engineering Daily, host Kevin Ball engages in a deep dive with Harjot Gill, the founder and CEO of CodeRabbit—a pioneering startup integrating generative AI into the code review process. They explore the architecture of CodeRabbit, its innovative use of Large Language Models (LLMs), and the intricate mechanisms that ensure quality, security, and maintainability in software development at scale.
1. Introduction to CodeRabbit and Harjot Gill
Harjot Gill introduces CodeRabbit as a solution that leverages generative AI to enhance code reviews, ensuring code quality and security across platforms like GitHub and GitLab. With over 100,000 daily users, CodeRabbit has rapidly gained popularity among developers across various industry segments.
Harjot Gill [00:00]: “One of the most immediate and high impact applications of LLMs has been in software development. The models can significantly accelerate code writing, but with that increased velocity comes a greater need for thoughtful, scalable approaches to code review.”
2. User Experience with CodeRabbit
Kevin Ball inquires about the practical aspects of using CodeRabbit. Harjot Gill explains that CodeRabbit seamlessly integrates into existing development workflows, primarily functioning within the pull request model. Once a feature branch is ready, opening a pull request triggers CodeRabbit to perform automated code reviews alongside traditional CI/CD pipelines. Additionally, a recently launched VS Code extension allows developers to review code before pushing it to remote repositories.
Harjot Gill [02:34]: “Coderabbit sits alongside those tools and uses AI to perform code reviews. And very recently... we also released a VS code extension that also works... so that the developers can also review the code before they even push the code to the remote git branch.”
3. Technical Architecture and Implementation
a. Code Generation vs. Code Review
Harjot Gill differentiates between code generation and code review, emphasizing that while code generation focuses on autocomplete and suggestions using smaller, low-latency models, code review demands deep reasoning and comprehensive analysis.
Harjot Gill [04:18]: “The workflow that Codrabbit is sitting on is latency insensitive because you're running it in the CI CD pipeline and that workflow can typically take several minutes to complete.”
b. Leveraging Large Language Models (LLMs)
CodeRabbit employs an ensemble of multiple LLMs tailored for specific tasks within the pipeline. This strategy ensures optimal performance and cost-effectiveness, utilizing models like GPT4.1 Nano for context preparation and more advanced models for nuanced code analysis.
Harjot Gill [10:15]: “Code Rabbit is an ensemble of models we draw. We don't even expose what models we are using to the end customers.”
c. Contextual Analysis and Indexing
To provide relevant and accurate code reviews, CodeRabbit builds a dynamic code graph from pull request payloads, analyzing diffs, dependencies, and contextual information from issue trackers like JIRA. Additionally, past interactions and team-specific learnings enrich the AI’s understanding, ensuring personalized and effective reviews.
Harjot Gill [06:21]: “There are like, I don't know, 10 to 15 different data points that we pull in during the context.”
4. Handling LLM Limitations
LLMs, despite their prowess, have limited context windows and can suffer from quality degradation when overloaded with information. Harjot Gill discusses how CodeRabbit strategically manages context by supplying the AI with essential hints and creating sandbox environments where the AI can execute agentic loops—running CLI commands and web queries to fetch additional context as needed.
Harjot Gill [06:48]: “What we are doing, which is a cool thing, which is like so differentiated right now, we create all these like sandbox environments in the cloud.”
5. Sandboxing and Security
Ensuring secure and efficient sandboxing is pivotal for CodeRabbit. The system employs standard containerization techniques, allowing the AI unrestricted internet access to perform necessary operations like running shell scripts and accessing GitHub APIs. Tokens are securely stored, and the AI leverages its training data to understand and execute CLI commands without additional handholding.
Harjot Gill [17:37]: “We actually generate code as instead of doing tool calls. We have a sandbox and CLI. That's all you need.”
6. Agentic Loop and Task Management
CodeRabbit's agentic loop is a sophisticated pipeline that breaks down code review tasks into a dynamic task graph. This system delegates subtasks to specialized agents, ensuring thorough and accurate analysis. Each agent's output is meticulously tracked, allowing the main agent to reassess and replan if necessary, maintaining a high standard of code quality.
Harjot Gill [25:09]: “That's right. And this task graph is dynamic, as you can guess. I mean it's figured out by the AI.”
7. Cost Management Strategies
To maintain cost-effectiveness, CodeRabbit strategically employs a mix of cheaper and more expensive models depending on the task complexity. Additionally, the platform implements rate limits and incremental review processes to minimize unnecessary computations, ensuring scalability without exorbitant costs.
Harjot Gill [28:12]: “One of the things that people love about Code Rev, it's an incremental reviewer. So it will remember the last time we left the review and next time when it resumes, it will first see whether I have to really re review something or not.”
8. Challenges and Principles in Building CodeRabbit
Developing CodeRabbit involved navigating the inherent non-determinism of LLMs and ensuring a reliable user experience. Harjot Gill emphasizes the importance of hiding LLM deficiencies from users, maintaining high accuracy, and embedding AI seamlessly into existing workflows to foster adoption.
Harjot Gill [30:10]: “The trick has been how do you hide those deficiencies from the end user. They tend to be noisy, they tend to create a lot of slop otherwise.”
9. Scaling and Customer Acquisition
CodeRabbit’s growth was propelled by a consumer-style marketing approach, leveraging influencer partnerships, organic social media presence, and a strong word-of-mouth effect. By making the tool accessible and free for open-source projects and individual developers, CodeRabbit built a habit-forming product that quickly resonated with its target audience.
Harjot Gill [43:44]: “We made the product accessible to as many people as we could. We made the product free for open source users... to quickly iterate on the product and make sure that we build a habit forming product.”
10. Future Directions
Looking ahead, CodeRabbit is expanding its focus from code reviews to code polishing, addressing the final 20% of code quality enhancements such as documentation and unit test coverage. This strategic shift aims to solidify CodeRabbit’s position as an indispensable tool in the software development lifecycle.
Harjot Gill [46:17]: “We are focusing on that in the PR. Can we eliminate all the deficiencies? Like for example like if you're missing documentation and you as a company care about it, can we add doc strings, can we add missing unit test case coverage.”
11. Conclusion and Takeaways
Harjot Gill concludes by encouraging developers to experience CodeRabbit firsthand, highlighting its effectiveness and seamless integration into daily workflows.
Harjot Gill [48:10]: “I recommend everyone at least try it once.”
Key Insights:
-
Integration Over Intrusion: CodeRabbit successfully embeds AI into existing workflows without disrupting developer habits, ensuring widespread adoption and love from its user base.
-
Strategic Use of LLMs: By employing an ensemble of specialized models and dynamic task management, CodeRabbit balances performance with cost, delivering high-quality code reviews at scale.
-
Transparency and Trust: Providing users with contextual insights and a clear reasoning trail fosters trust and allows for validation, mitigating the risk of AI-induced errors.
-
Scalable and Sustainable Growth: A focus on accessibility, combined with intelligent cost management and a strong user-centric approach, has propelled CodeRabbit’s exponential growth.
For developers looking to enhance their code review processes with state-of-the-art AI, CodeRabbit offers a robust, reliable, and seamlessly integrated solution. As Harjot Gill aptly puts it, experiencing CodeRabbit firsthand is the best way to appreciate its transformative potential.
