AI at Anaconda with Greg Jennings - Software Engineering Daily

Summary4 min read

Podcast Summary: AI at Anaconda with Greg Jennings

Episode: AI at Anaconda with Greg Jennings
Podcast: Software Engineering Daily
Release Date: July 3, 2025
Host: Kevin Ball

Introduction to Greg Jennings and Anaconda

In this episode of Software Engineering Daily, host Kevin Ball welcomes Greg Jennings, the Vice President of Engineering and AI at Anaconda. Anaconda is renowned for its comprehensive solutions in managing packages, environments, security, and large-scale data workflows, significantly contributing to the accessibility and scalability of Python-based data science.

Greg Jennings shares his journey from a physics and material science graduate to leading AI initiatives at Anaconda:

“I started as a graduate out of physics and material science... I found that was actually faster to do very often.”
[01:23]

Anaconda vs. Standard Python

Kevin Ball prompts Greg to differentiate Anaconda from the standard Python installation available on platforms like MacBooks.

Greg Jennings explains that Anaconda was created to address the complexities of managing Python’s binary dependencies and environment isolation, which are common challenges, especially on Windows machines. He recounts his initial struggles with installing packages like NumPy using standard Python and how Conda, Anaconda's package manager, provided a seamless experience:

“When I started using Python... I was like, this is magical.”
[05:04]

Anaconda’s defaults distribution offers a curated set of packages, ensuring compatibility and ease of installation, which stands in contrast to the more cumbersome standard Python setup.

Business Model and Sustainability

Greg delves into how Anaconda, the company, sustains itself. The company maintains the defaults distribution, providing enterprise support, observability, and governance tools tailored for large organizations. Additionally, Anaconda plays a pivotal role in the Conda Foundation, sponsoring and maintaining packages that cater to both enterprise and individual practitioners.

“Anaconda makes its money by providing enterprise support... governance for large organizations.”
[07:08]

Focus on AI: Traditional ML vs. Generative AI Models

Transitioning to AI, Greg distinguishes between traditional machine learning models and the current landscape dominated by generative pre-trained transformers (GPT). Traditional models required specific, problem-centric data and were often brittle, necessitating extensive monitoring. In contrast, GPT models are pre-trained on vast datasets, embedding extensive information that can be leveraged out-of-the-box.

“The pre-trained is really the key part... how people are incorporating those into their experiences.”
[08:57]

Anaconda’s AI Tooling Ecosystem

Kevin Ball inquires about the specific AI tools Anaconda is developing. Greg outlines three primary focus areas:

Internal AI Enhancements: Utilizing AI to streamline internal processes like package building and management.
Anaconda Toolbox: A context-aware assistant integrated within Jupyter Notebook and PyExT, designed to assist users with Python-specific challenges by providing inline support, reducing the need for manual context switching.

“Anaconda Assistant... it just kind of works.”
[11:13]
Anaconda Assistant’s Growth: Initially targeting key pain points like data visualization, the assistant has expanded to over 50,000 active users, continuously improving with more powerful models.

Overcoming Challenges with LLMs in Data Science

Greg addresses the inherent challenges of Large Language Models (LLMs), particularly their tendency to hallucinate. Anaconda tackles this by embedding tools that explain generated code snippets within notebooks, fostering better understanding and validation among users.

“We have the ability to inline immediately explain any code snippets...”
[26:33]

He emphasizes the importance of user education and validation to prevent misinterpretations of data, especially for non-expert users engaging in data science.

Enhancing Accessibility and Democratizing Data Science

The conversation highlights how AI tools like Anaconda Assistant are democratizing data science by making complex tasks more accessible. Greg envisions a future where notebooks, enhanced by AI, become even more user-friendly, enabling broader usage beyond seasoned data scientists.

“AI is helping us unlock that and surface that in a much more accessible way.”
[20:02]

Future Directions: AI Navigator and Package Management Evolution

Looking ahead, Greg discusses AI Navigator, Anaconda’s initiative to create a local AI control plane. AI Navigator aims to integrate AI models seamlessly into Python applications, facilitating the creation of resilient, AI-driven workflows. This includes evolving package management to accommodate external AI dependencies and enabling applications where agents can interact with one another.

“AI Navigator is this local control plane that's designed to do that...”
[47:12]

Conclusion: Embracing the Changing Software Stack

Greg Jennings concludes by acknowledging the transformative impact of AI on the software stack. Anaconda is committed to evolving alongside these changes, ensuring that developers and practitioners have the tools and support needed to leverage AI effectively within their workflows.

“We're working towards very interesting things in all of them...”
[47:43]

Key Takeaways:

Anaconda provides a robust ecosystem that simplifies Python package management, especially for data science applications.
AI Integration: Anaconda is at the forefront of integrating AI tools like Anaconda Assistant into development environments, enhancing productivity and accessibility.
Challenges with LLMs: Addressing issues like hallucinations in AI models is crucial for reliable data science workflows.
Future Innovations: Initiatives like AI Navigator signal Anaconda’s commitment to evolving package management and AI integration in software development.

This episode offers valuable insights into how Anaconda is shaping the future of AI in software engineering, making advanced tools more accessible and efficient for developers worldwide.

Loading summary

Transcript51 lines

[00:01]
Greg Jennings
Anaconda is a software company that's well known for its solutions for managing packages, environments and security and large scale data workflows. The company has played a major role in making Python based data science more accessible, efficient and scalable. Anaconda has also invested heavily in AI tool development. Greg Jennings is the VP of Engineering and AI at Anaconda. He joins the podcast with Kevin Ball to talk about the tooling ecosystem around AI app development, the Anaconda toolbox, the rapidly evolving role of AI in engineering, and more. Kevin Ball, or K. Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website Kball LLC.
[01:12]
Kevin Ball
Greg, welcome to the show.
[01:13]
Greg Jennings
Hi Kevin, very nice to meet you.
[01:15]
Kevin Ball
Yeah, excited to have you on here. Let's maybe start with a little bit about you. Can you share some of your background and how you got involved with Anaconda?
[01:24]
Greg Jennings
Sure. So I started as a graduate out of physics and material science, out of graduate school and went to work at a large consulting organization where we were building sort of complex models and simulations for mostly government organizations to help them do things like figure out complex dependencies and procurement schedules and anticipate what the capabilities that they would gain from picking platform A versus Platform B and balancing costs. And so though I hadn't I'd written mostly simulation code in graduate school, but I started writing a lot more, I guess user facing code to kind of interact with end users and expose some of those capabilities to end users. So during my time there I briefly found Python. We were writing things in Java Swing at the time, it was a painful experience. So we grew a team there, then ultimately decided along with a few other colleagues to to set out and start to forge our path in a startup world as some people do, and started to then explore Python. And so then at that point stepping away, I really sort of dove into Python much more deliberately and found that I could be way more productive with it. Of course some people have found that they love type languages for large projects, but for me trying to move very fast, it was extremely helpful to be writing in Python and so started writing a lot of things in Python. Even I found when I started doing consulting work that also leveraged my Java background, I would oftentimes write the algorithm in Python first and then port that Algorithm over to Java. And I found that was actually faster to do very often. And the thing that I found that really made it faster to do was the ability to work in the repl, and especially work in, at that time, IPython notebooks. So the ability to really iterate on things like that made it so powerful. And then from there, I kind of formed a product startup and we built the entire stack in Python. And again, starting off as sort of an individual contributor where I was sort of building everything end to end, starting with the web framework, all the way down to writing kind of custom machine learning code, it was extremely powerful. Like, there was really no other single language that I could have used that would have enabled me to do all of that individually. And as part of that, I found a package, magical package, called numba, which helped to accelerate a lot of those computationally intensive workloads that were a core part of the application that we built. And so when time came for me to move on from that startup, I had interacted with and known a number of folks. I made a small pull request at one point to the NUMBA library and had a very positive impression of Anaconda because it had been such a transformational product in my own workflow and really solved so many problems for me. So I had a very strong impression of it. And I came to Anaconda just about three years ago, not quite three years ago. It'll be three years at the end of April. And now I am the VP of Engineering for AI at Anaconda, where I lead all of our AI initiatives from the engineering side.
[04:39]
Kevin Ball
That's awesome.
[04:39]
Greg Jennings
Yeah.
[04:40]
Kevin Ball
I think a lot of people have had that experience with Python where it's like, there was an XKCD comic at some point about this, right. Where it's like, I'm flying. How Python. It just like, feels like you can move so quickly and productively in it.
[04:52]
Greg Jennings
Absolutely.
[04:53]
Kevin Ball
Let's actually talk a little bit about Anaconda. For folks who are not deeply in the Python world, what is Anaconda and how does it differ compared to, say, the Python that just comes installed on a MacBook?
[05:05]
Greg Jennings
Right. Well, when I started using Python, I can, like, leverage my own personal experience and talk a little bit about this. When I started using Python, I started using it on a Windows machine. And one of the first big challenges I had was trying to figure out how to install Numpy on a Windows machine with the existing standard Python. Because one of the challenges that Python has historically had is the inability to work really well with complex Binary dependencies and the ability to kind of have different sort of semi isolated environments on the same machine. So anybody that started doing any sort of heavy numerical work, and as NUMPY came along, which Travis Oliphant, who was one of the original founders of Anaconda, wrote, that was a core problem that they identified that people kept having. So they realized that a huge opportunity existed to solve that problem for people. So Anaconda was kind of created around this sort of package manager, the Conda ecosystem, with the idea that they could create a sort of way to build the packages and manage all of those dependencies so that people could stand those things up super easily and get started. And in fact, like, that was my experience. And so at some point, I think back this might have even been like pre the days when stack overflow, I guess stack overflow was around back then. But I found at some point on a forum and somebody said like, just use conda, right? So I did just use CONDA to install numpy and was able to get everything up and running and was like, this is magical. And so I think that initial sort of magical experience where everything just worked is kind of the core of the practitioner experience that Anaconda really tries to deliver to all of our users and all of the folks who are our customers as well.
[06:48]
Kevin Ball
Yeah, I love that description. So let's really spend a tiny bit more time on this and then move on a little bit into AI, which is, as I understand, kind of your bread and butter there. But you have Anaconda the distro and then Anaconda, the company that is running it. Like what is the relationship? How does Anaconda, the company sustain itself? Like, what does that look like?
[07:09]
Greg Jennings
So Anaconda, the company maintains called like the defaults distribution. And this is a set of curated packages that we build ourselves. So we take those packages, we take the upstream build recipes, we bundle that all together, we run the build process and we package that into that distribution, which is available through the Anaconda defaults channel. That's the primary distribution that all of our customers use. And that is where Anaconda makes its money, is by providing that and providing a layer of enterprise support for that and providing a level of capability on top of that, which is mainly around observability and governance for large organizations. So this is how Anaconda makes its money. Now Anaconda, the larger organization, is also a core part of kind of the Conda foundation, so that we are also a big sponsor of that we host and maintain all of the packages that for instance, Conda Forge serves, which is on the anaconda.org channel. And many people, especially, you know, practitioners outside of enterprises, also use those packages.
[08:18]
Kevin Ball
Got it. Okay. So as I understand it, there's a bunch of stuff you can get as an individual, but if you want to use this as an enterprise, you're paying Anaconda the company and getting support and all these different pieces, correct?
[08:29]
Greg Jennings
Yep.
[08:30]
Kevin Ball
Okay, cool. Let's now dive in to what you are focused on. You said AI is your area, your VP of AI applications and tooling. Let's talk a little bit about what we mean with AI because you've been in the ML space for a long time. A lot of people hear AI right now and they think the latest in LLMs and the LLM craze. But what are the different parts of the AI ecosystem that you focus on?
[08:57]
Greg Jennings
So most of what we are focused on internally is around how to use say pre trained AI models. And that's distinct from, as you noted, like a lot of the traditional machine learning models and sort of what my historical background is. So way back when people built machine learning models, typically they wouldn't do anything without having your own data that you needed to bring to them. And oftentimes it was problem specific data. So for instance, if I wanted to train a decision tree model or you know, classical regressor model or classifier model, I needed to bring my own data to it. I needed to apply that and train an algorithm. And then once I had that, I could run inference. But it only really worked on this very specific narrow problem. And they were often kind of brittle. You had to do a lot of monitoring associated with it. So their utility was kind of limited to places where you had that kind of both problem stability and like the value of the problem that you were going to apply it to was enough that it was worth it to sort of throw this resources at it. The era now that we find ourselves in is quite different, which is like the GPT is generative pre trained transformer. And the pre trained is really the key part of that because that step involves taking a huge amount of information, tons of information pulled from all over the Internet and data sources like the pile and feeding that into the model in advance. And what that creates is a situation where you have these much bigger, huge models that have a lot of information in them sort of baked in and available out of the box. And the primary place that we focus on at Anaconda is how we can use those kinds of models to Help enhance people's workflows and how people are incorporating those into their experiences that they're building with our tools.
[10:53]
Kevin Ball
That makes sense. So there's a lot of different takes going on in that space. And you have kind of, I think the first in this space, most of those tools were in Python. You had like LangChain and all these other different tools for doing it. So what is the tooling that your team is focused on? What parts are you building out? Is this application frameworks or is this integrations into notebooks or how are you thinking about it?
[11:14]
Greg Jennings
We're thinking about it kind of in a couple different ways, actually three major ways which we'll kind of go through. The first is that we in Anaconda work through a lot of Python related issues and there's ways that we can apply AI internally to help us address some of those issues. So one of the things that we do which is complex is building packages, building packages for our distribution. And so we've done some work to try to determine how can we use the existing AI tooling that exists and has emerged in order to help us do that job more efficiently. And so that sort of internally focused effort, especially around the core product offerings that we have, is a place where we're targeting some of our efforts. Then on the external facing, specific product efforts, there's something we call like Anaconda Toolbox. Anaconda Toolbox exists within Jupyter Notebook and it also exists within the PY ext which we might talk about. And it's basically sort of a context aware assistant that is customized in a way to be able to help people with Python specific problems. So this is designed to have better context awareness of where the user is in their workflow and provide direct assistance to it. So one of the initial places, for instance with when we started building Anaconda Assistant, we noticed was that we were routinely, if we were operating in a notebook and operating in that repl environment, we were constantly copy, pasting information from the notebook over to the chat window. Looking at what the chat window said, oh, I didn't provide the right context or it got a variable name wrong. The column name and my data frame is different than the one that I thought was there. So you were not only copying and relaying what problem you wanted the model to solve, you were also copying and relaying a lot of the context that the model needed to do do a good job of solving it. And this sort of process went back and forth if you hit an error. Now I have to relay what the error is, I copy the error message and paste it over here. And then it might need to ask more information in order to get a good handle on how to solve it. So we thought, well, any place there's a situation where I'm copy pasting context back and forth between windows on a routine basis is a really good place for me to see if I can provide more direct inline in workflow AI assistance. So Anaconda Assistance, the concept was really born out of that. And one of the core problems that we wanted to solve initially we had these sort of user flows that we wanted to make easier, these kind of key pain points. One of them was around data visualization, for instance. So data visualization is sort of notorious. It's very powerful in notebooks, it's amazing use case. But there's a bunch of different types of data visualization libraries are seaborn this matplotlib bouquet. Count them up. There's tons of them, depending on what you want to do. And they can have, depending on how complex the graph you want to make is, they can have complex APIs. So even people who are experts who work with those all the time often find themselves going back to the documentation and trying to figure out what are the right things that I do in order to format the graph in exactly this way. And so we thought, you know, I bet we can make that specific workflow a lot more efficient. And so that was one of the primary goals that we have with Anaconda Assistant. It's grown. We now have over 50,000 active users on it and we are now expanding some of those capabilities and obviously does a lot more now than it did at the very beginning, where it's really designed to focus on. On just a few handful of workflows. But also the models that we're having run inference workloads behind it are much more powerful as well. So it's kind of interesting in a way to see those capabilities where we have an embedded AI workflow right alongside of what the user's already doing, get additional value, really, just by kind of adding a better model behind it. And suddenly things and kind of workflows that didn't really work before, all of a sudden they just kind of work. And so we discover those every time we update a model. But some of those things are.
[15:27]
This episode of Software Engineering Daily is brought to you by Capital One. How does Capital One stack? It starts with applied research and leveraging data to build AI models. Their engineering teams use the power of the cloud and platform standardization and automation to embed AI solutions throughout the business. Real Time Data at Scale enables these proprietary AI solutions to help Capital One improve the financial lives of its customers. That's technology at Capital One. Learn more about how Capital One's modern tech stack data ecosystem and application of AI ML are central to the business by visiting capital1.comtech.
[16:07]
Kevin Ball
So, yeah, I think this is a great example because this is something we're seeing showing up in a lot of places, right? In the general software development ecosystem. Cursor is doing something like this where it's like, okay, we can have the chat connected deeply to your code, and simply by being able to automatically load the right context in the right places, we can go a tremendous distance. I'm curious maybe to dive in a little deeper about how you implemented, for example, this graph approach. Now, one challenge that folks consistently run into is around, is this in the training data of the model or not? How are you approaching it? So when you were implementing those very first graph assistant workflows, what did that look like? Is that some additional context being loaded in? Is it a little fine tuning on top of the foundational models? Like, how are you approaching making that happen?
[16:55]
Greg Jennings
Yeah, great question. So we actually, on that particular case, haven't done any actual fine tuning. We've done a lot of prompt optimization. So we watched very carefully what some of the cases where we had a lot of internal use of the application. We have some folks who've kind of consented to have their anonymous information improve the system as well. And so we kind of look like, what are the situations where it generates errors and we track what errors actually get generated and we can go and look and see what specific requests led to those cases. And then we try to figure out, well, what kind of ways can we prompt the system in order to make that not happen? Right. And so a lot of the layer that's kind of sitting in between Anaconda and the raw sort of inference layer that we have sitting behind it, which is now in our bedrock, is in kind of adjusting and adding things to the prompt. And then the other thing, as you mentioned, is, well, there's a lot of information that exists within the developer workflow. So for Cursor or for normal workflows, it might be like, well, they have snippets of code and they're trying to inject in snippets of code into the context window alongside the prompts so the model can reason about it more effectively. For us, it's what are all of the different variables that exist within the global scope of the notebook. If I have a data frame, what are the columns of that data frame? What are the things that the user has inside of the notebook in terms of the packages and the things that are available to them? A lot of these are sort of like things that we've already done, and a lot of them are things that are ongoing. We try to figure out what's the next piece of context that I can add to the system that's going to reduce the odds that somebody gets an error. And we track the number of codes that we number of snippets of code that we've generated that creates an error and we try to manage to that. The goal being, obviously that people can do in a lot of cases now, full interactive workflows for things that are really great in notebooks, like interactive data exploration and other things that are great repl type workflows directly. With this sort of like Andrew Karpathy's vibe coding style, where you don't necessarily have to write a line of code, I would say that's not quite at a fully ready state just yet. You know, in terms, you still have to know a little bit about how Python works, how notebooks work, in order to get that kind of experience. And obviously somebody who really knows a lot about what they're doing inside of those systems can use it to drive more complex behaviors. But for a lot of cases, I think using a notebook in that fashion to do things like interactive data exploration and asking questions about data sets, a notebook's actually the perfect user interface for that. It just so happens that it was kind of inaccessible, I guess, for a lot of people previously who might otherwise been business analysts and relied on other tools. So we're optimistic that going forward we'll have the ability to bring the power of jupyter notebooks to a lot more people because we think that AI is helping us unlock that and surface that in a much more accessible way.
[20:03]
Kevin Ball
I love that. Well, and one of the things that's really interesting, if you're hosting the notebooks, you're running the execution, you can do things like a common loop I've seen in cursor, is you're trying to debug something and you'll ask it for help and it will suggest logs that you could insert that will help with debugging, but there's still then a copy and paste loop because you have to say, okay, here's where the logs are going go. And in here in your environment, you could potentially even introduce logging that isn't exposed to the end user that you intercept and just feed to the assistant for help.
[20:31]
Greg Jennings
Oh yeah, absolutely. And the way the system actually works, it has awareness, you know, when the system generates an error, we intercept that and we pipe that to the assistant. And so we catch the error and we actually offer directly in line to the user, do you want to fix this error? I see you got an error here. Would you like us to try to fix it? And most of the time it will suggest a fix that works. And so we obviously that's another thing we track and we want to make sure that that experience continues to get better. And if they have an error, we're able to fix it. Ideally we identify enough of them and we're able to categorize them well enough that we start to minimize the odds they'll have an error in the first place. But those things I think are coming. But I think the Notebook environment is just such a rich, powerful environment for those kinds of workflows. Especially when you're doing kind of things interactively, you're sort of exploring. And my personal feel is that actually in many cases it's a better testbed, a better approach for a lot of types of problems than starting off in a full IDE environment or starting off with a full code base. A lot of times I may just simply want to explore how well a particular function does, right? A single function, run a particular type of data against it, swap out a particular function and see if that performs better or performs worse. Right. Notebooks have a lot of notebook magic things like the ability to do timings of different runs, as you mentioned, sort of seeing a lot of the output directly in line and having that be attached to the specific cell. There's a lot of value there that I think is underexplored. And a lot of what we want to do with Anaconda Assistant is help to drive some, some of that capability into the community. Because I really think that there is a new, potentially a possibility to sort of get a lot more people to use and understand the value of Notebooks and the sort of repl based development approach when combining it with AI.
[22:30]
Kevin Ball
Yeah, I think you're absolutely right. I mean, one of the things that to me stands out about Notebooks is it takes this iteration loop of exploration that previously was really only accessible to programmers working in environments that expose that. Like I came up in the Ruby on Rails days. And the fact that you had the console that you could actively explore your code base in and have that interactive loop was an incredible Innovation and more and more environments provide that now. But notebooks are what gives that to data scientists. Oh, I can do this and see my data and do those things. I'm kind of curious to explore that direction a little bit more around. How do you see LLM based assistants and agents working in the data science world and really expanding accessibility to understanding of data?
[23:18]
Greg Jennings
Yeah, I think you're correct that there is accessibility and understanding to data. That first of all, the way people pull data in to start to do the exploration in the first place, I think is something that AI can help to unlock. Organizations, large organizations, and not just large organizations, but even small organizations often have tons of data sets sitting out there in different formats, different tables, maybe in different structured data sets. Maybe they're sitting in an Excel spreadsheet somewhere. And a lot of that is people wouldn't really try to access it normally because you don't really know what led to that data, what was the provenance of that data, how did it get created? I think that, you know, now having better use of AI throughout an organization can help people answer some of those questions, which maybe previously was sort of like the tribal knowledge. So if I, for instance, have a legacy code base and I see, well, it wrote this database in this form, it's actually reasonably straightforward now for me to put aspects of that code base into LLM, into an LLM that has a giant context window and try to ask it to explain, and even ask it to make something like a mermaid diagram that shows me what the process was that led to that data being created and then explain to me what the meaning is behind those tables. And then once it has that context, it can help me, for instance, write a query to get information out of those data sources that matter. Or it can help me to think about things like complex cross joints. Maybe I have multiple tables and multiple data sets within my organization that maybe wouldn't have been connected before, but now, you know, I think the ability for people to use language models to help to rapidly ingest a lot of information about how those data sets were created in the first place can help to facilitate that and get access to that. The second thing is within the notebook environment itself, like pulling data in from an external database. For a software engineer, it's mostly not too difficult usually, but for someone who's a business analyst, sometimes that can be really problematic. They don't necessarily understand how to set up all the things to be able to connect to the database. They don't know necessarily how to write the SQL query. They don't know what package they should pull in to do the SQL query, they don't know what package they should use to like set up a version of that database locally that is appropriate for analytic queries like a local DuckDB or something. And those are things I think that we can think about as enhancements to the developer workflow that are also very likely to really improve people's day to day working lives. You know, using code and using Python.
[26:08]
Kevin Ball
I love those examples because they're doing a lot of what I see as one of the most powerful things with LLMs of allowing you to translate intent into implementation, even without necessarily having to understand all the different underlying pieces in that implementation. So looking at that, then what are you working on now in notebooks or Anaconda Assistant to kind of bring that future to life?
[26:33]
Greg Jennings
Well, one of the things that I'll circle back on like the last point you just made, which is yes, you can absolutely do things now and not you can sort of express the intent in more ambiguous terms and the language model will fill in the gap, you know, as to what it thinks you meant or what probably makes sense to it. The old adage in machine learning was that, I think was that all models are wrong, but some models are useful. I think that the analog in the AI spaces, all gen AI models hallucinate, but some hallucinations are valuable. So and in this case most of the time the hallucinations make really good, rational, good decisions. Right? But there's always a danger that when you are going through and kind of expressing your request in natural language against something that is code, that it's going to misinterpret things in such a way where it will run, but it maybe gives you the unintended result. So one of the things that we've put directly into the sort of notebook experience is the sort of like ability to inline immediately explain any code snippets that this, that the system generates. One of the things that I think Anaconda has really internalized as its mission is not just arming people to do their jobs more effectively with, in data science and with Python, but it is also to help them understand and learn about the Python ecosystem and how it works and improve as developers and as practitioners. So it's one of the places where we have felt like there's an opportunity to do that directly in line. You know, people's focus and attention is already on that spot. That's one area that I wanted to make sure that I touched on in answering your question.
[28:23]
Kevin Ball
I Want to dig in on this because I think this is a really key and important thing to understand about LLMs and also to incorporate in application design. Right. Like a fundamental thing about how these things work is they're going to make things up because they don't have a core understanding of truth. I like to say they're not your, they should not be your system of record. Right. They're an interpretation environment. They're doing things like that. Tools like Perplexity do a really nice job of, oh, I'm going to show you, here's all the sources that I'm using to create an answer for you and let you then follow through and verify with sources and do other types of kind of validation of the correctness of the LLM generated output. So it sounds like if I'm understanding you correctly in the notebooks you're doing something similar where here's the generated code and it's going to explain it and link to things that will help you understand what it's actually doing.
[29:14]
Greg Jennings
Yeah. So we have there's a sort of a control within each cell, if you are running Anaconda Assistant that gives you the ability to explain any cell that you've written in the context of the notebook. Say it generates a SQL query, right? And it's going to generate a SQL query. And you might say, well, I want to generate a SQL query that gives me sales by month. It interprets what you said in terms of sales by month as being all aggregate sales of all products. What you really meant was sales by month of your product because you are individually only responsible for one business line. Well, you might not be apparent to you that that's the result it's giving when it gives you the graph. If you don't look at the code right, and you just take at face value the code that it generates, it will probably generate working code because models are really good at generating working code. Now. They've been trained and tuned specifically to do that. But the code it generates might not be the right answer. So I think that, you know, it's a little bit like they're going to give you the answer that is probably the most high probability set of responses or set of data that it was trained on, whatever that was, that's sort of what's going to guide its response. And that just may not be appropriate in all cases. So we think it's very important to give people tools to help to understand what those things actually mean. You know, especially in cases where it's like a talk to data Kind of an application, right.
[30:45]
Kevin Ball
It's actually a really interesting area to explore because data has this interesting thing where the naive interpretation of a particular set of data may not always be correct to pick an example, right. If you're looking at the outcomes of an experiment and you just look at what is the percentage difference between the experiment and the control group and you don't look at how many samples were taken to do that, you actually don't have any Idea if a 2% difference is hugely significant or not significant at all. So I'm kind of curious, like if you're exposing data analysis to more and more non data trained people, which is great, right? I think it's really valuable to broaden the access. How do you help them put that into context when they, for example, ask a question like, hey, is my experiment successful or not?
[31:36]
Greg Jennings
This is a great question and I think it would be disingenuous for me to say that we figured that out. So I think this is probably going to be one of the big challenges people have with, you know, providing additional capabilities to people with LLMs in general. I remember reading Judea Pearl's book and like learning about the, you know, Simpsons Paradox, for instance, is another case that, you know, that is a common, like misleading case. It can happen in sort of social sciences and medicine where, you know, because you've taken different size of distributions of different subpopulations, you can wind up with these really kind of interesting what they call like reversals or like paradoxes. So you might say, well, like medicine A is better than medicine B for both men and for women, but medicine B is better overall. And you say, well, how can that possibly be right? And I think those kinds of things are going to be really difficult to get an LLM to sort of understand that it should explain those things to people as they're doing data analysis and they should look for those things as they're doing data analysis and to not draw the wrong interpretations from data because drawing the wrong interpretations from data is also really, really easy to do. You know, one of my favorite statistics quotes from somebody I worked with long time ago said, you know, it's, it's if you torture the data long enough, it will confess and it's true. Like you can sort of manipulate numbers in a lot of different ways and like, it's not that necessarily people would do it intentionally. Perhaps some people do sometimes, but it's more so that it's very easy to kind of have the wrong interpretation of data if you make certain mistakes, you know, Cherry pick your endpoints incorrectly in a time series, or don't look at the subpopulations that you're sampling from, and then make a larger sort of assessment of how those things actually roll up. So, you know, I think these are all situations where we're going to have to, as a field, be very careful and cautious and just be aware that, you know, in the era of, I'll call this like the era necessarily of AI generated slop, but there's a lot of possibility for people to sort of do their own hobbyist data science. And maybe they haven't seen these kind of issues before. And I think for these specific purposes, I mean, I won't say we've we're nearly there yet or nearly have gotten there yet, but one of our goals is in fact to make sure that, like the AI assistance that we provide for this specific purpose, because Anaconda is one of the core organizations that helped unlock data science to the world, we want to make sure that our assistant actually, you know, does a great job of this. So we think helping people learn about the things that they're writing and give an explanation of that as a second level of validation, as part of that, probably giving people better sense of what are all of the things that could go wrong or might go wrong with the way that they're thinking about interpreting the data is another part of that. But ultimately, I think we're a long way from sort of not having a human expert at all to be able to make these complex evaluations and draw conclusions. Right. You can certainly draw conclusions, but I think we're a long way away from just turning the keys over to the.
[34:53]
Kevin Ball
LLMs, going a little bit more in this direction of gaps. So we've talked about some, about this goal of democratizing data and the ability to do things with that, and how the assistant is sort of exposing more and more things to folks who, that they might not otherwise do it. What other gaps do you see that you're trying to address and fill in this space?
[35:15]
Greg Jennings
So I think one of the other gaps that we've seen is the emergence of AI models is changing the kind of applications that people write. And more and more we see those applications being written in such a way they incorporate an AI model into the workflow. I think maybe perhaps another part of your question is, well, how do we as a field think about organizing and architecting our applications in a resilient way to account for the fact that all of a sudden all of my tests that I've written for the most part. Right. People write tests that are basically deterministic, you know, and now all of a sudden I have a stochastic system that I've injected kind of into the middle of my previously deterministic workflow. I don't think that we've really, really reconciled with what that means yet, you know, and how tests themselves are going to have to change, how things are going to have to adapt. I will say though, one of the things that we want to do to support that is as people start to incorporate models into their local workflow, we want to provide a lot of this information and metadata around the capabilities of those models to help them at least start to have better decisions and understanding around where models are potentially useful for which use cases. So we have internally started to curate some models and provide those alongside of our distribution. That's one of the things we're going to be rolling out in a more deliberate nature this year. And as part of that, we want to make sure that we also bring information about where they're good and where they're not. But that can help a little bit for that type of problem. But I think the whole field is going to have to understand this much more clearly. Right. And we're sort of still in this era where AI is mostly used, you know, in chat applications, at least LLMs, you know, generative language models are mostly used in chat applications where people are bringing them into their workflows with some type of human controls. Still, for anything that's generating like an external user facing output, that's going to go out to a lot of folks. But when we start to directly bring those language models into applications, it's going to be different and we're going to have to think about how to change the way we're testing and evaluations of those models. And having a better idea around how to do structured generation outputs is going to be, I think, a lot more important.
[37:55]
Kevin Ball
Evaluation is kind of a good topic to go down because I think one of the things that is coming up all the time is if you're trying to build applications around these, you need evals for how they're behaving. And it almost reminds me back of like everyone's having to grapple with the stuff that previously only like ML Ops folks had to grapple with of like, okay, we're putting a new model in. How are we validating that it works for our sort of data distribution and all these different pieces. Are you all building tooling for that as well? How do you do it internally Even.
[38:27]
Greg Jennings
Internally, we have an evaluation framework internally that we have used on a number of the different. So I mentioned that we have internally captured a lot of information about the specific use cases that people have and the way they use Anaconda Assistant. So we've done some topic modeling and organized some of those questions and we've structured some evaluations around that so that when we update the inference backend that sits behind Anaconda Assistant, we have a good sense of like, where it's good, where it's bad. Even for a better model, a lot of times the prompts that you've chosen and most real applications have a lot of sort of very specific ways that they do. Prompt injection and add context to the request might be different, might need to be different in order to get the right level of performance out of it. So I think you're correct. Any organization that's probably building any kind of application like this is going to have to have some kind of an evaluation. It's going to have to be specific to the problems they see. And they won't be able to kind of, I don't think they'll be able to understand or get it all right out of the box because even the best AI models can fail in unexpected ways and unexpected places. You just won't have an ability to kind of know a priori where to focus your evaluations. You have to, you have to do the manual, like back and forth testing, you know, although people have done a good job now of figuring out ways to automate some of that. And we've done a little bit of this too, automating some of that with LLMs themselves. Try to ask what might some questions be that users might ask in this particular situation? I think that's a possibility. Might be able to use LLMs themselves to help to fix this. But again, I think it's still going to require the domain expertise and it's going to require a lot of classical data science, looking at the data, understanding where the challenges are and redirecting your efforts to address them.
[40:31]
Kevin Ball
So taking things in a slightly different direction, you said even the best models have things they're good at and things they're not good at. And I remember I read a review at some point of all of the different Microsoft tooling that they put out. And one of the things that was completely panned was the Excel assistant. Now you mentioned you're doing, you have PI Excel, you're digging in there. Are you bringing your assistant into that environment as well? And how are you navigating that.
[40:55]
Greg Jennings
So we have our assistant inside of Excel in the form of an Anaconda Assistant integration with PI Excel toolbox. So PI Excel Toolbox is something that we created now, I should say we have a great relationship and a great partnership with Microsoft. And there's a whole capability set around PI Excel which allows people now to run Python code within an individual cell inside of Excel. Pyxl Toolbox is a capability that sits alongside of the Excel user experience, very similar to how the jupyter notebook Assistant sits alongside of the user as they're going through a jupyter notebook. And it can basically do things like help you understand data visualizations, help make some Python based data visualizations. We have the ability to share code snippets back and forth between people working in Python and people working in Excel. We have something internally we've developed which you may be familiar with, called PyScript, which is a WASM based Python+ ability to run that in the web natively, sort of. And so that is driving a lot of the ability there. And yes, Assistant is a part of that. But we're really focusing Assistant inside of the Excel experience to be focused on that. Like we want to sort of have a first class Python experience inside of Excel. We don't want to compete with Microsoft around making a generalized experience Excel assistant. I think that they're much better positioned to do that than we are. But at certain times, anybody who's used Excel for any length of time starts to hit certain barriers with what it can do and what it can't do natively. And this is where Python, a lot of people who are sort of more advanced Excel users asked for a long time, can we bring Python code into the Excel workspace? This is really where we want to provide that level of additional assistance for all of those more advanced users that are engaging with Python and engaging with those more advanced use cases inside of that system.
[43:02]
Kevin Ball
Got it. That makes a ton of sense. And thank you for clarifying the sort of constrained scope. I am curious from the implementation side then, how similar can that assistant be to the one that's running in notebooks? Like, is the context layer the only thing that's different or how does that work?
[43:19]
Greg Jennings
So there actually is a lot of similarity. So we have a lot of the same front end pieces we've leveraged. Now say building an Excel plugin is quite a lot different than building a jupyter extension. So not a lot could be directly used in all cases, but some of it is overlapping and on the back end there's different endpoints that we have and they both accept context in different ways. Right. So whereas the context for a jupyter notebook might be things like what is the data frame, what are all of the different fields, the columns of the data frame, what are what's in globals, what packages are available in Excel. It might be much more about like what is the user looking at, what data, what cells are on their page. And I should mention as well that in that system we also have, because of PyScript, we have the ability to let people run Python code kind of internally, like in that little sandbox inside of Excel as well. And that would also be something that we would, that we would get into the context window. But the way the system mostly works is if it interprets a data frame. Think about a lot of people who are using an Excel table that's kind of a common use case. We would take that Excel table and we would just read that directly into a data frame. And at that point, a lot of the same things that we've done to optimize the performance in answering questions or building data visualizations for how users should interact with data frames just translates.
[44:50]
Kevin Ball
Cool. Well, so we're getting close to the end of our time together. Are there any things that we haven't talked about yet that you think would be important to discuss for our audience?
[45:00]
Greg Jennings
Well, I think that for one, just that we recognize the entire sort of software stack is changing. There's some other things we kind of didn't talk about. Our work around AI Navigator, which is kind of a local AI access point to control plane to allow people to do in work locally with some of those models that we're curating. And we're excited about where we can take that. We're excited about being able to work with developers, work with practitioners to figure out ways to better streamline a lot of those workloads. Some of the problems that we talked about around how people are going to be incorporating to a much larger degree models directly into their applications. We think that package management, the whole way we think about package management is probably going to have to change, right? And it's going to have to scope is going to have to grow a little bit to accommodate that. Because now all of a sudden, to run an application effectively, you know, I might not just download a single conda package, you know, plus its dependencies, right? There might be external dependencies that I need to rely upon. Some of those external dependencies may be things that run on a remote somewhere, you know, and More and more, I think people will want the same kind of experience that they've gotten with Conda, where I can just kind of install stuff and it just works. They'll want that same kind of experience, but they'll want that experience translated to a world where they have more than just Python packages or even R packages we also support in there. They will also want the ability to run external models as a part of that, and potentially multiple external models. And I think this will be much more important as we think about a future world where models and those AI enabled applications are run as agents and I can have an agent which calls another agent. All of those things I think are just tremendously exciting areas of exploration for us and we're working towards very interesting things in all of them.
[47:03]
Kevin Ball
Let me double click really quickly on AI Navigator. So is this essentially an Anaconda equivalent to something like Ollama or how would that fit into sort of the mental.
[47:13]
Greg Jennings
Model in some ways? You could think about it that way. Right. So it has a lot of the same kind of capabilities as something like an Ollama. It's really designed to be used as a little bit of a control plane. I think Ollama is a fantastic tool. So I use Ollama myself in many cases. It's similar in that way in that I can download a model and I can very easily stand it up as a server. So we have that capability, but we also have internally the ability to run agents, some specific agents on top of those models. So we're sort of driving toward that environment where there's Python applications which include some type of a model that you might run locally, plus the runtime for that model and you can incorporate all that together. So we think, because we are sort of in this position where we've been a trusted source of a lot of the Python packages and we've helped to drive a lot of those capabilities into the open source community. We're in a unique position to provide a lot of value to users by connecting that to model workflows also. So we want to pull that together. And Ainavigator is this local control plane that's designed to do that, which we're developing.
[48:21]
Kevin Ball
Got it, got it. So it's not just here's a model I can stand up, I think access to it, but even you can package up an application that has an embedded model or is referencing a model and stand up a little endpoint that you can talk to. Now is that connected to the package management as well? So I could have a dependency on.
[48:40]
Greg Jennings
Okay.
[48:40]
Kevin Ball
AI navigator with this agent and that model and this thing.
[48:44]
Greg Jennings
Yes. Yeah, absolutely. So this is what we're working towards. This is exactly what we're working towards. And we have a number of things working internally. Those capabilities aren't quite ready for full release yet, but absolutely, that's exactly what we're working towards.
[48:56]
Kevin Ball
That is super cool. Awesome. Well, this has been super fun and interesting. I love diving into this stuff. I don't think I have any more questions. Anything else before we wrap up?
[49:07]
Greg Jennings
I don't think so. Kevin, it was. It was a pleasure talking to you and I will definitely look forward to coming on again at some point and chatting with you again.
[49:15]
Kevin Ball
Awesome.