
Loading summary
Rajarshi Gupta
Foreign.
Michael McNano
Hey everyone, and welcome to Generative Now. I am Michael McNano and I am a partner at Lightspeed. This week on the podcast, I'm sharing a conversation with Rajarshi Gupta, the head of machine learning at Coinbase. He joined us on the Generative San Francisco stage last year with my colleague Anand Iyer, a Lightspeed Venture partner focused on crypto. This is a really interesting conversation about how Coinbase is incorporating machine learning and into every part of the company. Rajarshi also answers audience questions and offers some great advice for founders.
Rajarshi Gupta
Enjoy.
Anand Iyer
So, welcome to Generative sf. My name is Anand. I work on crypto here at Lightspeed and I'm excited to have Rajeshi here. I'll have him introduce himself. And just as a reminder, this is a session that's focused on meme coin tradings. Rajeshi is an expert. I'm completely joking. The AI folks here are genuinely freaked out. So I'm going to retract that statement right away. But why don't I turn over to you, Rajeshi? Maybe you can give a background on yourself and everything from Genesis Block to how we got here today.
Rajarshi Gupta
Yeah, sure. All right, so I think going forward, I did my PhD across the bay at Berkeley and then I spent 10 years at Qualcomm Research and at Qualcomm Research, I guess my career. The funnest thing I did was that we built the industry's first on device machine learning engine. So this was way back and we launched it in 2015. It was the first on device machine learning engine, meant to catch malware on Android devices by looking at everything that was going on on the phone. And so, you know, I was really excited about it. This was my zero to one project. It was my idea. I was the only engineer on the project at the beginning and then we kind of built it and shipped it and so on. So then after my Qualcomm journey, I went and did a couple of startups. Both startups were in security. The first was a small startup called Balbix, which is a Mayfield startup down in San Jose. And then I joined a much bigger startup called Avast, which was actually the world's largest consumer security startup with 500 million active users. And then we went IPO. And then since then we merged with norton for like $8 billion. And now the new company is called Gen. Then I joined aws. Hadn't worked at a very big company. Joined at aws as a GM in the SageMaker team for a few years. And now for the last Three, I've been leading AI at Coinbase.
Anand Iyer
No, that's perfect. Thank you for being here. I don't think we talked about this Android experience. I'm curious because that was pre generative AI.
Rajarshi Gupta
Oh, totally.
Anand Iyer
And then you had to. I mean, malware and Android has been a thing forever. So. Yeah, maybe like, I'm just curious, maybe let's unpack that because I don't think we've talked about this before. So like, what was that like? Yeah, and it's like, was this model happening on device?
Rajarshi Gupta
Oh yeah, the model was happening on device. In fact, the entire model was written in C and we had to write our own training algorithms because there was no tools at the time. And we wrote our own training algorithms. The model was written in C and it had to be written in C because you couldn't put the model on the regular Linux stack. It had to put it in the internal secure stack of the phone so that the model itself could not be hacked the code. So that was a really fun. We were learning how to do training because nobody really was doing well at the time. This is like 2011, 12, 13, right? We had to do how to do training, we had to write the code, we had to write the training. Well, the training was happening offline, but the inference code had to be written in a very optimized way. It was looking at everything happening in the Linux layer, all the way to the app layer on the Android devices and it could literally catch ton of malware. So we shipped from between 2015. I don't know, most people here probably have iPhones, but if you had Android. So we've shipped it from 2015 to 2019. It shipped in all the high end Huawei and Samsung and LG phones. So eventually it shipped over a billion chips. So that was a really fun experience that we had.
Anand Iyer
What was the learning process like? Because how do you know when you were successful and how did that kind of feedback loop go back into the models?
Rajarshi Gupta
So when we first started doing it, you know, I remember. So Qualcomm is a nice company that, you know, as a. I think at the time I was like a staff engineer. I mean, as a staff engineer you have a cool idea and you could literally go up to the CTO and present the idea. He was coming to our office and I got a slot to present the idea to him and Matt Grob and he basically looked at the idea and said, this is a crazy idea, it's never going to work. But if you think this is going to work, I'm happy to let you try. So go ahead and try it. I mean, this is what we are supposed to be doing, right? I won't be doing my job if I didn't encourage you to go try this thing. And so don't believe me. Just go and try it. And then I tried it. I was the only person that. I got a couple more people and a couple of interns to help me and then we built the prototype. And the prototype was there used to be a version of Android called. I'm forgetting the name. So basically you could take that version of Android and make changes to it on your own. And then we made it work on a real phone. There was real malware and it was actually stopping and catching it. And then basically the power of the demo, the demo was, hey, here's the real phone, this is the real malware. We had two phones. One was not catching it, one was catching it. And then people said, heck, this works, so let's try and build it together. And we did.
Anand Iyer
Amazing. Obviously, it's been a while since then and you're now at Coinbase working on machine learning. And yeah, it'd be great for everyone here to understand, like, what is, what does your role entail? What do you do there? And, you know, I know you have a fairly large sized team as well, that's pretty geographically dispersed.
Rajarshi Gupta
So the first, the first thing we do, which is probably the most well known is, of course, my team makes sure that every transaction, every login is protected. Like, as you might imagine, it's not just Coinbase. If you have a PayPal account, if you have a Visa account, every transaction is protected by a series of machine learning models. And in crypto in particular, there's a lot of people trying to attack your account. And we protect it literally every login, every transaction. So that's one piece of it. The second fun way of thinking about it is, you know, you open your app, we get to decide what you see, honestly, I mean, you know, there's so many apps, so many assets, so many things. At every stage you do a transaction, you see a bunch of other assets that you could buy, or you get notifications. Naturally, this is like Facebook or Instagram, right? Or LinkedIn. The notifications we send are targeted towards you. We decide what K we get to see. So these are very traditional. Like, one is traditional finance applications, one is traditional web, two applications with the app, we have to do these things.
Anand Iyer
I have so many questions for you. Maybe we'll start with because you've been there for three years, and I feel like Generative AI.
Rajarshi Gupta
Oh, sorry. Yeah. I mean, I knew you were going to ask. I just told you about the rest of it. And then I have a whole big Gen AI team. But let's talk, let's get there.
Anand Iyer
Yeah, I'm curious, like, what, Amis, maybe if you could share a little bit about as much as you can about the kind of models you've developed, how much of it is off the shelf? Are you using any known flagship foundation models or so on and so forth?
Rajarshi Gupta
I mean, one of the things we did was that when the generative AI explosion started, I would say early, like end of 22. Well, basically 3.5 came out, which was in Thanksgiving. 22. Right. So very soon after that, we made a strategic bet that this is going to be a game changer and we are going to invest heavily into it. Now, this was not a very good time for Coinbase because this was at the bottom of the crypto doldrums and we just had a big layoff and stuff. But I was like, all right, this is the technology. This is what is going to really get us, help us. So we are going to focus. I mean, okay, we're not hiring, but we are going to focus everything we can to do this. And I can tell you that we released our employee assistant in fall of 23, so it was like pretty early. So an all company employee assistant was released in just after Thanksgiving 23, actually.
Anand Iyer
And the employee assistant is. What does that do?
Rajarshi Gupta
Okay, so I'll tell you. So what are the things we are doing with Jai? So there's two halves to the projects. So one half is to essentially build something for all our employees, and the other side is to build an assistant for all our customers. Now, naturally, the employee side is easier because you're talking to your own employees. You don't have to worry about the things. And so on. The first big release was there. We're doing two things. So first one is very straightforward. We have an employee assistant Persona where every one of them has an integration with Glean to be able to get access to your data so that it can give you answers based on what you know. Enormously useful. A lot of people use it. Then there's a whole series of other things that my team initially built, which was to help different people, like our designers, things that help our finance people, things that. So we built several of these ourselves. One that's the most popular in the company is that we just went through our performance review time. So we have A performance review assistant, which you basically take your notes of your bullet points and you drop it and it writes your performance self review for you in the coinbase format with the correct structure, with the thing with the correct word limit and then you need to edit it and do it. So this time we had out of I think 3200 people, 2000 some people used it. So everybody across the company all the way to the C suite uses it. We can't see what they're doing with it, but we can see that a number of people in our exec team are using it to do this. So that's one example. So then what we did is that we really wanted to build it as a platform. So we released much of the functionality as an API such that other people could build on it. The entire company doesn't have AI expertise, but we built it such that other people can use it as an API. So somebody went ahead and built a really nice one which we built what is called an incident bot, which is that when there's an incident going on, you get into the Slack channel and basically if you look at the history of Slack channel is basically somebody comes in, asks what the heck is going on? Somebody answers, 10 minutes later somebody else comes in and asks what the heck is going on? And then things have changed and stuff. So now every person comes in and you just ask the bot saying hey, what's going on? I'll DM you back exactly what's going on. I see. And this was built by that team, we have our data science team who built a text to SQL bot for it. And then there's all these things. So that's the other one. But now this is for internal employees, right now for external customers, we released the first few releases where of course we are planning to do an assistant for all your user journeys. The first user journey that we tackled is the obvious one which is the support user journey. So we released our LLM based chatbot in November. So we're now handling whatever like tens of millions of use requests. With that we took over the search on our site. So now if you go to do a search on either on your phone or on the website, you'll get a Gemini like answer first from AI and now we are doing other things like and this hasn't released yet but we are working on things where when a customer tries to do some research customer is trying to do. When we provide insights on things customer is trying to look and find out about certain types of crypto assets which is pretty Complex, we will help you. And then this is where we are going to expand. Got it.
Anand Iyer
How do you evaluate and manage deployments of these kinds of experiences? Assistance agents, models, like what does that process look like?
Rajarshi Gupta
So the way we do that is that. So this is called CB GPT, Coinbase GPT and this is a platform, so it's truly a multi cloud, multi LLM platform. So we literally use models from Azure, GCP and Amazon. You asked the question earlier. So the two biggest use cases, which is the chatbot and the help, are both on Claude, but it's load shared between AWS and gcp. Now the way we do it is that we have. This is one of my biggest problems with the whole LLM space, right. So you know, I grew up with machine learning. I've been working on it for, I don't know now, 15 years on machine learning. And then before that I worked on statistics. Entire life we've dealt with here's a prediction, here's the confidence interval. All of a sudden you're in this space where here's a prediction and I don't know the confidence interval. So it's a very uncomfortable situation, right, that for LLMs, you don't know how good the answer is. So you're having to do all these weird things as a different LLM, as a judge and all these things. And it's a constantly moving space, right? Like you had an LLM, you're using somebody else as a judge. Now this LLM is better than the judge all of a sudden, so now you need a better judge. So this gets very hard. I don't have a solution. We're doing same thing as everybody else is doing. We have an evaluation portal where people can try their own ground truth sets which suck because people just don't have a sense of. You're a normal user of the app. How do you know what's a good ground truth set? Is to figure out whether you're doing this. And then we are doing LLM as a judge and then we are doing human evaluation and we are doing curated data sets. Nothing fancy. It's just what everybody else is doing. Because I haven't seen any good answer. So lots of startups, you know, evaluation of LLMs. Truly great, great, great problem to solve.
Anand Iyer
You've gotten plugs for a couple of lightspeed portfolio companies already. We've got Glean, got Anthropic. Okay, we'll keep it going Petronas, we'll keep it going for evaluations. But that's, that's really Helpful, by the way, for folks who have questions. It's very small, intimate crowd. So if you have anything that's coming up that's at the moment, feel free to fire away, but I'll park that for now.
Rajarshi Gupta
I'm on Anthropic's customer advisory board and they've been a really good partner. So for our chatbot, right? When we were releasing our chatbot, one of the biggest Scare thing was we were releasing it in June of 2024. If you just think back, not that many companies like Uber didn't have the chatbot release at the time. So we're like, wow, we are really pushing the edge. And we were scared. And naturally the fear was about the guardrails, right? What if you get your chatbot? What if somebody. I mean, the New York Times, front page Scare, right? And they were very nice and they proposed to do a joint venture with us. So we built the guardrails model, a separate guardrail model in collaboration. And they were super helpful because we are. I mean, we are one of their early customers or big customers. And we did a really nice thing and it kind of saved us so much time. Yeah.
Anand Iyer
What's keeping you up at night these days?
Rajarshi Gupta
I think, I mean, honestly speaking, I tell people that most of your life at work, the general feeling is that you're pushing a boulder uphill and there's gravity and it's pushing it down and you're fighting something, right? Like once in a time in your life, the boulder is rolling downhill and you're chasing after it. So right now, we are in that phase, and I have been since for the last two years. So it's awesome. My fear is I've been doing this head of AI kind of role for a while, and most of the time you have to tell people, and people are used to the way they do things, right? You have to go and convince people that, hey, this way is better, you're going to do this and stuff, and so on. And now all of a sudden, the whole thing switches and people just come and say, can you please do this for me? And so on. So I think that's the thing. So the part that keeps me up at night is the fact that our people don't plan for the amount of effort and the amount of sophistication it takes to make these solutions real. So I'm not talking about the hype bubble or anything, but it's just that people's expectation, we can match the expectation, but that doesn't come for free. I mean, I'VE started telling people that AI is like magic, but before you can do magic, you have to go to Hogwarts for seven years. So it's that seven years, which is hard. And people don't get the fact that you can't just put an LLM and it'll do the work. You have to do a lot of work in getting the plumbing right, doing the testing, doing the measurement, doing the analysis and do seven iterations of it and then it becomes really good.
Anand Iyer
That's a really interesting point. But do you use generative AI in you have a pretty large team. Are you using tools like cursor or V0 or anything like that? How is the eng team starting to adopt?
Rajarshi Gupta
That's a great point. And that is one of the great cases where we did a straightaway buy over build decision like as we were beginning to look at generative AI. That was right when, I mean, if you remember, GitHub came up with Copilot within like two months of the 3.5 being released. And I just analyzed it with some of my team and said, whoa, this is a good product and we adopted it. So right now in our company we have rolled out for everybody, which we've rolled out Copilot and then we did Source Graph Cody and just today we roll out Cursor to everybody in the company, all engineers.
Anand Iyer
Amazing.
Rajarshi Gupta
So it's like we think these are great, they're doing extremely good thing, all the developers love it and we are adopting it. The funny thing though is that so you read and I don't know how well it happens. I'm sure every company is beginning to measure this, right? So you say, okay, 25%, like Sundar Pichai, 25% of code is being written by things. So if you're the CEO, you think 25% of code is being written by AI. Awesome. I can have 25% fewer developers or we're going to have 25% more time in there. But it turns out that developers don't code for eight hours a day. They only code for like two hours a day. The rest of the time they're going to find, they're trying to find data, find what happens, do a debugging and so on. So the total, even if 25% of code, it's like 25% of two hours. So you're saving like half an hour a day. I think the bigger advantage is coming from these, is going to come from these systems or these agents which not just can predict the next three lines of code, but actually Understands the problem. It's a much harder problem. Software, if you think about it, most of us have been software developers in our life. You don't just sit and write code, all right? You spend a lot more time figuring out how to solve the problem. And then the actual coding of it doesn't take that long.
Anand Iyer
Yeah, I was talking to a friend of mine who works on Gemini, and he was saying that that 20%, 25% stat was literally about auto completion of code, not about writing code itself. So that's kind of a misleading statistic. When we talk to some folks about some of the issues they're facing when it comes to the adoption of AI, usually GPU shortage comes up, quality assurance comes up. Is that something that's been on your mind, too?
Rajarshi Gupta
Oh. So I was giving a talk at gcp, Google Cloud Next, a few months ago, and one of the things they gave me was like, prompts they gave me was like, what keeps you awake at night? And I wrote down in my slide, you guys not giving me enough GPUs is what keeps me up at night. And then the person who was looking at it, she was like, I need to get that reviewed by somebody. But thankfully they did. They were completely fine with it. Like, whoever reviewed it said, no, this is the right problem. So, to my big surprise, getting the available GPUs was the biggest problem we faced. Like, by far. Like entire last year. That was by far the one that caused me the most grief. And it's because. So, I mean, at the beginning, we did our employee assistant and these assistive things for our agents and our developers. Great. You know, like whatever thousand people are, like 600 people are using it. You don't hit any bandwidth, right? Then you suddenly switch from 6,000 people to 6 million people. And then you realize that these bursts are coming and there just aren't any GPUs. So we had a couple of instances early in the year when it went down and we really struggled with it and literally had to go and, like, you know, escalate it to both gcp. That was actually the reason why our main solutions are across both AWS and gcp. Literally. That's the only reason. There's no reason to do it otherwise. It's the same model. It's load balancing and the ability to get capacity throughput in those places. And I don't blame them. I mean, I used to be a GM at aws. I know Atul, who's the GM for bedrock at AWS very well. Their problem is that these New models are showing up every month. And when a new model comes, you don't know whether to run that model on 1,000 GPUs or 10,000 GPUs and how the demand across models are going to switch around and which ones are going to go to Llama and so on. So there's not enough predictive ability on it.
Anand Iyer
And just to unpack that a little bit more, can you tell us about what is the workflow? Because you're using specific hosted instances and putting these models, are they like specific weights or specific kinds of models that you're hosting on these GPUs?
Rajarshi Gupta
We have many different use cases. For some of our use cases, we are having our own trained models based on the Llama family. These are hosted internally, but these are not high bandwidth ones. These are typically the ones where we have some legal security reasons that we don't want the data to go out. But these are small use cases, no problems. Right. The big use cases are hitting for us, there may be other companies who are doing differently, but for us, the big use cases are hitting these models. Particularly actually both Claude and now we have Gemini. Also both Claude and Gemini. Now what happens is that even if you think of it as something as simple as a chatbot, which is the most common use case, so it's not like the user says something and we send it to a chatbot, it's a chain with between 5 to 9 LLM calls within that chain. Because we have to figure out, is the user saying something bad? What does the user mean because somebody they says can't send Crypto, one of the most common. All right, you need to get more information, you need to get context. You have to figure, do a rag call, you have to get information, then you make a call, then you have to change the system so that it sounds empathetic, depending on what the answer is. So it's actually depending on which side of the flowchart you're on, it takes between five to nine calls. So that's what we are doing. And when we hit these many, like every customer, million customers, you take these calls, you hit them with, make nine calls. It adds up pretty quickly. Yeah.
Anand Iyer
Is there a requirement for specific kinds of GPUs? Like, is there homogeneity that's needed for the kinds of models? Or like is there a specific kind of instance that you always need from GCP or aws?
Rajarshi Gupta
I mean, we try to get the biggest one we can get. So it's not, I don't think, I mean you know, we don't really specify because we are not doing it. I mean, we are not big enough to demand an isolated cloud instance. That's too big. So we are shared instance. So we basically work off latency and bandwidth.
Anand Iyer
I see.
Rajarshi Gupta
Got it.
Anand Iyer
You know, if you had to look out maybe a year from now, you know, as you're starting to build up your expertise and obviously Coinbase is on a roll, crypto is doing well, so there's a lot more. There's added pressure on your team, I guess. What is your team going to be doing a year from now? What is, you know, what are the deliverables that you want to hit over the course of next year?
Rajarshi Gupta
I think there's two axes where we're really trying to make. And the two axes are at conflict with each other. Right. So one axis is that we always worry about the bull run. So, you know, a bull run is happening right now.
Anand Iyer
We don't worry about a bull run. Just to be clear, the opposite problem.
Rajarshi Gupta
We worry about a bull run all the time. I mean, in fact, honestly. But I don't know if, you know, there, there's this. For those of you who are in crypto, the forever the meme has been crypto goes up, coinbase goes down. Because the Coinbase side, whenever we. This is the first time, like literally, we are super proud of it, that the election time and all the crypto run, our site stayed completely up with no problems because of the enormous amount of investments we have had to do in the platform to make it all work smoothly. But of course, there is a capacity issue, right? I mean, we are handling what's going on right now just fine. But what if it goes up 10x and so that's one side of it, right? So it's not just me. Everybody in the platform is worried about the fact that we don't want that to happen. And there's a lot of work that goes in and we have some headroom, but I don't know how much. I mean, you know, we are estimating how much, but these things are very hard to estimate because when that many users come, there's so many. We do load tests and do everything. So that's one axis, the other axis is. But now, naturally, that sucks up a lot of resources and thinking time. And so on the other axis is that there's so many new features and new capabilities that we are trying to build. There are so many new spaces that we are trying to do. As you might imagine, Coinbase is a very regulated company. Because we end up getting hit with two sets of regulations. We get hit with the regular financial regulations because we people, we have people's money. But then crypto tends to have their own sets of regulations in many, many jurisdictions. So we have a lot of people in the company, humans whose job is to make sure that we stay compliant and we follow the regulations and so on. And naturally these processes are not efficient. Laws change. Laws are written in many languages and so on. These are all use cases for AI earlier. For example, like let's say a new law is written in Philippines. It's published in the Filipino language, right? So you know, you see there's a law that's written. Somebody in Philippines tells you that the new law was published. So you get hold of the law, you hire someone who can translate it, you wait three weeks, you pay them some fairly large sum of money. Three weeks later you have the version. Now literally we can do this in no time. So there are all these great use cases and we are trying to. There are many operations in the company and we are trying to optimize many of them. But that's a lot of work because of the Hogwarts seven years problem. Because most of these things are not geared for computers. They are geared for human beings. And we have to do a lot of software work to make sure that these work. So those are the two axis of problems.
Anand Iyer
That's super helpful. We'd love to hear from you folks. I'm sure you have questions for Rajeshree so please tee them up. I'll have just one thing I'll ask you and then please start to fire away. There are folks here who are excited to either start something or. I know, I'm sure there's an opportunity to squeeze in like a request for startups or something you want to see get built. What does that look like?
Rajarshi Gupta
Yeah, I love that. So I think what is happening is that I'll give a quick answer. Like we talked about a few little things here and there. Like I said, for example, a fundamental startup that does real LLM evaluation in a scientific way would be super useful for the industry. And I'm sure you guys would get enormous valuation but a broader set which I think is. And you're reading this in the news right now, right? That what is plateauing is the fact that the training gains are plateauing. And this was known because as early as GPT4 we're pretty much given it all the written knowledge. Internet was already there. So you knew that it was only the training gains that was coming, not the data gain anymore. But I think the enormous gap is from the capability of the crypto to really solving the real customer problem, especially in enterprises. There is such an enormous amount of money in enterprise problems that can be solved with gen AI. It is unbelievable. It's not easy though because every company, the processes are different, the tooling is different, the data pipelines are different. But companies that are going to be able to solve this, I mean just take we talked about glean, right? Enterprise search, I mean you would think like Internet search which is 100x bigger problem was solved in 2005. It's like why did it take 17 more years to solve Enterprise search was stuck, right? It was so bad. And why did it take that long to do enterprise search? Because the enterprise plumbing and the things are so difficult and somehow this is a great company and they managed to do a bunch of cool stuff in the space is that there is so many problems. I mean you look at anything, you look at hr, you look at finance, you look at legal, you look at, you know, all these operational functions in a company are ripe for improvement. But they don't know AI. They don't have the data pipelines. So this gap between what the AI can do and what the real problem solutions are there is huge. So that's my advice, wanting a startup, figure out the startup that is using these tools, but that is solving the problem. And the AI is not the hard part of the problem. That part has already been solved. But how can you use the power of the gen AI models to solve this problem? That's where the big space lies. Awesome. Thank you.
Audience Member 1
Based on what you just said, I think that. Isn't that like what SAP is kind of positioning themselves because they've been having. They have all the data of all the enterprises. Right?
Rajarshi Gupta
Every company is saying they're doing AI today. It doesn't matter what you are. Like a tire shop on the street is also using AI to change your tires. So that's not the point. Sure, SAP has the advantage and Salesforce has a ton of data and they are doing AI and they are doing a lot of AI. But yeah, I mean if they can solve it, great for them and they'll be even more valuable. But I am not seeing the solutions yet.
Audience Member 1
So that's my real question. So then what is the gap? Right? Because I don't really know. I don't know any solutions either. But we don't have the visibility because it's all within the corporation as well.
Rajarshi Gupta
That is exactly the problem like for a startup, the fact that the data isn't easily accessible is a problem. So here's what's happening. So every startup or every company that is in any space is saying, we're doing AI, but in order to do AI you have to build an AI team. So we had this situation where just as an internal trial, we wanted to pick one of our non technical teams and do a project with them. So a couple of my guys went and built a thing and these guys loved it. They were like, oh, this is such a useful thing, but we want an app for it. We are on an app building team now. At the same time, this group is getting hit with many startups. We're all saying we are doing AI and we are doing this thing. So we encouraged them. Then they decided to do a bake off. The bake off was there are these five companies and they were going to use us as the benchmark and say, okay, we have the benchmark and we'll pick the best. So it turned out that we were way better than all five of them. So these guys were like, well, this is free because you guys have already built it. So we're just going to pay this internal team to build an integration with our external tool and so we're just going to use it now. So it's that if the fundamental knowledge exists, the people don't, there is a big skills gap. And if you are a startup building databases or you are a startup building sales recommendation, I mean, if you're big enough at Salesforce, you can build that team and you certainly have an advantage. But I think there is a lot of space for a startup to come in. In fact, if you are building on top of a platform like Salesforce, that's actually good for you because Salesforce already has all the data. So you don't have to do integration with 20 things, you just do integration with Salesforce, you do the optimization thing and it works. It's when you have to integrate with like, you know, 17 different enterprise tools. Like in security, if you're trying to analyze security logs, it's such a big challenge because you know there are 50 different types of logs. But you know, that doesn't change my fact that there's a big gap between the technology and the solutions and there's a lot of money to be made here.
Eugene Chung
Hi, thanks very much. I'm Eugene Chung. I am since 2013, I've been a bitcoiner and mostly Happy Coinbase customer.
Rajarshi Gupta
So if you held onto your bitcoin, you're really happy.
Eugene Chung
Yeah, I'm not much of a degen, but over the years I've been lucky to see the, you know, the various cycles, the various hype. So there's, you know, there's obviously the defi summer, there's ICO boom, nft craze, all these other things. Now, of course, we seem to be in a boom of agentic AI. We have things like Marc Andreessen giving AI bot truth terminal 50k of Bitcoin and then having it quote, invest. And now we have these meme coins like Gotius Maximus and AI16Z unrelated to A16Z, touching about a billion dollars, I think, in market cap as of as of today. So I'm curious, given Coinbase's history of reacting to market trends, is this a trend that's interesting to you all? And if so, where do you predict some of the integrations could be with agentic AI? Well, I'll call them agentic AI themed coins, meme coins, because a lot of them don't have much in the way of sophisticated AI.
Rajarshi Gupta
Yeah, so I think agentic AI on crypto on blockchains is very important because if you just, just stare away all the hype and everything, blockchain brings certain very, very interesting characteristics. I mean, honestly, you know, when I was interviewing at Coin, one of the questions was like, you know, what interests you? And my answer was all about blockchain, not about crypto as a, as an investment vehicle. It was the fact that blockchain is the first technology that really makes distributed computing possible because it provides the ability to do provably, immutably and anonymously. Anonymous is not as important, but provably and immutably. And then it also has incentive trading mechanism, which is Bitcoins. Right, okay, that's a lot of technical jargon, but in reality, one of my colleagues at Coinbase said it very, very nicely said, hey, an AI agent cannot own a wallet with cash, but they can own a crypto wallet. So I think that is a space that we love because it allows. I mean, crypto wallets are big for us and the agents allow a wonderful mechanism for this is about the only mechanism available for agents. So I was just talking to people over snacks a little bit earlier that today in the Internet, if you want to share, exchange $20, you know, you can Venmo and there's all these mechanisms, but if you want to send.02 cents, there's no mechanism. You can't really send.02 cents between each other. And micropayments are Such an important tool for these kind of agentic distribution. And we can make that available, especially if you come to plug for base, if you come to things like base, which is very, very low transaction fees. So we think we love this methodology. They're two separate questions, right? Like, as an AI person, what I think of agentic AI, and then is crypto going to make agentic AI thing? And yes, not crypto necessarily, but blockchains and the ability to exchange crypto payments solves a huge problem for agentic AI, which we absolutely adore. And then the other answer is like, you know, do I really feel that agentic AI. Yes. I think once again, I mean, maybe the hype is. Hype is expanding on, like, what agents can really do, but agents can really solve problems. And, you know, the ability to take a problem, break it up into smaller things, put the answers back together. That's quite powerful. Quite powerful. And that's. I mean, to go back to my previous answer, that's a great way by which these complex enterprise problems are going to get solved.
Eugene Chung
Awesome. Thank you.
Audience Member 2
Hi, how's it going? And thanks so much. I was curious to hear more about the guardrail product and also just how you think about, like, guardrails from a framework perspective. Right? So you're in. You're in a world where, like, I want your framework, but like, hypothetically, it could be, you know, bots. LLM experience could be informational. That information could be generic, or it could be personal, or it could, you know, also be identical and go take actions. And in your particular product, those actions can be quite expensive if done wrong.
Rajarshi Gupta
Yes.
Audience Member 2
So anyway, I'm just curious to hear a little bit, a little bit more about the guardrails and how to think about the structure.
Rajarshi Gupta
You're absolutely correct, actually, in fact, honestly speaking, we can take actions. I mean, even forget LLMs, even before that, you could take actions on your bot, Right. You could talk to a chatbot, which was pre LLM days, which was like you click saying, which of these seven choices you want. I want to send and I want to do this. But like, forget chatbot. I just booking. We are traveling, doing a vacation. I had to make some hotel changes. I went to Expedia. Nice chatbot, very controlled, old school chatbot, but says, are you trying to do one of these things? Yes, I want to change dates. All right, which one I pick. It gives me the three options of my hotels. I have. I said this one. Your dates are from 23rd to 25th. What do you want? I say I want to Change it to 26 to 29. Okay, this is the price I have. Do you want to do it? I said yes. And it does it. So you can take action, which can be reasonably expensive. I mean, thousands of dollars, and they'll do it for you. And we can do that too. It's not a stretch. Even pre LLM chatbots were perfectly capable of doing a set of actions. And that's okay because we have other ML models that are looking to say if it says send 75 bitcoins to something, something else will trigger and that'll stop your transaction, but not the chatbot ness of it. To answer your broader question, which is how do you do guardrails? So guardrails are hard. They are easier for internal facing. I mean, I would say guardrailing is the main reason why, if you look at the slew of products we are releasing, most of them are internal and a few of them are external. But of course, the external ones are the largest scale, the largest money and so on. Now, in order to do guardrails, you have to have different levers of guardrail, right? So one lever of guardrail is to make sure that you're not giving any information extra or you're not saying a particular type of tone that you're not supposed to. Another level of guardrail is the fact that it actually looks at what's coming in. So as you give it more capability, you have to keep upping your guardrail. Because the first version we released, you were absolutely bang on target. The first version we released was only informational. It would only look at generic information. The second version is one that looks at information on your account. So you have to have the guardrails to make sure that I cannot look at your account and give you information about Dylan's account. And then the third version is the one which can actually take action on your account. And for every one of them, you need to have the guardrails go hand in hand and do it. So, no, there is no simple answer. It's testing its mechanisms. We are actually. There's one very interesting use case of the guardrails that we hadn't planned on at all. We are using the guardrails to protect our human agents because human agents get hit with intimidation, threats, things that a human, I mean, an agent really shouldn't be having to deal with, like abusive language and so on, which we hadn't planned on it. But then we realized that, heck, we have this guardrail and it was just completely for Twitters, right? We was In a meeting, it's the same CX team, right? A meeting with them. They were talking about someone. One of them had visited the center and had seen the kind of things that they had shown. And. And then we're like, wow. And then as they were talking, my product manager was just some 15 or 20 and our guardrail engine caught every one of them and said, no. And then we said, wait, we already have a solution. We should just use this to. So we of course had to change the. Then it was just the software work, right? Because the protocol was you did the chatbot and if the chatbot didn't answer, it was going to the agents. Now every time we went to the agent, the agent had to make separate LLM calls. So a bunch of software work, no machine learning work. So we are doing a lot of guardrails and sometimes you get freebies.
Josh
Really appreciate the discussion. I'm Josh, founder of ox. It's an AI job training platform and enablement for businesses. I want to touch on how important the confidence interval is. And evaluating LLMs makes me think back to a previous life when precision and recall mattered and used to measure each model iteration as it came out. Can you dream up any way to kind of bring back some of that rigor to modern LLMs? How to even start on that problem?
Rajarshi Gupta
So I don't have an answer. If I had the answer, I would really do it. And it's probably because if you really think about it, right. So just, you know, I'm aging myself here, but my whole machine learning background is pre LLM days, right? So the LLMs happened in the 2014, 1517 time frame when I was already in companies and not doing a lot of original work anymore. So I don't have an answer here, but I'm not the best person or qualified person to do it. There are a lot of very smart people who are working on this problem in both academia and university. And I am honestly a little bit surprised that this rigor is not coming out. And someone I asked this question to someone in academia actually and the answer they gave is kind of there. And their answer was that LLM for the first time is basically mimicking humans speech and all the mechanisms that we had developed about accuracy where pre days where they were trying to look at math and all of a sudden here is something that essentially mimics humans and we don't know how to measure that, which is probably a good representation of the problem. But to me it's not an answer. I mean we should as an industry be able to figure it out. And that's my ask, but I'm not qualified enough anymore.
Anand Iyer
It feels like maybe a lot of the focus has been on LLMs, right? And I think maybe looking at 20, 25, we have some. We'll go from probabilistic to deterministic. SLMs that are sort of more niche, more nuanced, that can avoid hallucinations, can understand how to work on guardrails, evaluations become easier, more math driven. So maybe this was the impetus we need to go from.
Rajarshi Gupta
Yeah, I mean, honestly speaking, look at how fast the thing changes, right? So even in June, like now in December, like in June, when we were working on it and releasing it, the biggest problem everybody was worried about was hallucinations. But especially for enterprise use cases and small language models and rag. We don't see hallucinations. It's not really. I mean, you know, we never really solved the problem, but it kind of went away. But just by changing the constraint parameters, like once you put constraint, the hallucination doesn't happen. Like, really, we rarely see hallucinations in a thing. I was like, all right. It says no, but. But yeah. I mean, I think you're right that SLMs are more deterministic. They're more accurate. I'm not sure if it's more deterministic. It's definitely more accurate.
Anand Iyer
Sure.
Rajarshi Gupta
More accurate.
Anand Iyer
Thank you so much for being here and spending some time with us. I, you know, we really genuinely and sincerely appreciate it.
Rajarshi Gupta
No, we love, love the questions, love the engagement. Thank you for inviting me.
Anand Iyer
Yeah, maybe a round of applause for Rajastree.
Rajarshi Gupta
Thank you.
Michael McNano
Thank you for listening to Generative now. If you like this episode, please rate and review the show and of course, subscribe. It really does help. And if you want to learn more, follow LightSpeed at LightSpeedVP on X, YouTube or LinkedIn. Generative now is produced by LightSpeed in partnership with Pod People. I am Michael McNano and we will be back next week. See you then.
Host: Lightspeed Venture Partners (Michael Mignano)
Guest: Rajarshi Gupta, Head of Machine Learning, Coinbase
Date: February 6, 2025
Episode Theme:
Artificial Intelligence and Crypto at Coinbase
This episode dives into the ways Coinbase integrates machine learning and generative AI across its platform, both for internal operations and customer-facing tools. Rajarshi Gupta, with a rich background in security and AI, shares a behind-the-scenes look at Coinbase’s AI transformation, discusses the practical challenges of deploying large language models (LLMs), and offers candid advice for AI founders on navigating enterprise adoption and industry gaps.
"We built the industry's first on-device machine learning engine... It shipped in all the high-end Huawei and Samsung and LG phones. So eventually it shipped over a billion chips."
– Rajarshi Gupta (02:44)
"With LLMs, you don't know how good the answer is. You're having to do all these weird things as a different LLM as a judge..."
– Rajarshi Gupta (12:04)
"Developers don't code for eight hours a day... they’re trying to find data, do debugging. So, even if 25% of code is AI-written, that's 25% of two hours."
– Rajarshi Gupta (16:54)
"To my big surprise, getting the available GPUs was the biggest problem we faced. Like, by far. Like entire last year."
– Rajarshi Gupta (18:49)
"The AI is not the hard part of the problem. That part has already been solved. But how can you use the power of the gen AI models to solve this problem?"
– Rajarshi Gupta (28:37)
"Blockchains and the ability to exchange crypto payments solves a huge problem for agentic AI, which we absolutely adore."
– Rajarshi Gupta (33:26)
"We are doing a lot of guardrails and sometimes you get freebies."
– Rajarshi Gupta (39:30)
"We should as an industry be able to figure it out. And that's my ask, but I'm not qualified enough anymore."
– Rajarshi Gupta (41:29)
On the transformative power of AI:
"AI is like magic, but before you can do magic, you have to go to Hogwarts for seven years." — Rajarshi Gupta (15:10)
On enterprise opportunity:
"There is such an enormous amount of money in enterprise problems that can be solved with gen AI. It is unbelievable. It’s not easy though…" — Rajarshi Gupta (28:02)
On integrating AI in regulated environments:
"Coinbase is a very regulated company…we have a lot of people in the company, humans whose job is to make sure that we stay compliant…these are all use cases for AI."
— Rajarshi Gupta (24:30)
On crypto as a platform for AI agents:
"Crypto wallets are big for us and the agents allow a wonderful mechanism…an AI agent cannot own a wallet with cash, but they can own a crypto wallet." — Rajarshi Gupta (33:07)
| Timestamp | Segment | |------------|---------| | 01:05–03:53 | Rajarshi’s background; story of the first Android ML engine | | 05:31–06:41 | ML at Coinbase: security, personalization, gen AI team | | 07:47–11:19 | Employee assistant, internal/external AI tool development | | 11:27–13:15 | Evaluation and management of LLM deployments | | 13:34–16:05 | Anthropic partnership, guardrails, launch anxiety | | 16:18–18:40 | Developer tools adoption, productivity nuance | | 18:40–22:27 | GPU availability struggles, platform scaling | | 23:18–26:34 | Future challenges: regulatory compliance, AI in operations| | 26:34–29:47 | Enterprise AI startup gap, integration is the challenge | | 32:05–35:33 | Agentic AI and blockchain synergies | | 35:34–39:53 | LLM guardrails: implementation, levels, human protection | | 39:53–42:50 | The challenge of LLM evaluation and measurement rigor |
Rajarshi Gupta’s perspective underscores the real-world complexity of enterprise AI adoption—where model performance, productization effort, and organizational change outpace the “magic” of any new model. He spotlights the critical, lucrative gap for startups in turning AI’s raw capability into frictionless, compliant, integrated enterprise solutions.
“The AI is not the hard part of the problem. That part has already been solved. But how can you use the power of the gen AI models to solve this problem? That's where the big space lies.”
— Rajarshi Gupta (28:37)