Loading summary
Nick Frost
I first got introduced to neural networks when I was at the University of Toronto in undergrad.
Host / Interviewer
Famously, OpenAI jumped on the transformers, scaled it up. The GPTs they're working on. Neuroevolution just came out with this book. Are you looking at Does Cohere explore these new or not necessarily new? It's not new, but these other schools of AI to see how they might be applied.
Sponsor / Tastytrade Announcer
This episode is brought to you by Tastytrade on ionai. We talk a lot about how artificial intelligence is changing, how people analyze information, spot patterns and make more informed decisions. Markets are no different. The edge increasingly comes from having the right tools, the right data and the ability to understand risk clearly. That's one of the reasons I like what tastytrade is building. With Tastytrade, you can trade stocks, options, futures and crypto all in one platform with low commissions, including zero commissions on stocks and crypto, so you keep more
Host / Interviewer
of what you earn.
Sponsor / Tastytrade Announcer
The platform is packed with advanced charting tools, back testing, strategy selection and risk analysis tools that help you think in probabilities rather than guesses. They've also introduced an AI powered search feature that can help you discover symbols aligned with your interests, which is a smart way to explore markets more intentionally. For active traders, there are tools like Active Trader mode, One click Trading and smart order tracking. And if you're still learning, tastytrade offers dozens of free educational courses plus live support from their trade desk reps during trading hours. If you're serious about trading in a world increasingly shaped by technology, check out tastytrade. Visit tastytrade.com to start your trading journey today. I'm going to myself. TastyTrade Inc. Is a registered broker, dealer and member of FINLA, NFA and SIPC.
Host / Interviewer
I mean, you have a really interesting background. You were Jeff Hinton's first hire, is that right at Google Brain when he was over there in Toronto, Yeah. Oh, in Toronto, yeah. And he very generously did it is to explain neural networks to me. Gave, you know, I was completely, this is 2017, so that, that really woke me up to what's going on. And then, you know, since then I, I started the podcast and I, I've talked to a lot of people but you know, the, the, the big companies are focused on AGI. You guys are not, you're focused on the more practical problems of getting AI to work in the enterprise. So yeah, why don't you introduce yourself, give your background how you ended up with Jeff at Google Brain and then how you join Cohair or founded Cohair with Aiden. I don't know how many of you were there at the founding and what Cohere has been doing for the last couple of years. And then we'll talk about everything that's in the air these days. Okay.
Nick Frost
Yeah, happy to. Yeah. So I'm. I'm Nick Frost. I'm a co. Founder of Cohere. Prior to this, I was a researcher. I had the pleasure of working with Jeff Hinton at the Toronto Google Brain Group for a few years, working on foundational research. I first got introduced to neural networks when I was at the University of Toronto in undergrad. I was pretty excited about the idea of AI at the time and learned about neural networks in the cognitive science department, actually, and thought they were. They made a whole lot of sense and were really exciting. And I remember learning about them in, I think, 2012 or something, which was shortly after Alexnet, which was the paper that really showed.
Host / Interviewer
Yeah.
Nick Frost
That neural networks were the best form of machine learning, I think, at that time.
Host / Interviewer
Right.
Nick Frost
And I remember learning about that in 2012 and thinking, wow, if only I had been here a little bit sooner. Kind of missed it at the time. That turned out to, of course, not be true. And neural networks have remained exciting since then and I've been working in and around them. Yeah. Since. Since learning about them back then.
Host / Interviewer
Yeah. And. Yeah, go ahead. On. On founding Coherer, how did you meet Aidan?
Nick Frost
Yeah, so I knew Ivan, the other co founder from University of Toronto. We were both there at the same time. Aidan and I were also there at the same time, but we didn't really meet. So I properly met him when he was working in Google Brain as well. And he came up to the Toronto office for a summer and then later came back and told me about Transformers after having helped write the paper that introduced the Transformer architecture. And we were all very excited about that and realized that there was an opportunity and a need for a company to make Transformers useful for the enterprise. And we've been doing that since.
Host / Interviewer
Yeah. And famously OpenAI jumped on the Transformers Transformer algorithm and scaled it up to the GPTs. Was that already happening when you guys started Cohere or were you kind of at the same starting line?
Nick Frost
So we started quite a while after OpenAI started. We were founded in beginning of 2020, end of 2019. So were we starting at the same starting line? No, I think we.
Host / Interviewer
We.
Nick Frost
Yeah, we were a little while later, but they. And they had been working on scaling up Transformers. We. We founded Cohere after the GP after the first GPT paper, but before the second one.
Host / Interviewer
I see, but were you guys. The first model that you built, was it a similar sort of architecture to what OpenAI built?
Nick Frost
Yeah, I mean all the architectures are the same architectures. Hilariously, the models that we're using now are still transformers, which is the paper that was introduced in 2017. So we're almost a decade into pretty much the same architecture. There's been changes in scale and changes in how you train them, and now we're seeing changes in combined like having multiple models together and then using some switching mechanism to choose which one to use based on a query. But all of the models themselves are still transformer models. They're still the exact same architecture that was introduced by those authors back in 2017.
Host / Interviewer
Yeah, I've been talking to people about post transformer architectures and world models and there's interesting things happening. But most of the post transformer architectures, I mean like Mamba or things like that. Yeah, they, they're really based on transformers. I mean they're. Yeah, they have these different tricks and things, but ultimately it's still transformer based.
Nick Frost
Yeah, I think the transformer is it. It happens to be very stable and easy to scale. And scale is what has been, you know, making, making these language models suddenly useful and all. I mean, I myself worked on a variety of other architecture architectures when I was working with Jeff. We were working on a few different forms of capsule networks. Yeah, very cool. And had really interesting properties, but did not scale well at all. And so are not relevant from a. From a practical aspect. They're not really used, they're not used in like industry at all today because. Mostly because they don't scale well.
Host / Interviewer
Yeah, yeah, yeah. I remember when, I think my, the last time I talked to him was about capsule networks.
Nick Frost
Yeah, yeah. And they, and I, I think transformers have some of the nice properties of capsule networks, but are way simpler, way easier to scale, line up way better with the hardware that we have and so can be much, much, much more useful.
Host / Interviewer
Yeah. So you've also worked quite a bit on distillation, which I remember Jeff was working on last couple of times I talked to him. So can you talk about. So Cohere famously has sort of issued the AGI goal. I mean that's. You're not your stated goal as it is for many of the companies coming out of the transformer revolution. Can you talk about what coherence focus is and, and how distillation, for example, plays into it or, or what other
Nick Frost
yeah, let me handle those. Let me handle those two. Those. Those questions separately. I. I don't think they have a ton to do with each other. So to address. To address the goal of cohere and as it relates to AGI or not AGI. So. So when people talk about AGI, I mean, we've been talking about it for many years now, but there's still ambiguity about what they actually mean.
Host / Interviewer
Yeah.
Nick Frost
When someone says AGI, the only. The way that I interpret it is a computer that you treat like a person, that for all intents and purposes, you expect it to behave as a person. You ask it the things you would expect a person capable of doing, and it does those. When you're not asking it to do something, it continues to behave as a person. Like, that's kind of what I mean. And then when people say asi, like artificial superintelligence, they mean that, but instead of a person, a God. Like, that's kind of. Yeah, that's what I mean. So I don't think we've built AGI and I don't think we're going to anytime soon. I don't think the transformer behaves like a person, and I don't think making better transformers makes them more like people.
Host / Interviewer
Yeah, I.
Nick Frost
The analogy I use, which I think was first introduced by Feynman, is the. Is the flight artificial flight? Analysis. Analysis or analogy. Sorry. And that's to say, you know, flight is a property. There are biological systems that can fly like birds. There are artificial systems that can fly like planes, and they fly in completely different ways.
Host / Interviewer
Right.
Nick Frost
Like a bird flies by generating lift by flapping its wings down. A plane flies by generating forward propulsion and then having a wing shape that results in a difference of pressure between the top and the bottom of the wing, and that creates upwards lift. There's. They share some things. The wing designs are kind of similar. And like, a plane can glide in the same way that a bird can glide, but they don't generate upwards lift in the same way at all. And that means that while planes are super useful and have, you know, completely changed the economy and even our understanding of the world and social dynamics and all kinds of things, they. There's lots of things that they can't do that birds can.
Host / Interviewer
Right.
Nick Frost
Like, we've made nothing as efficient as an albatross. We've made nothing as reactive as a bat. We've made nothing that can hover, like, you know, can turn on a dime like a hummingbird. We've had nothing that can do that the flip side is, you know, planes can carry insane amounts of weight, they're way larger, can go way faster than birds. Like we've done all kinds of things that artificial, an artificial bird machine, like an ornithopter wouldn't be able to do.
Host / Interviewer
Yeah.
Nick Frost
So I think we're at a very similar place with intelligence. Right. Like we've, we've definitely made artificial intelligence. It's just not the way humans do intelligence. And subsequently it can do lots of things that people can't do. Like a transformer at this stage can, can summarize documents better than me, can find answers to questions in mountains of unstructured data better than me, it can figure out tool chains better than me, it can do mathematical reasoning better than me in many cases, but it can't understand nuance, it can't understand cultural dynamics, it can't work on its own in a goal directed way, in a way that a person can. And I think anybody who uses an LLM is aware of those limitations and they figure out how to make LLMs useful, and they are really useful. But understanding the pretty huge difference between working with a person and working with an LLM, certainly all of that was, was a preamble to say that I don't, I don't think we're going to make AGI. And when I say we, I mean humanity. So I don't think anybody's going to make AGI. And I think that that used to be a controversial opinion. Like two years ago, I would say that, and people would be like, nah, look at what we're getting at this stage. That's the widely held opinion outside of the very small Silicon Valley rooms. That's the widely held belief in all universities. That's the widely held belief like on, for anybody who's kind of paying attention, who isn't doing that within a small room in Silicon Valley. So that's not our goal, because I don't think it's possible right now, and I don't think we need to do that in order to make something super useful. I think LLMs are massively transformative, massively useful, and they can still be adding so much more value to the world than what they're adding right now. So our goal is to make these useful for the enterprise, useful for people working with them, and we've made a lot of progress in that. And there's more progress to go, we have more work to do. And so that's what we as a company have been focusing on since we started.
Host / Interviewer
Yeah, and, and what are the practical steps or practical directions that you've taken coherent to address that? I mean, that's why I mentioned distillation. I know Jeff was talking a lot about distillation at the time that you were working with him. And it seems, I mean, I know that, that your thesis is, you know, providing AI that's capitally efficient, I think is the term that you guys use for the enterprise. In other words, you don't need these massive models. You're more concerned with trust and reliability and capital efficiency and, and working with regulated industries. Can you talk about all of that? How. What, where you take. Yeah, transformers to get to that?
Nick Frost
Yeah, that's. Yeah, that, that's exactly what we've been focusing on. So when I say we're making useful for the enterprise, I mean we're making it an efficient solution so that it can actually, you know, it can. It doesn't just burn through money and instead is useful in providing value above and beyond what you're. You're paying to use it. So we make very efficient models that can be deployed on a small number of GPUs. We make models that can be privately deployable so that it can access secret private data without leaking that data to your competition. We make models that can be customized. So we work with some of our partners to make models for particular areas, particular languages or particular industries so that they're really good at the things that they want that the companies want those models to do, as opposed to being really great at generating images or, you know, doing mathematical reasoning, which turns out is not the majority of the work that's done out there in the world. Right. So we make things that are useful for our customers. We also work a lot on frameworks for using those models. So it's one thing to have a model, it's another thing to have a system that your employees can use, you know, to help them with their work. So we work a lot on things like that. Yeah, you mentioned distillation. So distillation is a little bit of an overloaded word in these days. I mean, when Jeff first introduced it, he was calling it dark knowledge, which was a very exciting term. And what he meant at that time was distillation being like logit distillation. I don't know if your audience is mostly machine learning engineers. I assume it's not. So we could break down what logit distillation means.
Host / Interviewer
Sure.
Nick Frost
Like, yeah. So a neural network and even a transformer. We think about it as outputting a word, you know, like one word at a time or people think about it as writing a sentence, it's not really what they do. What they do is give you a probability distribution over what words are likely to come next. So the out when you give a sent like a prompt into a model, the output, the very first output is a probability distribution over all the words that could come next. And then you pick one word based on, you know, you can based it on the probabilities themselves, or you can just say, I'm going to take the most likely next word. That's the very sampling techniques. People talked about that a lot in the beginning of this. At this stage, we kind of figured it out. You can use a pretty standard sampling technique, and that works. Logit distillation is when you train one model based on the output of those probabilities from another model. So that. And the dark knowledge that when Jeff first coined the term, he was able to show that you could take a really big model trained on image classification and then you could train another model on the, a small model on the output of those probabilities from that big model. And it would learn a bunch of things much faster. Like, hidden in, like, if you're imagining, you know, imagine you're trying to recognize handwritten digits. This is kind of the canonical machine learning thing for a while.
Host / Interviewer
Yeah.
Nick Frost
And you would have one model that was really good. And so its output probability distribution would say, hey, that three looks a lot. It's a three. It's definitely a three, but it looks a little bit like an eight. And like, if you trained a model on that probability, the small model would do a lot better than if you trained it with just saying, hey, it's a three. So that was distillation. That's not really done much these days in the world of machine learning of transformers. In part, that's because it's a really computationally intensive process. And if ever you want to train a small model from the logit distribution of a big model, you got to be running that big model all the time. I'm like, that, that, that takes up a lot of time. Also the, the probability distribution when you're recognizing handwritten digits, it's you know, either a zero or, you know, all the way through nine. But like, those are the digits. When you're looking at the probability distribution from a transformer, it's like every word. So it's a huge probability distribution. And so it's just not as, as effective. So we, we're not really doing anything like that. But your question is, comes from A the right place, which is that we're interested in making efficient models. And one way we could be doing that is with distillation. We're not, but that it makes sense. It's a good, it's a good thought.
Host / Interviewer
So how are you doing it?
Nick Frost
Yeah, a lot of it comes from just going at that angle specifically and figuring out what do we need, what do our customers want the models to do and what is kind of cool but not really relevant to enterprise real world use cases. So generating images is a good example. We don't make a model. You can't generate a meme with our model. And there's a really good consumer application for that. I like using generative images to make jokes. And it's fun. I like it. It's fun. It's fun to put up your picture and see it as a cartoon, like whatever. That's a good time. Turns out it's not really relevant for a business. There just aren't that many use cases, certainly not in the industries. And the knowledge workers who we work with just like not really a lot of usage. And so if you just don't train your model to do that, you can save a bunch of parameters, you can make the model a lot smaller and still be really great. Now the flip side of that is there's lots of things that enterprises do really care about like tool use, like enterprise search, like being able to answer a question based on a really complicated technical document with, with image input and like graphs and schematics and things like that. And so we train the model to be really good at that stuff. And that's, that's how we're able to, to make models that are efficient and subsequently much more like much easier to deploy in real world environments. Just to call that out. Like the model that we released recently is called Command a reasoning. So reasoning model trained on enterprise reasoning. It requires two GPUs to run deep seq models. They take about 16 GPUs. Lots of the other models take significantly more. So, so there's models that we outperform that were like, you know, eight times as, as easy to deploy from a hardware perspective.
Host / Interviewer
Yeah. And are you starting with a base model and then training it on highly curated data for a specific use case? Are you working with open weight models and fine tuning them? No. So how do, how do you get to the small model?
Nick Frost
So we, so you could do that, you could take an open source model and then try to fine tune it. But the reality is most of the underlying capabilities of a model come from the pre training phase. So we train our own models from scratch. We are, there's about 10, call it 10 companies in the world that make foundational models from scratch. We are one of them and we are unique in that group in our singular focus on the enterprise.
Host / Interviewer
Yeah. And, and the models. How do you make a smaller model, a more capital efficient model if not through distillation, if not through pruning, if not through. Oh, data curation.
Nick Frost
Well, there's some techniques that we do that that I won't tell you about. But, but the, the short of it is you just choose an architecture that works. You're principled about the data you put into it, you train it for as long as you possibly can and you focus on the things that are that, that give real ROI to businesses as opposed to cool flashy tricks for consumers.
Host / Interviewer
Yeah. And that that focus is primarily through the quality and, and scope of the data that you're training it on.
Nick Frost
Yeah, yeah. Data curation and a handful of other techniques.
Host / Interviewer
Yeah, yeah. So you can you train a model to be good at. I don't know what would be a,
Nick Frost
like a really common use case. A really common use case is. I'm going to ask, I'm going to ask. I have a model, I have a bunch of documents and a bunch of tools, things that my company uses like my HR software or my, you know, my sales software or something. I want to ask the model to figure out a, answer a question and write a document about it, cross referencing all of this various data and tools and then I want it to update another tool based on the output of that document. Like you can imagine a lot of your work, the work that I've just like that generic description is a lot of what people do behind a computer. And so we make a model that's really good at thinking through that. Those kind of things.
Host / Interviewer
Yeah. And, and the, the, the training data its training set is focus on, on business problems. I mean how so we still we
Nick Frost
train on the open web on the text that's available for training from the web as a starting point and then further refine it on stuff like I've just described.
Host / Interviewer
Right.
Nick Frost
And add that into the pre training.
Host / Interviewer
And how many models have you guys deployed so far?
Nick Frost
How many models we We've been releasing about a new, the, the whole industry, all of us have been releasing about a new family of models once a year. It kind of seems Right.
Host / Interviewer
But, but your models, are they domain specific or are they just a more efficient and more.
Nick Frost
Yeah, they're More efficient, they're easier to deploy and they outperform on the things that our enterprise customers care about.
Host / Interviewer
Yeah, yeah. Can you give us a few use cases? And, and I know that you're what I've read you're focused on regulated industries. So these models are small enough to deploy on premise or.
Nick Frost
Yeah.
Host / Interviewer
Virtual private cloud or something.
Nick Frost
That's exactly right. It turns out that an LLM like a language model is really only as useful as the data it has access to. For a real world application it's really only as useful as the data it has access to. And a lot of the useful data is for very good reason, private secret data. It's data that you can't just send over the wire to some server somewhere. You can't have a company train on that for various legal reasons, for various strategic reasons. And so we make, when we work with a customer we don't ask them to bring their data to us, rather we bring our model to them. And when we deploy, we deploy in a customer's environment, whether that's the cloud that they're managing or a virtual private cloud or a GPU that they have on a shelf somewhere. And so we'll deploy wherever we need to deploy in order to be useful for that customer. That has been, that has opened up a lot of real world applications that just wouldn't, you would not be able to use an LLM on otherwise.
Host / Interviewer
What are some of the applications that.
Nick Frost
Yeah, yeah. So a large customer of ours that's using our models and our agentic platform for them is the Royal bank of Canada. So rbc, one of Canada's largest companies and they're using our models across the company for things like doing analysis on quarterly earnings reports. That's a good example. Or working. Or working through documentation on a bunch of different things or something. But it's mostly being used by the people who work in that company to do their job behind a computer faster, easier, more effectively.
Host / Interviewer
Yeah. Did you go to NERPS this year? I went originally. Yeah.
Nick Frost
I wasn't there this year but I have been in previous years. I actually, I missed this one.
Host / Interviewer
Yeah, yeah. I met a bunch of guys from my wife's Japanese. I'm gonna say it wrong, I say Sakana but I think it's pronounced Sakana. What's that?
Nick Frost
Yeah, in Japanese I think Sakana. That's fine.
Host / Interviewer
Yeah, yeah. And, and they're working on evolutionary Neuro evolution. Just came out with this book.
Nick Frost
Yeah.
Host / Interviewer
Are you looking at those Cohere Explore these new or not necessarily new. It's not new, but these other schools of AI to see how they might be applied.
Nick Frost
So we are not. Those guys are great. They got a lot of respect for some of the research that they did and they continue to do cool stuff. We are not working on evolutionary things. I think it's a promising, cool research direction and I think one day some useful stuff might come out of it. It's so far not particularly relevant to business, but really cool. And one day might be. But for now, no.
Host / Interviewer
Yeah, yeah. But I mean, more generally, are you. You guys have kind of your heads down working on. On enterprise applications and models that apply to the enterprise, or is. Are you also at the same time exploring new architecture still? Or are you. Is there enough work ahead of you that you're not. That's for somebody else to do.
Nick Frost
We've definitely done some work on new architectures and we continue to do a lot of research. Right. Like, we've published more than 100 papers at this point. Our people within Cohere have been collaborators on it. More than 100. We release the weights of our model into the research community. Our open science initiative, Cohere Labs, continues to collaborate across the industry with people and publish huge amounts of papers. Some of their work has been on new architectures, some of it has been on, you know, on efficiency or the impacts of AI or various. Yeah. Other training paradigms or the consequences of evaluation and stuff like that. So they're kind of working all over the place. We have a big commitment to open science and I think that's the way that this industry will continue to evolve. But our focus is always on making useful stuff. Everybody here is motivated by making something useful. And I think sometimes people feel like that's at odds with the idea of research. And they'll try to say, oh, there's two different things. You're either doing research which is inherently useless, or you're building something useful. You're doing enterprise stuff. Those things are very similar to me. We're solving problems, we're answering questions. Those questions are relevant and we think the answers to those questions are going to be impactful. We're not pontificating, but we publish our findings all the time and collaborate with people and solve novel problems. It's just that those novel problems are targeted towards something grounded in reality and useful.
Host / Interviewer
Yeah. And that's something I've wondered.
Nick Frost
Do.
Host / Interviewer
Do you build two problems or do you build. Yeah, I mean, you're obviously very involved with the enterprise writ large, you know, and do you See problems that, you know, you can solve and build to that. Or do you build generally useful models and then companies come to you and you figure out how those models apply to their problems? I mean, it's a chicken and the egg thing.
Nick Frost
Yeah, we generally. So we build models that we release and we again, we release those weights available for research, use all of the models that we've ever made. You can find the weights of those models on hugging face. And then we work with those models and our customers to further refine them to be particularly useful for whatever they're interested in. Sometimes we do that if the problem requires it. Like sometimes people have come to us and said, hey, we needed to be slightly better at Japanese or something, or we needed to be better at this field. And we'll train the models a little more to make them particularly good at that. And other times, a lot of what they want to do actually the model out of the box is great at it. So we'll work with that. But it depends on the problem. But we start and release new foundational models every year.
Host / Interviewer
Yeah, yeah. And how large are the models? I mean, you're saying they're 111 billion parameters. That's the largest model.
Nick Frost
The largest model that we have released so far is 111 billion parameters.
Host / Interviewer
Wow. Yeah. And can compete with the behemoths on specific tasks. Yeah. How do you, how do you evaluate?
Nick Frost
Yeah, it's fraught. The space of evaluating large language models is fraught. And a huge amount of the evaluation is misaligned with the way people actually want to use it. Our suggestion to companies is to say don't like, don't look at whatever is the latest eval that you think is exciting. I mean, people these days are talking about ARC AGI as an eval, but it's a weird pixel matching game. It's a weird pixel reasoning game. I've never seen anybody in my life have that task as a job.
Host / Interviewer
Right.
Nick Frost
And if you work under the assumption that these are not building towards AGI as, again, most people think they're instead building something useful, but very different than a person you shouldn't. Evaluating it on a random pixel matching game isn't a particularly good measure of how useful they're going to be. So instead we say, look, just figure out what you want the LLMs to do and write like, I don't know, 10, 10 or 20 here, however many as you can, examples of that problem and then ask the LLM, see if it gets the right answer. But if it does, like that's your evaluation, and that's going to be a better eval set. And looking at, you know, I've been in this industry long enough to see a new eval come out every year, and it's exciting. And then people realize it's no longer relevant. I mean, when we started, the eval was called LM1B, and that was an eval. I'm not even sure if you've ever heard of that. That was an eval that was about, like, news. News sources from 2011. And at the time we wrote an article being like, this is a very bad eval. And I feel like we could have written a paper every single year talking about why the eval is bad. And it fundamentally comes from a miss from people think that they're evaluating something super general purpose. And these LLMs are general purpose, but their deployment is more targeted. Their deployment is more like, hey, I need it to help me with summarizing the week's emails to prepare a slideshow to tell my boss about. Right. Like, that's a very targeted, grounded thing. And so we encourage people to just figure out what they want the model to do. And you use a small number of examples of that to figure out if the model is good as opposed to relying on, you know, the eval du jour.
Host / Interviewer
Yeah, this capital efficiency. I mean, if people are using models for inference, they're not training them themselves. As long as you can deploy in a virtual private cloud, it doesn't really matter how large the model is. I guess there's a running cost.
Nick Frost
Yeah, there's a running cost. And the running cost for very large models is prohibitive for people. I mean, there's a reason. So you might be familiar with that MIT report that came out a while ago about saying, like, 95% of AI applications are in demo and not production. Right, yeah. Familiar with this? Yeah, yeah, yeah. It was a great piece of work. Our deployments are much closer to the inverse of that than they are to that. The vast majority of our deployments are production use. Like, we're not really interested in building in demos and nor are our customers anymore. And there's many reasons for that. One of them is that a lot of people built a demo and then realized that if they were to go to production, it would be way too expensive and it didn't provide enough value to justify the cost. Right. So our focus has always been, okay, like, what do you. What do you want to do? Great. This is probably based on, you know, we make a really efficient model. This is how much it's going to cost to run it into production? Is that, is that going to be good enough? And the answer is it normally is. Right. So that's been a big focus of us. Yeah. Inference, inference can be very expensive.
Host / Interviewer
Yeah. I mean that's interesting that the MIT report and there been others like it that, that say so much of particularly the agent, agentic stuff is failing and pilots are not getting into production. Is that. And, and I'm curious how quickly this tech will penetrate the economy because right now, despite all those stuff that's written about it. Yeah. It's barely penetrating the economy.
Nick Frost
I think we're seeing that over the next few like year, next few years. I think at this stage in a consumer perspective, almost everybody who's spending a lot of time on a computer has come across some way of using an LLM in their consumer life that was useful. Maybe not as useful as like, maybe not useful enough to pay a subscription fee. Right. Like, you know, huge, most, the vast, vast majority of people using any of the consumer applications are not paying for it. And if you ask them to pay for it, I think most of them would stop. So you know, but it has from a consumer perspective been super, super widely used. If you ask how many people have used an LLM in their job, not as many. And a lot of that is because the models don't have access to the right data. So it's not super useful or they're not quite good enough at the real thing or it's too expensive for their company to deploy and they weren't allowed to bring the data into ChatGPT or something. So a lot of those issues we're solving and when we work with a customer, we get into production and we get something useful and they start getting real ROI on it. But that's work that is ongoing and that I think we'll continue to see going on over the next few years. I think companies at this stage like need to be adopting AI into their workflows if they want to stay competitive. But I don't think you're not going to wake up tomorrow and see suddenly everybody. This is still bleeding cutting edge technology. We're still in the early days of this stuff getting put to work.
Host / Interviewer
Yeah. And I presume you guys are working on, on agents and.
Nick Frost
Yeah, yeah, yeah, yeah, yeah. I mean agents is another overloaded term. I think what people talk.
Host / Interviewer
Yeah, I was just gonna say, I mean what, what is the difference between.
Nick Frost
I've heard, I've heard a handful of different definitions. I think the one that I thought was useful is an LLM like a regular large language model when you're using it in a non agentic way is you put in a prompt, it writes the response and we sample from the model directly. It gives us answers on a gentic. LLM is when the model, you give the prompt to a model, it decides some tools to call. Maybe it does a search, maybe it writes some code and runs the code, maybe it does a handful of searches or something and then based on the results of those tools, maybe it calls more tools or maybe it responds and if there's a little loop in there saying hey, the model is going to call as many tools as it wants until it's found the answer. I'm going to call that agentic. But like that, that's my working definition of it. And lots of people have a different definition. The answer is not a super useful word.
Host / Interviewer
Yeah, but how much of your work with a lot of it.
Nick Frost
A lot of it's like that.
Host / Interviewer
Yeah, because it turns out.
Nick Frost
Yeah, yeah, it turns out that stuff's super useful. Like doing, having a model, you know, iterate on that is very useful.
Host / Interviewer
Yeah, yeah. And then people are talking, I mean they've been talking for a while but they're starting to build these multi agent systems. Yeah, multi model systems where the models debate each other and then come up with a consensus answer. Do you, do you guys do all of that in the background?
Nick Frost
Yeah, yeah, we do all of that. And some of that stuff can be super useful. Like deep research is a good example of, you know, it's one, I mean it's, it's one call to a model resulting in many more calls to maybe the same model or different models but like a handful of other things and then that's all aggregated finally. And that, that I think it can be super useful in an enterprise setting in particular. That can be really useful. You know, you can ask something like hey, like read through all my emails and my slack and everything and cross reference it with Salesforce and figure out which customer has the potential to be a big success on my accounts but is currently really small and I should be looking after. You can imagine that's a complicated thing requiring many different searches all filtered down into a final report telling you the state of things that could be super useful. The flip side of that is that there's lots of other examples where you'll see people talking about hey, we just had this model work for seven hours straight or something and it gave us something completely wrong. But like wow. Isn't it cool how long it worked for?
Sponsor / Tastytrade Announcer
So, you know.
Nick Frost
Yeah, that can be. That can. It's not always useful if they're not deployed correctly and they're not given access to the right data and train on the right stuff.
Host / Interviewer
Yeah. And you were talking about working with the enterprise, deploying in the enterprise and using enterprise data. Are you working on RAG systems or you, when you get into the enterprise, are you fine tuning on the enterprise data or.
Nick Frost
Yes.
Host / Interviewer
How do you.
Nick Frost
Yeah, let's again define, define RAG to me. Rag. Rag, which is a term that was, was retrieval, augmented generation for first coined by Patrick Lewis who actually leads our team here. V leads the RAG team here. That team has now moved on to the agents team. But the term RAG originally meant that thing. First we're going to have the model. It's going to generate a search query, it's going to do that retrieval. It's going to find some relevant information that is going to answer the question based on the relevant information. Now that's like, okay, the model is going to call a tool and that tool might be search or it might be something else. And it might not call one tool, it might call five and then based on the result it might call more tools or it might call one. That's the agentic framework. A lot of what we do is still that's a useful thing to do.
Host / Interviewer
I had a call with somebody at Microsoft maybe six months ago and she was talking about societies of agents. This is back when Yuval Hariri was everywhere talking about, you know, the agents sort of taking over the, the sub fabric of the economies, the global economy where there will be agents that are talking to each other and taking actions, you know, kind of unseen, but doing a lot of the necessary stuff. How realistic do you think that vision is? Because agents aren't, aren't yet that reliable that you. Well, let's fire and forget, you know.
Nick Frost
Yeah, I mean let's. Again, like if what you've done in your mind is substituted the word agent for artificial general intelligence and now we're just having the same conversation. But every time you say, instead of saying artificial general intelligence, you say agent. And I think Yuval Harari, like when he's talking about it, that's what he's saying. He's just saying like, okay, well imagine if you know, there's a digital person and then that person does stuff. Yeah, like no, again, I don't think that's going to happen. I don't think these things work the same people way as people I don't think they're replacements for people. I think they're very. It's a very different thing. And I think a lot of what he was talking about was really just saying, well, what if AGI, right, and then he has his conversation. Now, I do think it's reasonable that, you know, I might have a model that I, you know, an agent that again, is just. It's a neural network with a system prompt and a bunch of tools and it might have. Hey, every time an email comes in, I want you to look at that email and then I want you to write a draft for it and then I want you to message me. And if it's a good message, I might tell you yes, and then I want you to send it. And each of those just a different tool call. And so maybe an agent might be helping me write and send emails, and then somebody on the other line might get that email and they might have a similar neural net setup agent that does the same thing and helps them. And so, like, I can imagine a world where LLMs are helping process a lot of the information to the degree that, you know, you might be able to say, like, hey, I want to meet with so and so find a time and book it. You know, and that's back and forth between a two few language models connected to various sources of data. But that's a lot different than like society of agents. And like, oh, what if we have this world of. And I just don't see the world going that way. Right. Like, I don't. I think it's, it's much, this stuff is much more augmentative than it is fully replacing things. And so the idea of a society of agents I don't think is really.
Host / Interviewer
Well, this is interesting and this is getting off the topic of what cohere does. But, but, you know, you did work with Jeff Hinton. You know him very well. Why do you think he's gone off on this?
Nick Frost
Yeah, yeah, I talked to him. I have talked to him at length about this stuff. And we, and we disagree on our proximity to or we disagree on AGI. I think one of the things that, one of the reasons why we're different on this is he's kind of thinking about the indefinite future. A lot of the stuff he's saying, you know, he's, he's, he's, he'd invented this field and a lot of what he's saying is figuring out like, cool, now that I've invented it and it exists and, and he's taken a Step back. What can I say that's going to govern this and align people to be thinking about it forever up until, who knows, for as long as possible, well past when we're all gone and there's other people in the world thinking about this kind of stuff. And so I think a lot of what he's thinking about, as I understand it, is he's thinking about, cool, I gotta be somebody who invented. And certainly at some point we might get to AGI. And if we ever do, neural networks are going to be a component of that. I agree with that. I think they're probably necessary but definitely insufficient component of AGI. And so as the guy who created it, I think it's awesome that he's thinking about the future and thinking about how to set people up to think about this stuff correctly. We disagree on whether or not large language model like neural nets are a sufficient component of AGI. At times he thinks they're a sufficient component of AGI. And so that's just a technical disagreement that we have. And so that kind of explains like why I have the stance that I'm saying, I'm saying, look, we've made something super useful, super transformative the way planes are transformative, but not the way like, but they, you know, similar to planes and birds do this, this property, you know, intelligence versus flight, they do it in a completely separate way. And so we need to be thinking about the consequences of that completely separate way in the same way that no one's running around talking about, oh no, like what, what happens if, if, you know, if planes replace bees, planes replace hummingbirds or something. They're just, it's obviously different thing, you know. So you have conversations about planes.
Host / Interviewer
Yeah, yeah, I, I bring it up because I've, I haven't spoken to him since before the pause letter, you know, Max Tegmark's thing. And I, I, and I, I saw T. Mark a couple of weeks ago. You know, he has this scorecard for the big foundation, like the top six or something companies. And I, I just, Jeff's approach, you know, he's on 60 Minutes, he's on, you know, a lot of very broad public outlets talking about the existential threat of AI And I, to me, that's not helpful. It just confuses the public sort of.
Nick Frost
Yeah.
Host / Interviewer
Feeds this hysteria among the public about something that's not going to happen in, certainly in, in, in people my age's lifetime. So, you know, talk to the regulators, talk to the governments, maybe talk to the researchers, but why go out and broadcast this to the public, who's, who's not going to understand the time scales involved in that sort of thing. Anyway, that's, that's how I feel about it. But. So what's next for Cohere?
Nick Frost
Yeah, I mean, it's been a pretty, pretty great year for us. You know, you will have seen we announced a raise earlier and that was off the back of a pretty massive increase in our revenue. And that is off the back of focusing on the real world stuff, focusing on enterprise applications, making large language models useful. But obviously there's way more to go, right? There's still so much more that LLMs can be doing. There's still so much work that people are doing that an LLM could be doing with them, for them, helping them do it way faster, way better, and allowing them to do the creative and the strategic and the interesting. I think about this a lot. There's not a ton I want to automate in my personal life. I don't really want to respond to text messages from my friends faster, but there's a huge amount I want to not do in my work life. Right. There's a huge amount of stuff that I would rather have an LLM do for me. And indeed, these days I rely on our own, like our own system a lot. Right. Like if I'm writing documents, it's helping me write it. If I'm responding emails, it's responding to emails. If I'm pulling up data from various sources, it's doing that for me. If it's searching through Slack and our documentation is doing that for me. Me, like, so I'm, I'm using it a lot, but not everybody is. And there's still a lot more, so there's a lot more I could be doing for other people. So we're going to just keep focusing on that, focusing on enterprise reasoning, on multimodality for the enterprise, like looking at schematics, looking at graphs, looking at text and audio as well. So there's just a whole bunch that we're focused on.
Host / Interviewer
Yeah, you know, we were talking about how a lot of these pilots never get to production because one of the reasons, as you said, is the cost of inference in production. How do you measure the ROI of the systems that you're putting out there?
Nick Frost
Yeah, that's a good question. It's totally dependent on what the customer wants them to do. Right. So in some cases they're like, hey, we, you know, we want this, we want this LLM to help, to help analysts do research or research on companies that's very easy to measure the ROI on. It's like they used to be able to keep track of 10 companies, now they can keep track of 50. Great. That's how useful this is. Or we want to help them make predictions on where to do investments. You started giving them access to a thing to help them process documents and think through their reasoning for investments. And now they're doing this much better on investment. There are times like that when it's very measurable, but. But it's always connected to what the customer wants to do. And that's, that's very different across all our customers.
Host / Interviewer
Yeah, yeah. How, How? I mean, you, you talked a little bit about financial services or banking.
Nick Frost
Oh, yeah.
Host / Interviewer
How is this applied in healthcare?
Nick Frost
Yeah, healthcare. There's a lot that it can be useful for as well. But again, a lot of it's like administration.
Host / Interviewer
Right, right.
Nick Frost
Help it. Like, you know, doctors sometimes have to read through huge amounts of notes to prepare for a meeting. And if you have a model help summarize those notes for them, that can be way faster, that can be way more efficient. Or, you know, administrators within a hospital have to process insurance claims. And like, that requires reading through huge amounts of information. It's a, it's a whole mess. Right. And if you can have an LLM help you with that, then that can get you to the real, I mean, the real healthcare worker that needs to get done, which is caring for people. And like every, every hour that you can save a doctor's time in administration is a huge gain for the people that they're seeing. Right. So it's, it's a. Definitely a promising area.
Host / Interviewer
Yeah. I mean, 2025 was really the year of agents, or at least the, the dawn of agents in the enterprise. What, what do you see in the year coming up? What, what is the focus going to be for the enterprise in AI? Do you think it's just a year of.
Nick Frost
I'm gonna, I'm gonna push back a little bit. I think 2025 was the year that people spoke about agents all the time, but they didn't even really do a. Define it, you know, they just spoke about them.
Host / Interviewer
Right.
Nick Frost
Without even really knowing what they were, as we mentioned, beginning. I think this technology is going to get boring. I think this technology, because it's just going to be part of your work life. Right. Like, you don't talk about word processing anymore, but at one point having a word processor instead of a typewriter was the craziest thing that happened. Right. And you don't talk about that just because it's part of your work. And I think that's what's going to be happening with AI over the next little bit. It's just going to be like, yeah, a way you use a computer and you expect the computer to do it. And that's, that's the, that's when you get to the real value. Like when something is no longer the topic of discussion, it's just a natural reality of the world we live in. That's when it's the real value. Like I don't know when the last time someone talked about, you know, email or the Internet as being revolutionary and yet I could not do my job, live my life like do, you know, do anything that I'm used to doing on a daily basis without the Internet. It's a fundamental component of how I engage with the world. And so that's, I think what's going to be happening over the next year. Like this technology is just going to be more useful and it will blend in with the natural technological fabric.
Host / Interviewer
Yeah. So you're not looking for a particular breakthrough or multi agent systems to sort of come to the forest more. It's, it's more a year of, of consolidation, deployment. Yeah, yeah, yeah.
Nick Frost
I hope we don't have a new buzzword next year. I hope we're just, I hope it's just useful, you know.
Host / Interviewer
Yeah. And from your, from Cohere's point of view, you have kind of been a bottomless market. Right.
Nick Frost
There's a lot, there's a lot we can do. Yeah.
Host / Interviewer
Yeah. I mean how, how do you handle that? Do you have, are you judicious in the customers you take on or, or are you growing as fast as you can to handle all the.
Nick Frost
Always a delicate balance. It's always delicate balance. There's certainly like a, there's certainly a huge number of people that our technology is useful for and so there's a huge number of people who are interested in using it. Yeah, we've grown incredibly fast over the, the last year and we'll continue to grow incredibly fast.
Host / Interviewer
And, and you guys build frameworks as, as well as the base models. Right. So for example you, you were talking about, you have
Nick Frost
platform is called North. That's the framework you use for.
Host / Interviewer
And is, is that what you're using in your daily work when you were
Nick Frost
talking and that's the thing that has access in a private data, data secure way to all the data that I have as an employee of Cohere.
Host / Interviewer
I see. And is that priced such is That a subscription model? I mean, how, how are you making that available? Is it all, you know, bespoke negotiation or is it, is it something I can go on the Coher website, sign up for, pay $20 a month and no.
Nick Frost
Yeah. It's not a consumer product.
Host / Interviewer
Yeah.
Nick Frost
So you can't, you can't go and pay $20 a month for it. If you're an enterprise and you want to deploy this for your company, as all of our customers do, you can reach out to us and we'll figure out the fastest, most effective to deploy it for you.
Host / Interviewer
Yeah, yeah. I've got to ask. Well, why, why not do a consumer product?
Nick Frost
I just don't. There's a handful of reasons. Yeah. One of them is I just don't, I don't think that's where this stuff adds the most value. I also think there's good consumer AI companies out there. You know, I think one, there's a bunch of. All the other companies are consumer. One of them is going to win. It's a pretty weird industry. I don't, there's a lot of weird stuff going on and I don't claim to understand or endorse a lot of it, but there's one of those consumer company is going to be using AI for consumer applications in an effective way. But you know, you look around the enterprise and there's not a lot of people aren't using LLMs anywhere near as much as they could be if they were deployed correctly, if they answered the questions accurately, if they didn't hallucinate and gave good citations. Right. Like those are all things we train the model to be good at. And so we see a need there and it's where I think the models are the most useful.
Host / Interviewer
Yeah. So you started Cohere with who, who were the founders at the beginning?
Nick Frost
Aiden Gomez is our co founder and CEO and I'm saying it's a co founder, Ivan.
Host / Interviewer
Right. And how big are you guys now? Oh, we're employee wise.
Nick Frost
Yeah, we're several hundred. We're pretty large. Yeah, we've grown quite a bit. Yeah.
Host / Interviewer
Yeah. And you're, you're all in Toronto?
Nick Frost
No, no, no, no, no. We're spread around the world. We have offices in, in London and Paris. Paris and San Francisco and New York and presence in Asia and presence in, in, in Germany. Germany has a big place for us and some people in the Middle East. Yeah, we're, we're all over.
Host / Interviewer
And then Cohare Labs. How big is the, is the crew there?
Nick Frost
Oh, that's A that's a smaller organization but that that organization works with many people within other with like it collaborates across the open science.
Host / Interviewer
I see. Yeah.
Nick Frost
So it collaborates with universities and with other labs even and with people all over the industry. Yeah, that has a lot of people who there's actually a whole, a whole online community in that of people just, just like starting out in the industry and starting out in research and cohere labs is often the entry point for them. So it's been really lovely. People start out working there.
Host / Interviewer
Yeah. And the, the size of the enterprises that you're handling are are or what
Nick Frost
It's a wide variety but we work with, you know, some of the, some of the largest businesses in the world. Oracle is a large customer. RBC is a large Bell is a recent customer. Some of the largest industries this episode
Sponsor / Tastytrade Announcer
is brought to you by Tastytrade. On Ion AI we talk a lot about how artificial intelligence is changing, how people analyze information, spot patterns and make more informed decisions. Markets are no different. The edge increasingly comes from having the right tools, the right data and the ability to understand risk clearly. That's one of the reasons I like what Tastytrade is building. With tastytrade, you can trade stocks, options, futures and crypto all in one platform with with low commissions, including zero commissions on stocks and crypto, so you keep
Host / Interviewer
more of what you earn.
Sponsor / Tastytrade Announcer
The platform is packed with advanced charting tools, back testing, strategy selection and risk analysis tools that help you think in probabilities rather than guesses. They've also introduced an AI powered search feature that can help you discover symbols aligned with your interests, which is a smart way to explore markets more intentionally. For active traders, there are tools like Active Trader Mode, One Click Trading and Smart Order Tracking. And if you're still learning, tastytrade offers dozens of free educational courses plus live support from their trade desk reps during trading hours. If you're serious about trading in a world increasingly shaped by technology, check out tastytrade. Visit tastytrade.com to start your trading journey today. I'm going to myself. TastyTrade Inc. Is a registered broker, dealer and member of FINRA, NFA and SIPC.
Host: Craig S. Smith | Guest: Nick Frosst, Cofounder of Cohere
Date: February 17, 2026
In this episode, host Craig S. Smith is joined by Nick Frosst, cofounder of Cohere, to unpack why Cohere has chosen to focus on pragmatic, enterprise-focused AI rather than pursuing artificial general intelligence (AGI). Frosst draws on his experience as a foundational AI researcher (including his tenure with Geoffrey Hinton at Google Brain) to outline Cohere’s trajectory, innovations, and philosophy. The conversation traverses technical distinctions between neural architectures, the practical constraints of deploying large language models (LLMs), the realities of enterprise adoption, and the broader industry misalignments around AGI versus targeted utility.
[03:54]
[09:43]–[14:34]
[15:37]–[23:18]
[25:17]–[26:59]
[26:23], [51:11]
[28:56]
[41:45]
[38:22]–[45:16]
[32:21]
[36:39]–[54:42]
On AGI and AI Utility:
"I don't think we've built AGI and I don't think we're going to anytime soon. I don't think the transformer behaves like a person, and I don't think making better transformers makes them more like people."
— Nick Frosst ([10:53])
On the 'Flight Analogy' for AI:
"Planes can carry insane amounts of weight, they're way larger, can go way faster than birds... We've made artificial intelligence. It's just not the way humans do intelligence."
— Nick Frosst ([11:53])
On Efficient Models:
"The model that we released recently is called Command... It requires two GPUs to run; deep seq models take about 16 GPUs... so there's models that we outperform that were eight times as easy to deploy from a hardware perspective."
— Nick Frosst ([21:42])
On Evaluation:
"Our suggestion to companies is to say don't look at whatever is the latest eval that's exciting... just figure out what you want the LLMs to do and write 10, 20 examples of that problem and then ask the LLM—see if it gets the right answer."
— Nick Frosst ([33:12])
On Market Focus:
"There's a lot of weird stuff going on and I don't claim to understand or endorse a lot of it, but one of those consumer companies is going to be using AI for consumer applications in an effective way... there's not a lot of people using LLMs anywhere near as much as they could be if they were deployed correctly."
— Nick Frosst ([57:06])