
Loading summary
A
Clem Delong, welcome to ACQ2.
B
Thanks for having me.
A
It's a pleasure to have you here. We have heard so much about Hugging Face over the last few years. It just feels appropriate in this moment to talk to you about the company directly.
B
I feel like at a very critical time for AI. And with Hugging Face, we have the pleasure and the honor to be at the center of it. So excited to be able to share some of the things that we're seeing.
A
I think listeners who are tuning into this and saying, what is this episode going to be about? We want to frame it as you should come in and you don't need to know anything about AI and you should walk out with a pretty clear understanding of open source AI. The more closed ecosystem, what is the difference between the two? What are the trade offs? What are the virtues of each one? And we're going to tell it through the Hugging Face story. So what role do you play in the ecosystem? Who do you work with? Who do you not? How did this thing spring up out of quite an unlikely place given the name of your company? We'll kind of work our way backwards at this moment in time. Today. How do you describe what Hugging Face is?
B
So Hugging Face has been lucky to become the number one platform for AI builders. So AI builders are kind of like the new software engineers in a way, right? Like in the previous paradigm of technology, the way you would build technology was by writing code. You would write like a million lines of code and I would create a product like a Facebook, like a Google, or all the products that we use in our day to day life. Now, today, the way that you create technology is by training models, using data sets and building AI apps. So most of the people that do that today are using the Hugging Face platform to find models, find data sets and build apps. So we have over 5 million AI builders that are using the platform every day to do that.
A
The ecosystem around Hugging Face in many ways reminds me of the 2008-10 era of the Web 2.0 sort of restful APIs that everybody was publishing and you could suddenly daisy chain together a million different companies. Mashups, services and yeah, this sort of like API mashups. It kind of feels like there's a loose analogy to at least the movement that you're on is similar to that one. What can we create with a bunch of these sort of more open, flexible building blocks?
B
Yeah, it's super exciting because it's replacing some of the previous capabilities. Now you starting to see search being built with AI. You're starting to see social networks being built with AI, but at the same time it's empowering new use cases, it's unlocking new capabilities that weren't possible before to some extremes. Right. Like some people are talking about super intelligence, AGI, completely new things that we weren't even thinking about in the past. So we're at this kind of like very interesting time where the technology is starting to catch up to the use cases and we're seeing the emergence of a million new things that weren't possible before.
A
That's cool. And just so listeners understand the scale at which you're operating. Hugging Face is currently valued as of recording at $4.5 billion. Investors include Nvidia, Salesforce, Google, Amazon, Intel, AMD, Qualcomm, IBM. It's a pretty wild set. What are some metrics that you care about as a company that you can sort of use to describe the scale at? Developers are using it today.
B
So I was saying that we have 5 million AI builders using the platform, but more interestingly, I think it's the frequency and volume of usage that they have on the platform. So collectively they shared over 3 million models, data sets and apps on the platform. So some of these models, you might know them, might have heard of them, like Llama 3.1. Maybe you've heard of stable diffusion for image, maybe you've heard of Whisper for audio or Flux for image. We're going to cross soon 1 million public models that have been shared on the platform and almost as many that have not been shared and that companies are using internally, privately for their use cases.
C
So the analogy and model for you guys really is just like GitHub, except for AI models, right? You can public, open source, open to everybody. And companies can also use internal closed source repositories for their own use, right?
B
Yeah, it's a new paradigm. So AI is quite different than traditional software. So it's not going to be exactly the same, but we're similar in the sense that we're the most used platform for this new class of technology builders. For GitHub it was software engineers, and for us it's AI builders. And to add to kind of like the usage side of things, one interesting metric is now that a model, a data set or an app is built every 10 seconds on the Hugging Face platform. So I don't know how long this podcast is going to last, but by the end of this podcast, we're going to have a few hundred more models, data sets and apps built on the Hugging face.
A
Platform and to continue to maybe torture the repository, comparison the set of things that need to exist. Besides, I'm going to upload a pile of code that everyone can see and potentially attempt to modify. It's also the data sets themselves. It's also a platform to actually run applications and also a compute platform where if you want to train a model, that is also possible. On Hugging Face, right?
B
Yeah. And one additional aspect that sometimes people underestimate is a lot of features around collaboration for building AI. The truth is that you don't build AI by yourself as a single individual. You need the help of everybody in your team, but also sometimes people in other teams in your company or even people in the field. Right. So things like the ability to comment on a model, on a data set, on an app, to version your code, your models, your data sets, to report bugs, to comment and add reviews about your code, your models, your data sets, these are kind of like some of the most used features on the platform because it enables bigger and bigger teams to build AI together. And that's something we're seeing at companies, is that a few years ago maybe there was small team or 5, 10 people leading the AI teams at companies. Now it's much bigger teams. So for example, at Microsoft, at Nvidia, at Salesforce, we have thousands of users using the Hugging Face platform altogether, privately and publicly.
A
So I have a whole bunch of questions, kind of philosophical ones, about where AI goes from here and sort of how the mental model for the AI ecosystem is different than previous generations. But to get there, I think it's helpful to understand how you arrived here. So in 2016, you co founded a company named after the Unicode codepoint, Hugging Face the emoji. And as far as I can tell, it was an emoji that you could talk to as a chatbot aimed primarily at teenagers. Is that right?
B
Yes, yes, absolutely correct. There was a long journey.
C
So you started neither an AI infrastructure company, nor did you even start in the current era of AI.
B
No, but we did start based on our excitement and passion for AI, even if we weren't even calling it AI at the time. Right. We were saying more machine learning, deep learning. I was lucky enough, I think it's now almost 15 years ago, a few years more, to work at the startup in Paris that was called Mood Stocks, where we're doing machine learning for computer vision. So much before, a lot of people were talking about AI and it kind of made me realize some sort of the potential for the new technology and the way we could change things with AI. So when we started hugging face with my co founders Julian and Thomas, we were super excited about the topic and we were like, okay, it's going to enable a lot of new things. So let's start with a topic that is both scientifically challenging and fun. And so we started with conversational AI. We were at the time, okay, Siri, Alexa, they suck. We remember our Tamagotchi, which were this kind of like fun virtual pets that you would play with. So let's build an AI Tamagotchi, like an AI conversational AI that would be fun to talk to. And that's what we did. We worked on it for three years. We raised our first two rounds of funding on this idea. So shout out to our first investors who invested in a very different idea than what we are today.
A
Who were your early investors?
B
So our earliest investor was betaworks in New York.
A
I had no idea.
B
Yes. With John Borswick and Matt Hartman, who were our first supporters. Really backed us when we were like random French dudes with like no specific background or credentials with a broken English.
A
I assume you're now the most valuable company betaworks has ever invested in or.
B
Yes. And more proud of the fact that now with the companies that they invested the most money in. So we're like the biggest bets that they've made. They've been extremely, extremely supportive. But the support from a bunch of very important impactful angel investors for us like Richard Socher, who's The founder of you.com who was the chief scientist at Salesforce at the time. And then the support of the Conway's family with acapital run by Ronnie Conway that led our next round and Ron Conway who also campflag supported us throughout the early days of Huggingface.
A
That's awesome. And so this was all still for the. I'm going to chat with an emoji idea.
B
Yes, yes.
A
And to put a finer point on it, you started the company in 2016. 2017 is when the transformer paper gets released from Google. So we are not yet to the era of even people in the AI community really knowing LLMs are close to on the forefront. Like OpenAI hadn't made their big pivot yet. And so the state of the art for natural language processing is still like pretty limited. Small models trained on very particular well cleaned data sets. Is that right?
B
Yeah. Surprisingly or luckily, that's what led to what Hugging face is today. Because at the time the way you were doing conversational AI is by stitching a bunch of different models which would do Very different tasks. So you would need one model to extract information from the text, one model to detect the intent of the sentence, one model to generate the answer, one model to understand the emotion linked with the model. And so very early on in the journey of Hugging Face, we started to think about, like, how do you build a layer, a platform, an abstraction layer that allows you to have multiple models with multiple data sets? Because we wanted the chatbot to be able to talk about the weather, talk about sports, talk about so many different topics that you needed a bunch of different data sets. And that was kind of like the foundation to what Hugging Face is today. Like, this platform to host so many models, so many data sets. So it's a very interesting fate, a very interesting thing. Obviously, it reinforces for people who are listening the importance of being flexible, being opportunistic, and being able to seize kind of like new opportunities. Even three years in, right? For us, it was three years in with maybe 6 million of dollars raised, completely changing what we're doing, what we're going after, what we're building. Obviously we don't regret at all, but it's a good, good learning for everyone listening that Even, like, with $6 million raised three years in, you can still pivot and find kind of like a new direction for your company. And this is for the best.
C
How did those conversations start? How did they go? How much time did it take to go from talking about it to doing it?
B
Yeah. Surprisingly, the transition wasn't as hard as we thought. It all started from an initiative from Thomas, who's our third co founder and our chief scientist. I think it's right at the time when Bert. So the first very popular Transformers models came out.
A
That's Google's model.
B
Google's model that they opened first, I think on a Friday. That day. I remember really vividly, Thomas told us, like, oh, there's this new Transformer model that came out from Google. It's amazing, but it sucks because it's in TensorFlow. And at the time, the most popular language for AI was and still is actually Pytorch. And it was like, oh, I think I'm going to spend the weekends porting this model into Pytorch. And Julian and I were like, okay, yeah, if you don't have anything better to do during your weekend, just have fun, do it. And on Monday, he released a Pytorch version of Bert tweeted about it. And I think his tweet got maybe like 1000 likes. And for us at the time, we were like, what is happening here? We broke the Internet. Thousand Twitter likes. That's insane.
A
The developer demand is so obviously at that point in time, Pytorch, but since it was born out of Google, of course we're going to implement it in TensorFlow. They had to use their own sort of endorsed stack. It's just waiting there for the first person to realize, oh, my God, this thing needs to exist in Pytorch to like go and get all the Internet points by doing that.
B
Yeah, yeah. I guess it's another gift from fate or from the universe to us that we managed to seize thanks to the work of Tom. And after that, we kind of like saw the interest, doubled down on it. And I think six months later we told our investors, look, this is the adoption, this is the usage that we're getting on this new platform. We think we need to pivot from one to another. And luckily they were all super supportive and that's what led to the pivot and to the direction that we took.
C
Wow. How did you take Thomas, porting Bert from TensorFlow into PyTorch, into the idea of like, oh, there should actually be a platform for this.
B
It was very organic. What we did is really followed community feedback. So what happened is after this first model release, we just started to hear from other scientists building other models who expressed interest in adding their models to our library. So I think at the time it was things like Excel. Net actually coming, if I'm not mistaken, from Guillaume Lample, who is like the founder of Mistral. Now there was, I think it was GPT2 from the OpenAI team at the time. Which was open source.
C
That's right. It used to be open AI.
B
Yes. And they told us that they wanted to add their model since we really followed the community feedback on it. And that's what kind of like took it from a single model repository. I think the first name was Pre Trained Pytorch Bert to. I think it was Pytorch Transformers to Transformers. And then it expanded to the Hugging Face platform as we see it now.
A
And that's the thing you kind of got famous for, was that Transformers library. And you were sort of the steward of that open source project and you sort of constructed the hugging face platform around it to sort of host and facilitate all the community interaction on Transformers. And it turned out, oh my gosh, there's a lot of other people who are building something that looks like our Transformers libraries that also want a place for that same infrastructure.
B
Exactly. It was the same process. At some point, users in the community started to tell us, oh, I have bigger models. I can't host them on GitHub anymore. All right, let's build a platform for that. Or I want to host my data sets, but I want to be able to search in my data sets to see is there good data, bad data, how can I filter my data, and things like that. So we started to build that, and a few months later we realized that basically we built a new GitHub for AI. So our development has always been very community driven, really following the feedback from the community. And I think that's a big part of the reason why we've been so successful over the years and why the community has contributed so much to our platform and to our success. We couldn't be anywhere close to where we are without the millions of AI builders, contributors that are sharing open models, open data sets, open apps that are contributing with comments, with bug fixes. It's the main reason for success today.
A
You're sort of famously open. I mean, you really embrace this. We literally will build the product that the community tells us they want. Internally, you have a very open policy. The Twitter account, your social media accounts are actually accessible, I think, by all employees, right?
B
Yes, yes.
A
As someone who is a champion of open source, how much openness is too much openness? Like, you're not a dao, you don't do the thing where you, like, publish everyone's salaries. I don't think, what do you like to be open versus what do you feel is good? That it's proprietary.
B
What we like to do is to give tools for companies to be more open than they would be without us, but without, like, forcing them in any way. So I was mentioning the number of models, data sets, and apps that are built on the platform. Something that people don't know as well is that half of them are actually private. Right. That companies are just using internally for their own AI systems that they're not sharing. And we're completely fine with that because we understand that some companies build more openly than others, but we want to provide them tools to open what they feel comfortable opening. So sometimes it's not like a big model, it's not a big data set they can share, kind of like a research paper on the platform, because obviously openness is even more important for science than it is for AI in general. And progressively, it allows them to share more and contribute more to the world. Because ultimately we believe that openness and open source AI, open science is really kind of like the tides that lift all boats Right. That enable everyone to build, that enable everyone to understand, to get transparency on how AI is working, not working, and ultimately leads to a safer future. It's like a lot of people right now are talking about AGI. I'm incredibly scared of a non decentralized AGI. If only one company, one organization gets to AGI, I think that's when the risk is the highest versus if we can give access to the technology to everyone, not only private companies, but also policymakers, nonprofits, civil society. I think it creates a much safer future. And the future I'm much more excited.
A
About, I was going to not go here because it's almost too much of a shiny question to ask, but we're talking AGI, so we have to do it. Do you feel that the models today are on a path to AGI or do you feel like AGI is something completely separate and these are not stepping stones to it?
B
Well, I think they're building blocks for AGI, surely in the sense that we're learning how to build some better technologies. But I think at the same time there's some sort of a misconception based on the name of the technology itself. We kind of like call it AI, artificial intelligence. And so in people's minds it brings association with sci fi, with like acceleration, with singularity. Whereas for me what I'm seeing on the ground is that it's just a new paradigm to build technology. So I prefer to call it almost software 2.0. Right. Like you had software before, you have software 2.0. And I think it will keep improving in the next few years the way software has kept improving in the past few years. But it's not because we call it AI that it makes it kind of like closer to some sort of RoboCop scenario of an all dominating AI system that is going to take over the world.
A
It does feel like there's these two different things that masquerade under the same name as AI. One of them is I kind of like software 2.0 because software gave humans leverage to do more and to scale more with a small set of humans. And this new era of software really feels like it's just that on steroids, the richness of applications that you can build very quickly is astonishing and is another 10x improvement on top of the amazing software paradigms that we had until now. There is a completely separate thing which is things that pass the Turing test. I'm talking to something and I'm pretty convinced that thing is a human, but it's not. And it Is a little bit funny to me that these are both sort of referred to as AI when one is really just like leverage for builders on how much they can make.
B
Yes. It's also maybe because we overestimate the second field that you're talking about. To me, it doesn't feel incredibly difficult and incredibly mind blowing that we finally managed to build a chatbot.
C
You thought you could do it in 2016, right?
B
Yeah. If anything, I'm surprised that we didn't manage to build a good chatbot before. So to me, even that kind of like falls into development of the technology for the past few decades. And I think sometimes we forget because we so entrenched about on kind of like today and we are more impressed with like progress of today than progress in the past. But imagine the first VI course that were going faster than humans. Imagine the first computer that can retrieve information much better than humans. Imagine the first time you would go on Google and find any information in a matter of few seconds. These are all impressive progress. Now we take them for granted, but they were impressive progress. So I think technology continues to progress the way it's been progressing for the past few years. Obviously some of the builders of these technologies are hyping it right. And are excited about it, which is normal. But as a society, I think it's good to keep some moderation and understand that the technology will keep improving, that we need to take it into the direction that is positive for us, for society, for humans, and that everything is going to be fine, that we're not going to fall into a doomsday scenario in a few months because of a chatbot.
A
Fascinating. It's funny, as you were talking, you linked it to the bicycle. I always think back to the Steve Jobs quote. A computer is a bicycle for the mind, which is in many ways saying it's leverage. It's a way for the mind to output way more than it otherwise could have, the way that a bicycle does to someone walking. And it's almost like this. Software 2.0 is a bicycle for the bicycle for the mind. It's like a compounded bicycle. We want to thank our longtime friend of the show, Vanta, the leading trust management platform. Vanta, of course, automates your security reviews and compliance efforts. So frameworks like SoC2, ISO 27001, GDPR and HIPAA compliance and monitoring. VANTA takes care of these otherwise incredibly time and resource draining efforts for your organization and makes them fast and simple.
C
Yep. Vanta is the perfect example of the quote that we talk about all the time here on acquired Jeff Bezos. His idea that a company should only focus on what actually makes your beer taste better. That is spend your time and resources only on what's actually going to move the needle for your product and your customers and outsource everything else that doesn't. Every company needs compliance and trust with their vendors and customers. It plays a major role in enabling revenue because customers and partners demand it, but yet it adds zero flavor to your actual product.
A
Vanta takes care of all of it for you. No more spreadsheets, no fragmented tools, no manual reviews to cobble together your security and compliance requirements. It is one single software pane of glass that connects to all of your services via APIs and eliminates countless hours of work for your organization. There are now AI capabilities to make this even more powerful and they even integrate with over 300 external tools. Plus they let customers build private integrations with their internal systems.
C
And perhaps most importantly, your security reviews are now real time instead of static, so you can monitor and share with your customers and partners to give them added confidence.
A
So whether you're a startup or a large enterprise and your company is ready to automate compliance and streamline security reviews like Vanta's 7,000 customers around the globe and go back to making your beer taste better, head on over to vanta.com acquired and just tell them that Ben and David sent you. And thanks to friend of the show, Christina, Vanta's CEO, all acquired listeners get $1,000 of free credit. Vanta.com acquired.
C
You've kind of been there for this whole arc of the modern development of AI. How would you characterize open versus closed over the last call it six, seven years that you've been in this. Does it feel like the pendulum has shifted significantly during that time? Or is it like, oh no, well there was always open and closed. You go back to the beginning and well, Facebook and Google were closed and the academic research community was open. How do you view it?
B
So first, the debate itself is a bit misleading because the truth is that open source is kind of like the foundation to all AI, right? Something that people forget is even the closed source companies are using open source quite a lot, right? So like if you think about OpenAI, if you think about Entropic, they're using open research, they're using open source quite a lot. So it's almost kind of like a two different layers of the stack, right? Where open source, open science is here and then you can build kind of like closed source on top of this open source Foundation. But I do think if you look at the field in general, that it has become less open than it used to be. We talked about 2017, 2018, 2019. At that time, most of the research was shared publicly by the research community. Right. That's how Transformers emerged. That's how Bertrand emerged. Players like Google OpenAI at the time were sharing most of their AI research and their models, which in my opinion led to the time that we are now. It's all this openness and this collaborativeness between the fields that led to much faster progress than we would have had if everything was closed source. Right. OpenAI took transformers, the GPT2, GPT3, and that led to where we are today. For the past few years, maybe two, three years, it became a bit less open or a lot more open, depending on your point of view. Probably because more commercial considerations are starting to play a factor. Also because I think there has been some misleading arguments around the safety of like openness against closeness, which leads to something weird where open source and open science is not as celebrated as it used to be.
A
Yeah, maybe talk about that. What is the argument and why do you feel it is misleading?
B
There are a lot of people emphasizing the existential risk of AI to justify the fact that it shouldn't be as open as it is. Right. Saying that it's better not to share research because it's dangerous.
C
A bad actor gets a hold of this and could do bad things.
B
Exactly. That's not the first time that such things have been used actually in every technology cycles. If you look at it, it's kind of the same, you know, like books are dangerous. They shouldn't be given to everyone. Right. Like they should be controlled just by a few organization. You need a license to write a book, to share a book.
A
It feels like that's never happened though in the like software industry. Yes, that happened in the nuclear era. But like, I don't remember any of this around, like, oh my God, software as a service that's terribly dangerous. Or the mobile apps. Ah, make sure state actors don't get ahold of that.
B
Yeah, it's true. Maybe the cycle has been faster with AI between people not knowing about the technology at all, to everyone knowing. And so it creates more fears, more ability for people to manipulate and people to kind of like mislead. Maybe the name played a big factor. Right. When you call it artificial intelligence, it's much more scary than when you call it software.
C
Back in the day, it was the world viewed what was happening as like, oh, it's A bunch of nerds like there was its own community and it was the norms of the community were around openness and really just coming out of the hippie movement in the Bay Area, frankly in the 60s and 70s. But now the stakes are way higher.
B
Yeah, the competitive environment is quite different too. I feel like at the early days of software, I think it was easier for new companies, new actors to emerge than now, where you have much more concentration of power in the hands of a few big technology companies. So that might play a role. For me, one of the most important things in support to openness is that hopefully it's going to empower thousands of new AI companies to be built, which is incredibly exciting. Big companies are doing a lot of good and they're doing a great job in many aspects. But I think if we can use this change in paradigm between software and AI as a way to kind of like redistribute the cards and change things and empower a new generation of companies, of founders, of CEOs, of team members to play a bigger role in the world, it would be great. I think it would align in a way more the challenges and the preoccupations of society with what companies are actually building. So I'm excited to try to do that.
C
For listeners who haven't seen this firsthand, I was over the weekend with a good friend of mine who is a startup founder, non technical. Has a small bootstrapped company decided to essentially build an AI product around it 10 days ago? Built it? Well, probably decided a month ago. Built it over the course of a couple weeks. Being non technical, I'm sure using Hugging Face launched it and it's completely transformed his business and the output of it as a product is mind blowing and world class thanks to these AI tools.
B
Yeah, it's incredibly exciting. That's one of the reasons why I feel like we don't need the doomsday scenario of AI or like the AGI super intelligence talks about AI, because just the fact that it's a completely new paradigm to build all tech is exciting enough. It's already kind of like thinking about how many people it will empower, how many new capabilities, how many new startups, companies it's going to create is exciting enough for me and for a lot of people. It's going to change a lot of things in the ways that you build companies, you build startups, as you mentioned, the way you invest in startups. I know a lot of investors are listening to this podcast. I think it's going to completely change the way you invest in startups. I've played a little bit with investment at this point. I've done a hundred angel investments in the past two years, mostly in the community around hugging face. And I think we're starting to see that building an AI startup is very different than building a software startup in many ways that is, I think, impactful for the way you think about investing and returns for funds. Like, for example, it seems like it's the first time that you're seeing so many of these startups with very heavy need for capital for compute, like a Mistral that we know with like an OpenAI. So I think it changed a little bit the way you think about investment. Returns on investments burn for startups.
A
That category of companies requires way, way more capital. But there's not that many foundational model companies.
B
I think they could be, they could be. If you think of it, most of the investment now is going towards foundational LLMs, but it's just one modality. Text, right? What about foundational models for video? What about foundational models for biology, for chemistry, for audio, for image? What if actually foundational model companies are actually just normal AI companies the same way software companies were like the new type, the new default for all companies in the software software paradigm. The truth is that we don't know yet, right? I think it's still too early to tell exactly what are the recipes for AI startups. And so that's why it's super exciting as an investor too, because the truth is you can't apply the same playbook that you used to in software, right? In software you were so mature that you had the playbooks, right? You needed like a co founder, cto, CEO, small team. And then you do the lean startup and then you follow your rounds and then you get to the highest probability of success. What if in AI it's completely different? For example, most of the founders actually are not software engineers anymore, they're scientists, right? It's totally, completely different game. The lean startup doesn't work anymore because they need heavy capital investment before any sort of return. So what I'm saying is that it just completely changes the game and you have to forget everything that you've learned, everything that you've internalized and start from scratch.
A
It's funny, where I thought you were going to go with this was AI companies or companies that use AI can be just a few people and get huge output because they're just using the API as provided by these foundational model companies. And there's an extreme amount of leverage to produce great value for customers with few employees. You took it completely the other direction, which I think is quite contrarian, and said most AI companies, or perhaps you were saying most dollars deployed into AI will require new foundational models and therefore they're going to be these unbelievably large investments to get these step function advancements in a lot of different fields. Am I hearing you right?
B
Yeah, yeah. And I think the truth is that nobody knows yet. So I'm not saying that I'm a hundred percent sure that it's going to go that way, but I'm saying that it's possible. And so that's why it's exciting to see how it's going to evolve in the next few years.
A
One easy way you win that argument is that the dollars consumed by foundational model companies are so large that even if there's a thousand times more regular startups consuming APIs provided by AI companies, it's still the case that most investment dollars will actually go to foundational model and large training runs.
B
I mean, if you look at some of the successful companies so far, if you look at Hugging face, if you look at OpenAI companies like that, I don't think they acted in the traditional way you would expect a software company to act. Right. And maybe on OpenAI they started with a billion dollar raise, did open source open science for six, seven years and then started a completely new model for Hugging Face. We operated and like fully open source for many years, really community driven, very different kind of organization than what everyone was telling us to do. So I think there's something to be said about really throwing away the playbooks, throwing away the learnings from the software paradigm and really start from scratch, maybe start from first principles and build a new model, a new playbook for AI.
A
Has Hugging Face as a company been particularly capital intensive and if so, why?
B
We haven't. So we raised a bit more than $500 million so far over the course of seven years. We actually spent less than half of that and we're lucky enough to be profitable.
A
Congratulations.
B
Which is quite, quite unusual for most AI startups. We have like a different kind of model than some other AI companies.
C
I assume you all don't have nearly the same kind of capital expenditure requirements that say an OpenAI does in terms of compute and training.
B
Yeah, yeah. And we have enough usage already that is free, that we have quite straightforward and quite permissive freemium model. We can easily get to a level of revenue that is meaningful. We have some specificities for sure that allows us to do that. And it was also an intentional decision for us because as a community platform, we want to make sure that we're not going to be here just for a year, two years. When people build on top of you, when they contribute to the platform, I think you have some sort of a responsibility towards them to be here for the long term. And so finding kind of like a profitable, sustainable business model that doesn't prevent us from doing open source and sharing most of the platform for free was important for us to be able to deliver to the community that we're catering to.
A
Your customers do use Hugging Face for very capital intensive things, training these models, but that doesn't show up in your financials as, oh my God, we had to sink a billion dollars into a training run. You partner with a cloud provider on the back end and pass it along to whoever is doing the training run, right?
B
Yeah, we try to find sustainable ways to do that, either by partnering with the cloud providers, by providing enough value so that the companies that are buying the compute are okay with paying a markup to the compute that makes it high margin for us, or providing paid features that are basically like 100% margins. Like, for example, a lot of companies are now subscribed to our Enterprise Hub offering, which is an enterprise version of the hub, which is obviously kind of like a different kind of economics than selling compute.
A
Yep, very proven business model. You get to choose how you make money. Are you marking up Compute? Are you selling SaaS? Are you going the enterprise route and developing this custom package for every engagement? I'm very curious on the routes where you choose to apply a margin or a markup on top of Compute, what is it? Because clearly you're not ashamed of this and I think it's a great business model. What is it that Hugging Face can provide where a customer goes, yeah, I'll do it through hugging face instead of going and figuring out how to do it myself directly on a cloud provider.
B
We've never been so interested in taking part of the race to the bottom on Compute. It's a much more challenging business model than a lot of people think, especially with the hyperscaler being in such a position of strength, both in terms of offering but also in terms of cash flow, giving them the ability to do a lot of things that other organizations wouldn't be able to do. And so the way we think about it is instead of taking part of this race to the bottom, we're trying to provide enough value both with the platform, the features and the compute. So that companies are comfortable paying a sustainable amount of money for it. So when you use for example the platform, the idea is, and when you use offering like the inference endpoints or spaces GPUs on the platform, the idea is that it's so integrated with the feature of the platform that it actually makes it 10 times easier for you as a company to use that as a bundle versus using just the platform and then going for cloud provider for the compute. So it's what I call it locked in compute. It's almost kind of like not the compute that you can trade in and it doesn't really matter to you if you switch from aws, Google Cloud or another provider. It's more, we make the experience so much more seamless, so much less complex, which is the name of the game for AI. Right? AI is still complex for most companies that at the end of the day, yes, companies are paying more for it, but instead of having 10ml engineers, maybe they're going to have one or two.
C
The alternative to this would be you have your AI researchers working on models and then when you want to go train or deploy it, not through hugging face, you basically need a whole nother team of AI infrastructure deployment engineers. Right?
B
Yeah, yeah. As we mentioned before, I think when the early days of AI, we're in the early days of AI monetization today, no one knows what is like a profitable sustainable business model for AI. Right. Like even the big players. I mean OpenAI is of course generating a lot of revenue, but the question of profitability and sustainability of this revenue is still an open question. And I think they're going to figure it out and I hope they're going to figure it out. But we're so early in figuring out business models for AI that there's a lot to build. And so that is extremely exciting.
A
And I would argue you're not figuring out any business model. You are using time tested, proven ways to make money where you occupy a particular part in the value chain, where you're providing a rich set of experiences to developers. They're willing to pay for that directly, they're willing to pay for it in the chain of slightly more expensive compute. The nice thing is you get to innovate on all the AI things without having to build a business model from scratch. These foundational model companies, that is where there's this big open question of what exactly is the business model? Especially when the consumer expectation with interacting with all these AI chat style agents is that that is free for a huge set of functionality.
B
Yeah, the Beauty of the position we're in is that if you're the number one platform that AI builders are using and if AI becomes the default to build or tech, it's pretty obvious that there's kind of like a sustainable, massive business model around it, right? Otherwise we would be doing something wrong. That's why we so much focused on the usage on the community, because we believe if we keep figuring that out, if we keep leading on the usage and the adoption, we keep kind of like empowering the community to use our tools and be successful with our tools, there's going to be good things in the future for us for hugging face and hopefully for the community.
A
There are some businesses that are just perfect. Like, you sort of analyze them. Visa is a good example and you're like, man, there's basically nothing wrong with this business model. Everything about it is just glorious. If you are a shareholder of Visa and every business shy of Visa has these things where you're like, that's an exceptional thing about that business. And here's the thorn in my side that as I'm operating this business, I just can't escape this thing that kind of sucks. We've talked a lot about all the ways in which you've positioned yourself in a remarkable place in the emerging AI ecosystem. What's the thing that you have to deal with where you're like, ugh, it is such a thorn in my side for us inherently.
B
We have to almost take a step back from the communities that we're empowering. That's kind of like a little bit the curse of the platforms. So like, if you think, for example of GitHub, it's probably the company in the past 20 years that has empowered the most the way you build technology, right? Because virtually all software engineers have used GitHub as their way of kind of like collaboratively building. And yet people don't talk about them, right? Like don't talk about the product. It's not as visible as Facebook, Google or these companies can be. So we have some sort of a curse around. I would say visibility, maybe. Sexiness will never be kind of like an open AI in terms of sexiness and hotness and people talking about us and always kind of like stay a little bit in the background.
C
Back in the day though, when GitHub was in its earlier years and was a startup, it was very the hundred.
A
Million dollar series A. I still remember it.
C
Yeah, I remember that for sure. It was plenty buzzy. But to your point of as a infrastructure company or, well, developer writ large in your case, AI builder platform, you're more behind the scenes.
B
And then another challenge for us is that yes, AI is starting to be mainstream in terms of usage, but if you really look at it, the underlying technology foundations are still evolving really fast. And so there's this constant battle between building mature, stable platforms and solution. But at the same time innovating, iterating fast enough so that you don't miss the next wave. So for us, more like as a company building aspect, it's something that we always worried about. So we're 250 team members in the company. We say that we always want to stay an order of magnitude less team members than our peers. Like we could be 2000 people, but we prefer to be 2002000 as a way to reconcile this difficult challenge between building really, really fast, but really building tools that scale. That's an important challenge for us for sure.
C
That's such a good point. And you made it a minute ago. I hadn't really considered, you know, we might still be in the sort of, you know, Yahoo, AltaVista era of foundational model companies. Many of them are very successful. You know about them, as you were saying, they make a lot of revenue. But like, are they fundamentally profitable endeavors yet? Probably not.
B
I think we are. Even when you think about how companies are building with AI, to me, an AI company using an API sounds very unintuitive or it doesn't sound like the optimal way to build AI and more almost kind of like a transitional time where the technology is still a bit too hard for all companies to build AI themselves. But I would be surprised if it didn't happen. It's almost kind of like the early days of software where you had to use, I don't remember what they were at the time, but like a squarespace you had to use kind of like a no code platform to build a website.
A
Dreamweaver and Microsoft frontpage.
B
Yeah, yeah. Before technology companies could learn, before software engineers could learn to build code themselves. We might be at the same time in AI where companies are using API because they haven't built yet the capabilities, the trust, the ability to do AI themselves. At some point they will. They know their customers, they know their constraints, they know the value that they're providing. At some point in history, all tech companies will be AI companies and that means that all companies, all these technologies are going to build their own models, optimize their own models, fine tune their own models for their own use case for their own constraints, for their own domains.
A
I think this is Pretty contrarian too. I, coming into this conversation, would have fallen in the opposite camp of there are going to be five to eight players, maybe even consolidating more from there, that need to spend 10 to $100 billion every couple years and no one else has that ability to spend or attract that sort of research talent. And so we all consume their APIs. And you're proposing a very opposite future.
B
Yeah, I mean, I'm a bit biased obviously by the usage that we see.
C
Well, you're a lot closer to it than we are.
B
As I was saying, there's a new model data set or app that is built on hugging face every 10 seconds. So I can't believe that these new models are just created for the sake of new models. I think what we're seeing is that you need new models because they're like optimized for specific domain, they optimize for specific latency, for specific hardware, for specific use case, and so they're smaller, more efficient, cheaper, cheaper to run. So ultimately, I believe in a world where there's almost as many models as code repositories today, and actually if you think about models, they're somehow similar to code repositories. Right. It's a tech stack. A model is like a tech stack. So I can't imagine that only a few players are going to build the tech stacks and that everyone else is just going to try to ping them through APIs to use their tech stacks. So I envision a bit of a different world.
A
Yeah, it makes sense. And implicit in your comment is that 99.9 something percent of models are inexpensive to train and do inference on and they're small and they're purpose built and it's nice that this thing happened in the last three years where these God models seem to be able to do everything better than all the specialized models that people spent 10 years building before. But that's a, that's a blip in time and we're going to kind of shift back to specialized, cheap models handling a lot of the labor as, as everyone gets better at the state of the art.
B
Yeah, or something in between. Right. It's always kind of like a gradient. And I think some, some companies, some context, some use cases will require very large generalist models. So like when you're doing a chat GPT, yes, of course you need kind of like a big generalist models because your users are asking everything. But when you're building banking, customer support chatbots, you don't really need it to tell you the meaning of life. Right. So you can save some of the parameters to make sure that your chatbot is smaller, has been trained more on the data that is relevant to you, that is going to cost you less, that is going to reply faster. So that's kind of like of course, also very depending on the use cases that you plan to use AI for.
C
I'm curious if you listening to this and you're thinking about starting a company, thinking about starting an AI company. Maybe you have a use case or a vertical use case knowledge that you want to go after. What are the ingredients and skill sets that you need on your team? If you, if you buy what you're saying of like, hey, you could use APIs, but like really ultimately you want to build your own model, what do you need to build your own model and build a great one.
B
So for me, the main difference between the software paradigm and the AI paradigm is that AI is much more science driven than software. It's a bit of a paradox because in software sometimes we call people computer scientists. Right. But the reality is that they're not really scientists in, in the true sense of it. Right.
A
Such a misnomer.
C
They're engineers. Yeah.
A
This always bothered me studying computer science in college. Like all of the other sciences are things that occur in our natural world, biology, chemistry, physics. And computer science is like, no, you're learning how a thing that is man made works and how to operate it.
B
Yeah. So to me that's the main difference between the software paradigm and the AI paradigm. So when it comes to founding teams and capabilities, I think having more science backgrounds are actually kind of like a must. Having one co founder who is a scientist, I think is a big, big plus. If you look at most of the successful AI companies, they actually have like a science co founder. We do a tugging face. I think OpenAI has, of course, with Wilia, that's one big thing.
C
How would you describe the difference in mindset and skillset between a traditional software startup and the engineering skill set you need for that versus the scientist skillset and the research skillset.
B
Timing is very different in the way you look at how fast to build something, ship something. When I was more working at software startups, right. We have the cult of like shipping really fast. This might not be as true for AI. I think you want to ship as fast as you can. But realistically to train a model and optimize a model, it's more at best a matter of months than a matter of days. So you probably want to look differently at how you're shipping, how fast you're shipping, how you iterating on things. The skills are quite different too. I think an AI scientist has the potential to be more skilled at math, pure math, than kind of like an engineer, I think thinking more in terms of like, how can I make foundational or meaningful progress compared to the state of the art? And you kind of like looking at bigger scales of improvement. I think in the software paradigm, you can almost think like, okay, if I make my product 5% better than others, it's going to be enough because I'm going to make it 5% better now, and then in two weeks 5% more, and in two weeks 5% more. And at some point you'll have enough differential in terms of value adds to get users and convince and retain users. For science, it's almost like you don't create any value. You work on something for six months and then after six months you have something ten times better than the existing. Right. Like in a way, that's what OpenAI did. Right? They worked for six years barely releasing anything or anything successful, but at some point they were able to release something that was probably 10 times better than others. So that's kind of like a different way of looking at it too.
A
I'd push back on that. I think that's a little bit revisionist history. I'm sure you were watching OpenAI very closely. It felt like they were releasing all sorts of stuff. None of it had any commercial value and all of it felt super researchy. But that thing where they trained Universe on Grand Theft Auto, I mean, The GPT and GPT2, they weren't known in the mainstream, but it was pretty remarkable watching that. I think them going all in on the transformer and deciding, hey, we need to fundamentally change the set of things that we're working on. I think that company has worked incredibly fast, shipped pretty fast, and now they're shipping faster than ever because they're actually in this arms race. I definitely don't think of them as a go away, think and build for 10 years and then finally release something.
B
They did release a lot of things, but compared to their size and their scale, knowing that they started with a $1 billion investment, maybe they were releasing one thing every three months or like one thing every six months. So relatively to their size and their scale and the amount of money that they raised, I think they were shipping and releasing way less things than the typical software company would have with their budget. But I agree with you that it was an iterative way, I guess to.
C
The point too, like, if you have a large model, you're not going to do continuous deployment because you got to retrain the model, if nothing else, right?
B
Like, yeah, it's just a different approach. The best advice I give to people is to trash their Lean startup book when they're starting an AI company, because I think these kind of things have been so ingrained into our minds, into our way of building as kind of like software entrepreneurs, that it's really easy to fall into the trap of doing it without even realizing we do it, instead of completely changing the paradigm, changing the operating system of the startup builder, which in my opinion leads to much better results.
A
Well, Clem, this brings us to a topic that I've been wanting to ask you about, which I think will be kind of our last major topic for today. In the discussion of which approach will sort of win in the marketplace of open source versus closed source AI, there's a pretty compelling argument which is as more training data, real time training data is required, people's interactions with an application will become incredibly valuable to fine tune or train the next version of the model. There's sort of a compelling argument that is closed source AI will win because they're just going to get all of that directly from users. When you own the model and the application and you sort of have tightly integrated everything versus in the open source world, like, great, you publish something and then a bunch of people fork it and they build their own applications and then the real time interaction data with the application doesn't make its way all the way back upstream to make the model smarter. How do you think about that?
B
Well, I think a lot of people are thinking and talking about moats and economies of scale for AI. I think that all of that is kind of like open questions at this point. I think nobody really knows how to create a moat or how to generate economies of scale for AI. My intuition is that they're not going to be so different than the software paradigm and that you're going to find the same kind of modes maybe applied differently, but you're going to have the cost economies of scale, right? Similar maybe to a cloud provider or like a hardware provider who can get an advantage from larger scale to reduce prices. I think you're going to have like the social modes or the network effect, right? That's more like the game that we play in, where when you have collaborative usage in a way like your platform becomes more and more useful the more users you have and so it makes it difficult for anyone to compete with you. Right. That's why GitHub has never really been challenged. Or that's why social networks are arguably very hard to compete with. They're going to maybe be more intense than in the software paradigm. So maybe the cost mode of compute will be more extreme. But it's an open question because if you think about the current winners, some of the current winners, they didn't have so much of these advantages from the get go. Like if you look at OpenAI, they didn't really have more access to data than most companies. Right. They ended up scrapping the web and getting data that everyone else could get. If you look at hugging face, I don't think like going in we had any specific advantage that allowed us except being kind of like as community driven as we were, that enabled us to develop the social network effect. It's still an open question. I would be careful of people and companies kind of like overplaying and over hyping one sort of moat compared to others. And even if you think of ethically and the kind of world that we need, I hope that we're not going to have just a few companies winning. It would be a shame, it would be quite sad if we ended up with just five companies winning in AI. I think it would be dangerous, right? Imagine if only a few companies were able to do software. We would be like in a very different world than we are today. I hope many companies win. I think the technology is impactful enough so that there can be almost more AI companies winning than software companies in the past. That'd be very exciting to me.
A
And you make a very credible argument that it's going to empower more people than ever to build products. And so it stands to reason that there should be more companies or at least more attempts to start companies that can serve a particular customer need in this generation than any previous generation before.
B
AI is the opportunity of the century to shake things up, break the monopolies and break off the established positions and do something a bit new.
C
I'm curious to get there. Do you think, think that we need just like a lot more people getting trained in how to be AI builders and AI scientists, or do we need the tools and infrastructure to get a lot easier to use or both?
B
Both. But I think it's much more important that we get many more AI builders than we do today. If you're looking at pegging face, as I was saying, we have 5 million AI builders, right? So we can assume like most AI builders are using hugging face one way or another. So you can Estimate that there's around 5 million AI builders in the world today. There's probably around 50 million software engineers, or like software builders, depending on how you set this definition. I think GitHub has over a hundred million users. A lot of them obviously are not software engineers, but probably half of them. So we're still at the early innings, right? It wouldn't be surprising that if in the few years you would have more AI builders than software builders, right? So maybe in a few years you're going to have 50 million, 100 million AI builders, even more, because the beauty of AI is that it's a bit less constrained than software in the way that people can contribute to it. In a way, to be a software building, you have to learn a programming language and write lines of code, which is a pretty high barrier to entry, versus for AI. You can be considered an AI builder if you contribute expertise, if you contribute data to a model that improved the model. Maybe we're going to have like 10 times more AI builders than software builders, which would be also good for the world because it would mean that more people could contribute, could understand, and could kind of like shape the technology more aligned with what they want. I think sometimes in San Francisco, in Silicon Valley, or in tech in general, we forget that it's a very small number of people shaping products for a much bigger number of people. Whereas if you maybe include more people in the building process, you can not only build better products, but more inclusive products, maybe products that can solve more social issues than we've been solving. And so that's. That's quite an exciting future for sure.
A
Well, Clem, I can't imagine a better place to leave it. Where should listeners go to learn more about you or Hugging Face or get involved?
B
Huggingface co actually dot com. We just got the dot com a few days ago.
A
Hey, congratulations.
B
Yes, it's a good example that you shouldn't sweat the small things early on. Right. Our name Hugging Face is obviously very unusual for the kind of things we do. Our domain name for like seven years we kept Huggingface Co, but it didn't create too many problems for us. I'm on Twitter, I share a lot on x and on LinkedIn, so you can follow me there or ask me questions there and happy to answer.
A
Awesome. Well, thank you so much. And listeners, we'll see you next time.
C
We'll see you next time.
ACQ2 by Acquired - Episode Summary
Title: Building the Open Source AI Revolution
Host/Author: Ben Gilbert and David Rosenthal
Guest: Clem Delangue, CEO of Hugging Face
Release Date: October 14, 2024
In this episode of ACQ2, hosts Ben Gilbert and David Rosenthal engage in an enlightening conversation with Clem Delangue, the CEO of Hugging Face. The discussion centers on the pivotal role Hugging Face plays in the burgeoning open-source AI landscape, aiming to demystify open-source AI for listeners regardless of their prior knowledge.
Notable Quote:
Ben Gilbert [00:27]: "You should walk out with a pretty clear understanding of open-source AI. The more closed ecosystem, what is the difference between the two? What are the trade-offs?"
Clem Delangue elucidates Hugging Face's position as the premier platform for AI builders, equating AI builders to the new generation of software engineers. He highlights the platform's extensive user base—over 5 million AI builders—and its comprehensive repository of more than 3 million shared models, datasets, and applications.
Notable Quotes:
Clem Delangue [01:05]: "Hugging Face has been lucky to become the number one platform for AI builders... We have over 5 million AI builders using the platform every day."
Clem Delangue [04:13]: "It's a new paradigm. So AI is quite different than traditional software."
Originally founded in 2016 as a chatbot company inspired by conversational AI and virtual pets like Tamagotchis, Hugging Face underwent a significant pivot following the release of Google’s Transformer models in 2017. This transition was catalyzed by Hugging Face’s initiative to port the BERT model from TensorFlow to PyTorch, which garnered substantial community interest and laid the foundation for their current platform-centric approach.
Notable Quotes:
Clem Delangue [07:26]: "We started with conversational AI... We remember our Tamagotchi, which were this kind of like fun virtual pets."
Clem Delangue [12:34]: "Thomas told us, like, oh, there's this new Transformer model that came out from Google. It's amazing, but it sucks because it's in TensorFlow."
A significant portion of the conversation delves into the ongoing debate between open-source and closed-source AI models. Clem advocates for open-source AI, emphasizing its role in democratizing technology, enhancing transparency, and fostering safer AI development. He counters the argument that closed-source models have inherent advantages in data accumulation and model refinement, positing that the open-source community's collaborative nature can rival proprietary ecosystems.
Notable Quotes:
Clem Delangue [17:49]: "Openness and open source AI, open science is really kind of like the tides that lift all boats."
Clem Delangue [26:52]: "Open source is kind of like the foundation to all AI... It has become less open than it used to be."
The discussion explores Hugging Face's unique business model, which diverges from typical AI startups by maintaining profitability with less-than-half of its $500 million raised over seven years. Clem explains that Hugging Face leverages partnerships with cloud providers and offers high-margin enterprise solutions, such as their Enterprise Hub. This approach ensures sustainability without succumbing to the capital-intensive demands often associated with AI development.
Notable Quotes:
Clem Delangue [38:38]: "We raised a bit more than $500 million so far over the course of seven years. We actually spent less than half of that and we're lucky enough to be profitable."
Clem Delangue [40:35]: "We try to find sustainable ways to do that, either by partnering with the cloud providers... providing paid features that are basically like 100% margins."
Clem offers a contrarian yet optimistic view of the future, suggesting that the AI startup ecosystem will diversify beyond a few dominant players. He envisions a landscape where specialized, purpose-built models proliferate, empowering countless new companies to innovate without the need for exorbitant capital investments. This perspective challenges the conventional belief that only a handful of organizations can lead in foundational AI development.
Notable Quotes:
Clem Delangue [36:58]: "Most of the investment now is going towards foundational LLMs, but it's just one modality. What about foundational models for video... audio, for image?"
Clem Delangue [51:22]: "AI is the opportunity of the century to shake things up, break the monopolies and break off the established positions and do something a bit new."
Despite its success, Hugging Face grapples with the "curse of platforms"—a lack of visibility compared to consumer-facing tech giants. Additionally, the rapid evolution of AI technology poses a constant challenge in balancing platform stability with the need for innovation. Clem emphasizes the importance of maintaining a lean team to navigate these challenges efficiently.
Notable Quotes:
Clem Delangue [46:49]: "Like, we have to almost take a step back from the communities that we're empowering. That's kind of like a little bit the curse of the platforms."
Clem Delangue [48:09]: "The underlying technology foundations are still evolving really fast... balancing between building mature, stable platforms and innovating fast enough."
Clem concludes with a compelling vision for an inclusive AI future, where millions of AI builders contribute to a rich ecosystem of models and applications. He underscores the potential for open-source AI to foster a more equitable technological landscape, enabling diverse voices to shape AI's trajectory and ensuring that its benefits are widely distributed.
Notable Quotes:
Clem Delangue [65:34]: "We can have almost more AI companies winning than software companies in the past... That's quite an exciting future for sure."
Clem Delangue [64:43]: "AI is the opportunity of the century to shake things up, break the monopolies and break off the established positions and do something a bit new."
Clem encourages listeners to engage with Hugging Face through their newly acquired domain huggingface.com, highlighting the platform’s transition from "huggingface.co" to "huggingface.com." He invites the community to follow him on social media for updates and further involvement.
Notable Quote:
Clem Delangue [67:57]: "We just got the .com a few days ago... follow me there or ask me questions there and happy to answer."
Summary:
This episode provides an in-depth exploration of Hugging Face’s transformative journey in the AI industry. From its origins as a chatbot platform to becoming the leading open-source AI repository, Hugging Face exemplifies the power of community-driven innovation. Clem Delangue articulates a vision where open-source AI democratizes technology, fosters safe and transparent development, and empowers a diverse array of startups to thrive. Despite challenges related to visibility and the fast-paced evolution of AI, Hugging Face remains committed to sustainability and inclusivity, positioning itself as a cornerstone in the open-source AI revolution.