
Loading summary
A
Foreign. Hey, Ted, great to have you here at Infra AI.
B
Great to be here, Keith. Thanks for having me.
A
The COO of Inflection AI. How are things going at Inflection?
B
You know, it's been an amazing year. Obviously, people know about the transcendent a year ago where Microsoft licensed our technology and made it the core of copilot.Microsoft.com and Mustafa, of course, leading the AI for consumers at Microsoft now. But little known to most people is that we continued to chug along. In fact, actually, we had a terrific working relationship with Microsoft during the whole transition, continued to develop our technology, but we made a huge pivot because we're no longer focused on consumers, we're focused on the enterprise.
A
Yeah. So you didn't just take a holiday after you signed the Microsoft deal?
B
No, actually, Reid Hoffman came in and said, okay, now we have to actually redouble our efforts. Nice.
A
We're increasing. Okay. But talk to me about seamless infrastructure. That's an area that's getting a lot of attention right now.
B
Yeah, well, I think when we started out thinking about this in May of last year, it was because we were actually having trouble getting a hold of Nvidia chips like a lot of people were. And so we started looking around for other suppliers and said, obviously we built, like most AI companies, all of our technology on top of Nvidia. And we then said, well, what is it going to take for us to be able to run it on other chip platforms? And we realized, of course, CUDA is a big lock in. So we leaned into Pytorch and we moved everything in our entire platform to be able to run on a heterogeneous architecture, to be able to say, we can deploy on intel, we can deploy on amd, we can deploy of course, on Nvidia. And then the next question was, well, we run a huge infrastructure. We serve millions of users a day with our consumer app pie still today, and it's growing with our business users, and we wanted to be able to run it across infrastructure. So that's where the seamless infrastructure idea started. Let's be able to deploy multiple clouds, multiple chip architectures, multiple models, and run.
A
It all seamlessly and at the same performance levels. Or do we see some decline in performance when you run it across multiple layers?
B
Well, you've got latency in some kinds of operations. So if you're doing pre training, for example, you really want to be able to run it all in one cluster. But for inference, actually the load is distributed in a way that actually we don't see any performance issues at all.
A
Okay, and in terms of the scaling, what are some companies doing? Well, and what are some companies doing that just don't look or suboptimal, let's say?
B
Yeah, I don't think very many companies have experience in running the scale clusters that the top companies run. So I think one of the challenges that we've had in working with new cloud providers is getting them up to speed on what it takes to run thousands of GPUs simultaneously. And I think this is going to be a problem that companies start encountering. Right. Very few companies are now at a scale where they're worrying about this. In fact, I was with a large bank and won't mention the name in New York a couple weeks ago, and they said, ted, we're GPU rich. Like, oh, really for you guys, what does GPU rich mean? And they're like, we have a 48 node cluster. I'm like, that's nice. Yeah, we're kind of running at the hundreds of nodes, and that means thousands of GPUs at a time. In fact, I just signed another 1024 GPU license a month ago and blew them away. They're just not even thinking about it at that scale.
A
Where are we seeing the demand in terms of purchasing now, in terms of trying to scale the business?
B
Yeah, well, I mean, different levels here. Right. So I think what I see is a bell curve of distribution of companies. And believe it or not, I talk to companies, I'll stick with financial services. I talked to a CIO a couple weeks ago who said, you know what, Ted? I don't think generative AI is going to have any impact on our business for years. Okay, well, I'll be back when you get fired. And Obviously then there's JPMorgan Chase spending billions of dollars on it. Right, right. And so the need curve is going to be. Is very different right now. Number one, I think companies are starting to really lean into what I'll call sovereign AI. We need to own it ourselves. And we need to own ourselves for a bunch of reasons. Number one is we want control. Number two is we're worried about security. Number three is we're worried about cost. And then actually, the unlock that happens once they start actually building and owning their own AI, is they start recognizing that there is a value in customization, customization to their industry, customization to their specific business, even customization down at the individual process or function level. And that, that actually is, is a huge accelerator in value. So companies are still sitting there saying I'm going to invest based on roi. But they're starting to see the benefits of scale AI that is customized and personalized for their business.
A
And do you help them in terms of developing their models, their strategies in terms of getting fully leveraged?
B
Yeah, absolutely. So our business consists of really four parts. So the first part is we are a model building company and we built amazing models. Our largest model is an 840 billion parameter mixture of experts model. So we're not at the 2 trillion parameter model and we don't expect to be. That's not our objective because what we think is it's actually about a compound architecture of multiple models. And so the second part of our business really is about how do we actually manage a whole infrastructure of multiple models and multiple tools to be able to execute the architectures that our customers need. And one of the things that I think is a benefit for us, us being focused on the enterprise is if I look at the challenge that OpenAI has, they have to solve every problem for every user all over the world all at once, right. I go in and I solve a particular company's problems, right. So we can be very narrowly focused to create that composition that is appropriate and that customization that's appropriate for that industry, that company and those use cases. So that's the third piece that we do is we then help them build out what is the specific infrastructure, what models do they need, what tuning do they need, even what pre training might they want in order to enable their use cases. And then we are building applications on top of that as well that are general purpose. So one we just introduced is called Insights. Basic point is, how do you make it easy to put a conversational interface on top of structured data so that your employee can ask a natural language question and, and have it go and correctly go and execute a plan to retrieve the data, analyze the data, generate a report and give it back to you in 30 seconds.
A
Yeah. What's the use case for that? It could be many, I would imagine.
B
Well, that's the point. We're not going after saying we're specialists in one particular market or industry or use case. What we're trying to do is build reusable technology. But our first use case we build, one of our customers is in supply chain. So imagine, just imagine for a moment this crazy idea. Somebody decides that there's 134% tariffs on products coming from China. And so what you want to be able to. I know, crazy, right? And so what you need to know really quickly is what's the impact on my cost base? Right. So you want an employee to be able to say, you know, what happens when there's 134% tariffs on all my products sourced from China? That's actually a complex business analyst question. You have to go back in, look at how many products are sourced from China, and look at what the total dollar volume was. Calculate 134% of that. Imagine being able to do that in an instant because you have a conversational interface on top of your supply chain data.
A
I love it. I need it, too.
B
Okay, well, talk to me after.
A
Should I go to Microsoft or they go straight to inflection?
B
Well, it depends on how complicated and private your data is. So I think the two things that we hear most from customers is, number one, they worry about security of their data. They don't really want to move all their data into a compute environment which they don't have control over. But the other thing which is really interesting is the network cost of moving data starts actually being one of the issues in being able to run these kinds of compute exercises. So they want the compute to be right there wherever the data is being stored.
A
Ted, we got to wrap it up, but I also don't want to keep you from your goal for Reid Hoffman and achieve your sales objectives for the next quarter. So thanks for coming. My pleasure. Keith, wonderful to have you here at Infra AI and good luck. It looks like you're going to be on a. A wild ride for a while.
B
It will be a wild ride. Thank you.
A
Thank you.
Episode Overview In this episode of Liftoff with Keith Newman, former journalist turned Silicon Valley dealmaker and entrepreneur Keith Newman sits down with Ted Shelton, the Chief Operating Officer of Inflection AI. Released on June 2, 2025, the conversation delves into Inflection AI’s strategic pivot from consumer-focused AI solutions to enterprise-level applications, their approach to seamless infrastructure, and the broader implications of scaling AI beyond traditional GPU dependencies.
Ted Shelton opens the discussion by reflecting on the past year’s significant milestones for Inflection AI. A key highlight was Microsoft’s licensing of Inflection’s technology, which became central to Microsoft's Copilot (00:09).
Notable Quote:
“We continued to develop our technology, but we made a huge pivot because we're no longer focused on consumers, we're focused on the enterprise.” — Ted Shelton (00:13)
Shelton emphasizes that rather than resting on their successes, Inflection AI intensified their efforts toward enterprise applications, supported by strategic guidance from Reid Hoffman. This shift underscores a commitment to addressing the complex needs of businesses leveraging AI.
A significant portion of the conversation centers on Inflection AI’s initiative to create a seamless infrastructure that transcends reliance on Nvidia GPUs. Shelton explains the challenges encountered due to limited access to Nvidia chips, prompting a strategic exploration of alternative suppliers and chip architectures (00:57).
Notable Quote:
“We moved everything in our entire platform to be able to run on a heterogeneous architecture, to be able to say, we can deploy on Intel, we can deploy on AMD, we can deploy, of course, on Nvidia.” — Ted Shelton (01:10)
By leveraging frameworks like PyTorch, Inflection AI transitioned towards a heterogeneous computing environment. This adaptability allows deployment across various cloud providers and chip architectures without significant performance degradation during inference operations, although some latency is noted during pre-training phases (02:01).
Shelton addresses the scalability challenges companies face when managing extensive GPU clusters. Highlighting the disparity between larger enterprises and smaller companies, he illustrates how many organizations lack the infrastructure experience necessary to operate at scale.
Notable Quote:
“Very few companies are now at a scale where they're worrying about this. In fact, I was with a large bank... they said, 'Ted, we're GPU rich.'... they're just not even thinking about it at that scale.” — Ted Shelton (02:31)
This segment underscores the necessity for enterprises to develop robust GPU infrastructures, a realm where Inflection AI positions itself as a knowledgeable partner.
Delving into the broader AI landscape, Shelton introduces the concept of "sovereign AI," where companies prioritize owning and controlling their AI systems. This approach is driven by concerns over security, cost, and the desire for customization tailored to specific business needs.
Notable Quote:
“We need to own it ourselves. We need to own ourselves for a bunch of reasons... customization to their industry, customization to their specific business… is a huge accelerator in value.” — Ted Shelton (04:56)
Inflection AI assists enterprises in developing bespoke AI models and strategies, ensuring that AI implementations are finely tuned to deliver maximum return on investment. This focus on customization allows businesses to leverage AI in ways that are uniquely aligned with their operational objectives.
Shelton outlines Inflection AI’s multifaceted approach, which encompasses model building, infrastructure management, and application development.
Model Building: Creation of robust models, including an 840 billion parameter mixture of experts model.
Infrastructure Management: Orchestrating the deployment of multiple models and tools to meet diverse customer requirements.
Customized Solutions: Tailoring AI architectures to specific industries and use cases, enabling precise problem-solving.
Application Development: Introduction of applications like "Insights," which facilitates conversational interfaces over structured data, allowing users to perform complex data analyses through natural language queries (05:03).
Notable Quote:
“If I look at the challenge that OpenAI has, they have to solve every problem for every user all over the world all at once... we can be very narrowly focused to create that composition that is appropriate and that customization that's appropriate for that industry, that company and those use cases.” — Ted Shelton (05:03)
A vivid example provided by Shelton illustrates the practical utility of Inflection AI’s solutions in supply chain management. He describes a scenario where an abrupt tariff increase necessitates rapid analysis of its impact on costs, a process that their "Insights" application can streamline effectively.
Notable Quote:
“Imagine being able to do that in an instant because you have a conversational interface on top of your supply chain data.” — Ted Shelton (07:37)
This example showcases the real-world applicability of Inflection AI’s technology, enabling businesses to respond swiftly to unforeseen challenges through intuitive AI-driven tools.
Addressing potential customer concerns, Shelton discusses the importance of data security and the logistical challenges related to network costs. Inflection AI ensures that sensitive data remains under the company’s control by allowing compute operations to stay proximate to data storage locations, thereby mitigating security risks and reducing data transfer costs.
Notable Quote:
“They worry about security of their data... they want the compute to be right there wherever the data is being stored.” — Ted Shelton (07:44)
The episode concludes with Keith Newman and Ted Shelton acknowledging the dynamic and challenging journey ahead for Inflection AI. Shelton remains optimistic about the company’s trajectory, emphasizing the transformative potential of their enterprise-focused AI solutions.
Notable Quote:
“It will be a wild ride.” — Ted Shelton (08:29)
Strategic Pivot: Inflection AI’s shift from consumer to enterprise AI underscores a focus on delivering tailored, scalable solutions for businesses.
Seamless Infrastructure: Embracing heterogeneous architectures allows Inflection AI to deploy across various platforms, enhancing flexibility and performance.
Sovereign AI: Ownership and customization of AI systems are pivotal for enterprises seeking security, cost-efficiency, and personalized applications.
Comprehensive Services: Inflection AI offers end-to-end solutions, from model building to infrastructure management and application development.
Practical Impact: Real-world use cases, such as optimizing supply chain responses, demonstrate the tangible benefits of Inflection AI’s technology.
Data Security: Ensuring data remains within the control of the enterprise while optimizing compute locations is crucial for adoption.
This episode provides a comprehensive look into how Inflection AI is navigating the complexities of enterprise AI, offering insights into their innovative approaches to scaling beyond traditional GPU dependencies and catering to the nuanced needs of businesses in the evolving AI landscape.