Why AI is Broken: RK Anand Exposes the Hidden Costs and Challenges - Liftoff with Keith: Conversations with Super Founders and Growth Experts sharing useful insights

Summary4 min read

Liftoff with Keith Newman: Episode Summary
Title: Why AI is Broken: RK Anand Exposes the Hidden Costs and Challenges
Release Date: July 2, 2025
Guest: RK Anand, Co-Founder and Chief Product Officer of Recogni

Introduction

In this enlightening episode of Liftoff with Keith Newman, former journalist and Silicon Valley dealmaker Keith Newman engages in a deep conversation with RK Anand, the co-founder and Chief Product Officer of Recogni. The discussion delves into the intricate economics of AI, particularly focusing on the challenges and hidden costs that accompany the evolving landscape of artificial intelligence.

Tokenized AI Economics and Variable Compute

The conversation kicks off with an exploration of the traditional token-based economy in AI consumption. RK Anand explains how tokens have served as a currency equivalent, simplifying the billing process based on input and output tokens. However, with the advent of more complex AI models, this system faces significant challenges.

RK Anand [00:33]: “...there has to be a reassessment in the industry on how tokens are a currency and maybe some change in how you think about how you can use AI, how you can charge for it, and how you can be profitable with AI.”

As AI models become more sophisticated, particularly with reasoning capabilities, the fixed compute and energy consumption per token become variable, complicating the existing economic framework.

Impact of Chain of Thought and Agentic Models on Compute Costs

RK delves deeper into the technical advancements of AI, highlighting the transition from simple language models to chain of thought and agentic models. These advanced models significantly increase compute consumption, leading to exponential cost escalations.

RK Anand [02:03]: “...when you went from a language model to a chain of thought model, costs could go up by anywhere from 10x to 100x.”

The introduction of agentic models, which involve multiple interacting models to perform tasks like customer service or travel planning, further amplifies compute usage. This multiplicative effect poses a substantial economic burden on AI service providers.

Challenges in AI Infrastructure and Profitability

The rising costs associated with advanced AI models bring forth questions about the sustainability and profitability of AI-driven businesses. RK Anand emphasizes the industry's struggle to monetize inference effectively without compromising profitability.

RK Anand [03:05]: “...nobody's yet figured out how to people are making revenue in inference, but are they making profits? Is the question.”

The variability in compute consumption necessitates a reevaluation of pricing strategies and business models to ensure that AI applications remain economically viable.

Recogni's Role in AI Infrastructure

Addressing these challenges, RK introduces Recogni's mission to revolutionize AI infrastructure. As an AI infrastructure and inference company, Recogni focuses on developing highly efficient chips and systems to lower the cost of AI computation.

RK Anand [05:05]: “...we are trying to change the economics of inference by building technology that has far much more higher efficiency in terms of compute costs and compute power consumption.”

By enhancing compute efficiency, Recogni aims to make AI services more affordable and sustainable, benefiting both model providers and application developers.

Energy Constraints and Infrastructure Solutions

A significant portion of the discussion revolves around the energy constraints facing AI infrastructure, particularly in the US and EU. RK Anand outlines the challenges of energy acquisition and the limitations of existing power transmission infrastructure.

RK Anand [06:30]: “...we are constrained in energy power. And that is a major, major issue in the us, in EU and in many, many countries.”

Recogni proposes building data centers with captive power sources, such as gas turbines, to ensure a stable and efficient power supply. This strategy aims to support the growing demand for AI compute without overburdening existing energy infrastructure.

RK Anand [07:30]: “...you actually have to have AI hardware and infrastructure systems that deliver more tokens per rack, consume much lower power per rack, and do it much less expensively.”

By addressing both compute efficiency and energy consumption, Recogni seeks to sustain the momentum of AI development and its broader application across various industries.

Conclusion

The episode wraps up with RK Anand emphasizing Recogni's commitment to building technology that supports the expansive and sustainable growth of AI. By tackling the economic and infrastructural challenges, Recogni aims to ensure that AI remains a driving force for productivity and innovation.

RK Anand [08:19]: “...the use of AI and its broader ubiquitous use the virtuous cycle, the flywheel will get slowed down and that's not what we want.”

Keith Newman and RK Anand part ways with a shared vision of advancing AI infrastructure to foster continued technological progress and societal benefits.

This episode offers a comprehensive look into the economic and infrastructural hurdles facing the AI industry today. RK Anand's insights shed light on the necessary innovations required to sustain AI's growth and its integration into everyday life.

Loading summary

Transcript30 lines

[00:06]
A
So great to welcome RK Anand, the co founder and CPO of Recogni, to the group here.
[00:14]
B
Well, it's a pleasure meeting you, Keith. Thank you for the opportunity.
[00:17]
A
Yeah. And you just got off stage. And how did that go?
[00:19]
B
I think it went well.
[00:21]
A
You're being modest. So we want to talk a little bit about tokenized AI economics and the vulnerabilities and challenges in that area. How are you, how are you seeing the whole reshaping in that discussion?
[00:34]
B
Till about maybe even early this year, the currency for AI in terms of how people get charged and how AI gets consumed was tokens. Think of it as a currency equivalent to a US dollar. They would say, okay, for so many tokens, input tokens, and so many output tokens, it costs you so many cents or dollars per million tokens. Well, that was fine when the models were just simple language models where you could predict the amount of compute taken infrastructure. Compute taken was fixed. So you could say, if I gave a token in here, it consumes so much compute, so much energy. And then this is the output token that came so you could measure it. But something changed in the latter half of last year, which is that now you have these reasoning models. And these reasoning models actually spend a lot of time in their own brains. Imagine almost like us thinking about a problem before giving an answer. So they consume a lot more compute, and the compute went from something that's fixed to something that became variable. And so now when you have variable consumption of compute, how do you charge somebody? And so that becomes a challenge. And so there has to be a reassessment in the industry on how tokens are a currency and maybe some change in how you think about how you can use AI, how you can charge for it, and how you can be profitable with AI.
[01:59]
A
So with the Gentex models, we're seeing that play out, right?
[02:04]
B
We are. So if you think about it, when you went from a language model to a chain of thought model, costs could go up by anywhere from 10x to 100x. Now on top of that, with agentic models, you overlay, say, some number of language LLM models or chain of thought models, and you say, okay, I'm going to do a customer service application agent, or I'm going to do a travel agent that's going to plan itinerary for travel, which means booking hotels, booking travel, picking cities, and all of those things. Now you have many models that are interacting with each other, and depending on what input you gave, you could have variable use of these models before the agent completes its work and says, hey Keith, here's an answer. So by the time that happens now, you've added another maybe 10x of variability, increased consumption of compute. So we are in almost an exponential increase in compute with chain of thought and then agentic models. And so we have to solve the economics of that very carefully.
[03:02]
A
How's that going?
[03:06]
B
I don't know yet. Right. I think we are in a very rapidly evolving world. Right. Every week some new models appear. Every week people are building new applications and new agents on top of these models with different new protocols coming from companies like Anthropic and others. But as they build them, nobody's yet figured out how to people are making revenue in inference, but are they making profits? Is the question. Profitability eventually matters in all of the businesses we do. So if you think about whether it's chain of thought models or the explosion that might happen in the next 12, 18 months on agents, one has to think about can you deliver them, serve them economically?
[03:47]
A
So where does rekognite come in in terms of helping you balance out this change, this constant flux in terms of pricing or in terms of usage models?
[03:58]
B
Yeah. So we are an AI infrastructure and inference company. We're trying to build chips and systems for inference and all of AI consumption of AI is all inference. All of consumption of AI is inference. So if you don't fundamentally change as an infrastructure company the cost of serving tokens or whatever the currency might be of AI and make it less expensive from a capital acquisition point of view, and make it less expensive from, let's say, OPEX point of view, energy consumption point of view, the upper layers of either the model providers or eventually the software providers who are building applications that layer on top of models might not have economical businesses. They might not be able to sustain those businesses profitably. So fundamentally what we are trying to do at recogn is to change the economics of inference by building technology that has far much more higher efficiency in terms of compute costs and compute power consumption. And you solve those fundamentally, then all the upper layers start benefiting from it.
[05:05]
A
Do you have an example that comes to mind that you can share?
[05:09]
B
An example of chain of thought model or of the technology we are building?
[05:15]
A
No chain of thought model.
[05:16]
B
So imagine you wanted to. Let's just take simple examples we do as human beings.
[05:22]
A
Yes.
[05:23]
B
So if I asked you the capital of the United States, it won't take you even a nanosecond to say Washington D.C. but if I said, hey Keith, can you multiply 25 times 75 times 42. You're going to tell me, hey, RK, hold on a second. Maybe I'll have to think through it and then maybe I have to take a notebook and maybe I have to run some cycles in my brain before I spit out an answer. So you spend time validating it before you give. What do you think is a truthful answer on that multiplication problem? You didn't think for a second whether Washington D.C. was the truth or was incorrect. You knew the answer by fact. So that's the difference between a language model and a chain of thought model. So imagine the amount of. Let's assume we were able to quantify the energy that our brains used. If you quantified the amount of time and energy you consumed in your brain before you gave an answer to the multiplication problem, you see that it's a scale challenge. You consume a lot more time, you consume a lot more energy. That analogy applies to compute in AI too.
[06:21]
A
Perfect. Last question. Urgent infrastructure pivot. What are you looking for and seeing as a necessary step in terms of infrastructure for AI?
[06:31]
B
Yeah, I think there are two parts to it. In the US we are constrained in energy power. And that is a major, major issue in the us, in EU and in many, many countries. So as we build these power stations up, imagine trying to get a new nuclear power station. By the time you get it through the regulatory process, it's probably a decade at least. At least. So what we're going to do is at least in the short term build. And then our transmission infrastructure in the US is also aging and old, so you can't deliver that much power. So what we have to do is to start building data centers with captive power next to it. And most likely that captive power will be gas turbine based. So once you start getting that capacity of data centers with captive power adjacent to them, then you want to use them most efficiently. You want to be able to. If I deliver, if I put together a data center that is so many hundred thousand square feet, maybe 100 megawatts of power, don't you want to actually have inference systems to deliver AI so that AI becomes ubiquitous in our life across all industries and all aspects of how we function? To do that, you actually have to have AI hardware and infrastructure systems that deliver more tokens per rack, consume much lower power per rack, and do it much less expensively? From a capital acquisition point of view, if you don't solve for all of those simultaneously, then the momentum on AI will start, gets timed, and then the use of AI and its broader ubiquitous use the virtuous cycle, the flywheel will get slowed down and that's not what we want.
[08:09]
A
No.
[08:10]
B
So we are trying to build technology that will enable that.
[08:12]
A
Do not let that happen.
[08:14]
B
Thank you. I will not try. I will try not to.
[08:17]
A
I'd like to be in the position of keeping it going because I think.
[08:20]
B
We aspire to build you know we are technology builders and we aspire to build technology that serves the world for better causes. So we will continue to strive to build the best technology so that the momentum on AI, the momentum on infrastructure build out in the US and other parts of the world is not held back and then the use of AI gets more broader use help us gain productivity and improve lives all over. So that's our goal.
[08:47]
A
Yeah, that's a great goal and exciting times ahead for recognized.
[08:50]
B
Thank you Keith. I really appreciate it. Thank you RK okay take care.
[08:53]
A
Take care.
[08:53]
B
Cheers.