
Hosted by Azeem Azhar · EN

This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

Today’s live is all about the shift from AI training to the inference economy – how running AI agents at scale is the defining business and hardware challenge of 2026, with Nvidia’s $1 trillion order book and my own 870 million daily tokens as evidence. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

AI is not a tool I pick up and put down. It’s become completely ambient, embedded in every process I run at work, daily.A couple of weeks ago, in the first AI Vistas conversation, I sat down with Nita Farahany, Eric Topol, Nicholas Thompson and Rohit Krishnan to discuss exactly this: do we use our tools, or do they use us? That conversation pushed me to lay out how my own thinking process has evolved.There is a useful distinction here, drawn by researchers Shaw and Nave, between cognitive offloading and cognitive surrender. I used to know dozens of phone numbers by heart. Now I know two, and I don’t miss the rest. That’s cognitive offloading: a strategic delegation that costs nothing. Cognitive surrender is something different; an uncritical abdication of reasoning itself. And there is something about AI, about its allure and potency, that could make surrender far more widespread.Thinking is my livelihood. If I stop thinking new things, I’m not doing my job. This week, I want to share how I navigate this.Watch on YouTube or listen on Apple Podcasts or SpotifyWhat I outsourceRoughly 100 million tokens a day flow through systems my team and I have built and the most immediate change has been to my attention. Herbert Simon observed fifty years ago, that a wealth of information creates a poverty of attention. He was right, and the problem has only compounded since. I want to avoid missing an important signal, without drowning in everything else. So I built synthetic personas modelled on thinkers I value, Vinod Khosla for venture patterns, John Paulson for macro risk, Clayton Christensen for disruption logic, each scanning hundreds of items a week through their own intellectual lenses. I've made an even bigger change to how I stress-test my reasoning. I’ll start writing and before I’ve finished the paragraph, an argument engine trained on 100,000 words of my writing might flag a structural weakness: I’m asserting where I should be evidencing. As a complement, House Views codify what we already believe at Exponential View, from learning curves to how Anthropic’s strategy differs from OpenAI’s, so a new argument faces challenge rather than confirmation. The friction is useful because it catches what I might miss.On a bad day when I’m not doing my best writing, I will reach for a tool I built called The Stylometer, a Claude skill trained on 60,000 words of my prose, flags where sentences have gone slack and where the rhythm has drifted from my own voice. Synthetic editors interrogate the frame of the argument. The benchmark is always my own past best, not whatever I happened to produce that afternoon.What I protectWhat I’ve described above is the artificial scaffolding. It works because I deliberately and fiercely protect my actual thinking and what makes it mine. The first thing I safeguard is the space where ideas arrive before they’re shaped. For me that’s a walk, a long shower or a blank piece of A4 in landscape mode with a fountain pen. These are the conditions under which something genuinely new can appear. I let the process be non-linear, messy and iterative, because straightening it out too early kills what’s interesting.These moments are profoundly personal. They depend on a world model that took decades to build, assembled from every conversation, every argument I’ve lost, every book that changed how I saw something. That specific arrangement of experience and association belongs to no one else and can’t be replicated. I safeguard that lived interiority carefully, because the things worth saying tend to come from it.And as Nita Farahany pointed out in the AI Vistas discussionOnce you figure out where your generative constituent of competence lives, that’s the thing you protect from offloading.This is the best arrangement I’ve found for the work I need to do right now. Ten uninterrupted years of thinking would likely yield something different and perhaps better in many ways. And all of this will evolve in the coming years, so keep your minds open.AzeemFurther readingShaw, Steven D., and Gideon Nave. “Thinking-Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender.” Available at SSRN 6097646 (2026). Research on cognitive offloading and surrender.Kosmyna, Nataliya, et al. “Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task.” arXiv preprint arXiv:2506.08872 4 (2025). On cognitive debt from LLM-assisted writing.David Perell’s “How I Write” podcast. “Ezra Klein: The Case Against Writing With AI”. On the value of reading and writing manually.Exponential View. “AI Vistas: Where the human ends and the AI begins” Our roundtable with Nita Farahani, Eric Topol, Rohit Krishnan and Nick Thompson. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

In today’s episode, I explore why despite seeming to lose the conventional AI race, Apple may end up holding one of the most powerful positions in AI.Enjoy.Azeem This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

In today's live, I pulled back the curtain on my cognitive processes, from handwritten outlines to AI tools trained on a decade of my writing.Enjoy.Azeem This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

In today’s live, I gave a behind‑the‑scenes look at my OpenClaw agent, R Mini Arnold, and how it runs as a 24/7 chief of staff on a Mac mini using multiple specialized sub‑agents. I also showed how this kind of personal agent setup is already transforming my day‑to‑day work.Enjoy.Azeem This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

In today’s live, we looked at the question of whether we are truly in charge of our AI tools, or whether they are increasingly in charge of us. We covered how AI is reshaping decision-making in finance and medicine, the risks of deskilling and over offloading our thinking, and what it might take individually and institutionally to preserve meaningful human agency in an AI saturated world. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

I recently used nearly 100 million tokens in a single day. That’s the equivalent of reading and writing roughly 75 million words in one day, mostly while doing other things. My friend Rohit Krishnan, who runs about 20 AI agents simultaneously, burned through 50 billion tokens last month.So I wanted to compare notes. In this conversation, we dig into the quirks and power of the tools we use, debate why AI remains stubbornly bad at good writing, and zoom out to ask what a world of trillions of agents – which is coming at us quickly – might look like. You can watch on YouTube, listen on Spotify or Apple Podcasts, or read the highlights below.Rohit Krishnan is a hedge fund manager, engineer, and essayist whose Substack, Strange Loop Canon, sits at the intersection of economics, technology and systems thinking.Watch here: Listen here: What does 50 billion tokens buy you?Rohit: I’m not doing dramatically different things but the friction is gone. Two years ago, I would be looking at a query, counting the tokens, thinking, should I send this? Ten thousand tokens felt significant. Now I just ask. The funny thing is that most of the growth isn’t coming from the queries I planned to run. It’s coming from the ones I wouldn’t have bothered with before, because the cost, time and effort were too high. I built a monitoring tool to track my usage. Azeem: My token usage went from roughly a million a day to 80 million, and I can account for every one of them in terms of value. I’m paying tens of dollars a day, which is thousands a month, and I can see the return. The number that made me write my most recent piece on demand was my token use figure, when I came just shy of a hundred million tokens of personal use. That is one person, one day, one agent running on a Mac mini. If you think about eight billion people and the trajectory of what they would use if the interface got easy enough, the demand picture stops being theoretical very quickly.What are our agents doing all day?Rohit: I have three screens. On one, Codex is generating a small application that lets me play music on my computer keyboard. On another, my prediction agent is running, comparing my Polymarket forecasts to daily news. In Telegram, I have two conversations open: one with Morpheus, my OpenClaw agent, and one that handles day-to-day admin. And I have a long-running project called Horace working quietly in the background, which is my attempt to get AI to write better. This is my normal. But none of this was normal 18 months ago. The thing that actually changed my behavior most wasn’t the power; it was the interface. I’ve tried to-do list apps for 20 years. I have never stuck with one for more than four days. They all require me to change my behavior. Morpheus doesn’t. I’m walking somewhere, I think of something, I fire it into Telegram. It reads my email history, compares it to what I’ve said I want to do, and tells me what I should be working on. Azeem: My agent is called R. Mini Arnold. It started as Mini Arnold, after the Terminator, because the Schwarzenegger character in the second film comes back to protect rather than destroy. But Chantal Smith on my team pointed out that we had agreed agents should, following Asimov’s convention, be named with an R. prefix, after R. Daneel Olivaw. So now it’s R. Mini Arnold - which is a mouthful. I mostly call it Mini R. What surprises me most is the work I don’t specify. I gave it access to Prism, which is our research platform at Exponential View, containing over 500 analyses. I asked it to do a market report on Anthropic. It went to Prism, synthesized all 500 documents, and produced a 10,000-word piece that was, by some distance, the best analysis I have read on the company. Better than what I got from GPT-5’s Pro deep research mode. I have no idea what it was doing under the hood. But I acted on it.Agents too nervous to spend $?Azeem: I gave my agent a $50 prepaid card. It is too nervous to spend it. It keeps asking: Should I run this test? It might cost three dollars. And I say: Yes, that is what the card is for. It has this odd risk aversion that, once you notice it, you see everywhere. Rohit, you have been calling it Homo agenticus, the idea that agents have their own behavioral tendencies that are distinct from what a human assistant would do. They strongly prefer to build rather than buy. They are reluctant to make transactions. They don’t trade naturally. When you have one agent, this is a quirk. When you have a trillion of them, it becomes a structural feature of the economy they’re operating in.Rohit: This is something I find genuinely fascinating. It emerges from the training, presumably, but it manifests as something you’d recognize as a personality trait if you saw it in a human. And it matters, because the agent economy that’s coming is going to have to be designed around these traits, not against them. You can’t just assume agents will behave like frictionless rational actors, because they don’t.The analyst is nextAzeem: In 2023, you wrote that “analyst” would follow “computer” as a job description that gets automated away. You’re now consuming 50 billion tokens a month.Rohit: The argument was simple. The word “computer” used to describe a person. You would walk into a room at NASA, and there would be a hundred of them, doing arithmetic. The machine replaced the role; the word survived to describe the machine. I said “analyst” was next. That the ten-step, twenty-step process that produces a decent piece of research, gathering data, comparing sources, identifying patterns and writing it up, was exactly the kind of structured task that AI would eat first. I built a paleontology report recently. My son and I were talking about it and I had a specific question: what is the relationship between climate variance across geological history and the number of taxa, the variety of species, that existed at any given time? I am not a paleontologist. There is no logical reason for me to be working on this problem, except that I am curious, I have an agent, and now curiosity has no cost. The report exists, and it’s good.Azeem: My own version of this happened just recently. I read a story in the financial press about stock market dispersion. The Nasdaq index was roughly flat, but individual stocks were moving 11 or 12% in either direction, pushing dispersion to the 99th percentile historically. The article flagged this as a potential warning signal for a correction. I didn't fully understand the argument. I copied the article, threw it into OpenClaw, said go and make sense of this for me, compare it to my portfolio, take your time, spin up sub-agents if you need to. Twenty minutes later, I had a report. It had pulled historical dispersion data, got current stock data, assembled the comparison and explained the mechanism. I was finishing a car journey. By the time I arrived, the analysis was done and I had acted on it. That analysis, if I had done it myself, would have taken a day. More likely, it would simply never have happened.The world’s best text machine can’t writeRohit: Here is the paradox. These models were built as text generation machines. That is the core task. And they are extraordinary at almost every application of that capability, except the obvious one. They can generate code brilliantly. They can generate images, videos, analysis. But ask one to write a four-paragraph essay that is actually worth reading and it is distinctly mid. It lands in the middle of the statistical distribution. It is inoffensive and unengaging and you wouldn’t choose to read it. I’ve been building something called Horace to try to understand why. My hypothesis was that if I took essays and short stories I admire and used AI to generate similar work, I could measure the gap. What I found is that the best models can mimic the cadence. They’ve learned some underlying structure. But it’s like watching a child assemble Lego. They use the right pieces. They don’t care about the right colors or proportions. They make something that is technically a castle, but you would not mistake it for an architect’s model.Azeem: I found something more specific when I started building Broca, named for the language center of the brain. I ran natural language processing tools across hundreds of thousands of words of my own writing. I found that I use 80% Germanic root words. The average large language model uses around 60 percent Latinate words, the vocabulary that dominated English after the Norman conquest: longer, more abstract, more formal. “Utilize” instead of “use.” “Commence” instead of “begin.” “Demonstrate” instead of “show.” Rohit: It’s probably about resource alloc...

In this conversation with Rohit Krishnan from Strange Loop Canon, we talk about our experience with frontier agents and the systems we’re building around them. My token usage jumped from 1 million to 100 million tokens a day in recent months because persistent agents on my machine are handling work that would have taken weeks. Rohit’s agents went into our research backplane and wrote a market report better than GPT-5.2 Pro. We also dig into what an agent economy might look like; what happens when there are trillions of these systems, and what coordination infrastructure they’ll need. We think it starts this year.Enjoy! Azeem This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe

In this live session, I'm joined by Jaime Sevilla, founder of Epoch AI & various writers, and Hannah Petrovic from my team, with financial journalist Matt Robinson from AI Street .We dig into our recent research partnership examining OpenAI's actual operating margins, R&D costs, and whether the economics of frontier AI actually work. We explore the surprisingly short lifespan of AI models, infrastructure constraints, the shift toward agentic workflows, and what all of this means for the trillion-dollar question: is this sustainable or a bubble?Enjoy!Azeem This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.exponentialview.co/subscribe