Summary8 min read

Podcast Summary: Y Combinator Startup Podcast

Episode: "The CEO Must Be the Chief AI Officer"

Date: June 10, 2026
Guest: Pedro Franceschi (Co-founder and CEO of Brex)
Hosts: Y Combinator Partners (likely Jared Friedman and others)

Episode Overview

This episode dives deep into the integration of AI at the very core of modern companies, advocating that every CEO should act as the “Chief AI Officer.” Pedro Franceschi, CEO of Brex, shares Brex’s pioneering journey into agentic AI systems, practical insights for founders, transformative security models, and a bold vision for building startups where AI is not an add-on—but the foundation. The conversation is rich with tactics, real-world analogies, and practical quotes, all geared toward founders aiming to restructure company DNA around the new capabilities of AI.

Key Discussion Points & Insights

1. AI as the Foundation, Not a Feature

Pedro argues the CEO must actively guide, own, and experience the company's AI edge—far beyond just delegating to technical teams.
Founders should relentlessly ask: "Why can't you solve this problem with AI, and what are the bounds?"
- Quote: “You wake up, whatever problem you have in your life, why can't you solve it with AI and just like start there? I think the CEO needs to be the Chief AI officer. Like, it's not an engineering team thing.” (Pedro, 00:00, 39:20)
This mindset shift redefines a company’s very identity and the role of every exec.

2. Agentic AI: From Coding Harnesses to “Freeing the Claw”

The “Foxconn factory” metaphor: Most companies treat LLMs “preciously,” overconstraining their context and abilities, like squeezing factory efficiency at the cost of creativity.
- “You have to literally put the agent inside a Foxconn factory… waking up at 6am, if you don’t, you’re going to get electroshocked.” (Host, 02:08)
True power is unlocked by “freeing the claw”: Letting AI agents access tools, act more autonomously, with properly engineered boundaries.
- “Every single good AI product you've used is an agent loop with tools. That's it.” (Pedro, 02:45)
Brex’s internal systems—like OpenClaw—reflect this philosophy, enabling robust, multi-modal, tool-using agents that function as virtual employees.

3. Security & Network Layer Innovation (“Crabtrap” and Policy Engines)

Security is the critical constraint for AI agents in real-world enterprises.
- Early iterations: read-only agents using OAuth to access Pedro’s email, Slack, etc.
- To enable writing/updating by agents, Brex innovated at the network layer (not just in tool wrangling): every agent’s outbound HTTP request is proxied, audited, and subject to policy—using LLMs as real-time judges.
- Product highlight: Brex’s open-sourced “Crabtrap” tool analyzes, logs, and governs all agent network activity.
  - “You analyze, you HTTP proxy the entire network boundary of an agent… you can use an LLM as a judge.” (Pedro, 07:15)
This architecture lets Brex push the boundaries safely—automating recruiting, onboarding, and more.

4. The Three Tiers of AI Adoption in Companies

Token Maxers: Elite engineers who maximize LLM usage.
Average Engineers: Use LLMs, but less intensely—a “tenth of the productivity.”
The Rest of the Organization: Uses AI in “Google search mode”—just chatbots or assistants.
True value comes in building “agent harnesses” for every department, enabling nontechnical staff to use AI agents as virtual team members.
- “What people really want is… a virtual employee almost that has this on Slack, it has an email, I can invite it to a meeting, can join a meeting, take notes…” (Pedro, 12:36)

5. Token Usage, Cost and Productivity Management

Most companies are far too conservative in maximizing LLM token usage—“AI pill test” is, do you default to AI for every new problem?
Costs will drop, but usage and value will grow faster.
Brex built tools (e.g., “magbuy”) to allocate and analyze token spend productively.
- “The cost part is one. But even if you take the cost part aside, the first symptom is a lot more people should be complaining about the max plan limits.” (Pedro, 15:22)
The future: AI inference (token usage) will be the largest expense in the company.

6. Startup Building in 2026: Minimal Surfaces & “Napkin Ideas”

Avoid bloat: Great startups compress the surface area of their products—early Stripe, Brex, DoorDash all started as basically APIs or simple forms.
Risk with AI: The power to automate everything easily can lead to founder distraction and lack of discipline. Compress problems, keep MVPs small.
- “Intelligence is compression. When someone comes to pitch me an idea… great ideas fit in a napkin. What’s your napkin?” (Pedro, 19:55)
Pivot and idea exploration: AI makes it easy to parallelize many ideas, but core insight still comes from signals the models cannot see—i.e., true customer conversations.

7. Limits of AI: Out-of-Distribution, The Curse of Unknowing

LLMs only know what’s in their training data—"blind spots" are critical and subtle; founders’ jobs are to uncover them by talking to customers.
Meta-insights, not magic: “LLMs are trained on a very specific corpus... you have no sense of how much training data the model has seen for the exact thing that you're asking it.”
- “If every time you asked an LLM a question, it gave you the sampling frequency in its dataset… it would change what you trust.” (Pedro, 24:05)
Models will always be biased by who built them ("AI Capex" example, 28:10).

8. Rebuilding Company Fabric: Rethink Before Layering on AI

Don’t just “bolt on” agents—redesign workflows from scratch.
- Example: Instead of just automating KYC, reimagined customer onboarding so agents can risk-qualify leads before they even become customers—changes the funnel, not just the efficiency.
The company must “refound” itself, realigning around new AI-native approaches.

9. CEO as Chief AI Officer

The CEO must personally internalize the limits, capabilities, and boundary conditions of AI, because only the CEO can drive deep, cross-organization realignment.
- Quote: "The CEO needs to be the Chief AI officer... you have to understand the bounds of a technology better than anyone." (Pedro, 39:20)
Top-down, this enables massive process reimagining and accelerates “breaking glass,” overturning inertia within the org.

10. Company as a Federation of Domain-Specific Agents (“Virtual Exec Team”)

Instead of one monolithic company model, build virtual employees/domains: agents for customer success, agents managing roadmaps, operations, etc.
- “Domain specificity matters… how do I build an agent or virtual employee that is exceptional at understanding everything about this customer?” (Pedro, 43:57)
True value comes when these building blocks displace real human workflows and are measured by actual usage and time saved.

11. Continuous Improvement: Evals and Dream Cycles

Every human-agent interaction is turned into an eval/test case, with automation closing the feedback loop for learning and self-improvement.
- “The goal… is to make the whole thing a self-learning system.” (Pedro, 47:31)
The “dream cycle” ensures agents can nightly sift feedback, look for patterns, and update themselves.

12. Human Bottlenecks, Future UIs, and Personalization

Context management—organizing the info the model sees—is the key bottleneck now.
Voice memos, cross-modal UIs, and personal retrieval engines (e.g., “gbrain” ingesting 60GB of Google Takeout data) will define next-gen “personal agent” productivity.
Creativity and serendipitous connections are supported by features like “Lateral Synaptic Drift”—forcing agents to combine ideas in new, orthogonal ways.

13. Final Advice to Founders:

Embrace the “electricity analogy"—it's still just the beginning!
Play, measure, and push boundaries:
- Put a sticky note on your computer: "Why can't this be solved with AI?"
- Build personal projects to push the limits and develop deep “texture and feel” for what’s possible.
Spend your unique time on what only you can do: deciding what is worth solving, and creating new signals the models can’t access.
- “At a limit. I think the question is, how do you spend your time on the things that only you can do as a founder?” (Pedro, 52:40)

Timestamps & Memorable Moments

| Timestamp | Topic / Quote | |---------------|---------------------------------------------------------------------------------------------------------------------------| | 00:00 | “You wake up, whatever problem you have in your life, why can't you solve it with AI... I think the CEO needs to be...” —Pedro | | 02:08 | “You have to literally put the agent inside a Foxconn factory… if you don’t, you’re going to get electroshocked.” —Host | | 07:15 | Security innovation: “You analyze, you HTTP proxy the entire network boundary of an agent… you can use an LLM as a judge.” —Pedro | | 12:36 | “What people really want is… a virtual employee almost... that has an email, I can invite it to a meeting...” —Pedro | | 19:55 | “Intelligence is compression…great ideas fit in a napkin. What’s your napkin?” —Pedro | | 24:05 | “If every time you asked an LLM a question, it gave you the sampling frequency…” —Pedro | | 28:10 | Model bias example: “...the people that are building the models fucking only think about AI Capex.” —Pedro | | 39:20 | “The CEO needs to be the Chief AI officer... you have to understand the bounds of a technology better than anyone.” —Pedro | | 47:31 | “The goal… is to make the whole thing a self-learning system.” —Pedro | | 51:38 | Parting advice: “Put a sticky note on your computer: why can’t you solve it with AI and start there…” —Pedro |

Notable Quotes

“You have to sort of refound the very concept of what the company self identity is.” (Pedro, 00:15, 39:58)
“LLMs are not magic... You have no sense of how much training data the model has seen for the exact thing that you're asking it.” (Pedro, 24:32)
“You’re working for the LLM to some point. In a bigger company, you’re in a turnaround to put the LLM as almost the founder and the CEO; you’re architecting the entire company around that idea.” (Pedro, 52:50)
"Intelligence is compression." (Pedro, 19:55)
“The company builds antibodies against any sort of disturbance to the social cohesion… making escalations faster is key.” (Pedro, 43:11)

Conclusion & Advice for Founders

Pedro Franceschi’s core message is radical, practical, and urgent: the next generation of iconic startups—and perhaps the survival of existing ones—depends on AI as the organizational substrate, not a bolt-on. Only founders willing to plunge hands-on into the limits, risks, and architecture of AI will be able to redesign their companies for the coming era.

Takeaway:
Start with the question “Why can’t I solve this with AI?” for every problem. Relentlessly push the boundaries, compress your projects, and spend your own founder energy where only humans, not LLMs, can—the strategic choices and the founding wisdom.

For more, listen to the full episode or follow up with YC's upcoming Startup School.

Loading summary

Transcript144 lines

[00:00]
Pedro Franceschi
You wake up whatever problem you have in your life, why can't you solve it with AI and just like start there? I think the CEO needs to be the Chief AI officer. Like, it's not an engineering team thing. It's not like a product team thing. It's like you have to understand the bounds of the technology better than anyone. I think a good proxy for how to spend your time is what are things that only you can do, the models cannot do. You have to sort of refound the very concept of what the company self identity is.
[00:32]
Host 1 (possibly Jared Friedman)
Welcome back to another episode of the Light Cone. Today we're joined by Pedro Franceschi, co founder and CEO of Brex. Pedro started Brex in the YC Winter 17 batch and built it into one of the most important fintech companies of the last decade. He's here today because Brex has gone deeper on AI than almost any enterprise company we know. And Pedro's own AI setup is so compelling that when he came to YC for lunch, it sent our entire team down a rabbit hole of building on their own. So, Pedro, welcome to the Light Cone.
[01:07]
Pedro Franceschi
Thanks for having me. Excited to be here.
[01:08]
Host 2
Thanks for changing our lives. Yeah.
[01:10]
Pedro Franceschi
Oh, God, that lunch. I'm like, I think the model company should be sponsoring me for the token consumption increase we supposedly generated on that lunch. That was the precursor of GBrain, I guess.
[01:23]
Host 1 (possibly Jared Friedman)
I was still working on GStack. I was still a 2013 Web 2.0 engineer who time traveled in instantly to the AI tools of January 2026. I was, you know, probably half a million lines of Rails code in and I could create a Gstack because of that. To like, help me make a software factory.
[01:42]
Pedro Franceschi
Yeah.
[01:43]
Host 1 (possibly Jared Friedman)
And then after I met you, I realized everything is about freeing the claw.
[01:48]
Pedro Franceschi
Free the claw.
[01:48]
Host 2
I knew you were going to say that.
[01:49]
Pedro Franceschi
Yeah.
[01:50]
Host 1 (possibly Jared Friedman)
And then.
[01:50]
Pedro Franceschi
And give it tokens. Yeah.
[01:52]
Host 1 (possibly Jared Friedman)
Well, no, I mean, let it rip. The craziest thing was realizing, like, what I had gotten wrong. That I think actually most people in software are still getting it wrong, is you. They've been treating the LLM like this very precious thing that's very expensive.
[02:08]
Pedro Franceschi
Yeah.
[02:08]
Host 1 (possibly Jared Friedman)
And so as a result, you have to literally put the agent inside a Foxconn factory. And it's like, like, can you imagine? Like, I mean, that's what the half a million lines of Rails code was for me. It's like, no, no, I need to control what the LLM sees because it's about really, really, Like, I only want the context, like from here. And let me write like all the if statements to make. Sure. Like, you know, like a Foxconn engineer, you're waking up at 6am and you know if you don't, you're going to get electroshocked. I mean like this terrible thing that you do to agents.
[02:42]
Pedro Franceschi
Yeah.
[02:42]
Host 1 (possibly Jared Friedman)
And they want to be like at the Esalen Institute. And that's what openclaw is.
[02:46]
Pedro Franceschi
Exactly. Exactly. And it's funny because I feel like every single good AI product you've used is an agent loop with tools. That's it. You try to sort of over engineer the harness and then do certain things, but at the end of the day it's skills, tools and a model. Like there's not really much else.
[03:03]
Host 1 (possibly Jared Friedman)
Maybe we start earlier because one of the things we'd love to kind of get down as a part of lore is like, how did you get so AI pilled and like all the way to the edge?
[03:14]
Pedro Franceschi
Well, I'll tell you my encounter with LLMs, which was so. I remember in the Pandemic, there was someone, someone gave me an API access to GPT3 and I was playing with it and I was like, okay, this is really cool, there's something here that could be special. But it was the kind of thing that was like, yeah, it feels like a research project, the kind of thing that Google used to release and you play with it for 10 minutes and you stop. ChatGPT came out and I think everybody was sort of interested in it. Where I think it got interesting was when you started to see reasoning models and of course tools. But I think everything else was sort of a blip until December. And the way I describe it to my team is like, electricity was invented in December. And I think electricity was Opus 4.5. And sure, Opus models and OpenAI models got better and better since then, but to me that was the tip of the spear where you could say, yes, coding harnesses actually work and cloud code existed for probably a year before, but it wasn't that that valuable yet. And I remember during the holiday break I was playing with it and it was pretty shocking, probably similar reaction that everybody here had. And I think the question becomes, if you think about, you're sort of standing, looking at 200 years of history and then you imagine you are, we're now in May, you're sort of five or six months after electricity was invented and most people are still playing with candles and questioning what can you do with candles and fire and who needs light? Yeah, exactly. What about these lanterns and what can you do with it? And the steam engine is like I don't know, maybe like 20 years away still, but electricity already exists. That to me was the sort of the fundamental light behind it. And I would say, I think since then Open Cloud was kind of an interesting sort of next step. Which is I think when we realized that the reality is good AI products are agentic loops with tools. And we started doing this in our own product at Rex. But then on the personal side I started spending a lot of time understanding, okay, what is at the frontier of using OpenClaw. And I think the insight was just, yeah, like markdowns can take you really far. Just like configuring and automating a lot of the things in your life. It's kind of funny, I remember I had this experience of like buying a movie ticket entirely in openclaw using like a Brex card. It was provisioned through an API and, and then I showed it to my team and they were like, oh, but like you can go online and like book it in 10 seconds. And I'm like, that's not the point you're missing. You're completely missing the point. But anyway, and then I went obviously very deep in this rabbit hole and started spending a lot of time thinking how to change the fabric of the company and the way we build the products. And tell us about your personal open
[05:58]
Host 2
claw journey because I, before you came for lunch, I had it like in store, but I was like way too scared to do anything with it. We were all scared.
[06:06]
Pedro Franceschi
Yeah. Don't get me wrong, we deal with financial services data. We spend a lot of time figuring out how to be mindful of security and protection. And yet I think people are a little bit more risk averse than the technology probably requires them to be given where the technology is. And when we started using Open Cloud personally, I started doing it on a lot of my own personal setup. Basically what I did the V1 was I want to give it read access to everything and just create oauth tokens to my email to Slack and to everything to just literally not write. And I was kind of shocked how far it got me. And then the next question that we spent time on Brex was okay, how do we actually get it to right into our systems? And everybody, including our security team was well, we cannot do that for all the reasons that we know. And then basically where I spent, I don't know, probably four weeks of my time was, okay, let's solve the hardest problem, which is security. And we ended up realizing that the only way to actually do something about it was to do something in the network layer. And if you treat the agent like, you know, the agent has its own wills, desires, and you know, they go to the Aslan Institute for agents and you know, they have Foxconn's factory instead of Foxconn's factory, they will try to do things at a network boundary that could not be the right ones. And we decided to actually just focus on that. So a lot of folks were, you know, and we saw Nvidia and others on Nemo Claw, let's build these like open shell forks that have controls over, you know, tools the model calls. And the reality is, yeah, you can do all of that, but you can also just make an HTTP request wrong. So we focus on that layer and then we build this thing called Crabtrap, which we open sourced probably about two months ago, which is actually the way we use to secure agents at Braxton Production. And, and the basic premise is you analyze you HTTP proxy the entire network boundary of an agent. And the idea is when a request goes through that becomes auditable. And you basically can use an otter agent to analyze the traffic and create a policy to let traffic go through or not. And surprisingly or unsurprisingly, because these models are trained on hundreds of billions of web documents, HTTP traffic is actually, I would say, probably the way the models reason more so than anything else, because they just literally learn on the web. So the ability of the model to watch like a thousand requests and make sense of what's happening was way higher than we anticipated. So we actually build that, put that in production at Brex. And after you record the traffic of an agent operating for a day, you can build a pretty good policy that sets things that should be automatically approved, and for things that the agent isn't really sure, you can just use an LLM as a judge. And the LLM determines, is this request something that should be approved or not based on the policy for what that agent should be doing? So, for example, we have a recruiting agent at Brex called Jim. We have a policy for Jim, and all the traffic goes to that same policy. And 98% of requests go through automatically. 2% use an LLM. So we sort of got that problem solved to a degree that we got comfortable experimenting much more aggressively and sort of freeing the clot on the enterprise, which is really hard inside Capital One. So I would say if we found a way to experiment with these things, and granted we don't do the most aggressive things with this stuff yet, we don't use it on customer data to the degree that we want one day to do and there's boundaries to how we do it. I don't see any reason why a YC company shouldn't be at the bleeding edge of the stuff.
[09:38]
Host 2
Yeah, I mean, I think your intuition around the proxy at the network level ended up being quite prescient. I think a lot of the stuff that I'm seeing kind of around the openclaw ecosystem at the moment at least, or just agent ecosystem essentially doing that, we're seeing that with credentials credential brokering Agent Vault is doing a lot of that. I think you had mentioned the first version of Craptrap included Credentials Vault. Why did you decide not to include that?
[10:02]
Pedro Franceschi
I think it was just, let's just do one thing really well and you know, at the end of the day I think there's going to be a lot of solutions that do that. You could do credential brokering and other tools already, but the LLM as a judge was for us the determining capability to say, do you trust this in production or not? In our security team at Brex, very rigorous and very good at what they do for a long time we're, well, you know, not really to getting them to a yes, we actually believe this is enough was a big unlock for us. And look, I always say this like we're not in the business of building HTTP proxies. We are in the business of being at the bleeding edge of what you can do with AI. And to get to the bleeding edge required us to build this proxy. That's why we did it. Hopefully someone's going to build a YC company. Hopefully we're just going to build a better version and we're just going to go use it. But at the end of the day, that's the journey that took us to just sort of being at a bleeding edge in a way.
[10:52]
Host 2
And how much was you pushing this forward and how much resistance did you get internally and just how did you get AI build?
[11:01]
Pedro Franceschi
I think there was a lot of excitement about it. But the way I describe AI adoption inside most companies is I think there's like sort of three tiers. There's tier number one, which is your token maxers, like your engineers that are pushing a bunch of code and typically living inside coding harnesses. And those are sort of well known. We know who those are. Then you have the sort of average engineer that is building a few things, but not nosar, token maxer to the same degree and probably, I don't know, a tenth of the productivity and then you have the entire rest of the company. And the entire rest of the company typically is interacting with AI in what I call Google search mode way, which is a chatbot with a few MCPS or a G Suite equivalent. You have a few tools from Google, but at the end of the day it's really just a search. And I think our thesis was if you think about the value that AI creates for like a token maxer, for example, a lot of the value comes from the harness. And the thesis was how do you actually build an equivalent harness for other teams that are non technical? And our whole sort of thinking behind it was like that's a lot of what OpenClaw created, which is this ability that you can self bootstrap a lot of the capabilities of the agent. By the way, you edit your skills and markdowns and sort of set up the environment around the agent. And how far can we get this ability for the agent to self bootstrap a capability without anyone actually going and coding it by hand? So the analogy we use internally for I would say the sort of company wide adoption of AI is we don't believe in the yes, give people a few MCPs and let them go. Because I think what people really want is in my opinion is really a way of saying, okay, this is actually a virtual employee almost that has this on Slack, it has an email, I can actually invite it to a meeting, can join a meeting, take notes and you're trying to replicate that as much as possible. So how do you build the infrastructure to support that kind of use case? And I think the harnesses will look a little different and probably more like open claw than a coding model.
[13:05]
Host 1 (possibly Jared Friedman)
Jared and I just did this this week for the first time where we installed Aquavoice and then you open Telegram with the claw or actually we have it in Slack now. And then basically it was like me and Jared and like three engineers and someone from the events team and we're trying to put together how do we put together 60 dinners with 20 people each of attendees from startup school. Nice. With 21 partners and visiting partners at YC.
[13:34]
Pedro Franceschi
Sounds like a great problem.
[13:35]
Host 1 (possibly Jared Friedman)
And then we just basically started talking about it and then I picked that up and I pressed enter and then you know, our claw just started doing it. None of us opened Claude code. Like it just sort of built a bunch of markdown. It did the analysis and yeah, people
[13:50]
Pedro Franceschi
forget that Claude code isn't magic. It's just literally harness around the same models you can use in an API. Right. So I think that's the unlock of. And by the way, there's a few things that cloud code is doing that I think are really cool.
[14:01]
Host 1 (possibly Jared Friedman)
Oh, they're amazing.
[14:02]
Pedro Franceschi
And yet it's just a harness.
[14:05]
Host 1 (possibly Jared Friedman)
And Claw can use cloud code.
[14:07]
Pedro Franceschi
Exactly.
[14:08]
Host 1 (possibly Jared Friedman)
And codecs. Right. It really prefers to use codecs these days.
[14:11]
Pedro Franceschi
Exactly.
[14:12]
Host 3
Everything, actually.
[14:13]
Pedro Franceschi
I don't know why exactly, but ACP helps on that.
[14:17]
Host 1 (possibly Jared Friedman)
Yeah, ACP is good.
[14:18]
Host 3
Why do you think the adoption of token maxing hasn't really taken off the thing that we've found it very curious working with a lot of startups early on is a lot of founders are very shy about burning tokens. I think you really get to experience this when you really go all the way.
[14:36]
Pedro Franceschi
Gary mentioned this point, which is tokens are expensive. And I think there are, you know, I'm in a fortunate position to be able to spend on tokens. But I would say. I keep trying to picture myself. Imagine if I was like 14 or 12 when I started coding for real and I had the technology we have now. I would be token maxing in the cheapest way possible. And there are people doing that. You know, you look at the Chinese models, for example, like they're. They're pretty decent.
[15:00]
Host 1 (possibly Jared Friedman)
There's a huge hobbyist community where they, you know, build a gaming rig, but then they try to build like local LLM.
[15:08]
Pedro Franceschi
Yeah.
[15:08]
Host 1 (possibly Jared Friedman)
And then that actually is like, totally reasonable way to do it.
[15:11]
Pedro Franceschi
100%. 100%. I have a friend that has the exact same setup. He has his like little GPU farm in his house. And first time I went there, I was like, wow, heating is on here.
[15:21]
Host 1 (possibly Jared Friedman)
It's really warm. It's really hot in here.
[15:23]
Pedro Franceschi
And he's like, no, no, it's my GPUs. And I was like, great power efficiency all the way through. It's funny because at Brex and we should talk about managing token costs and spend management for tokens, which is a topic we're spending a bunch of cycles on now. I think the cost part is one. But even if you take the cost part aside, the first symptom is a lot more people should be complaining about the max plan limits. And you see, what's the percentage of Twitter that probably complains about it? Like 0.1%. So I think people are probably still early. To me, there's this like the AI pill test, in my opinion, is whatever problem shows up in your life, do you default to AI first or not? It's like, of course, mechanically you can do it, but there's a point that it becomes like second nature. And then your whole brain gets rewired and you cannot think in a different way. And there's the whole topic about AI dependency, human machine interaction. There's all these things that we can talk about and put in the corner. It still sort of surprises me how many people you go talk to about a problem. And I'm like, it's so cheap to intimately understand the bounds of this problem. Now why haven't you done that yet and come in with a much more digested view on the problem? And I think the second thing is I think if you have the luxury of building a company now, the fabric of the company from day one can be built in such a different way that I think if I were to start a company today, I would say, okay, the premise is why can't it be just me? And then you start from there and your token consumption is probably going to be a lot higher than if you said, well, I'm going to have three people or five people or seven people. And I think the fundamental constraint isn't as much, in my opinion, like, oh, like AI as a cost savings or I'm going to be more efficient. I think the unlock is like the fabric of the company just looks very different when the boundaries become type, systems, interfaces, agents talking to each other versus people. I think people still didn't fully grasp by okay, what does it mean to build code with new agents and the new technologies we have? I think that's well understood. But how to live in a world where intelligence is on a tap and your default answer is, let me actually solve this problem with AI first, even if you feel suboptimal and then from there saying, okay, how do I actually make it optimal? Because I think for the majority of problems there is a way to solve it with AI that is probably better and your job is to figure that out even if it's going to take you more time because that will compound.
[18:04]
Host 2
YC startup school is back. We're hand selecting the most promising builders in the world and flying them out to San Francisco for July 25th and 26th to discuss the cutting edge of tech and startups. Apply now for your spot. When you started Brexit, I mean like it's well known, like you're like mvp, like had no web ui, right? Which is like all terminal, like super scrappy.
[18:27]
Pedro Franceschi
Yeah, today would have because yeah, that's what I'm curious. No one needs HTML it's success anymore.
[18:31]
Host 2
Like was it actually still the right approach? Just have a really simple MVP and test that anyone works or would would you have like a way more fully featured?
[18:38]
Pedro Franceschi
So, so I, I, I have this controversial view which maybe you all will disagree, which is like, I actually think if I look into a pattern of companies that succeed, I think there's a really interesting pattern which is minimal surface area. And the problem is with AI, I think you see like look at Stripe, for example. Stripe, early days was like literally an API Bracks in the early days, no ui, just like literally a terminal. You look at Airbnb, it's like the website was a form and the form was just like literally where you inputted what you needed and then someone somehow went there and figured out how to actually make the booking happen. Like doordash in the early days. Similar, right? Like it was just like literally. So the surface area was so small with the customer and so much of the intelligence and the bandwidth of the founders were spent nailing this one single interaction pattern. And I think the risk with AI is that the agency behind choice goes away. So you have this lack of discipline on what matters to solve. And I think people tend to believe that I can just experiment a lot of things. And that's absolutely true. But that doesn't preclude you from actually choosing what matters. I always tell people, I think if you can't minimize your surface area and solve the problem with a very clear set of boundaries, you haven't found the right problem to solve. And you can of course find how to compress the problem into a smaller surface area using AI. And that's really valuable. But I don't think you should use it as an excuse to not do that. Which I think is, well, I can just build so many other things. But you know, I, I always tell this to people, like, intelligence is compression. So when someone comes to pitch me an idea in the company, I'm like, it has to fit in a napkin. Like great ideas fit in a napkin. What's your, what's your napkin? And then someone comes with this and I'm like, I don't know where you buy napkins, but the ones in my house are not the size.
[20:29]
Host 2
How about the step before it then even I'd like. Actually a lot of the pivot advice I give founders during the batch comes from you talking about how you found the Brexit idea. And if I like the proximate view, I remember is that you thought about it as like two week cycles and you're either in exploration or exploitation mode and you're trying a bunch of things, but then you want to hone down, would you still use that pattern now or would you?
[20:50]
Pedro Franceschi
100%. I think one of the hardest things of building a company is talking to customers and not just having the conversation, but how to extract the sort of unspoken signal from these conversations. And I think to make the can AI solve this lens, whatever problem shows up in your life, can AI go solve that? And you think about building a successful company, why can't you prompt your way into that? And the reason is very simple, is because their signal that the models weren't trained on and the signal is when you go talk to a person and they tell you about a problem they have, they're not going to tell you, they're not going to give you the answer shaped into a prompt that you can put into an LLM. And that LLM is going to go and output the product that's going to win and be a billion dollar company. They're going to tell you a very sort of local, optimum answer based on their worldviews and their constraints and the way they see things. And I think a lot of the job is, the job now is to have the wisdom to choose what you want. And because before the wisdom was not just to choose, was to choose and know how to execute it. The execution is out, right, the execution is gone and the model is going to do that better. The wisdom to choose is still, I think, the, the missing bottleneck. And to me that all comes from which signals are not in the models.
[22:02]
Host 2
So say like pre AI, you had personal bandwidth to explore like three ideas in parallel. You're saying like now in AI world you'd still do three in parallel, or would you like 30 and let the models try?
[22:14]
Pedro Franceschi
And the way I would probably approach it is let's pick a broader universe of things to do. Sort of an early initial exploration. But to me the lens is okay, why can't AI solve it? And which signal is not in the model? And I think the signal is typically the customer. And then, and then when you go talk to the customer, I think I wouldn't paralyze that. Probably I would be, okay, let's try to get in the headspace of this person. And I think there's like, it's so easy. And we did a lot of exploration with like synthetic customers and building customer role models and things like that. And those are really valuable once you know a lot about the customer. But when you don't know enough yet, I think there is this like very basic thing which is even at Brex, for example, one of the hardest things for us as a company was we initially sold to founders. We're founders. We knew about ourselves, we knew about our problems. And then as the company got bigger, we were selling to finance teams. And finance teams are different. So building that mental model of what's the value system, of course, you can eventually make the model represent that and have that worldview. But there's an intangible that I think is where a lot of the alpha still comes from. And I think, to me is like the. I think a good proxy for how to spend your time is what are things that only you can do? And even in the company of one, what are things that only you can do, the models cannot do? And that to me is like one of them.
[23:31]
Host 3
I think that's so on point. I think a lot of founders like you that successfully navigated Pivot have this loop. Basically, there's this. There's this book, others in mind from psychology that has to do with people that have very good emotional connection with people are able to simulate what the other person is thinking and what the others and others theory in mind.
[23:51]
Pedro Franceschi
Exactly, exactly.
[23:52]
Host 3
And I think the founders that get that and have the empathy to figure out what the customer is not verbalizing is what is make the. I think Gary says this. Make the implicit explicit 100% of what are all those desires?
[24:06]
Pedro Franceschi
100%.
[24:07]
Host 3
And they're very subtle signs a lot of time, because they're murmurs as founders go through them and figure out the insights like, oh, is this really a thing? But how do you know when to poke for it?
[24:16]
Pedro Franceschi
Exactly.
[24:16]
Host 3
And the problem with relying on models and right now, which is. I'm still very optimistic that there's still a lot of job three founders. Oh, definitely is you don't even know what the right incantation or set of prompts to ask the model because you don't even know what to ask.
[24:30]
Pedro Franceschi
Exactly.
[24:31]
Host 3
There's like another meta layer.
[24:32]
Pedro Franceschi
Yes. It's the whole Elon thing of which question is the universe the answer for? Kind of. And of course, these are generalities. Right. But I think what I've seen is you have to remember that LLMs are not magic. LLMs are trained on a very specific corpse of information, optimizing for a very specific set of benchmarks and outcomes. And I think the biggest pitfall of LLMs is you have no sense of how much training data the model has seen for the exact thing that you're asking it. So imagine if every time you asked an LLM a question, it gave you like. Yeah, the Sampling frequency of this in my dataset was I don't know, X and on this other answer was 0.00001x you would trust is very different. Right? The distribution's so different.
[25:21]
Host 1 (possibly Jared Friedman)
Oh, I would pay for that. That's a great startup idea.
[25:24]
Pedro Franceschi
Exactly. Someone should do that.
[25:25]
Host 1 (possibly Jared Friedman)
We need to do a model that does that.
[25:27]
Pedro Franceschi
Yeah, 100% I would pay for it.
[25:29]
Host 1 (possibly Jared Friedman)
Yeah, well, because it's fascinating because then anything that's out of distribution, you just go and fill that in. I mean, actually, as an applications engineer on top of the LLMs, that's actually a huge blind spot.
[25:42]
Pedro Franceschi
And that's what Merkore and a lot of the other data companies are doing a lot of the jobs for them is to say, well, what are the blind spots for LLMs? And it's funny, I think a lot of the data labeling companies right now trying to understand the pitfalls in the models. But the problem is in order to do that you have to be an expert to know what the gaps are in the answers. But the problem as a founder when you're looking for an idea is you know nothing about it. So there is a curse of knowledge and a curse of not even knowing what the bounds of knowledge is, which I think can, can, can make you believe that you understand something that you actually, you nor the model actually understand.
[26:17]
Host 1 (possibly Jared Friedman)
Can I confess something weird about like after creating gbrain now I, I do use AI in a different way where now that I have a retrieval system that is actually usable if I have a problem or question about anything. Like for instance, I was trying to work on a really, really like the last humanized prompt and you know, a lot of that stuff probably isn't in distribution yet. Like there's a whole wik a Wikipedia article about like, you know, characteristics of AI writing. But you know, now I can just go tell it like go spend a day like deep research. Literally every single paper article, like read everything, put it into my git repo and then I'll be able to retrieve it and summarize it into something that actually is usable. And so that's sort of like filling in 100% the things that are out of distribution. Like I can sort of like pack it with whatever context and it's like you can do that with anything. Is like if you're interested in, you know, running a restaurant, literally you could have, you could go and read like 500 books about like every, the top 500 books about what it's like to run a restaurant and you would have like the compendium of all information about it.
[27:27]
Pedro Franceschi
Yeah, and I think a lot of what, like, for example, like, one of the things that we do at Brax now is building this customer world model is a similar idea where we're trying to get every single touch point that the customer has of us. Like, literally, like, what, how many times they click a button on the dashboard, all the way to what they tell someone on an email or what they say on the phone, or they send a call and ingest that and consolidate data. Okay, what should this customer need next from us? What should this customer be thinking about? Like, what are the issues that they will face but haven't faced? And again, it's just a distribution problem.
[28:01]
Host 1 (possibly Jared Friedman)
This is actually an answer to one of the questions which is like, will there be jobs? Or whatever. It's like, as long as there are limits on ram, actually, like, there will be. So I don't know. I mean, it's kind of an interesting one, right?
[28:16]
Pedro Franceschi
I think so.
[28:16]
Host 1 (possibly Jared Friedman)
Literally, you can't have a model that has enough parameters that could, like, have everything that you could possibly need in distribution. Like, there aren't enough atoms in the universe. Right. It's like a modeling problem.
[28:27]
Pedro Franceschi
I think we forget that the world models in which the models are trained, like, there is something that the designers of the models influence the way the model actually behaves in the end. So one of the things that we spend a lot of time thinking is how to make LLMs work for people that look very different from us. How to make LLMs work for the average finance person in the US that if you're talking about an answer and the model defaults to AI Capex as a default category, for example. That's a really funny example. I was playing with AI for accounting categorization, and then the first example of an example of an. It's just like writing prose. And an example is like, AI CapEx. And I'm like, oh, why is it AI CapEx the first example it comes up with? Because the people that are building the models fucking only think about AI Capex. Right? So there are things like that that I think is kind of interesting to think about. That the mental models of the models, I think are out of the box are more biased than we may give them credit for.
[29:28]
Host 1 (possibly Jared Friedman)
I mean, speaking of AI Capex, like earlier you're saying, you know, we're. We're so early still. I don't know. The funniest thing about AI to me is how often I find myself thinking crypto maxims. Yes, this is the worst the models will ever be.
[29:43]
Pedro Franceschi
Yes.
[29:44]
Host 1 (possibly Jared Friedman)
My favorite now is telling people who hate AI coding like have fun coding at 1x speed.
[29:49]
Pedro Franceschi
Exactly, exactly. I was telling a friend about, you know, how to be, you know, long inference. That basically the thesis that there's going to be a lot more inference than people think and people are expecting a lot of inference. If you just look at public markets and semi supply chain all that people
[30:05]
Host 1 (possibly Jared Friedman)
are saying like 10,000x.
[30:07]
Pedro Franceschi
Yeah, but the underwriting which is kind of funny is I think there's one image, 2,500 dots each dot is 3.2 million people on the planet and basically 84% of the world never used AI. 16% have used at least once a free chatbot, then 0.3% which is I guess six or seven squares, pay 20 bucks a month for AI and one box out of the 2,500 actually use agents in whatever capacity. So that's the argument to be long inference. And I think it's just starting out and I think a funny thing on this is I think it will be the biggest expense in a company easily. Right? And yes, there's a lot of margin in tokens right now, but people always want to be at the bleeding edge. But even token costs decrease by 10x, they're going to have 10x more usage. So it will be still a large cost. And we're spending a lot of time thinking how to help companies actually manage token spend. On Brex, we ended up building our internal version of this, we call it magbuy, where the idea is you can effectively every dollar of token spend in the company you can attribute to product. We have to customers, an internal tool that we use to serve or an internal employee and understand model usage, et cetera. And we're now figuring out how to build analytics on what are we trying to do with the tokens to start to get a sense of roi. But anyway, it's a fascinating topic that I think has a lot of early work compared to what it will be one day.
[31:41]
Host 4
Can you share any of the data that you've gotten from Brexit about just what token spend is like in the economy?
[31:46]
Pedro Franceschi
It's increasing? No, look, I think two things are surprising. One is, I think to your point earlier on, how do we look at token maxing? I do think there's such a thing as how much cost boundaries you create internally dictate token consumption, obviously. But to me I think what's the most fascinating is when you look into the sort of 10 mile radius we're in now and maybe you include New York, tons of token consumption. And you could probably argue, and we see in the data also faster revenue growth. I think what's really interesting is the gap between anyone in these two 10 mile radius and everything else. And this is not small companies. You look into very large companies with very large budgets and that could be token maxing. And the economic thing for them to do would be the token max. And they spend like, I don't know, 10,000amonth. And you're like, you should probably be spending 10 times more or 20 times more or 100 times more. That's still surprising. And I think the reason is again, sort of similar to the point at the beginning. We did this exercise two and a half years ago where I sat down with a lot of the engineering product leaders in the company and we had this question, which is if we started Brexit again in 2024, the answer would be even more different now, what would we do differently? And turns out everything. And we start going down this route. And it's kind of maddening because they're like, okay, we have this completely old way of even thinking about the fabric of the company and the way we build the product and the way we build our processes internally. The first best answer is, yes, we wish we had started now. Second best answer is, let's go do something about it and change the way we do things. Right? And I think a lot of our approach in terms of adopting AI has also been how do you pause and say, okay, there is a discontinuity not just in how we solve the problem, but on what the definition of the problem actually even is. And sort of take a step back and rethink it. And there's millions of examples of that. But one example which is kind of funny is we're redesigning our KYC process. Whenever we onboard a customer, we have to do all these checks to KYC the customer. And KYC historically is something that you can automate, like 80% of it, 20% is manual. And of course, the original impetus for anyone is that let's build an agent that does it. Yes, we can go do that. But what we decided to do is actually say, let's redesign the entire process end to end. And then what we redesigned is the entire onboarding process. And when you redesign the entire onboarding process, what you realize is there's a very important thing that happens in the beginning of the funnel, which is deal qualification. Like, is this customer even remotely qualified to be a BREX customer? But when you Have KYC for free. You can KYC a lead versus a customer. So you start to have risk orientation up in your funnel and that changes who you even target because you know who's going to qualify. And the same thing's true for credit to some degree. So now the bounds of the problem have changed and you can go in and say, and I think a lot of, including a lot of our competitors have this approach of saying, oh, I have this entire old process, let me go and like latch on AI on top of it or lash on AI on top of our product. And I think the biggest discontinuities in a positive way that we've had were when we said, hey, let's keep this old way here, put it in a corner and how would we design it if we started the company today from scratch and then just doing that? It takes a little bit of founder energy to do that, but I think it's the only thing we've seen working to really sort of inflect.
[35:21]
Host 3
I think that reminds me a lot about. This is sort of way back, I don't know if you ever tried to compile ARC distributions of Linux. The culture within power users of Arc Linux versus Ubuntu is very different.
[35:35]
Pedro Franceschi
Very different.
[35:36]
Host 3
I think the Ubuntu people kind of feel more like people that try ChatGPT stuff kind of just works out of the box. There's some stuff that you can get up and running. There's still not a lot of people that use Linux, by the way, which I think it feels where AI is. But with ARC you're like super hardcore. And I think that's what openclaw and Hermes feel like because you have to really customize it to your own use case, maintain your skills, have all the markdowns, and if you get it working, you can build something awesome. One of the most impressive things I've seen people build with ARC is actually, I don't know if you know, Valve, the Steam engine, the operating system that runs that makes it feel like a Nintendo Switch is actually built on top of arc.
[36:19]
Pedro Franceschi
Oh, interesting.
[36:19]
Host 3
They customize all the drivers over the air updates, it works with all consoles, it works with all sorts of hardware out of the box. But they super duper customized it. And I think this is kind of what's happening. If you get your open claw to work really well for you, you could kind of build your own custom Nintendo Switch for whatever you need to do.
[36:40]
Pedro Franceschi
Yeah, I always have this thing that I tell people which is funny, which is think about your time two Years ago. I feel like you're working a lot more now than two years ago. Right. And probably same for everybody here. So then the argument is, what about the productivity? Where's the productivity? Right. And I was talking to the CFO of a very large public company this week and she was telling me that we see all this token consumption and we're trying to measure product velocity and we're seeing more lines of code pushed. So yes, maybe that's the way to measure the roi, but is it really there? Because people are spending so much on tokens? And I think this analysis, yes, of course, I think having a sense on ROI on tokens is important, but I think it misses the point that you're standing in the timeline of history and it's six months after electricity was invented. Thinking about, imagine someone saying in, I don't know, the 1800s, like, oh, my electricity bill is so high now, like, gosh, let's use a little less. Let's push the steam engine to come like maybe 20 years later. Because the cost savings. Yes, of course, don't bankrupt your company on tokens.
[37:52]
Host 4
It's actually a perfect analogy because I don't know if you know this, but when electricity was first invented, it didn't work very well and the ROI was actually bad. And so if shortly after the invention of electricity, some of accountants had done this analysis, they would have been like, this electricity thing is like, is it never going to be a thing? The ROI sucks.
[38:08]
Pedro Franceschi
Why do people stick to it? And it wasn't the cost savings, it was just because people were curious about it. And I think the point of why I was yesterday until 2am playing with slash workflows and Opus 4.8 and all that is because I think I'll be doing the exact same thing if I wasn't making any money. Because you just see the possibilities and you see what you can do to technology and that just drives people to behave differently. And I think that to me is the ultimate litmus test and it's a good separator. And sure, if tokens are so expensive, they're going to be, I think over the fullness of time, probably free if a project is, I don't know, 100 years down the line almost compared to what electricity. Now we don't think of electricity costs in our data days, but unless you're in a data center. But I think there's something similar for sure.
[38:57]
Host 4
We talked to a lot of founders of later stage companies who wish that their companies could be like as AI pilled as Possible. And you run this, like, big company now with all of these employees, and that's only the Brexit. There's also the capital one side. I'm curious what you've done to like, bring the rest of the company along with you on this journey. And if you have advice for other
[39:20]
Pedro Franceschi
people, there's a lot to do. I think the CEO needs to be the Chief AI officer. Like, it's not a engineering team thing. It's not like a product team thing. It's like you have to understand the bounds of a technology better than anyone. I would argue that unless you really experience the limits of technology every day, I think it's really hard to even understand what it can possibly do.
[39:43]
Host 1 (possibly Jared Friedman)
Oh, you know why? It's because nobody can say no to the CEO except the board. And the board won't be in the weeds per se.
[39:49]
Pedro Franceschi
That is 100% true. When you go think about like, you know, the whole example of KYC that we were saying, the KYC team would never think of using the KYC technology to score a lead. The only people that can think about the organization of the system itself is if you have the context of the whole. And to me, the single most important question that any CEO needs to answer is forget about the competitive landscape. Imagine you could get the state of the technology today and transport to the moment you started your company. The opportunity was still the same, but just the possibilities of the way to build a company are totally different. How would you do it? And then diff this versus what you have? And then first suffer in silence for a little bit because you will. I mean, I do every day. But then the second thing is, okay, what do you do about it and how would you do it if you were starting from scratch? You'll be the one figuring out, okay, how do we design our onboarding process or how we design our growth engine and our customer acquisition, and the way we talk to users and the way we synthesize the data and all of that would be redesigned from scratch. So I think it's almost like you have to sort of refound the very concept of what the company self identity is and the way the functions and people's sense of success get structured. AI is an umbrella that I think has like three things. The way we talk about it internally. There's product AI, the product we actually ship to customers. There's operational AI, which is things that directly affect our ability to serve customers at scale. Like think of customer success, risk, onboarding operations, et cetera. And then there's Corporate AI, which is how people work internally, the three agendas matter, and they matter in different ways depending on the timing of the company. And I think people will sometimes sort of pigeonhole themselves in one of the three. But in reality, I think you have to take a step back and be like the same thing we were talking about earlier. Why can't you solve everything with AI at a limit? That's the question. And then sort of start from there and sort of problem solve around that question. It's a turnaround almost. I think you have to assume that if you're a big, large company that's not an AI native, you're doing a turnaround to some degree.
[41:55]
Host 1 (possibly Jared Friedman)
I guess we've been making fun of Foxconn factories for some time, but on the other hand, if you look at them, they're like this paragon of very extreme efficiency. But they also are designed to be that, to create one thing perfectly back to back to back. And so you have to build a factory like that.
[42:11]
Pedro Franceschi
And most companies are designed that way. Right. I think processes are designed not to change. There is a certain amount of broken glass required. The question is how? I think it's 10x easier for the CEO to break glass than an executive, and 10x easier for an executive than an employee. So a lot of times someone comes to me and says, I'm trying to do this if AI, but someone is saying, no, because we haven't tested this in this use case or in that thing. And I'm like, okay, what are you trying to do? Like, do you understand the risks? Do you understand the guardrails? Yes. Okay. It takes me literally 10 seconds to solve that problem. And it would take someone 10 hours to go in, into the meetings and escalate and understand, okay, can we. Or maybe never. And I think the conclusion is probably never, because most people would say, you know what? I'm just going to build this product in the old way, because why wouldn't we? It just works.
[43:02]
Host 1 (possibly Jared Friedman)
We know it's, well, that guy's going to hate me. And then I have to look at that person in the lunch line every day. And it's like, I want people to be happy and like me, so I'm just not going to do that.
[43:12]
Pedro Franceschi
And what I tell people is I think the escalation paths need to be desensitized in the system because the company builds antibodies against any sort of disturbance to the social cohesion of the company, typically gets rejected by the antibodies. And I think making escalations faster and being like, hey, we're going to go try this thing. I understand the risks. Let's take this risk. Because the biggest risk is not taking that. It's just literally missing the opportunity to rethink a problem from what would you do if you started the company today
[43:45]
Host 2
on the corporate AI sort of leg of that stool? Specifically, do you buy into the Jack Dorsey view of every company essentially trying to build its own little company AGI
[43:58]
Pedro Franceschi
I do, but, but maybe in a slightly different way. I do think domain specificity matters. So I don't believe in the, oh, I'm going to have a single company model that has every piece of data in a single with no judgment or lens into anything. And the way I think about it more is more the virtual employee analogy, so to speak, which is like, how do I build an agent or virtual employee that is exceptional at understanding everything that matters about this customer? That is a well defined problem with clear boundaries of clear APIs of WHO people who depends on the data, who interact with the data that is self contained. Then there's another agent that can be okay, given all the customers that we have and the problems they have. How do I manage my product roadmap that can be a separate agent, but that builds on top of this customer
[44:46]
Host 2
world, like a virtual exec team, basically.
[44:49]
Pedro Franceschi
Exactly. Functional and domain knowledge still matter, right? These things are not going to go away. And I think the, the way knowledge is structured I think is still true, right? That doesn't necessarily change that much. And you should separate the agent and the systems that are actually emitting code from the system that is talking to customers and the system that is reasoning about the conversations of customers and translating into a product roadmap. These three separate things, we're kind of like the Tesla for AI. We're like, I don't believe in anything that doesn't have real usage. So it's like, yeah, I build this great model and I'm like, okay, how many people are using it? Is it actually displacing the need to hire a person inside a company? Is it actually displacing the need to, you know, spend literally hours? Like how many hours is this thing saving? And I think a lot of times people say, well, you know, it's a cool model. And I'm like, yeah, but like that's not going to cut it, right? Once you have that orientation. I think customer world model, okay, like for example, our client sales team now runs on our customer world model. So I know it works. I'm actually having lunch of a customer tomorrow and I Don't know the state of that account as well as I probably should. Customer role model answered the question for me and I now have a report including things that the team didn't know about that came through support tickets and an executive that was traveling had an issue at an airport with their car. Total information awareness. Right. That is a well defined problem that is working. I can trust this building block as part of my company model as a whole. And you can have evolves on it. We know. I think a very we should talk about evals. There's a bunch of learnings on this and how to build evals into the fabric of the company. But anyway, I think it's more of like you have to decompose the problem a little bit.
[46:26]
Host 1 (possibly Jared Friedman)
Yeah. My favorite thing about evals is just running cross modal evals against each other.
[46:31]
Pedro Franceschi
So one of the things that we're doing that is related but I think it's really fun which is how do you have every single human interaction in the company becoming an evaluation when you have any agents. So for example, we have the onboarding agent is doing something and then you have a team that actually goes in and looks at KYC exceptions that the model can figure out how to make that a breaking change. And okay, this manual interaction will become an eval case. We have an expense agent in Brex. Whenever someone has a conversation with the agent that flags an issue or a bug or something that feels like the conversation didn't go as smoothly, that creates a bug. That bug triggers an agent that's going to go and modify the code base and the prompts and everything to make that eval pass. And if that doesn't break, then engineer is going to go in and figure out how to make that pass. Because the goal at the end I think is to make the whole thing a self learning system. Right. And I think a lot of what I see with companies is they spend a lot of time getting an agent working but never thinking how to make the agent improve every day. And I think that's like always the biggest unlocking.
[47:44]
Host 1 (possibly Jared Friedman)
You need a dream cycle.
[47:45]
Pedro Franceschi
You need a dream cycle.
[47:46]
Host 1 (possibly Jared Friedman)
Sees everything every night.
[47:48]
Pedro Franceschi
Exactly.
[47:49]
Host 1 (possibly Jared Friedman)
And then it's like, oh, what's going on there? I need to put this over here. What actually happened? Is there a pattern? How do I cause this?
[47:55]
Pedro Franceschi
So how to bake the dream cycle into the products and into the agents and into the things you ship?
[48:01]
Host 1 (possibly Jared Friedman)
My favorite thing right now is I'm building like three or four agents for my friends.
[48:05]
Pedro Franceschi
Oh, interesting.
[48:06]
Host 1 (possibly Jared Friedman)
And some of it is like this is user research for me for gbrain. Because it's like, I have one, it's working really well. I have 350,000 markdown pages in there now. What a crazy. Like, I thought it was this wild, you know, pie in the sky thing and it's like, it's going to happen in our lifetimes, you know, I remember
[48:23]
Pedro Franceschi
when, when neuralink came out and I used to think about it, I was like, I don't get it. Like, I was like, yeah, of course I get it conceptually, but why is it a thing? And then now I use AI and
[48:34]
Host 1 (possibly Jared Friedman)
you're like, yeah, yeah, makes sense, makes sense. Yeah.
[48:37]
Pedro Franceschi
I'm the bottleneck.
[48:38]
Host 1 (possibly Jared Friedman)
Yeah, yeah.
[48:39]
Host 3
Typing is so slow. I don't know if you use a lot of adaptation.
[48:41]
Pedro Franceschi
I use a lot. Most used developer UI right now is like Voice memos to OpenCloth.
[48:48]
Host 2
I've said this before. Like it was maybe accidental, but I actually just really love the fact that like Telegram works so well with audio because it's forced me to just put more stuff, like make the agent more intelligent so that you can do more stuff via voice memos. Because you have to sort of fight the natural instinct as like a traditional developer where you're like, oh, like I can't quite do this or it doesn't do this, so I need to go like, build more client or more UI or like more functionality for it.
[49:13]
Pedro Franceschi
Foxconn.
[49:14]
Host 2
Yeah, exactly.
[49:16]
Host 1 (possibly Jared Friedman)
Just let it do what it wants to do, give it some context and it'll just think about, you know. Oh, like actually, what about this?
[49:22]
Pedro Franceschi
I think a lot of the work to your point is the how do you organize the context for the model? And you can use the model to help, but that is the bottleneck for most things.
[49:34]
Host 1 (possibly Jared Friedman)
Once you have the context in there, it's actually you can do some pretty crazy stuff. Like my favorite new feature I mentioned,
[49:39]
Pedro Franceschi
I saw Brain lsd.
[49:40]
Host 1 (possibly Jared Friedman)
Yeah, lsd.
[49:41]
Host 3
Yeah.
[49:43]
Host 1 (possibly Jared Friedman)
Lateral synaptic drift.
[49:45]
Pedro Franceschi
So you just bump the temperature on the search.
[49:47]
Host 1 (possibly Jared Friedman)
It's not just that. So you have the vectors.
[49:49]
Pedro Franceschi
Right.
[49:49]
Host 1 (possibly Jared Friedman)
And so, you know, if you think about what conventional ideas are, like, most people give you, like, oh, well, this idea. With this idea. And it's like kind of like in this cone, LSD mode actually says you cannot combine concepts that are within this cone. They actually must be orthogonal or just like ran, like feeling seemingly random. And then it'll try like, you know, randomly hundreds of these combinations and then it'll rank order them into the ones that are actually the most coherent.
[50:17]
Pedro Franceschi
Yeah.
[50:18]
Host 1 (possibly Jared Friedman)
And then if you do like a hundred of them. Actually, like, the top five tend to be banger tweets.
[50:22]
Host 2
You know what's crazy is like, I didn't tell Alfred expressly to be dry. I went. I actually had like, chatgpt generate, like the soul file, just like, based on, like, everything you know about me, all the interactions here, like, generate like a soul MD for, like, my openclaw agent. And it was so unerringly, like, accurate about, like, kind of what I would want from, like an agent. And I was like, oh, damn, these models know a lot about us.
[50:47]
Host 1 (possibly Jared Friedman)
My open clock got really interesting once I just ingested my 60 gig Google takeout. I mean, you have to write a bunch of haiku code to like, only get the emails that are actually real. But, you know, there's like, it extracted like 4000 emails out of 50 gigs that actually matter. But, like, those are like, oh, actually like a lot of your thinking and, you know, the consequential moments of your life. So, Pedro, thank you so much for being with us. I mean, you're by far one of the most AI pilled, farthest out on the edge, but also very practical CEOs who is, you know, playing with this stuff and actually building it yourself. What would you say to people watching who are founders, who want to be founders? You know, I think that you are sort of the model for the way people should start companies and run them with AI as your Esalen buddy.
[51:38]
Pedro Franceschi
I really can't stop thinking about the electricity analogy, which is you're standing, there's a 200 year timeline of human history. There's a point in time where electricity was invented, it sucked in the beginning or six months after that point. What do you do differently? Knowing everything that will be true about electricity, knowing that data centers, when they consume electricity, and even AI. Right. Well, you do a lot of things differently, I think so. I think that's one of just marveling at the possibility of the exact moment in time we're now. I think the second is have a post it on your computer, which is you wake up whatever problem you have in your life, why can't you solve it with AI and just start there? 80%, yeah, you can use a chatbot, but 20% that you can't figure out why, and go build something that makes you solve that problem less so because of the immediate usefulness that solving that thing at scale will have, because it gives you a texture and a feel for the possibilities of the technology, which are really hard if you're not playing with it every day. And maybe the third thing is, I think it's just measure your token consumption and how much you're just pushing the limits of the company and starting with the premise of, okay, why can't it just be one person? Why can't it just be me that builds the whole thing? And you're going to probably face a wall of the elements of what models can and cannot do, but at a limit. I think the question is, how do you spend your time on the things that only you can do as a founder? And these things to me are, number one, which problems are worth solving? And two, and sort of the choice thing we talked about. And the second thing is, okay, given these choices, what are the limitations of an LLM that they still cannot do? And I have to go in and do those things myself. But almost to some degree, you're working for the LLM to some point. And if you're in a bigger company, you're in a turnaround to put the LLM as almost the founder and the CEO, and you're almost architecting the entire company around that idea. But I think early on, so much of it is choosing what matters, talking to customers, injecting the signal that models don't have, and just rebuilding it the way you would do it in 2026 with electricity being six months old.
[53:51]
Host 1 (possibly Jared Friedman)
Thanks, Pedro. This is awesome.
[53:52]
Pedro Franceschi
Yeah, thanks for having me. Appreciate.