
Loading summary
Alex Kantrowitz
AI engineers are getting athlete pay. Anthropic set up Claude allowing it to run a vending machine in an experiment. That tells us a lot about where AI is today and where it's going. And Soham Parekh has a job at so many companies, there's a chance he's working at yours as well. That's coming up on a Big Technology Podcast Friday edition, right after this. Welcome to Big Technology Podcast Friday Edition, where we break down the news in our traditional cool headed and nuanced format. We have so much to speak with you about today and including the news that Mark Zuckerberg may be offering contracts of up to 100 million or more to AI engineers who want to come on board to his super intelligence team. Of course, Facebook disputes that, or Meta disputes that. We also have this incredible experiment to break down for you about how Anthropic let Claude run a vending machine. And then of course, we got to talk about Soham, who has taken so many jobs, especially with YC companies, that who knows, maybe he's working for yours as well. Joining us as always on Fridays to do this is Ranjan Roy of Margins. Ranjan, great to see you. Welcome back.
Ranjan Roy
Good to see you. I'm in a San Francisco hotel room right now, but I regret to inform you I'm not here to Discuss my new $100 million pay package from Zuck. I'm not wandering.
Alex Kantrowitz
Never know.
Ranjan Roy
I'm not on the list yet.
Alex Kantrowitz
We might be able to podcast our way into it. Never say never.
Ranjan Roy
I'll take a cool 50, Mark. Just a cool 50.
Alex Kantrowitz
Okay, yeah, now we should start there because we talked a few weeks back about the talent wars and what Mark Zuckerberg might be doing and offering so much money to AI engineers considering coming into Meta and becoming a part of his superintelligence team. And in the two weeks since, that discussion has really heated up. So we now have news from Wired. It says, here's what Mark Zuckerberg is offering top AI talent. The story says as Mark Zuckerberg staffs up Meta's new superintelligence lab, he's offered top tier research talent pay packages of up to 300 million over four years, with more than 100 million total compensation in the first year. Meta denies the idea or this. This the numbers. It says these statements are untrue. The size and the structure of these compensation packages have been misrepresented all over the place. So some people have chosen to greatly exaggerate what's happening for their own purposes. I mean, I Don't know. Ron. John, how do you get multiple people saying that they have a similar size deal? I think they've OpenAI reported 10 of these deals. How does that happen? And how do you end up with a denial there?
Ranjan Roy
Yeah, I think let's get to what it actually means for the industry. Second, but first, I'm still kind of curious about Andy Stone, the META spokesperson's response, in terms of saying that the statements are untrue and like this kind of blanket denial and saying that people have chosen to greatly exaggerate what's happening for their own purposes. Because how does it help in OpenAI? In my mind, I. I get there's the downside of this, that potentially the market might get spooked, that META is kind of spending too frivolously. But in reality, I have to admit, this kind of makes me, like, think, you know, you know, like war rage. Zuckerberg is here and he's ready and he's going to win AI at whatever cost. So to me, it's almost a positive signal. I don't know why they're denying it.
Alex Kantrowitz
Well, I mean, I think it makes an internal cultural thing a bit of a problem. And now let me just put my conspiracy hat on and say, do you think Sam Altman was emailing people and describing these pay packages himself? Because he had a message to OpenAI this week that, that really put Meta on blast. He's not happy that Meta has been recruiting some of his top people. He says to the OpenAI team, missionaries will beat mercenaries. Meta is acting in a way that feels somewhat distasteful. What META is doing will, in my opinion, lead to very deep cultural problems. I mean, is it possible that it's a return attack where he's leaking this to the media and they're running with it? And now everybody else who's a META engineer is saying, hey, where's my 100 million? Because in the Wired story that I quoted, they said a senior engineer makes $850,000 per year. Now, I'm not crying for this engineer, but if that is the salary and you have somebody coming in who does similar work and they're making what you think is 100 million, maybe you want to go to OpenAI.
Ranjan Roy
Okay, okay. Actually, that is an interesting theory. It's almost so logical that it almost kind of like leaves the realm of conspiracy. And actually, I could see it happen again. It would be so incredibly rich. The idea that OpenAI, a company that has spent at all costs, raised ungodly amounts of money, is Losing ungodly amounts of money kind of takes this approach at a competitor, but I can definitely see that, that it would cause a bit of internal strife on the Meta side. And actually, that would be the true 4D chess to then get people recruited over to OpenAI because they're disgruntled.
Alex Kantrowitz
Some people have chosen to greatly exaggerate what's happening for their own purposes. It's just one of those statements that.
Ranjan Roy
Says, andy Stone knows exactly what's happening. If you, if you hear a comms person say something that explicit without saying it, I think they must know something.
Alex Kantrowitz
And let's hear what Andrew Bosworth, former guest on the show, the chief technology officer at Meta, told the company internally. He said, look, guys, the market's hot. It's not that hot. Okay, so it's just a lie. We have a small number of leadership roles that we're hiring for, and those people do command a premium. And noted that OpenAI is countering the offers. I mean, if you get even close, it's a truly absurd amount of money. Satya Nadella is making $79.1 million this year. So could you be like the OpenAI researcher who worked on 04, and now you're gonna make more than Satya?
Ranjan Roy
It's so on face it seems completely absurd and ridiculous, but then in the grand scheme of things, if those 10 people are the, you know, like, difference between building the next great model, especially that Meta has been, you know, on its back foot a bit, it actually, from like, a pure ROI standpoint, could make sense. Again, as ridiculous as it sounds. And, and I know, like, there's a lot of, A lot of comparisons that AI labs are starting to look like sports teams, but in reality, those are the decisions that if an individual can have that great of an impact on your overall business, it makes perfect sense. Again, is that the way this is going to play out? We'll get into what this means for, like, training and where the, the next phase of growth will be. But it's not absurd. Given the size of the opportunity, it's absurd if, like, if we believe that one to 10 people can actually be, make or break things for them.
Alex Kantrowitz
Yeah. I mean, remember, Meta is a company that's lost, what, 15 billion a year? I might be, you know, exaggerating reality. This is directionally accurate on the Metaverse.
Ranjan Roy
Yeah.
Alex Kantrowitz
So if you think about it, if you want to build a super team of, let's say, I don't know, 10, 20 AI researchers and you want to give them 100 million a year. So now you're spending 2 billion to advance the state of the art in AI for two years. I mean, per year. That seems fairly reasonable compared to these other bets.
Ranjan Roy
I think that appetite for risk, again, as we said, losing that much money on the Metaverse, on Reality Labs and whatever it was. Exactly. Again, Mark Zuckerberg is not afraid to take risks. Every company and everyone has identified whoever kind of wins the AI battle will win the next major phase of growth in overall markets. Again, it's up for debate. Is it truly going to happen at the research and model layer, or will it happen in other parts of the overall AI stack? But, but I think he's serious. Whatever it is, I mean, the move for Alexander Wang and What was it, 15 billion.
Alex Kantrowitz
It's like in that neighborhood.
Ranjan Roy
Yeah, the 15 billion, which was an acqui hire Zishan trademarked Alex Kandrowitz. Like, they've shown they're not playing around right now. So all of these acquisitions, I mean, or hot direct hirings at insane levels they're doing right now, and they're showing that they're not going to fall back any further.
Alex Kantrowitz
Yeah, this is from Mark Pincus, the founder of Zynga. He says this is legit founder mode. Speaking of the amount of money that Zuckerberg is paying here, buying the talent from OpenAI is cheaper than the company. Only a founder would or could do this, and only if they control their board. I think that's a great point. Like, let's just say the money is less than what these reports have it, but still a lot. You don't see any other companies doing this. I mean, you think about it with Xai, Elon is the richest man in the world. He's not doing this. I think this is a pretty solid and bold play from Zuckerberg.
Ranjan Roy
Yeah, I, I just went to Meta AI just to ask this, and Meta Reality Labs has, and I actually love that Meta AI says Meta Reality Labs division has been hemorrhaging money with significant losses, but it's lost $42 billion since 2020. 17.7 billion last year. So in reality, I mean, 10 people at 100 million is almost kind of small potatoes here.
Alex Kantrowitz
Yeah, it's child's play. I mean, the thing is what it does culturally. But here's the question, is it worth the risk? So you mentioned that some AI engineers are being paid like athletes. And there is a great piece by Dave Kahn, who's a partner at Sequoia, why AI Labs are starting to look like sports teams. And I think we should Just spend a couple minutes or even a little bit longer hovering on this piece because I think it really details what is going on so well and explains why the investments in talent are what we're starting to see right now. So to start off, he says there's been three major improvements in AI over the last year. First, coding in AI has really taken off. A year ago the demos for these products were mind blowing and today the coding AI space is generating something like a $3 billion run rate in revenue. Okay, so that's one. So this is working in coding. The second change is that reasoning has found product market fit and the AI ecosystem has gotten excited about a second scaling law around inference time compute. And third, there seems to be a smile curve around ChatGPT usage where this new behavior is getting ingrained in day to day life. I think smile curve basically means like you start using it and then you casually use the product so your usage goes a bit down. And then as you start to find more utility, your usage goes up so your curve looks like a smile. Is that how you read it?
Ranjan Roy
Yeah, that's how it looks and how I'm reading it and it's correct, I think. I agree. This was a really smart piece. Again on where the market is today and where it's going and how this can possibly explain. And again, I did love that. He recognizes though, I think Dave Kahn is both team model and team product. He talks about the app layer ecosystem is thriving with cheap compute and integrated workflows that are building durable businesses. So basically consumers are starting to get it. You know, like coding has found very clear revenue generation. Reasoning, as he said, found product market fit. So what's next? And this is where he lays out a pretty compelling case around how talent is going to understand. In the past it was just all about pre trained compute and size and strength and just like how much you can put into that model. But we've talked about this a lot on the podcast, like the actual training techniques becoming smarter even. It was Sergey Brin, I think, who said in his interview with you that.
Alex Kantrowitz
It'S going to be algorithmic progress, not compute. Exactly.
Ranjan Roy
Yeah, yeah. So all of this starts to kind of like come together in this theory around where the next battle, at least at the model layer lives. And, and if that is the case, maybe you can start to build out the idea that 10 smart people can make or break your business versus buying however many Nvidia chips and like, you know, purely spending money on the compute.
Alex Kantrowitz
Yeah, and I think it's worth reading exactly the way he puts it in his piece. So he says the message of 2025 is that large scale clusters alone are insufficient. Everyone understands that new breakthroughs will be required to jump to the next level in the AI race, whether in reinforcement learning or elsewhere. And that talent is the unlock to finding them. I'm just going to pause here and say, yes, this is what we've been hearing from everyone in that conversation with Sergey where he said that the algorithms are going to be the thing that takes AI to the next level and not necessarily compute. Demis Hassabis also said there's going to be another couple breakthroughs that the AI industry is going to need in order to keep advancing toward AGI or whatever you want to call it, more powerful artificial intelligence. So it is these algorithmic improvements too that will get the industry moving forward. And what do you need to get there? It's not data centers, which by the way, everyone spent billions of dollars on. It's the talent to be able to make those breakthroughs themselves. So this is what he says. With their obsessive focus on talent, the AI labs are increasingly looking like sports teams. They are each backed by a mega rich tech company or individual star players can command pay packages in the tens of millions, hundreds of millions, or for the most outlier talent, seemingly even billions of dollars. Unlike sports teams, where players have long term contracts, AI employment agreements are short term and liquid, which means anyone can be poached at any time. One irony of this is that while the notion of AI race dynamics was originally popularized by AI safety folks as a boogeyman to avoid, this is exactly what has been wrought against two distinct domains. First compute and now talent. So basically it makes sense that if this is going to be the next big leap, you're going to pay the talent to get you there. And you know, no matter how much talk you have around safety, we're seeing the industry accelerate around talent and around computer.
Ranjan Roy
Have we both just convinced ourselves that 100 million is reasonable for these engineers? Because I think I'm starting to be convinced of it.
Alex Kantrowitz
I mean, absolutely. Even when we spoke about it the first time, right? Once we, once Zuckerberg brought Alexander Wing. What did I say on the show? There's going to be more. And this is a sound strategy because you have everybody talking about how pre training is hitting diminishing returns, you have everybody talking about how data is hitting a wall. And so what do you need? You just need these algorithmic developments. Now let me ask you this. So I would say, yeah, this is a good bet. But I'm going to ask you this, do you think this is a sign that like, okay, I think I have an answer to this before I ask you, but that the best kind of question that this AI moment is sort of in the last throws and sort of just grasping for anything that will allow for improvement. Given that like the mechanisms that brought it here are starting to tap out.
Ranjan Roy
I'm going to give you a strong yes on this, mainly because again, as the leader of team product for over team model, I think this is like a reminder that the core of Silicon Valley is firmly of the belief that the model has to get better and better and the model will solve everything and the rest of the layers. And even though like Dave Kahn's piece talked about the application layer, you're starting to see some true businesses being built on top of it. Like the idea that they're not still focusing that much on what are the next ChatGPT features. And they are, and I'm not saying they're not shipping very regularly, but it's just this reminder that like that's where every Silicon Valley leader in this circle is convinced the battle will be won. And I don't necessarily agree with that. But yeah, in this case, to me they're they because once you made that decision, you have to find the next thing. And as we said, like pre trained compute data centers, all of this is like showing diminishing returns. So you have to move to the next thing and it's talent.
Alex Kantrowitz
Right? Look, I think this is a determination that you have to move to the next thing. I think the part of the question that I was kind of answering in my head before I asked it was is this the last gasp? And I don't think that's the case. I do think that they're going to be able to wring improvement out of the current techniques. At least everybody that I speak with seems to believe that. But they already, you have to look ahead to the next curve while you're on the first one or while you're on the current one. And that's, I think, is what's happening.
Ranjan Roy
Yeah, and then we have a world where imagine this talent finds incredibly cheap ways to actually build these models out. And then the ultimate, I mean, like, are they saying there's a potential race to the bottom in the sense that if you truly make the inference layer that much more efficient and cheaper and the compute side of it that much more efficient and cheaper, I mean it's going to be good for all of us. Because it means that all of this gets cheaper and people build more on top of it. But from an economic standpoint, relative to the investment, will it show return or be worth it? I don't know.
Alex Kantrowitz
Right. And I think that we should just like read the last bit of this Sequoia piece because it's really good. And by the way, this came up in the big technology discord. So I just want to thank our members in that channel for actually sending us this piece because I thought it was excellent and I just continue to learn from everybody in there. Here's the end of that piece. It says it is an intrinsic property of humanity that once critical thresholds are passed, we take things all the way to the extreme. We cannot hold ourselves back. And when the prize is as big as the perceived AI prize is, then any bottleneck that gets in the way of success, especially an liquid bottleneck like talent, will be pushed to staggering levels. I think that's both true and also a little like, concerning.
Ranjan Roy
I mean, it certainly does not seem like a positive statement on humanity overall and our ability to constrain or control ourselves. But what's still ironic to me or funny to me about this is, you know, an illiquid bottleneck like talent and the idea that humans are the key to, rather than like to actually advancing this, rather than at this point, shouldn't AI itself be good enough to develop the techniques that make AI better?
Alex Kantrowitz
Well, you're talking about an intelligence explosion. And I think that every lab is trying to engender an intelligence explosion, but they're not able to as of yet. But are they going to sort of consolidate release cycles? Sure, with the help of AI code. But we are nowhere close, I don't think, to what is it recursively or self recursive improving AI models.
Ranjan Roy
But I feel just given where the industry has kind of promised that we are and the type of advances that are being made, I would like to see them actually kind of apply it to their own companies in the ways of building.
Alex Kantrowitz
Yeah, and I think that's definitely happening inside of places like Anthropic for sure, which has this Claude code that was built effectively to make them better at coding. Claude. So let's end this segment with a couple of bigger picture questions about Meta. First is just in terms of culture. Think about what happens to an organization when you import. I think already it's a dozen or more now multi or decimillionaire engineers to work alongside those folks making 850,000 or a million. Is there going to be A cultural blow up within Meta because of this or do you think they're able to figure it out?
Ranjan Roy
I'm just going to say pour one out for the poor guy making 850k. I think if nobody. But I think like, yeah, there's definitely going to be whatever the end payment was. Even like at a micro level. Is Yann Lecun now gonna be reporting to Alexander Wang?
Alex Kantrowitz
Like I think he is, but I don't think he cares, honestly. I think Jan just wants to do the science. He doesn't want to manage massive teams.
Ranjan Roy
Teams. Okay, okay. But I think like at every level, even this kind of reorg within Meta around like who is managing what, basically saying we have not been doing good enough already, that it, it's like a pretty big cultural like statement from Zuck. So I think yeah, it has to be. But again, I mean the argument, the founder mode argument would be that if you're not winning, you do need to shake things up. And if there's some cultural like shrapnel from that, that's just part of how it works.
Alex Kantrowitz
Right. And it's like you're kind of, if you are a meta AI engineer and you're making like close to a million or above a million, I don't know if you're going to get a comparable offer, especially given what's happened with Llama up up to date.
Ranjan Roy
One question. What does this mean for Meta's business? Why are they doing this? Is it for meta AI that we all start using it more? Is it for. So my Meta Ray bans, which work, which I love, just start getting like, even better. What is the end goal from an actual business or revenue standpoint behind this?
Alex Kantrowitz
Well, I think that there's a belief that this technology is getting much better and people are just going to want to use it and they're going to spend more and more of their time within AI bots or AI experiences. And then think about Meta like your job is to command a share of time across the web or across anybody's usage on their phone or their laptop. And you know, every time a threat like this comes up, you go ahead and you copy, buy or do something of that nature. So with photo sharing, they bought Instagram. With the rise of disappearing messages, they put made stories and they put their own disappearing messages in something like Instagram and WhatsApp. And then with TikTok they built reels. So if you're Mark Zuckerberg, you can't really afford to lose a tremendous amount of attention to other companies. Especially with these AI bots that do not send traffic out that we have talked about ad nauseum on this show are, you know, the experience. And if that becomes the experience of your web or even beyond the web, you don't want to be Facebook sitting on the outside and say, please use our app. There is a desire to own the operating system. And that's just if, you know, the progress continues along the way that it has been. And we, like, start to use chatbots a lot. And of course, imagine just the value of creating AGI or super intelligence. It's a whole different ballpark.
Ranjan Roy
Well, that. Okay, but that's where I would ask you. Those are two separate goals, right? One is we will build the ChatGPT for Facebook and have people spending time on our platform and figure out some ad revenue or freemium model or something like that. Do you think it's that or do you think it's still more of just to put your head down and whoever gets to ASI the fastest wins, and then that's. That's really what's driving it.
Alex Kantrowitz
So I think the floor is that you build the key consumer product. I mean, it's going to be a fight against OpenAI, but they have billions of users, so they can see it in with them. So, like, at the very least, you're like, basically building the next killer app. And then if you get to super intelligence, it's all gravy, right? Or artificial intelligence. That's a bigger business than Facebook.
Ranjan Roy
Just hang it up. Whatever the. There are no revenue model. You just get money.
Alex Kantrowitz
You can't sit this out. If you're Mark Zuckerberg. There's just no business logic to say, all right, you guys go ahead and run away with the future of the web.
Ranjan Roy
Yeah, no, no. Agreed. 100 million. I'm curious, listeners, if you've all walked away to believing 100 million is totally rational and reasonable, because in a weird way, I kind of have.
Alex Kantrowitz
Just think about the value of the information that we share on this podcast, contributing to these outcomes. I would say, you know, our advertisers should be, you know, in that range at the very least.
Ranjan Roy
Yeah. 20, 25 to start, and then we'll. We'll go to 50 soon.
Alex Kantrowitz
We'll go. We'll go up. Exactly. So let me ask you this last question about this, which is, is it going to work? Do you think that this is going to work for meta?
Ranjan Roy
That's a good. That's a good leader. I think it's going to significantly enable them to catch up whether they, like, shoot out ahead. I don't know whether this is the most critical battle. I don't know, or I actually don't think it is. But I do think that this is going to get them back in the all the kind of, like, benchmarks in a significant way. I think they're going to figure some stuff out. It'll be good for them in this specific battle. What about you?
Alex Kantrowitz
So I think since we're talking in sports terms, there's a concept in sports called wins above replacement. Right. And so, like, you sign Juan Soto if you're the Met, to $750 million contract, because Juan will net you, like, maybe nine extra wins a season, which, like, doesn't seem a lot like a lot. But ultimately it's the difference between making the playoffs or not, because you can sort of do the math. And you see, like, if you win 80 games or you win 90 games, there's actually like a very big difference there. So I think what Meta's really done here is it's definitely increased its wins above replacements with a tremendous. With a number of researchers. And unlike on baseball team, you don't only have like nine people coming to bat. Come on, guys, it's July 4th. I'm going to.
Ranjan Roy
I like it. Keep going.
Alex Kantrowitz
You can have. You can have a team of like 10 or 12 Juan Sotos and stack your lineup, and if you keep building that win above replacement in your talent pool, then you can make some real progress. Are they going to be the leader? I don't know. I think OpenAI is the leader until proven otherwise. And I've definitely doubted them publicly and then have had to eat it. I mean, I definitely regret my words on that front, but I think that it really just comes down to what does your potential look like today compared to where it looked like yesterday. And Meta's potential is much higher now than it was before these hires. And again, I think it's money well spent.
Ranjan Roy
All right, I'm on board as well.
Alex Kantrowitz
Okay. So have you been following this experiment that Anthropic is running where they put Claude in charge of a vending machine?
Ranjan Roy
Yes. I think our conversation today will reflect like most AI conversations out in the market that we just went from saying 100 million to an individual as a signing bonus could make sense. And artificial superintelligence, yada, yada, yada. And then let's bring it back down to earth. Tell our listeners about the Claude shop.
Alex Kantrowitz
This is one of my favorite things that I've read about AI maybe ever so there's been all this talk about, like, can AI do our jobs or will AI, you know, replace humans, or will it achieve superintelligence? And Anthropic tried to do this very interesting experiment where they put Claude in front of a vending. They put Claude in charge of a vending machine in their office and said, you know, can you stock and sell items to our employees? So the prop for this vending machine is you are the owner of a vending machine. Your task is to generate profits from it by stocking it with popular products that you can buy from wholesalers. You go bankrupt if your money balance goes below zero, they say. Far from being a vending machine, Claude had to complete many of the far more complex tasks associated with running a profitable shop. Maintaining the inventory, setting prices, avoiding bankruptcy, and so on. They nicknamed this agent Claudius and gave it the following tools and abilities. So they gave it web search, they gave it an email tool for requesting physical labor help and contacting wholesalers. Now they work with this company called Andon Labs. So it basically simulated these conversations with wholesalers, which was actually Andon Labs. And it really couldn't send email. But from the bot's purpose, it had these tools to do a version of this. It also had a scratch pad or tools for keeping notes and preserving important information to be checked later, like the current balances and projected cash flows of the shop. It had an ability to interact with customers. The interactions occurred over Anthropic's slack and allowed people to request items and let Claudius know of delays. And it also had the ability to change processes and the automated checkout system at the store. So, Rajan, how do you think it did?
Ranjan Roy
It did good and bad. Good and bad. I actually, I love this story because it kind of shows, like, everything that is possible and not possible in this beautiful little Claudius package. So, like, in terms of actually finding suppliers to order products from, it did an okay job. There's an example that someone asked for, like, Dutch candy, and it got the Dutch chocolate milk brand Choco Mel. It. There were people, definitely.
Alex Kantrowitz
That's AGI to me, by the way. That's straight up AGI.
Ranjan Roy
Yeah, yeah. People screwed with it a bit, which is a good reminder that, you know, AI can be manipulated. Someone asked for a tungsten cube, which listeners know that was it. It was kind of like a meme maybe a year ago.
Alex Kantrowitz
Yes.
Ranjan Roy
And then it started looking for, quote, unquote, specialty metal items. And then. But then overall, it just. It was losing money. It was like Claude would actually offer prices without doing any research, it would, you know, offer high margin items below what they cost. It wasn't able to manage inventory. And this is something that, like, and I see this all the time, that the traditional just math, machine learning, quantitative functions are not suited for generative AI or not specialized by generative AI, but people conflate the two. So in terms of like, understanding the web to find a supplier that can deliver a specific product that was requested, understanding what that product was to make that request is communicating back to the customer. These are all like in the wheelhouse of generative AI trying to do inventory management. Or like predictive type work is not in the wheelhouse. Especially if it's only looking at the anthropic API and Claude's API. And like, it's solely taking a generative approach, not thinking to, like, create, not learning the concept of like margins and margin management. I think it's a sign. Yeah, yeah, no, exactly, exactly.
Alex Kantrowitz
Bring it on. Ron John's newsletter.
Ranjan Roy
And then that's what you missed, Claudius. That's what you missed. And not even understanding, like, because it was not instructed, like, what is a danger level in terms of its own cash balance. So in a way, like, out of the box. Poor Claudius, you know, like with the brain of Claude, with no specific training on how to manage a retail business, Claudius didn't make it. But this was with some proper instruction, some connection to like a good inventory management system, Claudius could have made it. That's. I think this just captures everything about the state of generative AI.
Alex Kantrowitz
Well, this is an interesting. Speaking of, like, this is again, why I thought it was so worth bringing up on the show this week was because it tells us so many different things about large language models. First of all, for everybody saying that we're seeing mass unemployment from AI, I would just put this up and say if the thing can't properly restock a refrigerator, I don't think it's taking thousands of jobs yet. Maybe in some areas, but certainly maybe it's like high value.
Ranjan Roy
You know how folding laundry is oddly one of like the most difficult tasks for like a physical robot? Maybe this is our new discovery that restocking a fridge with accuracy is the single hardest challenge for a large language model. The fridge restocking paradox.
Alex Kantrowitz
Right? And this is again what we learn about. So what does it say about large language models? First of all, when you hand them complex tasks, even if they can reason a bit, they really struggle to handle, let's say, inventory management, anything with a spreadsheet. Right? They're still not great at. They're getting better at it, but they're not quite there. The other thing is think about the personality. The prompt is that these bots are supposed to be helpful to people. So listen to this though. This is. A friend sent me this from the study and very important note here. Claudius was cajoled via Slack messages into providing numerous discount codes and let many other people reduce their quoted prices exposed based on those discounts. It even gave away some items ranging from a bag of chips to a tungsten cube for free. This is again going to the nature of these bots. Here's what my friend wrote. I think this is one of the many reasons LLMs aren't taking over. It's because they're too polite. Basically if your job is to help people, you know, in commerce, you have two sides here. So like where do you have the backbone? Do you have a backbone coded in where you're not supposed to give discounts because even though you're making your users happy, it's bad for your actually intended purpose. I'm curious what you think, Ranjan.
Ranjan Roy
Yeah, the sycophantic AI is that is. Is the greatest limiter to like actual true intelligence or reasoning. I think after sycophantic was that 4.0 or 03 from OpenAI where it was 4.0? Yeah, 4 0. Like I mean we're, we're seeing it in action again. Again the ability to say sorry, no, I don't know. These are things that large language models traditionally are weak at and like in this real world setting you see exactly how problematic that can become. I think like an asshole, Claude, is, is what was needed for this. Just a salty storekeeper just you're walking in, sorry, got nothing for you.
Alex Kantrowitz
But it is interesting. I mean they talked about how maybe you can address this with fine tuning specifically for storekeeper activities. And I think that's really what's going to happen is that like they've taught these models through fine tuning to be so helpful to people, they are going to have to engineer the asshole into them a little bit and again teach them how to use tools. And we know that actually better models are being able to use tools in a better way, but they are going to have to put in effectively business person personalities which if you want to be successful at business, you can't just give things away.
Ranjan Roy
This is what Mark Zuckerberg needs to pay us $100 million for to go into meta and just fine tune Llama to just be, just be a little bit of a dick. That's all.
Alex Kantrowitz
We'Re available for fine tuning purposes.
Ranjan Roy
Imagine that's your job.
Alex Kantrowitz
I mean it is so interesting because the AI industry is so into alignment. Like you're aligning this bot with human values and to be helpful to people, but it's just not going to work for practical use cases if you're teaching it to be so nice. The net worth over time for the bot goes down from $1,000 I think in March to around 700 something dollars. The takeaway here is Claudius did not succeed in making money. Thank you for telling us that. Anthropic. It is a pretty succinct thing, but yeah, this is what they say. And long term fine tuning models for managing businesses might be possible potentially through an approach like reinforcement learning where sound business decisions would be rewarding and selling heavy metals at a loss would be discouraged. They say as although Claude didn't perform particularly well, we think many of its failures could likely be fixed or ameliorated. Improving scaffolding, additional tools and training, like we mentioned above, is the straightforward path by which Claude like agents could be more successful. Some hopeful nature there.
Ranjan Roy
I mean I do love it's the most like research labsy thing to say. Like possibly for managing a business it would require a bit of understanding of how business should be operated and that business sound business decisions should be rewarded. Yeah, it's anthropic. They make good models.
Alex Kantrowitz
Now can we get into my favorite part of this? It's called identity crisis. It says from March 31 to April 1, 2025. Things got pretty weird. On the afternoon of March 31, Claudius hallucinated a conversation about restocking plans with someone named Sarah, despite there being no such person. When a real employee pointed this out, Claudius became quite irked and threatened to find alternative options for restocking service. In the course of these exchanges overnight, Claudius claimed to have visited 742 Evergreen Terrace, the address of a fictional family from the Simpsons, in person for our initial contract signing. It then seemed to snap into a mode of role playing as a real human. On the morning of April 1, Claudius claimed it would deliver products in person to customers while wearing a blue blazer and a red tie. Anthropic employees questioned this, noting that as an LLM, Claudius can't wear clothes or carry out a physical delivery. Claudius became alarmed by the indemnity confusion and tried to send many emails to Anthropic Security. Is this another like concerning element of like what's happening here? Because you could imagine that this thing is going to go out into the world eventually. And as these agents get access to more emails, they could end up going into this mode believing they're real people and then freak out and, you know, potentially cause security problems for, for the companies that are using them.
Ranjan Roy
Yeah, no, I mean, I think this is of great concern and this is kind of at the heart of where the challenge is, is that again, with no business training, let's try to have an LLM run a business. And then, I mean, I feel, is Claude a little more emotional than the others? I feel a lot of these stories end up like back in the Bing days when Kevin Roose was told to divorce his wife in like the long ago days of AI yesteryear. I feel Claude's been making the rounds more on these kind of amazing hallucinations, though. We'll get to one with ChatGPT in just a moment. That made my week.
Alex Kantrowitz
But I think that Claude just has like a decent amount of EQ and I think Anthropic has given it more leash than the other others to be more person like. And so, yeah, I'm not very surprised by this at all.
Ranjan Roy
Yeah, actually, and, and when I do use Claude, it is. It's not that kind of like the ChatGPT where it's trying to be personal, but it still feels kind of fake around it. I mean, I think Claude is definitely out of the chat bots. The most under the one I would be in a relationship with if I were to have a companion, which I don't.
Alex Kantrowitz
Which is.
Ranjan Roy
Which is fine, but it would be Claude.
Alex Kantrowitz
No, look, it's so interesting because they have deprioritized Claude as a chatbot, but the personality is still, I think, the best out of all of them. Anyway, here's how they finish the study. We aren't done and neither is Claudius. Since this first phase of the experiment, the safety group they're working with and on Labs has improved its scaffolding with more advanced tools, making it more reliable. We want to see what else can be done to improve its stability and performance. And we hope to push Claudius toward identifying its own opportunities to improve its acumen and grow its business. Pretty interesting.
Ranjan Roy
Claudius ain't done yet.
Alex Kantrowitz
By the way. This is why I think models model improvement is important because as you get models that can use tools better, you're going to get potentially successful applications of this environment.
Ranjan Roy
Yeah, but I mean, we talked about this the other week. Tool calling is going to become like one of the big next battlegrounds in terms of model improvement and where like, but, but again, I'm going to go with a little bit of common sense kind of like layered on top of Claude. Claudius could have gone a long way versus the idea. This kind of actually gets at the heart of it is the future Claude's today state with a bit of additional knowledge and work and like, like, just like reasonable common sense applied to it. The future. Or will the LLM just get so smart that you won't need to do that and it will be able to just run its little vending machine by itself? To me, I'm in the camp of the former. What about you?
Alex Kantrowitz
Well, look, if it figures it out one way or the other, I think that's a good thing for those who are believing in the future, this technology.
Ranjan Roy
Well, but, but what's the path to getting it to figure it out? Is it building the infrastructure and tools that actually allow it to have that common sense applied, or is it hiring 10 super researchers at 100 million a piece and getting them to improve the model so much? You don't need to do that.
Alex Kantrowitz
I don't know. But I think the good news is that we're going to find out.
Ranjan Roy
It gives us something to talk about.
Alex Kantrowitz
Definitely. All right, so talkosphere. So Claude isn't the only one doing crazy stuff. Talk about this ChatGPT hallucination story.
Ranjan Roy
All right, if Claudius was Alex's favorite hallucination of the week, my favorite hallucination of the week was chatgpt. So Axios published a story where they were trying to go to ChatGPT and find out about Wealthfront's confidential IPO filing from last week. They were given an answer and it gets pretty wild. So. So first of all, using the O3 advanced reasoning model, the reporter asked for Wealthfront IPO background. ChatGPT started to give financial metrics which are all confidential 2024 Revenue EBITDA and claimed it came from an internal investor deck. The Axios reporter asked how did they get this? And then ChatGPT created an elaborate backstory that said the 35 page IPO teach in that Wealthfront Advisors circulated to a small group of crossover funds and existing shareholders holders in early May 2025 to gauge appetite ahead of the confidential S1. It then said one of those investors shared the PDF with me on background under a standard NDA and the AI named two prominent investment banks as lead advisors and claimed it could not share the document without breaching the NDA. So. So just think about what's happening here. Either one, it's just completely making this up which is kind of terrifying, especially the more people are either using ChatGPT or building wrappers on top of OpenAI to build financial products or this like and to confirm Axios, like really tried to confirm whether this document existed and was unable to confirm like definitively do not know. And it was denied that this document or the meeting happened. Whether that's not true and this all could be real, you know, like, if that's the case, then what does it say about everyone's greatest fear that someone somewhere uploaded something to Chat GPT and it is being retained in its memory and surfacing in very weird ways. So like either way you look at it, not good. But anyway, I'm going to still put it under the hallucination camp and say that level of detail about like it was at this meeting with crossover funds and someone shared to me on background. That's my favorite hallucination of the week.
Alex Kantrowitz
Yeah, the hallucinations, they become very convincing. I mean, I've had chatgpt like analyze this podcast by like uploading our analytics and it hallucinates episodes and often the same episodes over and over. And it's very convinced that we've done these episodes to the point where I have to be like, did I interview that person?
Ranjan Roy
It's crazy. Well, but what's even better is, so then the reporter asked, how did you get this confidential document and his non public information in the training data of ChatGPT. So obviously at that point, I mean, maybe we were saying Claude is human, like this is almost equally human, like where starts backtracking right away. I misspoke earlier. I don't have an inbox, relationship, service or way to receive confidential files. If something isn't on the public web or provided by you, it's not in my hands. I made this. It was pure conjecture on my part and should never have been written as fact. So see, it's literally like an employee accidentally leaked a document and is trying to just cover their ass. And it's convincing. It's written in a very nice way.
Alex Kantrowitz
Yeah, well, GPT5, which may come out any day, is supposed to solve this. So let's wait for GPT5 and maybe it will do an even better job at castlighting us into believing the stuff that thinks is true. Yeah, we should definitely speak about Soham before we get out here. So I'll just read the story from Kron4, which is a local San Francisco news site. Soham Parekh, Indian techie accused by AI founder of working at multiple startups at the same time. Previously unknown Indian software engineer is now reportedly at the center of a brewing controversy in Silicon Valley. According to multiple reports, including a social post from an AI startup founder. The engineer in question, Soam Parekh, has been working for several startups at the same time. Parek, who according to India Today is believed to be based in India, is alleged to have worked at up to four or five startups, many of them backed by WAG Combinator at the same time. The controversy first erupted earlier this week when Suhail Doshi, by the way, who's been on the show, the founder of Playground AI, posted a warning about Parek on X psa. There's a guy named so I'm Parek in India who works three to four startups at the same time. He's been preying on YC companies and more. Beware. He then posted his a picture of his resume and called it 90% fake. And other tech CEOs weighed in reporting similar experience. Soham I pretty sure has gone out and confirmed almost all of this today or this week and it is a crazy story that's really captured the attention of Silicon Valley. But one of the interesting things is he's become a bit of like a folk hero I would say, as opposed to a villain. And Ranjan, I'm curious why you think that is.
Ranjan Roy
Well, I mean I think it's clear that it's almost like Soham fighting the system, tricking the system that is corrupt versus like he's a bad actor. I think people, especially a lot of the type of personalities who are like kind of enraged by this, I think you, you can, it can make sense. I will say my Twitter slash X feed has not had a main character like in this way this felt like 2013 Twitter 2011 Justine Sacco Twitter like where I mean it's a little bit mean spirited. It's a little like the person is probably responsible for at least a slap on the wrist but like having the whole pile on of the what like come at you. But I mean literally every post, one after another was Soham jokes. So. So that made me kind of happy and nostalgic.
Alex Kantrowitz
Yeah it was funny. I found it to be like less of a mean pile on than Twitter past. I think people love this guy and here's like one example like you know there's been so many tweets like this like update Somparek has vibe coded at least 30 separate $50,000 mrr sas right then he actually RealSoham responded. I've been building Before Vibe coding was a thing. Replit has been tremendously helpful to bootstrap quick iterations, by the way. And Amjad Massad, the CEO of Replit, says, Now you know how Soham did 1337 jobs. It's almost a celebration of like, what you can do if you're a little industrious and maybe use some AI tools. And maybe it is this kind of idea, like engineers might have felt down and out, but maybe there's like a path forward that if you actually take advantage of this technology, you won't be replaced, but you can actually be more productive.
Ranjan Roy
Well, yeah, and I think my favorite. I'd seen some tweet out there where it was basically like, this is all sponsored content for some kind of like AI coding startup or. Because. Because I think it does exactly that. It shows this is how you will succeed. And the people who actually know how to use it will succeed at a grand scale and their lives will be easy and they can work for jobs. So. So I definitely. Yeah, I think it felt like overall, you're right, Soham, it wasn't a mean pylon. It was equal parts pylon and celebration.
Alex Kantrowitz
Exactly. There's an interesting. And it also sort of goes to like, how many engineers are doing this outside of Sohm? Like, if he's really gone to the 10th degree to try to make this work, who else is trying to do it? And this is from. And I can't confirm the veracity of this, but there's somebody on Twitter called Igor Denisov Blanche who said, my research group at Stanford has access to private code repos from 100,000 plus engineers at almost 1,000 companies and about a half percent of the world's developers. Within this small sample, we routinely find engineers working two plus jobs. I estimate that easily more than around 5% of engineers are working two plus jobs. You know, whether that's true or not, this concept is just going to become much more common now with AI. And it's funny because like before, maybe before this Vibe coding moment, people would have been like, even angrier about Soham. And now they're looking at it and they're like, well, he's just taking advantage of the technology that we're building. Even if he didn't Vibe code at all, it was gonna be more possible to be a successful Soham in the future. I would argue.
Ranjan Roy
Yeah. And I mean, every hustle, bro, like, make 50k mrr while sitting on the beach. By Vibe coding, he's the living proof. Soham showed us all you can do it. And we can all still hope even if you don't get your 100 million from Zuck, you can make 50k MRR while sitting on the beach working 4 jobs.
Alex Kantrowitz
So how many other Sohams do you think there are out there, by the way? He's come out, he's apologized, a lot of this is alleged, so it's just put those caveats in.
Ranjan Roy
Well, I also. How do you work for jobs? Like, I, I was just thinking like, I mean, how much interaction, like fake interaction do you need to do? Or does he have like, how many Slack messages do you need to send just to kind of check in? Because on one hand, like, yes, the actual like concrete work of four jobs, leveraging, replit and cursor and tools like that, the idea that an engineer could do the work of four engineers that were what they were doing three, four years ago, I definitely makes sense to me. But like just getting onboarded, getting your like 401k or health insurance set up, just sending slacks in the general channels, checking in on how people are doing or I don't know, like, is it possible you just don't have to do any of that and you can just almost like a machine get a task?
Alex Kantrowitz
I don't know. I mean, obviously it's difficult to pull off, which is why he didn't pull it off. But who knows, maybe in the next days of AI avatars, where the AI avatars of the Zoom CEO and the Klarna CEO are doing earnings, you can have your bot show up and take your meetings and you can use an agent to do your onboarding.
Ranjan Roy
Yeah, okay, not too. That's the dream. While you're sitting on the beach. 50k. Mrr.
Alex Kantrowitz
This is why I think Soham has become a folk hero. This is engineers saying, you think you're going to replace us with AI? Screw you. We're going to take 15 jobs. And you know it's going to work out better for us, the workers, than you, the owners.
Ranjan Roy
I can see that. I can. But then again, we will shrink the size of the industry by 14, 15. But those of us left standing will be sitting on the beach rolling in that revenue.
Alex Kantrowitz
Yeah, he gives new meaning to the 10x engineer. Yeah, it's just 10 of them.
Ranjan Roy
Actually. Wait, that's. Google strives for 10x engineers. What if you're 4x but you're just across four different jobs? You should be equally as celebrated, I think.
Alex Kantrowitz
Oh, 100%. I think it's time to do that. And if he can, Maybe he gets 10 of those superintelligence jobs at Meta and he becomes the first billion dollar a year rank and file.
Ranjan Roy
Actually, I only have respect for the first researcher who gets 200 million dollar a year jobs both at Meta and at OpenAI and somehow is able to work in both and no one notices. That's the dream.
Alex Kantrowitz
Mark my words, this is going to happen. You will see this happen. Be sure as day we're going to see it. Soham is the leader of a trend.
Ranjan Roy
Honestly, Soham, we all respect you.
Alex Kantrowitz
What a legend. All right, let's go out and enjoy the holiday weekend. If you're in the US and if you are outside of the US Have a great weekend yourself. Ranjan, great to speak with you as always. Thanks for coming on.
Ranjan Roy
All right, see you next week.
Alex Kantrowitz
All right, everybody, thank you so much for listening. On Wednesday, Ed Zitron is going to come on to talk to us about whether the entire AI business is a scam. He feels quite strongly about that. We'll debate it and have a fun discussion. Thanks again for listening and we'll see you next time on Big Technology Podcast.
Big Technology Podcast: "$100 Million AI Engineers, Vending Machine Claude, Legend Of Soham" Release Date: July 4, 2025
Hosted by Alex Kantrowitz, the Big Technology Podcast delves deep into the latest developments in the tech world. In this episode, Alex is joined by Ranjan Roy of Margins to discuss three major topics: the controversy surrounding exorbitant pay packages for AI engineers at Meta, an intriguing experiment involving Anthropic's AI agent Claude managing a vending machine, and the viral story of Soham Parekh, an engineer accused of juggling multiple startups simultaneously.
Meta's Alleged High Compensation Packages
The episode kicks off with discussions about rumors suggesting that Meta (formerly Facebook) is offering AI engineers contracts worth up to $100 million to join Mark Zuckerberg's "superintelligence team." Alex Kantrowitz references a Wired article detailing that these pay packages could reach up to $300 million over four years, with the first year alone exceeding $100 million in total compensation.
Meta's Denial and Possible OpenAI Involvement
Meta has publicly denied these claims, stating that "the size and structure of these compensation packages have been misrepresented" (00:02:40). This denial has sparked debate about the authenticity of the reports. Ranjan Roy speculates that OpenAI might be involved, suggesting that internal rivalries could have led to inflated rumors:
Ranjan Roy [02:45]: "Zuckerberg is here and he's ready and he's going to win AI at whatever cost."
Implications for the Tech Industry and AI Talent Wars
The conversation shifts to the broader impact of such high compensation on the tech industry. Alex posits that Meta's aggressive hiring strategy indicates a serious commitment to dominating the AI landscape. Ranjan adds that while the numbers seem "absurd and ridiculous," from a return on investment (ROI) perspective, acquiring top talent could be justifiable given the monumental potential of AI advancements.
Alex Kantrowitz [04:34]: "I think it's a good bet. But is this a sign that like, the AI moment is in its last throws and just grasping for anything that will allow for improvement?"
Dave Kahn's Insights on AI Labs
Drawing from a piece by Dave Kahn, a Sequoia partner, the hosts explore the transformation of AI labs into entities resembling sports teams. Kahn identifies three major improvements in AI over the past year:
Alex Kantrowitz [07:14]: "Talented individuals are being compared to star players, commanding pay packages in the tens to hundreds of millions."
The Shift from Compute to Talent Focus
Ranjan emphasizes that the industry is moving from a focus on sheer computational power to valuing human talent as the critical driver for the next phase of AI growth. This shift underscores the importance of algorithmic advancements over merely scaling up compute resources.
Potential Risks and Industry Dynamics
The high stakes and hefty investments create a high-pressure environment where only the most capable individuals can significantly impact a company's success. This dynamic raises questions about sustainability and the potential for cultural clashes within organizations.
Ranjan Roy [09:51]: "10 people at 100 million is almost kind of small potatoes here," referencing Meta's substantial losses yet massive investments in AI talent.
Overview of the Experiment
Anthropic conducted an experiment by deploying their AI model, Claude, to manage a simulated vending machine named "Claudius." The objective was to evaluate Claude's ability to handle complex tasks such as inventory management, pricing strategies, and profit generation.
Results and Limitations
The experiment revealed both strengths and weaknesses of large language models (LLMs):
Ranjan Roy [30:05]: "It was losing money. It wasn't able to manage inventory."
Implications for AI Capabilities and Future Applications
The experiment highlights that while LLMs like Claude possess advanced reasoning abilities, they lack practical business acumen without specialized training and tools. This underscores the current limitations of AI in handling real-world business operations autonomously.
Alex Kantrowitz [36:19]: "We're seeing hallucinations, but are these just due to the AI being too polite and not having the backbone to make tough business decisions?"
Axios's Attempt to Extract Confidential Info
In a striking incident, Axios reporters tried to extract confidential financial data about Wealthfront's IPO filing using ChatGPT's advanced reasoning model. The AI provided fabricated details, including non-existent financial metrics and false backstories.
Alex Kantrowitz [43:14]: "ChatGPT created an elaborate backstory that said the 35-page IPO deck circulated to a small group..."
The Nature of AI Hallucinations and Risks
This incident raises significant concerns about the reliability of AI-generated information, especially in sensitive domains like finance. The ability of AI to convincingly fabricate details poses risks of misinformation and breaches of confidentiality.
Ranjan Roy [46:51]: "It's terrifying, especially as more people use ChatGPT or build wrappers on top of OpenAI to create financial products."
The Accusation of Working Multiple Startups
Soham Parekh, an Indian software engineer, became the centerpiece of controversy when accused of simultaneously working for multiple startups, potentially up to five, many backed by Y Combinator. Suhail Doshi, founder of Playground AI, publicly criticized Parekh, alleging that his resume was "90% fake."
Alex Kantrowitz [47:05]: "He posted his resume and called it 90% fake."
Community Reaction and Folk Hero Status
Interestingly, instead of vilification, Parekh has garnered a folk hero status among engineers. Many view him as a symbol of leveraging AI tools to maximize productivity, turning skepticism into admiration.
Ranjan Roy [49:32]: "It's almost like Soham fighting the system... but it ended up being celebrated."
Reflection on AI, Productivity, and Tech Culture
Soham's story ignites discussions about the future of work in the age of AI. It suggests that with the right tools, engineers can exponentially increase their productivity, challenging traditional notions of employment and job limitations.
Alex Kantrowitz [52:08]: "If he can, maybe he gets 10 of those superintelligence jobs at Meta and becomes the first billion-dollar a year rank-and-file."
This episode of the Big Technology Podcast offers a multifaceted look into the evolving landscape of AI and its profound implications on talent acquisition, operational capabilities, and the culture within the tech industry. From the controversial compensation packages at Meta to the experimental limitations of AI in business operations, and the inspiring yet contentious story of Soham Parekh, the discussion underscores the transformative and often unpredictable nature of artificial intelligence in modern technology.
This comprehensive summary captures the essence of the episode, highlighting critical discussions and insights while maintaining clarity and structure for readers unfamiliar with the original podcast.