Transcript
A (0:00)
Anthropic is cracking down and tightening their usage limits for Claude code. And this is interesting for a number of reasons. This is a problem that we had over the weekend basically in a little bit last week. But I think it's interesting because this is a problem that basically every AI company is going to face in the future at some point. And you as a user of AI tools are going to get this issue, whether whether you use Claude or Google or OpenAI or anyone else. This is a problem that I think anyone can have. And it all is kind of rooted in the not very transparent way that some of these models are talking about their usage limits. We're going to get into all of that, what you can do and what, what these AI companies are doing on the podcast today. But before we get into it, I wanted to mention, if you want to try the top 40 different AI models, including everything from Claude and everything from OpenAI, check out my own startup, which is AI Box AI. We have a platform which is in beta right now and essentially get access to the top 40 models. You get Anthropic, Cohere, Deepseek, Google Meta, Microsoft, Nvidia, OpenAI, Quen Xai, a bunch of image models, text to speech, speech to text models. It's all 20amonth. You can try all of them, you can use all of them and you can use them on the same thread, which personally I actually find way more useful because I'll be talking to ChatGPT to get me to brainstorm something analytical. I'll switch to Claude code and just pick up the conversation of Claude code, make it have a better tone, and then I'll pull in something like Ideogram to create, generate images of what I'm working on all inside of the same thread. It's super, super nice. So in any case, if you want to try it out, it is AI Box AI. There is a link in the description and I'd love to hear what you have to what you think about the platform. All right, let's get into what Anthropic is doing here. So the first thing that I think people were upset about and also I'll explain why you can't be too upset. So the main thing people are upset about, they have this thing called the Max plan. It is $200 a month. Like all these platforms have $200 a month plans, right? So big whoop. It's for developers, specifically for people using Claude code. So this is actually something we use at AI Box and, and our cto, he like sent me this article before I even saw it because he was like, ah, dang it, maybe the party's over. And basically what was happening is before we got on the max plan, we were split spending a hundred dollars every two days on Claude code credits. We're just like, that's just how much it costs, right? So you can imagine a hundred dollars every two days gets very expensive. And we were actually thrilled because we were getting so much done. We're doing front end, back end cloud code pretty much just ties into your whole code base. And it's a text interface where you're like, hey, you know, go to all of our, you know, all of our sales pages and go update the footer to be like X, Y and Z. Or you could say, hey, hey, go and create a new, you know, feature on this tool that we have that's able to automatically upload and generate images that do X, Y, Z. Like you could do really impressive things. It's, you know, not quote unquote vibe coding where all these guys are over on lovable with these other apps and kind of making like these, you know, front end interfaces. Like it was doing some serious stuff. Now was it perfect? No. Sometimes it would just break everything and you got to revert the change and go figure out why and, and change, you know, how you asked it a question. But, but we got very good and essentially came up with some really good steps in order to, you know, ask Claude code questions. Very elaborate stuff. And so essentially we're able to generate tons and get a ton done for what we would be paying engineers in way more money for. The problem was it still was pretty expensive because we're getting so much done. And there's this max plan, which is essentially $200 a month. And once we realized, because, you know, I've seen comments on Twitter people talking about that they just launched this thing and so we tried it out and it was kind of complicated because you actually can't get it just by like clicking on it. You got to contact them and like send them a message and get them to add it to your account and do all this, I don't know, sort of annoying stuff, but whatever. Basically we went from paying $100 every two days to $200 a month, which seems incredible. And we are not the only ones because there's someone else that was commenting on a story over on TechCrunch and they are a super heavy user of Claude code and they, they wanted to remain anonymous. Cause they don't want to know. They don't Want people to know who they were. But they said that this, you know, this max plan lets them make about a thousand dollars worth of calls. So he was measuring their API pricing. He was just looking at how much like, output he was getting, looking at what their API cost was. And he was like, look, I'm getting a thousand dollars worth of calls every single day on a $200 a month plan, right? So he's getting $30,000 or, you know, know if he's $20,000, if he's not working the weekends, it's getting $20,000 worth of outputs for $200 a month. It. It's like insane, right? It. That's so. Is way too much, way too generous. I mean, we love it, but it's probably too generous for the company. And he. So basically this guy said he wasn't surprised the usage limits were coming and becoming more restrictive. But pretty much, I think what he said, what everyone said is like, just be transparent. He said, the lack of communication just causes people to lose confidence in them. We're all also kind of crying and whining as we're saving thousands and thousands of dollars. So you, like, take it with a grain of salt. So basically, what happened and what could. Other. What, what could be the implication for other AI companies? Basically, there's a kind of this arbitrage where you're getting very cheap output for code for, you know, 200 bucks a month. They started restricting it, but they didn't tell people they were restricting it. Basically, um, people were just, you just get kind of this error message that said claude code limit reached. And then you were given a time which was typically within a matter of hours when your limit was going to reset. But there was like basically nowhere online could you see any sort of announcement about a change of limits. A lot of people just thought that their subscription had been downgraded and that their usage was being inaccurately tracked. One person was complaining over on GitHub. So basically you can go to GitHub and see, you know, people's comments and complaints on what's going on. But one user specifically said, your tracking of usage limits has changed and is no longer accurate. There's no way in the 30 minutes of a few requests I've hit the 900 messages, right? So some people like us are going crazy on it, some people are using it less, but it's still useful to them. Right. What was really interesting is anthropic basically like said. Yes. Like when they were. Someone reached out to them about it. They're like yeah, we, we know we have this problem and there's something going on. But they didn't really elaborate further and it's kind of hard for them to elaborate, right, because if they said hey look, we had some sort of outage or our GPUs went down or our server was went down or like something like that, then people would be like, okay, well we want like a refund and like, or like at some sort of discount or whatever, right? So they're kind of like very not transparent about the whole thing other than saying like, yeah, we see an issue, we're trying to fix it. Basically what's interesting is like it, it's kind of funny to me because people that I heard talk about this issue basically were like, okay, we're going to go try different platforms. Um, that if you, you know. So one person in particular said I just stopped the. Well, okay, so this is kind of crazy. They, they said pretty much their whole project got killed from this usage limit. They couldn't work on it at all. I'm assuming they're doing some crazy stuff with front end and back end. And when we do that we use an insane amount of compute. But anyways, this user said that it has been impossible to get their project to move forward with all these limits. They said, quote, it just stopped the ability to make progress. I tried Gemini and Kimi, but there's really nothing else that can that's competitive with the capability set of Claude code right now. And this is true. Like Gemini. I'll tell you the amazing thing about Google Gemini is it's got a million token context window. So you put this huge file in there. But guess what? Some code bases are way bigger than that. And plus you don't want to copy and paste your entire code base into Gemini to ask it to do a little question. Claude code is amazing because it's built right into your terminal and so you know, VS code or whatever. And so you can be coding in where you're usually are looking at your code, where all your code is hosted and held and it's, it's, you know, editing everything right in there. You're not copying and pasting it off site somewhere else. Like Elon Musk recommended you with Grok for like and like. I welcome Grok or Gemini to build tools like cloud code. But right now cloud code is just the best because it plugs into what you currently use. It does the front end and the back end. It's insane and does all the design as well. It looks at your Designs. Anyways, it's amazing. I'm sure you've heard me talk about it too much, but it is really impressive and like this person was saying, like, there are no other options. So it's kind of interesting. We had this really big unlock in coding capabilities thanks to it, but there's not a lot of other options. And I would love for there to be other options. They just don't exist right now with, you know, what, what Claude's able to do. So what, what does this mean and why were they able to do this? And really no one, you know, kind of get mad at them. So what's interesting is if you go to their website and look at the, you know, basically the terms for Claude's max plan, like how much usage you get out of it. It doesn't say you get a certain amount of tokens, it doesn't say you get a certain amount of compute or, or like basically anything. All it says is that they actually have. So they have one plan lower than max, which is called expanded usage, and that's $100 a month. So there's regular Claude, 20 bucks a month, expanded usage $100 a month and max 200. So like, what do you get? They don't really explain. All they say is that the expanded usage is 5 times more usage than Pro and that the maximum is 20 times more usage than Pro. So theoretically, if they were throttling everyone on the Pro tier as well, you're still getting 20 times use more usage than the Pro. Just everyone equally gets throttled. So, you know, there's like nothing people can really do to complain or like get compensated or something for it because like, they never said you got a certain amount, they just had a certain percentage. And if they drop how much, you know, how much, how many tokens that the PRO plan gets, then everyone else drops 5x or 20x. So in any case, I think that's kind of a sneaky thing they did. And I think other AI companies are thinking along the same lines there, where they don't really say like, oh, you can generate like five videos an hour. They just say like, they're just like really not transparent about it. And it's like five times as many videos as our free tier. And the free tier can be whatever they decide to throttle it at. So I think this is kind of interesting. If you go to their website, it still says that their network has a hundred percent uptime for the week. So they're not really calling it an outage. It's basically probably A lot of usage and they just had to dial back what they were allowing people to use. As of now, I think it's mostly sort of regulated, but I think over the weekend it was a really big deal. A lot of people were complaining and getting this error and I think it's something that will be brought back as usage spikes in the future. And I don't think that Anthropic is the only company that will do this. I think we're going to see this from basically every company. We even see when OpenAI makes a big new product announcement, a big launch and everyone's trying to test it out. You know, I remember like Sam Altman, I think it was with memory or something, was like, oh yeah, everyone gets memory like tomorrow and then like the next day it was like actually too many people wanted to try memory. So basically it's going to take like three weeks to roll it out to everyone so we can like ramp up our capability to support it. So it's. These things get very popular and it's really hard to supply all of the demand sometimes. But it's going to be interesting. And I think at the end of the day it'd be great if these companies are like transparent about it. Like, hey, we're reaching high volumes of usage, we've kind of pulled throttled everyone a little bit right now. We hope to have this regulated soon, spin up more servers, etc. Etc. That's not typically the way they want to communicate though, because they're worried that people are going to complain and try to get like some sort of compensation for it. So in any case, fascinating time. If you learned anything in the podcast today, the number one thing that you could do to say thank you is to leave a rating and review wherever you get your podcast. It helps the show out a ton. And also if you want to try the top 40 AI models, go check out AI Box AI, my startup. I've been thrilled by how many of you guys have tried it and given me amazing feedback. We're working on implementing. Anyone that sent me a message, sent a, you know, found a bug, or sent a feature request, we're working on all of them right now and are going to have a really exciting update soon with some incredible new features. So go check it out. It's 20amonth. AI box AI, the links in the description and I will catch you guys all next time.
