Loading summary
A
I was recently watching a video by Mo Gadda. It was a keynote he was giving. He's a former Google X executive and he was saying that AI is no longer just writing code, it's actually correcting human math. He gives this really incredible example where he says basically for the last 56 years he's been using the same matrix multiplication method for code. And this is something that he said is like very standard. People agree on this for a very long time. And he said that recently, um, he was, you know, talking to, to AI and telling it basically to improve itself. And he said when he told AI to improve itself, the AI realized that their matrix multiplication method was flawed. And so instead of trying to go and optimize the software that he had created that they had for AI, which is what he assumed it would do, instead it invented a completely new way of doing math and he said to essentially optimize itself. And he said that that new invention resulted in a 26% in performance boost and the removal of hundreds of millions of dollars in cost and energy use for Google. So, like this massive uptick in basically optimization. This is a fascinating concept. When I first saw that, I was really fascinated by the fact that AI is kind of getting, is definitely getting better at math. But beyond just getting better at solving math or solving math the way that we might solve it, it's creating new ways to solve math and coming up with completely new methods when it thinks that our methods are flawed. So today on the podcast I want to get into AI and math where it is today, because there's also a whole bunch of really interesting news about math problems that have been solved recently. And I think it's, it's easy to talk about AI hallucinations and how, you know, AI can't do xyz. I honestly think like beyond the hype, what I'm actually seeing in my day to day use of AI is, is that it is getting like startlingly good and it's improving very quickly. And I think a lot of that isn't necessarily the maybe the model's getting better, but the tooling we're adding. Anyways, we're going to get into all of it on the podcast. Before we do, I wanted to say if you want to go check out the latest updates I've done to AI box AI that allow you to build any AI tool you want without knowing how to code. You just prompt it to build something and it will link together all of the AI models, put in the prompts and build something cool. Most recently I saw someone created a Bible story graphic novel generator that was a really cool tool that I'm sure my children will love. But there's so many different options. If you want to go check it out, there's a link in the description to AI box. AI, you can go try to build something and check out a whole bunch of things that other creators are building. All right, let's get into the state of AI and math today. I wanted to start this off by saying that AI models right now are starting to crack a whole bunch of high level math problems. I was recently on X and I saw a tweet from Bartos Nesrecki where he said GPT5Pro solved in just 15 minutes without any Internet searches the presentation problem known as Yu Semrutsu's 554th problem. He said this is the first model to solve this task completely. He expects more of these kind of results. The model showed that it had a really strong grasp of elementary abstract algebra reasoning. So like these models are getting better and better at solving problems, but they're also doing really good in math competitions and other areas. Didi recently posted On X and said AI just achieved a perfect perfect score on the hardest math competition in the world. The Pootman has 12 problems and they each are worth 10 points. The highest score last year was 90. The median was zero. Axiom's AI Pro Prover in Lear scored 120 out of 120 and just shared all of the solutions. Huge milestone in AI. I saw another really interesting story where over the weekend Neil Simani, who is a software engineer, he's a former quant and now he has a startup. But he was testing how well OpenAI's newest AI model could handle really difficult math problems. And he said he had a. He saw something that was really surprised him. Essentially he pasted in a really long unsolved math problem. So there's these lists online, by the way, that like there's one Hungarian mathematician, he's got like a thousand problems he's posted online that have never been solved. And basically people take them and they put them into AI models to see if the AI model can solve them. And it's like, oh my gosh, like AGI is here. The AI model could solve it. Uh, recently we had one last year that Google's AI model solved. And so anyways, it's kind of always an exciting thing when they get solved. So he posted an unsolved math problem into chat GPT. He let it run for 15 minutes. He came back and it had a solution for him. But you never know, right? Like maybe this is just hallucinated. So he goes and checks the solution and it turns out that it was actually right. There's kind of these online verification tools. One of them in particular is called Harmonic and it's basically just designed to make sure that the logical arguments of solving a math problem are sound. And apparently once he pasted in ChatGPT GPT's response, everything checked out. It said it was accurate. He, this is a quote from him. He said, I wanted to get a sense of where AI systems can actually solve open math problems and when they still get stuck. So I think he was really surprised by the fact that just how it had actually solved this problem. So problems that were previously out of reach, he says he thinks are now solvable. Okay, but I want to talk to you how it solved, about how it solved this problem. It's line of reasoning. Because what's cool with ChatGPT and with reasoning is you can go and like look through its chain of thought. And so he's obviously, he was a, formerly a quant, he's a huge math nerd, so he can go and understand his chain of thought. And it was fascinating. So this is how ChatGPT got to solving this, you know, very complex, previously unsolved math problem. Basically what it did is it pulled out a bunch of well known ideas from mathematicians. So first it does research, it gets all of these, it tries to connect them in a logical way. This particular case, it went and found some information about legends, formula, Bertrand's prostrate and the Star of David theorem. And then what absolutely blew my mind, it went and it found an old math overflow post from 2013 where there was a Harvard mathematician named Noam Elixis who had a really interesting solution to a related problem. Right. So not, not the same problem at all, but it was kind of a related problem. It goes and finds that and then instead of copying just like the solution of how, you know, that former Harvard mathematician solved that problem, it took a completely different approach and, and ended up producing a much more complete answer to the question that was connected to some works of Paul Erdos, which is one of the most influential mathematicians of the 20th century. So I think for anyone that is skeptical about machine intelligence, this is amazing because this isn't just one isolated example. I think a lot of people are seeing like AI tools are already being used by a lot of different researchers. They're helping in a lot of different things, searching through academic papers and Checking complex arguments. But I think since ChatGPT5.2 came out, which is what Somani was using, he says that he has seen a huge shift in, in its reasoning, basically, than earlier versions. A lot of this is because of tooling. When you add give tools to these models, different formulas and algorithms and calculators like they're, they're better. But its reasoning was so good, like it was going and doing research, finding old posts, finding, you know, solutions to similar problems, adapting them to this problem, and writing more complete versions of it, which was incredible. So I think we're going to start seeing way more breakthroughs with where AI is being used to do this now. It's kind of interesting because, like, AI isn't out just there in a vacuum. It's not just running around solving all the math problems of the world. Like you have to point it in a direction, it needs a person to point it in a direction, for now, anyways. And so it's amazing as we just pick what directions to point it in, how it's able to solve so many things. Something that was interesting to me is that Somani was looking specifically at erdosis problems. It's just this famous list of a thousand unanswered math questions, like I was telling you about, which it's been online for years. But what's interesting is, like, the range of the problem. So there's like some simple puzzles, there's some extremely difficult challenges. And anyways, this basically makes it a really popular benchmark for testing human and machine problem solving. And I think since Christmas, 15 problems on the EARDOS list have moved from open to solved. So open means you don't, like, here's a problem, no one has solved it to being officially solved. 11 of the cases, the published solution explicitly mentioned AI tools as part of the process. So whether, you know, that was like an AI model, 100% solving the problem or a human was most likely, in a lot of these cases, a human was using AI tools to help them. 15 or 11 out of 15 of those iridos problems that got solved since Christmas were using AI tools. So in my mind this is just no doubt that this is really pushing the field forward and in really novel, interesting new ways. So I think not everyone is claiming that AI can now replace mathematicians. Terence Tao, he's one of the world's most respected mathematicians. He's basically tracked the progress carefully, and he says that in a whole bunch of different cases, AI systems produce meaningful, meaningfully new ideas on their own. But in other areas, they helped by finding relevant past research that humans could build on. Right. Like in the case we were talking about earlier, you know, it was going, and it was finding some work that a Harvard mathematician had done, and it was kind of adapting it to its problem. But, like, to be honest, that's still amazing because it did come up with something new, and it did adapt it in a new way. So it's like, obviously it's drawing on something anyways. I think. I think a lot of, like, mathematicians say this. Like, look like it's still using human research. Well, of course it's using human research. Like, where do you think it's training data came from? Where do you think it was? Like, what was it fed? How can it do this? It's using human knowledge to be able to do this. But it is coming up with new, novel things, which is interesting because, you know, at that rate, eventually it could just come up with stuff without human knowledge, theoretically. Right, That's. That's kind of the interesting thing. Fully independent mathematicians are still a long way off, according to Tao, but he does say that those tools are already making a huge difference. So there's a recent post that he also made that he suggested that I might be especially good at tackling less famous overlooked problems. A lot of those questions are not unsolvable, but they just simply never get enough attention from human experts because AI systems can work really, like, methodically. They can search through thousands of possibilities. And I guess, like, if I'm being a hundred percent honest, it's also because they don't get bored. And because of this, Tao basically argued that a lot of those systems might be able to solve problems using AI that humans have not solved alone on their own. Not because humans are incapable, but because, you know, we're humans. And solving really long, complex math problems maybe is boring for some people. Okay, so another reason that I think progress is accelerating a lot is a growing focus on making math arguments easier to check. So traditionally, proofs are written in natural language, which can hide a lot of the small mistakes or some of the unclear steps. So there's a bunch of new software tools that are allowing researchers to translate those arguments into a really precise format that can be automatically verified. This process is slow and it's tedious by hand. But if you're using an AI system, which, of course these things are getting increasingly better at helping with, it makes it way easier to confirm results and then. And then also to build on them. Right? Because the faster you can actually check that the AI model, or even a person solved a problem correctly, the faster you can say, okay, here's how we solve this type of problem. Now we'll use this kind of as a building block for future problems in this direction and there's more problems that you can solve. There's an interesting comment made by the founder of Harmonic. It's this tool that checks the math problems. And his name's Tudor Arkemic. And something that he said that's really interesting is that the most important signal is not how many problems actually get solved, but who's willing to use the tools. Here's a quote from him. He said, what matters is that serious math and computer science professors are actually using them. These are people whose careers depend on being careful and credible. So when they say they rely on AI tools, that says a lot. And honestly, I think it really does. I think basically the real world implications is going to go a lot further than just math. Right. Because the same abilities that help an AI reason through on, you know, for example, like an abstract problem, it could still it can actually be used and applied into fields like engineering, economics, medicine, science. Right. Like progress in a lot of those fields is often slow because problems are really complex, they're hard to verify. So as the AI systems get better at actually exploring ideas and checking work and connecting past knowledge, they are going to dramatically speed up research and innovation. So for me, the advancements being made in AI and math isn't just exciting for math, but for so many other areas that are going to benefit. And the reasoning capability of these models improving means that so many different areas are going to improve, improve, including software engineering, which personally is getting me really excited this week. All right, guys, thank you so much for tuning into the podcast. If this was interesting, it would mean the world to me if you could leave a rating and review on the show. Basically, it just helps other amazing people like yourself find the show. It helps me rank in the algorithm on Apple podcasts. So it's the number one way you could say thank you. As always, make sure you go check out AI box. AI if you want to build tools without knowing how to code, and if you want to try over 40 of the top AI models in one place, 20 bucks a month saves you a ton of money. The link is in the description. All right, have a great rest of the day, guys.
Episode: AI: Humanity's Last Math Tool?
Date: January 15, 2026
In this episode, the host explores the rapidly evolving capabilities of AI in mathematics—not just as a tool for solving problems the traditional way, but as an entity now inventing new mathematical methods. The discussion focuses on how AI is revolutionizing the field by solving longstanding problems, enhancing human research, and potentially becoming a generational force multiplier for all technical innovation.
The episode opens with an anecdote about Mo Gadda (former Google X executive), who highlights a striking breakthrough:
"AI is... creating new ways to solve math and coming up with completely new methods when it thinks that our methods are flawed." (02:00)
Implication: AI is not just "getting better at math," but discovering fundamentally new methods, changing the paradigms mathematicians have long accepted.
The host discusses recent news:
"AI just achieved a perfect perfect score on the hardest math competition in the world." (06:00)
Beyond competitions, AI models are now routinely tackling unsolved problems.
Neil Simani, a software engineer, inputted a long-standing unsolved problem into ChatGPT. After 15 minutes, the model returned a solution, which independent verification tool Harmonic confirmed as correct (08:50).
The AI's chain-of-thought process involved:
The host emphasizes the importance of "tooling"—the additional functions like calculators and algorithmic helpers now available to AI models (12:10).
AI isn’t just copying from human sources; it's proving capable of self-directed research, synthesizing across sources, and innovating beyond its training data.
"Its reasoning was so good, like it was going and doing research, finding old posts, finding, you know, solutions to similar problems, adapting them to this problem, and writing more complete versions of it, which was incredible." (13:20)
The current generation of AIs isn't independently seeking out problems—it’s still human experts who “point it in a direction.” (15:40)
The Erdős Problem List: Since Christmas, 15 previously open problems have been solved; 11 of those solutions credited AI tools, either fully solving or supporting human researchers (16:30).
"In my mind this is just no doubt that this is really pushing the field forward and in really novel, interesting new ways." (17:10)
Terence Tao (renowned mathematician) observes:
"Of course it's using human research. Like, where do you think it's training data came from?... But it is coming up with new, novel things." (20:20)
"What matters is that serious math and computer science professors are actually using them. These are people whose careers depend on being careful and credible. So when they say they rely on AI tools, that says a lot." (24:30)
The host projects these advancements will extend to fields like engineering, medicine, science, and economics, where complexity and verification bottlenecks slow discovery (25:20).
"As the AI systems get better at actually exploring ideas and checking work and connecting past knowledge, they are going to dramatically speed up research and innovation... The reasoning capability of these models improving means that so many different areas are going to improve." (26:00)
On AI’s creative leap:
"Instead [of optimizing existing software] it invented a completely new way of doing math... a 26% performance boost and the removal of hundreds of millions of dollars in cost and energy use for Google." (01:50, summarizing Mo Gadda’s example)
On validation and trust:
"What matters is that serious math and computer science professors are actually using them... When they say they rely on AI tools, that says a lot."
— Tudor Achimic (24:30)
On changing research culture:
"Fully independent mathematicians are still a long way off... But those tools are already making a huge difference."
— Summarizing Terence Tao (21:00)
This episode makes a compelling case that AI is not simply automating human mathematical work, but is actively broadening and deepening the boundaries of mathematical discovery. With AI inventing new mathematical methods, solving renowned open problems, and enabling researchers to verify and build on each other's work more efficiently than ever, the nature of mathematical research—and by extension, scientific progress—may be on the cusp of dramatic transformation.