Why You Should Wait Out AI’s Super-Spending False Start - Merryn Talks Money

Summary7 min read

Podcast Summary

Merryn Talks Money – "Why You Should Wait Out AI’s Super-Spending False Start"

Date: April 13, 2026
Host: Merryn Somerset Webb
Guest: Dr. Janousz Meretsky, AI partner at Aaron Innovation Capital

Episode Overview

This episode delves into the current state and future prospects of artificial intelligence (AI), focusing on whether the wave of massive investment in AI infrastructure is justified or fundamentally misguided. Dr. Janousz Meretsky, with deep experience in both academic and commercial AI research, challenges the established narrative around large language models (LLMs) and the so-called AI revolution, questioning whether the super-spending on data centers and compute power is built on false assumptions. The conversation covers the technical limits of current AI approaches, investment strategies, the nature of AI "hallucinations," and how real breakthroughs might emerge from fundamentally new approaches.

Key Discussion Points & Insights

1. Defining AI and Its Current Capabilities

What does "AI" mean today?
- Dr. Meretsky explains modern AI as "a system which is approximating certain processes… They just approximate things, they don't solve intelligence." (03:08)
- Large language models (LLMs) like GPT-4 do not possess true intelligence, but provide an illusion through statistical approximation.

2. The Data Limitation and Model Collapse

The data bottleneck:
- "We have run out of high-quality, diverse data three and a half years ago… The last potent LLM was GPT-4, data for it finished at the end of 2022." (04:48)
- New data generated online is increasingly tainted by other LLMs, leading to "model collapse," where AI models train on low-quality or AI-generated content, degrading their performance.
  - "The new models now train on the entire internet, training on the output from other models… that, using a technical term, ends up leading to something called model collapse, where the models themselves are actually getting dumber." (06:36)

3. Technical Limits of Current Generation AI

Diminishing returns from scaling:
- Performance plateaued about three years ago; adding more data or compute isn't improving LLM benchmarks (07:26).
- Developments now aim at smaller, more resource-efficient models that can even run on laptops, but the underlying limitations persist.
Why scaling isn't enough:
- "You really don't [need to pay for compute]. ...What’s going to happen a year, two, three years from now when the majority of laptops out there will be able to run general purpose language models? ...Why do you need all this expansion of the data centers?" (09:00)

4. AI Hallucinations and Lack of Continual Learning

Why AI can’t be trusted for everything:
- LLMs are "stochastic"; every word produced carries a small probability of error, compounding over long outputs.
  - "You have a little bit of error, you cannot eliminate that thing." (14:16)
- The dream of AI that continually improves by interacting with humans is not yet realized; current systems don’t genuinely learn from ongoing interactions.
Permanent systemic shortcoming:
- "No, it’s impossible [to solve these problems with current neural network approaches]...You have to use a different technique." (14:16)
- Even AI leaders have "jumped ship" from LLMs: "The leading researchers in the field have already jumped ship... working on the next generation things." (16:12)

5. The Economics and Investment Landscape of AI

Super-spending on data centers: A misallocation?
- Webb: "Is a catastrophic misallocation of capital?"
- Meretsky: "Well, if that's how you look at it, yeah. ...You shouldn't allocate [capital] in companies which are spending on this compute. You should allocate ...in companies that are allowing those data centers to run efficiently." (18:15)
Winners and losers:
- Avoid firms "borrowing a lot of money to expand those data centers." Consider "arbitrage—buy companies that have not wasted money on LLMs and short companies that have borrowed a lot of money to expand data centers." (19:35)
Big Tech namedrops:
- Apple is implied as a likely winner for not heavily investing in training LLMs, instead prioritizing publication and research. (19:37)

6. Where Next? Alternative AI Approaches

Emerging non-LLM models:
- Companies like Innate AI (Switzerland), Pathway AI (Bay Area), and Fractal Brain AI are exploring "frontier models" inspired by neuroscience, focusing on continual learning and dynamic network structures.
  - "They create new connections all the time and on top of that they are continually learning and thousands of times more power efficient..." (21:57)
- AI research is "going back from the age of scaling to the age of research." (24:13)
The virtues of the LLM "hype":
- "I like the current hype... because they allowed us to understand the size of the problem and approximate it. Now that we know that we can in principle approximate human language, let's just solve it, okay?" (24:28)

7. Human Input Remains Critical—For Now

AI won’t replace most jobs—yet:
- Generative AI is effective for boilerplate work, not for detail or accuracy.
  - "The killer use case for generative AI is producing output that you yourself can check for correctness." (27:32)
- Entry-level jobs, especially in software, are threatened, but senior roles are safe for now. The bigger worry is the growing skills gap as AI automation eats up learning roles.
  - "It's not about replacing software engineers, it's about us not having a pipeline of software architects and senior software engineers." (28:53)
Law and other professions:
- Legal professionals still necessary, especially for accuracy—AI will create draft documents, but humans must check for correctness. (30:43)

8. Risks of Misuse Over Malice

Caution against overconfidence:
- "I do worry that people are going to be using existing systems in domains where they should not be used, for example, for identifying targets to bomb in Iran. ...I worry about misuse of existing AI tools." (32:38)

9. Energy Efficiency: Human Brain vs. AI Models

AI's massive energy consumption:
- LLMs and data centers are energy hogs; next-generation systems are built to be far more energy efficient, only activating the necessary parameters.
  - "The next generation systems... are three orders of magnitude better [in energy efficiency]." (34:28)
- The challenge: truly adaptive systems, once released, can become unpredictable.
  - "We have systems which are adapting themselves...you cannot outsmart them permanently. They will learn from your mistakes... I personally do not know if it's a good time to release those systems to the general public." (36:04)

Memorable Quotes & Notable Moments

“At the end of the day, the current generation of AI techniques... are just function approximators. They don't solve intelligence, they approximate. So that's why you may have an illusion that those systems are intelligent.”
— Dr. Janousz Meretsky (03:08)
“We have run out of [diverse] data three and a half years ago. ...We have trained the model that used all the publicly available data on the Internet. There is nothing more out there to use to train the model.”
— Dr. Janousz Meretsky (04:48)
"The market still believes we can solve hallucinations, but the leading researchers have jumped ship. ...It's unbelievable to me that we keep pouring money into bigger data centers, knowing we've used all the data already..."
— Dr. Janousz Meretsky (16:12)
“You really don’t [need to pay for compute]. ...Why do you need all this expansion of the data centers? You really don’t.”
— Dr. Janousz Meretsky (09:00)
“The killer use case for generative AI is producing output that you yourself can check for correctness. Unfortunately, people are using those LLMs to answer a question to which they themselves don’t know the answer to. This is a recipe for an absolute disaster.”
— Dr. Janousz Meretsky (27:32)
“It's not about replacing software engineers, it's about us not having a pipeline of software architects and senior software engineers.”
— Dr. Janousz Meretsky (28:53)
"I worry about inadvertent misuse of those tools without understanding what they are not good for."
— Dr. Janousz Meretsky (32:38)

Timestamps for Key Segments

[02:06] – Introduction and episode theme
[03:08] – What do we really mean by "AI"?
[04:48] – The data bottleneck and supply ceiling
[06:36] – "Model collapse" and dangers of AI-generated training data
[09:00] – Questioning the compute/data center expansion
[13:56] – The inevitability of AI hallucinations and error propagation
[16:12] – Top researchers leaving LLMs for the next frontier in AI
[18:15] – The folly of current Capex/investment in mega data centers
[21:57] – Startups and alternative approaches (Innate AI, Pathway AI, Fractal Brain AI)
[24:28] – Why the hype is productive for research
[27:32] – AI’s current best uses and its limitations
[28:53] – Implications for the job market, especially for new entrants
[30:43] – Application (and limits) of AI in the legal profession
[32:38] – Cautions about use and misuse of AI systems
[34:28] – Energy efficiency: future models vs. today’s LLMs
[37:16] – Dr. Meretsky's reading habits and learning Spanish
[38:42] – Thanks and episode close

Conclusion

This episode offers a grounded, skeptical—but ultimately optimistic—perspective on AI’s real progress. Dr. Meretsky urges caution around the ongoing investment boom in AI infrastructure, identifying clear technical and systemic limits in current approaches, and encouraging listeners (and investors) to track the ongoing pivot in AI research toward fundamentally new models.

While generative AI and LLMs offer useful productivity tools, they are not the “intelligence revolution” so often hyped, and reliance on them for critical or highly accurate tasks remains fraught. The real breakthroughs—efficient, adaptive, continually learning AI—are still in the lab, and may bring an entirely different set of challenges and opportunities.

Recommended for: Investors, technologists, and anyone seeking to understand the difference between AI hype and reality—not just for profit, but for preparing for the real next act in artificial intelligence.

Loading summary

Transcript53 lines

[00:00]
A
We don't just invest in cutting edge companies. We look at companies with a history of steady growth and companies whose growth cycle has come round again. Because in the real world, you have to look at growth in three dimensions.
[00:13]
B
Monk's Investment Trust if you follow markets, you know the value of long term thinking. You plan, you diversify, you prepare for volatility. But in life, even the best strategies can't prevent every bad day a fire, a loss, a disruption that demands immediate attention. When that happens, what matters isn't just what you planned, it's who shows up. That's where Cincinnati Insurance comes in. For more than 75 years, they've helped individuals and businesses navigate life's toughest moments with care, expertise and personal attention. Together with independent agents, Cincinnati Insurance focuses on relationships, not transactions. Their approach is grounded in experience, follow through and trust built over time. Bad days happen, and when they do, you deserve an insurance partner who understands risk, respects what you've built, and is ready to help you move forward. The Cincinnati Insurance companies Let them make your bad day better. Find an independent agent@cin fin.com youm need to make a huge presentation in an hour A Adobe Acrobat uses AI to take all your documents and generate a presentation with a single click. Build slides quickly and streamline the process. Need a last minute pitch deck? Do that with Acrobat. Need to level up your presentation design? Do that with acrobat. You have 30 plus documents that need to be simplified into a proposal. Do that, do that, do that with Acrobat. Learn more@adobe.com.au do that with Acrobat. Bloomberg Audio Studios Podcasts Radio News.
[02:06]
A
Welcome to Marin Talks Money, the podcast in which people who know the markets explain the market. I am Maren Somerset Webb and this week I am speaking with Dr. Janousz Meretsky, who is an AI partner at Aaron Innovation Capital. Now, as you know, on this podcast we like to talk about the big forces affecting our economy and markets in general. So it's really no surprise that we keep coming back to the impact of AI. We've talked at different times about the consequences for jobs, for inflation, for interest rates, for tech companies, and whether politicians and indeed policymakers, let alone ordinary workers and investors, are ready for any of this. But what we've never asked is, is it actually working? Yanish welcome to Marin Talks Money.
[02:49]
C
Thank you so much for having me.
[02:51]
A
Can we just start with a brief explanation of exactly what it is that we mean when we say AI? Yes, everyone talks about AI all the time. AI There say that it's going to change the world, it's going to solve all our problems, it's going to destroy our jobs, etc. But what do we actually mean when we say AI?
[03:09]
C
Yeah. So these days what we mean by saying AI is a system which is approximating certain process. It might be a system which is approximating language, it might be a system which is approximating images, it might be a system which is approximating how a robot moves. By the end of the day, the current generation of AI techniques, those neural networks are this function approximators, they just approximate things, they don't solve intelligence, they approximate. So that's why you may have an illusion that those systems are intelligent. At the end of the day, they are approximating intelligence.
[03:47]
A
I mean, I suppose there's two ways to look at this. What AI as you've just described needs from us and what we really need from it to make it work for us. So if you look at the way that the big hyperscalers are approaching things at the moment, they're building massive data centers to build out their capacity and that requires vast amounts of energy, it requires lots of coolants, it requires a very large volume of various different types of chips, right?
[04:11]
C
Yes.
[04:11]
A
And all those things, obviously there are troubles at the moment with getting all these things, with the war in the Middle east, etc. So there are supply restrictions, but nonetheless, none of these things are really a long term problem. Given the correct policy choices, all those material elements can easily be. Not easily, but can be found and built in. Then there's the second bit that we've talked about. When we met, we last met, which is the data that you require to train a model and that can't be created in the volumes that are required. And we've hit a supply problem with data.
[04:48]
C
We have hit a supply problem with diverse data. So I clarify the thing because you can just create an infinite amount of data by randomly generating new words. So you can create it. But we're talking about creating high quality, diverse data. And you see we've run out of data, diverse data. Not yesterday, not a month ago. We have run out of data three and a half years ago. You got to understand that the last frontier model, GPT4, which was not a combination of agents, etc. Just basic LLM. Yeah, the last LLM that was the Most potent was GPT4. It was released when it was released in 2023, in January. However, the training of that model finished at the end of 2022. So it's three and a half years ago. We have trained the model that used all the publicly available data on the Internet. There is nothing more out there to use to train the model. So we've hit the data ceiling not a month ago, but three and a half years ago. This is extremely important. And yes, what we're doing right now is we're trying to put together multiple LLMs, we're trying to have synthetic data, but the performance isn't really there. We've hit diminishing returns not a month ago, but three and a half years ago.
[06:08]
A
Okay, but surely new data is created all the time on the Internet. I know we talk about diverse data and there's an awful lot more created in the past than there is now. But everyone's use of the Internet surely creates huge volumes of new data all the time.
[06:21]
C
That's even worse. Humans are creating new data on the Internet, but that data falls into certain patterns. How many conversations, on the weather you can have every day?
[06:31]
A
Quite a lot, actually. A lot. I think I've already had three today, to be honest. You know, it's very cold in Edinburgh.
[06:37]
C
You can have a lot. But there is a bigger profound problem there I want to mention is that if you look at the data that is currently being created on the open Internet, it is data created by those LLMs which, number one, is inaccurate because it suffers from hallucinations, and number two, it feeds into itself so that the new models now train on the entire Internet. The training on the output from other models, which, as I said, are making mistakes and hallucinations, and that, using a technical term, slowly ends up leading to something called model collapse, where the models themselves are actually getting dumber.
[07:17]
A
Okay, so if you train a new model on newly created data, you're effectively training on its own nonsense or nonsense created by other similar models.
[07:27]
C
Yeah, well, not only nonsense, we have to understand that those LLMs using artificial neural networks, they are correct, let's say 95, 99% of the time. So most of the content is correct, but you no longer know what content is incorrect, which is a significant degradation of the quality of data. It's almost like having access to a calculator which claims to be correct 100% of the time, but in reality it is correct 95, 99% of the time. How much would you pay for that calculator? Would you use the output from that calculator to produce novel calculator? No, you wouldn't do that. But we are doing it right now, and we are seeing right now that the performance of those models on benchmark is not increasing anymore. It plateaued three years ago. And there is even a bigger profound question. We are training those models now using higher quality data. So no longer we're using the entire Internet. Now we are filtering from the Internet a lot of garbage. So we're using a smaller training data set, thus producing smaller LLMs that have the same quality. But as I mentioned, they are smaller. So what is the consequence of it? Well, as we speak, I'm running those models on my laptop. I don't use any data center for it. I never will. I'm running them on my laptop. So the models have gotten better. But by saying better, I mean they are more power efficient, they're not more
[08:49]
A
accurate, also mean that they're more specific. So they are geared toward a more specific task because they're using a narrower range of data. So they can't be a type of general intelligence, They're a specific intelligence.
[09:00]
C
Remarkably, no, they are general purpose. I mean, you can just go online and download Quent 3.5 and 3.6. This is a general purpose model which is even accessing the Internet. To give you summaries, reasoning, it's doing this thing on your laptop. It's today. What? Of course, on top of that, if you want to, you can fine tune your model on your proprietary data. But you see, even the fine tuning process these days can be done on your laptop, which again raises the question, why do you need to pay for compute? Why do you need all this expansion of the data centers? You really don't. And what's going to happen a year, two, three years from now when the majority of laptops out there will be able to run general purpose, the most potent language models with access to the Internet? I'm just not seeing it. And that's on top of one fundamental thing that I want to mention up front because you mentioned what does the AI need from us and what we need from the AI? So what the AI needs from us are two things, as you mentioned. Number one, compute. And yes, we can provide more and more compute as long as Oracle CDS is not going down, which it is right now. And number two, we need to provide more data. We've used all of it, so to some extent we cannot produce better AI. Now, what do we want to have from the AI? We want to have at least two things. And just bear with me here. Number one, we want to have systems which are continually learning. So just like during this podcast today, I hope that I will be able to learn something from you. You will Be able to learn something from me and remember tomorrow, maybe remembered after Easter. Current generation systems in general, neural networks are not learning anything new when you interact with them, which is a significant limitation. So we're not getting from the AIs what we want to. They're not learning from us. It's a fundamental limitation that's not solved. And number two, do want to mention this thing up front. Those systems are stochastic, they are probabilistic, you cannot trust them. They roll the dice whenever they produce output. So to some extent you cannot trust their output. Can you make them deterministic? Yes, of course you can, by making sure that they always produce the most likely token. I'll use the word token. But the problem with this thing is that then they will be just copying the data from the training set. So just imagine those lawsuits when you see verbatim copies of all the podcasts books produced as an output of Gemini or OpenAI system. So those systems, number one, are not continually learning, which basically for me it's just a no go. And number two, those systems are not to be trusted because whenever they produce output, that output is stochastic. It's non deterministic, it's basically rolling a dice to get your output.
[11:54]
A
Yeah. Can we talk a bit more about that? About how the hallucinations or errors, how they build up?
[11:59]
C
Absolutely. Let me try to explain using simple terms. When you use ChatGPT or Gemini or Copilot, but this is. Copilot is actually OpenAI system. It just produces text. So you have an impression that it generates one word at a time. Deterministically, that's what on the screen, just one word after another. But in reality that's not what those systems produce. If you are a developer, like me and my colleagues are developers as well, you can look at the developer output of a large language models. Do you know what it gives you? It gives you a vector of 50,000 elements, actually 52,000 elements. So 50,000 elements and each element has a certain probability of being correct. 0.95, 0.001, 0, 0.3. It's an entire vector. It's not one word has probability 1. Everything else is 0. No, there is a little bit of error there. It has to be. And so notice what happens when you're producing one word at a time or one token at a time. You can think of a token as a word split by two or three. So when you produce output one token at a time, this system is rolling the dice all the time. It's Making a small tiny error every time it produces a word. So at the beginning you may not perceive the error, but over time, after 300, 500, a thousand words, the error is going to be so big that it's going to result in a critical failure of the system. You cannot circumvent it because those systems are probabilistic. It's not like in an Excel spreadsheet where you can have a chain of, I don't know, 10, 20, 100, a thousand formulas and you know that the formulas are going to produce the correct result. Here if you have a chain of words, every time you produce a word you accrue a little bit of error and you see this error manifest itself later on.
[13:56]
A
Okay, so is this solvable? I guess that's the key question. Is it possible in the type of models that we're using at the moment, which are super hyped, is it possible for that hallucination problem or compounding error problem to be solved? Or is it simply a systemic shortcoming that is unresolvable?
[14:16]
C
So here I'm again speaking from my experience, having two PhDs in mathematics and computer science and 20 years of experience in DeepMind and IBM Watson research. No, it's impossible. You have to use a different technique for it. Yes, there are different techniques that are emerging right now. In full disclaimer, I'm also a co founder of a startup working on one of those techniques, we call them fractal brain. Yes, there are new techniques on the horizon. However, the existing techniques have a building mechanism so that every time you produce an output, in other words, you have a little bit of error, you cannot eliminate that thing. And there is also a fundamental other thing you mentioned, hallucinations. What do we mean by hallucinations? It's number one, it's producing those small errors one word at a time. But there's also another reason for hallucinations. When the system just doesn't remember what was mentioned yesterday, a week ago or a month ago, and goes back to the initial question, you need both. You need to have a system which like humans, it's continually learning. And number two, you need to have a system which does not make errors when it produces the next word. It shouldn't roll a dice. You need those two. And so now to answer your question, can we solve this problem using artificial neural networks? No, there have been attempts to circumvent it. If you want to have continual learning, you can maybe try to use something like continual back propagation from Rich Sutton, get a Two Wing Award in 2024. He can, but it's not solving the problem. At least he's attempting to find solutions to it. You can retrain the system, of course, right? I mean, why not take GPT4 after the end of the podcast, retrain the entire system. It's going to cost you $5 million. But you could do that. You could retrain the entire system, or you can fine tune your system. But when you're fine tuning your system, the system is forgetting what it was trained on before. So you're suffering something which is called catastrophic forgetting or catastrophic interference. Long story short, no, you cannot solve the outstanding problems of hallucinations and lack of continual learning. Unfortunately, I wish you could, but you need to have different things for it. And it's not just me saying it. Look at the landscape of researchers, leading researchers in the field, my colleagues, I know all of them personally, they have all jumped ship. This is important. Look at, for example, Jan Lecon from Meta. He jumped ship, he's working on his new startup, Mi Labs, not working on LLMs, saying LLMs is a dead end to AGI. Look at my colleague, Dave Silver from DeepMind. He just left DeepMind a couple of weeks ago. He formed Ineffable Intelligence. I think they're raising a billion dollars. Same thing. He's not a believer in using LLMs for general artificial intelligence. And I can go on and on. Ilya Sutskever, Andrej Karpathy and so on. So it's not just me who is saying that we need to go back to research. The leading researchers in the field have already jumped ship. Few months, maybe even years ago, working on the next generation things. The market still believes we can solve hallucinations, but the leading researchers have jumped ship. It's unbelievable to me that we keep pouring money into bigger data centers, knowing we've used all the data already and knowing that even if you have more data, you will not solve continual learning and you will not solve hallucinations. So why is every. Not everyone, why are we doing it? Why are people doing it?
[17:51]
A
Okay, so if we know clearly, and it sounds from what you say that we do know very clearly, that using more and more computing power and more and more already mildly corrupted data isn't going to get us anywhere, then the enormous Capex spend on vast data centers, the hundreds of billions of dollars that have already gone into this and are still projected to go into this, is a catastrophic misallocation of capital.
[18:15]
C
Well, if that's how you look at it, yeah. So obviously you can say the biggest winner of it is Nvidia because it's producing those GPUs or during the gold rush you should invest in companies that produce the shovels. You can still make money. For example, I'm a partner at RN Innovation Capital in the UK and we are investing in, I just mentioned two companies, one called Hiverge and second called Phydra. These companies improve coolings of data centers or these companies produce better algorithms to run on data centers. So you can still allocate your capital wisely. But you shouldn't allocate them in companies which are spending on this compute. You should allocate your capital in companies that are allowing those data centers to run efficiently. Because those data centers, who knows, maybe they will be used for a different purpose at some point. So again there are going to be winners and losers of the current gold rush in GPUs and AI. I would say companies that have not invested massively in data centers and in frontier models are going to be the winners. There are some companies, without mentioning the names that have been accused of not training their own language models. I think these to me are going to be the winners in today's market.
[19:35]
A
Are we talking about Apple?
[19:37]
C
I will let you determine that thing. But you can see some companies, number one, have not invested in the LLM frontier models. But their research teams have kept publishing papers saying that those LLMs, they don't reason, they make mistakes. So let you find those companies. And there are some companies that have borrowed a lot of money to expand those data centers. And it's not me. Look at the market, look at CDS on Oracle. It's the market is just flashing red saying this is foolish. So we're seeing those signals already. But I want to make sure that if you want to make money in today's market is given that the governments have to pay 8, 9% of the revenue on servicing the debt, you don't know if the market is going to go up and down. Maybe a capital injection, liquidity injection from central banks. You don't know those things. So I would not recommend you use short or go long any investment. I would recommend maybe doing an arbitrage buy companies that have not wasted money on LLMs and short companies that have borrowed a lot of money to expand data centers. That would be my suggestion, but I might be wrong again.
[20:48]
A
We don't just invest in cutting edge companies. We look at companies with a history of steady growth and companies whose growth cycle has come round again. Because in the real world you have to look at growth in three dimensions. Monks investment trust run a business and not thinking about podcasting. Think again. More Americans listen to podcasts than ad supported streaming music from Spotify and Pandora. And as the number one podcaster, iHeart's twice as large as the next two combined. So whatever your customers listen to, they'll hear your message. Plus, only iHeart can extend your message to audiences across broadcast radio. Think podcasting can help your business. Think iHeart streaming radio and podcasting. Let us show you at iheartadvertising.com that's iheartadvertising.com. Can you tell us about any of the startup companies that you're invested in or interested in that are taking us to this new frontier in AI, away from the LLM model and towards a different model?
[21:58]
C
Absolutely. So in our investment fund, Aaron Innovation Capital, we have access to a lot of companies, actually maybe 20 or 30 companies that are developing the next frontier models which are not necessarily using artificial neural networks. And I can speak about three or four of them which are very exciting. One company, you can have a look. They are based in Switzerland. They're called Innate AI. Again, I'm advertising them and we're not investing in them. We're not investing in them yet. They are developing new version of neural networks which are inspired on the brain. This is an effort that was going on in Europe for more than a decade, the Blue Brain project. They are developing something new which is not an artificial neural network. So that's one kind of look at that other company, for example, Pathway AI in the Bay Area. Again, it's another example they mention up front. You need to solve continual learning. You need to solve it. If you don't have it, forget about a solution to AGI. And so they have been developing systems that can, that can learn using something called Hebbian learning, which is a local learning technique that happens in the brain, not using backpropagation gradient descent. So this is another example, another company that I'm actually a co founder of and the CEO called Fractal Brain AI. Have a look at the thing. It's also based on prefrontal cortex and it's this idea that those networks are continually growing and rewiring themselves. So no longer you have a fixed network with a fixed number of parameters. No, the network is growing, expanding themselves. Like today. You're going to probably form a lot of connections after this podcast. Those networks do the same. They create new connections all the time and on top of that they are continually learning and thousands of times more power efficient in addition to being data efficient. So these are only some of the examples of companies that I'm personally very excited about. But as Ilya Sutskever said the other day on one of those podcasts that we have gone back from the age of scaling to the age of research, so researchers have gone back to developing the new things. It's just on my end here, me and my teams have started developing, for example, fractals, fractal brain. Twelve years ago, we knew about those outstanding limitations of artificial neural networks more than a decade ago, so we wouldn't invest our time in it.
[24:21]
A
Nevertheless, somehow we got caught up in this sort of super bubble hype, despite the fact that good scientists knew.
[24:28]
C
But this is good. I like the hype because you see to some extent that hype and the LLMs, they allowed us to understand that it's possible to approximate human language. So now that you know that you can approximate human language with LLMs, you can try to find ways to actually solve the idea of human language. If you can approximate something, you can see the size of it. So it's almost like if someone showed you, hey, there is a rocket there, it flies. You already know the size of the rocket, you can know it can fly. You can start to crack the details of the engine of the rocket. So to some extent I like the current hype, I like the current generation systems because they allowed us to understand the size of the problem and approximate it. Now that we know that we can in principle approximate human language, let's just solve it, okay?
[25:18]
A
And the other thing I suppose we should say is that while we spent quite a lot of time criticizing this generation of LLMs, they're still great, it's still really useful. It's not like we have a totally pointless technology. We have something that can remove entry level jobs across the board, which of course comes with its own problems, but which nonetheless has enormous use for productivity and business.
[25:38]
C
Absolutely. I love Maybe not just LLMs. I love the generative AI, for example. Not sure if you've noticed behind me I have this amazing landscape of London, but you can tell it's all fake here. Like this building is a little tilted here. So those systems, they produce very pretty graphics. It's inaccurate, but it's okay for me. It still gives me a very nice background. Same thing with text, they will produce beautifully looking poem. They can summarize a document. Yes, there are errors there. Like this building here, you can tell it's all tilted a little bit, but I'm okay with that. So those systems are very good for creating templates templates of data, the so called boilerplate code, creating nice graphics. They're not good for details to put in there. And it's interesting because about two or three years ago I was giving a talk to high school students, they were asking me, what do we use those LLMs for? I told them for generating templates, templates of presentation, etc. And for summarizing documents. But I misled them. I don't think you should be using those systems for summarization for two reasons. Reason number one, in that summary there might be errors and mistakes. So if you summarize a document, don't throw away the original. And number two, you know when you're summarizing something, you should know what you care about. For example, if you were to summarize today's podcast, maybe you only care about this tilted building here, which is fake. Maybe that's what you're looking for. LLM doesn't know it, so when it produces a summary of text, when it compresses graphics, it doesn't know what you care about really. So it's going to produce a summary, maybe lacking the details that you want to know later on.
[27:25]
A
And I suppose the other thing is with the errors, you should only really be using it for things where you know that you will be able to spot the errors at the end.
[27:33]
C
So this is actually very interesting. I think that the killer use case for generative AI is producing output that you yourself can check for correctness. So like for example, if I'm, if I want to compute a hundred plus 100 LLM will give me an answer, I know I can check the answer myself. The correctness. I like it this way. Unfortunately, people are using those LLMs to answer a question to which they themselves don't know the answer to. This is a recipe for an absolute disaster. In the worst case, you can use those systems to to check whether your output that you produce yourself is correct. You can do it that way. But people are using it the other way around. They are asking those LLMs to produce an output. They don't know what the output should be. And the output can have 1 or 5% error rate. Why would you even do that?
[28:24]
A
So are we worrying unnecessarily about the job market? We spend most of the talks and podcasts and panels etc that I do. The question is always what on earth does my child do for work in an age of AI? Are we worrying too much about that? Because the human input will remain absolutely compulsory for the next couple of decades.
[28:42]
C
Having the same Problem. As I mentioned, my son has 13 years old. So me as a father, I need to give him advice what to do in the future. I guess being a scuba diver instructor is a good job.
[28:53]
A
Yeah, great job.
[28:54]
C
It is a great job. You're going to need them. I worry about interns, entry level software engineers, because to some extent you can automate, not replace. You can automate most of the tasks that you delegate to interns today. Like for example, write a boilerplate code template, code check some. If there's errors in that code, you can automate that. What's going to happen then is that we are not going to have interns anymore or significantly smaller number of interns and entry level software engineers. So what's the consequence? What's going to happen with the entire promotion cycle? If you're in a senior management or a middle management, are you going to get promoted? Who's going to replace you? Who's going to become a software architect if we're not training the new entry level software engineers? So there's this entire skills gap right now. Some people that I know have chosen not to pursue studies in computer science because for the very reason they worry that we're not going to need software engineers. Yes, we are going to need software engineers. It's just you need to jump immediately to being an architect of the software engineering system and to make the very hard. It is hard. Typically what you do is you gain this experience on the job. You go to Google, spend first two, three years writing code, but you appear so software architects and so you learn from them. If we don't have this experience that we're giving to entry level software engineers, they won't be able to have this experience. So this is what worries me more. It's not about replacing software engineers, it's about us not having a pipeline of software architects and senior software engineers. We are not having this pipeline anymore. That worries me quite a bit, to be honest.
[30:38]
A
Yeah. And the pipeline problem is just being discussed in a lot of other professions as well, most obviously in the legal profession.
[30:44]
C
Absolutely. And for lawyers, this is very interesting. And this is not my profession, so you can discount what I say quite a bit right now. You can get the initial blueprints of legal documents very quickly. And we do it all the time at our startup. You can get a blueprint of a legal document, but I would never use that document to get an investor on board in my startup. I wouldn't do that. I still need to send it to an actual human being to at least proofread it. So we're going to have to have those lawyers which can proofread documents produced by the generative AI. But aren't they doing it already? They already have hundreds of template documents saved on their computers. They're just changing the names of companies, investors in those documents. So the legal profession is not going to go away because those generative AI systems, they don't have the notion of true and false. They don't, they confuse, it's all probabilistic. So they will make an error. Sometimes after 20, 30, 40 legal statements, they will just change one true into false. So we are still going to need to have lawyers for it. This is one of those professions where I don't think it's going to be automated fully by AI. But as I said, there are other professions that have already been automated, for example, content creators. And you can go online, go to YouTube, you're going to see a lot of videos summarizing AI. AI bubbles, summarizing conflict in Ukraine or Iran, all generated using generative AI. So some jobs have already gone away, but jobs that require you to be 100% accurate, those jobs are not going to go away using the current generation of AI systems. Remember, there are new generations on the horizons. My startup is working on it, other startups are working on it, but with the current generation of tools, they will not displace those jobs yet.
[32:31]
A
It sounds to me like we shouldn't necessarily be frightened of the current generation of tools, but we should probably be pretty frightened of the next generation.
[32:39]
C
I would give a very simple example here. So me and my colleagues, we built AlphaGo at DeepMind in 2015, which is the system that won in computer go with a world champion. And so back then it was a state of the art system. These days you yourself can win against that system. I can do it as well. Why? Because those systems have stayed the same and humans have identified flaws in that systems. They have identified hallucinations, errors, and now when you play against that system, you just exploit this system, exploit its weaknesses. The system is not adapting itself, not rewiring itself. So to answer your question quickly, I'm personally not worried of existing AI systems because they are adapting themselves. However, I do have to mention I do worry that people are going to be using existing systems in domains where they should not be used, for example, for identifying targets to bomb in Iran. Let's just don't do those things. Those systems make errors and hallucinations. So I worry about misuse of existing AI tools. I don't worry about these tools themselves being malicious? No, I worry about inadvertent misuse of those tools without understanding what they are not good for.
[33:59]
A
Okay, interesting. Always something to worry about. Can I ask you one last thing? One of the things that you mentioned earlier was the extraordinary energy inefficiency of the current systems. And it is true, isn't it? The human brain is remarkably energy efficient. And when you look at these models, the amount of energy that they will use to simply have the same thought that a human can have on a couple of watts. That's an extraordinary problem that in your new generation that we talked about earlier will be diminished.
[34:29]
C
Absolutely. The systems that I know that fractal brain is coming up with, innate AI is coming up with. See those systems, they don't use back propagation to produce the next word. They don't have to load from the memory all of those 1.5 trillion parameters just to produce the next word. They don't do that. They load maybe a hundred, maybe a thousand parameters. 3, 4 orders of magnitude better power efficiency. And we know this thing, and we know that we've trained what first version of our fractal language model using I think 0.1% electricity that OpenAI use for GPT 1 and 2. So we know it's possible to do that. It hindsight is interesting. Before this podcast we had a conversation, if you recall me, on Easter dinner and cooking potatoes. So as we're talking right now, I can assure you that for me and for you to produce the next word in our conversation, you are not thinking about potatoes for Easter. You're not doing that thing you don't need those connections to. When we talk about AI, we don't do those things. Maybe now you think about your Easter dinner, but the point is that those systems should produce the next word only loading up parameters which are important for the network. Next word. And this is not billions of parameters. No, hundreds of thousands at most, maybe a thousand, two thousand parameters. So yes, the next generation systems, in addition to being deterministic, you can trust their output. They have power and the data efficiency, which is remarkable. Talking three orders of magnitude better. It's coming out. This thing is going. But then again it's. The world may not be ready for it yet. Plus you gotta understand my own motivation for building those systems and the dangers. We have systems which are adapting themselves, rewiring themselves, so to some extent you cannot outsmart them permanently. They will learn from your mistakes. It's like your kid. Try using some technique on your kid, it's gonna learn how to adapt itself. Those systems do it as well. So I personally do not know if it's a good time to release those systems to the general public. I don't. That's why those systems are not being published. Because think about it, if a solution for a system which is learning all the time, adapting its flaws, that becomes spooky to release that system. So I worry more about potential misuse of those next generation systems, less worry about their performance because they're already beating existing systems on common benchmarks.
[37:01]
A
Okay, thank you very much. There's one last thing, Yanish, before we go. I think our listeners will have been absolutely fascinated by all this and what I often ask people, I'm going to ask you as well and I hope that we'll be able to understand that. Your suggestion, what are you reading at the moment?
[37:16]
C
Well, I wish I could tell you that I'm reading books on AI, but I don't. In fact, four years ago I started learning without any necessity, Spanish. Zero necessity, nothing. The reason I started doing that is I want us to again learn language the way humans do that. Without billions of words. No, with a small number of words. So as an mental experiment to learn whether our fractal language model learns language in the same way as I'm learning Spanish, I started learning Spanish. And because of that, I'll be. I'll disappoint you. I'm reading kids books, reading Diary of the Wimpy Kid in Spanish. Of course, I'm reading Harry Potter as well in Spanish. So I'm reading lots and lots of books in Spanish. But these are elementary school books. I'm sorry to disappoint you.
[37:59]
A
You're reading the same things in Spanish that my son is. So there we go. You have that in common.
[38:03]
C
Yes, absolutely. This is actually good. If you want to understand, for example, how language works, at least try to learn yet another language. Now that you can do an introspection, you can see how you're learning a language. And so it's remarkable that you can learn a language after, in my case, about seven or 8,000 hours. I guess my Spanish is better than my English right now, so. 8,000 hours, how many tokens? We're talking about couple million tokens. Couple million tokens of a training set rather than couple trillion tokens. So that's what fascinates me. And yeah, maybe next time we can speak in Spanish on the podcast.
[38:38]
A
No, I'll have to get one of my kids in for that.
[38:40]
C
Okay, sounds good.
[38:42]
A
Yanish, thank you so much for joining us today.
[38:45]
C
It's a pleasure. Thank you so much for having me.
[38:51]
A
Thanks for listening to this week's Marin Talks Money. If you like our show, rate, review and subscribe wherever you listen to podcasts and keep sending your questions or comments to marinmoneyoomberg.net you can also follow me and John on Twitter or x. I'm at MarinersW and John is underscore Stepek. This episode was hosted by me, Marin Somerset Webb. It was produced by Sama Saadi and Moses Andam Sound designed by Blake Maples and Aaron Casper. And special thanks, of course, to Yanush Meski.