Insights from OpenAI's AMA: The Next Breakthrough in AI - The Last Invention is AI

Summary6 min read

Podcast Summary: Joe Rogan Experience for AI – "Insights from OpenAI's AMA: The Next Breakthrough in AI"

Release Date: November 17, 2024
Host: Joe Rogan Experience for AI

The "Joe Rogan Experience for AI" episode titled "Insights from OpenAI's AMA: The Next Breakthrough in AI" delves into the recent AMA (Ask Me Anything) session conducted by OpenAI's top executives, including CEO Sam Altman and Chief Product Officer (CPO) Kevin Weill. The episode provides an in-depth analysis of the discussions, addressing key topics such as API costs, upcoming AI models, regulatory challenges, and future breakthroughs in artificial intelligence.

1. API Cost Reduction for Advanced Voice

One of the primary concerns among developers is the high cost of OpenAI's Advanced Voice API, which limits its accessibility for creating diverse AI-driven applications like virtual life coaches or AI mechanics.

Kevin Weill, CPO of OpenAI, addressed this issue at [04:15]:

"We've been reducing the cost of our API for 2 years now. I think GPT-4 mini is like 2% the cost of the original GPT-3. Expect this to continue with Voice and others."

This significant cost reduction trajectory aims to make advanced AI tools more viable for developers, fostering innovation and broader application deployment.

2. Navigating EU Regulations

The episode touched upon the challenges OpenAI faces with European Union (EU) regulations, which can delay the rollout of new features and products.

Sam Altman, CEO of OpenAI, commented at [10:30]:

"We'll follow EU policy. A strong Europe is important for the world."

Altman emphasized the importance of adhering to EU policies, highlighting the balance between regulatory compliance and technological advancement.

3. Bold Predictions for 2025

Listeners probed OpenAI's long-term vision, seeking ambitious forecasts for the AI landscape.

Sam Altman stated at [15:45]:

"We aim to saturate all the benchmarks, meaning all the places where they benchmark different AI models, to ensure OpenAI tools are the top in every category."

This prediction underscores OpenAI's commitment to maintaining leadership across various AI benchmarks and continuously enhancing their models' capabilities.

4. Inference Costs and Computational Efficiency

The discussion moved to the efficiency of AI model operations, specifically regarding inference costs and the implementation of multi-layered reasoning processes.

Kevin Weill elaborated at [20:10]:

"We expect inference costs to keep going down. Over the last year, they've been decreasing by about 10x."

Reducing inference costs is pivotal for implementing complex reasoning chains, making sophisticated AI functionalities more accessible and cost-effective.

5. Next Breakthrough: AI Agents

A significant portion of the AMA focused on the next major advancement in AI—autonomous agents that can perform tasks independently.

Sam Altman revealed at [25:00]:

"The next giant breakthrough will be agents. We're shifting our focus toward enabling AI to autonomously execute tasks, which I find incredibly exciting."

This development marks a transition from foundational model improvements to creating AI systems capable of autonomous action, potentially revolutionizing various industries.

6. Advice for Aspiring AI Contributors

OpenAI executives provided guidance for individuals eager to contribute to the AI revolution.

Kevin Weill advised at [30:20]:

"Start using AI every day to teach yourself coding, writing, product design, anything. If you can learn faster than others, you can achieve anything."

This recommendation emphasizes the importance of hands-on experience and continuous learning in leveraging AI technologies effectively.

7. Support for Image Input in O1 Models

Questions were raised about the incorporation of image inputs in OpenAI's O1 series models.

Kevin Weill responded at [35:05]:

"We're prioritizing getting the model out to the world first. Full-featured image input is on the roadmap for O1 and future O series models in the coming months."

This phased approach ensures that foundational capabilities are established before integrating more complex multimodal functionalities.

8. Scaling Large Language Models (LLMs)

The AMA addressed strategies for scaling LLMs, balancing model size with inference speed.

Kevin Weill clarified at [40:40]:

"It's not either or; it's both. We'll enhance base models while also improving inference compute time."

OpenAI aims to achieve a harmonious balance between model sophistication and operational efficiency, adhering to established scaling laws while optimizing performance.

9. Enhancing ChatGPT's Memory Capacity

Users expressed concerns about ChatGPT's limited memory retention for individual accounts.

Kevin Weill acknowledged at [45:15]:

"We're aware of the memory limitations and are working on solutions to expand the memory capacity for accounts, including longer context windows and better persistent memory features."

Improving memory capabilities is critical for enhancing personalized user interactions and maintaining context over extended conversations.

10. Release Plans for GPT-5 and Equivalent Models

Anticipation surrounds the release of GPT-5 and its feature set.

Sam Altman addressed release timelines at [50:00]:

"We have some very good releases coming later this year, but nothing that we'll call GPT-5 yet."

This response indicates progressive enhancements to existing models without a significant enough leap to warrant a new version designation immediately.

11. Ilya's Vision and Contributions

The role of Ilya Sutskever, OpenAI's chief scientist, was highlighted in shaping the company's AI advancements.

Sam Altman praised at [55:30]:

"Ilya is an incredible visionary. His early ideas, like the chain of thought, have been pivotal in advancing our models and maintaining our competitive edge."

Ilya's visionary approach has been instrumental in developing foundational aspects of OpenAI's AI models, driving innovation and strategic direction.

12. Advancements in Text-Image Models

Finally, inquiries about the next generation of text-image models were discussed.

Sam Altman responded at [60:45]:

"The next updates to our text-image models will be worth the wait, although we don't have a specific release plan yet."

While specifics remain under wraps, OpenAI assures ongoing development to surpass current text-image generation capabilities.

Conclusion

The episode provided a comprehensive overview of OpenAI's strategic directions, addressing both immediate concerns and long-term aspirations. Key takeaways include sustained efforts to reduce API costs, navigating regulatory landscapes, ambitious goals for AI model benchmarks, and the imminent development of autonomous AI agents. OpenAI remains committed to enhancing model capabilities while ensuring accessibility and compliance, positioning itself at the forefront of AI innovation.

For listeners eager to stay updated on AI advancements and leverage these technologies for business growth, the podcast recommends engaging with the AI Hustle School community for exclusive insights and resources.

Notable Quotes:

Kevin Weill, CPO of OpenAI, [04:15]:

"We've been reducing the cost of our API for 2 years now. I think GPT-4 mini is like 2% the cost of the original GPT-3. Expect this to continue with Voice and others."
Sam Altman, CEO of OpenAI, [15:45]:

"We aim to saturate all the benchmarks, meaning all the places where they benchmark different AI models, to ensure OpenAI tools are the top in every category."
Sam Altman, CEO of OpenAI, [25:00]:

"The next giant breakthrough will be agents. We're shifting our focus toward enabling AI to autonomously execute tasks, which I find incredibly exciting."
Kevin Weill, CPO of OpenAI, [40:40]:

"It's not either or; it's both. We'll enhance base models while also improving inference compute time."

This structured summary encapsulates the key discussions from OpenAI's AMA, providing valuable insights into the company's current initiatives and future plans in the AI domain.

Loading summary

Transcript1 lines

[00:00]
Host
Sam Altman, the CEO of OpenAI and a bunch of other top AI OpenAI executives just held an AMA and ask me anything over on Reddit where people were asking them questions and they were giving all of the responses. I want to show you a bunch of the answers because they gave details on projects like the updates on the timeline for Sora, their video model, Dall E, their new image model, what's coming with GPT5. Like they while they didn't give exact dates for anything, they gave you timelines, they gave you ideas, they told you when they didn't have timelines, they tons of new interesting things with price changes. I'm going to be breaking down all of their responses and all the most interesting responses that I saw there in the podcast today, so let's jump along for the ride. And the one thing I wanted to say before we get started, if you are interested in making money with AI tools and maybe helping grow or scale your current business, I would love to have you as a member of the AI Hustle School community. So every single week I record an exclusive piece of content with my co host Jamie where we break down different AI tools, different ways that we're making money, things we can't share publicly when it comes to products and software and workflows and kind of exactly what we're doing in the behind the scenes of it. It's all in this school community with over 200 members. Some of them have come from companies they've started which are a hundred million plus, and others are just getting started. So you get a really wide range of perspectives and some great feedback on your projects, what you're doing. We'd love to have you as a member of the AI Hustle School community. And it is currently 19amonth. We're going to bump the price up eventually, but if you get in now you can lock in that price and it will never be raised on you. So I'd love to have you as a member of the AI Hustle School community, hear about what you're working on and share some of the behind the scenes stuff. I don't share anywhere else. You can check this out at the link in the description. So let's get back to what's going on with this new ama. So I actually got a great recap of this over on X from Kim is Muss or AKA Chubby. So shout out to Kimismus for some of these screenshots and recaps and of the interesting stuff. The first thing I want to bring break out is that somebody asked them, are you planning to reduce the API cost of Advanced Voice? Now this is something that a lot of people are asking about because it's a little bit expensive and what you really want to see is the cost of this Advanced Voice coming down. The more that these things come down in price, the more viable it is for developers to take them, build really exciting new features and tools with them. So yeah, the API cost is developers ability to add this into their software. And this is exciting and a big question, right, because you can imagine for this Advanced Voice, it's like you're making, you know, you're making like AI life coaches and you're making, you know, fitness coaches and you're making, you know, someone that could, you could talk with them and they could help you fix your car. Like a AI mechanic, right? There's so many interesting use cases but if it's too expensive, it's not viable and so the cost will come down. So the CPO over OpenAI, who is Kevin Weill, said, we've been reducing the cost of our API for 2 years now. I think GPT4 mini is like 2% the cost of the original GPT3. Expect this to continue with Voice and others. This is a really big bit of alpha and update, which is these things are going to continue to get cheaper and cheaper. They're going to make them more efficient, which essentially makes them cheaper and the hardware that runs them is going to get more powerful. So the older models will get better and we'll probably be paying essentially a very similar cost for the most premium best models and all the new features. But the older stuff is just rapidly getting cheaper and a lot of the older stuff, AKA what we're using today, is already being used in tons of different use cases. So if you don't need the most advanced thing to do a specific use case, the cost is going to come down a lot. That's exciting. Someone said any plan to negotiate with the EU so EU users will get stuff faster, not dumbed down. This is a problem. I mean we're even looking at like Apple intelligence not getting rolled out to the EU and a bunch of features getting blocked there or at least delayed just because of all the regulation. Sam Altman said we'll follow EU policy. Obviously all of us hope for increasingly sensible EU policy. A strong Europe is important for the world. So I kind of like this approach, right? Like it's really easy to kind of take a, take a, a pot shot at EU and say it's a terrible place. I Was there all summer. I mean, it's a nice place, I like it. It's a bummer that some of their policies slow down a lot of this AI stuff. They perhaps overregulated. That's a whole argument for another time. I like that Sam Altman doesn't like try to take a dig at them. He's just saying, you know, a strong Europe is more important, is important for the world. He's saying, come on guys, like we can make this happen, but it's really up to Europe and they're going to follow their policies. Someone asked for from Sam Altman, the CEO of OpenAI, they asked for a bold prediction for 2025 and he said saturate all the benchmarks, meaning all the places where they benchmark different AI models, which one's the best? He wants all the OpenAI tools to be the top of the benchmark for everything. And this isn't always the case. Like they do pretty well and they're usually at the top, especially when they come up with a new release. They're usually at the top for their models, but it's not always the case. Sometimes anthropic comes out and they get ahead or new AI image models come out, they get ahead. So I think he really wants to be everything that's the best of everything is comes from OpenAI. Bold prediction, we'll see, right? And prediction meaning he hopes that their products in the pipe and what's coming soon will do that. It's not really exactly like that necessarily. Today someone asked how fast does OpenAI see inference costs reducing in order to enable chain of thought or multi layered thought trees? From a business logic perspective, we'd like to execute reasoning chains as fast and as cheaply as possible. So really what they're talking about Here is the O1 preview is essentially chain of thought. So it's like you ask it a question and it runs it through like 20 different questions to make sure the response is the best possible. But it takes more time to do that. So the OpenAI VP of Engineering said we expect inference costs to keep going down. If you see the trend over the last year, it's coming down like 10x. Okay, so this is exciting. It seems like things are going to get cheaper faster. That's fantastic. An absolutely wild question was asked which is what's the next breakthrough in GPT line of products and what's the expected timeline? And Sam Altman, CEO of OpenAI said we will have a better, we will have better and better models. But I think the Thing that will feel like the next giant breakthrough will be agents. This is fascinating to me. You know, their, their next thing that they're really putting on as a focus after kind of having some better reasoning from their AI models, which they've currently achieved, is agents. That's the next step that OpenAI is really trying to achieve. So this is going to be amazing. And I think what's interesting is he's saying, what better and better models, meaning they kind of have all the foundational models in the works. They want image, video, audio, text. And so at this point, those things will get incremental improvements. I don't expect insane jumps or maybe big jumps, but you know, there is going to be a bell curve, a cutoff. Like these things are getting pretty smart. They're going to be as smart as a human. And if that's the data that they're trained on, that might be. We might hit a wall there, unless we can figure out some clever ways to get them smarter than humans. But in any case, getting them to be able to autonomously do things, I think is the next big step. So I'm super excited to see where that goes. Someone asked for advice from ambitious youngsters that want to contribute to the revolution of AI and the CPO of OpenAI. Kevin Weil said, my vote, start using it every day. Use it to teach you things and to learn whatever you want to learn, coding, writing, product design, anything. If you can learn faster than others, then you can do anything. Eh, that's just general good advice, but didn't really have a. Too much too. Nothing too exciting as far as what we, what we're going to get. Someone asked, why does O1 not support image input? And again, Kevin said, we focus on getting it out to the world first versus waiting to make it. Full featured image input is coming in O1, and in general, the O series of models will be getting things like multimodality, tool use, et cetera, in the coming months. So this is great. I mean, this is kind of what we expect, right? They and I think everyone would rather get the model first and then have all the features come out later than have to wait an extra three or four months to get all the features out. So yeah, this is going to be cool. Someone asked, when will we get more information about GPT 4.0 image and 3D model generation? To which he replied, soon. And had a Screenshot of giving ChatGPT some HTML and saying render this. And it was able to render what that HTML would look like in a web Browser, meaning they'll have, you know, essentially code visualization and some really cool stuff. And by the way, that was OpenAI's senior SVP of research, which is Mark Chen. Someone asked if Sora being delayed, is Sora being delayed due to the amount of compute time required for inference or due to safety? Kevin, The CPO of OpenAI said, need to perfect the model, need to get safety slash impersonation, slash other things right. I need to scale compute. Someone replied, so basically you're waiting for 15x speed boost on inference with the B2 hundreds. Oof. This is not one I love. But yeah, if they don't get the compute they need, they have so many different projects right now. It's not like it's a company just working on soar because OpenAI has their fingers in so many pies and there's such a computer essentially bottleneck, then it really slows down a lot of their other projects, which is a bummer because you can imagine if there was four companies the size of OpenAI, one was working on image, one was working on audio, one was working on video, one was working on text. We'd have way faster advancements in all of them. But because they only have so much compute in one company, they have to prioritize which projects to work on. And it looks right now like Sora's model is getting essentially sort of shafted until they can figure out a couple other problems. But get more computers. Someone said, when will the full 01 release? Kevin said, soon. Which is obviously not very specific. So people kind of roasted him and said a date or it didn't happen. But you know, at least we're, at least we're making progress towards this, which is good. There's. Someone said, how will O1 influence scaling LLMs? Will you continue scaling LLMs as per scaling laws or will inference compute time Scaling means smaller models with fast, faster and longer inference will be the main focus. Kevin said, it's not either or, it's both. Better base models plus more Strawberry scaling inference time compute. Okay, someone said, what's one thing you wish Chat GPT could do but can't yet and data SIF. OpenAI VP of Engineering said, I'd love for it to understand my personal information better and take action on my behalf. I know it's kind of interesting, especially we start getting agents. Someone said, well, ChatGPT eventually be able to perform tasks on its own. Message you first. Someone said. Kevin OpenAI's CPO said, IMHO this is going to be a big theme in 2025 and someone said, we heard about, we heard that about 2024 and the crying emoji. So at the end of the day, like you, everyone wants to be so optimistic about opening Eye and they have such incredible tech. But sometimes when you feel like things are advancing so fast, like you can see what the next step would be. And so it's like, well, why haven't they just taken that next step? And then at the end of the day, like, the reason is because they have constraints of safety or compute or priorities or money, like whatever. And so it just sucks to feel like we want the technology, we know how to do the technology. We're just bottlenecked by something other than the knowledge that kind of sucks. But hopefully this is going to be a big theme in 2025. But nothing super specific on that. Someone said is the plan to continue to release O series models from now on improving on the regular models. GPT 3, 4, 4, 05, both or a combination of those? Kevin from OpenAI CPO said both. And at some point I expect they'll converge. This is what I really think is happening here. GPT4 came out super soon after GPT3, but both of those have been in the works for a long time before we'd even gotten chat GPT launch. And so it felt like a massive jump in capabilities when GPT4 came out. And you even can remember Elon Musk and a whole bunch of other people signing a letter saying like, we can't, like the government needs to ban or everyone needs to ban making any models better than GPT4 for the time being. And then of course everyone like is trying to compete with OpenAI. They just felt so much further ahead. It feels like they wanted to come up with something that would be GPT5, which ended up actually just being GPT01. But it wasn't really such a big a difference from three to four to that. So they didn't call it GPT 5, they just called it, you know, 01 or whatever. And eventually when they have another thing that's a huge step in how much better it is, they're going to call it five. But you know, they're making updates and they're making, they're training new models and they're putting in work. It's just, it's just hard to wow us. I think like they did in the past and I've heard some, I've heard Sam Altman make some comments about things that they plan on being able to do that they would consider GPT5 and it's going to be impressive, but it's also ambitious and needs a lot of compute. So someone said, will we see advanced voice loosen restrictions around musical capabilities like singing at some point? Is there a timeline for this? Kevin said, working on it. I want to hear chatbots sing too. That's cool. Again, it's like these capabilities exist. It's just, you know, for copyright reasons. Now what's interesting is you have companies like Sora that are already doing this. And what's interesting to me about that is the fact that essentially one of the edges these companies have is like a willingness to kind of break some regulatory, maybe copyright, maybe like other things. Like to be fair, OpenAI did at the beginning when they sucked up the whole Internet and trained off of it and people got mad about that, including the New York Times and sued them. So if you're willing to take some lawsuits, you can be the first person like Sora to really come up with a solid video model. And once they do that, they're able to take a really big lead, which, you know, to be fair, like OpenAI is going to do it eventually. But if they can get a big lead, they get ahead. And that's exactly what a company like eleven Labs was able to do. They got a big lead on voice and OpenAI is doing voice now. But at this point, like 11 Labs is really well known for Voice. They're doing a really good job. They have a lot of APIs and developer tools that are built into a lot of things and so you can kind of get this edge and you can get, you can jumpstart them a bit and you know, try to, try to keep a bit of a moat. Someone said, what is the best use case for ChatGPT you've seen in the wild so far and what if any area do you think it and future versions could or versions next couple of years it could be particularly good for. Sam Altman said, there are a lot of great ones with the stories of people figuring out the cause of debilitating disease and then getting fully cured are really awesome to hear also a lot. But the ability to be a really good software engineer feels deeply underappreciated. Even still more generally, the ability to help scientists discover new knowledge even faster will be so great. I agree all of those are fascinating and really useful. So I'm excited about pretty much all of those. Someone says, do you have any plans to increase the memory ChatGPT can store? Kevin CPO said, Do you mean longer context windows? If so yes, they said no, I mean the amount of memory ChatGPT stores for a single account, the memory capacity keeps getting full and I'm forced to select which memories I would like to delete to make space for new memories to be saved, persistent memory and someone plus one to that. So that's kind of interesting and I don't think we got a response to it, but yeah, it's a. Definitely is an issue and I've heard other people I've talked to and consulted have that run into that same issue. Someone asked release date of chat GPT5 or its equivalent? Equivalent, what are its features? And they said we have some very good. Sam Altman said, we have some very good releases coming later this year. Nothing that we're going to call GPT5 yet though. Okay, so hold our horses there. Someone said, seriously though, what did Ilya see? And Sam Altman said, the transcendent future. Ilya is an incredible visionary and sees the future more clearly than almost anyone else. His early ideas, excitement and vision were critical to so much of what we have done. For example, he was the, he was one of the key initial explorers and champions for some of the ideas that eventually became 01. The field is very lucky to have him. So those ideas and what became 01 was essentially chain of thought. So I guess he came up with a lot of the ideas that essentially did the chain of thought, which really, to be fair, gave OpenAI a big boost and helped them get ahead of some of their competitors. And you know, someone else asked about him leaving, which I don't think was responded to. Some said, when will you guys give us a new text image model? Like text image model dolly3 is kind of outdated. Sam Altman said the next updates will be worth the wait, but we don't have a release plan yet. Oof, that sucks. Always sucks to hear that it's going to be worth the wait because you know, we kind of all want it now, but it is what it is. So much has happened. This is absolutely fascinating. I'm going to keep you up to date on anything else. Any other news coming out of OpenAI but these like, I don't know if they're leaks but you know, this kind of, this ama, I feel like gave a really deep insight into the timelines on some of their core features, some of their biggest products, when we're going to be expecting those and we're going, we're going to actually be able to use those. So absolutely fascinating and excited. And if you're interested in using any of these tools to make money online again. I would love for you to join the AI Hustle school community. The links in the description and I hope that you all have an incredible rest of your day.