Big Technology Podcast: The Next Gen AI Models – Reliable, Consistent, Trustworthy with Aidan Gomez
Hosted by Alex Kantrowitz | Release Date: October 30, 2024
In this insightful episode of the Big Technology Podcast, host Alex Kantrowitz engages in a deep conversation with Aidan Gomez, CEO of Cohere and co-author of the seminal paper “Attention is All You Need.” This paper introduced the transformer architecture, a cornerstone of modern generative AI. Together, they explore the current landscape of AI, dispel prevalent myths, discuss the return on investment (ROI) in AI technologies, and forecast the future trajectory of artificial intelligence.
Investment and ROI in AI
Alex Kantrowitz kicks off the discussion by addressing the substantial investments flowing into AI, citing OpenAI’s recent $6.6 billion raise contrasted with reports of a $5 billion annual loss. He raises concerns about the sustainability of such investments given the escalating costs associated with data, compute, and energy required for training advanced models.
Aidan Gomez responds thoughtfully, emphasizing the long-term value of AI technologies. At 02:06, he states:
“I think those numbers are actually small relative to the long-term value that the technology will deliver. It is now time to prove that.”
Gomez highlights the transition from the proof-of-concept phase to actual production and widespread adoption, particularly in enterprise settings. He notes that Cohere is witnessing significant growth as businesses integrate AI into their operations, promising substantial ROI in the near future.
Dispelling Myths About AI
The conversation delves into the myth that next-generation AI models will be "godlike." Gomez dismisses the notion of Artificial General Intelligence (AGI) achieving superhuman capabilities imminently. At 04:25, he clarifies:
“I think the idea that we're building AGI or something, that's just going to solve all our problems for us. I think we need to set that aside.”
Rather than anticipating a sudden leap to AGI, Gomez envisions a steady improvement in AI reliability, trustworthiness, and competency. He anticipates continuous enhancements without expecting abrupt, transformative breakthroughs.
Scaling AI Models: Compute, Data, and Beyond
Alex probes into the scaling of AI models, referencing the exponential increase in GPU usage—from Meta’s Llama 3 requiring 16,000 GPUs to Elon Musk’s Project Memphis deploying 100,000 GPUs. He asks Gomez about the implications of such scaling.
Aidan Gomez acknowledges that more compute leads to better models but warns against excessive scaling without practical deployment benefits. At 08:32, he explains:
“If you're building a massive model, it's not actually useful for the world if it's too big to be consumed, if it's too expensive to actually deploy.”
Gomez advocates for balancing model size with usability, ensuring that advancements translate into tangible, accessible tools for users.
Artificial General Intelligence: Goals and Realities
The discussion shifts to AGI, with Alex referencing differing interpretations of the term. He distinguishes between AGI as comparable to human intelligence and superintelligence. Gomez aligns with a practical definition of AGI as intelligence matching human capabilities in various tasks. At 10:39, he shares:
“With that definition of AGI, I think it's both achievable and a fairly reasonable target. So we can measure how good humans are in any particular task and create technology that can perform that well.”
Gomez contends that AI will augment rather than replace human workers, enhancing productivity and creating new opportunities rather than causing mass unemployment.
Enhancing Work through AI Assistants
Exploring the role of AI as reliable assistants, Alex references OpenAI’s Zero1 reasoning model and its potential to evolve into more trustworthy tools. Gomez emphasizes the importance of models being able to reason and self-correct, enhancing trustworthiness. At 15:16, he states:
“It [reasoning] is a crucial piece in improving not only the accuracy or robustness or usefulness of the model, but also the trust in the model.”
He envisions AI evolving into collaborative partners that assist daily tasks, thereby transforming the nature of work.
Addressing AI Risks and Safety Concerns
Alex raises concerns about AI potentially causing harm through autonomous decision-making. Gomez reassures listeners by emphasizing human oversight and safeguards. At 16:35, he asserts:
“We have to plug them in and we have the opportunity to implement safeguards. So to make sure that before these models are put in any very high stakes situations, there's oversight that a human has to approve high stake actions.”
Gomez dismisses doomsday scenarios portrayed in media, advocating for thoughtful and controlled deployment of AI technologies.
Emergent Behaviors and AI Capabilities
The topic of emergent behaviors in AI models emerges, with Alex questioning whether Large Language Models (LLMs) can exhibit behaviors beyond their training data. Gomez clarifies that while AI can interpolate between known skills, it does not spontaneously develop unconstrained abilities. At 19:42, he remarks:
“I've never seen a model behave in a totally unexplainable way. They're really good interpolators.”
He dismisses the notion of an intelligence explosion, suggesting that AI’s self-improvement capabilities plateau rather than perpetuate indefinite advancement.
The Role of Synthetic Data in AI Development
Gomez discusses the limitations and applications of synthetic data in training AI models. At 22:08, he explains:
“Synthetic data probably doesn't get us out of that issue. Actually, I don't know if synthetic data outside of easily verifiable domains like math, it's hard to use synthetic data to drive outcomes.”
While synthetic data aids in areas like mathematics and coding, human expertise remains crucial for complex, nuanced fields.
AI in Enterprise: Real-World ROI Examples
Shifting focus to ROI, Gomez highlights Cohere’s integration with enterprise clients like Oracle. At 29:09, he shares:
“We're powering over 50 different applications within those software tools. And so it's actually starting to get into the hands of employees and drive efficiencies.”
Examples include automating job description creation, supply chain management, legal contract review, and healthcare data analysis. These implementations demonstrate significant time and cost savings, underpinning the economic value of AI in enterprise settings.
AI’s Impact on the Workforce
Addressing concerns about AI replacing jobs, Gomez emphasizes augmentation over replacement. At 39:56, he states:
“It's very assistive actually. So it's less about replacement, it's more about augmentation. What everyone's building are tools to augment their workforce to make them more productive.”
He envisions AI handling mundane tasks, freeing humans to engage in more fulfilling and intellectually stimulating work.
Partnerships and Integration with Cloud Providers
Discussing the role of cloud providers, Gomez notes Cohere’s collaboration with major platforms like Amazon and Azure, while also supporting on-premises deployments for regulated industries. At 40:32, he mentions:
“Cohere has had a long time focus on, on prem as well, because for a lot of regulated industries like finance and healthcare, a lot of that data doesn't actually go on the cloud.”
This dual approach ensures versatility and compliance across diverse enterprise needs.
Future Outlook: The Next 2-5 Years in AI
Looking ahead, Gomez predicts a continuous evolution of AI assistants becoming more capable and integrated into daily workflows. At 42:06, he forecasts:
“In the next two years, I think we're going to start to see really compelling assistance. It won't just be little convenience functions or small features. It'll look a lot like a partner that you do work with.”
Over the next five years, these assistants will evolve into trusted collaborators, deeply embedded within various systems and processes.
Personal Insights from Aidan Gomez
In a reflective segment, Gomez shares his astonishment at the rapid adoption and impact of the transformer architecture. At 27:26, he expresses:
“Even if I step away from being one of the authors of the paper, the impact and what the architecture has been able to do for the field has been a huge shock, a colossal shock.”
He credits his co-authors and acknowledges Google’s pivotal role in integrating transformer technology across its platforms.
Conclusion
The episode concludes with Gomez reiterating the immense potential of AI to transform industries by automating repetitive tasks and enhancing productivity. He underscores the importance of thoughtful implementation and the collaborative future of human-AI partnerships.
Alex Kantrowitz wraps up the conversation by highlighting the understated yet profound impact of AI in enterprise settings, challenging the perception that AI’s value lies solely in consumer-facing applications.
This episode of the Big Technology Podcast offers a comprehensive examination of the current state and future prospects of AI, grounded in the expertise of one of the field’s pioneering figures. Listeners gain a nuanced understanding of AI’s economic viability, practical applications, and the realistic scope of its capabilities, steering clear of both unfounded fears and exaggerated hype.
