Podcast Summary: “Datumo Emerges as Scale AI’s Fiercest Competitor”
Podcast: The Joe Rogan Experience of AI
Host: The Joe Rogan Experience of AI
Date: August 17, 2025
Overview:
In this episode, the host takes an in-depth look at Datumo, a South Korean AI startup challenging the dominance of Scale AI. The discussion covers Datumo's recent $15.5 million funding round, its innovative approach to data labeling and AI evaluation, its expanding presence in the global market, and the shifting power dynamics caused by Scale AI’s recent deal with Meta. The episode maintains a lively, conversational tone reminiscent of the original Joe Rogan Experience, blending analysis, industry anecdotes, and personal commentary.
Main Discussion Points & Insights
1. Datumo’s Origins and Business Model
- Base of Operations: Seoul, South Korea, with a growing presence in Silicon Valley.
- Original Focus: Started as a data labeling company tackling the time-consuming process of prepping training data for AI models.
- Innovation: CEO David Kim developed a reward-based app that crowdsources data labeling—“if you have free time, you can sit there and label data in your spare time and you get paid for it.” (06:01)
- Rapid Traction: Even before fully building the product, they secured tens of thousands in pre-contract sales during customer discovery.
“In their first year they actually passed a million dollars in revenue. So a ton of people wanted this well labeled data.” (08:15)
2. Customer Base, Revenue, and Expansion
- Major Clients: Samsung, LG Electronics, Hyundai, Naver, SK Telecom—predominantly large Korean corporations.
- Growth: Over 300 clients, $6 million in revenue last year, and now 150+ employees.
- Future Plans: Expanding beyond Korea, targeting Japan and the US markets, with an office and hires in Silicon Valley as of March.
3. Service Evolution: From Labeling to Model Evaluation
- Market Need: As customers requested more than just labeling, Datumo began to help companies benchmark AI model safety and performance.
- Notable Internal Discovery:
“They wanted us to score AI model outputs to compare them to other models. That’s when we realized we were already doing model evaluation without even knowing it.”
—Michael Huang, Co-founder (11:24) - Product Launch: Introduction of a no-code evaluation tool, “Datumo Eval”, designed for policy, trust, safety, and compliance teams lacking deep technical backgrounds.
4. Strategic Funding and Notable Investors
- Total Raised: Now $28 million, with Salesforce Ventures as the latest lead investor.
- Funding Story: Connection with Salesforce came from hosting a fireside chat with DeepLearning.AI’s Andrew Ng in Korea, which was later spotted on LinkedIn by Salesforce Ventures—an organic, networking-driven fundraising success.
“Hosting some big famous person and then posting it on LinkedIn is like the best way for a startup to raise money.” (19:15)
5. Licensed Datasets and Differentiation
- Beyond Custom Labeling: Datumo also licenses its own unique datasets, including large collections of data crawled from published books, described as “rich structured human reasoning” datasets—useful for training reasoning capabilities into LLMs.
“Apparently reading books is a good way for [models] to, like, reason through, learn how to reason through problems, which I thought was absolutely fascinating.” (15:57)
6. Industry Context: Scale AI, Meta, and Competitive Shifts
- Meta’s $14.3 Billion Deal: Recent investment/acquisition by Meta in Scale AI shook the industry. Host draws direct parallels to Microsoft's deal with OpenAI (but notes similar convoluted corporate structures).
- Backlash: Some major clients, including OpenAI, have pulled out of Scale AI services post-Meta-deal due to conflict of interest concerns.
- Market Opportunity: Host expresses strong optimism about Datumo’s prospects in this rapidly expanding and turbulent space.
7. Use of Recent Funding
- R&D Focus: Accelerate automation of AI evaluation and scale global go-to-market strategies.
- Scaling Global Presence: Noted push to expand outside Korea, especially in the US and Japan.
Notable Quotes & Memorable Moments
-
On Industry Readiness for AI Safety:
“Most companies from surveys… say that they’re not prepared to use AI in a safe and responsible way.” (03:55) -
On the Crowdsourced Data Labeling Approach:
“It’s reward-based… anyone can get on there… in your spare time and you get paid for it. Super, super funny, but obviously a huge business—needs a lot of humans.” (06:10) -
On Startup Hustle and Validation:
“Before the app was fully built… they actually had tens of thousands of dollars in pre-contract sales during their customer discovery phase.” (07:32) -
On Ecosystem Expansion:
“Over the last couple years their clients also started asking them for other things other than just data labeling… helping companies benchmark models.” (10:45) -
On Data Quality from Books:
“Reading books is a good way for [models] to reason through problems… which I thought was absolutely fascinating.” (15:57) -
On Unconventional Fundraising:
“Hosting some big famous person and then posting it on LinkedIn is like the best way for a startup to raise money.” (19:15)
Timestamps for Key Segments
| Segment | Timestamp | |--------------------------------------------------|------------| | Introducing Datumo and Scale AI context | 00:00–03:45| | Industry survey on AI safety/preparedness | 03:45–05:00| | Datumo’s crowdsourced data labeling model | 05:00–08:30| | Customer validation and early success | 08:30–09:20| | Core clients and move into benchmarking | 09:20–12:00| | Expansion, revenue, and service evolution | 12:00–14:15| | Differentiating with licensed datasets (books) | 14:15–16:00| | Industry drama: Meta’s Scale AI deal & fallout | 16:00–18:15| | Product focus: Evaluation tools for non-devs | 18:15–19:00| | Unusual Salesforce Ventures investment story | 19:00–20:20| | Growth plans: R&D and global expansion | 20:20–22:00|
Conclusion
The host concludes that Datumo stands as a prime challenger to Scale AI, leveraging Korea’s tech ecosystem, a clever crowdsourcing model, and a focus on both data and safety benchmarking. With high-profile clients, savvy networking for funding, and a push for global expansion, Datumo exemplifies how rapidly the AI infrastructure landscape is evolving and how opportunities are being seized in the wake of industry upheavals.
