Podcast Summary: Software Engineering Daily – Small AI Models with Yoeven Khemlani
Release Date: July 24, 2025
1. Introduction
In this episode of Software Engineering Daily, host Gregor Van sits down with Yoeven Khemlani, the founder of Jigsaw Stack. The discussion revolves around leveraging small AI models for a variety of backend applications, highlighting the advantages of using specialized, lightweight models over large-scale language models (LLMs).
2. Yoeven's Background
Yoeven Khemlani shared his journey into entrepreneurship and technology. Starting as a game developer, he transitioned through various industries, including banking and corporate sectors, before founding his first startup, StayRWayback, a hotel aggregation service in Southeast Asia. After selling his initial venture, Yoeven became passionate about creating tools for developers, leading to the inception of Jigsaw Stack.
Yoeven Khemlani [01:33]: "I love building and I love building for people, right? It's like, how can I make money from the things that I build."
3. The Genesis of Jigsaw Stack
Jigsaw Stack was born out of Yoeven's desire to automate backend tasks using AI. Observing that existing LLMs like GPT-3 and GPT-4 excelled in front-end, human-in-the-loop applications but fell short in generating structured, actionable data, Yoeven envisioned a platform that could handle backend processes without manual intervention.
Yoeven Khemlani [01:33]: "Can we bring this technology to backend applications where there's no humans in the loop? It just works and takes away processes that used to require a lot of manual intervention."
4. The Concept of Small AI Models
Jigsaw Stack differentiates itself by utilizing small, specialized AI models instead of relying on massive LLMs. This approach focuses on creating models that are efficient, cost-effective, and highly accurate for specific tasks.
Yoeven Khemlani [04:02]: "Can we take the same and bring it to a 70B model? Because then we reduce our cost and increase efficiency."
Yoeven explains that while larger models are powerful, they are often overkill for many backend tasks. By fine-tuning smaller models (around 70B parameters), Jigsaw Stack achieves high accuracy (97-98%) while maintaining deployability and affordability.
5. Key Applications and Models
Jigsaw Stack offers a suite of small models tailored to various data extraction and transformation tasks:
-
Web Scraping: Automates the extraction of structured data from websites without the need for writing complex Puppeteer or Playwright code.
Yoeven Khemlani [04:09]: "Jigsaw Stack is a suite of small models to automate your backend task."
-
Optical Character Recognition (OCR): Enhances traditional OCR by integrating vision-based LLMs to provide more accurate text extraction with bounding boxes.
-
Speech-to-Text: Utilizes optimized versions of Whisper 3 to deliver some of the fastest speech-to-text conversions in the market.
-
Translation: Specializes in translating structured data, outperforming traditional services like Google Translate by using models trained specifically for translation tasks.
-
Translating Text in Images: Beyond basic OCR, Jigsaw Stack is developing capabilities to translate text within images while preserving the original style and layout using diffusion models.
Yoeven Khemlani [08:03]: "Is there a way that we can take an existing image, understand the text on that image, then translate it and diffuse it back with the same style?"
6. The Prompt Engine
One of the standout features of Jigsaw Stack is the Prompt Engine, designed to streamline prompt management, model routing, and the application of prompt techniques. This engine intelligently selects the best model for a given prompt by running the input across multiple models and aggregating the results to ensure high accuracy and reliability.
Yoeven Khemlani [09:33]: "We took different data sets in different industries and trained a really small model that can make decisions on which model to pick at runtime based on your input of your prompt."
Gregor Van shared his personal experience using the Prompt Engine for scheduling across multiple time zones, noting its superior performance compared to individual LLMs in maintaining constraints and delivering accurate, structured responses.
Gregor Van [12:47]: "Prompt Engine came back pretty reliably covering that."
7. Speed and Infrastructure Advantages
Jigsaw Stack emphasizes speed and cost-efficiency by deploying small models that require fewer resources compared to their larger counterparts. This approach not only reduces operational costs but also enhances deployability, allowing enterprises to integrate Jigsaw Stack's models seamlessly into their existing infrastructure without the burden of managing bulky GPU resources.
Yoeven Khemlani [18:05]: "We started with the idea of GPU poor... focus on training the model to be deployable and cheap to run anywhere."
8. Developer Experience and APIs
Jigsaw Stack prioritizes developer experience by offering intuitive APIs that resemble familiar platforms like Stripe and Supabase. Developers can easily integrate Jigsaw Stack's models with simple installations (e.g., NPM or PIP) and benefit from well-typed libraries that minimize the need for extensive documentation.
Yoeven Khemlani [20:57]: "From day zero, every field needs to be a named field, everything needs to be descripted, everything needs to be typed."
9. Pricing Strategy
Initially, Jigsaw Stack adopted a usage-based pricing model, charging per API call. However, as their product offerings expanded, Yoeven identified the limitations of this approach, especially for services like speech-to-text where usage can vary significantly. In response, Jigsaw Stack is transitioning to a token-based pricing model, aligning more closely with industry standards and offering greater flexibility for developers.
Yoeven Khemlani [23:16]: "We're shifting to a token-based pricing where we're estimating it to be around $1.40 per million tokens."
Additionally, Jigsaw Stack will introduce a free tier, providing developers with a generous number of free tokens monthly to encourage adoption and experimentation.
10. Community and Developer Usage
Jigsaw Stack primarily targets startups and indie developers who value high-quality developer tools. The company fosters a close-knit community by actively engaging with users, addressing bugs promptly, and iterating based on real-time feedback. Hackathons and direct interactions with developers play a crucial role in shaping the product and ensuring it meets the needs of its user base.
Yoeven Khemlani [28:20]: "The feedback loop is really good, like in real time when you do stuff like that from the startup community."
Yoeven notes that while startups hold high expectations, especially in the U.S., they also exhibit a forgiving nature when it comes to documentation issues, focusing instead on critical functionality like uptime and reliability.
11. Future Roadmap and Product Direction
Looking ahead, Jigsaw Stack plans to deepen its focus on two primary areas:
- Data Extraction: Continuing to enhance models for OCR, segmentation, object detection, and other extraction tasks.
- Data Transformation: Expanding capabilities in translation and other data manipulation processes.
Instead of broadening the product range, Yoeven emphasizes improving the quality and efficiency of existing models, enhancing developer experience, and optimizing infrastructure for better performance.
Yoeven Khemlani [37:07]: "We're going super deep into some of the detection space and the embedding space."
12. Funding and Team Growth
Jigsaw Stack recently secured $1.5 million in funding, which will support the growth of a lean team focused on product excellence. Yoeven plans to expand to a five-member team, maintaining agility while advancing the platform's capabilities.
Yoeven Khemlani [35:23]: "We're raising one and a half million with the goal to grow the team to like a five-man team."
The company is actively hiring, particularly seeking a founding full-stack AI engineer who showcases passion through side projects, reflecting Jigsaw Stack's commitment to building a stellar team.
Yoeven Khemlani [37:54]: "We only hire star players. We have three criteria... Do you have a side project?"
13. Conclusion
Yoeven Khemlani's insights into Jigsaw Stack reveal a focused and innovative approach to utilizing small AI models for backend automation. By prioritizing speed, cost-efficiency, and developer-friendly experiences, Jigsaw Stack is poised to offer robust alternatives to large LLMs, catering especially to startups and developers seeking reliable, scalable AI solutions.
Yoeven Khemlani [39:40]: "Solo founders just have to build their team better and that's the only challenge. It's not a pain point for me."
As the company transitions out of its beta phase and continues to enhance its product offerings, listeners can expect Jigsaw Stack to play a significant role in the evolving landscape of developer tools and backend automation.
