Practical AI Podcast: “Controlling AI Models from the Inside”
Date: January 20, 2026
Host: Daniel Whitenack & Chris Benson (Practical AI LLC)
Guest: Ali Khatri, Founder of Rynx
Episode Overview
This episode tackles the critical and timely topic of AI safety—especially how to control AI models “from the inside” to ensure their outputs remain secure, ethical, and compliant in a variety of real-world scenarios. Ali Khatri, founder of Rynx, joins Daniel and Chris to discuss a new approach to runtime AI model safety: moving beyond traditional input/output “guardrails” to instrumenting and monitoring the internal workings of AI models, making nuanced, context-aware intervention possible in ways current solutions can’t match.
Key Discussion Points & Insights
1. Defining AI Safety: Security FOR AI vs. Security BY AI
-
Ali Khatri introduces the distinction:
- Security by AI: Using AI to solve security issues (e.g., fraud, abuse detection).
- Security for AI: Making AI models themselves secure, preventing them from causing harm or being abused.
“As models have entered the tech stack, they bring in a bunch of security challenges ... Models today will literally tell—like these generative models—will generate any vile content known to man.” — Ali Khatri [03:59]
-
Types of Risks:
- General categories (pornography, hate speech, etc.) and context-specific risks (e.g., financial fraud in banking, tailored responses in code generation).
2. Current Landscape: How Safety Is Handled Now
-
Guardrails:
- Input and output filtering are industry-standard—AI models’ prompts and responses are checked for disallowed content.
- This is compared to “ID checks at the gate”—fine for some threats, but ineffective if bad behavior happens inside.
“Today's solutions analyze what's going into the model ... and analyze what's coming out ... But by then, the damage has already been done.” — Ali Khatri [07:23]
-
Economics of Safety:
- Running guard models at input/output is computationally expensive, especially for large models or on edge devices.
3. The Core Problem: AI Models as Black Boxes
-
Jailbreaks & Adversarial ML:
- Attackers exploit lack of transparency by crafting prompts that evade guardrails.
- The inability to see or understand what happens inside the model means new attacks are always possible.
-
Limitations of External Defenses:
“If your neighbor … decides to pull out a golf club and starts violently assaulting you … will the good guys be able to protect you just by checking IDs at the gate? ... You're past that point.” — Ali Khatri [07:19]
4. A New Paradigm: Mechanistic Interpretability & Model Instrumentation
-
Interpretability:
- Explainability: Why a model made a decision (e.g., was this loan denied due to X?).
- Mechanistic Interpretability: How internal states generate outputs—gives insight into the “wiring” of the model.
“What interpretability tries to do is understand what's happening inside. Now, you could control it, or you could use that to make a risk quantification … You have a whole new class of data points.” — Ali Khatri [19:59]
-
Ali’s Approach:
- Instrument the internal “subspaces” of a model at runtime. Identify activations that correspond with unsafe behaviors and intervene before full generation.
- Works without modifying or retraining the underlying model.
-
Analogy:
“Think of this as cameras at every gate or every path. So you know that, OK, well, this is what's happening in this hallway, and we've got to stop it.” — Ali Khatri [17:44]
5. Implementation & Advantages
-
How Rynx Instruments AI Models:
- Sits atop any open-source or proprietary model (e.g., Llama, Mistral), adding a “safety module.”
- Doesn’t require rebuilding or fine-tuning.
- Approach is model-agnostic and highly customizable.
-
Efficiency Breakthrough:
- Example: Securing Llama (8B parameters) using Rynx requires only 20M parameters versus 160B if traditional guard models are applied (prompt + response filtering).
- Enables “comprehensive safety” even on edge devices where performance overhead is crucial.
“What we've built is a research breakthrough … we have succeeded in bringing it down to 20M [parameters]. So we're essentially a rounding error today.” — Ali Khatri [26:56]
-
Performance & Accuracy:
- Matches and often beats external guard models—faster, cheaper, more reliable due to inside visibility.
6. Defense in Depth & Custom Policies
-
Ali advocates for layered security (akin to human society’s national, border, state, & local law enforcement).
“Defense in depth … You need different levels ... to make sure that security on the whole is ensured.” — Ali Khatri [36:55]
-
Customization:
- Rynx’s framework allows for domain-specific safety—companies can define and detect behaviors that matter uniquely to them.
“The model needs of most companies are similar-ish, but the safety needs are dramatically different.” — Ali Khatri [39:34]
- Rynx’s framework allows for domain-specific safety—companies can define and detect behaviors that matter uniquely to them.
7. Looking to the Future
- Vision:
- Ali aims to establish “model-native safety” as a standard and enabler for more sensitive applications (e.g., healthcare, PII compliance).
- Make AI model safety robust, customizable, and universal—removing barriers to AI adoption in regulated and high-stakes industries.
“My aspiration is to … build model-native safety. … That is the guiding vision behind Rynx.” — Ali Khatri [41:19]
Notable Quotes & Memorable Moments
-
On the limitations of current guardrails:
“All these models are able to do is check IDs at the gate … That is the protection you’re getting.” — Ali Khatri [29:18]
-
On the importance of interpretability:
“When you analyze how the data flows inside of this black box, you're able to control it and stop it at the source.” — Ali Khatri [17:18]
-
On efficiency gains:
“We've succeeded in bringing it down to 20 million [parameters] … we can sort of deliver safety like comprehensive safety.” — Ali Khatri [27:07]
-
On hybrid approaches:
“Yes, you have guardrails which look at prompts and responses. They are valuable. … Sometimes you want to combine, mix and match.” — Ali Khatri [36:55]
Timestamps of Key Segments
- 02:14 — Ali Khatri on his background and early anti-abuse work
- 03:59 — Security for AI vs. security by AI; defining real-world risks
- 07:19 — “ID check at the gate” analogy & where current guardrails fail
- 11:47 — Terminology: guardrails, static checks, interpretability, mechanistic interpretability
- 16:44 — Expanding interpretability for runtime safety
- 23:20 — How Rynx’s approach avoids the need for retraining or modification
- 26:56 — Efficiency breakthrough: internal model monitoring vs. expensive guard models
- 32:03 — Discussing reliability and accuracy of internal vs. external model safety approaches
- 36:55 — Defense in depth: hybrid safety strategies
- 39:34 — Customization for domain-specific safety needs
- 41:19 — Ali’s vision for industry-standard, model-native safety
Final Thoughts
This episode provides a compelling, forward-thinking look at the next generation of AI safety—moving from black-box, guardrail-centric solutions to nuanced, efficient instrumentation of models themselves. Ali Khatri and Rynx represent a significant leap in how AI can be made both powerful and safe, not just for technology’s sake, but for the real-world contexts and stakes at which it is increasingly being deployed.
For developers, business leaders, and AI stakeholders who want models that are not just functional but also reliably safe—even in dynamic or custom environments—this episode is essential listening.
For more resources or to contact the guest, visit Rynx at wrynx.com (W-R-Y-N-X).
