Season 7, Episode 3: Understanding sovereign AI (with Alfred Succer) - Mobile Dev Memo Podcast

Summary7 min read

Mobile Dev Memo Podcast — S7E3: Understanding Sovereign AI (with Alfred Sukher)
Release Date: January 20, 2026
Host: Eric Soufert | Guest: Alfred Sukher, Founder of Novo Nuggets

Episode Overview

This episode explores the emerging concept of "sovereign AI"—AI systems that are privately owned, operated, and controlled by individuals or organizations, rather than relying on third-party, cloud-based large language models (LLMs) like ChatGPT or Gemini. Host Eric Soufert and guest Alfred Sukher, founder of Novo Nuggets, discuss the implications of AI sovereignty for data privacy, regulatory compliance, competitive advantage, and costs, across sectors such as finance, healthcare, legal, proprietary business operations, and even personal or consumer use.

Key Discussion Points & Insights

1. Guest Introduction & Data Ownership as a Competitive Advantage

Alfred Sukher's Experience: Decade-long career at the intersection of tech, data, and business—KPMG, Google, and consultancy roles.
Core Belief: "[Data] is the real competitive advantage in our age…You can't rent it, you can't actually share it with anybody else." (Alfred, 02:21)

2. Why Sovereign AI? — Practical Risks of Outsourcing LLMs

Example: Investment banking analyst fired after uploading a sensitive deck to ChatGPT. (Eric, 03:47)
Parallel in Healthcare: Doctors fired for sharing patient data with ChatGPT—regulations strictly prohibit this. (Alfred, 05:44)
Eric’s Realization: Outsourcing prompts to third-party LLMs exposes you to privacy, security, and regulatory risk.

3. Defining Sovereign AI

"Sovereign AI literally means that the entire intelligence stack is yours. From model to data to embeddings, vector databases, inferencing logs, all lives inside your walls, not on somebody else's server." (Alfred, 05:55)
"It’s the difference between owning a brain and renting one that reports just to you." (Alfred, 06:24)
Sovereignty is about custody of data and models—not necessarily about training models from scratch, but about who controls and houses them. (Eric & Alfred, 07:35–08:06)

4. Fine-Tuning, Privacy, and Regulatory Risk

Fine-tuning location matters: When done on third-party infrastructure (e.g., GCP, Vertex AI), there are rights, access, and data residency concerns.
- "Anybody that you share your data with is going to look at it especially for fine tuning a model..." (Alfred, 10:02)
Regulations: HIPAA, GDPR, DMA treat AI as a processor—moving data outside regulated environments can violate compliance. (Alfred, 12:21)
- "This is non-negotiable. The regulations are very clear. You can't share patient data or any private data with any other processor." (Alfred, 12:28)

5. Privacy Vulnerabilities in Web-Scale LLMs (13:43–16:18)

Data residue: "Even if they say no logging...there are some part[s] of this data [that are] going to [be] stored as system metadata." (Alfred, 14:39)
Model contamination: If your unique prompts/techniques are repeatedly used, they can shape or “contaminate” the model. "You're going to contaminate their model with your data... exposing your techniques." (Alfred, 15:14)
Cross-tenant exposure: Your data may pass through and persist at telemetry points, loaders, and inference routers shared with others.

6. LLM Poisoning and the Singular Response Problem (16:18–21:55)

Reddit as Data Poisoning Attack: Reddit users organized to poison Google Gemini’s output for commercial ends. (Eric, 16:18–17:47)
SERP vs. LLM Output: "The difference is I get a range of results when I do a search. If I do an LLM query...I'm expecting the all encompassing response." (Eric, 18:51)
Risk: With LLMs, users might trust a singular response without realizing it could be gamed.

7. Implementing Sovereign AI — Spectrum of Approaches

Full Ownership: Hosting model and data on-premises. "Once you have the hardware, you host your own model and you have your own data sources... you are 100% ironclad." (Alfred, 23:02)
SME/Startup Path: Even small teams (e.g., 5 people) can build in-house clusters at reasonable cost, especially over time. (Alfred, 23:54)
Cost and Economics: While upfront is higher, long-term costs (no per-token/API fees) swing in favor of sovereign AI; in-house scales after hardware, while API costs scale linearly. (Eric & Alfred, 25:38–28:13)

8. Sovereign AI as Core Business Infrastructure

Competitive Advantage: "Knowledge is competitive advantage...protecting this way, how you do it is actually the biggest asset that you have." (Alfred, 24:41)
LLM as Infrastructure, Not SaaS: Should not be treated like disposable tools—it's the “logic layer” of your business operations. (Eric, 28:13)
Consistency: On-prem models guarantee output consistency; cloud LLMs may have undetectable changes or disruptions. (Eric & Alfred, 28:13–29:14)

9. Security & Technical Weak Points

Load balancers, telemetry points: Even if vendors claim "no logging," data can pass through and stick at these points, beyond your control. (Eric & Alfred, 31:49–32:38)

10. Common Use Cases for Sovereign AI

Regulated Data: Healthcare, finance, government, energy—where data is legally protected.
Proprietary Processes: Internal SOPs, underwriting logic, manufacturing—core operational secrets.
Knowledge Retention: Building institution-specific AI knowledge bases for succession or training. (Alfred, 32:57–35:06)

11. Unexpected Vertical: Agriculture

An Amazon reforestation company in South America wanted its agronomic knowledge and forestry methods private; surprised Alfred, highlighting competitive secrets in unexpected fields. (Alfred, 35:19)

12. Geographical & Regulatory Context

Europe: More regulation, but interest is "pretty universal...Europeans are two steps ahead." (Alfred, 36:43)
Societal Awareness: Most users aren't vigilant; industry leaders including Sam Altman warn against oversharing with public LLMs.

13. Personal/Self-Hosted/Consumer Sovereign AI

Consumer Use: On-device LLMs are already viable (Alfred self-hosts at home), facilitating private "second brain" experiences. (Alfred, 38:33)
Vision: "AI will be as personal as your mobile phone should be..." (Alfred, 39:34)
Practicality: Physical AI appliances for personal use, never connected to the internet.

14. Sovereign AI & AI-Facilitated Cyber Threats

Coordinated Espionage: The first organized AI espionage campaign through Anthropic LLMs emerged in late 2025, demonstrating real-world weaponizability. (Alfred, 41:03)
Importance: "If I want my personal, my private one, I don't want it to be weaponizable."

Notable Quotes & Memorable Moments

On Data Ownership (02:21 / 24:41):
"You can't rent it, you can't actually share it with anybody else. No matter what field you are in... it's your data."
On Sovereign AI (05:55):
"Sovereign AI literally means that the entire intelligence stack is yours...all lives inside your walls, not on somebody else's server."
Cloud LLM Risk (10:02):
"Anybody that you share your data with is going to look at it especially for fine tuning a model..."
Regulations Non-Negotiable (12:28):
"This is non-negotiable. The regulations are very clear. You can't share patient data or any private data with any other processor."
On Model Contamination (15:14):
"You're going to contaminate their model with your data...the model is going to pick up on this trend...that's exposing your techniques."
On Consumer Trust (21:03):
"If I'm given this again, singular response, I'm just gonna...A lot of people just view that as definitive. Right?"
On Hardware vs. API Economics (27:30):
"At some point the economics flip to supporting in housing some of this hardware...you're paying a lot of money to use the API and a token..."
On Personal AI Future (39:34):
"In the future AI will be as personal as your mobile phone should be like that because everybody uses their mobile phone differently..."
On LLM Weaponization (41:03):
"There was an orchestrated cyber attack espionage campaign through Anthropic ... the first organized AI espionage crime in that space."

Timestamps for Key Segments

[02:21] Alfred on data as the unique, ownable competitive advantage
[03:47] Eric's investment banking anecdote—real-world prompting risk
[05:55] Alfred's definition of sovereign AI
[12:28] Alfred on regulatory non-negotiables (HIPAA, GDPR, DMA)
[14:39] Alfred explains inherent web LLM privacy vulnerabilities
[16:18] Eric on Reddit users poisoning Gemini’s output—LLM poisoning
[21:03] Eric on consumer trust in LLMs vs. search engines
[23:02] Alfred on full-stack, ironclad approaches for startups
[27:30] Host & guest discuss economics of sovereign AI vs cloud API
[32:57] Alfred outlines the three most common sovereign AI use cases
[35:19] Surprising use case: Amazon reforestation in agriculture
[38:33] Personal device AI and future of consumer sovereign AI
[41:03] Alfred recounts the first orchestrated LLM espionage attack

Final Thoughts

Sovereign AI represents a paradigm shift—one that’s about owning, not renting, the intelligence that fuels modern business and personal productivity. While security, regulatory compliance, and privacy are at its core, the long-term economic advantages and the ability to tailor AI to the “soul” of an organization or individual are just as compelling. As both enterprise and consumer AI users become more aware of these dynamics, demand for solutions like those discussed by Alfred Sukher will only intensify.

Connect with Alfred Sukher and Novo Nuggets via novonaughts.com or LinkedIn for further discussion or collaboration.

Loading summary

Transcript53 lines

[00:03]
Sponsor/Ad Host
Mobile game developers no longer need to Pay up to 30% in major app store fees. With Xsolo Webshop, you can create a direct storefront, cut fees down to as low as 5% and keep players engaged with bundles, rewards and analytics. Start today@xsola.com that's x s o l l a.com or use the link in the episode show notes.
[00:30]
Alfred Sukher
The problem is that distinction needs to be drawn between the competence of the economists and the correctness of their analysis.
[00:41]
Eric Soufert
Welcome to the Mobile Dev Memo podcast. This is your host, Eric Soufert and I'm joined today by Alfred Sukher. Alfred, welcome to the podcast.
[00:47]
Alfred Sukher
Hi Eric, thank you for having me. It's amazing starting the new year with this amazing opportunity to talk about a lot of, I think impactful subject of our lives and how AI is going to touch it in the next five years.
[00:59]
Eric Soufert
Yeah, I'm very excited, I'm very excited to talk to you about sovereign AI, about privacy issues related to, you know, LLM usage and chatbot usage, why companies should be thinking about that and how companies can, can deal with it. We said we were introduced by Luca who was on the podcast maybe two months ago from, from Pymt Labs. We had talked about his recent paper and, and he introduced us kind of championing you as, as a, as an interesting guest to have. And this is certainly a topic that I've been spending a lot of time thinking about, so I'm really excited to talk about it. Before we do that, please introduce yourself to the audience.
[01:33]
Alfred Sukher
Well, I'm Alfred. I had the past decade of my life working between or at the intersection of technology, data and business. So I was a consultant financial sector at kpmg. I worked at Google with big Tech. I was a freelancer consultant as well, doing innovation projects. And I realized how data is the real competitive advantage in our age. And this is the real differentiator between you and everybody else in the field, no matter what field you are working in. And you mentioned Luca. I just want to shout out Luca, thank you very much for this opportunity as well.
[02:06]
Eric Soufert
Well, I think, you know, feel free to introduce your company and what you're doing too. I know I told you not to be too commercial, but you know, you could be a little commercial, but I think that's, that helps to set the stage for the conversation. So maybe talk about what you're doing now and what your company's approaching because that's kind of the topic of the, of the discussion.
[02:21]
Alfred Sukher
This is a core belief that is the competitive Advantage that everybody should own. You can't rent it, you can't actually share it with anybody else. No matter what field you are in, no matter what business you are running, the way you do things, the way you operate, your process makes you different from everybody else. So it's maybe it's marketing, maybe it's product development, maybe it is product manufacturing, but it's your data. So what I was thinking, working with so much different industries, how can we protect this data as much as possible going into the AI era where computing is different, where we need inferencing and we can't actually not depend on AI for the next decade? So we started Novo Nuggets. Novo Nuggets is literally to give back control to everybody that wants to use a technology that is going to touch every aspect of their life. On the business side, protecting the data inside the business and on personal side, protecting your private life. You might want to talk with your LLM as a friend or you want to share your tax data. And this is something that we want to solve this problem or we want to make available. We want to enable people to use this technology without thinking, okay, my data might be logged somewhere or this is going to surface on an LLM when the LLM just like pick up on the pattern when I just prompted in the same way over and over again, that's going to be used for training and somebody else can get this information.
[03:48]
Eric Soufert
So just to share an anecdote, because I think this is kind of what kind of catalyzed my interest in this topic. So I heard of from a friend of mine who works in investment banking, they had to fire one of their analysts because, or, you know, they, they, I don't know if they had, they didn't have to, I guess, but they, they did fire one of their analysts because he just fed in a deck that he was working on to whatever he was, you know, using chat, GPT or whatever to ask for feedback and to essentially just shared. You know, if you think about what you're doing when you do that, I mean, you're sharing sensitive, highly sensitive, potentially tradable information with a third party. With, with, you know, this isn't your, this is not your internal, this is not a confidential resource here. This is a third party. This is a company that you've just shared this deck with and asked for feedback or whatever. And, and it got me thinking, wow, okay, an analyst, you know, they're 22 years old or 23 years old or whatever. They're, maybe they're just you know, just being sloppy or, you know, just being lazy or whatever. But that's a serious issue. That's a serious problem. And I don't know, I mean, I imagine like the big banks are thinking about it and, and certainly, you know, your company's thinking about it and they, and maybe your company works with a lot of big banks, but that's only going to become more and more of an acute risk. Right? Just this idea of, well, it's my program, it's my software, I pay for it, I pay a monthly subscription. But you don't know what they're doing behind the scenes. You don't know what they're doing with that and where they're storing it and how they're retaining it. I mean, that's a serious problem, a serious vulnerability in a range of domains, not, not just finance, where you might be working on an ipo, but in, in the legal domain, in the medical domain. And so like, I just think these kind of sovereign AI solutions, they're only going to become more and more important over time. But, but maybe just a good place to start there is to talk to me about what sovereign AI means in practice. Like so maybe even anchor to that example if it's helpful. But like what is sovereign AI?
[05:44]
Alfred Sukher
Well, you touched a lot on a lot of points. I think we'll talk about them later. To healthcare. Just like going back to your point, in healthcare, I know a lot of doctors that got fired immediately on the spot because they shared their patient data with ChatGPT and you can't let that happen. There's a lot of regulations, there's a lot of compliance issues there. They were fired on the spot. So back to your question, what is sovereign AI? Sovereign AI literally means that the entire intelligence stack is yours. From model to data to embeddings, vector databases, inferencing logs, all lives inside your walls, not on somebody else's server. It's the difference between owning a brain and renting one that reports just to you. So owning the whole brain is what is going to give you the competitive advantage to the, to the future. This is a new technology. This is not like a SaaS subscription that you just like pay for in the past. It's literally exchanging ideas and exchanging critical information with the, that needs this information and tokens to generate tokens back. And all of this going to be inferenced in somebody's else's server where even when they say through an API, we are not going to retain this information, we are not going to train our models on these information there is load balancers, there are telemetry points, there are a lot of things that are going to be in between your data and their servers and your data should go through all of these gates to get processed and back. And that's a risk, that's a risk that you can actually mitigate by owning the whole system. Even you don't need big scale GPUs or hardware. There are enough models today that, that are production level and that's in my opinion, a sovereign AI system where you can literally own the whole stock and destroy the whole stack at, at one point even with the data in it.
[07:36]
Eric Soufert
And just to be clear here, we're not necessarily talking about training a model from scratch, right? I mean I guess this, there's a spectrum, right? So on one end of the spectrum it's just your server, you're hosting some, you know, open weights model there up to. Yeah, we did train it completely with our own data and it's also hosted on our server. But the reality is it's more just like about who's receiving the data that is used as the prompt, right? Is that, is that kind of the core distinction here between sovereign and non sovereign?
[08:07]
Alfred Sukher
The core distinction has nothing to do with what model you are using, to be honest. It's custody of the model, it's where do you have the model living, Is it inside your hardware and your servers or is it somebody's else's? So do you own the hardware? Do you own the models? Do you own the data that's feeding the model and do you own the end user interface? This is the full stack. When you own all of this, you can say I own my AI, I can use it even without paying a monthly subscription. And that's the point. If you want to fine tune a model that's actually adding to your advantage, to your competitive advantage. Because once you use it, you can retain this data with your logs and then at some point you'll just label the data, tell the AI, this is the data that was good, this is the data that was bad. And please just continue learning how I see the work should go on and on. So this is going to create a, a second brain for your organization or business that's going to think, act and react exactly the way you want it, right?
[09:06]
Eric Soufert
And so like, but, but even with like, I guess fine tuning, I mean, you know, depending on how you do it, you know, people get confused with like. So fine tuning would be like an performance improvement, right? There's no inherent privacy benefit from Doing that if you're taking, and you know, I suppose like so if you fine tune a model, you have to host it somewhere and it becomes an endpoint that probably you only have access to. Right. So that's a little bit different than just feeding stuff in a chatbot in the web interface. Right. But still, I mean where did it get fine tuned? I mean you, and not to say that, not to be like conspiratorial about it but like if you're using you know, vertex AI or whatever to fine tune a model, I don't know what happens when I send the data there. I mean I'm, I'm uploading it to my Google cloud, my GCP instance or whatever. But that's not necessarily total, you know, to, to borrow your term sovereignty around the data. I don't know where it goes. Like I, I think no one's accessing it, but I can't be sure of that. And okay, well Google is probably more trustworthy than some fly by night company. But what if a fly by night company is offering to do it?
[10:03]
Alfred Sukher
Absolutely. Where to fine tune a model plays a huge role in the sovereignty of your data. Anybody that you share your data with is going to look at it especially for fine tuning a model because they, we need to label it, we need to see what data works, what not, and so on and so forth. So best of all situation is just to build your own stack, have a GPU server enough in your, in your end that's I know it's a big upfront cost, yet it plays phase out with time because you are not going to pay for token generation and then you will have your stack to train your own model. Actually within your four walls and GPUs are available there are a lot of computing power available today. There are going to be more computing power available with the near future. I think hardware manufacturers are picking up on the trend and I think everybody should be able to fine tune their model on their own hardware stock at home or at their own business.
[10:57]
Eric Soufert
Well, I think maybe just to revisit my point, so I don't want, I want to make it clear that like if I, if I'm fine tuning a model on GCP using vertex AI, I don't think anybody's snooping on that data, but I don't think that to be the case. But that's gcp, right? So now what if I use some other provider that's a smaller company than Google? Maybe they, or you know, but, but, but the real issue here is like what's the regulatory limitations around doing that? Now what if I say well I'm fine tuning this disease identification classification model and I just uploaded 2000 sets of patient data to GCP? Now again, I don't think Google snooping on that data, but does that violate some sort of medical data records regulation? Like I just think that, you know, you might, you might say well I just, it's our GCP instance of course, like it's, it's protected, it's safe. But like I don't, I have no idea if that violates some sort of, you know, medical data laws or whatever. And I think that's, that's part of the issue here too. It's the. Okay, well what's the, what's the practical threat? Which that could be severe or maybe it's, it's trivial or negligible. But what's also the regulatory threat? Like what are the ways I have to handle this data? What are the ways that I'm mandated to handle this data? And that also is a, is a risk. Right, and so it's just something that you can mitigate by just bringing this in house?
[12:21]
Alfred Sukher
Absolutely. The regulation liability is a big one in this area, to be honest. Like hipaa, gdpr, dma, most of them consider AI systems as a processor. So if the data flows outside this jurisdiction, you're exposed. So to this regulations you are not holding the relations if you're just fine tuning through vertex, AI or any other server. This is non negotiable. The regulations are very clear. You can'. Share patient data or any private data with any other processor. Yeah, Processing is needed to fine tune another model on your own data. There's no other way actually to do it with sovereignty and holding up to these regulations other than having the whole stack and fine tune the models in house.
[13:09]
Sponsor/Ad Host
You know those channels your colleagues keep bragging about? The ones getting all the credit? Yeah. They might be doing squat. Attribution makes every channel look like a hero.
[13:18]
Eric Soufert
Even when it's a zero.
[13:19]
Sponsor/Ad Host
Incremental tells you who's actually doing the work. It's like a lie detector for your marketing budget. Start using Incremental today. Get your demo@ Incremental.com that's I n c r m n t a l.com mention that you came through the mobile dev Memo podcast for a special 15% discount for the first six months.
[13:43]
Eric Soufert
What are the privacy vulnerabilities inherent in web scale LLMs? Because I, I do know that like you know, you'll read on on LinkedIn or whatever on Twitter, people like they uploaded their health history to Chat GPT and, or you know, I, I saw somebody uploaded their diary to Chat GPT to sort of be diagnosed from like a psychological perspective. And I think people do that kind of without thinking twice about it. But what are the, the privacy vulnerabilities inherent in doing that? And not necessarily from, like with, with respect to any sort of hostile intent by these companies or you know, mendacious intent by these companies, but just, just, just naturally occur. Right, because you're submitting data, you don't know how that data is used. If it's used to train, well then that's ingested into the system at some point and like maybe there's some sort of prompt attack that could, could be used to surface maybe. Just talk to me about that. What, what are the privacy vulnerabilities inherent with these, these, these kind of large scale web based chatbots?
[14:39]
Alfred Sukher
Well, it's a very interesting question. I see it in three points to be honest. One of them is data residue. Even if they say no logging, we are not going to train our model on your data. There are some part of this data is going to stored as system metadata. They are used for debugging. They are, they're used to detect your real intention with the model and for, for debugging. That's actually a standard process. So it's naturally occurring. That bunch of your data is going to reside in their servers. Second one is the model contamination risk. So you are going to contaminate their model with your data. So whenever you are going to use the same process again, again, the same prompt again, again, and you are going to expect a certain response from the model. The model is going to pick up on this trend. And if anybody else uses the same prompt that you are using, it's going to give them the same answer that you get. And that's actually exposing your techniques or exposing a step in your process. So data will reside with them no matter what they do. Even if they say no logging, okay, no logging, but some metadata will always stay. You're going to contaminate the data. I think we need to go back on this. And the third point is the cross tenant exposure. So because you are using a shared processor and shared server, once you are using this, your data is going to go through a lot of telemetry points, data loaders, inference routers and limiting pipelines. And these pipelines are shared with all of everybody else. I'm not going to say they are going to share the data, but that Means your data is going to be stored on those telemetry point no matter what you do.
[16:19]
Eric Soufert
Yeah, and I think kind of as an extension of that, like there was an interesting article I read, I don't know, a day or two ago about how people on Reddit, so, so Reddit is a, is a data source for Gemini, right? I mean that was the partnership that the Google and Reddit struck was, was for, you know, data licensing and people on Reddit figured out that they could band together and poison Gemini essentially just saying things that are wrong or whatever in, in with, with some sort of like commercial intent. Right. So like if I'm promoting some business or some approach to doing something and Gemini have used that as the sort of canonical way to do that thing, then if you get enough people that are coordinated in doing that, it's essentially an attack, Right? It's a data attack. Call it like they're poisoning the LLM, but not in a kind of fun prank sort of context. Remember the Google AI overviews was saying you could glue cheese to pizza. Right. Because it picked that up from Reddit. But actually if they're coordinating to sort of promote some business or promote some approach to doing something that, that benefits them when that's the result that's showed to users, I mean that's a form of, that's a vulnerability. Right. And because it's that shared resource, how do you protect against that? Right, because if you're just kind of like looking at the preponderance of responses to some question, then you just view that as like the weighted right answer.
[17:47]
Alfred Sukher
Well say scale or like web based LLMs are made for reach, not confidentiality and the data source plays a big role. Every LLM uses this rule, knowledge in, intelligence out. So once you choose what knowledge goes into your LLM, you can use the intelligence coming out. So you mentioned this example of Reddit users just like poisoning the data of Gemini. And you don't want any LLM that is going to be poisoned with anybody's data source. You're going to use it on critical business processes or for your own, let's say medical exam. And how do you protect that? I mean we started Novodoc to avoid or to give the people the opportunity to own the whole system, to know what knowledge is going in in their LLM and to have consistent of the LLM that is going to be usable, intelligent and trained on their domain, not anybody else's. Nobody can contaminate the data if it lives in your own four walls, I
[18:51]
Eric Soufert
would say yeah, I mean, to be clear, I think like people often draw the parallel between like chatbots and search. Right? And so it's certainly possible to poison serp, right? Or game serp. I mean that's, that's been an issue for forever. I mean, that's what SEO is. But the difference is I get a range of results when I do a search, right. If I do an LLM query, oftentimes I'm, I'm expecting just like the singular response to my question. It's, it's meant to be the all encompassing response. Right. And so that's the real issue here. You're not prevented with a range of things where then you could go and kind of like qualitatively evaluate the source. Right. Like if I knew that this LM response was sourced by like Reddit user Santa Claus27 with two posts to their name. Well, I don't know if I would trust that. But if I see, you know, that's, if I, you know, I do a search and that's actually what comes up, well then I, I, I'm more equipped to like dismiss it. I think that's, that's part of the issue here. It's not necessarily the mechanics of it, but it's, it's the expectation, it's the expectation that, well, this has actually gone through and done some sort of curation on my behalf. And so it's given me the best possible answer singularly. Right. But I don't actually know whether that was gamed or not. And it could have been gamed because again, this is like a shared resource.
[20:11]
Alfred Sukher
Absolutely. I would say it's how the LLM works versus how search and query works. So search and query is just like annexing the Internet. It's just like giving you what page of the Internet does this exist? But LLM is predicting the next word. And if they see a lot of examples saying that after a certain word comes always red and they will always answer with red. So it's knowledge in and intelligence out. Once the knowledge is repetitively saying after each word should red be present? So the LLM will always be answering with red. And that will be correct in his eyes. Put it like this. So you need to treat it like a prediction machine for the next word. The more specific knowledge you put in, the better intelligence you can get out of it. So search and querying, in my eyes, it's a different approach to extract the data from the Internet.
[21:04]
Eric Soufert
Right. And then, but again, like, it's also the consumer expectation, right? I mean, I think you know how LLMs work. And I feel like I have a pretty strong understanding of LLMs work, but not everyone does. And so it's if, if I'm giving, if I'm given this again, singular response, I'm just gonna, A lot of people just view that as definitive. Right. Versus the SERP is a set of options. And I can kind of use some sort of, you know, reasoning and intuition around which I should trust and which I shouldn't.
[21:31]
Alfred Sukher
Absolutely. You can raise the trust with an LLM when you know what data it has been trained on, what data knowledge access does it have. And you can trust it a bit more if it tells you every time, okay, I have this knowledge from this source and so on, so forth. Exactly. Like search, like you mentioned, search is giving you the sources directly in front of you so you can pick and choose what fits your need the best.
[21:55]
Eric Soufert
So I think like when you talk about in housing, like a, you know, a rig to do model trading, I think people probably get a little scared by that. Right? That's, that could be expensive. Talk to me about like the kind of range of options here. So like if, if someone wanted to approach data sovereignty, let's, let's say it's like a startup, but they, the startup is like a legal startup or something, which I mean I, I had this case. Exactly. So I got pitched by a company that was doing like a legal copilot kind of thing, which, I mean there's a lot of those that exist already. But that was my first question. I'm like, well, how are you going to, you know, you can have sensitive legal documents, you can't just send them to Gemini or whatever via API. And as like, oh, well, you know, that we build in, we're going to build out our own infrastructure. And I was like, well, okay, you're kind of glossing over a pretty big undertaking here. But talk to me about the range of ways you can implement this. Like, like what's the sort of, I don't know, the lightest touch, you know, kind of appropriate for a startup that's handling kind of sensitive data to like the, you know, the sort of like most bulletproof, like kind of ironclad certainly is compliant with every possible regulation approach. Like what's, what's that spectrum look like? And then maybe could it cost it out if you, if you can.
[23:02]
Alfred Sukher
Oh, cost it out is a very tricky question. I'll go there. I would say first of all, once you have the hardware, you host your own model and you have your own data sources, to that model you are 100% iron cloud. You can use it offline like I have, I have my server here and I use it offline and I can have AI talking to me and I have my processes running, even totally offline. And once you go offline, you know your ironclad, there's nobody, of course there's always a chance, but it's close to zero that anybody is going to hack your data sources or going to change the model or going to feed the model some information that are not correct or not fit to my use case. So you're protected once you have the full stack, first of all, second of all about sizing, it's about the throughput. So if we have a startup, let's say five people, five people are using the LLM on a, on an average base like daily and we have like two, three agents running. One for customer case, one for customer services, one for sales, let's say and one for HR legal. You can relatively cheap, I would say build a very good cluster that will handle all the loads and have it all in and that will literally save you money on the long run because you're not going to pay for API anymore. So there's no token generation cost, there's no data privacy issues. And with that, if you're going to see it on the long term, the on prem prices are much better than being literally dependent on an API that's going to run your HR processes, let's say, or your legal process, which you can't actually let's, let's talk about a business process that has no privacy issues and even that is dependency. So it is a risk to depend on anybody else that's going to rent an employee or intelligence or run a process for you on somebody else's servers. Knowledge is competitive advantage. How you do your things is your own way to when in that certain area or industry that you're working in. And protecting this way, how you do it is actually the biggest asset that you have. So turning LLM from a tool to infrastructure is literally the word here. How much you pay for this infrastructure is going always to be amortized with time. So it's going to be an asset, not always something you have to pay as rent to access to.
[25:38]
Eric Soufert
Yeah, I think that's an important consideration even beyond the privacy question because I mean you look a lot of these, you know, people kind of like dismissively call AI focused companies like rappers, right? So like that's, you know, that's sort of a kind of, you know, dismissive term, you know, even though a lot of times there's like some real tactics built on top of that. But nonetheless, I mean if you're, you know, if you're paying per token and you're losing money on every query, well like as your company grows, you know, you can't make that up on volume, right? That's the old joke from the the web 1.0 days. Like we're losing money on every user, but we'll make it up a volume. And actually like there's a certain scale where like the economics just become much more attractive when you are in housing this. And again I'm not, it's not even from like a regulatory standpoint or a data privacy standpoint. It's just from a usage standpoint and a volume standpoint. And I think a lot of people just, you'll hear people kind of throw around RAG as just this easy kind of approach to utilizing these third party models for these business use cases. But okay, RAG just increases the token throughput. I mean you're talking about. And people say well okay, maybe I don't even need RAG then because the context window is essentially infinite. Okay, so you're just going to, you're just going to send all this plain text data then you're just talking about expanding the throughput even more. I mean that's going to get really expensive. So it's like the ease of use of some of these APIs is actually really problematic from a unit economic standpoint. Again even ignoring any sort of the vulnerabilities with privacy. At some point the economics flip to supporting in housing some of this hardware and hosting a third party open source model yourself because it's just you're paying a lot of money to use the API and a token absolutely generic like
[27:30]
Alfred Sukher
web based LLMs win with small teams, non sensitive workflows and early prototyping. So once you see it works on the prototype side you can just like in house it then APIs scale linearly with usage. So on prem scales once after hardware is owned, marginal inference cost is going to be added on top, but that's it. And then you can scale infinitely literally with your own infrastructure. And that's the shift in everybody's mind that actually is going or happening today. Companies are overspending on web based LLM because they are treating it as a SaaS where it should literally be treated as infrastructure. This is something that is you need to own just like your own laptop, your own screen and things that you use daily.
[28:13]
Eric Soufert
Well, particularly when it's core to the business. I think that's the, you know, you could view it as like, well, it's just aws. I don't want to own my own servers. What's the point of that? But, but it's actually not because it's, it's the logic layer of your business. So it's more fundamental than just the web server where you could kind of pick and choose. I mean you're talking about the actual, the model that, that empowers your business. So it's, it's much more downstream than just well, I've got a, it's a commodity, I could just pick a server. No, it's, it's, it's much more integral to the operations of your business than that. It's, it's, it's, it's, it's genuinely empowering your business. And so it's just much more critical from, from that perspective. And to your point, I mean you then need to think about things like consistency, right? Even, even putting aside against like the regulatory stuff, privacy stuff, just consistency is important and you get access to the specific models, right? So it's a big model update. You don't necessarily have to transition to it. But again the consistency of that, if it's a shared resource, it's not guaranteed.
[29:14]
Alfred Sukher
Of course it's not guaranteed. You don't know when the prices are going to go up. On the other side, consistency is a big problem. If you, where do you want to load your data? You mentioned rag drag is a very powerful technique to turn any knowledge into intelligence. And where do you want to feed all this information? Like let's talk about SOPs of any organization. This is the core identity of the organization, how they do their work. And if you want to share this with any generic model, the data will be there, the model is trained about it and others can actually extract this knowledge, but it is going to be extracted with somebody else's prompt at some point if they, if they really scratch the surface enough to get to the information that you, that you fed in with your rug. So it's not old school computing, it's the new way of computing literally. So inferencing token generation needs a lot of context and this context is your business. You don't want to share your business with anybody else. And this is your assets. This is the biggest asset of any business. So keep it in house is the best or the only way I see it to minimize this risk and stay competitively advantage for the future while retaining the information, retaining the data. When you are using this kind of technology. You're retaining the data that you generate while using it and then refeeding this data to the machine, to the model, to tell it, okay, see, this is how we talked for the past year. I would like you to think and act in this particular manner way more. And by that you will just like take a turn, let the generic LLMs be the generic LLMs from everybody else and then have your own LLM that will actually think and be the business with time. And this is even creating even a bigger competitive advantage for you in the future.
[31:07]
Eric Soufert
People expect a flawless experience from the
[31:09]
Sponsor/Ad Host
moment they click a promo link in a text, tap a paid ad for a new pair of sneakers, or scan a QR code on a fast food cup. Marketers, meanwhile, want a clear view of customer behavior and how campaigns are performing across every channel without spending hours digging through data. That's where Branch comes in. Branch is your AI powered MMP connecting every paid, owned and organic touchpoint so growth teams can see exactly where to put their dollars to bring users in the door and keep them coming back. Find out what over a hundred thousand brands know. Head to Branch IO to get started. That's Branch IO.
[31:49]
Eric Soufert
Did you read that paper from. I want to say I was thinking machines or it was a blog post and they talked about why Chatbot output is not deterministic even when you set temperature to zero. And it was because of like the load balancers. It, it was, it was just a very sort of, you know, prosaic hardware issue. Nothing to do with the model itself. It's like the load balancers had, you know, different levels of throughput or whatever and so that impacted the inference. I said, well that's, that's utterly outside of your control. I mean there's nothing you can do to defend against that.
[32:22]
Alfred Sukher
Literally nothing you can do. And the telemetry point, the load balancers, everything else that your data will go through there. Even if they say no, logging your data is not going to train our model, but it's going to go through our load balancers. And that's a weak point for your data. You don't want your data to go through there.
[32:38]
Eric Soufert
Talk to me about the use cases that you see most frequently for this maybe where they perceive the utilization of these kind of third party tools via API as being particularly risky or just presenting a vulnerability. What are the most common use cases that you're seeing?
[32:57]
Alfred Sukher
Well, we're seeing literally through two very distinctive industries or categories, would I say regulated data, like everything healthcare, legal, Finance, government, energy, anything with regulations on the data. We are seeing a lot of interest there. A lot of industries that want to retain their data, they don't want to share it with anybody else. The ones that are very vigilant about it are the ones that are working in regulated data industries. So the second category is proprietary processes, I would say so internal sop, manufacturing steps, underwriting, logic, risk models. They are the crown jewel of any business and no business want to share that with any other server, anybody else. So this is how your business is running and you don't want this recipe to be shared with any other information, let's say, or database that's going to keep it. So these are the first two categories of businesses, I would say. And then the third use case or like the third category of use cases that we are seeing is knowledge retention. We are working with a lot of organizations, businesses that want to build their own knowledge source that is an LLM that is trained on their SOPs, trained on their way of doing things that will answer exactly the way that any senior employee at that organization dedicated or focused on one area in that business, like let's say a senior manager in hr. We will need some model that's going to be trained on HR data and their usage data. At some point we will have an HR model that's literally not like they know the rules, they know how would this organization answer at any given question regarding hr. And these are the three use cases that we are seeing. Any regulated data, any, any processes that are for priority and then the knowledge retention use case that anybody, everybody will want or is wanting now to build a knowledge base on LLM. LLM based knowledge base, if I can say that.
[35:07]
Eric Soufert
So have there been any use cases that surprised you? I mean I think I'm talking about like maybe specific verticals, like everything but healthcare. That seems like an obvious one. You know, legal, legal related work seems like an obvious one. Anything that surprised you that you wouldn't have expected?
[35:20]
Alfred Sukher
We are having a pilot with a company in South America that are literally preserving the Amazons. So it's, we're going to say it's agriculture. And they don't want their data to be shared with anybody. They want to create their own LLM that is trained on their own farming and forestry techniques. And that was a surprise for me. Even those don't want to share their data. How about legal and healthcare?
[35:47]
Eric Soufert
Oh, interesting. Yeah. They don't want to reveal which parcels of land they're about to acquire, I
[35:52]
Alfred Sukher
guess, I think so. Or like the model or like what how did we work about it is about what seeds going on. There's literally agricultures, what seed, how do they maintain it? How do they turn large portions of land from literally brown to completely green in very short time and period? So this information knowledge, they didn't want to share that. Like they don't want this knowledge to be shared with anybody else because that is their competitive advantage. Turning the Amazons or like regreening or reforesting the Amazons.
[36:26]
Eric Soufert
Do you see a particular geographic skew? I mean I imagine a lot of companies when they think about European operations this is very relevant to them because there's just more of a regulatory kind of requirement in the EU with the, with the gdpr or is it pretty universal?
[36:44]
Alfred Sukher
I think it's pretty universal. Everybody will want there. But how would I started. I started talking to ChatGPT. It was ChatGPT 2 maybe or 3 as a friend and once I realized that I'm Talking to this ChatGPT with my data way more than I should. So I realized we need something that's more private. So I think it's more universal. I think the Europeans are two steps ahead with this regulations that are going to protect your personal data or trying to protect your personal data with GDPR and everything around it. And I think every country and regulations should pick up fast on this trend because people are not like users. End users are not very vigilant about it and they are sharing way more data than they should. And this is literally. Sam Altman said this. Not only me, but Sam Altman said this in a podcast saying stop sharing personal data with tbt. It's a risk for everybody personally, individual or business.
[37:40]
Eric Soufert
Could you foresee a consumer use case? Like maybe not obviously like an on prem self hosted chatbot, but I mean I could imagine a company just, just processing everything on device, right? So you imagine like hey look, you know this is going to be a lower fidelity model, right? Because it's got to run on your iPhone hardware. But you know that it's not going off to our server. It's. It's gonna, it's just by its very nature more private. I mean you, you hear a lot about like on device processing as a. In the privacy context. And I'm just wondering when we get there, I mean I think we probably will get there with, with just chatbots. But when, when like what's the kind of future for that? And again not, not an enterprise use use case but just like a, just a consumer use case where like the duck duck go of chatbots saying we run entirely on your iPhone. Nothing leaves the iPhone, it's all on device. What do you think about that?
[38:34]
Alfred Sukher
I think it's possible today that's what I use home to be honest. This is my own device. It's my on prem little server that is I can run LLMs up to 400 billion parameter which I don't need by the way or too big. But I think it's possible today with the hardware that we have. I think what use cases do you want it to be used for personally? So we are developing something is like a second brain where you can literally upload all of your information and then you will have an LLM that is literally trained on you, that is living in your own server so you can talk to it about anything that you can think about and it will give you an honest opinion and consultation. And once you are, you don't want anybody to see this. You can literally throw it in the bathtub and water will take care of all the information that you shared with. And it's not a big box, it's an AI appliance literally to put it with your in your own network and then you'll have access to your own AI and you can use it even without Internet connection. And once you have that you'll have a companion. And I think in the future AI will be, will be as personal as your mobile phone should be like that because everybody uses their mobile phone differently even though it's the same iPhone, but everybody have different let's say apps on there or ways to use their phone. It's very personal and AI is not going to be anything different. It's going to be literally very personal. Knows you, knows your specific ways of handling life, your emails in the morning or in the evening. This is something very personal. But then AI will handle your emails and needs to know how do you prefer to go on with your day and that these information for me personally, even if there is no regulations about it, I don't want to share it with any big tech. I want this information to be mine. Just like my phone knows exactly where I wake up in the morning, but not Google, right.
[40:33]
Eric Soufert
One thing I'm curious about is what kind of coordinated threats have you seen begin to emerge here? Like I'm talking about kind of like you know, cyber crime rings. Like we talked briefly about the Reddit case but like have you seen, you know, I don't know organizations out of like North Korea or China begin to start attacking these shared LLM chatbots in, in any specific way is that, has that, has that become like a perceivable issue at this point?
[41:04]
Alfred Sukher
I think I read that on Anthropic page. I think there was the 13th of November 2025. So it's, it's not. So it's about a month and a half. Yeah, there was a, there was an orchestrated cyber attack espionage campaign through, through Anthropic and, and that was the first, let's say organized AI espionage crime in that space. And it's on their website. I think you can read it. It's very, very interesting how these LLMs can be weaponized on that big scale. So if I want my personal, my private one, I don't want it to be weaponizable. I don't want anybody else to be, to have access to it and I want my data secure.
[41:49]
Eric Soufert
Alfred, this was fascinating. How can people learn more about you? Learn more about what you're doing, please.
[41:53]
Alfred Sukher
Novanaughts.com we are trying to make Sovereign AI available for everybody. Yep, reach on LinkedIn anytime. I'm very happy to have any interesting conversation around. How can we make AI as personal as your phone?
[42:07]
Eric Soufert
Perfect. Thank you very much for your time today.
[42:08]
Alfred Sukher
Thank you very much, Eric.