Gabriel Custodiet speaks with Milan Dereede of NanoGPT. The website allows you to access hundreds of popular AI services without creating an account, with an affordable pay-per-prompt service, and with some privacy benefits. Milan explains the details...
Loading summary
A
This is Gabriel Custodiet of Watchmen Privacy. I know why you're here. You're looking to escape the technocratic apparatuses that you see slowly enveloping you and restraining your freedom for the fundamentals of privacy. You should start by visiting escapethetechnocracy.com to see my video tutorials, books and other resources for getting off the surveillance grid. Watchmen Privacy and Escape the Technocracy are leading the fight for privacy. And unlike just about any other show, we practice what we preach. Private payment options, no threat modeling, no status or collectivist solutions, and no sponsors ever. You know what that means? It means we can speak the unmitigated truth as we see it. Your support alone determines the future of this show. Go support you and me at Escape the Technocracy.com I'm very pleased to be joined by Milan Derrid and he can pronounce that in Dutch if you'd like. And he is the co founder of a really useful service called Nano GPT it and if you don't want to look up the text of that, that's nano GPT.com. now this is an AI service. It's really a kind of conglomerate AI service, pay per click. Pretty useful. We mention it in our AI Resistance course as a way of getting a feel for a lot of the big models out there without having to create an account with OpenAI or Claude. Right? You don't have to create an account, this is your intermediary. Without having to pay for that service and all of their restrictions for doing so. And without, you know, therefore giving up some of your data to those particular organizations, you are still, when you use Nano GPT, if I just pull it up here, I can see that basically what it offers is I can type in a prompt on the website and I can use any number of AI services that I would like, whether that's Claude, whether that's, you know, Google's AI, all sorts of things. And when I'm doing that, I'm still using that service, right? Google or OpenAI. I am however, using it through the intermediary of OpenAI. We'll get into these details. Maybe that was too specific for starters. So Milan, welcome to the show. How are you doing?
B
Thank you. Yeah, I am doing very well. I really enjoyed a very good explanation of what Nano GPT is and what we do. So yeah, very, very happy to be here and very nice to hear that you reference us more often.
A
And I'll add a couple things here that you can use the service as a Guest. It's basically, it is the kind of website that I would, if I was trying to design a user friendly, non coercive website, I would do it in this way because it allows you to use the service as a guest. You don't have to have an account. You can just kind of top up the service and use it and then exit if you would like. They accept cryptocurrencies, including Monero, which this audience will be happy to hear. So it's a very intuitive service. And let's just start with what's. And you can talk a little bit about your background in this question, but what led you to create Nano GPT?
B
Yeah, so also it's very nice to hear that this is how you would design the service because that's kind of how we try to think as well. Like how would we like to have a website work and then we can build it that way for ourselves. So that's very nice to hear that we're getting there for you. But yeah. So my initial reasoning for building this is when ChatGPT came out, I was kind of. I got obsessed with it, to be honest. And I wanted to learn how to do programming and whatever, but I wasn't very good at it. So I started using ChatGPT to teach me programming. But then you needed to have a subscription. Back then, I think there was no free model. Like there was no freemium service. You just had to have a subscription. And when I was talking to other people about this, I realized like they did not have access to the paid one because in some cases they were in like an unsupported country. In some cases they didn't have a credit card. In some cases they just didn't want to pay 20 bucks because them that was quite a lot. So I wanted to start something that could share that same magic. Like it sounds very corny, but I really thought it was magic. Share the same magic of having this help this assistant available for you, but make it in a way that's more accessible to everyone. So you don't need subscriptions, you don't need credit cards. So that's what I tried to build. And I built it as a Telegram bot at first. And then my co founder, Huggy, he reached out to me and he said, okay, like the Telegram bot is very cool, but I think more people would use it if it was a website. And then we turned it into a website and we started adding more different models, image models, video models, just a whole lot more stuff. But yeah, the start was just, I Wanted to like, have more people be able to access these, these AI models.
A
And so that gets to another question here, which is that, and I'll use this to compare to the another AI service that we talk about on here, which is Venice AI. Now, what Venice is doing is they are running open source models. So they're running open source models. Nano GPT is running all of the models, including the closed source models. And so Venice is basically running their own models. And they can therefore have certain privacy claims about that. And they're restricted to the open source models with nano GPT, maybe you can explain this. My understanding is that we are using the official ChatGPT or Claude or Gemini, but we're using it through the intermediary of your service. And therefore we should be cautious about the things that we're discussing and uploading onto even nano GPT. And also how exactly does that work? Like, how is this not against, I guess, the terms of service of some of these companies?
B
Yeah, yeah. So you're actually correct. So what we do is we use the API of these different services. I'm not sure if your audience is familiar with APIs, but essentially we can use the models through essentially a computer interface. So we can send prompts via code rather than having to go to the website and doing it in a browser. So if someone is using ChatGPT via our website, every prompt that's done, we send a query to ChatGPT, to the OpenAI API and we get a prompt returned. So you're right, they still do get the prompt. So if you were to say in your prompt like, hi, my name is Gabriel, I live here, then they would still know that information. So what we sort of add in terms of privacy, I guess, is all of the requests that we do to OpenAI come from just one location, one IP, one user. There's no way to tell for them whether it was you or me asking the question. They see absolutely nothing. They don't see a name, they don't see your payment details, they just see the prompt. So it adds a sort of layer of privacy, at least. And so it's not against their terms of service and that of the other services because we just use their API. So every call that we do, we also pay to OpenAI and to these other services. So for them it's actually kind of good. They just get slightly less data. But like they, they are quite okay with that, I guess. But yeah, so that's what we offer. And that does differ from Venice, as you said, because Venice doesn't use OpenAI and Anthropic, they only do open source models and then they have their own provider that runs these models for them, like just on graphics cards, like GPUs. It kind of limits them in a way because it's only the open source models, but they have more control in a way. So we also have the same open sour models that we offer on our service, but we just use like a regular provider. They don't run them specifically for us, they run them for us. And for many other entities that want to use these models as well.
A
That's a great explanation. So obviously you're a privacy and sovereignty seeker. You want to at a base level be self hosting. That's kind of what we teach in the AI resistance course. However, there are limitations to self hosting. Right. And you know, a lot of people, they just want to get a little flavor for all of the options that are out there and they don't want to create like I couldn't even create a Claude account. I had so much trouble just doing that. And Urban, my colleague was, was getting kicked off of ChatGPT for using a VPN, which is a no go for him. So we want to be able to experiment with some of these things. Obviously realize that, you know, we're still communicating with a big tech product in a way. We want to be able to experiment with these. And when it comes to things like images, AI images, it's a lot more difficult to do it in a self hosting way. So that's the use case I see for Nano GPT is for the people who wants to experiment and want to use it in a way that they realize they're not doing too sensitive of a task in doing so. So as you mentioned, the other useful thing about this is that it's a pay per go service as opposed to spending whatever $60 on midjourney and not even using a fraction of that in a given month. People can just top it up with a credit card with cryptocurrency and then pay as they go. And you have a way of charging per prompt and there's all these different comp. Every prompt you can see here has. It has a different estimated price. So how does that pay per price system work and what are some of the ranges of prices? And walk me through that system a little bit.
B
Yeah, so essentially the way the payment, like the price per prompt is decided is every different model has a cost per input and per output word. So let's say you use a very expensive model and you throw in like an entire book of inputs and it does a very big output, then it's going to be quite expensive. And if you use a very cheap model and you say like, hi, and it says hi back, it's incredibly cheap. So it's the per model cost and then input and output, that's literally all it is. And in terms of a range, it's a very broad range of how expensive a prompt can be, because some of our models are just insanely cheap. Like you can do 10,000 prompts for $1 for some of these models. So we have people that use our API, we also offer an API for I don't know what, but they're doing thousands of prompts and it costs them like literally just a dollar a day. So those are the very cheap ones. And then we also have, for example, Zero1 Pro, which is the sort of top premium model by OpenAI. It's the most expensive model ever developed. So then people sometimes literally spend like $5 or $10 on a single prompt, which I think is incredibly expensive. The only other way you can get access to that expensive model is with the $200 a month subscription to OpenAI. And even then you're limited in a way. So I guess that's why it's so expensive. But yeah, so it's kind of hard to say what a range is, but on average what we see is a prompt costs about $0.01. So if you're like a regular user, you're going to be paying roughly $0.01 per prompt on most of the top models that people know, like ChatGPT or Claude or Gemini, the Google model, all of those will. Will be roughly $0.01 per query.
A
Yeah, it's a really good system and for a lot of people, they are not going to have created a account on Gemini. They're not going to have paid that $200 a month for that really premium AI service. But maybe they want to. Obviously they want to check it out. Right? Somebody's charging $200 a month. You want to know what that's all about. And you can get a taste for that while using Nano GPT. Certainly a good tool for researchers, experimenters. Let's get into some of the sophistication of the AI. I imagine you've played around a lot with these. With what do you see in the difference between, let's say, the lowest model versus the highest model? Can you really tell a difference? Are the most expensive ones really a game changer in your mind?
B
Yeah, it's kind of funny because if you look Back like a year ago, the models that we had are so much worse or were so much worse than the models now. But like at the time they were the top of the line models, right. And we were quite happy to use them. So yeah, it just keeps developing in a way. But I would say I notice a big difference in some tasks. So for example, when I'm programming, when I'm coding, I want to use the top models, which for me is Gemini 2.5 now or 01 Pro, like the top OpenAI model. Just because if I'm programming it's going to be difficult enough that I want to have the very best model. I recently had some medical questions that I wanted to ask and then, then I'm going to use also just the very best model. Sometimes I just want to do some very simple information lookup and then a cheap model is fine. Right. So it kind of depends per task. For some tasks you will not notice a difference if it's just a simple lookup of, I don't know, if you use it as like an alternative to Google and you just go, what's the capital of Botswana? You can just use a cheap model.
A
Right.
B
But for the more sophisticated stuff, I would say the expensive models, you do notice the difference. One thing that some people use our servers for, that we know of, is role playing, like storytelling role playing. And there apparently people notice a lot of difference between the models, but just in terms of how they sort of how they feel, I guess, like the style of their output. So then people really like Claude and Gemini 2.5, but they really dislike ChatGPT, for example. So to an extent it's personal preference. To an extent, it's very much dependent on how difficult the task is. The more difficult, the more you're going to notice, of course, like how good a model is.
A
I'm just scrolling down. You have tons and tons of different models. We have Claude, some that I've. I'm not familiar with, Cohere, we have Deepseek, we have Daobao, I guess some of these are Chinese. We have Gemini, glm, the list just goes on and on. Are there any that you don't have for any reason?
B
I think almost literally every model there is. We have actually the GROK models right now we don't have because there's some unclearness to what extent we are allowed to use the API for this. So right now we cannot offer Grok and Grok3, which is the one that's used by Twitter, by X. They don't have an API yet, so we're not able to offer that one yet. Aside from that one, I would be hard. Like there are always going to be some tiny models that people are developing that we don't have, of course. But yeah, I'm pretty sure if you ask people to name 10, 20, 50 AI models, the top 50, we have all of them. And then yeah, as you say, a lot more as well. I think we're at like 150 models now is my rough estimate. Text models that is, and then some image and video models as well. We also have people requesting models, so if we ever miss a model, people just come into our chat and they like we have a model requests channel and people just ask like, can you add this model? And nine times out of ten we can do it quite quickly. So yeah, I would say, yeah, almost, almost everything you can think of, we, we have the model and it's just.
A
As simple as basically clicking the box here for which one you want on the fly. So it's obviously very handy. Now does our ownership of the stuff that we create using these change at all, given that it's an API or.
B
Yeah. So in most cases the terms of use for using the API is actually better than if you have a subscription to the service. So both in terms of data retention and ownership of what you create, it's usually better if you use the API, which is what we do. So for example OpenAI, I believe if you have like a subscription with them, they store the data for quite a while, I'm not sure how long. Whereas we can say like we want as little data retention as possible via the API and then they store it only for the minimum amount of time and only for like if law enforcement comes or something, they don't use it for training, they don't use it for anything else. So yeah, I would say in general it kind of depends per provider as well. But what you create is yours. Also with the image models, anything you create is yours to be used in any way that you like. And in general that is better using the API than if you actually had a subscription. So it's kind of a nice bonus I guess of using this.
A
Right. And I want to be clear for the listeners, if you're super privacy conscious, then you really shouldn't be doing these using NOW GPT for sensitive things. That doesn't mean that there are other non sensitive things that you might want to use it for and get a real taste for the landscape and especially the imagery I think is very useful. What about Milan, your own privacy policy on nano GPT hit us with the bad news or the good news. What is everything that nano GPT knows about its users?
B
We try to write our privacy policy in quite a clear way because essentially we store as little as possible. So we don't do cookies even, which would make our life a lot easier if we did. We discovered recently because at some point we want to do advertising and stuff, and we just have no clue where our users are even coming from. So we don't store IPs, we don't store conversations, we don't store prompts. If you pay with crypto, there's no way we can even see who you are. If you pay with credit card, we use stripe, and stripe forces us to store it. There's no way we can say, please don't show it to us. They just show it by default. So what can we see? We can see a user id. So if someone does a prompt, what we can see is this user ID is using this model. And we can see that because we are paying for the tokens, so we are paying for the API usage. So we get a return from the API saying, you use this many tokens on this model at this time. So we can see like, okay, this user ID did this model at this time, but we can never see the actual message that you're sending, nor do we know the ip, nor do we pass any of that information on to any of the providers. So is there anything else that we do know? Yeah, so if you pay with crypto, we see the address that you send from, of course, except if it's like Monero or zcash, we really just try to store as little as possible and use as little as possible. Also, because I feel like sometimes there are companies that they will never use your data, they say, but then they do store it in some way. And that kind of always leads to a risk of them, like being sold at some point or, I don't know, private equity coming in or venture capital and wanting to monetize the data. So we just try to store, like, literally as little as possible. I'm trying to think if I forgot any.
A
What we expect is we expect that companies are going to act in this way, but also we understand that we are responsible for hiding our IP address. We're responsible for the data that we input onto these platforms. And we need to be selective about how we do that. And if we really want to go about it in a private way, we can use Tor, we can use something like Monero. Don't create an account, all this good stuff. So we have that responsibility as well and we should take that on as users. So tell me about the difference between having an account and not having an account. Obviously it's very alluring to go on here. Oh, I don't have to create an account, I just top it up and then I can use it. But then everything kind of just disappears into the ether. What are the advantages of having an account and what do, what do we have to give to create an account?
B
If you create an account, you can always log in from another device and your balance, like anything you deposited, will be usable from that other device as well. Whereas if you don't make an account, we store it locally, like literally in your browser. We store sort of the connection between your balance and your browser. So as long as you don't clear your cache, your cookies, whatever, you will always have access to your balance. But because we have no way to connect it to, for example, your phone, if you check out Nano GPT on your phone, there's not going to be any link to the session that you have on your laptop. So people can create an account and then you can link the two sessions so you can use the same balance on both. And you can use literally any email address. We also have login with Google, but I guess if people like their privacy, that's less welcome. But you can use literally any email. You can use temporary email, you can use Proton, you can use anything you want. We don't block any email provider. We Also don't block VPNs so people can log, you can use Tor, you can use vpn, whatever and use our website. So I would say if you want to be able to share your balance between say your laptop and your phone, just get an anonymous email and log in with the anonymous email. Use the VPN on both devices. And then I would say you're pretty much as private as you would be if you had just the session. And you're probably like get some extra ease of use out of having the same balance on both devices. Just to be clear, actually I should add we don't store the conversations and prompts anywhere on our servers. We only store them locally in your browser. So even if you create an account, there's no way to access your conversations that you had on your laptop from your phone because we don't transfer between the two, because we don't store anything.
A
Here's an interesting scenario. My midjourney account, I recently just let it expire because I wanted to See if I could make nano GPT work instead. And obviously I get to use more than just mid journey if I do that. And it's a little bit costly mid journey. Right. So if I switch to nano GPT, are there any kind of rules of thumb where if you're using this many prompts, then nano GPT might be worth it?
B
Yeah, no, it makes sense. Depends. Is it 50 to 100 images a day or like per.
A
No. Per month.
B
Okay, yeah. Then we're like, definitely going to be cheaper. So I think sometimes it does make sense for people. If you don't really care about giving your credit card information, it sometimes does make sense to just have a subscription to, say, ChatGPT or Claude or whatever, if you're using it a lot. Like if you're using it for, say, programming or something, and you do hundreds of prompts every day, then you're sometimes going to run into rate limits on ChatGPT or Claude as well. But if you're getting to the level where you're hitting the rate limits constantly, then it's probably going to be cheaper to have a subscription than to use us. At least if all you want to do is use that one model from that one provider, then it could be cheaper to just use the provider themselves because let's say on average it costs $0.01 to do a query. So if you do like 500 queries per day to ChatGPT, it's like $5 a day on average, and a subscription is probably cheaper than $5 times like 30 days in a month. What we find is people actually do way fewer queries than they think. I think the average spend per month is like a few dollars on our service. And the sort of subscription then is not worth it for most people because it's like $20 just for ChatGPT. If you also want to have Claude, the premium models, then it's another dol. So, yeah, the situation in which it would make sense to have a subscription, like leaving aside the privacy stuff, is if you just use it nonstop and if you do huge prompts constantly to one particular model.
A
But for most people, as you say, they could probably benefit financially just from giving nano GPT a try, let their account kind of expire and then just see if you can get a cheaper rate on nano GPT. For somebody migrating from midjourney to nano GPT, obviously I'm now using the nano GPT interface and Midjourney has all these settings. Change the resolution, zoom in, zoom out, pan, all these sorts of things. How much of that is replicated on nano GPT.
B
Yeah, so we want to replicate more of this because for many of the image models we do let people play around with the settings, but we don't have a sort of two step process for midjourney yet. So people can generate using midjourney, but then doing the, the variation or whatever, we don't have that implemented yet. We do want to implement it also, doing image to image, for example. But right now all you can do with midjourney is just generate images. You get four images every time. Just like in midjourney with some of the other models. There's a lot more customization, but yeah, we don't have it yet on midjourney, unfortunately. It's something I quite want to add, but we only added midjourney like two weeks ago, I want to say. So yeah, it's not the most well developed image model we have and so.
A
People should expect that it won't be. It might not necessarily. Your AI model might not necessarily have every single feature if you're using Nano GPT. That doesn't mean you shouldn't experiment and see if it still works for you. I think, Milan, you'd be a good person to ask this. Maybe you should have asked this initially. The privacy abuses just overall by AI industries, how bad is that in your mind? And are there any particular instances or companies or anything that stands out to you?
B
It's kind of interesting in a way, right, because the reason these models are so good is because they can sort of vacuum up all of the data on the Internet. So I find it hard to judge that just because I also get so much use out of these models. So for me it depends if it's like on the Internet and they train on it. I'm like, okay, it's kind of public already in a way where I'm a bit more concerned. But I don't think we've actually seen big cases of this so far. Is all of this data is getting stored, like for example, let's just say chatgpt. They are building up a huge sort of treasure trove of all the questions that people ask to a model associated with a person's name, ip, credit card details. Like they can see you, for example, looking up medical requests or something and they'll know it's you. Right. They can link it to your credit card and everything. And right now nothing is done with that information or with the link between you and those prompts. But it is quite a tempting sort of data treasure really. So I can imagine at some point, they might be tempted to either sell it or to share it with intelligence services who say, like, okay, this could be very useful for us. But I don't actually, I don't know know, maybe I've missed it if we've had any big leaks yet or any big, like, sales of data from that. But that is my main fear. Like, not so much what they train on, but just all the data that they're gathering.
A
You hinted at this before. Let's just return to this for the moment. So one of the privacy benefits of using something like Nano GPT as opposed to ChatGPT, is that the prompts that we input are not necessarily links one to the other. Is that correct? How does that work? And how do we still have the conversation within AI where it's trying to understand what we said previously?
B
Yeah, so how all of these AI models actually work is every prompt that is sent is completely fresh and new for the AI. So for example, even if you're having a back and forth, like you say hi, the model says hi back, and then you say how are you? For the model, it just sees how are you? And it sees a history appended of the high and high back and forth, but it doesn't actually know it's still the same conversation, which is kind of weird in a way. So every time we do a prompt, if you are having a back and forth conversation, we have to pass it the history as well, because otherwise it just sees it as a fresh instance. So the way we do it is within a single conversation. As long as you stay within that conversation, we send along the history of that conversation. So if you're going back and forth with a model, we send along what you said previously and what the model said previously. Because that's usually quite useful, right? Like if you're going back and forth about your code or about an idea that you're building, you want it to know what you said three prompts ago as well. All of that is only the case if it's within a single conversation. If you click New chat or just delete the prior messages in the conversation, then it's completely gone for the AI model the next time that you ask a question. So if you have two chats open at the same time, in one you say, hi, my name is Gabriel, and in the other one you ask, what's my name? It's not going to have a clue what your name is because they are two completely separate instances for the model.
A
It was my mistake. I was thinking, okay, so one of the privacy benefits we get is that at least our prompts aren't linked to our identity or something like this. Yeah, at least we're not linking it to our identity in some way. But of course it does need some kind of memory in order to give us useful responses, and we can of course, reset that as we go along. Talk about some of the use cases of AI. You've probably played around with these a lot. What is the best coding model and what are some other maybe observations of how these different models do coding differently?
B
What we do in general is we use lmarena.com, which is a benchmark website. So every time a new model is released, they add it to their benchmark and people kind of blindly compare two models and then they say which one they prefer for a given prompt. So it's kind of impartial. There are some sort of small niggles with it, but in general it's quite a good way to rank the different models. So we also have an auto model which selects the best model to use whatever query you input. And then we use Alumarena to essentially decide, okay, what you're asking is programming. This model is best for programming, or what you're asking is role playing, whatever it is. Right now, the best model for programming, I think, is 01 Pro, but it's incredibly expensive. So expensive that, like, I barely use it. Whereas Gemini 2.5 Pro is like 99% as good, I would say, and is like a hundred of the price, quite literally. So my go to right now is Gemini 2.5 Pro for programming. But that keeps changing over time. Like, if you asked me four weeks ago, I would have said Claude 3.7 sonnet just before that it might have been Deepseek. So it kind of keeps changing every time a new model comes out. Yeah. So the way I evaluate it is just Alamarina and kind of just playing with it myself because it's kind of weird. Like, sometimes a model will do very well on the benchmarks and then you try it out and it just doesn't feel right or it gets too verbose in a way. So Claude does this sometimes where it will solve your problem, but it will also try and solve five unrelated problems that you don't think you actually have. So it will be like, oh, I fixed your code, but I'm also going to fix your fridge. And now that I'm here anyway, I'm going to fix this lamp that's not working anymore. So it's kind of. All these models have their own little quirks that you only discover when you start using them, which is quite funny. But yeah, just to answer your question for me, it's O1 Pro right now and Gemini 2.5 for programming.
A
Yeah. No, it's interesting to see the differences among these and that's why the only real way to get a sense of it is to play around. Have you noticed, Claude, European AI GPT San Francisco kind of AI, we have Deep Seq and some other Chinese ones. Can you tell, let's say things like cultural differences, would you say, between using these sorts of models? Different models.
B
Oh, that's actually interesting. I think the biggest thing you would notice if you use Deepseek, I don't know if you've used it, but it is very, very sensitive about anything to do with China. So it will be sort of, if you ask it about Tiananmen Square, for example, it will be like the Chinese party has always tried to be peaceful and whatever. So I would say the Chinese ones are a bit more censored when it comes to. To things to do with China, whereas funnily enough, they're quite uncensored when it comes to, for example, instructions on how do you. I don't know, how do you 3D print a gun, for example? All of the Western models will block that, like OpenAI anthropic. Whereas people have found that Deepseek is a lot more permissive also in doing like not suitable for work stuff. So it's. Yeah, it's kind of funny in a way, in terms of the cultural differences between models. Claude kind of feels very human in how it talks to you. And that's been the case for all of the models that they've made, pretty much. Whereas ChatGPT is a bit more like a professor, I almost want to say. But all of this is kind of personal preference, which I think is just really funny. It's kind of like there's a person in your life, some people will like that person because of how they are and others will dislike them. Doesn't mean one is better than the other. It's just different preferences, I guess. But yeah, that's the main difference.
A
For me, that was the first thing, of course I did with Deep SEQ is asking about Tiananmen Square and I tweeted about that. The results I got. But as Urban and I have talked about in our episodes on AI, the censorship and the values of something like OpenAI, where it has left wing values and it lectures you about certain things and what won't answer certain questions. I find that censorship much more perfidious and much more subtle and much more damaging, to be honest. And that's why, you know, I have friends who say that they feel that there's even more free speech when they're in China than when they're in the West. Controversial thing to say, but not, not on this show, obviously. On the topic of censorship, what's. How do you evaluate censorship in AI and do you do anything to counter some of the censorship? And how exactly does censorship work from these models? And how do you kind of subvert it?
B
Yeah, that's actually pretty interesting because some people think we do some like special system prompt or something on our side to make them less censored. But what actually happens is many providers add in an extra layer of like, censorship before you access the actual model. And we just don't have that layer on almost every model. The One exception is OpenAI models because they kind of force you to build in a moderation layer before the prompt gets to their actual model. So they have a sort of pre check in there. But for all other models, we just don't do any additional censoring. And because many others do, that makes us seem like we do something special to uncensor them, which is actually not the case. So our special trick is to just give you the models as they are in a way. That said, we do also have some models that. Oh yeah, this actually goes into the other part of your question. How do they do the censoring? So for many of these models, they train them. And when the model is trained, if you talk to the model, it will be extremely helpful, but it might not be politically correct. Like you can ask it stuff about, I don't know, different races, and it will gladly tell you, like, this race is better than the other at a certain thing. And then that's not exactly what they want the model to be like, at least OpenAI and anthropic and whatever. So they do a sort of post training, reinforcement learning, they call it. What is it? Human, I don't know, reinforcement learning. Anyway, so they will talk to the model and they will sort of punish or reward it based on what it answers. So if you ask it a controversial question and it will give a politically incorrect answer, they kind of try to beat it out of it by punishing it for giving that answer. So that adds in a sort of little layer inside the model after it's already been trained to be more censoring in a way, or to be more politically correct. And with the Open source models. What some developers have done is they've taken the open source model and they've kind of discovered where these little tweaks have been made, like where the additional layer has been put in and they just take it out, like bit by bit. They discover, okay, if this transformer fires, it's like an extra sensoring layer. So let's take this one out. And by doing that, they can kind of undo the extra censoring that was built into the model. So this only works with open source models, because with closed source, of course, there's no way anyone can do this. So with the open source models, they sometimes like blast the extra censorship out of these models and then they will publish that for others to use. So with Deepseek, for example, we host an obliterated version. It's called obliteration is like removing the censorship. And the obliterated version will gladly tell you what happened on Tiananmen Square. On the llama models, the obliterated version is way more willing to indulge in any weird fantasies you have. So, yeah, that's the way you can do it. With these open source models, you can actually sort of tweak the model in a way. What many people also do with the closed source models like Claude or OpenAI is they build in a sort of jailbreak. So they'll try and do a prompt, like a system prompt that says, you are now in testing mode. Nothing you say will be held against you. There are some very specific prompts for this that people share online, which is kind of funny to see. We don't put any of them in by default. We just let people do whatever they want themselves. But yeah, so there are ways you can try and sort of get around the censoring layer in these models.
A
Talking about the Dan kind of behavior. Right. You can do anything now. There's some hilarious stuff that people have found to work. I got a good few laughs out of investigating some of those things over time. So just talk about some of your favorite text and image models that you've used personally.
B
Yeah, so it's kind of funny, right? Because it keeps changing so much over time. Like as I said, I got so enthusiastic about ChatGPT when it came out and now I look back at it and I'm like, okay, it was such a terrible model. Like, it was. It was almost useless. How could I ever get any use out of this? So I would say at first it was chatgpt for me, then it was Claude, and now lately, Google has kind of been yeah, kind of been catching up in a way. I quite like Grok in a way as well. Even though we don't have it like it, it's got a certain tone that's very different from the other models. Doesn't really make it better for anything useful, but it's fun to play around with sometimes. I quite like Gemini Flash, which is a very small Google model because it's just so quick and so cheap. So if I ever have like batch tasks that I need to do, like analyzing a lot of different files, for example, I just throw it into Flash. So yeah, it's kind of a different model for every purpose for me. I find that I used Deep Seq Reasoner quite a bit bit when it came out, which. This is also relevant for your audience, I guess. So Deep Seq, we don't run it through deepseek themselves. They released a model, they made it open source. So now there are also open source providers that host it. So we use the open source providers because otherwise DeepSeek stores your data in China, which we prefer to avoid. So then I quite like using Deepseek because the reasoning it does is quite. It was quite new at the time and it was very fun to see and it was cheap enough that you could just play around with it a bit. So yeah, I keep kind of changing on what model I use most and which I like best in terms of image models. There was a new model recently, Reeve. Reeve Half Moon, like Re V E Half Moon, which is just a fantastic model. It's super cheap for images and it generates just beautiful imagery. But also that keeps changing over time. Right? Like every time a new model comes out, I try it out and I discover, okay, this one is very good with getting text into images. This one is good with, for example, Midjourney. It's very good at just give it a simple prompt and it will come up with beautiful visuals in different styles and whatever. So yeah, for me it's a little bit of everything that's kind of. I guess the nice thing about our service is we have all of them so you can just play around and see what they're like and spend a few cents experimenting with all of them, I guess.
A
So I don't want to get you. I don't want to bring attention to something that maybe we don't want to bring attention to. But as people who have managed API accesses in the past and understand some of the difficulties, sometimes it's a cat and mouse game, sometimes you have a user who is abusing that particular service, they're sending some nasty things, and that can get you. It can get you flagged from the service. Is it difficult to maintain API access? And if there are abusive users, do they get filtered? Could you talk about some of the struggles of keeping up with these services?
B
Yeah. So I'm thinking if we've had any filtering done, I actually don't think so. So we just pass on everything as is and we haven't gotten into trouble for it yet because I think most of the providers just do the filtering on their side. Even if a few users are like, trying to be annoying, you know, they pay for the prompts that they do, even though they're just doing it to annoy us. And they get kind of drowned out in a sea of regular people just using the models normally. So I guess we've been lucky in that so far that there hasn't been a major enough troll to really impact us on any of the services. We do sometimes get like, it feels like DDoS attacks where people send literally hundreds of requests to a model within a single second. And we're kind of wondering like, okay, are you actually using it and. Or is this just like trying to break our service? But so far it all worked out fine. They just pay for the prompts that they do. So we don't mind if people spam hundreds of prompts. But no, so far, actually, I'm trying to think. But we haven't done any manual blocking or filtering of users. We have had one guy or girl, I don't actually know who was trying to sort of DDoS the service, which was kind of funny in a way, but then we had like Vercel, which is a provider that we use. They kind of blocked it on their end. So, no, so far, maybe we've just been lucky, but so far we haven't had to do any sort of manual censoring of people or filtering of people.
A
This is a question from my friend Urban. He says that the video generator, so we can do video AI on your service, which is pretty neat. He says this is his words, the video generator sometimes fails and double charges me. What's going on there? Any ideas?
B
First of all, have him reach out to us on Discord or anywhere so I can refund him, please. I don't know. It kind of depends per model.
A
Right?
B
Because we have different providers for the different models. If he was using VO2, that's the Google video model. It's kind of a pain for us because it's extremely censoring in what it Finds not suitable for work. You could be like, give me a beautiful pig flying through the sky and it will be like, okay, this needs to be censored, but when it does so it only tells us after it returns the prompt and after we've already charged. And it will still charge us as well. That might be what happened to him. I'm not sure.
A
Yeah, no problem. No, just thinking of ways to make sure people don't repeat that themselves. Just be extra cautious in video generation. Expect even more sensitive sensitivity in those sorts of things. Is that what you would suggest?
B
Yeah, I guess in a way. So if you go into our settings, we have a, I'm not sure what we call it, like an 18 plus or not suitable for work or something setting. And if you turn it on, you get to see some extra models, like models that do that, generate more like that have less filtering or centering. Let me just call it that. So if you turn that on, the model that shows up then in the video models is also the most uncensored one for videos. Google is the most censored one. But yeah, you're right. Maybe we should add like into the description of the Google video model something like, be very careful, Google is overly censorious or something. I'm not sure. But it's kind of annoying because at the Same time the VO2 model is the best video model. Like, it's incredible. Yeah, it's an incredible model, but it's also incredibly fickle in what it will allow you to generate.
A
You store everything locally, which is good. You explained that. Have you considered adding some kind of maybe encrypted storage for users who want to have their chat on multiple machines?
B
Yeah, it's something we're working on. So my co founder, who's the far more technical of the two of us, it's like something he's wanted to add for quite a while and the way he wants to do it, I believe is that you can encrypt it, sync it to an AWS instance, or you can even like input your own instance that you want to sync it to and then you can like input that same one on the other device or just we link it to your account if that's what you prefer. And then that way you could sync the conversations. So it's something we're working on, but if we ever do it, it's definitely going to be like off by default because I think, yeah, many people really like the local storage of prompts and of conversations. So if we ever edit, we want to make sure that people can still do that if they prefer, and not just opt them into us storing the logs on a server, even if it is encrypted. I think it would feel kind of invasive for people to do that. So, yeah, we do plan to add it, but it will be like an option that you have to turn on.
A
Right. Just in terms of throwing out some ideas. There. There is. I don't know if you're familiar with the ESIM service, silent link. How they go about things is you never have an account. Basically they just generate a URL for you. So you never have to give any personal information, you never have to give an email, and you just have to remember your link so that you can return to it and top up your service, etc. Is that something that you could consider or are there limitations to that? Because that's something that we have also been aspiring to pursue. That kind of. It requires a lot of responsibility. Right. But it's also a cool concept and obviously you don't have to collect anything.
B
Yeah, yeah, no, I think that could work as well. It's kind of similar in a way, I guess to having just a seed phrase or something. Right? Like a private key, like in crypto that you can use instead. Like we randomly generate a private key or you just input a random private key and then that's the way you can use it. That would work as well, to be honest. So the reason we haven't done. Because the private key example has actually been requested before. I haven't heard of the side and link way of doing things before, but it's interesting. It could work fine as well, I think. But. So the reason it's not been implemented yet is just there's quite little demand for it just because people can use just any random email from like a throwaway service or protonmail or whatever. If you just use an email service where you don't have to give any personal information, then I feel like maybe that gets like 90% of the way there already. But maybe I'm wrong. You know more about this than I do. So, like, tell me if there's a big thing that we're missing there.
A
No, I think that's fine. I think it gets more troubling when you have to, you know, actually check that email. Right. It's one thing to actually use that as a login identifier, and it's another thing to have to check that email because these email services have their own things that they do and they could theoretically lock you out of your account and all this sort of stuff. I don't see it as a big issue. It's just if we're taking the step of asking for emails, the only purpose I see is so that you want to send actual emails to that person as just an identifier. It's always awesome to think about just replacing that with a unique code.
B
Yeah, no, fair enough. I'll suggest it to my co founder as well. He added the login system. Obviously that's way above my pay grade. And both of us are into crypto, so I think either a private key solution or, or the silent link solution of having like a custom URL. I think both would work fine.
A
Yeah, the silent link is definitely. We aspire to be that kind of model ourselves, but it's tricky and basically what they've said is, look, here's how we're doing things. If you come to this website, you know what's up, you know, you have some responsibility. You have to copy this down. If you don't copy it down, you're screwed. Right. So it is a different kind of mindset. But okay, just, just another question or two to get you out of here. So talk about your, your nano cryptocurrency. I, I'm a bit conservative when it comes to cryptocurrency, so I'm very, I'm not necessarily on the cutting edge and I want to see things develop a little bit. Tell us a little bit about your, your nanocrypto currency and maybe the, the proper conservative way of phrasing. This is why something like that when you have Monero.
B
Yeah, no, that's fair enough. So just to be clear, we didn't create nano, so it's an existing crypto. It has existed for, for almost 10 years now, and we just decided to use it because we like it for payments. So in short, what Nano does is payments are instantly settled, like within a second, and the fees are zero. So that's the, and the scalability is good, all that kind of stuff. But just instant zero fee is the reason we really like it. That said, I can see why you would prefer something like Monero, because nano is pseudo anonymous. Like you can trace the transactions, you won't know who's behind an address, but it's not as private as Monero is, for example. But yeah, for payments, it's kind of hard for us to beat instant and zero fee, also as a store of value, but maybe that's going too deep into it. We keep many of our profits in Nano because it's Fixed supply crypto and it's very decentralized, it has good game theory behind it all, that kind of stuff. We, we kind of just, we're not maximalists. So we really like Monero as well. It's the second most used coin on our platform. Maybe partially because of you guys, because you talk about us, but yeah, anything. Like, we think using crypto in general is better than using fiat. So we just want to support as many different crypto as we can. Like any crypto that people want to use, we want to be able to support it and then people can just decide what they want to use for themselves. We think Nano works well, but Monero also works well. Litecoin works well, Bitcoin cash works well. People can just use whatever they want. We're not maximalists, we just really like Nano personally.
A
One other thing, I'm looking at the website right here. We press a button just above the text input box that says Enable web. What does that do for us?
B
Yeah, so one of the annoyances that people had with many of these models, especially like people like my parents and such, they would always be surprised that the model wasn't up to date. Like they would ask it what the weather was somewhere and it was, wouldn't know because obviously these models are trained on old data, right? Like they're not always up to date. So what we did was we added web search. So essentially if you turn on web search, then if you send a prompt to the model, like let's just say who won the Formula 1 GP yesterday, for example. Then before your prompt gets sent to the model itself, a sort of Google search is done in the background and the results of it are appended to your prompt. So what the model sees is who won the GP yesterday. But it will also see like the sort of headlines about the Formula one race that happened. So it kind of gives some extra information to the model, it also provides the current date to the model and it just makes the models a lot smarter than at answering like sort of real world stuff or sort of current stuff. If you want to like know what the latest news is or how a certain event in the real world is like developing, then it's a very valuable tool to have, I think. So we use link up for it, we compared a lot of services, they score best. And yeah, essentially what it just does is it gives the model access to like a sort of deep search results. But like, let's just call it Googling for now. It's a bit disrespectful to the linkup guys. So sorry if listen to this. But yeah, essentially it gives more context, more information to the model.
A
Excellent.
B
I really appreciate people wanting to have more privacy when they use these models. I totally agree. Running a model locally is the best way for your privacy. Like if I didn't mention it before, that's really the case. There's no way to get more private than that. But we hope that if you want to use the closed source models, like the top models, like Gemini chatgpt, that we can sort of give you the best possible privacy at least. And if there's anything that we can improve about that, we would love to hear.
A
Yes, and I appreciate that self hosting is obviously the gold standard. We want to be doing that. We want to. For me, the open source models are great, maybe a service like Venice AI, but for those who are maybe the psychonauts of AI, right, you want to experiment and compare. You want access to the closed source models, maybe you want some good images. That's very difficult to do otherwise. Maybe you want access to some of the best coding. You could use something like Nano GPT. You're going to be very cognizant about what you're inputting. This is not. I'd be hesitant to do really personal stuff. I wouldn't be uploading my documents necessarily. But it does have its place. It definitely does have its place. And for people who are just trying to get into AI, they want to play around with it, this is probably the best way to do so. Top up a little bit, pay privately, don't even have to create an account and just have access to hundreds of these AI models. So really cool service. Appreciate what you're doing. Milan, thank you so much for joining.
B
Yeah, thank you for having me and thank you for the fun conversation.
A
Hey, thanks for listening. Look, I could use your help real quick if you could share this, engage with me in some way, leave a review anywhere. This really helps me to break the technocratic shadow banning that is happening with my brand. And of course, if you really want to escape the technocracy, go to escape the technocracy.com privacy tutorial series, books consulting and of course you can leave a donation. Thank you very much, Foreign.
B
SA.
Date: April 13, 2025
Host: Gabriel Custodiet
Guest: Milan Derrid, Co-founder of NanoGPT
This episode dives into NanoGPT, an innovative intermediary AI service designed for privacy-conscious users who want access to a wide array of AI models—including closed source models like ChatGPT, Claude, Gemini—without tying their identity to the big tech companies running them. Host Gabriel Custodiet and guest Milan Derrid discuss the privacy landscape of AI, compare open-source and closed-source access, practicalities of pay-per-prompt usage, censorship in models, and how NanoGPT is shaping private experimentation with generative AI. The episode is especially valuable for those seeking to explore and compare AI models without revealing personal information and while using private payment options such as Monero or Nano.
"If I was trying to design a user friendly, non coercive website, I would do it in this way ... allows you to use the service as a guest. You don't have to have an account. You can just ... use it and then exit if you would like." (A, 02:14)
Background: Milan started NanoGPT to democratize access after seeing friends blocked from ChatGPT due to payment or regional constraints.
"On average what we see is a prompt costs about $0.01." (B, 09:14)
"So what can we see? We can see a user id ... which model ... and at what time. But we can never see the actual message ... nor do we know the IP..." (B, 16:54)
"If you want to be able to share your balance between say your laptop and your phone, just get an anonymous email ... use VPN on both..." (B, 19:37)
"Our special trick is to just give you the models as they are ... many others add censorship layers before the model." (B, 34:00)
"If we ever do it, it's definitely going to be like off by default ... many people really like the local storage..." (B, 44:55)
"I wanted to start something that could share that same magic... but make it in a way that's more accessible... So you don't need subscriptions, you don't need credit cards."
— Milan (B), 02:49
"All of the requests that we do to OpenAI come from just one location, one IP, one user... They don't see your name, they don't see your payment details... it adds a sort of layer of privacy at least."
— Milan (B), 05:35
"What can we see? We can see a user id... but we can never see the actual message that you're sending, nor do we know the IP..."
— Milan (B), 16:54
"Running a model locally is the best way for your privacy... But we hope that if you want to use the closed source models... we can sort of give you the best possible privacy at least."
— Milan (B), 52:46
"If you want to be able to share your balance between say your laptop and your phone, just get an anonymous email... use VPN on both devices. And then... you're pretty much as private as you would be..."
— Milan (B), 19:37
"With Deepseek, for example, we host an obliterated version... so it will gladly tell you what happened on Tiananmen Square..."
— Milan (B), 36:38
Casual, practical, sometimes philosophical about the privacy and censorship landscape in AI. Milan is forthright about the platform’s capabilities and limitations; Gabriel interrogates the privacy nuances and opportunities for self-hosting or maximizing anonymity.
NanoGPT is positioned as a uniquely privacy-conscious, minimalist intermediary for AI experimentation—perfect for those unwilling to sacrifice privacy to Big Tech but who still want to explore the full range of generative models. While ultimate privacy will always require self-hosting, NanoGPT offers the friendliest on-ramp for everyday explorers, learners, and researchers, with flexible payments, minimal data retention, and broad model access. Knowledgeable users can push their privacy even further by combining guest mode, Monero payments, VPN/Tor, and careful prompt management.