
Loading summary
Evan Conrad
Foreign.
Alessio
Welcome to the Living Space podcast. This is Alessio, partner and CTO at Decibel. And I'm joined by my host Zwicks, founder of Small AI. Hey.
Zwicks
And today we're so excited to be finally in the studio with Evan Conrad from SF Compute. Welcome.
Evan Conrad
Hello. How goes it? How are we doing?
Zwicks
I've been fortunate enough to be your friend before you're famous and also we've hung out at like various social things. So it's really cool to see that SF Compute is coming into its own thing and it's a significant presence at least in the San Francisco community, which of course it's in the name. So you couldn't help but be.
Evan Conrad
Indeed, indeed. I think we have a long way to go, but yeah, thanks.
Zwicks
Of course. One way I was thinking about kicking off this conversation is we will likely release this right after Core Weave ipo. And I was watching, I was looking, doing some research on you. You did a talk at the curve. Yeah, I think I may have been viewer number 70. It was a great talk. More people should go see it. Evan Conrad at the Curve. But we have like three orders of magnitude more people and I just wanted to highlight like what is your analysis of what Core Reeve did that went so right for them?
Evan Conrad
Sell locked in long term contracts and don't really do much short term at all. I think like a lot of people had this assumption that GPUs would work a lot like CPUs and the standard business model of any sort of CPU cloud is you buy commodity hardware, then you lay on services that are mostly software and that gives you high margins and pretty much all your value comes from those services, not really the underlying compute in any capacity. And because it's commodity hardware and it's not actually that expensive, most of that can be sort of on demand compute. And while you do want locked in contracts for folks, it's mostly just a sort of de risk. Your situation helps you plan revenue because you don't know if people are going to scale up or down. But fundamentally people are like buying hourly and that's how your business is structured. And you're going to make 50% margins or higher. This doesn't really work in GPUs. And the reason why it doesn't work is because you end up with super price sensitive customers. And that isn't because necessarily it's just way more expensive though that's totally the case. So in a CPU cloud you might have like, you know, let's say if you had a million dollars of hardware. In GPUs, you have a billion dollars of hardware. And so your customers are buying at much higher volumes than you otherwise expect. And it's also smaller customers who are buying at higher amounts of volume. So relative to what they're spending in General, but in GPUs in particular, your customer cares about the scaling law behind it. So if you take like Gusto, for example, or Rippling, or an HR service like this, when they're buying from an aws or a GCP, they're buying CPUs and they're running web servers. Those web servers, they kind of buy up to the capacity that they need. They buy enough like CPUs and then they don't buy anymore. Like, they don't buy any more at all.
Zwicks
Yeah, you have a chart that goes like this and then flat.
Evan Conrad
Correct. And it's like a complete flat. It's not even like an incremental tiny amount. It's not like you could just like turn on some more nodes and then suddenly, you know, they would make incremental amounts of money. More like Gusto isn't going to make like, you know, 5% more money. They're going to make zero, like literally zero money from every incremental GPU or CPU after a certain point. This is not the case for anyone who is training models, and it's not the case for anyone who's doing test time inference or like inference that has scales at test time. Because like you, your scaling laws mean that you may have some diminishing returns, but there's always returns. Adding GPUs always means your model does actually get bigger and that actually does translate into revenue for you. And then for test time inference, you actually can just like run the inference longer and get a better performance. Or maybe you can run more customers faster and then charge for that. It actually does translate into revenue. Every incremental GPU translates to revenue. And what that means from the customer's perspective is you've got like a flat budget and you're trying to max the amount of GPUs you have for that budget. And it's very distinctly different than like where Augusto or Rippling might think, where they think, oh, we need this amount of CPUs, how do we, you know, reduce our amount of money that we're spending on this to get the same amount of CPUs? What that translates to is customers who are spending in really high volume, but also customers who are super price sensitive, who don't give a shit. Can I swear on this? Can I sure. Who don't give a shit at all about your software because a 10% difference in a billion dollars of hardware is like $100 million of value for you. So if you have a 10% margin increase because you have great software on your billion dollars customers are that price sensitive, they will immediately switch off if they can. Because why wouldn't you? You would just take that $100 million. You'd spend $50 million on hiring a software engineering team to replicate anything that you possibly did. So that means that the best way to make money in GPUs was to do basically exactly what Core Weave did, which is go out and sign only long term contracts, pretty much ignore the bottom end of the market completely, and then maximize your long term contracts with customers who are, who don't have credit risk, who won't sue you if, or are unlikely to sue you for like frivolous reasons. And then because they don't have credit risk and they won't sue you for frivolous reasons, you can go back to your lender and you can say, look, this is a really low risk situation for us to do. You should give me prime, like prime interest rate. You should give me the lowest cost of capital you possibly can. And when you do that, you just like make tons of money. The problem that I think lots of people are going to talk about with Core Weave is it doesn't really look like a cloud provider financially. It also doesn't really look like a software company financially.
Zwicks
It's a bank.
Evan Conrad
It's a bank, it's, it's a real estate company. And it's very hard to not be that. The problem of that, that people have tricked themselves into is thinking that Core Weave is a bad business. I don't think Core Reaves is explicitly a bad business. There's a bunch of people, there's kind of like two versions of the Core Weave take at the moment. There's oh my God, Core Reave. Amazing. Core Reave is this great new cloud provider, competitive with the hyperscalers. And to some extent this is true from a structural perspective, like they are indeed a real sort of thing against the cloud providers in this particular category. And the other take is, oh my gosh, coreweave is this horrible business and so on and blah, blah, blah. And I think it's just like a set of perception or perspective. If you think coreweave's business is supposed to look like the traditional cloud providers, you're going to be really upset to learn that GPUs don't look like that at all. And in fact for the hyperscalers it doesn't look like this either. My intuition is that the hyperscalers are probably going to lose a lot of money and they know they're going to lose a lot of money on reselling Nvidia GPUs.
Zwicks
At least hyperscalers. I want to Microsoft AWS.
Evan Conrad
Google. Correct? Yeah, the Microsoft AWS. And Google.
Zwicks
Does Google resell? I mean Google has TPUs but Google.
Evan Conrad
Has TPUs but I think you can also get H1 hundreds on and so on. But they have like two ways they can make money. One is by selling to small customers who aren't actually buying in any serious volume. They're testing around, they're playing around and if they get big they're immediately going to do one of two things. They're going to ask you for a discount because they're not going to pay your crazy sort of margin that you have locked into your business. Because for CPUs you need that. They're going to pay your massive per hour price and so they want you to sign a long term contract. And so that's your other way that you can make money is you can, you can basically do exactly what Corey does, which is have them pay as much as possible upfront and lock in the contract for a long time. Or you can have small customers. But the problem is that like for a hyperscaler the GPUs to sell on the low margins relative to what your other business, your CPUs are a worse business than what you are currently doing because like you could have spent the same money on those GPUs and you could have trained model and you could have made a model on top of it and then turned that into a product and had high margins from your product. Or you could have taken that same money and you could have competed with Nvidia and you could have cut into their margin instead. But just Simply reselling Nvidia GPUs doesn't work like your CPU business where you're able to capture high margins from big customers and so on and then they never leave you because your customers aren't actually price sensitive and so they won't switch off if your prices are a little higher.
Zwicks
You actually had a really nice chart again on that talk of this two by two of like where you want to be. And you also had some hot takes on who's making money and who isn't. So sure core we've locked up long term contracts. Get that?
Evan Conrad
Yes.
Zwicks
Maybe share Your mental framework just verbally describe it because we're trying to help the audio listeners as well.
Evan Conrad
Sure.
Zwicks
People can look up the chart if they want to.
Evan Conrad
Sure.
Zwicks
Okay, so this is a graph of interest rates and on the Y AIs it's a probability you're able to sell your GPUs from 0 to 1 and on the x axis it's how much they'll depreciate and cost from zero to one. And then you had ISO cost curves or ISO interest rate curves. So there's they, they kind of shape in a sort of concave fashion. Yeah, the lowest interest rates enable the most aggressive form of this cost curve. And the higher interest rates go, the more you have to push out to the top.
Evan Conrad
Right.
Zwicks
And then you had some analysis of where every player sits in this, including core weave, but also together and modal and all these other guys. I thought that was super insightful. So I just wanted to elaborate.
Evan Conrad
Basically it's like a graph of risk and the genres of places where you can be and what the risk is associated with that. The like optimal thing for you to do if you can, is to lock in long term contracts that are paid all upfront or in with a situation in which you trust the other party to pay you over time. So if you're you know, selling to Microsoft or something or OpenAI which are.
Zwicks
Together 77% of the revenue of Coldweave.
Evan Conrad
Yeah, so if you're doing that, that's a great business to be in. But because your interest rate that you can pitch for is really low because no one thinks Microsoft is going to default and like maybe OpenAI will default, but the backing by Microsoft kind of helps you and I think there's enough like generally it looks like OpenAI is winning that you can make a. It's just a much better case than if you're selling to the pre seed startup that just raised $30 million or something pre revenue. It's like way easier to make the case that the OpenAI is not going to default than the precede startup. And so the optimal place to be is selling to the maximally low risk customer for as long as possible. And then you never have to worry about depreciation and you make lots of money. The less good place to be is you could sell long term contracts to people who might default on you. And then if you're not bringing into the present so you're not like saying hey, you have to pay us all upfront, then you're in this more risky territory.
Zwicks
So is the top left of the chart.
Evan Conrad
Uh, if I have the chart right.
Zwicks
Large contracts paid over time.
Evan Conrad
Yeah, large contracts paid over time is like top left. So it's more risky, but you could still probably get away with it. And then the other opportunity is that you could sell short term contracts for really high prices. And so lots of people tried that too because this is actually closer to the original business model that people thought would work in cloud providers for CPUs. It works for free CPUs, but it doesn't really work for GPUs. And I don't think people were trying this because they were thinking about the risk associated with it. I think a lot of people are just come from a software background, have not really thought about like cogs or margins or inventory risk or things that you have to worry about in the physical world. And I think they were just like copy pasting the same business model onto CPUs. And also I remember fundraising like a few years ago. And I know based on what we knew other people were saying who were in a very similar business to us versus what we were saying. And we know that our pitch was way worse at the time because in the beginning of SF Compute we looked very similar to pretty much every other GPU cloud, not on purpose, but sort of accidentally. And I know that the correct pitch to give to an investor was we will look like a traditional CPU cloud with high margins and we'll sell to everyone. And that is a bad business model because your customers are price sensitive. And so what happens is if you sell at high prices, which is the price that you would need to sell at in order to de risk your loss on the depreciation curve. And specifically what I mean by that is like, let's say you're selling at like $5 an hour and you're paying $1.50 an hour for the GPU under the hood, it's a little bit different than that, but you know, nice numbers. $5 an hour, $50 an hour. Great, excellent. Well, you're charging a really high price per GPU hour because over time the price will go down and you'll get competed out. And what you need is to make sure that you never go under, or if you do go under your underlying costs, you've made so much money in the first part of it that the later end of it like doesn't matter because from the whole structure of the deal you've made money. The problem is that just you think that you're going to be able to retain your customers with software and actually what happens is your customers are super price sensitive and push you down and push you down and push you down and push you down that they don't care about your software at all. And then the other problem that you have is you have really big players like the hyperscalers who are looking to win the market and they have way more money than you and they can push down on margin much better than you can. And so if they have to, and they don't necessarily all the time, I think they actually keep quite a higher margin. But if they needed to, they could totally just like wreck your margin at any point and push you down. Which meant that that quadrant over there where you're charging a high price and just to make up for the risk completely got destroyed. Like did not work at all for many places because of the price sensitivity, because people could just shove you down instead. That pushed everybody up to the top right hand corner of that, which is selling short term contracts for low prices paid over time. Paid over time, which is the worst place to be in, the worst financial place to be in because it has the highest interest rate, which means that your costs go up at the same time your incoming cash goes down and squeezes your margins. And squeezes your margins. The nice thing for a core weave is that most of their business is over on the other sides of those quadrants, the ones that survived.
Zwicks
The only remaining question I have with Core Weave and I promise I get to SF Compute and I promise this is relevant to SF Compute in general because the framework is important, right? To understand the company. So why didn't Nvidia or Microsoft, both of which have more money than coreweave, do Core weave, right?
Evan Conrad
Why didn't they do Core weave?
Zwicks
Why have this middleman when either Nvidia or Microsoft have more money than God? And they could have done an internal core weave, which is effectively like a self funding vehicle, like a financial instrument, why does there have to be a third party?
Evan Conrad
Your question is like why didn't Microsoft, either one of them just do Core re? Why didn't they just set up their own cloud provider? I think, and I don't know and so correct me if I'm wrong and lots of people will have different opinions here or I mean not opinions, they'll have actual facts that differ from my facts. Those aren't opinions, those are actually indeed different differences of reality is that Nvidia doesn't want to compete with their customers. They make a large amount of money by selling to existing clouds. If they launch their own core weave Then it would make it much harder for them to sell to the hyperscalers and so they have a complex relationship with there. So not great for them. Second is that at least for a while I think they were dealing with antitrust concerns or fears that if they're going through, if they own too much layers of the stack, I could imagine that could be a problem for them. I don't know if that's actually true, but that's where my mind would go. Or guess mostly I think it's the first one, it's that they would be competing directly with their primary customers.
Zwicks
Then Microsoft could have done it.
Evan Conrad
Right?
Zwicks
That's the other question.
Evan Conrad
Yeah. So Microsoft didn't do it. And my guess is that Nvidia doesn't want Microsoft to do it and so they would limit the capacity because from Nvidia's perspective, both they don't want to necessarily launch their own cloud provider because it's competing with their customers, but also they don't want only one customer or only a few customers. It's really bad for Nvidia if you have customer concentration. And Microsoft and Google and Amazon, like Oracle too, buy up your entire supply and then you have four or five customers or so who pretty much get to set prices.
Zwicks
Monopsony.
Evan Conrad
Yeah, a monopsony. And so the optimal thing for you is a diverse set of customers who all are willing to pay at whatever price because if you don't, somebody else will. And so it's really optimal for Nvidia to have lots of other customers who are all competing against each other. Great.
Zwicks
Just wanted to establish that it's unintuitive for people who've never thought about it and you think about it all day long.
Evan Conrad
Yeah.
Zwicks
The last thing I'll call out from the talk, which is kind of cool and then I promise we'll get to SF compute is why will DigitalOcean and together lose money on their clusters?
Evan Conrad
Why will DigitalOcean and together move to money on their clusters? I'm going to start by clarifying that all of these businesses are excellent and fantastic. That together and DigitalOcean and Lambda I think are wonderful businesses who do like build excellent products. But my general intuition is that if you try to couple the software and the hardware together, you're going to lose money. That if you go out and you buy a long term contract from someone and then you lay your own services or you buy the hardware yourself and you spin it up and you get a bunch of debt, you're going to run into the same problem that Everybody else did the same problem we did. Same problem the hyperscalers are doing, which is you cannot add software and make high margins like a cloud provider can. You can pitch that into investors and it will totally make sense. And it's like the correct play in CPUs, but there isn't software you could make to make this occur. Like if you are spending a billion dollars on hardware, you need to make a billion dollars of software. There isn't a billion dollars of software that you can realistically make. And if you do, you're going to look like SAP and like um, not. That's not a knock on SAP. SAP makes a fuck ton of money, right? So there just aren't that many pieces of software that you could make that you can realistically sell like a billion dollars of software. And you're probably not going to do it to price sensitive customers who are spending their entire budget already on compute. They don't have any more money to give you. It's a very hard proposition to do. And so many parties have been trying to do this like buy their own computer, um, because that's what a traditional cloud does. It doesn't really work for them. You know that meme where there's like the grim reaper and he's like knocking on the door and then he keeps knocking on the next door. We have just seen door after door after door of the grim recur comes by and the economic realities of the compute market come knocking. And so the thing we encourage folks to do is if you are thinking about buying a big GPU cluster and you're going to layer on software on top, don't. There are so many dead bodies in the wake there. We would recommend not doing that. And we as SF compute, our entire business is structured to help you not do that. It's helped disintegrate these. The GPU clouds are fantastic real estate businesses. If you treat them like real estate businesses, you will make a lot of money. The cloud services you can make on that, on the software you want to make on that, you can do that fantastically. If you don't own the underlying hardware. If you mix these businesses together, you get shot in the head. But if you combine, if you split them, and that's what the market does, it helps you split them, it allows you to buy like later on services, but just buy from the market, you can make lots of money. So companies like Modal who don't own the underlying compute, like they don't own it. Lots of money, fantastic product. And then companies like Core Weave who are functionally like really, really good real estate businesses, lots of money, fantastic product, but if you combine them, you die. That's the economic reality of computer.
Zwicks
I think it also splits into trading versus difference, which different kinds of workloads.
Evan Conrad
Yeah.
Zwicks
And then.
Evan Conrad
Yeah.
Zwicks
One comment about the price sensitivity thing. Before we leave this topic, I want to credit Martin Casado for coining or naming this thing, which is like you said, this thing about you don't have room for a 10% margin on GPUs for software. And Martin actually played it out further. It's his first one I ever saw doing this at large enough runs. So let's say GPT 4 and 01 both had total training cost of like a $500 million is the rough estimate. When you get the $5 billion runs. When you get the $50 billion runs, it actually makes sense to build your own chips for OpenAI to get into chip design, which is so funny to like. I would make an ASIC for this run.
Evan Conrad
Yeah, maybe. I think a caveat of that that is not super well thought about is that only works if you're really confident. It only works if you really know which chip you're going to do. If you don't, then it's a little harder. So it makes. In my head, it makes more sense for inference where you've already established it, but for training there's so much like experimenting.
Zwicks
Generality.
Evan Conrad
Yeah, yeah, the generality is much more.
Zwicks
Useful in some sense. You know, Google is like six generations into the CPUs. Yeah, yeah. Okay, cool. Maybe we should go into SF compute now.
Evan Conrad
Sure. Yeah.
Alessio
Yeah. So you kind of talked about the different providers. Why did you decide to go with this approach and maybe talk a bit about how the market dynamics have evolved since you started a company?
Evan Conrad
So originally we were not doing this at all. We were definitely forced into this to some extent. SF Compute started because we wanted to go train models for music and audio in general. We were going to do a sort of generic audio model at some points and then we were going to do a music model at some points. It was early company, we didn't really spec down on a particular thing, but yeah, we were going to do a music model and audio model. The first thing that you do when you start any AI lab is you go out and you buy a big cluster. The thing we had seen everybody else do was they went out and they raised a really big round and then they would get stuck. Because if you raise the amount of money that you need to train a model Initially, like the $50 million pre seed, pre revenue, your valuation is so high or you get diluted so much that you can't raise the next round. And that's a very big ask to make. And also, I don't know, I feel like we just felt like we couldn't do it. We probably could have in retrospect, but I think one, we didn't really feel like we could do it. Two, it felt like if we did we would have been stuck later on. We didn't run away as the big round. And so instead we thought surely by now we would be able to just go out to any provider and buy like a traditional CPU cloud would offer you and just buy like on demand or buy like a month or so on. And this worked for like small incremental things. And I think this is where we were basing it off. We just like assumed we could go to like Lambda or something and like by thousands of at the time a 1/ hundreds. And this just like was not at all the case. So we started doing all the sales calls with people and we said, okay, well can we just get like month to month, can we get like one month of compute or so on? Everyone told us at the time, no, you need to have a year long contract longer or you're out of luck. Sorry. And at the time we were just like pissed off, like why won't nobody sell us a month at a time nowadays? We totally understand why, because it's the same economic reason. Because if you, if they had sold us the month to month or so on and we canceled or so on, they would have massive risk on that. And so the optimal thing to do was to only to just completely abandon this section of the market. We didn't like that. So our plan was we were going to buy a year long contract anyway. We would use a month and then we would sublease the other 11 months and we were locked in for a year, but we only had to pay on every individual month. And so we did this, but then immediately we said, oh shit, now we have a cloud provider, not a training models company, not an AI lab, because every 30 days we owed about $500,000 or so and we had about $500,000 in the bank. So that meant that every single month if we did not sell out our cluster, we would just go bankrupt. So that's what we did for the first year of the company. And when you're in that position, you try to think how in the world do you get out of that position? What that transition to is okay, well we tend to be pretty good at selling this cluster every month because we haven't died yet. And so what we should do is we should go basically be like this broker for other people and we will be more like a GPU real estate or like a GPU realtor. And so we started doing that for a while where we would go to other people who was trying to sell like a year long contract with somebody and we'd go to another person who like maybe this person wanted six months and somebody else wanted six months or something. And we'd combine all these people together to make the deal happen. And we'd organized these like one off bespoke deals that looked like basically it ended up with us taking a bunch of customers, us assigning with a vendor, taking some cut, and then us operating the cluster for people typically with bare metal. And so we were doing this, but this was definitely like a. Oh shit, oh shit, oh shit. How do we get out of our current situation? And less of a like a strategic plan of any sort. But while we were doing this, since like the beginning of the company, we had been thinking about how to buy GPU clusters, how to sell them effectively, because we'd seen every part of it. And what we ended up with was a book of everybody who's trying to buy and everyone was trying to sell because we were these GPU brokers. And so that turned into what is today SF Compute, which is a compute market, which we think we are the functionally the most liquid GPU market of any capacity. Honestly, I think we're the only thing that actually is like a real market, that there's like bids and asks and there's a trading engine that combines everything and so on. I think we're the only place where you can do things that a market should be able to do. Like you can go on SF compute today and you can get thousands of H1 hundreds for an hour if you want. And that's because there is a price for thousands of GPUs for an hour. That is like not a thing you can reasonably do on kind of any other cloud provider. Because nobody should realistically sell you thousands of GPUs for an hour. They should sell it to you for a year or so on. But one of the nice things about a market is that you can buy the year on SF Compute, but then if you need to sell back, you can sell back as well. And that opens up all these little pockets of liquidity where somebody who's just trying to buy for a little bit of time, some burst capacity. So people don't normally buy for an hour. That's not like actually a realistic thing. But it's like the range. Somebody who wants, who was like us, who needed to buy for a month can actually buy for a month. They can like place the order and there is actually a price for that and it typically comes from somebody else who's selling back. Somebody who bought a longer term contract and is like, they bought for some period of time, their code doesn't work and now they need to sell off a little bit.
Alessio
What are the utilization rates at which a market like this works? What do you see the usual GBU utilization rate. And like at what point does the.
Evan Conrad
Market get saturated, assuming there are not like hardware problems or software problems? The utilization rate is like near 100% because the price dips until the utilization is 100%. So the price actually has to dip quite a lot in order for the utilization not to be. That's not always the case because you just have logistical problems. Like you get a cluster and parts of the Infiniband fabric are broken and there's like some issue with some switch somewhere. And so you have to take some portion of the cluster offline or you know, stuff like this. Like there's just underlying physical realities of the clusters. But nominally we have better utilization than basically anybody because. But that's on utilization of the cluster. Like that doesn't necessarily translate into. I mean, I actually do think we have much better overall money made for our underlying vendors than kind of anybody else. We work with the other GPU clouds and the basic pitch to the other GPU clouds is one, we're still your broker. So we can, we can find you the long term contracts that are at the prices that you want. But meanwhile your cluster is idle. And for that we can increase your utilization and get you more money because we can sell that idle cluster for you. And then the moment we find the longer, the bigger customer and they come on, you can kick off those people and then go to the other ones. You get kind of the mix of like sell your cluster at whatever price you can get on the market and then sell your cluster at the big price that you want to do for a long, long term contract, which is your ideal business model. And then the benefit of the whole thing being on the market is you can pitch your customer that they can cancel their long term contract, which is not a thing that you can reasonably do if you are just the GPU cloud. If you're just the GBU cloud. You can never cancel your contract because that introduces so much risk that you would otherwise like not get your cheap cost capital or whatever. But if you're selling it through the market or you're selling it with us, then you can say, hey look, you can cancel for a fee. And that fee is the difference between the price of the market and then the price that they paid at, which means that they canceled. And you have the ability to offer that flexibility, but you don't have to take the risk of it. The money's already there and like you got paid, but it's just being sold to somebody else.
Zwicks
One of our top pieces from last year was Talking about the H100 glut from all the long term contracts that were not being fully utilized and being put under the market. You have on here a dollar per hour contracts as well as it goes up to two actually. I think you were involved. You were obliquely quoted in that article. I think you remember. I remember because this was hidden. We hid your name, but then you were like, it's us. Could you talk about the supply and demand of H1 hundreds? Was that just a normal cycle? Was that like a super cycle because of all the VC funding that went in in 2003, what was that like? GPU prices have come down.
Evan Conrad
Yeah, GPU prices have come down.
Zwicks
Some part of that is normal depreciation cycle. Some part of that is just there were a lot of startups that bought GPUs and never used them and now they're lending it out and therefore you exist.
Evan Conrad
There's a lot of like various theories as to why this happened. I dislike all of them because they're all kind of like. They're often said with really high confidence. And I think just the market's much more complicated than that. And so everything I'm going to say is like very hedged. But there was a series of like places where a bunch of the orders were placed and people were pitching to their customers and their investors and just the broader market that they would arrive on time. And that is not how the world works. And because there was such a really quick build out of things, you would end up with bottlenecks in the supply chain somewhere. That has nothing to do with necessarily the chip. It's like the Infiniband cables of the mix or whatever, or you need a bunch of generators or you don't have data center space. There's always some bottleneck somewhere else. And so a lot of the clusters didn't come online within the period of time. But then all the bottlenecks got sorted out and then they all came online all at the same time. So I think you saw a shortage because supply chain hard and then you saw a increase or like a glut because supply chain eventually figured itself out.
Zwicks
And specifically people over ordered in order to get the allocations that they wanted. Then they got the allocations and then they went under. Yeah, whatever, right. There was just a lot of shenanigans.
Evan Conrad
A caveat of this is every time you say somebody like over ordered, there is this assumption that the problem was like the demand went down. And I don't think that's the case at all. And so I want to clarify that. It definitely seems like there's more demand for GPUs than there ever was. It's just that there was also more supply. So at the moment, I think there is still functionally a glut. But the difference that I think is happening is mostly the test time inference stuff that you just need way more chips for that than you did before. And so whenever you make a statement about the current market, people sort of take your words and then they assume that you're making a statement about the future market. And so if you say there's a glut now, people will continue to think there's a glut. But I think what is happening at the moment, my general prediction is that by the winter we will be back towards shortage. But then also this very much depends on the rollout of future chips. And that comes with its own. I think I'm trying to give you a good. Here's Evan's forecast.
Zwicks
Okay.
Evan Conrad
But I don't know if my forecast is very.
Zwicks
You don't have to. Nobody's going to hold you to it. But I think people want to know what's true and what's not. And there's a lot of vague speculations from people who are not that close to the market, actually. And you are?
Evan Conrad
I think I'm a close to the market, but also a vague speculator. I think there are a lot of really highly confident speculators, and I am indeed a vague speculator. I think I have more information than a lot of other people. And this makes me more vague of a spectator because I feel less certain or less confident than I think a lot of other people do. The thing I do feel reasonably confident about saying is that the test time inference is probably going to quite significantly expand the amount of compute that was used for inference. So a caveat of this is like pretty much all the inference demand is in a few companies. A good example is like lots of bio and Pharma was using H1 hundreds training, sort of the biomodels of sorts of. And they would come along and they would buy thousands of H1 hundreds for training and then just like not a lot of stuff for inference, not in any. Not relative to like an OpenAI or anthropic or something because they don't have a consumer product. Their inference event, if they can do it right. There's really like only one inference event that matters and obviously I think they're going to run in batch and they're not going to literally just run one inference event, but the one that produces the drug is the important one. Right. And I'm dumb and I don't know anything about biology, so I could be completely wrong here. But my understanding is that's kind of the gist.
Zwicks
I can check that for you.
Evan Conrad
You can check that for me. Check that for me. But my understanding is like the one that produces the sequence that is the drug that cures cancer or whatever, that's the important deal. But a lot of models look like this where they're sort of more enterprise Y use cases. So prior to something that looks like test time inference, you got lots and lots of demand for training and then pretty much entirely fell off for inference. And I think we looked at open router for example, the entirety of open router that was not anthropic or Gemini or OpenAI or something. It was like 10H100 nodes or something like that. It's just not that much, not that many GPUs actually to service that entire demand. But that's a really sizable portion of the open source market. But the actual amount of compute needed for it was not that much. But if you imagine what an OpenAI needs for GPT4, it's tremendously big. But that's because it's a consumer product that has almost all the inference demand.
Zwicks
Yeah, that's a message we've had roughly open source AI compared to closed AI is like 5%.
Evan Conrad
Yeah, it's like super small.
Zwicks
Super small.
Evan Conrad
Super small, super small. But test time inference changes that quite significantly. Um, so I will expect that to increase our overall demand. But my question on whether or not that actually affects your compute price is entirely based on how quickly do we roll out the next chips.
Zwicks
Like the way that you burst is different for test time.
Alessio
Any thoughts on the third part of the market, which is the more peer to peer distributed? Some Are like crypto nimble like hyperbolic prime intellect and all of that. We're where do those fit? Like do you see a lot of people will want to participate in a peer to peer market or just because of the capital requirements. At the end of the day it doesn't really matter.
Evan Conrad
I'm like wildly skeptical of these to.
Zwicks
Be frankly, the dream is like steady at home.
Evan Conrad
Right?
Zwicks
I got this 1590. Nobody has 15904090 sitting at home. I can rent it out.
Evan Conrad
Yeah, I just don't really think this is going to ever be more efficient than a fully interconnected cluster with Infiniband or whatever sort of next spec might be. I could be completely wrong, but speed of light is really hard to beat and regardless of whatever you're using, you just can't get around that physical limitation. And so you could imagine a decentralized market that still has a lot of places where there's colocation, but then you would get something that looks like SF compute. And so that's what we do. That's why we take. Our general take is on SF compute you're not buying from random people, you're buying from the other GPU clouds functionally, you're buying from data centers that are the same genre of people that you would work with already. And you can specify, oh, I want all these nodes to be co located. And I don't think you're really going to get around that. And I think I buy crypto for the purposes of transferring money. The financial system is quite painful and so on. I can understand the uses of it to sort of incentivize an initial market or try to get around the cold start problem. We've been able to get around the cold start problem just fine. So it didn't actually need that at all. What I do think is totally possible is you could launch a token and then you could subsidize the compute prices for a bit. But maybe that will help you.
Zwicks
I think that's what Noose is doing.
Evan Conrad
Yeah. I think there's lots of people who are trying to do things like this, but at some point that runs out.
Zwicks
So I think generally agree. I think the only thread in that model is very fine grained mixture of experts that can be algorithms can shift to adapt to hardware realities. The hardware reality is like okay, it's annoying to do large collocated clusters. Then we'll just redesign attention or whatever in our architecture to distribute it more. There was a little bit buzz of block attention last year. That strong compute made a big Push on. But I think in a world where we have 200 experts in MOE model, it starts to be a little bit better.
Evan Conrad
Like, I don't disagree with this. I can imagine the world in which you have like, in which you've redesigned it to be more parallelizable, like across space, but assuming without that your hardware limitation is your speed of light limitation, and that's a very hard one to.
Alessio
Get around any customers or like stories that you want to shout out of, like maybe things that wouldn't have been economically viable or like others. I know there's some sensitivity on that.
Evan Conrad
But my, my favorites are grad students are folks who are trying to do things that would normally otherwise require the scale of a big lab. And the grad students are like the worst possible customer for the traditional GPU clouds because they will immediately turn if you sell them a thing because they're going to graduate and not going to go anywhere or they're not, they're not going to like that project isn't continuing to spend lots of money. Like sometimes it does, but not if you're working with the university or you're working with a lab of some sort. But a lot of times it's just like the ability for us to offer big burst capacity I think is lovely and wonderful. And it's one of my favorite things to do because all those folks look like we did. And I have a special place in my heart for young hackers and young grad students and researchers who are trying to do the same genre of thing that we are doing for the same reason. I have a special place in my heart for the startups, the people who are just actively trying to compete on the same scale but can't afford it time wise but can't afford it spike wise.
Zwicks
Yeah, I liked your example of like, I have a grant of 100k and it's expiring. I got to spend it on that. That's really beautiful and I hope interesting. Has there been interesting work coming out of that? Anything you want to mention?
Evan Conrad
Yeah. So from a startup perspective, like standard intelligence and find P H I N D, we've had them on the pod.
Zwicks
Yeah, yeah, Michael's great.
Evan Conrad
And then from grad students perspective, we worked a lot with like the Schmidt futures grantees of various sorts. My fear is if I talk about their research I will be completely wrong to a sort of almost insulting degree because I am very dumb.
Zwicks
But yeah, I think one thing that's maybe also relevant startups and GPUs wise is there was a Brief moment where it kind of made sense that VCs provided GPU clusters. And obviously you worked at AI Grant. We set up Andromeda, which is supposedly $100 million cluster.
Evan Conrad
Yeah, I can explain why that's the case or why anybody would think that would be smart because I remember before any of that happened, we were asking for it to happen. And the general reason is credit risk.
Zwicks
Again, it's a bank. I have lower risk than you do the credit transformation. I take your risk onto my balance sheet.
Evan Conrad
Correct. Exactly. If you wanted to go for a while, if you wanted to go set up a GPU cluster, you had to be the one that actually bought the hardware and racked it and stacked it like co located it somewhere with someone. Functionally it was like on your balance sheet, which means you had to get a loan. And you cannot get a loan for like $50 million as a startup. Like not really. You can get like venture debt and stuff but like it's like very, very difficult to get a loan of any serious price for that. But it's like not that difficult to get a loan for $50 million if you already have a fund or you already have like, like a billion dollars under asset somewhere or like you personally can like do a personal guarantee for it or something like this. If you have a lot of money, it is way easier for you to get a loan than if you don't have a lot of money. And so the hack of a VC or some capital partner offering equity for compute is always some arbitrage on the credit risk.
Zwicks
That's amazing.
Evan Conrad
Yeah, that's a hack. You should do that. I don't think people should do it right now. I think the market has like. I think it made sense at the time and it was helpful and useful for the people who did it at the time. But I think it was a one time arbitrage because now there are lots of other sources that can do it. And also I think it made sense when no one else was doing it and you were the only person who was doing it. But now it's an arbitrage that gets competed down. So I don't know, it's super effective. I wouldn't totally recommend it. It's great that Andromeda did it, but the marginal increase of somebody else doing it is not super helpful.
Zwicks
I don't think that many people have followed in their footsteps. I think maybe Andreessen did it.
Evan Conrad
Yeah. I think just because pretty much all the value flows to Andromeda, I think the.
Zwicks
That cannot be true. How many companies are in AI grant? Like 50.
Evan Conrad
My understanding of Andromeda is it works with all the NFTG companies or like several of the NFTG companies. But I might be wrong about that again. Something, Something. Nat, don't kill me. I could be completely wrong. But the. But you know, I think Andromeda was like an excellent idea to do at the right time in which it occurred.
Zwicks
His timing is impeccable.
Evan Conrad
Timing. Yeah. Nat and Daniel are like. I mean there's lots of people who are like, yeah, Seer. Like S E E R. Oh, seers. Like seers of the valley. For years and years before any of the like chatgpt moment or anything, they had fully understood what was going to happen. Like way, way before. Like AI Grant is like, like five years old, six years old or something like that. Seven years old. When I, when it like first launched or something.
Zwicks
You start the nonprofit version.
Evan Conrad
Yeah, the nonprofit version was like, like happening for a while. I think it's going on for quite a bit of time. And then like Nat and Daniel are like the early investors in a lot of the sort of early AI labs of various sorts. They've been doing this for a bit.
Alessio
I was looking at your pricing yesterday. We were kind of talking about it before and there's this weird thing where one week is more expensive of both one day and one month. What are some of the market pricing dynamics? What are things that to somebody that is not in the business, this looks really weird. But I'm curious if you have an explanation for it, if that looks normal to you.
Evan Conrad
Yeah. So the simple answer is preemptible pricing is cheaper than non preemptible pricing. And the same economic principle is the reason why that's the case right now. That's not entirely true on SF Compute. SF Compute doesn't really have the concept of preemptible. Instead what it has is very short reservations. So you go to a traditional cloud provider and you can say, hey, I want a reserve contract for a year. We will let you do a reserve contract for one hour, which is the part of sfc. But what you can do is you can just buy every single hour continuously and you're reserving just for that hour. And then the next hour, you reserve just for that next hour. And this is obviously a built in. This is an automation that you can use. But what you're seeing when you see the cheap price is you're seeing somebody who's buying the next hour, but maybe not necessarily buying the hour after that. So if the Price goes up too much, they might not get that next hour. And the underlying part of this, of where that's coming from the market is, you can imagine like day old milk or like milk that's about to be old might drop its price until it's expired because nobody wants to buy the milk. That's in the past. Or maybe you can't legally sell it. COMPUTE is the same way. No, you can't sell a block of compute that is not, that is in the past. And so what you should do in the market and what people do do is they take, they take a, a block of compute and then they drop it and drop it and drop it and drop into a floor price right before it's about to expire and they keep dropping it until it clears. And so anything that is idle drops until some point. So if you go and you on the website and you set that, that chart to like a week from now, what you'll see is much more normal looking sort of curves. But if you say oh, I want to start right now, that immediate instant, here's the compute that I want right now is the, is functionally the preemptible price. It's where most people are getting the best compute or like the best compute prices from. The caveat of that is you can do really fun stuff on SFC if you want so because it's not actually preemptible, it's, it's reserved, but only reserved for an hour. Which means that the optimal way to use SF COMPUTE is to just buy on the market price but set a limit price that is much higher. So you can set a limit price for like $4 and say oh, if the market ever happens to spike up to $4, then don't buy. I don't want to buy at that price for that hour. But otherwise just buy at the cheapest price. And if you're comfortable with that of the volatility of it, you're actually going to get like really good prices like close to a dollar an hour or so on sometimes down to like 80 cents or whatever.
Zwicks
You said four though.
Evan Conrad
Yeah, so that's the thing. So four is your max price. Four is like where you basically want to like pull the plug and say don't do it because the actual average price is not. Or like the, you know, the preemptive price doesn't actually look like that. So what you're doing when you're saying four is always, always, always give me this compute, like continue to buy every hour, don't preempt me, don't kick me off. And I want this compute and just buy at the preemptible price, but never kick me off. The only times in which you get kicked off is if there is a big price spike. And you know, let's say one day out of the year there's like a $4 an hour price because of some weird fluke or something. If there are other periods of time you're actually getting a much lower price then you. It makes sense. Your, your average cost that you're actually paying is way better. And your trade off here is you don't literally know what price you're going to get. So it's volatile. But your actual average historically has been like everyone who's done this has gotten wildly better prices. And this is like one of the clever things you can do with the market. If you're willing to make those trade offs, you can get a lot of really good prices. You can also do a bunch of other things like you can only buy at night, for example, so the price goes down at night. And so you can say oh, I want to only buy if the price is lower than 90 cents. And so if you have some long running job, you can make it only run on 90 cents, then you recover back and so on.
Zwicks
So what you can kind of create as like a spot inst is what other the CPU world has. But you've created a system where you can kind of manufacture the exact profile that you want.
Evan Conrad
Exactly.
Zwicks
That is not just whatever the hyperscalers offer you. Which is usually just one thing.
Evan Conrad
Correct. SF Compute is like the power tool.
Zwicks
The underlying primitives of hourly compute is there.
Evan Conrad
Correct?
Zwicks
Yeah, it's pretty interesting. I've often asked OpenAI so all these guys cloud as well. They do batch APIs. So it's half off of whatever your thing is. And the only contract will return in 24 hours.
Evan Conrad
Sure.
Alessio
Right.
Zwicks
And I was like 24 hours is good, but sometimes I want one hour, I want four hours, I want something. And so based off of SF Compute's system, you can actually kind of create that kind of guarantee. That would be like not 24, but within eight hours. Within four hours, like the work half of a workday, I can return your results to you. And if your latency requirements are like that low, actually that's fine.
Evan Conrad
You can carve out that you can financially engineer that on soc.
Zwicks
Yeah, I mean I think to me that unlocks a lot of agent use cases that I want which is like yeah, I worked in a background but I don't want you to take a day, take a couple hours or something. This touches a lot of my background because I used to be a derivatives trader. And this is a forward market, a futures forward market, whatever you call it.
Evan Conrad
Not a future very explicitly, not yet a futures.
Zwicks
Yes, we can talk about that one, but I don't know if you have any other points to talk about. So you recognize that you are a marketplace and you've hired. I met Alex Epstein at your launch event and you're building out the financialization of GPUs. Part of that's legal. Part of that is like listing on an exchange. Maybe you're the exchange. I don't know how that works, but just talk to me about that. From the legal, the standardization. Where is this all headed? Is this a full listed on the Chicago Mercantile Exchange or whatever?
Evan Conrad
What we're trying to do is create an underlying spot market that gives you an index price that you can use. And then with that index price you can create a cash huddled future. And with a cash huddled future you can go back to the data centers and you can say lock in your price now and de risk your entire position, which lets you get cheaper cost of capital and so on. And that we think will improve the entire industry because the marginal cost of compute is the risk as shown by that graph. And basically every part of this conversation, it's risk that causes the, the price to be all sorts of funky. And we think a future is the correct solution to this. So that's the, that's the eventual goal. Right now you have to make the, the underlying spot market in order to make this occur. And then to make the spot market work, you actually have to solve a lot of technology problems. You really cannot make a spot market work if you don't run the clusters, if you don't have control over them, if you don't know how to audit them. Because these are supercomputers, not soybeans. They have to work in a way that like it's just a lot simpler to deliver a soybean than it is to deliver it.
Zwicks
I know, talk to the soybean guys. Sure, you know.
Evan Conrad
Yeah, but you have to have a delivery mechanism, your delivery mechanism. Like somebody somewhere has to actually get the compute at some point and it actually has to work and it is really complicated. And so that is the other part of our business that we go and we build a bare metal infrastructure stack that goes. And then also we do auditing of all the clusters. You sort of de risk the technical perspective and that allows you to eventually de risk the financial perspective. And that is kind of the pitch of SF Compute.
Zwicks
Yeah. I'll double click on the auditing on the clusters. Yep. This is something I've had conversations with Vitae on. He started Rika and. Yeah, and I think it's a. He had a blog post which kind of shone the light a little bit on how unreliable some clusters are versus others.
Evan Conrad
Correct. Yeah.
Zwicks
And sometimes you kind of have to season them and age them a little bit to find the bad cards.
Evan Conrad
You have to burn them in. Yep.
Zwicks
So what do you do to audit them?
Evan Conrad
There's like a burn in process, a suite of tests, and then active checking and passive checking. Burn in process is where you typically run Linpack. Linpack is this thing that, like a bunch of linear algebra equations that you're. You're stress testing that you use a.
Zwicks
Proprietary thing that you wrote.
Evan Conrad
No, no. Linpack is like the most common form of burn in. If you just type in burn in. Typically when people say burn in, they literally just mean Linpack. There's like an Nvidia reference version of this.
Zwicks
Again, Nvidia could run this before they ship, but now the customers have to do it. It's annoying.
Evan Conrad
You're not just checking for the GPU itself, you're checking the whole component, all the hardware. And it's an integration test. It's an integration test, yeah. So what you're doing when you're running Linpack or Burnin in general is you're stress testing the GPUs for some period of time, 48 hours, for example, maybe seven days or so on, and you're just trying to kill all the dead GPUs or any components in the system that are broken. And we've had experiences where we ran Linpack on a cluster and it rounds out like, you know, sort of comes offline when you run Linpack. This is a pretty good sign that maybe there is a problem with this cluster. And so Linpack is like the most common sort of standard test. But then beyond that, what you do is we have like a series of performance tests that replicate a much more realistic environment as well that we run. Just assuming, if Linpack works at all, then you run the next set of tests. And then while the GPUs are in operation, you're also going through and you're doing active tests and passive tests. Passive tests are things that are running in the background while somebody else is running, while like some other workload is running. And active tests are during like idle periods. You're running Some period of you're running some sort of check that would otherwise sort of interrupt something and then the active tests will take something offline, basically. Or a passive check might mark it to get taken offline later and so on. And then the thing that we are working on, that we have working partially but not entirely, is automated refunds, which is basically like, is the case that the hardware breaks so much and there's only so much that we can do. And it is the effect of pretty much the entire industry. So a pretty common thing that I think happens to kind of everybody in the space is a customer comes online, they experience your cluster and your cluster has the same problem that like any cluster has, or it's, I mean, a different problem every time, but they experience one of the problems of HPC and then their experience is bad and you have to like negotiate a refund or some other thing like this.
Zwicks
It's always case by case. And like, yeah, a lot of people just eat the cost.
Evan Conrad
Correct. So one of the nice things about a market that we can do as we get bigger and have been doing as we get bigger is we can immediately give you something else and then also we can automatically refund you and you're still going to experience it. Like the hardware problems aren't going away until the underlying vendors fix things. But honestly, I don't think that's likely because you're always pushing the limits of hpc. This is the case of trying to build a supercomputer. But that's one of the nice things that we can do is we can switch you out for somebody else somewhere and then automatically refund to you or prorate or whatever the correct move is.
Zwicks
Yeah. One of the things that itay in this conversation with me was like, you know, you know a provider is good when they guarantee automatic refunds.
Evan Conrad
Yep.
Zwicks
Which doesn't happen.
Evan Conrad
But yeah, that's, that's in our contact with all the underlying cloud providers.
Zwicks
You built it in already?
Evan Conrad
Yeah. So we have a quite strict SLA that we pass on to you. The reason why I'm like hedging on this is because we have some amount of active checks, we have some amount of passive checks. There are always new genres of bullshit. And the new genres of bullshit might cause a customer to have a bad experience. And the active or passive checks didn't catch it. And so then it's a manual process after that. Then we have like a literal thing in our website that you can just say, hey, some hardware problem, please tell us. And then we will go and resolve it for you.
Zwicks
I mean, cards don't change generation to generation. What is a new genre of bullshit?
Evan Conrad
If every component piece in the cluster has maybe like a one in 100 chance of failing, or maybe a one in a thousand chance of failing, or maybe a 1 in 10,000 chance of.
Zwicks
Failing, you discover them.
Evan Conrad
You discover them. So there's ones that like maybe nobody saw, maybe you didn't see, or maybe only matters for this one cluster with this motherboard in this particular data center or something, there's new interactions that otherwise don't happen. Most problems are really common and you can adapt to them. Like a GPU falls off a bus is like one of the most common things that can happen.
Zwicks
So it's not SF Compute's job to go fix those things.
Evan Conrad
No, it totally is to some extent. Totally is to some extent. So we operate the cluster. So unlike a reseller, which is what we were doing before, in almost all cases we have BMC access. So if on your laptop there's like the button in the top right hand corner that you can hold down to like re image the machine, there's a similar thing in like server X that you is like this other box that kind of plugs in and it basically lets you reset the machine from outside. And it's like remote, it's a remote hand sort of thing. So we ask for this and we get this from a lot of our vendors, which means we have quite a lot of ability to solve problems for customers in a way that you might not actually get from a reseller. Oftentimes we are the person who's debugging your cluster. For most customers that we work with, we have Slack channel. Our entire engineering team gets put in the Slack channel. If there is a problem at 2am, we are the ones who are debugging your problem at 2am not always the case because we don't physically run the hardware cluster or the data center itself. But most problems are solvable through this.
Zwicks
So that's the auditing side. The other side is I think, of a standardization or whatever you call it beyond auditing. The other part of the work is kind of standardizing the commodity contract.
Evan Conrad
Yeah, so there's two ways that we do that. One is that you set like a this or better list. So you set like a spec list and you say, oh, you're going to get like a common variability is the amount of storage on the cluster. And so you'll say like, oh, you're going to get X or better. And there's some Guarantee minimum. And sometimes you might get more. And then we're working on a persistent storage layer that might sort of abstract a lot of this way, but mostly it's that. And then there's like a whitelist of motherboards and various genres of things. But the other part is we run the clusters from bare metal up, and so we make a thing that's this, like, it's a UEFI shim. And if you're not familiar with what UEFI is, a UEFI is like the sort of firmware modern version of bios. Modern meaning it's been around for like, forever. But, you know, BIOS is like really old. It's like this whole IBM thing and you can write code that exists at the UEFI layer. And again, when you hear uefi, you should think bios. And it does the same sort of thing as a Pixie boot, but in environments in which Pixie boot doesn't necessarily always work for us. So it basically sits at your bios, downloads an image, boots into an image that's custom for the user, and then on top of that image, we can throw kubernetes on it, we can throw VMs on it, or whatever you want, and at some point we'll probably do more stuff with that, but that's functionally what we can do. The nice thing, though, is that because you control from that layer, you can easily image the entire cluster. You can make it all the same. You can run your performance tests all automated. So much nicer than what we used to do.
Zwicks
Yeah, I mean, that is a very important work. I think for me as a trader, I need standard contracts, and so there basically needs to be the safe of a gpu.
Evan Conrad
Yes. What we functionally do is we have a market under the hood that is focused on the buyer and the seller, and it's optimized for them. And then beyond that, for a trader, you can standardize around a certain segment of it and you can trade on that contract. That's the goal that we're trying to get to. But you start by making something that works really well for buyers and really well for sellers.
Zwicks
For those who are not familiar with derivatives markets, I can go ahead and say this because the point of being cash settled, which is something that you mentioned, which I think people might miss, is that you don't have to take physical delivery of the GPUs and that so it's a pure financial instrument, which actually does mean that almost for certain there will be more volume on SFC's marketplace than actually change hands in GPU terms.
Evan Conrad
To be super clear, we are not a derivative, we are not a derivatives market. We may in the future work to create a cash settled future. We are not currently a derivatives market. We are an alliance spot market.
Zwicks
I just think like people, normies get really upset when, when they're like, then they learn things like oh, like derivatives on mortgages are like 12 times larger than the mortgages themselves.
Evan Conrad
Yes. Yeah, no, A common thing that people have talked to us about or like a fear or concern I think people have is like oh, you're financializing compute and this will like cause various problems of sorts.
Zwicks
Subprime crisis.
Evan Conrad
Yeah, and I think so First I think part of this is just because crypto caused a lot of people to think about finance in the like very degen way for the right word. And then before that the sort of 2008, 2009 crisis caused people to think about it also in sort of like a degen y way. And this is very much not our mindset. The reason to create a derivative at all or the reason to create a future at all is a risk reduction thing. That's what futures do. The reason why a farmer wants a future is because they have no idea what the weather is going to do and they don't want to be on the hook for like they have small margins and if things go wrong they really, really want to have a locked in price. So that way they can like continue to exist for the next year. Data centers are the same way. The way that they solve it today is you go out and you sign long term contracts with your customers. What that does for you is it means your business is de risked. You don't have to worry about the revenue for the next year, but that means that the customer now has to worry about what they're going to do with all this compute and if they don't optimally use it and so on and so on and that just pushes everything onto the startups who then in turn push it on to VCs. And so what the VCs are forced to do in order to invest in AI is they have to go and write big giant valuations like pre revenue at ridiculous multiples. So what you've done by not having a future is you've inflated the venture capital market. And that is a bubble that's totally going to pop at some point. Like a lot of the companies are not going to work and the valuations are not going to work. And what's going to happen is A lot of these funds aren't going to return back to their LPs, and that affects the broader market. The way that you solve that, the way that you add security to the entire economic system in this chain, is you add a future. That's how we did it in lots of other markets. It doesn't have to be this like, oh, my gosh, we're going to like, speculate on GB prices and like, whatever. No, the whole point of SF Compute is to reduce the risk, reduce the technical risk, reduce the financial risk. Let's just chill out a little bit. There's so much other random shit. It's supercomputers, there's AGI, whatever. No, let's just chill the fuck out.
Zwicks
I mean, also, like, Dan is going raising at a $30 billion valuation for Ilya.
Evan Conrad
If everybody else in all of AI is pushing the hype and the extreme. Everything we've been trying to do is go the other way. Whole website is just like a fucking single page. The entire brand is just like, what if we were calm in nature? And then everything that we do as the product is just calm. What if we. What if we were the opposite force of the big hype extreme thing? What if we just, like, chilled things out? And part of that was because we in the beginning were at the whim of the hypey nature. Like, our entire origin is every 30 days. If we don't sell out, we're going to go crazy and just completely bankrupt the company. And so everybody in the company is just like, what if we just chilled out? What if we stopped for a bit?
Zwicks
This is the first time I've ever heard derivatives are the way to chill out.
Alessio
Yes.
Evan Conrad
No. Futures are the way to chill out. Futures are the way to chill out the entire industry. And we wouldn't be doing this if it wasn't that case.
Alessio
I like that you have a very nice brand with a, you know, website.
Zwicks
We have to ask about the website.
Evan Conrad
Yeah.
Alessio
What was the inspiration behind it? Why did you not go the black neon, more cool thing and go the more nature?
Evan Conrad
I don't think I really am a black neon sort of person. I say that's wearing black pants and I thought I was wearing a black shirt, but apparently I'm not. So the actual. The actual thing was a lot of companies do this thing where they their website, you go to there and it's like a magical experience. And like everything is extreme and amazing and incredible and you go to the product and it's like some SaaS app or something. And it's like not actually that exciting. And that expectation of being like really, really good and then the fall off, the drop of not being really, really good was something that from a product perspective, I never wanted to happen. Especially because in the beginning our product was really bad. And so I don't want to set the expectation that it's going to be an amazing experience. I want to set the expectation that it's going to be a good price for short term bursts. And so what we did instead is we set the thing to be really low. You set your expectations really low and then you get a supercomputer for like millions of dollars cheaper than you would have otherwise gotten your supercomputer. And so you have the opposite expectation. You have like really low expectations that are like mild or met higher. And I think that's like the correct way to do things. But also I think we were just like so sick of hype and excitement and it's like really want to like not do that.
Zwicks
It's weird. Like by, by being anti hype, you have created hype. Like I would say, like the vibes are immaculate. You know, like you just, you go to like at the bait, the cow trade, you just put up like a banner that's just, just says SF Compute. True.
Evan Conrad
That banner was created about five minutes before we had to actually put something up. Like before the deadline was there.
Zwicks
It opens up Microsoft Word and you did some serif. What is the font?
Evan Conrad
Exactly?
Zwicks
I don't know.
Evan Conrad
Yeah, that was indeed the. Yeah. I think every time we tried to do. The only caveat to this, the only caveat that we ever violate this rule with is when we're pitching San Francisco. I think San Francisco is amazing. So sometimes you will see these like advertisements from the city. Yeah, the city. So if there's a part of San Francisco Computes brand, which are these beautiful like images of SF or various SF things. And I am the complete opposite about this. I am such a San Francisco promoter that anytime we talk about the city, I want to show the city from the like eyes that we have, which is mostly just gorgeous, beautiful area with nature. Like a lot of people think about San Francisco and they think about like the tech industry or they think. Yeah. Or the Tenderloin or something like grind culture or something. And no, I think about the fog and just the gorgeous view over the bridge and just the fact that there is this massive amount of optimism in the city and the backdrop of that optimism is the most beautiful countryside in all of the world. And so anytime we Talk about sf. You will see. Or we have a billboard somewhere that's just local friendly, supercomputer, whatever. And then the backdrop is beautiful and amazing. And that's because to some extent we're pitching the city and the people here. And I think the people in the city here are actually really amazing. And so you get to earn the brand because the expectations are met. Whereas I think on our own product I typically want it to be better and so I set the brand a lot lower and then the expectations are higher and you still meet the expectations, but you, you set them a little lower.
Zwicks
I know. Are you the designer? I know you have an artistic side.
Evan Conrad
So I was in the beginning. So I'm like a figurative artist. So I draw people. But we've worked with a design firm. Erfoil was really excellent with us. And then nowadays though, John Pham had.
Zwicks
A design from Vercel.
Evan Conrad
Yep. John is unbelievably amazing. I think the amount of care and craft and attention to detail that he puts into just everything is so cool. If you go on our buy page right now, you go to sfcompute.com buy there is an Easter egg there that you should find. Almost don't want to spoil it, but you should go find that Easter egg. If you just hover the mouse around the thing in the top right hand corner, you'll find it.
Zwicks
Tweet at Evan if you find it.
Evan Conrad
And then the other person is Ethan Anderson, our CEO, who has this RISD design background and so he used to be sort of industrial designery. I'm probably going to say that wrong. He's probably not an actual industrial designer, but design background same. So I think between me and John and Ethan, I think we the source of the vibes.
Zwicks
I had to ask. Yeah, okay, so we're going to zoom out a little bit. One of the last things I wanted to ask you was actually like, I remember, I think the first time that you was in like kind of celo and you were working on your email.
Evan Conrad
Oh yeah, yeah.
Zwicks
And I have a favorite pet topic of mine. We were here with Dharmesh yesterday talking about someone build an agent that reads my emails.
Evan Conrad
Yeah.
Zwicks
And you did. And I think I actually paid for the first one. You, you were so excited in the early GPT three days. I was like. You were like, I'm building the most expensive startup ever.
Evan Conrad
Yeah, it's so expensive.
Zwicks
Anyway, so the point being, what I'm trying to get to is you are a very smart guy. You built email, you didn't like It. You pivoted away. I've seen other. Like, every year there's someone who is like, I will crack email and I'll.
Evan Conrad
And then.
Zwicks
And then they give up.
Evan Conrad
Yeah.
Zwicks
What is so hard about email?
Evan Conrad
I didn't pivot away because the product or the idea was bad. I pivoted away because I was super burnt out. I did a startup for, like, four years and the first thing didn't work out.
Zwicks
This is Room Service.
Evan Conrad
Yeah, this is room Service. So my startup before this originally started as Quirk, which was like a mental health app. But then Quirk had the same problems that basically every mental health app has, which is like your retention goes to zero if you work it in any capacity. And so switched and then said, okay, well, I will do something that's closer to my actual background, which is like a distributed systems company called Room Service. Room Service went for about nine months and then sort of had the same problem that I think every other competitor at Room Service has, which is mostly people building in house. And so then I went back to our investors at the time, which was Nat and Daniel. And specifically Daniel told me that I should go stare at the ocean and I will find something else to do and just throw shit at the wall. And then I think it was Gustav at yc, maybe. It was probably actually Dalton Caldwell. Dalton Caldwell just said, don't die. You can just keep doing things and don't die. And so I think I just got it in my head that you should keep trying things and not die. And I really, really, really did not want to die and didn't really know what to do. And so I just threw out like 40 products with the assumption that if you just keep trying things, you won't die. This is actually not the most ideal thing to do. You actually should totally just pick a thing and go with it. But my brain wasn't set on like, oh, I should do this particular thing. It was set on not die. And so I just kept going for a very long time for like four years. And by the end of it, I think I was just super burnt out. And I was going to do the email thing with one co founder and then they quit. And then I was going to do an email thing with another co founder, and then they fell in love and decided to go get married.
Zwicks
And, you know, okay, so it wasn't that email is intractable. I'm just trying to figure out, like, look, is there something bad? Like, is this the graveyard of ideas? Right? Everyone wants to do email and then nobody does because Something I think it's.
Evan Conrad
Just hard to make an email client. I think it's hard to make an email client that is. It's a competitive space in which there are lots of things. I do think that the better version of that is something that looks closer to what Intercom is doing. And Intercom obviously existed beforehand. So you can think about any product. Should you be doing it or should somebody else in the industry who already has the existing customer set do it? And I think Intercom has pretty much very successfully done. They already had the position to do it. What do you actually need the AI to write your emails for? Most people don't need this. But who does need this is support. Use cases is pretty much there and the people who are best able to execute on this is totally Intercom. So like props to Owen. I think that was like completely the correct move.
Alessio
Call to action.
Zwicks
Yes. You're hiring?
Alessio
Yeah.
Evan Conrad
Oh yeah, we are. We are hiring for two roles as of this recording. I don't know, maybe this will change and we'll be hiring for different roles. So go to the website or whatever. But the first role is for traditional systems engineering. This is like low level systems or low level Linux Y people. Rust. Yeah. So all rust. Most all of our code base is in Rust. But we're not necessarily just looking for Rust engineers. We're specifically looking for Linux Y people. The sort of pitch is you get to work on supercomputers, you get to work on one of the few places in supercomputers that I think has a pretty good business model and is a working thing. And people generally seem to think that our vibe at SF Compute is very nice. The we have just an unbelievably excellent team I think nowadays. Our CTO is Eric Park. He's the co founder of Voltage park, which is one of the other GPU clouds. And he is quite possibly the sweetest man I've ever met. He is extremely chill and also just extremely earnest and kind. And the rest of the team kind of feels that energy very strongly. And then the other role that we're hiring for is financial systems engineering, which I really should learn. What it's not systems engineering but we should really find a better name for this role. It's basically a fintech engineer that we have the same problems as traditional fintech does. And that's like we have a ledger, we have recording requirements and all that stuff. This role is responsible for the not lose all the money Cole. We've got a whole bunch of money flowing through us. There is a bunch of stuff that you need to do in order to not lose all that money. And then the actual outcome of that work, besides not just losing all the money, which is very important, is that you end up with better prices for the vendors and better prices for the buyers. And this means that your grad student, who is making the cancer cure or whatever and needs to be able to buy like a hundred K of compute to like scale up really big, actually can do so. And that's. I think the like, this is part of the reason to work at SFC is that you're. The things you do actually matter in a way that doesn't necessarily always at all the companies functionally we run supercomputers like not soybeans or I don't know. It's a very cool place to work because your outcomes of what you do have real deal impact in a way that you don't always get when you're doing SaaS.
Zwicks
Excellent pitch. I bet you've done that a lot. But it's nice to hear for the first time. I was going to say, have you looked into Tiger Beetle, the dual entry.
Evan Conrad
Accounting database we have?
Zwicks
That seems to be the thing. If you want to make systems that don't lose money.
Evan Conrad
Yes. Systems that don't lose money. There are lots of other things you have to do. Like you have to make things in a format that your accountants can read and then get audited and so on. It's not purely just the. Yeah, it's not purely just the tech.
Zwicks
Cool. Awesome. Thank you so much.
Evan Conrad
Of course. Thank you so much for having me.
Date: April 11, 2025
Host: Alessio (CTO at Decibel) & Zwicks (Founder of Small AI)
Guest: Evan Conrad (Founder of SF Compute)
This episode delves into the shifting economics, business models, and technical realities of the GPU infrastructure market, focusing on how SF Compute offers a new paradigm for the acquisition and utilization of high-end compute. The discussion ranges from the state of "GPU bubble," the rise of compute as a commodity, challenges for traditional cloud providers, SF Compute’s path and pivots, liquidity and market design, as well as the future of financialization (futures, exchanges) of compute. Listeners get a transparent, sometimes technical account of how the market for AI hardware is evolving—and why the right business model matters for builders, startups, hyperscalers, and research.
On traditional software margins in GPU:
“There isn’t a billion dollars of software that you can realistically make... and if you do, you’re going to look like SAP.”
– Evan, 16:31
On risk and margin:
“If you combine [hardware ownership and software services], and that's what the market does, you get shot in the head. But if you split them... you can make lots of money.”
– Evan, 18:00
On the goal of futures in compute:
“The reason to create a derivative at all... is a risk reduction thing. That's what futures do... The whole point of SF Compute is to reduce the risk, reduce the technical risk, reduce the financial risk. Let's just chill out a little bit.”
– Evan, 58:13
On the anti-hype vibe:
“By being anti-hype, you have created hype... the vibes are immaculate.”
– Zwicks, 62:46
This episode is a deep dive into what it really takes—not just technically but economically and culturally—to turn AI compute into a liquid, accessible, and commoditized market. From the rise and fall of previous GPU cloud models to the emergence of spot and future markets and the war stories of pivots, SF Compute is at the center of a new paradigm. The episode is a must-listen for anyone building or investing in AI infra, or just interested in how financialization trends are about to transform the foundation of machine learning itself.
Further reading, show notes, and Easter egg hints at:
latent.space
Presented in the spirit and style of the Latent Space podcast—calm, precise, and deeply insightful.