Building PostgreSQL for the Future with Heikki Linnakangas - Software Engineering Daily

Summary7 min read

Podcast Summary: Building PostgreSQL for the Future with Heikki Linnakangas

Episode: Building PostgreSQL for the Future with Heikki Linnakangas
Host: Kevin Ball
Release Date: May 20, 2025
Podcast: Software Engineering Daily

1. Introduction

The episode kicks off with an overview of PostgreSQL's reputation as a robust, extensible, and SQL-compliant open-source database. Heikki Linnakangas, a prominent PostgreSQL developer and co-founder of Neon, joins host Kevin Ball to delve deep into PostgreSQL's enduring popularity, its extensibility, the role of extensions like PGvector, and the innovative serverless platform Neon.

2. PostgreSQL's Popularity and Longevity

Heikki Linakangas reflects on PostgreSQL's journey, noting its transformation from a lesser-known database in the early 2000s to the default choice for many developers today.

"It's become the default. Which is funny, it feels strange to me because it used to not be that way, but nowadays people just take it for granted that people are using postgres."
[02:17]

He attributes PostgreSQL's longevity to its stability, comprehensive feature set, and a disciplined release management process that ensures predictable and regular updates. The extensive ecosystem, rich tooling, and the ease of finding solutions through extensive documentation and community support further consolidate its position.

"It has a big ecosystem of all kinds of tools... a lot of people are like, there's a big ecosystem and if you have a problem with Postgres, you can Google that easily."
[04:25]

3. Extensibility in PostgreSQL: Core vs. Extensions

A standout feature of PostgreSQL is its unparalleled extensibility. Heikki traces this back to Postgres's origins as a university project focused on flexibility.

"Postgres has a very flexible type system. You can create your own data type with its own functions, and not just functions, but also operators."
[04:55]

Heikki emphasizes that PostgreSQL's ability to handle custom data types, operators, and indexing systems enables a wide range of applications, from geographic information systems with PostGIS to AI-driven applications using PGvector.

"Postgres goes a lot deeper in that. Like even all of the built in data types are created using the same primitives... you can build your own indexing system for that."
[04:55]

The distinction between what resides in the core database and what exists as an extension is crucial. Heikki advocates for keeping the core lean, allowing extensions to innovate independently without burdening the main PostgreSQL codebase.

"Extensions have a lot of advantages. Like you can have your own release schedule, you get to decide your own releases, you get to decide your own other external dependencies."
[08:13]

4. PGvector and AI Applications

The conversation shifts to PGvector, a popular PostgreSQL extension tailored for handling vector data essential in AI applications.

"PGvector uses the same algorithm as many other vector databases, HCNs, W and IVFlat."
[11:14]

Heikki discusses the challenges and innovations brought by PGvector, such as handling large-dimensional vectors, the CPU-intensive nature of building vector indexes, and the absence of efficient disk-based algorithms, which necessitates in-memory processing for optimal performance.

"What I have done, when I started to look at PGvector, I started to look at it from the point of view of I know indexes... it's very different from traditional data types."
[13:15]

A notable point is the approximate nature of vector searches, which contrasts with PostgreSQL's foundational goal of exactness in query results.

"Whenever you do a search on a vector index, it's always like the thing it does... they are always approximate."
[13:28]

This introduces complexities when integrating approximate results within traditional SQL queries, posing challenges in maintaining SQL's guarantee of exact results.

5. Neon: Building a Serverless PostgreSQL Platform

Heikki introduces Neon, a serverless platform for PostgreSQL designed to separate compute and storage layers, inspired by Amazon Aurora's architecture but fully open-source.

"The core is the separation of compute and storage. So the idea is that there is a separate storage system which keeps all of the history."
[17:07]

Key features of Neon include:

Separation of Compute and Storage: Allows independent scaling and efficient resource utilization.
Point-in-Time Recovery: Enabled by maintaining a comprehensive history of the write-ahead log (WAL), replacing traditional backup methods.
Serverless Architecture: Achieved by decoupling persistent storage from compute, enabling rapid provisioning and scalability.

Heikki elaborates on how Neon uses a unique storage layer that processes PostgreSQL's WAL to reconstruct any version of a data page, facilitating features like read replicas and point-in-time queries without duplicating data.

"We can just tell it to please show me all the data at this older point in time... the storage layer can do that."
[22:22]

6. Future of PostgreSQL and Neon

Heikki shares insights into the future roadmap for PostgreSQL, highlighting upcoming features like asynchronous I/O and multi-threading.

"There's no central roadmap for postgres. So it always depends on what other stuff that people submit patches for."
[09:55]

He envisions PostgreSQL evolving to better accommodate serverless environments, addressing challenges like connection management and memory allocation. The transition to multi-threading is seen as pivotal for enhancing scalability and flexibility.

"I would love to see much more flexibility in Postgres so that it's easier to run in this kind of serverless environment."
[35:18]

Regarding Neon, Heikki outlines ongoing efforts to optimize cold start times, currently averaging around one second, and plans to further reduce this latency by maintaining a pool of pre-warmed virtual machines.

"We have plans to bring it down further and further. But it's not really a problem anymore."
[20:00]

7. Open Source Community and Contributions

The strength of PostgreSQL lies in its vibrant open-source community. Heikki encourages developers to contribute, emphasizing that contributions often stem from solving personal challenges or innovating new features.

"All of people historically have gotten active with Postgres because they have an itch to scratch."
[40:15]

For those interested in contributing, he suggests starting with writing extensions, which provides a practical avenue to engage with PostgreSQL's extensible architecture.

"Writing your own extension is a good way to get started. We talk about all the extensible type systems and stuff."
[40:42]

Neon's storage layer is also open-source, allowing the community to explore, modify, and deploy it independently, fostering broader innovation.

"The Neon storage layer is all open source... If somebody wanted to take the Neon storage system and run it somewhere else, they could just go and do that."
[31:38]

8. Closing Remarks

In wrapping up, Heikki reiterates his optimism for PostgreSQL's future, likening its enduring presence to Linux's dominance in the open-source ecosystem. He underscores the collaborative spirit of PostgreSQL's community, advocating for ongoing contributions to drive the platform forward.

"Postgres is still alive and kicking. Like there's not... it has some staying power in that, like just being the lowest common denominator between all of the forks and all of the different approaches."
[38:02]

Host Kevin Ball expresses enthusiasm for Neon's capabilities, particularly its streamlined approach to managing data persistence and point-in-time queries, highlighting its potential to revolutionize how developers interact with PostgreSQL.

Key Takeaways:

PostgreSQL's Strengths: Stability, extensibility, and a strong community underpin its lasting popularity.
Extensibility: PostgreSQL's core vs. extensions model fosters innovation while keeping the main codebase lean.
Neon: A pioneering serverless PostgreSQL platform that decouples compute and storage, enabling scalable and efficient database management.
Future Directions: Embracing multi-threading and enhancing serverless capabilities are pivotal for PostgreSQL's evolution.
Community Engagement: Active contributions and the open-source ethos are crucial for PostgreSQL's continued success.

Notable Quotes:

"It's become the default. Which is funny, it feels strange to me because it used to not be that way, but nowadays people just take it for granted that people are using postgres." — Heikki Linakangas [02:17]
"Postgres has a very flexible type system. You can create your own data type with its own functions, and not just functions, but also operators." — Heikki Linakangas [04:55]
"Whenever you do a search on a vector index, it's always like the thing it does... they are always approximate." — Heikki Linakangas [13:28]
"I would love to see much more flexibility in Postgres so that it's easier to run in this kind of serverless environment." — Heikki Linakangas [35:18]
"Postgres is still alive and kicking... it has some staying power in that, like just being the lowest common denominator between all of the forks and all of the different approaches." — Heikki Linakangas [38:02]

This episode offers a comprehensive exploration of PostgreSQL's enduring legacy, its flexible architecture, and the innovative strides being made through platforms like Neon. For developers and database enthusiasts, it underscores the importance of community-driven development and the endless possibilities that arise from PostgreSQL's extensible framework.

Loading summary

Transcript81 lines

[00:01]
Narrator
PostgreSQL is an open source database known for its robustness, extensibility and compliance with SQL standards. Its ability to handle complex queries and maintain high data integrity has made it a top choice for both startups and large enterprises. Heiki Linakangas is a leading developer for the PostgreSQL project and he's a co founder at Neon, which provides a serverless platform for spinning up Postgres databases. In this episode he joins Kevin Ball to talk about why PostgreSQL has become so popular, why he founded Neon, Postgres, Core versus extensions, the PGvector similarity search for AI applications, and much more. Kevin Ball, or K. Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup, and organizes the AI in Action discussion group through In Space. Check out the Show Notes to follow K. Ball on Twitter or LinkedIn or visit his website K Ball LLC.
[01:18]
Kevin Ball
Heiky welcome to the show.
[01:20]
Heiki Linakangas
Thank you Kevin.
[01:21]
Kevin Ball
I'm excited to get to talk to you. So let's start a little bit with about you. So can you introduce yourself and your background and what you got you to the point where we're talking today.
[01:31]
Heiki Linakangas
So hi, my name is Heikilin Nagnas. I've been working on Postgres for the last 20 years for different companies. Currently I'm a co founder of Neon, which is a cloud provider for postgres. Before that I worked for Greenplum, like a open source analytics platform built on Postgres and many other things. But currently I'm working on Neon on a serverless postgres and cloud.
[01:54]
Kevin Ball
Yeah, I have been very heavily on the Postgres train since probably the mid-2010s. I sort of grew up on MySQL but then like at some point they seemed to really fall behind and postgres kept going. And honestly like these days I feel like there's all these new specialized vector DBs and specialized DBs and I'm like why not just Postgres? It's a high bar.
[02:17]
Heiki Linakangas
Yeah, I think that's a bit of a meme nowadays. Like just use Postgres. It's kind of funny because I started around 2003 I want to say and back then it was not a given that people would use postgres. Like MySQL was made more popular and you kind of had to explain people first of all what is Postgres? And then next you have to explain why would you use Postgres rather than MySQL or something else? Like proprietary databases were much bigger back then as well. But over the years something changed and you don't have to explain that anymore. It's become the default. Which is funny, it feels strange to me because it used to not be that way, but nowadays people just take it for granted that people are using postgres.
[02:56]
Kevin Ball
What do you think gave postgres such both longevity and why has it become the default?
[03:03]
Heiki Linakangas
That's a great question. I've thought about that myself, to be honest, I don't know, but I can speculate. I think postgres has done really well with the it has a reputation of being very stable and it has a good feature set, but it's been around for a long time and it's stable and predictable. And I think a lot of the credit for that goes to the packagers and the whole release management team that we've been doing regular yearly releases, annual releases for a long, long time now. And it's like very predictable, the schedule and what's new, all the upgrade process, all of that. It keeps improving, but it's a very stable and predictable process at the same time. Proprietary databases, well, proprietary software in general has become less popular. People are moving to open source increasingly and in the open source ecosystem, MySQL have had their own problems with the versions and like there's been confusing in that world. And then other databases never really became that popular for some reason. But then there's of course technical reasons like postgres has a big ecosystem of all kinds of tools and a lot of things that kind of come with the fact that it has become popular. And the default, like a lot of people are like, there's a big ecosystem and if you have a problem with Postgres, you can Google that easily. You will find 10 blog posts explaining the same error messages that you are seeing and how do you fix that? So it has reached kind of the point where there's a snowball effect going on that because it is popular, it has advantages simply because it is popular.
[04:26]
Kevin Ball
Yeah, that momentum effect is there. One of the things that stands out to me actually thinking about it is also extensibility. I feel like postgres has been able to not just grow in the core and it has grown in the core. When document stuff started to get big, we said, okay, we've got JSON B, let's go. But the custom types. And I'm kind of curious actually for folks who have never played around with that. Maybe we can talk a little bit about that. Extensibility. What does the extensibility story for Postgres look like?
[04:55]
Heiki Linakangas
That has actually been like if you go all the way back to the university project 30 years ago when Postgres was born. That was at the core of Postgres already back then, the idea of extensibility, and especially with the data types. Postgres has a very flexible type system. You can create your own data type with its own functions, and not just functions, but also operators on how do you integrate into the different index types? And you get to define your own definition of ordering for your data type. And what does it mean to sort it or what does it mean to, you know, you can create your own hash functions. Many other databases have a few basic data types of strings and integers, for example, and everything is either a string or an integer with some sugar on top. But Postgres goes a lot deeper in that. Like even all of the built in data types are created using the same primitives of operators, operator families, operator classes, and how do other operators work together. And you get to define all of that yourself. I remember I wrote for a presentation once a data type for indexing colors, for example, or to work with different colors. It was a toy. But it's really interesting to see that you can create a data type for colors, for example, and then you get to define your own operators, like what does it mean for yellow to be greater than blue, for example, or whatever operators make sense. And then you get to build your own indexing system for that. You can build similar to geographic indexes, you can build indexes on which colors are closer to each other and you get to define the distance function and all of that. So yeah, the type system is really very flexible. A lot of the credit goes all the way back to the University Times. That was the whole idea of postgres even before any of the other stuff was while looking all that was even created yet. But then we've had these extensions. PostGIS has been really important for that. That's been around for a very, very long time. And Postgres has been kind of the default for geographic applications, even longer than it has been for other data for other applications because of PostGIS and the fact that PostGIS was built as an extension and it has a different license, it's GPL licensed, so it was never going to become part of the core because of the license issue, if nothing else. But that has kind of forced a nice dynamic between the communities too. Like we are all friends and we talk to each other. So whenever there has been a need for a new kind of an indexing or some kind of support for indexing geographical types, we've kind of designed the core features for what does it mean to have defined these things? And then on the PostGIS side, they've implemented the implementations of data for the geographic types. But that has played really well. Like another very popular extension nowadays recently, thanks to all the AI stuff, is PGvector. So that's become really popular in the last couple of years suddenly. But like the fact that, you know, why is it possible to create a PGvector extension? Like it all plugs into the same extension points that we have had for PostGIS and other data types and it just plugged in really nicely there.
[07:44]
Kevin Ball
That makes sense. And it is interesting how having that external body be such a big part actually of postgres. Early growth forced you into that.
[07:54]
Heiki Linakangas
Yeah, I think that's really, really helpful. And the fact that it was a different license, there was never really any question of should it be part of the core. It was not meant to be. But that has kind of driven the way we think about extensions. And in general it's a good thing to have extensions. And we don't want to have all the different data types in Core. They have a more happy life outside the core as an extension.
[08:14]
Kevin Ball
How do you think about what should live in Core and what should not?
[08:18]
Heiki Linakangas
That's a great question. People have different opinions. My opinion is that anything that if there's a reason it needs to be part of Core, then it can be part of Core. But for most new things, it's better if you can have something else than extension. Extensions have a lot of advantages. Like you can have your own release schedule, you get to decide your own releases, you get to decide your own other external dependencies. Something like PostGIS, for example, depends on a bunch of other libraries that we would rather not have any dependency on core postgres on those libraries anyway. But similarly, if you're building a new data type nowadays, you get to decide which other libraries you use. For example, you don't need to ask for permission for anyone, you just do it. And I think that's really powerful. That allows you to move much faster.
[09:01]
Kevin Ball
Absolutely.
[09:02]
Heiki Linakangas
There are reasons like people do suggest or want to include various stuff in Core and we have to have discussions like why should this be in Core or not? I think that dynamic is shifting a little bit. Like one of the reasons why people want to have extensions in Core or move these Things into Core is that they want to have the same level of support and the same kind of. They want it to be integral part of postgres for whatever reasons. But as a developer I'm always pushing back on that because that basically sounds like someone is dumping a lot of work to me for me to maintain all that stuff. And I don't want that. I want to have listings to maintain.
[09:35]
Kevin Ball
Yeah, for sure. So, I mean, I think this balance is always tricky, right. And you've got, as you say, if you're outside of Core, you can move much faster, you can explore things much quicker, you can use a lot more third party dependencies. What types of innovations are still in progress for Core? Where do you think the core of postgres is developing?
[09:56]
Heiki Linakangas
Well, there's no central roadmap. We know what's coming in version 18, which is released in autumn, September, probably there's going to be asynchronous IO. That's the big feature I've been keeping an eye on. I haven't done much work on it except some reviewing, but it's the Andres front and others from Microsoft have been leading that effort. But that's going to bring big improvements to the I O characteristics of postgres. One thing that I've been working on a little bit is multi threading. I kind of raised the flag on that, like let's get multi threaded and that has hit social media and stuff. I haven't done much work on actually making that happen. But I think we are in a place where there's agreement that that's where we want to go and that that's kind of the first step in establishing that it's desirable. That's something we want to happen. And I'm hoping to spend a lot more time on that for the next release actually. But yeah, there's no central roadmap for postgres. So it always depends on what other stuff that people submit patches for. What are the things that people and companies who are contributing decide to work on.
[10:55]
Kevin Ball
Yeah, that makes sense. Well, and let's talk. You mentioned briefly PGvector and I know that whole space of vector databases and storing embeddings and all of these different semantic search layers and things like that is very much the hot topic today. And I think you worked quite a bit on PGvector, is that correct?
[11:15]
Heiki Linakangas
Yeah, I have contributed a little bit to that.
[11:17]
Kevin Ball
So what is in PGvector? What does it take to run it and how does it stack up against say for example, a dedicated vector database?
[11:25]
Heiki Linakangas
I Mean it uses the same algorithm as many other vector databases, HCNs, W and IVFlat, those are the two algorithms that PTVort implements. And that's pretty much the state of the art. Like there are some other algorithms with slightly different trade offs. There are other implementations of the algorithms that can be faster or slower, but it's roughly the same algorithm that everyone is using. So I think ptvector is stacking up okay. It could be a little bit faster, it could use less memory, there's always improvements to be made, but it's doing okay. But the thing with all of these algorithms, in my experience, coming from a traditional database world and kind of, I haven't done much with AI, but what I have done, when I started to look at PTVortor, I started to look at it from the point of view of I know indexes, I know how PostGIS works, I know all the other index types. Let's look at this new thing. How different can it be? And it turns out that yeah, it has some vector data is slightly different. Vectors are large, first of all, and building in these indexes is very slow compared to traditional B3 indexes or other indexes. So that was a bit of shocker just how expensive and CPU intensive these workloads are. Another thing that is striking is that there are no good disk algorithms for vectors. So all of the algorithms pretty much depend on you have to fit your workload in memory, which puts kind of a cap on upper limit on how much data can you deal with. And the big differences between many of these algorithms implementations is actually in how much you can compress it, like how much lossy compression can you do. Some of them, you can compress the vectors all the way down to one bit per dimension and it works surprisingly well. And there's like all of the new research that is happening is happening on how do you just make the data smaller so that it becomes faster to process and you can fit more of that in ram. So that's very different from traditional data types that I'm used to dealing with.
[13:15]
Kevin Ball
Yeah, that makes sense. And then are there any ways that PGvector is different from some other custom type? Or does it essentially end up looking to postgres the same as post GIS or something else?
[13:29]
Heiki Linakangas
It looks exactly the same. So that's the interesting thing, and this is something we haven't solved, but something that is interesting with the vector algorithms is that they're all approximate. Whenever you're doing a search on a vector index, it's always like the Thing it does. The thing that all of these algorithms do is approximate the nearest neighbor search. And whenever you hear something approximate, that's kind of a red flag to a SQL developer. Like databases are supposed to be very exact. And if you do, you select star from a table, you don't expect to get approximate results. You expect to get exactly the same data set what you inserted. But that's not what these algorithms do. They are always approximate. They lose not that much information, but they do lose some information. And whenever you then do the search and you pick the top 10 results, for example, it's not deterministic which results you get. And that's fine for simple cases, but that actually throws off some of the rest of the system. Like if you're using PGvector together as part of a larger SQL query, for example, you would do a vector search and then you would filter that or whatnot. Then it gets complicated because suddenly the kind of the lossiness of the query propagates all the way to the other stuff. Like if you Fetch the top 10 results and then you get different results and then you filter them and then you sort them and you get different results than you would otherwise. For example, it controls the rest of the query and it can even cause errors when the ordering doesn't match what the planner expects. Stuff like that. There are corner cases, but it can happen. So that's something that we still haven't figured out. Like how do you represent that kind of approximate query results from these operations at the SQL level? I'm not aware of anyone who has a good solution to that. I would be all ears. And maybe this is something that needs to go all the way to the SQL standard or something where you would kind of define the semantics of what does it mean to get the approximate results.
[15:19]
Kevin Ball
Yeah, that's fascinating because I think in a lot of cases you might have that as a separate data store and so you don't have to worry about how is that combining then with the SQL place? It's up to the application developer. They decide.
[15:30]
Heiki Linakangas
Yeah, exactly. Yeah, you have to figure it out yourself.
[15:33]
Kevin Ball
Yeah, so that kind of leads to an interesting question. So you know this is a type of bug that can occur. How do you deal with that inside of Postgres, do you sort of look for it and raise an error? Do you just return incorrect results? Like what does that look like?
[15:49]
Heiki Linakangas
Yeah, it depends. It depends on how exactly it happens. It can lead to incorrect query results. Well, they're kind of all incorrect. Like this is approximate search.
[15:58]
Kevin Ball
Yeah. What does correct mean in this world? Right, right.
[16:00]
Heiki Linakangas
What does correct mean? Like there's different levels of correctness and you know, how do you measure that? None of that is well defined. There are ways you can work around it. Like you can have in the SQL syntax, you can kind of work with it the same way you would with an external vector database and do one query for the approximate part and then do the rest of the query using the results of that. You can put the barrier with the with clause or something to kind of force the planner to make the plan a certain way. So that's one workaround. But if you don't do that, then yeah, those errors can propagate and it depends on what the rest of the plan looks like. If it's a merge join and you'd get the different ordering for the results, for example, then there's actually checks, like sanity checks in postgres where if the result set that was supposed to be ordered is not, you will get an error. But in other cases you might just silently get incorrect query results. Again, for some definition of incorrect.
[16:51]
Kevin Ball
Yeah, it is a fascinating world we're in right now. So looking at this and looking at sort of where postgres is going and how it's being used actually brings me a little bit towards Neon. So can you actually share what was the motivation behind starting Neon?
[17:08]
Heiki Linakangas
So the way we started, we looked at the architecture of Amazon Aurora and we decided that we want to build something similar but make it open source. That was kind of the starting point. Now, along the way, of course we've come up with new ideas and other stuff, but that's still kind of at the core of what we do. And the core is the separation of compute and storage. So the idea is that there is a separate storage system which keeps all of the history. Like that's the interesting part about Neon and that's different from our. At least they don't expose it. But the way the Neon storage works is that it takes the write ahead log from regular postgres and it keeps all of the history and it kind of allows you to do point in time recovery and it replaces your regular backups and while archive that you would normally use with the postgres setup, it kind of integrates and replaces all that with the Neon storage which keeps all of the history and that allows you to do stuff like point in time query or launch multiple read replicas against the same storage without having to make multiple copies of it of your data, things like that. And that allows us to scale the Storage layer separately, like independently. And then there's the compute side. And the compute for us means basically Postgres. And postgres connects to the specialized Neon storage instead of the local disk. So this idea of separating the compute and storage, that was at the heart of when we founded NEON and what we started to work on, and it's still at the heart of everything we do, all of the features we have. Branching is a popular one that a lot of our paying customers are using Neon because of the branching functionality. Well, that's possible thanks to the storage, because the storage allows us to do that branching. We are serverless again. That's the big reason why we can be serverless. It's that we can launch postgres very quickly. And the reason we can do that is because we have separated the storage. So it all kind of ties back to, thanks to the storage engine that makes all of the other stuff we do impossible.
[18:58]
Kevin Ball
Yeah. This concept of a serverless database is fascinating because as the world has gone towards, okay, serverless and stateless services and all these things, like the most heavy inertial piece that's hardest to do that for has always been where your data lives. Because data is persistent. Data has to be persistent in order to be there. And if I understand correctly, you're saying, okay, great, but let's sort of cut the tightest barrier possible around that thing that has to be persistent and make everything else serverless.
[19:27]
Heiki Linakangas
Exactly. So, yeah, that's exactly what we do. So even we made even Postgres serverless by pushing down the thing that needs to have the state, which is the storage, kind of push that further down the stack and separate that as well from the compute side of postgres. Yeah, as you said, the data has inertia and it doesn't move easily. It's not serverless. It doesn't just. You can't just suddenly come up with data from thin air. You have to actually store it somewhere.
[19:53]
Kevin Ball
So how fast can you spin up postgres if your data is separated off in neon? Like, what is that response time on a serverless Postgres?
[20:01]
Heiki Linakangas
So we measure it internally and it's about 700 milliseconds at the moment. I think there's a little bit more that the user perceives, because handshake to the client and so forth. That's a little bit. But we try to keep it under one second when we started. Another part of how we make this work is that we run the postgres instances in Kubernetes cluster and we launch A vm, a separate virtual machine for every postgres. Instance when we started, the delay of that was about five seconds to launch a new Kubernetes pod and connect your connection to that. And we kept hearing from users like, yeah, neon is awesome, but man, the cold start time, that's too long. Like that's killing us. That's too long. So we kept hearing about that and we've started to work on it based on the feedback we got. And we have plans to bring it down further and further. The thing that made a big difference is Pre creating these VMs. So we have a pool of pre warmed VMs available at all times and we got it down to about one second and we stopped hearing these complaints. So that was kind of the tipping point where people stopped complaining. And that was really interesting to see. Like we have plans, we still have plans to bring it down even further, but it's not really a problem anymore. Like that seems to be that people are happy with the roughly 1 second delay when you connect for the first time.
[21:14]
Kevin Ball
Got it. So that's low enough that you can essentially spin this down if you don't have requests coming in, but it'll stay hot so long as you're coming at it. It's not quite the same thing in some ways as a serverless function that really is lifetime as only the request.
[21:30]
Heiki Linakangas
Well, a serverless function has the same thing. The function lives somewhere, the code is somewhere, and there is a delay the first time you call a serverless function as well. And then it gets loaded and so forth. It's not as high like they typically, that's even lower. But we also have plans to bring it down even further. But the pain point seems to be somewhere between one and five seconds where people stopped complaining.
[21:53]
Kevin Ball
That makes sense. So I'd love to dive into what you're doing in the storage layer because I was researching this a little bit ahead of time and found it completely fascinating. So can we maybe just start with high level architecture, which I think you laid out when you were starting neon, or at least I saw it in the blog post. But like, what does this minimum viable encapsulation of your heavy inertial data look like and how does it enable these things like point in time queries and all of that sort of thing?
[22:22]
Heiki Linakangas
The kind of the abstraction we have is that we store all of the write ahead log that postgres produces. It's the same write ahead log that people use normally with postgres for point in time recovery backups while archive Replication all of those features. So all of this is based on the same Writer headlog. And the basic idea is that the storage takes the writer headlog stream from Postgres and processes it and it transforms it into a different, like reshuffles the data into a different format. Based on the write ahead log, it can reconstruct any version of a page. So postgres always requests a page like all of this works at the page level, at the block level, works with the eight kilobyte blocks that postgres always uses. But whenever postgres needs to read a page, instead of reading it locally from disk, it sends a request to the storage get page number 1, 2, 3. At this point in time. And the kind of this time dimension is what makes all of these other features possible. So the storage layer can reconstruct any version of any page. So if the page has been modified 100 times, it actually keeps all of the hundred versions of the page available and it can return any version of those pages depending on what the request is. And you can kind of see where that's the thing that enables us to do things like having a read replica that's lagging a little bit behind the primary. It will just request all of the pages at a slightly delayed point in time, or it replaces a point in time recovery, because you can just launch a new compute node, a new postgres instance, and it can just tell, you can just tell it to please show me all the data at this older point in time. And the storage layer can do that. Now there's a lot of complexity and a lot of smart engineering that goes into the storage to make it possible to do that. Actually, keeping every version of every page is very expensive, obviously. So there's a lot of smarts in how it stores the data. And it doesn't literally keep all of the versions, but it keeps the riderhead log and some images of the pages at specific points in time, so that it can quickly reconstruct any page based on the writer log and the images it has.
[24:33]
Kevin Ball
This sounds to me like essentially an event sourced model, if I'm understanding correctly.
[24:40]
Heiki Linakangas
Yeah, that's one way to think about it. Yes.
[24:42]
Kevin Ball
Where the events are, the write ahead log, it's saying, okay, this changed in this way, this changed in this way. And then it's projecting out here's the state of data at some point in time and you save those images essentially. And so any point in time becomes the most recent image before it, plus a set of deltas. Right?
[25:02]
Heiki Linakangas
Yeah, that's one way to think about it. In an event sourcing system, it's up to you to define what do you do with the events and how do you collapse the events into the final image of whatever you have. In this case, it's the pages and the page images. So postgres never needs to know about any of this stuff that's happening behind the scenes. Postgres can just request a page and it will get a page. And the storage does all the magic to reconstruct that.
[25:26]
Kevin Ball
Yeah, postgres can stay blissfully unaware of the events underneath the surface. And it's just what is my state? That's really interesting. So you highlighted at a point before around the importance of having data in memory for certain index types or other different things. What does the memory hierarchy of this system look like? Are those servers that are responding to pages, how much are they keeping in memory? Where are they falling back to? What does this look like starting from.
[25:52]
Heiki Linakangas
The top, like from the fastest tier? Well, I guess there's even the CPU caches and registers and so forth. But the way I think about it is that at the top there's the postgres shared buffer cache, which is the same buffer cache that postgres always has. We actually configure that buffer cache to be very small. I think we studied 228 megabytes, regardless of the size of your database and regardless of the instance size. And the reason for that is flexibility. So that allows us to easily scale that up and down. Postgres doesn't allow you to change the size of that. So we kind of set it to the minimum that we can get away with. Then there's the next level of cache, which is something we call the local file cache. And that's a neon specific thing that we built that just uses a local file as a kind of second level cache. So if the page is already in that cache, then you don't need to go and request it from the storage. We can return it from locally. And that makes it possible for us to have such a small shared buffer cache that it kind of allows us to overflow that. Then the next step is that whenever there's a cache miss from that now you have to go to actually go to the storage and you have to go to what we call the page server, which is that's the service that does all the reconstruction of the pages we talked about. So now you have to actually do a network request. You have to go over the network, make that request, and then bring it back there Is caching involved within the page server as well? For various things, but that's not very significant to the latency. Like we pay a lot of attention to the latency of all of this because that easily adds up. But as soon as you have to go over the network, the other latencies don't matter so much. So we spend a lot of time making sure that there is caching that happens in postgres and that we do prefetching properly so you don't have to. That's a very effective way of hiding the latency. Then the stack goes even deeper than that. So the page servers actually don't. They're not the final source of truth either. The page servers are actually just a cache of what we store in object storage, like. Like Amazon S3 or Azure Store blob storage. So that's where the object storage is what ensures that we don't lose data in the long run. So if one of the page servers goes down, we just launch a new one and it will download the files from the object storage that it needs on demand or beforehand. So there is a pretty deep hierarchy of caching involved.
[28:08]
Kevin Ball
Yeah, absolutely. So that's fascinating. You said you pay very close attention to the different latencies involved. So what does that look like? One network hop. And then if you have to go all the way down, say you're accessing something that's completely cold coming out of the object storage. Like how much does that add to your query time?
[28:24]
Heiki Linakangas
Yeah, the first time you have to do a request and download this data from object storage. That that's hundreds of milliseconds or up to a second or two seconds even. So that's very slow. But we do try to keep things cached on the page server side so you don't see that latency. You can also hide that a little bit. Like we do offload the stuff to and kind of remove it from the hot storage for databases that haven't been accessed for a long time. I don't remember what exactly that time is, but if you haven't used your database for weeks, then you're probably okay with a few seconds of latency on the first hit. The first time you actually access it again. And those downloads tend to happen in big batches. So it's not like you have to pay the 1 second latency for every page. It's only really for the first few pages and then it gets cached again.
[29:07]
Kevin Ball
Nice. You talked a little bit about some of the benefits that this gets you, but let's maybe dive a little bit more. So you said you don't need your whole backup service. Is that because it's all going to object storage anyway, which is already dealing with all of that or how does that work?
[29:21]
Heiki Linakangas
Yeah, that's correct. So all the data ultimately goes to object storage and that's what ensures the durability. There's actually like two pieces that take care of the durability for the recent stuff that like recent modifications. We have a service called the Safekeepers that make sure that we don't lose the recent transactions because you can't stream data to the object storage. So that's the thing, that's the service. We keep like three copies of the reason, while the reason Writer had log and there's a query algorithm there based on paxes which makes sure that when you commit the transaction and we respond to the client that okay, the transaction is committed, we don't lose the recent transactions. But that's only for the recent stuff. So pretty quickly the data gets processed by the page servers and uploaded to files on object storage. And that's ultimately what ensures that we don't lose the data.
[30:11]
Kevin Ball
That's awesome. So to Postgres, all of this just looks like a file system. It doesn't have to worry about it.
[30:17]
Heiki Linakangas
That's right. We have to modify postgres a little bit to make this work because there was no extension point for this. Unfortunately, it's a very small patch. It's a tiny patch to hook into the very close to the functions where postgres normally does, like read a page from disk or write a page to disk. One interesting fact about this all is that it's all based on the Writer headlog and we reconstruct the pages from the Writer headlog. So whenever postgres writes out a page from the buffer cache, we actually just throw it away. We don't need it because we can always reconstruct the data from the Writer headlog just like you would when you're restoring for backup. So in a way we are continuously all the time restoring the data from the Writer headlog.
[31:01]
Kevin Ball
It's interesting to me because I remember when I first looked into how postgres does replication and all these different things. I saw this right ahead log and I was like, oh, under the covers it is like this event sourced model, but it's just continuously creating like the one projection or the one true image of it. You're essentially saying, okay, well yeah, let's take advantage of that. Let's hook into it. Let's Put it over here. We can ignore postgres attempt to keep a safe image. We've got that handled.
[31:27]
Heiki Linakangas
Yeah. The writer headlog is the data. That's the important thing.
[31:30]
Kevin Ball
Yeah, absolutely. You said you did this wanting to do something like Aurora. That is all open source. So is the Neon storage layer open source as well?
[31:39]
Heiki Linakangas
Yes, the Neon storage layer is all open source.
[31:41]
Kevin Ball
Wow. So what else goes into Neon that is making this work? Right?
[31:48]
Heiki Linakangas
Well, there's a lot of like all of the storage and all the things that we talked about so far is roughly half of what the company does. Like roughly half of the engineering effort goes into all that. The other half is all the other stuff like the website, the control panel, billing, all kinds of stuff. And user facing APIs, dashboards, managing the whole cluster. I mentioned the Kubernetes cluster. There's a lot of stuff to make all that work. Oh, and auto scaling, like scaling these compute VMs up and down, moving them around, just managing the cluster and making. Managing the whole service the. And. And all of the serverless aspects.
[32:25]
Kevin Ball
Got it. So if you were to say again, about 50% is like business stuff, essentially what it takes to keep the business going. And 50% is open source. Postgres Neon. If somebody wanted to take the Neon storage system and run it somewhere else, they could just go and do that.
[32:41]
Heiki Linakangas
Yeah, and I would love that. Like people try that every now and then and like I encourage them. It can be a little bit tricky because we don't like, it's not a product we sell. Like the open source thing. We built this thing so we can run our service, so we don't do tagged releases and things like that. So it can be a little bit hard for other people. But there are people always doing that. And we have gotten some contributions to Docker Images for people to run that on their own. And I welcome that. I love it when people are doing that.
[33:10]
Kevin Ball
Yeah, well, and I think it speaks in a slightly different area to sort of the openness of the postgres community. Right. That you can do this and have it be open and connect into postgres without needing to do too much. You have a patch, but it's I assume also open source.
[33:28]
Heiki Linakangas
Yes, for sure.
[33:29]
Kevin Ball
That's super cool. So I guess then the question I would come to from this is where are things going? What is the evolution of postgres and kind of cloud databases looking like in your mind?
[33:44]
Heiki Linakangas
Yeah. Working for Neon or Neon, I have like things I want to work on for Postgres are for upstream postgres you know, people might have different opinions, but the thing that strikes me is that I would love to see much more flexibility in Postgres so that it's easier to run in this kind of serverless environment. Not just neon, but for anyone who wants to run it in cloud and, you know, scale it up and down. We have had a lot of friction with connection management. For example, if you want to have thousands and thousands of connections, you need to have a connection pool. And well, there's a lot of connection poolers out there in the ecosystem and they have slightly different trade offs and people have found the workarounds and there's, you know, thousands of blog posts on how to do all that. But kind of at the core, like having to deal with all of that is a bit painful. Some other databases do a lot better, to be honest, with just having lots of connections open and dealing with that problem internally. For Postgres, that's always a. It's a bit messy. It comes down to things like memory management. If you have a lot of connections now they're competing for memory they have, every connection has its own caches for queries and stuff like that. So it kind of adds up. So that whole story is a bit awkward. So I would love to somehow address that. And that's one of the reasons I wanted to start working on the multi threading is that it will just switching to threads won't fix any of those other problems, but it makes it possible to start having more shared caches. It makes it possible to resize these certain memory areas more easily. I think in the future, five years down the line maybe that we will start to reap the benefits of that and then we can have more flexibility.
[35:19]
Kevin Ball
Yeah, I think that's always one of the challenges with a really long lived project is you get these big architectural choices that were made in Postgres case 20 years ago and you get to a point where they're limiting you. But you've got millions of users, you can't just rewrite it. You've got to very gradually shift things forward. So maybe walk us through in your head what that looks like. For example, multi threading, like what does it take to make such a huge architectural change that's going to set you up in five years to be able to reap those benefits?
[35:52]
Heiki Linakangas
Well, there's a long list of todo items on a wiki, so what we need to do, there are some core components that need to be refactored. Like the first step is to refactor things so that it's easier to use either threads or processes because this is not going to happen within one release. We're not going to just switch over. So we will need to have a plan where we can comfortably live with threads or processes for at least a few releases. I would love to keep that transition period as short as possible, but realistically it's going to take years. One aspect of that is again, the whole ecosystem we have, there are tons of extensions out there. Even if we do all the changes we need in core, the whole ecosystem will have to be dragged along and we need to make it as easy as possible for them. I actually think the ecosystem, like extensions, might actually move faster because it might be a lot easier to write some extensions in a multi generated environment to begin with. So I think we will pretty quickly start to actually see extensions that only work when you're using the threads, even as soon as we get that feature out there. But yeah, it will take years and there's a lot of refactoring that needs to happen and then we'll need to define the user visible. You know, how do you choose, how do you configure this thing? Hopefully we won't invalidate much of the conventional wisdom of how do you set up Postgres? And the goal is that it would not be, it should have roughly the same trade offs, at least in the beginning. But then once we get to the point where we can start to remove the stuff that requires processes and kind of go all the way in, that's when we can start to really reap the benefits and we can start to rely on having threads and, and the.
[37:25]
Kevin Ball
Shared address space databases feel like an area where there's been a lot of noise recently about new approaches and new different things. We, we alluded a little bit to vector databases and you know, we had the whole boom of key value stores and then, oh, now that we can do distributed and still maintain, you know, SQL guarantees and asset guarantees and things like that, like what do you think is going to keep postgres as the main choice for developers? Or what are the risks that it's not addressing that something else might be able to come in and take that crown?
[38:02]
Heiki Linakangas
I mean, nothing is forever. But postgres has been around for a long time and I think it will stick around for a long time still. There's a lot of new projects that are choosing to use the postgres syntax. There's a lot of projects that are choosing to use wire protocol. There's a lot of projects that are you know, choosing to use bits and pieces of postgres, even if there are completely new implementations of a completely new system. So I think there's some staying power in that, like just being the lowest common denominator between all of the forks and all of the different approaches, that's one, one way. But postgres is still alive and kicking. Like there's not. There isn't really any serious competitors, I find. Like there's a lot of competitors for niches, but there's nothing that is taking over that I see. And there's like, there's no reason. It's a similar story with the Linux, for example. It's very dominant. The fact that it is open, like there's no particular need for anyone to compete directly with that. You can just join the project. You know, why compete when you can join the project? And postgres is similarly very open. Like it's not dominated by any single commercial vendor. There's a true open source ecosystem around that. So if someone wants to do something cool, some kind of a new approach, they can just use postgres for that. And that's how it will keep evolving with the times.
[39:12]
Kevin Ball
So we've covered a lot of ground, we've talked a lot about postgres, we've talked a lot about neon. Is there anything we haven't covered yet that you want to make sure that we talk about before we wrap up?
[39:23]
Heiki Linakangas
Talk about the future of postgres. That and the whole open source ecosystem. That really depends on what people come up with. So if someone is out there thinking that, hey, I have this new cool algorithm or something, please submit it to the postgres community. The review process can be long and tedious, but there's a lot of people paying attention. And if it's. Some things get done very quickly, depending on what it is.
[39:44]
Kevin Ball
I guess that is one thing that might be worth diving into. I feel like in the last few years the vast majority of growth in the software engineering world has been really at the application layer. Lots and lots of new developers jumping into applications. Getting involved with a database project feels very intimidating. How would you recommend people approach that? And why should they be looking? I mean, I think that's. That's the other thing, right? We've got these new sexy AI development tools or doing apps out for days. Why should somebody get involved with postgres?
[40:16]
Heiki Linakangas
I think someone needs to be motivated and have their own reasons. All of people historically have gotten active with Postgres because they have an itch to scratch. Like maybe they Run into a bug. Maybe they have a missing feature and they want to fix that. But a lot of people, including myself, actually have started by just wanting to work on databases for some whatever strange reason. And then postgres is a good one to. To get started with. So, yeah, I wouldn't spend too much time thinking, like, why would someone contribute?
[40:43]
Kevin Ball
I think if you feel drawn, do it. If you don't.
[40:45]
Heiki Linakangas
Yeah, don't exactly. For advice for how to get started. I don't know. Like, I feel that I've been around the community for such a long time that however I got started is probably obsolete by now. I know there are a bunch of good books. There's a bunch of. There's a lot of resources out there. I would suggest people just Google for it. Maybe writing your own extension is a good way to get started. We talk about all the extensible type systems and stuff. That's a good place to get started and play with.
[41:10]
Kevin Ball
Yeah, absolutely. Yeah. The extension ecosystem, that is a great way, especially because you can get in. You're writing your own indexing code, right? You're like, understanding what does it take to index? How does this stuff have to work under the covers? What do I need to do? Like, I feel like that's a good way in.
[41:25]
Heiki Linakangas
Yeah, for sure.
[41:26]
Kevin Ball
Awesome. Well, this has been super fun. Thank you for joining me today and yeah, good luck. I'm really excited to. I actually have not tried NEON yet. I haven't. When I started, it only takes five minutes. I know. Well, that's the thing is, like, I have dealt with so many, like, backup things and this that, like, as I said, like, databases have so much inertia. The idea of, oh, I could just spin it up and get my point in time. I don't have to build a custom time system to keep track of what things were like. That sounds amazing.
[41:56]
Heiki Linakangas
Yeah, for sure. Thanks for having me.
[41:57]
Kevin Ball
Cheers.
[42:04]
Heiki Linakangas
Sat.