
CodeSandbox was founded in 2017 and provides cloud based development environments along with other features. It’s quickly become one of the most prominent cloud development platforms. Ives van Hoorne is a Co-Founder at CodeSandbox.
Loading summary
Eves Van Hoorn
Codesandbox was founded in 2017 and provides.
Josh Goldberg
Cloud based development environments along with other features.
Eves Van Hoorn
It's quickly become one of the most.
Josh Goldberg
Prominent cloud development platforms.
Eves Van Hoorn
Eves Van Hoorn is a co founder at CodesandBox. He joins the show to talk about the platform. This episode is hosted by Josh Goldberg.
Josh Goldberg
An independent full time open source developer.
Eves Van Hoorn
Josh works on projects in the TypeScript.
Josh Goldberg
Ecosystem, most notably TypeScript ES Slint, the tooling that enables ES Slint and Prettier.
Eves Van Hoorn
To run on TypeScript code.
Josh Goldberg
Josh is also the author of the.
Eves Van Hoorn
O'Reilly Learning TypeScript book, a Microsoft MVP for developer technologies, and a live code streamer on Twitch. Find Josh on Bluesky, Mastodon, Twitter, Twitch, YouTube and dot com as JoshUakGoldber.
Josh Goldberg
Divas. Welcome to Software Engineering Daily. How's it going?
Eves Van Hoorn
It's going great. Thanks for having me.
Josh Goldberg
Well, thanks for coming on. We're really excited to have you. I personally used your product quite a lot, from job interview sandboxes to demos. Could you give us a brief introduction to what is CodesandBox?
Eves Van Hoorn
Yeah, so CodesandBox, simply said, is an online development environment. You can start a new web development project on CodesandBox and you can completely work in your browser. I tend to compare it with Google Docs and Microsoft Word, where if you're writing a document in Microsoft Word and you write something on your computer, but if you want to share it, that becomes harder. And so people have been using Google Docs more, where they can just share a link to a website where they write together in an environment. And we wanted to build the same thing with Code Sandbox.
Josh Goldberg
Fantastic. Before we dive into that and the future of code sharing, I want to dial back a little bit and talk about you as a developer, as a person. How did you first get into programming?
Eves Van Hoorn
So that's a long, long time ago. I initially started with coding not because I wanted to learn coding, but because of I had a need. I think I was about 10 years old or 11, not sure how old I was, but a friend of mine and I, we have a secret language and we tended to just write secret letters to each other in the class. So we would write secret characters and then the other would have to translate it. And that was a lot of fun, but I wanted to go faster. And that is when I started looking into whether it would be possible to create a program. And that's my first kind of program, a Visual Basic program where you could. It was really just two text boxes where if you put something in the left text box, it would translate it and put the solution or the translation in the right text box. And it would also work the other way around. And we had a kind of a Dutch Facebook at the time, and we would send public messages to each other with this secret language. That was my first interaction with coding. It was quite challenging. And to be honest, after that, I didn't code for a long time. Only when I started to look into gaming and mods, I started to do coding again. But that was my first interaction.
Josh Goldberg
There's a humor point that your first form of coding was some kind of input on the left and some kind of output on the right, and you're still doing that decades later.
Eves Van Hoorn
Yeah, that is. Ultimately, it's the same thing, kind of.
Josh Goldberg
Yeah.
Eves Van Hoorn
You have something on the left and it transforms it and puts it on the right. I guess you could say the same thing about Code Sandbox, which is very funny. And we have a lot of fancy things in Code Sandbox, but the core functionality and the most important functionality remains that you have a code editor on the left and a preview on the right that shows what the code is doing. That's the core functionality.
Josh Goldberg
How did that core functionality come to be, or how did you come to create the product in the first place?
Eves Van Hoorn
So I stopped coding for a while, but later on, when I was, I think, 17, I had done a lot of graphic design for a company, and I started to realize that I liked graphic design, but I didn't like doing graphic design for other people because they wanted me to design things that I didn't agree with. Things like, oh, can you make this fire yellow on this purple background? And I was 17 and a bit naive, and I thought, well, graphic design is not for me because it's so subjective. So I started to move to coding, thinking, oh, it's kind of like solving puzzles, and there's only one solution to a puzzle. So there's no discussion or there is no subjectiveness in coding. But I was wrong in hindsight, but I still do still enjoy it. But I started to learn web development, and initially I created a portfolio website. And later on I read in a newspaper, a local newspaper, of my little village in the Netherlands, that there was a very cool new startup that was growing extremely fast. And I was thinking, I want to join this startup because there was not much happening in this village. So after high school, I asked if I could join them for vacation work so that I would work over the holidays for them. And they essentially said, well, initially they ignored me. They didn't want to. Well, the recruiter Was a bit confused that an 18 year old person would just ask to work there as a developer. But later on they called me back after I called them a couple times and they said, yeah, you can work here, but you have to work at least for a year, so would it be possible for you to take a gap year? And so I started working there and everything was in Ruby on Rails. And around that time a new technology called React became more and more popular and I was intrigued by React. I started building more little pages in React and I was thinking, wow, our Ruby on Rails front end feels a bit antiquated compared to how fast a single page application of React is. And I guess again I was a bit naive and I started to convert more and more Ruby on Rails pages into React. It's funny because when I did that conversion, at some point we wanted to test it and we put it live. And then the marketing department came to us because they were distraught. Suddenly 50% of their analytics were gone because I didn't think of all those other business things implementing analytics. Anyway, I'm going on a bit of a tangent, but at some point I realized that when I was working with my co workers on React code, it was very hard to share work with each other. It was like whenever I was on vacation at some point and I got questions from my coworkers about a piece of code in React router and they just sent me snippets on Slack and I had to decipher from my phone what was going on and what was going wrong and what the error was. And I didn't have like a JavaScript interpreter in my own head. So that's when I started thinking that it would be very cool if they could just send me running code. And at the time Figma became more popular, Google Docs became the default for a lot of people. And I started to think, would it be possible to have a code environment in the browser? And at the time I was just writing down all my ideas for if I ever wanted to start a startup in the future. So I wrote this idea down, didn't do much with it. Started going to university after my gap year and then at the university initially I had a lot of fun drinking beer, but that got boring after a while. We started to get lectures about object oriented programming in Java and I was already fully in the React world by then. I was like, why do I have to go through all this again? So that's when I started to work on a side project and I just looked at my list of ideas picked the top one, the latest one, which was the online web editor, and started to make the first design in Sketch. And then my friend Buzz joined and we started working on it more and more and more. And then on April 2, not April 1, we released the first version of CodeSandbox. That was initially why it got started.
Josh Goldberg
This was April 2, 2017.
Eves Van Hoorn
Yeah, that's right.
Josh Goldberg
So that's been over seven years. What was the tech platform that you used in React Land at first back in 2017?
Eves Van Hoorn
Oof. Very different. Everything was in JavaScript, not even in TypeScript. We did use flow types to type things with comments and everything. And that was very, very interesting too. The base application was in Create React app. The backend was written in Elixir using the Phoenix framework, because Elixir was this cool new thing that looks a lot like Ruby and Ruby on Rails, but was fully functional. So that was an interesting learning project. And it turns out that it scales really, really well. And then for the database we use Postgres and for some caches we use Redis. So ultimately the stack was Create React app, front end, Elixir, backend, Postgres database, Redis, second in memory database. And I deployed all of this to a VPS on falter, like $20 per month, VPs, 2 gigabytes of RAM. And it was quite surprising how well that scaled. It scaled for a year, I would say. At some point we got to the 500,000 monthly users and it was still running on this $20 pps in falter. After that we did move to Kubernetes and we moved the deployment to GCP to Google Cloud platform. But because the idea was that we can also do easier migrations that way and easier scaling. But it was intriguing to think that a solution as simple as this works so well. Like I think the implicitly ultimately helped with scaling it. I remember when building the first version of CodesandBox that I was very worried about saving files. Like if someone creates a sandbox and they press fork, for example, and they get their own version, the other person gets their own version of a sandbox. How would we scale all of that with all those files? Should we use something like S3? Should we use something like Dropbox? There was a lot of questions about how to store those files. And ultimately, after like a month of very, very advanced thinking of should we use Git and all those kind of things, I decided in that moment to store everything in postgres so every file would just be a row in a postgres table. And when you press fork, we would just do a bunch of selects and a bunch of inserts in postgres to copy all the files. No deduplication or anything like that. And the funny thing is, we still use that today. That system, it never hit limits. And we now have, like, I think we have over 80 million sandboxes over like 500 million files stored in a Postgres database. And it's still like, this query still is like within 20 milliseconds. A query and sandbox can be loaded within 100 milliseconds, just doing a bunch of selects. Only thing that we don't store in the database are binary files. That was a bit too far. So those are stored in a Google Cloud bucket, and then we store a link inside the database to that file. But it was a reminder of me that often the simple solution works the best, either because it's just so simple that there are less race conditions, or there's less things that can go wrong, but also because it's much easier to understand how it works. So, yeah, that was the initial.
Josh Goldberg
Before we dive into more on how that works, I want to take a moment to emphasize that point that you have half of a billion file entries in your postgres database and you're still able to load the core part of your site that involves potentially many file queries in a tenth of a second. That's an incredible scaling, performance feat, no?
Eves Van Hoorn
Yeah. But I would attribute it all to postgres. Postgres has exceeded my expectations time and time again. If you have a good index on multiple columns, then the performance is incredible. And the scaling is also incredible. I would choose postgres for any database right now. The only exception would be for things like storing a tremendous amount of data that is inherently tied to timing. So time series data. Then I would look at something like Clickhouse. But postgres is for everything else, an incredible solution.
Josh Goldberg
We'll see if we can get that quote on their homepage. So let's continue that journey. On the very bottom or back of a code sandbox, you have files stored in a Postgres database and also links to binary large objects stored in Google Cloud. What is on top of that? How are those retrieved? Or what's the system around them?
Eves Van Hoorn
Yeah, so I think it's the easiest way to describe how everything is retrieved. If we look at, like, the oldest version of CodeSandBox, like the initial version of Code Sandbox. Because otherwise we need to think about, like, all the permissions, billing, and stuff that comes now in between but the simplest, the first version of CodesandBox, whenever you would retrieve a sandbox, you would call an API call on our elixir server. The elixir server would go through a couple of checks. It would check if you have access to that certain sandbox, it will get your user from the cookie and then it would run a query on the database. And that query is huge. And it has like 20 joins. That's one of the only queries where it's handwritten, where it's not a query that is generated by an ormond and that query, it looks at our modules table that is where all the files are stored. And it looks at our directory stable, which is how the files are linked to the different directories and it looks at our sandbox table. So it gets from the sandbox table it gets the sandbox, then it gets from the directories table it gets all the directories that are related to the sandbox and then from the modules table it gets all the files that have the same sandbox id and then based on all that information, it generates like a JSON blurb that contains all the files of the sandbox itself and returns that. And you could argue that that's unscalable in the sense of it wouldn't work for huge projects. And that is completely right. Codesandbox was specifically built for creating prototypes, creating small projects. That was the initial use case. And so there was a limit of 500 files for a single sandbox. And if there's a limit of 500 files, then it's fine to return the whole contents of the sandbox. Now, at this point with codesandbox, we also have a second type of project which we call dev boxes. So we have sandboxes for prototyping and dev boxes for development. And for dev boxes, we have a more, I would say, sophisticated way of retrieving files. You can lazily retrieve files so that you don't have to download the whole sandbox just to see what's going on. That is in a nutshell, how it works. Then we have redis in between for simple small things like caches, but also tracking page views so that if you access the sandbox twice in the same hour, that it's seen as one page view instead of two. But yeah, that's how we retrieve sandboxes.
Josh Goldberg
So that dev boxes concept, that wasn't there in the first versions of CodesandBox, when did that get added in?
Eves Van Hoorn
Dev boxes are now, I would say, approximately two and a half years old, maybe three years old. Even when CodesandBox launched, it was initially not very popular. I put it on Twitter. I had like 60 followers on Twitter. Most of my followers were high school friends, so I got three likes. We started to become much more proactive with talking about codesandbox. I started to write blog posts about how it worked, started to directly talk to people if they created an account to just get feedback, all with the idea. At some point, Paul Graham, he said that it's better to have 100 fans than 100,000 people that like you. And with that idea, we've tried to get to 100 fans by doing a lot of unscalable things. And codesandbox started growing and growing and growing, and people started to use it for things that we didn't imagine in the first place. It was initially built for this specific case that I had with at work, where you wanted to ask a question and you wanted to have a live example for this question. But people started to use it for other things. Things like job interviews, bug reports, documentation, also workshops where people learn how to code. And people started to use it to build new projects. Like they started to work on a new website, a portfolio website, for example. Or they started to work on, for example, a new blog. Or even some people started a new startup. Like, for example, there is one that I've always found funny to. Well, funny it's more like proud. There is this whiteboarding tool called ExceliDraw. I use it a lot. And the interesting thing is the initial version of Excalidraw, it was built on CodesandBox when it was called Excalibur, and it was sandboxes shared over Twitter. So people wanted to build real things with codesandbox. They wanted to build, like, their portfolio website, like things that they ultimately want to Deploy. And while CodesandBox worked really well for the things that were small, like job interviews or examples, it didn't really work for the big projects because of our 500 file limits. And that's when we started to look into if we could create the same experience that we have with CodesandBox for the smaller projects, but then for big projects. So still, you should have the capability to share a link with someone and they can see the running code. They can see everything, how it works, and they can press fork to get their own version. And that is what Dev Boxes has become. It was kind of like a sort of rewrite of Code Sandbox, because the core system, the file system, changed underneath it. And we normally sandboxes all run in the browser because they can run small projects, but dev boxes, they run on the server. So we built a version of CodesandBox that was really meant for full development.
Josh Goldberg
So how is it different, just thinking in the database context, how is it different that dev boxes retrieve stuff versus the original layout?
Eves Van Hoorn
So in the case of dev boxes, we run a VM to run that project, a virtual machine, so essentially a small server that runs the project itself. And inside that server, we run a process that can read from the file system. So now when you open a dev box, you don't connect to our API server to get the files. Instead you connect via a WebSocket connection to a little server that runs inside that VM. And then the editor can ask things. It can ask like, can you give me the contents of the file under the file path slash project hello Txt and then it will return it. And so the whole API server was still there, but it was only there for validation, for authentication. But ultimately the connection for getting the files and understanding what's going on within the project would be done by directly connecting to the server, to the VM itself.
Josh Goldberg
So that's not as scalable as just the original sandboxes. Everything runs on the client. We're just a file and API lookup.
Eves Van Hoorn
It's not as scalable like starting a VM for every user for every project. It's a real challenge. And at least the last four years, I've been really, really deep in learning how to build efficient infrastructure. We went through multiple iterations of finding ways to run VMs efficiently. Initially, we tried Kubernetes, we tried Docker containers, but we felt like that was too slow. And in 2021, I found a project called Firecracker. It was created by the Amazon team, by AWS team, because they use it to run AWS Lambda and AWS Fargate. And the really interesting thing about Firecracker is that it's a VM that can run code, but at any point in time you can say, pause this VM and it will literally just halt. It will not do any execution anymore. And then you can say, write your memory now to disk. And then later on a day later, for example, you can say, create a new VM exactly from this memory that you wrote to disk, and it will continue exactly where it left off. It could be in the middle of like an operation where it's calculating Fibonacci sequence, for example, and it will just continue. It doesn't matter. That was so interesting. And it's very. I would say it's very Similar to, like, if you would close your laptop and you would open it a day later, it will also just continue. Even if you have a next JS server running and it's in the middle of a compilation, you can close your laptop and a day later you can open it and it will continue exactly where it left off. But the interesting thing about this is one of the things that people felt with our initial version of CodeSandBox with the server was that it was slow. Because when you would open a project which hasn't opened in a long time, then you would have to wait for the Create React app server to start. Maybe you need to run NPM install. It could take a long time before you can actually see a preview. And also when you press fork, then we would have to create a copy of that file system, and we would have to do that same process again. And with this approach, we solved both of those problems at the same time. Because whenever someone would go away from a vm, we would just pause it and we would save the memory to disk. And then when someone later on, like two days later, would open that vm, we would be able to resume the VM exactly from where it stopped from when it was paused, and it would resume in, like, one second. So that was problem one solved. And that also helped a ton with scaling, because we now have a rule like, if someone hasn't looked at a VM for five minutes, then we already hibernate it, because people won't notice if we hibernate it, because the resuming is so seamless. And the second thing that's done now is when someone presses fork, we also create a snapshot of the original vm and we use that snapshot to resume the new VM that was created. So when you press fork, we create kind of like an exact copy, because it will continue exactly where the last VM left off. And later on, we did optimizations where VMs even share memory. So if you have like, a VM that started from a snapshot and someone presses fork, then that new VM will share the memory of the old VM. So if two VMs use two gigabytes of memory, it could be that the total usage of memory is 2 gigabytes because. Because they refer to the same shared memory. And those little tricks made it possible to scale VMs. It is the most challenging thing I've worked on, because it's much more challenging than sandboxes, because with sandboxes, we would run everything in the browser, so we would not have to run Servers to run the code of the user. Like we would just the only thing that we had to provide were files and all the execution was on the user part to do. But in this case, we had to create a fast service that can run code, but also is secure because people, they shouldn't be able to break out of that environment. And we're literally giving remote code execution. As a service with CodesandBox, that's such a hard problem.
Josh Goldberg
When I was at Codecademy, we had issues with people doing incredible amounts of compute and we had to have all these hacks and cool gotchas around, say crypto bitcoin mining. But you have, not only that, you also have intentionally the ability for people to call out to the network on the server. So how on earth do you make your boxes secure?
Eves Van Hoorn
Yeah, that is challenging. The boxes themselves, they are pretty secure in the sense of, well, they are very secure. They ultimately we use the same techniques that AWS Lambda uses and AWS Fargate uses. So every VM has its own Unix user. We use a jailer to make sure that everything is in its own environment. But people can still abuse. For example, someone could run a crypto miner on server. Someone could create an account on code sandbox, create 20 VMs and that will go fast. Like we can spin up those 20 VMs very quickly, which makes it very easy for them to do, and they can start mining. Crypto and crypto miners are the most frustrating people. They are very creative, they have a lot of time on their hands. The thing that we do right now is we have a detection heuristic that runs every minute inside a VM to detect if a VM is running. And I have to say, right now it's a lot of if statements and else statements based on existing crypto miners and their behavior that we've seen before. And we're now experimenting with training little neural network that automatically can detect crypto behavior based on network calls and process the process behavior. But it's a cat and mouse game. The other problem that we've had a lot was phishing. People were using Code Sandbox a lot for phishing. They were using it to create fake bank sign in pages. They were using it to create fake Microsoft login pages. And we were fighting that for a long time. And we also saw disruption in CodeSandbox as a service because of it. Because, for example, Google could, for example, block the whole CSB app, which is our preview domain. They could just block the whole domain because of an automated check that said, oh, there were two phishing sites on this domain, so that whole domain should be blocked. And then we had a downtime of like three hours because of that. And then we had to fall back to other domains. Or there is still an ongoing issue where Turkey, some ISPs in Turkey, they have blocked code sandbox like the preview domain just because they saw a couple of phishing pages. Initially we also applied AI to this, actually to we created screenshots of all these public sandboxes and tried to determine whether it was a phishing sandbox or not. And then we would show warnings to the user or we would proactively block these sandboxes. Nowadays we sort of have given up the fight in the sense of whenever someone visits a codesandbox preview for the first time, and it's a standalone preview, so they open it not from within the editor, but they open it from for example, an email. Then we will show initially a big interstitial saying, watch out, this is a codesandbox preview. It's not a bank login page or it's used for development purposes. And then they have to press a button saying I understand. And then they get to the preview. And it does affect the experience of code sandbox. Like when you share a preview with someone to a real thing, then they still have to go through this interstitial before they can actually access the page. But after deploying this, the amount of fishing pages on codesandbox has gone down tremendously. Like probably the fishers, they are looking for new pastures, they're looking for places that don't have something like this. We do still sometimes get emails from these services that detect phishing pages, but we even automatically handle those now. Like we scan all our emails and if it's from an email that we know is a phishing detector, we would automatically ban the sandbox when they put a link in there.
Josh Goldberg
Yeah, that's tough. There's no wide answer here, right? Like what if I'm say a boot camp and I'm teaching my students how to write a full page in whatever front end language and the example app is a login page, one that happens to look like say Google's login. How do you know that that's legit versus, you know, the equivalent scammer?
Eves Van Hoorn
Yeah, those were the hardest cases. That's why we prefer to show a warning instead of proactively banning sandboxes at this point it's much easier because now we just show this interstitial for everyone. And if you trust the person who has sent you that page, then you can open it. And if you get this page from, I don't know, a random text message, and then you get this interstitial saying don't trust any sign in form on this page, then that covers a lot of cases.
Josh Goldberg
This is yet another example of you've tried the complicated approach, say the AI scanning to detect scams and then the simple scalable cheap one actually is pretty effective.
Eves Van Hoorn
Yeah, I didn't reflect on that, but you're right. This simple approach, it solved all of it. And it is also the simplest approach. No fancy AI or detection heuristics Introducing Height, the only autonomous project management tool. Backlog grooming, bug triage, keeping documentation up to date. Those aren't why you got into product building, right? Well, height handles all that grunt work for you. Using a first of its kind AI approach, height proactively takes care of time consuming workflows without you lifting a finger. Height recognizes when you've agreed to trim, scope and handles, mapping the necessary edits back to your product brief. When new tickets are added to your backlog, height comes through them, adding feature tags, time estimates and more. And it's not just you. Everyone on your team manages projects, tracking updates, scoping work, balancing priorities. But whether or not your product succeeds shouldn't depend on project management. With height, autonomous workflows handle that mundane upkeep so your team can focus on building great products. If you're ready to stop managing projects, it's time for height join the new era of product building where projects manage themselves. Visit Height app Sedaily to get started.
Josh Goldberg
Before we move on to the client code runs the entire code sandbox portion of the dev stack. I have one last question on the back end at Codecademy. I always wanted to build something where if we did detect someone was crypto mining, we would let them and then steal their crypto coins and give them something fake and response just to really stick it to them. Ever tried something like that that you're.
Eves Van Hoorn
Willing to talk about? I have fantasized about this as well. Yeah, I did think about it, never did it. There was a case where people did not just use it for crypto, but they were also and still are sometimes using it to watch advertisements and get money from it. It's very interesting. They start a browser inside a VM and then they have like a VNC connection to it so you can see the browser and then they start on like 20 VMs all at once. They start going to different pages where they watch advertisements and then they get money Back for watching those advertisements. It's very interesting. And I once I found this, I found one of those VMs, and the VM was just public so I could open it. And I saw the VNC window open in the preview. So I just started doing a lot of things inside that vnc. Like I opened notepad and I sent a message and I could see them at some point, like looking at that sandbox and getting very confused and closing all the windows because they wanted. They felt like they were caught. That was funny. Yeah.
Josh Goldberg
The back end team for us, I hear they used to do things to mess with people to make it not worth their while to abuse the platform. It's a great way to stop people.
Eves Van Hoorn
It's similarly when we detected crypto miners for a while we just throttled their sandboxes, their dev boxes, so then they would feel like they would mine crypto and everything would still work, everything would still run, but they would only be running at 5% speed of what the full VM could run. So that's another way to confuse them, I guess.
Josh Goldberg
I love it, but. Okay, let's move closer to the front end. So let's stick with the original code sandboxes for now. Your postgres database is continuing to scale wonderfully. You served up the the file contents to the user. What happens in the user's browser now?
Eves Van Hoorn
Yeah, when we started Code Sandbox, we had a very low budget, so using a server was out of the question to run the code. Like, we were students. We had a budget of like $100 a month. That was the max that we could have. And that was all from student loans in any case. So I started to look into whether I could run webpack in the browser. Webpack by was by far the most popular bundler at the time. And Initially I got Webpack to run, but the bundle size of Webpack itself was huge. It was like 8 megabytes, which was even more for 2017. That was huge. So then I started to look into more how these different pieces of code are executed. And I started to try to execute the code myself, like try to build a very simple version of webpack that would run in the browser to execute the code. And in essence, it all revolves around eval, like the JavaScript function, eval, where you can give it a string and it will evaluate that code. And everyone says that you should never use eval, but that's the core functionality of codesample. So that's what we built kind of the startup around. And what happens is we Receive all the code. And the first thing we do is we parse all the codes. We try to understand which files are used and which files import which other files. This results in a dependency graph, as they call it. With this graph, we transpile all the code. So you could have, for example, your source code could be TypeScript or it could be JavaScript, but not suitable to run in the browser yet. So we would transpile that code to code that would be possible to run in the browser. Then later on, once everything is transpiled, we would execute that code by using eval. And these pieces of code, they could import other files. And they would do that using a require function call, the common JS way of running code. And what we would do is whenever we would eval the code, we would wrap everything in a function where we provide require as a function. So we would overwrite the require function with our own function, and whenever it would require call require, we would resolve that code and we would run eval on that code. Creating this kind of infinite loop if you have infinite imports. And that is ultimately how it works. The logic itself is not extremely complicated. I also have given a talk, and I think I've done a blog post about how this works. And I also have a sandbox that has like a mini bundler implemented. That is, in a nutshell, how it works. It became more advanced because there are bigger also big challenges. Like, for example, how are you going to support node modules, like installing dependencies alone. Everyone makes a joke that node modules are bigger than the universe. So how do you make that efficient? Which is a challenge in of itself. But the core functionality is essentially this loop of creating the dependency graph, transpiling all those files in that dependency graph, and then calling eval with the required override on the code.
Josh Goldberg
So code sandboxes are known for having beautiful visuals, right? How does that interact with the DOM then? Or the HTML page?
Eves Van Hoorn
Okay, so code sandbox, if you are looking at the editor page, you're actually looking at two applications. You're looking at the editor itself. That's the Create React app application. But the preview on the right, that's a completely different application that is rendered inside an iframe, and that iframe refers to a different entry point. So to say that entry point is the bundler. So when the editor downloads all the code from our postgres API server, then the editor sends that code to the iframe of the preview iframe, it would call PostMessage to send all the files to it. The reason that we have to run everything in an iframe is twofold. One is it isolates the user application completely from our editor, so they cannot mess with our editor. But the other reason is security, because we ultimately are running user code, so they shouldn't be able to access our cookies or our local storage, those kind of things. So the editor would call, post message on the iframe, send the user code to that preview. Inside the preview, we would have the bundler sitting idle and listen for any message that comes in. And when it would receive a message, it would then do this loop of executing the code and that would then ultimately fill up the preview. And whenever the user would change code, we would just send the whole bundle, like everything from the editor back to the preview iframe again. And the preview iframe would. Then the bundler in that would make a diff. It would look at the previous ver, the current version of the code and the new version of the code and it would compare everything and. And then for every file that changed, it would reevaluate those files. It would just run eval on those files and its parents.
Josh Goldberg
So you're using, Let me get this straight, Eval iframes and window postmessage.
Eves Van Hoorn
Yeah, yeah, that's the core functionality. Yeah. It sounds counterintuitive, right?
Josh Goldberg
Yes. But then on the inside you're also building up a full dependency graph and doing dynamic reloads of the graph based on impacted changes from the editor. So you have this incredible mixture of pre 2017 era tech with very lovely computer science concepts for efficient redeploys.
Eves Van Hoorn
Yeah, that's the interesting part, because you're building essentially two things. You're building a, I would not say hardcore, but like a pure developer tool like webpack. But you're also building a UI around it, like a developer experience around it. And that opens up some unique opportunities because you have the whole stack, you control the whole stack. So a simple thing that we did, for example, is whenever we would get an error inside the bundler that it could not resolve a dependency. Let's say you start, you import Lodash, but you have not installed Lodash yet. We would create like a specialized error message in the bundler that would be a button saying, oh, we could not resolve lodash. And then it would be a suggestions that you could click which says install lodash. And when you click that button, the bundler would post message back to the editor saying we need to install lodash. And then the editor would Create would show the UI of installing Lodash. It would install Lodash and call Reevaluate on the bundler. That is just one of those examples where because you control the whole experience, you can create a pretty nice experience. Another one was someone in our discord was frustrated because they were using codesandbox for a job interview and they hadn't capitalized their components, their React components. And they were. The React component was not working for some. I think it was back in the day when React components had to be capitalized to work and because of that they failed their interview. So as a response I created like a small detection heuristic on that too, where if we detect that there's an error that's about like a custom component that's not a default HTML element, and React would throw that error. Then we would show a suggestion like have you capitalized your component? And then we would have a button capitalize component. And when you would click it, we would update the code to capitalize the components to catch those kind of things. And that was the most exciting to me because it's a mix of like the core developer tooling, building a bundler, but there is also developer experience and user experience that you can refine based on what comes back from that bundler.
Josh Goldberg
Did the person ever respond that they got a job afterwards or did that help them?
Eves Van Hoorn
No correspondence after. I didn't implement any analytics to know how often that button was pressed. Maybe no one has seen it afterwards, but who knows.
Josh Goldberg
Just one clarification point for technical areas. When you say install what is installing in the client?
Eves Van Hoorn
Yeah, installing dependencies. It has changed over time. But the biggest challenge that was there was installing NPM dependencies, because NPM dependencies tend to be pretty big. Sometimes a library, an NPM dependency could be 2 megabytes. And that's not a big problem in of itself. But that dependency could also say I have 20 other dependencies, and those dependencies could also be 22 megabytes. Suddenly you have 80 megabytes of dependencies you have to download. So we added support for NPM dependencies by creating a separate service, an AWS Lambda service. And that Lambda service, we would say, I want Lodash, for example, and Lambda would install Lodash and then it would look at what files have a high probability of being required to run the dependency. So it would look at the entry point. So for example, Lodash would say, oh, my entry point is in source index. And we would then look at that file and we would look at all those imports in that file. And we would again go through the same process of creating a dependency graph. And we would make sure to only include the files that were required to run the main dependency, and we would send that also back as a JSON to the bundler itself. If the user then would require a file that was not included in this bundle, then we would manually download that single file from a service called Unpackage, which hosts the files for NPM dependencies. So when I say installing a dependency, the only thing the editor does is add, for example, Lodash to the list of dependencies. But ultimately the bundler, still the bundler is the one that calls a service to download the files required for that dependency.
Josh Goldberg
And this is all cached under redis.
Eves Van Hoorn
This is cached in S3. Yeah, it's something that came later. And it's very interesting how it works, because working with dependencies itself is a very interesting challenge, because what if, for example, if you install dependency A and it Requires Lodash version 2, and then you install also dependency B and it Requires Lodash Version 3, how can you make that work? Or if you install dependency A, it requires Lodash version any version that's in two, and then you install dependency B and it requires lodash, specifically version 2.5. How are you going to make sure that Both versions get loaded 2.5 to be most efficient in the download? So there are some very interesting challenges with building up this dynamic NPM install service. And so we don't just cache singular dependencies, we also cache combinations of dependencies like for example, React and React dom. We cache that combined that merged bundle, but we also cache React React dom, Lodash and React icons, for example, as combined bundle so that we don't have to recompute these combinations every time. So yeah, that bucket is huge. It's a couple terabytes, I think.
Josh Goldberg
So there's a benefit then to users using common combinations of packages, like if I'm using React and React DOM the same way everyone else's that's already pre cached before I make that installation request.
Eves Van Hoorn
Yeah, for sure, if you. And very often people will use. I think React and React DOM is the most common combination. And all of this, this whole S3 bucket is cached. Cloudflare is in front of it. So because of that we also don't have a lot of requests to S3, so dependency installations. It's very interesting if you try to install, if you want to run a sandbox with a dependency combination that has been used by another user before. Then installation will be within a second. It would just be a matter of downloading the file from the S3 bucket. It would already be pre bundled. Sometimes we even do pre transpilation on the source files to make it also faster to run so that the bundler doesn't have to do transpilation on it anymore. We would even store information for every file like what require the dependency graph. We would also embed the dependency graph in the dependencies to know like which file imports which file. So the bundler would also save work. Like the bundler wouldn't have to do the parsing of all those files that to run the code. And that is the interesting thing. You could see it as a super fast NPM because the dependency combinations you are most likely using a dependency combination that has been used by someone else before. And in that case you would just have to download that file from an S3 bucket.
Josh Goldberg
Okay, so we've covered most of this stack. We're nearing the end. We've got database and correlated S3 and RedIS caches around it. When I make an installation request, you got a whole bunch of heuristics and strategies there that all gets sent to the clients in some kind of transpiled bundled form. And then you have the two apps on the client, the editor and the preview or viewer. Whenever the editor causes the file to be changed, we post message over to the IFRAME to eval the code as necessary. I think the last bit I want to touch on is the editor itself because that on its own is an entirely separately complex and interesting application. A code editor that works in the browser. How does that work?
Eves Van Hoorn
I would argue that's the most complicated part of that specific stack. The first version of the editor was very simple. It was all React components. It was like the File Explorer were React components and the editor itself. The code editor was codemirror, which is a library created by a Dutch person. Later on we would move our code editor to Monaco. Monaco is a little piece of VS code. Fiesta has an incredible code base, one of the best code bases I have ever seen. It has really good composition, it has really good APIs. And you can already see that it's so good because they were able to react to extract the core editor from VS code and make it a library and keep it in sync with VS code relatively easily. So Monaco is that it's essentially the VS code editor, but then exposed as a library. It doesn't have all the fancy things from VSCode like extension support or a File Explorer or all those. It just has the editing part, but it works, and it works really well. So we moved to Monaco because it had better. Well, it had a more familiar experience for everyone using VS code. And when codesandbox started, Atom was the most popular editor. But later on VS Code became the most popular editor. But there was always a case where people were wondering if they could do the same thing as VSCode. And code editor is an incredibly complicated piece of technology. You have things like the command split, you have key bindings, you have themes, because developers, they want themes. You have settings for all the little things like line height, font size, font itself, how the editor should behave when you press alt and move your cursor around, how should multi cursor work, all of these. There were a ton of settings and then you have extensions, where extensions alter the editor behavior or add new editor behavior. So Code Editor is one of the most ambitious UIs to ever work on. And it's an incredible thing to work on. You learn a ton. Oh, and performance, just making it performant, like when you click on a file that it can open that file within 10 milliseconds, incredibly hard. Initially we were doing everything with Monaco. We had our own UI built around it, and Monaco was the core editor, but we had our own ui. And because of that, we were able to create a lot of custom experiences. But later on, I started to become more and more interested in VS code. And in 2018, during my Facebook internship, I was looking at a way to make VS code run in the browser. VSCode didn't run in the browser back then. Initially, VS Code was built to run in the browser, but it was not performant enough. So they made VS code an Electron application. So it used web technologies to render the UI of VS code, but it did not run in the browser because it did use things like Node file system, a lot of node APIs. So I was looking if we could use VSCO code SAMBOS because that would save a lot of work. It would create familiarity for developers, like their own custom key bindings, those kind of things. We would start to allow extensions. We would be able. It would be faster because ultimately VS code is much faster than React, mostly because VS code is imperative instead of declarative. It would have a ton of benefits. So making VS code run in the browser ultimately came down to emulating nodes in the browser, because node ultimately is JavaScript with native node modules. So, for example file, system, net, HTTP OS, all of these nodes modules, if you create a Browser equivalent version of that that would run in the browser. Then essentially you're re implementing nodes and you're tricking VS code in thinking that it's running on your local computer, but it's running in the browser. And so that's initially what I did to make VS code run in the browser. Later on the VS code team actually ported VS code back to the browser. And now with CodesandBox, we're running the browser version of VSCode. We're still emulating some node components to run extensions sometimes in the browser. But if there is for example a vm, then we use VS Code Server so that you have the native VS code experience. But we do have a little add on that we put on VS code that allows us to render React components inside the VS code UI and that allows us to create custom experiences still using React but within the context of VSCode.
Josh Goldberg
So how does that work with type information? Let's say I want to hover my mouse over Lodash, import it from Lodash and see what are the methods available under it.
Eves Van Hoorn
Yeah, that's a very interesting question. So if we are looking at the dev Books version, it's Quite simple because VSCode server gets the files from the file system and well, in fact the extension, the TypeScript extension runs on the server itself and it would read the files. That's kind of like a solved problem. But for sandboxes that completely run in the browser, we have a bit of an interesting solution. We run VS code extensions in the browser. Those VS code extensions expect to run in a node environment. They expect to have, when they do require fs, they expect to have the FS module available. They want to be able to run, read file, read file, read dir, write file, all those kind of things. So the first thing that we did was implement file system but then to run in the browser. So we can run VS code extensions in the browser. They will think that they are in a node environment, but they're running in the browser. And the second thing is whenever you install a dependency like Lodash, we do the same thing that we essentially did for the bundler. We have a service that calls NPM Install for Lodash and then it looks at all the types files that are installed and it will create like again a JSON bundle that just only includes all the type files. So only the TS files or the TS files. And then in the browser we would download this bundle and we would write all of those files into our in memory file system as we call it. So this fake file system that we've created in the browser. And then when the TypeScript extension would run, it would again think that it's in a node environment and it would read files from node modules. And those files, those are all coming from this in browser file system that was populated from this dependency package here.
Josh Goldberg
Is that a different service, just to clarify, from the Runtime JS file service?
Eves Van Hoorn
Yeah, but it shares a lot of code. I essentially copied the code from the dependency package here and then changed some things to make it more fitting for type fetching. I have to say there's now, that's how it works for the first six years. And now recently I've made a change that we actually download the TAR file directly from npm. Browsers have become better and better. And at this point, unpacking a gzip tar file is really fast in the browser. So for a lot of dependencies, we don't go to our own servers anymore. We literally download from the NPM Registry directly the TAR file and we unzip it in memory and then we save it. It's a bit less efficient, but we had to do it because half of our bandwidth usage was just from this type fetcher. And Cloudflare told us that we had to go on a more expensive plan. And I was thinking, well, what if we then just download directly from the NPM Registry? Then we don't pay for the bandwidth of those bundles. And the funny thing is, because of that, we significantly reduced our Cloudflare bill. But Cloudflare backs the NPM Registry, so Cloudflare itself doesn't notice like a difference in bandwidth. In fact, the bandwidth has increased because now TAR files, they don't create like these perfect bundles of typings. But at least we don't get billed for that bandwidth.
Josh Goldberg
Microsoft owns npm. If they want to reduce that, they can always add a feature to NPM to help you out.
Eves Van Hoorn
Yeah, that's true. That's true. It was an interesting thing. And unpacking a gzip TAR file definitely would be super inefficient back in 2017 or back in 2018, but nowadays you have. I think it's a browser feature called Decompression Stream or something, which makes it super fast because ultimately it uses the native decompression of the of the computer.
Josh Goldberg
It is really beautiful how the browsers have made things like this possible. And more and more over time, more and more of your stack is kind of becoming built in.
Eves Van Hoorn
True. I would even say that if I would rebuild codesandbox today I would probably not build the bundler as it is today because right now everything is based on common JS with this require override. I think if I would build it today, I would use the ES module capabilities of the browser and then perhaps use a service worker that acts similar to a Vite dev server. It would. Whenever you try to download a file, a JavaScript file, our service worker would maybe transpile that file and give it back. But then the execution and bundling is done by the browser. It's not done by us anymore. We would not even be using Eval. Yeah, it would essentially it would be fully esm. It would be like a vite. It would literally be an in browser version of Vite. If I would rewrite everything, I would do it that way because it's much more native to the browser and the world is moving to ESM in any case eventually, right? Eventually it'll take a while.
Josh Goldberg
Do you think you'll ever have time to try out say in a few years the fully browser bundle and dream Transpiler for codesandbox?
Eves Van Hoorn
Yeah, if no one else has done it. That's also the big question because I think that's a big opportunity and I can understand if people would explore it right now. The bundler as it works today works really well. I'm a bit worried that if I would rewrite it completely using this, that we would introduce bugs. It would still be an interesting experiment. Yeah, it is something that I would want to explore. I had it in my mind for a while, but I'm not sure when I will explore it.
Josh Goldberg
What's another long term exploration that you would love to get to if and only if you had time?
Eves Van Hoorn
What I find very interesting about this whole firecracker stack that we now have for development environments is I think the fact that we can clone a VM within a second, that we can resume a VM within a second. That's incredibly powerful technology that is not just applicable to development environments, but also CI CD systems, even to deployments. Yeah, I think there's a whole plethora of different use cases for that technology. I would love to generalize this infrastructure so that people can use it for other purposes that go beyond the cloud development environments. Yeah, I think it's incredibly powerful. It could be used for so much more. So that's something that I would find very interesting. Like making it so that the infrastructure that we have today, we can open source it, that we can make it generic enough that people can use it for their own use cases.
Josh Goldberg
A lot of companies, a lot of projects have CI CD times where you have whatever, let's say a half dozen tasks. Each of Those tasks spends 20 seconds in NPM or PNPM or yarn install, and then 3 seconds in the task itself. It's very frustrating. It's very wasteful.
Eves Van Hoorn
Yeah. And this technology, I was thinking you could, for example, one thing that you could do is you could do all the preparation that needs to be done for cicd and then create a snapshot. And whenever you need to do a CI CD run, it will continue exactly from that snapshot. It will just then pull the latest code and then run the tests. But also for parallelization, like if you, for example, want to run a test suite across 12 different workers, what you could do is you could create a snapshot, you could do all the preparation work on one VM and then after that you clone this VM 12 times. And all these 12 VMs, they run a different part of the test suite in parallel. And one of the things that we even support or we have never deployed it, but it's something, it's an experiment that I built, is that you can do cloning of VMs over the network. Even so you have a VM running on one machine and then another machine starts a clone of that VM and they use a network to sync the memory between the VMs. That's also a capability. Then you could really think like if you have like a CICD cluster that is like, I don't know, 20 servers big, you could do the preparation on one machine and then clone that machine to all the other 20 machines. And they run all 1/20 of the test stack using the result of the preparation that this single server has done.
Josh Goldberg
And it's actually worth it, both time and compute to do it that way.
Eves Van Hoorn
Yeah, I'd say so. It's just an idea. It's like you probably will find things that don't work in practice, but that always happens when you try things. I also wonder, for example, there's now a ton of research being done in how to efficiently train AI models using hundreds of servers at the same time. Because ultimately that's the biggest challenge. Like, how can you scale so much, so much compute and data transfer in a data center? I think those challenges that they have, they are not the same, but they are in a similar vein, which is an interesting space, at least.
Josh Goldberg
How soon will I be running my CI jobs on codesandbox then? Given all your caching.
Eves Van Hoorn
I think not soon. Well, it would be good to explore like different use cases and work on the different capabilities for the stack itself. In fact, you could run your CI CD today already. It's just that the platform has not been built for it. But we can expose an API or you can even create a dev box that runs CI cd. Let's say just next year. Then the next year you can ask me, can I run CI cd? And then I will say, yes, we have this API. And then you can. Then you can give it a Docker image and it will run.
Josh Goldberg
Your code will follow up in a year. Same with the AI models.
Eves Van Hoorn
Great.
Josh Goldberg
This has already run longer than our typical SED interviews. Because you've got so much fascinating tech to talk about. I want to end on a less technical, perhaps still fascinating note on your personal points. Can you tell us a little bit about volleyball and why it's such a fantastic game to play?
Eves Van Hoorn
Yeah, I mentioned this before, like in my free time, I do a lot of volleyball. Volleyball is something that I started doing when I was like, I don't know, nine years old or something, and did it a lot during high school. I did it an extremely amount of lot, if that would be worked. I think I trained like six times in a week. I got a bit burnt out on it during high school. I was a bit too, I guess a bit too competitive. So I stopped for a while and didn't do much volleyball. But I recently picked it up again, I think. A year ago. Yeah, about a year ago. And it's so incredibly good to play volleyball because it's. It's when I. During the training or during the match, I completely forget everything. Like everything surrounding. Like I could be worrying about something, but when I play volleyball, I don't worry about anything. I just worry about the ball. So it's very good from a psychological vantage point. But also I'm much fitter now and the day after I feel much more energized because of the sports. So, yeah, I'm a big fan of volleyball. It's kind of a reminder for me that it's important to not just do coding or meetings the whole day. Ultimately, it's also important to stay healthy and exercise.
Josh Goldberg
I think a lot of people maybe did high school volleyball just hanging around in gym, maybe have done some casual beach volleyball with friends. What is different about your areas and brands of volleyball that were not just standing around awkwardly waiting for the ball to come near us?
Eves Van Hoorn
Oh, so you mean what makes it fast?
Josh Goldberg
What makes it a good workout? You know, what are you doing?
Eves Van Hoorn
When I play volleyball, I mostly Found on the ground. I think there's a lot of diving going on. Like if you want to get to a ball and it's just out of reach, you jump. And recently I started to also track my heart rate during volleyball to see like how intense it is. I've now done it two times and my heart rate goes up to 195 when, when I play volleyball, it's really. Especially at the net. It's also. You jump a ton. Because, for example, when we play volleyball, if I play mid. The thing you do is whenever the setter has the volleyball, like whenever the ball goes to the setter, you have to jump and fake an attack. So you always jump when the setter has the ball because that way you confuse the opponent about where the ball will be attacked so that you confuse the blockers. So you can jump like in just one rally, you can jump like two or three times and then you have like 40 rallies and that's four times, 160 times. Yeah, you jump like 350 times or something in a single match. It's very intense in that sense. It's very explosive. There's no long running, but it's, it's. It requires a lot of short bursts of energy.
Josh Goldberg
That sounds very intense. It's also very psychological. You're not turning your brain off. You're still thinking.
Eves Van Hoorn
Yeah, it's true. There's a lot of prediction going on. Like you can confuse your opponents, they can confuse you. But I like. What I like the most is the competitive aspect of it, like winning matches or losing matches at two. That makes it a lot of fun compared to, for example, running. I also do a lot of running nowadays, but for me something is only when it comes to sports. I find something mostly exciting when it's a competition that keeps me in it, I guess.
Josh Goldberg
Great. Well, this has been an absolutely phenomenal interview. Evis, thank you so much for talking us through the start of CodesandBox. The back end, the database, the caching layers, the front end. This is a lot of stuff. If folks want to find out more about you and or chord sandbox on the Internet, where would you direct them?
Eves Van Hoorn
My Twitter is where I'm most active. I would say that's compuifes C O M P U I V E S. That's my Twitter. And codesandbox can be found on codesandbox IO or codesandbox.com or codesandbox.org. that's a small tidbit. We went with codesandbox IO because the domain was only $30 and codesandbox.com was $3,000. But then later on, a lawyer who learned how to code using codesandbox gifted the codesandbox.com domain to us. That's how we ultimately got the codesandbox.com domain.
Josh Goldberg
Well, all that being said, this has been Josh Kohlberg and Ibis Van Horne with Software Engineering Daily. Thanks all. Cheers.
Eves Van Hoorn
Thanks.
Podcast: Software Engineering Daily
Host: Josh Goldberg
Guest: Eves Van Hoorn, Co-Founder of CodeSandbox
Release Date: December 4, 2024
In this engaging episode, Josh Goldberg interviews Eves Van Hoorn, the co-founder of CodeSandbox, a prominent cloud-based development platform. They delve deep into the inception, evolution, and technical intricacies of CodeSandbox, providing listeners with valuable insights into building scalable and efficient development tools.
Eves Van Hoorn shares the foundational story of CodeSandbox, highlighting his early interest in programming driven by a desire to create a tool for translating secret messages with a friend. This initial foray into coding, using Visual Basic, laid the groundwork for his future endeavors.
Eves Van Hoorn (00:02:00):
"I initially started with coding not because I wanted to learn coding, but because I had a need... that was my first interaction with coding."
Transitioning from graphic design to coding during his gap year, Eves was motivated by the challenge and problem-solving nature of programming. His experience with Ruby on Rails and the emergence of React inspired him to create a more efficient code-sharing platform.
Eves Van Hoorn (00:04:00):
"I started moving to coding, thinking it's like solving puzzles, and there's only one solution... That's when I started thinking that it would be very cool if they could just send me running code."
The first version of CodeSandbox was launched on April 2, 2017, initially serving a niche use case of sharing runnable code snippets for job interviews and bug reports.
Eves elaborates on the backend stack, emphasizing the robustness and scalability of Postgres combined with Elixir and the Phoenix framework.
Eves Van Hoorn (00:08:28):
"The stack was Create React app, frontend, Elixir backend, Postgres database, Redis... we deployed all of this to a VPS on DigitalOcean for about $20 per month."
Remarkably, this setup handled up to 500,000 monthly users without significant performance degradation. The decision to store files directly in Postgres, treating each file as a row in a table, proved both simple and effective.
Eves Van Hoorn (00:10:00):
"We store everything in Postgres so every file would just be a row in a Postgres table. When you press fork, we copy all the files with selects and inserts in Postgres... and it never hit limits."
The innovative approach of storing sandbox files in Postgres facilitated seamless forking and sharing, maintaining high query performance even with over 500 million files.
Eves Van Hoorn (00:12:13):
"Postgres has exceeded my expectations time and time again. If you have a good index on multiple columns, then the performance is incredible."
On the frontend, CodeSandbox leverages React and later transitioned to the Monaco Editor, the core of Visual Studio Code, to enhance the user editing experience.
Eves Van Hoorn (00:35:11):
"Monaco is essentially the VS Code editor, but exposed as a library. It doesn't have all the fancy things from VSCode like extension support, but it works really well."
This shift allowed for a more familiar and efficient editing environment, aligning with developers' expectations from popular coding editors.
As CodeSandbox grew, the introduction of Dev Boxes addressed the limitations of the initial sandbox model, which was primarily suited for small projects.
Eves Van Hoorn (00:15:41):
"Dev Boxes became a more sophisticated way to retrieve files, running projects on the server instead of the client, facilitating full development environments."
Dev Boxes leverage virtual machines (VMs) to handle larger projects, enabling users to clone and manage environments efficiently. The adoption of Firecracker by AWS significantly enhanced the scalability and performance of these VMs.
Eves Van Hoorn (00:18:34):
"Firecracker allows us to pause and resume VMs seamlessly, enabling efficient scaling and quick cloning for user projects."
Running user code in a cloud environment introduces significant security concerns. Eves discusses the measures taken to secure Dev Boxes against malicious activities like crypto mining and phishing.
Eves Van Hoorn (00:23:44):
"We use the same techniques that AWS Lambda uses. Every VM has its own Unix user and we use a jailer to ensure isolation."
To combat crypto mining and phishing, CodeSandbox employs detection heuristics and interstitial warnings, effectively reducing abuse without severely impacting user experience.
Eves Van Hoorn (00:28:30):
"We show an interstitial warning to users accessing standalone previews, informing them of potential phishing attempts and ensuring legitimate use."
Managing npm dependencies efficiently is crucial for a smooth developer experience. Eves explains how CodeSandbox handles dependency installation and caching to optimize performance.
Eves Van Hoorn (00:40:03):
"We created a separate service using AWS Lambda to install dependencies and only include necessary files, reducing bandwidth and improving load times."
By caching common dependency combinations in S3 and leveraging Cloudflare, CodeSandbox ensures rapid installation and loading of frequently used packages, enhancing overall responsiveness.
Eves Van Hoorn (00:43:15):
"If you're using common combinations like React and React DOM, the installation is near-instant because it's pre-bundled and cached."
Eves shares his vision for leveraging Firecracker's VM capabilities beyond development environments, such as in CI/CD systems and large-scale deployments. The potential to streamline continuous integration processes by cloning VMs efficiently could revolutionize how development workflows operate.
Eves Van Hoorn (00:55:33):
"We could use snapshots of VMs for CI/CD runs, allowing parallelization and significantly reducing setup times for test suites."
Additionally, he contemplates modernizing the bundler to utilize native ES modules and service workers, aligning with the evolving landscape of browser technologies.
Eves Van Hoorn (00:53:50):
"If I were to rebuild CodeSandbox today, I would leverage ES module capabilities and service workers to handle bundling directly in the browser."
Balancing intense technical work with physical activity, Eves discusses his passion for volleyball. He highlights how the sport serves as both a physical workout and a mental reset, emphasizing the importance of maintaining health alongside a demanding career.
Eves Van Hoorn (00:59:54):
"Playing volleyball helps me forget everything else and focus entirely on the game. It's a great way to stay fit and clear my mind."
His dedication to the sport underscores the value of integrating physical activities to sustain long-term productivity and well-being.
For more information about CodeSandbox and to explore its features, visit codesandbox.io or follow Eves Van Hoorn on Twitter.