
Grand Theft Auto III is a 2001 an open-world action-adventure game developed by Rockstar Games and it had a profound impact on both gaming and popular culture. Its success cemented video games as a dominant form of entertainment and storytelling,
Loading summary
Kevin Ball
Grand Theft Auto III is a 2001 open world action adventure game developed by Rockstar Games and it had a profound impact on both gaming and popular culture. Its success cemented video games as a dominant form of entertainment and storytelling and paved the way for future blockbuster franchises. The game was also a technological milestone that redefined what was possible in open world game design. It was one of the first fully 3D open world games to offer seamless exploration, blending mission based gameplay with a living breath city. The game was originally released on PlayStation 2 and PC, but never had an official Sega Dreamcast version. However, the homebrew community embarked on the goal of porting the game to the Dreamcast and recently released the port to much acclaim. Falco Girgis and Steph Cornelio Smitsas Poitidis are developers on the GTA 3 Dreamcast port. They join the podcast to talk about the Dreamcast hardware and the heroic task of porting GTA 3 to the console. Kevin Ball, or K. Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website Kball LLC.
Steph Cornelio Smitsas Poitidis
Foreign.
Host
Hey guys, welcome to the show.
Falco Girgis
Hey, thanks for having us.
Steph Cornelio Smitsas Poitidis
Thanks for having us.
Host
Yeah, super excited to dig into this. Let's maybe start with each of you giving a little bit about your backgrounds and how you got into this sort of GTA 3 and Dreamcast World.
Steph Cornelio Smitsas Poitidis
So I'm traditionally a Dreamcast emulator developer. That means I have implemented the Dreamcast in software that started as a childhood project. Actually it's been ongoing for 20 years this involvement with on and off periods and after I kind of withdraw from that scene because I decided that other people are now on the forefront. I don't want to do the emulation aspect so much. I was like, okay, what can I do to make something run on the Dreamcast this time? You know, play the other role. And that's how this happened for me.
Host
Awesome. What about you Falco?
Falco Girgis
All right, so I'll start off saying I actually taught myself C and c at age 14 because I specifically wanted to make Dreamcast games when I was a kid. This was before there was like an iPhone market. This was before Xbox Live Arcade. There was no way for an indie developer to target a console ever. Except for this Dreamcast thing, which was just recently discontinued Right. And I found this scene of developers like Steph, like so many others who are doing really cool stuff in this Dreamcast community. And I was 14 and I made a forum post, hey man, how can I make Dreamcast games? You know, do I use Pearl or whatever? And you're like God, no, use C and C. So I used to walk to the library to get books on it and like fast forward way too many years. And now I help to maintain the Callistios SDK where I used to use that SDK when I was a kid to do Dreamcast development. And for me, Steph, I knew about him and what he did and his emulator work and I, every time he was in Discord I knew, wow, this guy is really, really impressive and knows pretty much everything ever about the Dreamcast. And then he started working on a GTA 3 port. And I'm just watching from the background and everyone's doubting this guy. And I knew that if anyone could ever pull this off, it would be him. And then I saw some other developers who were just man, like a tier Rockstar Dreamcast people coming and I just knew, oh my God, I have to be a part of this. Like even if it fails, it's what a privilege to be a part of something like that.
Host
Nice. So let's maybe talk a little bit about GTA 3 and what makes it such an undertaking to port it to Dreamcast.
Steph Cornelio Smitsas Poitidis
So yeah, there is various aspects. First of all we are starting from the PC version, that's the reversionary one. So the requirements are substantially higher, especially on memory. And also the models are a little bit more complex, but it's mostly memory, like audio memory. The sound effects don't fit, texture memory, the textures don't fit, system memory. The game loads too many models in it.
Falco Girgis
You want to tell them exactly how many megabytes we're working with.
Steph Cornelio Smitsas Poitidis
We have 16 megabytes of main memory for everything and then 8 megabytes for video. But that's also for the Transform models. So it's like at the end of it we have like two, two and a half megabytes free for textures and a little bit less than 2 megabytes for audio. That's like very small specs compared to even then PC standards.
Host
Yeah, I think those of us who are in the modern world now that's like mind bogglingly small. But I'm kind of curious like if you look at, okay, what was this originally shipped at? How much of a downsize are we looking here?
Steph Cornelio Smitsas Poitidis
The PS2 version that was the most compact version. It shipped for 32 megabytes of RAM and similar audio and video RAM sizes like the dreamcast and the PS2 are not directly comparable but they are similar is but twice the RAM that what we have. So we have to compress things further.
Falco Girgis
Yeah, PS2 had 32 meg system RAM and 4 megabytes of video memory. But that gets complicated because Dreamcast you have to store other things in video memory. I'm sure that will be something Steph talks about where it's not. Oh, you have eight megs of video memory. PS2 had four. You know, why is that a problem? But yeah, I'm sure we'll get into that.
Host
Okay, so memory is a big domain. What other aspects of Dreamcast were different enough that it involved a substantial porting effort?
Falco Girgis
I'll say one thing that I am working on that. Oh man, I'm still working on it because I didn't want to screw up the alpha release by pushing for it without too much testing. Because if you mess this up, it's really bad. And for the stability of the game, it's physics engine stuff. We can at least see in our code base where they were using a vector unit in the PlayStation 2, a coprocessor that it's a little bit like a DSP and how you'd program for it. And they were accelerating a lot of like the collision checks, stuff like that. We don't have that on the Dreamcast, no vector CO processor. But we do have some pretty cool instructions that like a very fast sine cosine approximation inner product of vector.products, two 4D vectors and one instruction, things like that to where you could start accelerating some of that math. But the only problem is like I was saying, those are approximations. And when the approximations are too approximate in like a physics engine that things go very poorly. But luckily actually this works. It seems to be a very stable physics engine. Unless I do something just really stupid like forget to copy a sine because some inverse square root. There's a really funny approximation trick. A floating point division on the Dreamcast. Steph, how many cycles?
Steph Cornelio Smitsas Poitidis
I think it's 20 something.
Falco Girgis
I have to ask him because I don't know how many. It's quicker to use. Are you Familiar with the Quake 3 inverse square root approximation?
Host
Why don't we spell it out a little bit? Because not everybody will be okay.
Falco Girgis
Yeah, great. And Steph, correct me if I'm wrong, I'm not exactly 100% sure, but the folklore Is that this really crazy integer bit shifting? I don't even understand how someone could figure this out. Approximation for an inverse square root. It's very fast. And taking a real inverse square root or even a square root is extremely slow. So they figured that out. And that's very fundamental for doing things like normalizing a vector that you can multiply by the inverse square root. And the cool thing is on our FPU on the Dreamcast, that's one instruction. FSRR A. What does it stand for, Steph?
Steph Cornelio Smitsas Poitidis
I actually have no idea.
Falco Girgis
Something reciprocal square root. Anyway, so it's faster for us. A division is like painfully slow. It's faster for us if the sign does not matter. To multiply a number by one over the square root of itself times itself, if that makes any sense.
Host
That's kind of hilarious, right?
Falco Girgis
So I'm sitting there in the physics like, hey, maybe I can, you know, get rid of some of these divisions. And luckily it looks like so far it needs to be soaked a lot more. But it looks like as long as you don't actually need the sign cause you're losing it with the squaring and the square root, it looks like it's pretty good.
Host
So that I think the level of optimization that we're talking here is almost. It's kind of hilarious. But it's endemic to building for old hardware. Right? You're trying to take something that could not do this and squeezing every extra erg out of it.
Falco Girgis
Yes. Just wait till Steph talks about what he did for the tnl, the transform and lighting. I think it's one of the most glorious things I have ever. Like, if ever I wanted to see code that was like, yeah, this is why I'm on this team. It's what he did there. It was just really cool stuff.
Host
Well, so let's maybe start at a bit of a high level. When you're doing these types of transformations, like where are you starting? Do you have original source code or do you have just a binary and you're intercepting there? Like, what does this process look like to get to? Oh, I have a bit of a game to run, but it's stuttering here, so I need to intervene. Like, what is the flow of development?
Steph Cornelio Smitsas Poitidis
We had the full reverse engineer source code for the game by the RE3 project. And also we had the fully reverse engineer code of the game engine that they used, the LIBRW project. Without these two projects, this would have been a totally different scope. Like, the Dreamcast port is a small addition to the Previous work done after finding those projects and getting everything to compile together, the rest of it was pretty straightforward to get to the menus. And then it was just we needed a new renderer for the game engine because it doesn't support. That's one of the other major chunks that we had to write, the renderer. Because the game engine, of course it only supports OpenGL Direct 3D and it has some beats for Xbox and PS2, but only some bits. And we had to bring in support for all of the custom texture formats that the graphics card support for the Dreamcast, implement the pipeline tools to convert, to repack everything to the Dreamcast optimal formats, then actually go ahead and write the transform and lighting pipelines and submit the vertexes to the graphics card. That was the major part of the work.
Falco Girgis
You know what's really cool is this driver level graphics work he's talking about is actually separate from the GTA 3 code base or the reverse engineer re GTA 3 code base. It's actually part of something called Renderware that was a very famous middleware that was used in that era. So we actually made our own renderware backend for the Dreamcast. That's kind of cool.
Host
Yeah, actually, let's talk about that. So you're building these pieces that are specifically to adapt it to the Dreamcast. Are those reusable for other games?
Steph Cornelio Smitsas Poitidis
In theory, yes. We don't have any other projects. I mean, there is one other project using the same engine and it's the GTA 3 revenge engineering for. Is it Vice City? What's the name? Yeah, like the Miami brand, Vice City. So that's another project that uses the same rendering library with some changes, but similar enough. And there are a ton of other games that use this library. Especially back in the Dreamcast age, it was a very dominating library for PS2, Xbox and GameCube. So there is hope that more games will be reverse engineered and then use it. And also ideally we would contribute back upstream. So independent developers could also use LIBRW to develop their own applications. Right. Their own games.
Falco Girgis
That was one of my motivations for wanting to be part of this team too is like, okay, we have this team of alpha Dreamcast developers doing this like crazy lighting and we're pushing a lot of polygons and we've implemented this middleware layer and you know, can we help democratize like the upper echelons of the Dreamcast hardware for the rest of the community with some of this work that we've done for GTA 3? And I think the answer is yes. You know, I think this could benefit the Dreamcast indie scene, like, as a whole.
Host
So let's maybe dive in a little bit to these different pieces and tell me if I'm wrong here, but I think what I'm hearing is one, there's like big chunks that were missing for Dreamcast that you're able to sort of implement as a package, and those you can plug in, you're architecting in a particular way. And then there's other places where you're like, okay, this is maybe part of the game code or doing something, but it's too slow or it's taking too much memory.
Falco Girgis
We are both doing it cleanly, implementing on a driver boundary. And then on the other hand, we're going in with the sledgehammer and game code too. Yes, yes, yes.
Host
So let's maybe talk about, for those driver pieces, what does the architecture look like? What were some of the interesting technical challenges to get this to work? And maybe for someone like you, Steph, who's deeply familiar with what the gamecast is, it's just obvious. But for those of us who aren't, like, what does that whole piece look like? And then we can come back and look at those places where you're taking in the sledgehammer or the scalpel to tweak things.
Steph Cornelio Smitsas Poitidis
Well, for the Dreamcast, for the Librew, it actually follows a typical pattern, similar to, let's say, a modern game engine would have, like Unity. It has a scene graph of transformations that have child objects. I mean, they call them atomics, they don't call them objects. So at the lower level, you have to provide basic initialization functions like query the driver for the screen size and things like that. And Also it has three rendering paths, one for immediate mode 2D graphics, one for immediate mode 3D graphics, and then one for the atomic rendering. And you slowly have to go through and implement those parts. And to get into slightly more detail, an atomic is made up of a mesh. A mesh has an index buffer, which is the indexes of the geometry, and a vertex buffer. So you look up the index and then you fetch the vertex, then you transform the vertex and you send it to the graphics card. That's the basic flow. And you do this for every mesh, for every atomic that the game engine asks you to. The game engine itself is pretty straightforward, to be honest. You register the renderer as a plugin to the game engine and it will call your callbacks. So as long as you register all of the correct plugins, then you will get all of the callbacks and if you implement them correctly, you will get your scene rendered correctly. Right. For the Dreamcast, there are some major problems. Like the Dreamcast doesn't render all of the types of geometry in the order you submit them. You have to sort the geometry into opaque, transparent and alpha tested. And this creates some problems with the ordering of geometry because the game expects things to be rendered in the way it sends them. And then we have to split them.
Falco Girgis
Into lists and that's universal for any port the Dreamcast. Like that's fundamental to our architecture. Like we sort our geometry by lists that are opaque, translucent, and there's a third list punch through.
Host
Yeah, got it. Okay, so where does that. You said most of the game engines aren't going to expect that. That's a Dreamcast limitation. So at what layer does that transformation happen? And is it transparent to the game engine or do you have to go in and then start doing those interjections later?
Falco Girgis
Well, this is where Steph at first was doing something cool, where we were deferring everything. Right. If the Dreamcast needs it in a certain order, then you have to capture the pipeline state of what you're trying to draw, the vertices that you can defer it until it needs to be drawn. And our first cast, Steph had some stateful C11 lambdas that were just like capturing the whole pipeline state. It was like. It wasn't really what you would want for high performance code, but it was like, man, that is a really useful use of a stateful lambda right there. And then he placed them into standard vectors and then he could iterate over them later when he needed to.
Steph Cornelio Smitsas Poitidis
It's also mostly transparent to the game engine. We did have to modify the game a little bit, like in the places where it expects certain depth ordering. Like for example, the fog, it rendered it at a predefined depth and it expected this to be the minimum depth, but did this in the wrong order of things. So we had to help it there. But for the most part it works out of the box. The reordering.
Host
Okay, cool. And so to make sure I understand it, you sort of implement kind of a middleware that's capturing these things as they're coming in. Then you sort them in the way that the Dreamcast expects and send them on the way. Do you have like a size of buffer or is it over per frame or like how does that.
Steph Cornelio Smitsas Poitidis
It's just a vector right now.
Falco Girgis
It's dynamic at first, but he is doing a clear. But he's Never doing a shrink to fit on the vector.
Host
Okay.
Falco Girgis
So once it's kind of stabled out, it's basically like a static.
Host
Nice. Okay, cool. Any other of those sort of big driver layer?
Steph Cornelio Smitsas Poitidis
Well, on the driver layer, like Falco mentioned, we had this capturing lambda stuff and it was wonderful from a language perspective.
Falco Girgis
Yeah, right. It was the ultimate modern. We got a lot of stuff, man, why you need. When I joined the team, I was the guy who went into the makefile and was like, whoops, C23. And it caused so many problems because our host compilers couldn't support that for the tool side, you know. So I had to back it down to C20, like, okay. And people asked, what good is that going to do you for this project? And that was a. Well, right there, you know, it was pretty nice.
Steph Cornelio Smitsas Poitidis
The thing is that I unfortunately had to move away from that to avoid copies, because we spent a lot of time copying the lambdas around, and due to several intricacies of the lambdas, you can't have zero copies when you construct them into a vector. So now we use this context. This is a concept of contexts. So for every atomic, we create an atomic context and we attach some other contexts, like a material effects context, if it has reflections, things like that. And then we capture only the context number, which is enough for us to recover the information. But from a functionality perspective, that's how the driver works. The other part is how we actually submit the geometry to the graphics card. That is interesting because, like, the normal way is that you deindex the buffer, you fetch the vertex, and then you transform the vertex and then you send it to the graphics card. Then there is a smarter way that you can transform all of your vertexes. Like usually you have more indexes than vertexes. That's why you have indexed buffers. So the smarter way is you transform all of your vertexes at first and then you de index them, and then you don't have to transform them twice if they appear more than once. And then we did this approach where we sliced our buffers in 128 vertexes, which was tricky because we had to regenerate the topology of the mesh a little bit, like to cut it into chunks in a way that doesn't generate too many more vertexes, too many duplicates. It's a trade off everywhere. And then we handle this at 128vertexes, which actually can fit in the cache. So we freeze half of the cache with some tricks. Then we transform into the cache and then instead of Doing a memory copy for the index, we directly send the cache to the graphics card. Like send the cache line.
Falco Girgis
Yeah. I think that was the most mind blowing thing. We have a mode where he split the cache in half. Right. And then we can manually control what goes into the second half of cache. So he pre allocates this above where the GPU expects the vertices to go and he does all the TNL there and then to send it to the GPU cache flush. It was. It blew my mind.
Host
Yeah, you just blew my mind. So if I'm understanding you're essentially splitting the cache into two logical sections and using one of those to essentially prepare all of your data which you can then just flush straight to the gpu.
Steph Cornelio Smitsas Poitidis
Yeah, yeah. In the order that you want.
Falco Girgis
Yes. Isn't that crazy? I've been doing Dreamcast stuff for a long time and that was pretty crazy to me.
Host
Yeah. This is getting down to a level most of us never touch in terms of control over your hardware.
Falco Girgis
Right.
Host
So one of the things you said a little bit ago, Falco, made me want to kind of dig in. What does the kind of build chain tooling environment look like? Building for this?
Falco Girgis
Okay. That's where I come from is I work on the. It's called Callistios. The Dreamcast community is kind of unique in that no one uses the commercial stuff in our community because not to brag, but our stuff is pretty good. Our open source stuff, it's been around since 2001. Right. It was around back when Sega was still doing Dreamcast stuff. So it's had a lot of time to mature. There's a lot of stuff in there like supporting IPv6, supporting different mods and things for the Dreamcast, the hardware mods. Like my dreamcast actually has 32 megs of RAM, so I don't know about this GTA 3 out of memory stuff. I'm just kidding. But yeah, we support a lot of hardware modifications that have come around for all these years and one thing that we're pretty passionate about is keeping our tool chain up to date. Like the Super H4 is our architecture and it's still maintained in GCC, believe it or not, because it's used in Casio calculators and it used to be used in set top boxes and routers. So it never really went away from the GCC toolchain. So we actually like. Oh, GCC 14. 2.0 came out the other day. I think it's 2.1 now. We had a midnight launch like oh hey, you can Already use this. Some preview like C26 features on the Dreamcast. You want to do that? Yeah. So we nerd out pretty hard about our toolchain stuff and our toolchain support. And the crazy thing is we're not the only ones either. The Nintendo 64 community is pretty much on the tip. The PlayStation Portable community. Some guys I know on the Saturn scene, I don't know what it is. When we're with retro hardware, we like to not be retro with the software in.
Host
That's awesome. Well, and I think Steph, you mentioned you started out kind of on the emulation side. So are you able to run all of your test suite, whatever you're doing this on an emulated piece of hardware or how is this.
Falco Girgis
Tell them about all of it.
Steph Cornelio Smitsas Poitidis
There's several layers of inception in this. But yes, I maintain a fork of my old emulator. It's no longer the best emulator for Dreamcast, but it's one I'm very confident within the code base. So I maintained that just for this project. So every feature we use I added to the emulator. Like this cast tricks. I added them to the emulator. And also we have added a lot of validation. Like every time we have found the mistake in how we hand the hardware, I try to add the check into the emulator so that next time we do the same mistake again, it will warn us about it.
Falco Girgis
And it was extremely useful. There were times we were stuck and it's not so easy debugging on the Dreamcast. Yeah, a lot of people, we have a lot of people who are like, oh, I print F debug exclusively. We have a GDB stub, but it's not the best. It's lacking a lot of features. So Steph's inception of his virtual Dreamcast really saved our butts many times.
Steph Cornelio Smitsas Poitidis
It helps. And we can also cross compile for PC. I mean, the code base originally worked on PC, so we did the splice thing together. Like take some parts of the emulator and then splice it with the rest of gta. So we can run into this hybrid mode where it's actually the CPU is running native, but the graphics is running through the emulation. And this means we can use tools like valgrind or address sanitizer. A couple of bugs have been found this way. Like the memory. We had the memory leak of a matrix. There's no way we would have found where it came from without address sanitizer. So there's this tooling that has been developed for the project around the Dreamcast simulation that I'm familiar with. And this has really helped us.
Falco Girgis
Yes, I think his emulation background has helped in every, you know, I have started from like indie. I haven't seen this big picture. He's seen where he has had to emulate all of the AAA titles. So he knows all their tricks, you know, it's been fantastic just learning from him from that perspective.
Host
Well, and one of the things that stands out to me here is like hardware quirks, right? Things where it doesn't actually work as advertised or like I don't think I would have ever thought about going into the abstraction layer of the cache and like splitting that out and taking separate control of different pieces. Right. So I guess that would be an interesting thread to go down is like, what unexpected hardware quirks are there within the Dreamcast and how do you end up having to work around those? How does debugging those work in terms of the emulator, things like that?
Steph Cornelio Smitsas Poitidis
Well, hardware quirks, to be honest, most of them are handled at the Callistios level. Like there are some known hardware bugs, they are handled in the initialization layer and everything kind of works out of the box. The one we ran into is that. So the Dreamcast has, apart from the cache, it has these two buffers called store queues where you can collect some data and then write it all together as 32 bytes into the memory, the graphics card, whatever you want to write.32.
Falco Girgis
Bytes, it's like an ultra fast 32 byte mem copy or memset. If you use it like that, they alternate and you can write into it and flush one while one's flushing, you write into the other one and they alternate and it's just. It's a really cool thing, but you.
Steph Cornelio Smitsas Poitidis
Cannot read from it.
Falco Girgis
So I learned.
Host
It sounds like something you learned through pain perhaps.
Steph Cornelio Smitsas Poitidis
And you also can only write 32 or 64 bits at them, like 4 or 8 bytes. You cannot write 2 bytes or 1 byte, for example. And these are quirks we have in the emulator. We have them assert like if you use them, it goes and says, hey, you're doing something that you shouldn't be doing. Another example is null pointers. Like null pointer is usually a zero pointer and that is a pointer to the BIOS in the Dreamcast. So you have a null pointer read. You're reading some data from the bios, nobody minds you doing that. And we also have warnings for both reads and writes for that on the emulator. Just to catch this said edge case, we could actually do this in kos, we could install a null pointer page, but we don't.
Host
All right, so I think we've talked a lot about kind of the systems and the systems we had to do wholesale replacement. Let's maybe now get back into where you're having to go in and bring out the sledgehammer or the scalpel, make optimizations either in original game code or heard a lot about transforming the formats of your textures and other things. So like what are the things that you're doing now that are no longer cleanly divided by okay, here's the subsystem but that you're still having to dive in to get this thing to actually run on the Dreamcast?
Steph Cornelio Smitsas Poitidis
Well, for the texture conversion tools and like the repack process, I can give some detail. I think most of the other optimization Falco has worked more on the actual game code than me, to be honest. But for the repack process we copy pasted game loaders and made a tool that loads the textures and then we're using some community tools. I don't remember who wrote PVR text or what's.
Falco Girgis
I think that's tapam N, but I'm not positive.
Steph Cornelio Smitsas Poitidis
Yes, Tapam N is an awesome developer for the Dreamcast. They also tried this CAS submission trick that we're using years before we did. I saw that in some forum post, but ours is the first use in a real application. Also, the same thing goes for the audio files like we process them into a custom format. I wrote well for the audio files and the image formats. I actually used ChatGPT to write the unpacking tools and the repacking tools. And what did you know, it worked on the first try. But then for the audio conversion itself, like compressing the audio, we use some tools that come with Callistios. We just modify them to handle our peculiarities in the format and down sampling the audio if we have to. Things like that. At that level it was mostly plumbing work to get different tools to work together and then makefile where you can parallelize that work and have it in a nice experience for the developers.
Host
Got it. And is that then essentially build once ahead of time, it's packaged, you ship it and it's done. Or are there any dynamic aspects to it that need to be handled?
Steph Cornelio Smitsas Poitidis
In the makefile you can build the CD image directly from it. So if you ask it to build the CD image, it will pick the repacking tools, it will build the game itself, it will build everything it needs, it will link it all together. Then it will run the repack tools, then it will take the output and it will make a CD image for you.
Host
Nice.
Steph Cornelio Smitsas Poitidis
Falco can talk more about the game code itself, I think.
Falco Girgis
Okay, well, the first thing I did before I even jumped into that stuff was. I don't know if you know or if people know the Dreamcast memory card. The visual memory unit.
Host
No.
Falco Girgis
So in the controller you see how there's a screen.
Host
I can see it, but our listeners will not be able to see it. So let's describe it.
Falco Girgis
So anyway, we have a very interesting memory card that looks like a Game Boy and it even has a screen on it. So when you put this thing in the controller, a game can actually drive it as an external screen. And born partially of laziness and like I don't want to mess up the UI and then partially out of. I think this would be cool. The first thing I did is I started displaying debug information like performance information on this little visual memory card. So I didn't have to pollute the UI or have to figure out how it worked or anything like that. So that's one of the things that's even ongoing for me is I have a background thread that I spawn. It's like a C standard thread that wakes up every, I believe it's 200 milliseconds, checks in on the system. How much memory do we have left? Video memory and sound memory displays. It updates. That little screen goes back to sleep. I've been doing that for a bunch of other stats so that going forward we can we know where to focus. Micro optimizing things. I would say in terms of. One of the most interesting things that I had to work on was for this physics stuff, the collision, the math for that was implemented. One of the most important things you can do is you transform a vector by a matrix and that can be for the vertices, for the collision meshes, for the renderable meshes, for anything like that. That's one of the most fundamental operations in the engine that you can do. So this was implemented through a C overloaded operator that took a matrix that was on a matrix and it took a vector. And I will say this was not inline. One of the problems we had was the way the matrices worked on the Dreamcast was you have a background bank that you load and then you can swap your active bank register so that it's undisturbed. So if this overloaded operator was having to reload the matrix to multiply it by one vertex and then it gets Called again. It reloads the matrix again, multiplies it by one vertex. Usually you have a loop of like, could be like 50 verts being multiplied by one matrix. So I had to basically break out this or cfy. This C pattern wasn't to my liking. I wanted to know that. All the nice C stuff. But I had to make it basically take an array of vertices, an input, an output, and the number so that I could load the matrix one time and then swap banks, go through and process every, multiply every vertex and then swap back. And that, yeah, that was pretty wasteful before that was optimized more for the Dreamcast. But that was more precision thing, I would say, than a sledgehammer thing. Because, yeah, touching that code is pretty sensitive.
Host
And so to make sure that I'm understanding, essentially there's like conceptually a cache. You called it a register, but like a matrix register. Yep. Kind of a little cache layer that loads the matrix. And previously what would happen is every time you would go into this code, it would spill your cache and you'd have to reload it. And so what you said is, hey, let's pull together all the operations we're going to do on this matrix so we can load the cache once, run through them. Not have to keep spilling memory in and out of there.
Falco Girgis
Yes, yes. Not have to keep loading and unloading that matrix into that cache area that you're talking about.
Host
Yeah, that makes a ton of sense. How did you find that?
Falco Girgis
Well, to be honest with you, that's just something I've. I've worked on this math abstraction before for our OpenGL driver, and I'm kind of used to like the load once, multiply while you're in it, and then unload or load the next one pattern. And when I saw this like pretty little loop of invoking an overloaded operator, I was like, I don't. Because that's what I wish I could do. Right. I'm like, man, if that. If they're doing this efficiently, I need to know what the heck is going on here, because I can't. So, yeah, that was kind of a red flag.
Host
You're like, man, I want to be able to do this. How are they doing it to make it work? Oh, it's not working.
Falco Girgis
Yeah, exactly, exactly.
Host
Got it. Okay, that's cool. That's an interesting example. And so in that case, I guess you're starting with source code, so you just go and rework the source code. So you're saying instead of this, we're going to Just inline this into this function. Go.
Falco Girgis
Yes, exactly.
Host
Got it.
Falco Girgis
They wouldn't want this upstream if this were still a maintained thing. We would have to make some prettier abstractions or something. But you know, we did what we had to do in that code.
Host
Well, and that's actually kind of one of the places I was wondering is like to what extent can you abstract these things? Because like I did another interview recently with folks doing they were more in the like game emulation space and taking it and they starting from a binary rather than starting from source code. So they had to use different tricks, but they did a lot of like, okay, let's swap in. You know, we have this function call. We're going to replace this function and go down a different path. When you're working in these, like we talked about a hack around for division, right? We're not going to do division. We're going to multiply by the inverse square root times itself so that we can just do a multiply and an inverse square root because those are fast. Like are you able to do that at an abstraction layer where it just kind of applies across all the physics code or are you going in and finding all the examples where they're doing a divide and having to update those?
Falco Girgis
You want to go first, Steph, what would you say? I was going to say there's trade offs and especially so with that divide. Actually now in the code base I have a C template that takes a non template type parameter for whether it should do an actual divide or a BS divide. Right. So I can kind of, if I break something I can. Oh God. Change that to true, change the default on that one to true and I can go down. And that should not be extra overhead. Right? Because I'm. I. It should be just a template that's like expanding into a couple of. I say should, did I? I have not verified the code down both paths, so maybe there's an extra move or something I gotta look. But that's an example of when templates can do it. What would you say, Steph? I'm sure there's more runtime examples with stuff we couldn't really get away from or that we did well for me.
Steph Cornelio Smitsas Poitidis
It's also very manual, this work. Usually I go in and manually change things, put them behind a define or a template, something like that. It really depends if you care about having the code maintainable or not. If you don't really care about maintainability, like we don't really care care if the code is a little bit dirtier it's not going to be worked on further. So then we can just use the simplest approach that works.
Host
And then how are you sort of finding these opportunities? You mentioned a little bit about you've got kind of the DIY profiling tool coming onto the memory card, which I love, and in some cases are just like your pattern matching based on things that you've been doing before. But are you doing systematic profiling? Is it gameplay that's driving? Oh, it's getting sticky here. Like, how are you finding the places that need optimization?
Steph Cornelio Smitsas Poitidis
We have a profiler that is passed down from some other developer to me and I modified it and now I'm passing it down to the next developer down the line. It looks similar to gprof when I got it, like when SWAT gave it to me, it was compatible with GProF. So you could use the standard ZProF tools to analyze the traces. After the modifications I did, it's no longer compatible with ZProf. But I nicely asked ChatGPT for a tool that can analyze the reports and it kindly made me one, as does sometimes. So we, we have that tool and it's able to take in a full disassembly of your executable, like with objdump, and then it can annotate for you the lines where it gets hits on the profiler. So you can even see on hot functions, you can see individual assembly opcodes and how long they take and things like that. That helps when your operation is concentrated into a function. But for the cases, like Falco said, when you have little cuts spread over the code, then the profiler doesn't really help with that. Like it doesn't really have the. We only sample a thousand times per second, that's 60 frames per second. So that's, I don't know, 20 times per frame. That's not really enough context to understand what's happening in a frame, like only statistically. So you try to stay still and you hope that the statistical profile will help you.
Host
So when you're running the profile, are you running that in your emulated environment or like, how much overhead does it take to run the profile?
Falco Girgis
Oh, I knew that question was coming. Well, for the VMU one, I will say I kept that thing really lean, except for I have a change I've been sitting on where there's a shortcoming in the C11 threading API where you cannot set the stack size by default. You can't say I want to make a thread with this large of a stack. So we're blowing a few kilobytes that we should not be blowing every time we make standard threads. And the VMU thing is sitting on a standard thread. Other than that, the thing wakes up so infrequently, it's like 0.02% CPU overhead. I keep an eye on that one because I don't want people to say, oh, your tool's like costing me FPS or something. But what Steph was doing. What would you say for yours?
Steph Cornelio Smitsas Poitidis
I think the Profiler cuts around 5% of the performance.
Falco Girgis
Oh, that's nice.
Steph Cornelio Smitsas Poitidis
And it's not terrible considering everything.
Falco Girgis
Considering what it's doing, yeah.
Steph Cornelio Smitsas Poitidis
It can store a trace for like 10 seconds right now in memory before it writes it to a file. And for this performance testing, it only makes sense to do it on the real hardware because on the emulator the timing is all kinds of skewed up, so you don't really have things like cache simulation. You don't know cache misses. The Dreamcast has a very weak cache and a very weak memory system, so it's not optimal if you don't model this. I need to mention there that there's also the performance counters that are part of KOS that can give you hints like how many instructions run, how much time you were waiting for memory, and you can use them on any scope or function and then you can really micro benchmark functions and optimize them by hand.
Host
Awesome. Well, so let's maybe at this point step back a little bit and kind of give an overview of at this point. So you shipped a first working version? I think I saw somebody running through. I don't have Dreamcast hardware though I might have to try your emulator, Steph, and run through it. But what would you say the status of the project is and what are you guys excited about and working on now?
Steph Cornelio Smitsas Poitidis
Well, the status is that we thought it was fully playable, the alpha, but there were three bugs in it. It turns out all of those three are fixed. So now the game is for real, fully playable. There is a lot of minor fixes that go in and minor improvements like also there's Falco's physics optimizations, there's things like anti aliasing that we might be able to get in. Things like. So for example, things are correctly fogged, things like that the game was mostly there, the next big thing. Now we are running out of memory after some time and it doesn't seem to be some memory leak because if you sit still on the same region, even though the game is dynamic it doesn't happen. It only happens where you're actively playing for an hour or so. It seems to be memory fragmentation. So there is allocations all over the space, the dress space, and then you have a small allocation in the middle of a free region. And then you can't make a big allocation because you have this small allocation in the middle. The middle. So that's the next big hurdle to like for my side. That would be the goal that would make this a beta because it would be fully playable, including the fog, fixed, some new features, some better performance, and you can play like five hours without it crashing. That would be a nice goal.
Host
This is awesome. So, all right, you've accomplished the. I think some people were saying unaccomplishable port. Where do you want to take this next? Are you thinking, okay, we shipped GTA 3, we're onto a new game. Are you thinking expanding within the ecosystem? You also mentioned taking stuff back to the community. Where are you going post beta?
Falco Girgis
I'm going wherever they're going. I had so much fun and it was such a pleasure to be around. As long as they'll have me around, I'll follow them.
Steph Cornelio Smitsas Poitidis
I guess there's a few different paths we can take, but it really depends on where the community also wants to go. Like Falco says, there is people that are trying to make some mods for the Dreamcast, like either to make it more Dreamcast friendly or to bring in.
Falco Girgis
Better models, better textures, to make it more Dreamcast unfriendly.
Steph Cornelio Smitsas Poitidis
Yes.
Falco Girgis
We have people loading, like, Xbox models and doing crazy stuff. It's really cool stuff, though.
Steph Cornelio Smitsas Poitidis
It works. So that's one direction. Like, on this GTA path, the nice. I mean, for a final release, this would have to be fully localized. There's some minor things in the menus, some strings are missing, things like that. And of course, it has to be fully playable, fully stable, no visual glitches. I guess then the next challenge would be Vice City.
Falco Girgis
Oh, you said that out loud.
Host
All right, so. And no committing here, but, like, timeline wise, are you thinking, you know, Vice city is a 2026 thing or, like, what does that look like?
Steph Cornelio Smitsas Poitidis
I have no idea.
Falco Girgis
Yeah. To be honest with you, I don't know. Is it? I wouldn't know if it's like, starting over from another gt. Surely it's not, you know, but I don't have enough experience doing GTA is to know with as much of a head start as we have on the engine and stuff, how much further there is to go. If that makes Any sense.
Steph Cornelio Smitsas Poitidis
One of the original developers claimed that Vice City is essentially the same game, just kind of different story. So that is hopeful. And I guess it's one of those projects that once you are in a couple of weeks, you know how possible it is because most things should be in place. So if it doesn't have higher memory requirements, for example, which might, things should be fine.
Falco Girgis
Right?
Host
So it's one of those that once you get in, you might be shipping within a week or you might. Or a few weeks or you might be looking at a year long.
Steph Cornelio Smitsas Poitidis
Right, exactly, exactly.
Host
Nice. One other thing that you mentioned that I want to dive in on, you mentioned community. How would people who are interested in this, who listen to this are like, man, that's cool, I love Falco. Your origin story here, right? You're like, I just want to hack on this cool gaming platform. Like how do I do that? How would you recommend today people kind of start exploring and getting involved in this space.
Falco Girgis
I would say start off, we have a wiki, Dreamcast wiki. And this is like the ultimate community driven mind dump of everything going on. And it has like the definitive. If you Google Dreamcast development, I think it should be number one on Google. And it's like the ultimate getting started. And this is how everyone starts off, like tells you, you know, what do you need software wise, what platform? Well, we of course we let you do Windows, Mac, Linux. We don't, of course we're going to support your platform. But anyway, how to set up the SDKs and then from there pretty much inevitably there's a link to our Discord server and Simulant. We also have one for DCA 3, GTA 3 and people wind up in our Discord server and it's a lot of fun and that's where all the coding really happens. But we're also on GitHub. Callistios. Callistios, yeah, you can't miss us if you look for that in Dreamcast on GitHub. Steph, you want to talk more about our site for GTA 3?
Steph Cornelio Smitsas Poitidis
Yeah, for GTA 3 specifically we have some basic instructions, but you have to do the repack yourself. So you need to buy the game, either have a copy of it or buy a new copy. We do support the version of the game that Rockstar is selling online, so it's kind of easy. You can download that and either install it on your Windows PC or install it with Wine. I have tried both. It works and from there on you have to run some commands, download the Dreamcast SDK, run some commands and it will bake everything for you.
Falco Girgis
And that is on dca3.net by the way.
Steph Cornelio Smitsas Poitidis
Yes, dca3.net and the site is also browsable on the Dreamcast.
Falco Girgis
Yeah, that's the best part.
Steph Cornelio Smitsas Poitidis
We made sure of that.
Falco Girgis
The guys who made that site. Yeah. Made sure that it can be viewed with the Dreamcast web browser. So it's pretty cool. Yeah.
Steph Cornelio Smitsas Poitidis
And one more thing to note. For Windows, there is this Dream SDK installer that installs everything for you, and it gives you a ready working environment where you can tinker on. On Linux and macOS, people assume that, you know, you can follow a tutorial.
Host
Well. Awesome. This has been super fun, guys. We're getting close to the end of the time here. Is there anything that we haven't talked about that you would like to leave folks with?
Falco Girgis
Yeah. Steph, I want to put you on the spot here and ask you something I've never asked you. When this started out, this was running on an emulator only. This emulator had double the RAM of a regular Dreamcast. When you started out, did you actually think that you would be able to make it to run on a stock Dreamcast or were you like, you know, let's see where it takes us. And maybe. And you know what I mean, how did that happen?
Steph Cornelio Smitsas Poitidis
I was, let's see where it take us because I had no idea how the code worked and if we would be able to thin it down. But to be honest, it was so effortless to get it initially to render something. It only took a couple of days. So I was very hopeful this would be too.
Falco Girgis
Right.
Steph Cornelio Smitsas Poitidis
Because if it takes two months to show something, then you're like, maybe this is not so easy.
Falco Girgis
Exactly. Yeah. Yeah. Interesting. Yeah. I always wondered. I always wondered if you knew starting out that you would make it onto a stock Dreamcast or not.
Steph Cornelio Smitsas Poitidis
I mean, it would have been fun, even if it was only for a motive. Dreamcast.
Falco Girgis
Right. Right. Couldn't quite make. Would still be great, but. Yeah, not quite the same.
Host
Well, thank you, gentlemen. Super fun. And we will catch you another time. I will definitely be checking out, getting this installed on my Mac, at least.
Falco Girgis
Thanks.
Steph Cornelio Smitsas Poitidis
Thank you.
Host
Sam.
Software Engineering Daily: Grand Theft Auto III on the Dreamcast with Falco Girgis and Stef Kornilios Mitsis Poiitidis
Release Date: May 8, 2025
In this captivating episode of Software Engineering Daily, the host delves into the ambitious project of porting the iconic Grand Theft Auto III (GTA III) to the Sega Dreamcast console. Joining the discussion are Falco Girgis and Stef Cornelio Smitsas Poitidis, the brilliant minds behind this intricate endeavor. The conversation navigates through the technical challenges, innovative solutions, and the passion driving this unique porting project.
Stef Cornelio Smitsas Poitidis introduced his journey into Dreamcast development:
"[...] I have implemented the Dreamcast in software that started as a childhood project. Actually, it's been ongoing for 20 years [...]" (01:53)
Transitioning from emulator development to porting, Stef emphasizes his shift in focus to contribute directly to making GTA III run on the Dreamcast hardware.
Falco Girgis shared his early fascination with Dreamcast game development:
"I actually taught myself C and C++ at age 14 because I specifically wanted to make Dreamcast games when I was a kid." (02:28)
Falco's long-standing commitment is evident as he currently maintains the Callistios SDK, a cornerstone for Dreamcast development.
Kevin Ball sets the stage by highlighting GTA III's monumental impact on gaming and open-world design:
"Grand Theft Auto III is a 2001 open world action-adventure game developed by Rockstar Games, and it had a profound impact on both gaming and popular culture." (00:00)
The game's success not only solidified video games as a dominant entertainment medium but also pushed technological boundaries, making the porting project both a tribute and a technical challenge.
Porting GTA III to the Dreamcast was no small feat, primarily due to stark hardware limitations:
Memory Constraints: The Dreamcast operates with 16 MB of main memory and 8 MB of video memory, significantly less than the original PC and PlayStation 2 versions. Stef elaborates:
"We have 16 megabytes of main memory for everything and then 8 megabytes for video. [...] That's like very small specs compared to even then PC standards." (04:36)
Rendering Engine Adaptation: The original game engine supported platforms like OpenGL and Direct3D, necessitating a new renderer tailored for the Dreamcast's hardware specifications.
Physics Engine Optimization: Falco discusses the complexities of adapting the physics engine without the Dreamcast's vector co-processor:
"We don't have that on the Dreamcast, no vector COProcessor. But we do have some pretty cool instructions that like a very fast sine cosine approximation..." (07:15)
The team employed several ingenious techniques to overcome hardware limitations:
Custom Renderer Development: Stef and Falco built a new renderer compatible with Dreamcast's unique graphics pipeline, handling opaque, translucent, and alpha-tested geometry separately to fit the console's requirements.
Cache Management Enhancements: By splitting the Dreamcast's cache and manually controlling data flow, they optimized vertex processing, a breakthrough that Falco finds particularly impressive:
"We have a mode where he split the cache in half. [...] It was pretty mind-blowing." (19:38)
Physics Engine Adjustments: To maintain a stable physics system, Falco utilized the Dreamcast's fast sine and cosine approximation instructions, ensuring efficient vector calculations without compromising gameplay.
Toolchain and Emulation Support: Stef maintained a customized Dreamcast emulator, enabling efficient debugging and integration with modern development tools like Valgrind and Address Sanitizer:
"We can use tools like Valgrind or Address Sanitizer. [...] This has really helped us." (23:01)
As of the episode's release, the GTA III Dreamcast port has transitioned from alpha to a fully playable state, with only minor bugs remaining. The team's immediate focus includes addressing memory fragmentation issues to ensure extended playability without crashes.
Looking ahead, the developers contemplate porting Vice City, another popular title in the GTA series, leveraging the groundwork laid by the GTA III project.
"It's very straightforward to get to the menus. And then it was just we needed a new renderer for the game engine because it doesn't support." (10:41)
Both Falco and Stef emphasize the importance of community collaboration. They invite enthusiasts to contribute via their Dreamcast Wiki, Discord server, and GitHub repositories.
Falco encourages newcomers:
"Start off, we have a wiki, Dreamcast wiki. [...] And it's the ultimate getting started." (43:58)
Stef provides resources for those interested in replicating the port:
"For GTA 3 specifically we have some basic instructions, but you have to do the repack yourself. [...] And from there, you have to run some commands, download the Dreamcast SDK, run some commands and it will bake everything for you." (45:28)
Stef on Memory Constraints:
"We have 16 megabytes of main memory for everything and then 8 megabytes for video. [...] That's like very small specs compared to even then PC standards." (04:36)
Falco on Physics Engine Optimization:
"We don't have that on the Dreamcast, no vector co-processor. But we do have some pretty cool instructions..." (07:15)
Stef on Renderer Development:
"The game engine, of course, it only supports OpenGL Direct 3D and it has some beats for Xbox and PS2, but only some bits. And we had to bring in support for all of the custom texture formats that the graphics card support for the Dreamcast..." (10:41)
Falco on Cache Management:
"You have to capture the pipeline state... and then he placed them into standard vectors and then he could iterate over them later when he needed to." (16:17)
Stef on Profiler Tools:
"We have a profiler that is passed down from some other developer to me and I modified it..." (35:20)
The GTA III Dreamcast port project stands as a testament to the ingenuity and dedication of its developers. By meticulously navigating hardware limitations and leveraging community-driven tools, Falco Girgis and Stef Cornelio Smitsas Poitidis have breathed new life into a classic game on beloved retro hardware. Their work not only preserves gaming history but also inspires future endeavors in niche game development communities.
For enthusiasts eager to join or support the project, resources and collaboration opportunities are readily available through their official platforms.
Interested in diving deeper or contributing to the project? Visit dca3.net, join the Discord community, or explore the repositories on GitHub under the Callistios organization.