Loading summary
Dave Buettner
You're listening to the Cyberwire network, powered by N2K. The IT world used to be simpler. You only had to secure and manage environments that you controlled. Then came new technologies and new ways to work. Now employees, apps and networks are everywhere. This means poor visibility, security gaps and added risk. That's why Cloudflare created the first ever connectivity cloud. Visit cloudflare.com to protect your business everywhere you do business. Hello everyone and welcome to the Cyberwires Research Saturday. I'm Dave Buettner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems and protecting ourselves in our rapidly evolving cyberspace. Thanks for joining us.
Carlos Zanki
So in this case the detection was triggered by machine learning model and we have a review procedure of those detections to see which are true positives and determine what type of malware in this case we had.
Dave Buettner
That's Carlos Zanki, reverse engineer at Reversing Labs. The research we're discussing today is titled Malicious Pypi Crypto Pay Implants, Infosteeler code.
Unknown
So you get that indication from the automation and what motivated your team to dig deeper into the package?
Carlos Zanki
Well, in this case we had a package we have previously seen to have clean versions. So usually when we encounter most of the malware in public repositories, you encounter new packages. So somebody publishes a new package which is immediately malicious. The first or second version. In this case we had a package that was, let's say maintained for some time, for several months, since I believe beginning of September or something like that. And usually when you see such a package that has several months of maintenance, they are not always true positives. More often it happens that they are false positives. So those are packages that use malicious looking behaviors in a legitimate way. But in this case we took a detailed look and spotted well observed obfuscation pattern of base64 encoding and zlib compression. So when we encounter something like that, even though sometimes you find something using it in legitimate purposes, most often it is malware.
Unknown
Well, what was the process for de obfuscating the code and what sort of things did you uncover once that obfuscation was stripped away?
Carlos Zanki
Well, this is not. It's Python, it's scripting language. So it's fast to write your own code to decrypt something like this. It's basically Several rounds of base 64 decoding and reversing spring and doing zlib decompression. So you do what the attacker did just in reverse order. So it's not hard to create to perform the obfuscation if you have a little of coding experience. And what we observed is that we had malicious code aiming to steal sensitive information relating to crypto trading. So the goal was financial gain. Most often we see in latest time threat actors try to steal cryptocurrencies and secrets related to cryptocurrency trading to quickly get to financial gain.
Unknown
I mean, looking at some of the threat actors tactics, how was the AIO CPA campaign different than some of the more common supply chain attacks? Things like typo squatting or impersonation? Were there some differentiators here?
Carlos Zanki
Yeah, well as you said, type squatting is the most common way. You take a popular package and type of squat it or try to impersonate it, but add a suffix to the name, let's say legacy, I don't know, something related to Bitcoin, Ethereum or anything like that and add some suffix to make it look like a fork of a legitimate project and hope that somebody will use it. And those are two most common ways of infecting targets. But in this case we had developer creating his own crypto trading tool, likely forking from some other legitimate previously deployed tool and waiting for some time, several months in this case to build up a user base and then publish malicious version to them. What was also interesting is that on day or two after the initial version of this package was published to pypi, the developer behind this package tried to retake another already existing PYPI package named pay. So they were likely trying to make either make a more sound name. So it's probably more likely someone will use package name pay than AICOPA or however you spell it right.
Unknown
Well, how was the malicious code injected into the PYPI package without being added to the GitHub repository?
Carlos Zanki
Well, that's not really hard to do in most cases developer don't set up Automated Publishing to GitHub and you can separate your publishing process in. If you control your publishing process, you can separate it however you wish. So you have PYPI assets tokens. You can publish Source code to GitHub and then take your source code, add some malicious code to it, package it to the PYPI package format and publish that version to PyPi. It's not something we have never seen before. Often there were crypto miners in NPM packages which didn't have the crypto mining functionality in GitHub repository Source code because somebody expects that you will take a look at their source code and you won't find that and threat hackers know that it's harder to analyze compiled packages, in this case pypy I packages which are in binary format. Then to look at Source code on GitHub which is created to make it as comfortable as possible to perform code review.
Unknown
Can you help us understand how your use of differential analysis between those package versions helped reveal the malicious behavior?
Carlos Zanki
Yeah, well, indicator behaviors are proprietary technology and they are based on observing some low set of behaviors and creating threat hunting rules based on them. So basically what you do, you take a pypi package through static analysis engine and it extracts indicator behaviors, that's how we call them. So they're basically text written descriptions of what the code does. So if some part of code uses some of HTTP libraries, it'll say, okay, this code is capable of performing, I don't know, HTTP requests. In this case, what we had, we had behavior indicators for decoding data with base 64 Algori and importing of zlib module which is used for Zlib compression and decompression. And we also have unusually long strings present in the source code. So those three indicators on their own aren't necessarily suspicious. But when you come, when you put them together and above that you have expression execution in the code, that's something that you want to look at. And here differential behavior analysis is the feature of our tool which enables you to give two package versions to the engine tool and it performs that comparison based on file level. So it takes equally named files on equal parts from both of packages. So like in GitHub you have version diff on the source code. In this case you have version diff on behaviors, extracted behaviors. So basically it compares same files from two different versions of the package and tells you what behaviors have changed in those files. You don't have to understand source code to get a sense of what's happening. What here has been introduced in your version, so somebody perhaps wouldn't know what requests library does. This way you have easy textual explanation which tells you, okay, this is used to perform HTTP requests. So basically you compare two versions of package and gives you a way to see what behaviors have been introduced in a new version.
Dave Buettner
We'll be right back.
Unknown
Flex your business with an American Express Business Gold card. You'll earn four times membership rewards points on your top two eligible spending categories like transit and electronics each month on up to 150k in combined purchases per year. Plus you can now earn 3 times Membership Rewards points on flights and prepaid hotels booked on amextravel.com terms and points cap. Apply learn more at americanexpress.com business gold amex business gold Built for business by American Express.
Dave Buettner
And now a word from our sponsor, Know before It's all connected and we're not talking conspiracy theories when it comes to infosec tools, effective integrations can make or break your security stack. The same should be true for security awareness training. KnowBeFor, provider of the world's largest library of security awareness training, provides a way to integrate your existing security stack tools to help you strengthen your organization's security culture. KnowBe4's security coach uses standard APIs to quickly and easily integrate with your existing security products from vendors like Microsoft, CrowdStrike and Cisco 35 vendor integrations and Counting Security Coach analyzes your security stack alerts to identify events related to any risky security behavior from your users. Use this information to set up real time coaching campaigns targeting risky users based on those events from your network, endpoint identity or web security vendors. Then coach your users at the moment the risky behavior occurs, with contextual security tips delivered via Microsoft Teams, Slack or email. Learn more@knowbefor.com SecurityCoach that's knowbe4.com SecurityCoach and we thank knowbe4 for sponsoring our show.
Unknown
Oh such a clutch off season pickup Dave. I was worried we'd bring back the same team. I meant Those blackout motorized shades lines.com made it crazy affordable to replace our old blinds.
Carlos Zanki
Hard to install?
Unknown
No, it's easy. I installed these and then got some for my mom. She talked to a design consultant for free and scheduled a professional measure and install hall of fame son. They're the number one online retailer of custom window coverings in the world.
Carlos Zanki
Blinds.com is the goat shop blinds.com right.
Dave Buettner
Now and get up to 40% off select styles plus a free professional measure. Rules and restrictions may apply.
Unknown
Well looking at the potential impact here, I mean what what was the harm that this package could have caused had you all not discovered it?
Carlos Zanki
Well basically it could result in financial loss for the users who installed this package and used it in their projects. So basically it's stealing of cryptocurrencies.
Unknown
And what sort of insights does this research give everyone on some of the risks of relying on open source repositories in the software supply chain?
Carlos Zanki
Well, the software the risks are emerging on year basis. It's not just the amount of malware present there, but it's also the level of sophistication that we see each new year. A year ago the main attack vector was typosplotting and let's say the malicious actor didn't put a lot of effort to develop sophisticated malicious packages. So they just hope that somebody will hop onto their package and get it installed. As the time passes, the attackers get more sophisticated. We have seen two really sophisticated attacks the last two weeks targeting cryptocurrency and artificial intelligence. ML tools, Ultralytics and Solana web packages on NPM and PYPI respectively. So in those cases, the build environments were compromised and that type of text were even harder to detect. And since they have a big user base like, I don't know, Ultralytics package was downloaded in, let's say 60 million downloads, I believe that's a big user base to quickly deploy your malware to, and that's a pretty high risk. And in this case, even smaller package projects which quickly grow user community are also good vector to infect a big number of users in with low effort and almost no cost. Basically you don't need to either host your infrastructure. You can use open source package repository for distribution, you can use GitHub for data exfiltration or Dropbox or anything and tools that can be used to upload or download files.
Unknown
Well, what was the process of reporting this package to PYPI and how did they respond?
Carlos Zanki
Well, the PYPI community and people behind it invest a lot of time to improve the security of the entire ecosystems. And I believe somewhere in January, February or March of this year, they introduced reporting feature to their batch repository. So basically you have button where you can report malware and point to the line of code where you have found malware in. And they put that package into quarantine until they determine if the package is truly malicious or it was false positive reporting. And their response were quick. I believe just a few hours the package was quarantined and in a few days later it was removed. And we had really good communication with the security team behind that product.
Unknown
That's good to hear.
Carlos Zanki
Yes.
Unknown
What are your recommendations then? I mean, based on the research here, what should people do to better protect themselves?
Carlos Zanki
Well, as I said, there are various package repositories. Not all package repositories put the same effort into improving security environment they operate in. So basically on most repositories, it's up to you to make sure that something you're installing is not malicious. You should double check everything you have in your code base. So basically, security web everything you plan to use from open source package repositories. Because even trustworthy packages with millions of downloads can quickly become malicious. If somebody compromises either deployment account or deployment environment so you can't have trust based on good reputation. It's not enough anymore. There are too many compromises already happened. You need to perform security vetting from your side. If you don't have enough resources or secure knowledge, you should use dedicated tools to perform that security vetting. Version Labs has Spectra sure for Community, which is a free repository where you can see and check if there is anything known malicious about some package. You can go there and check it. If you use hundreds or thousands of projects, you likely want to use some commercial tool or commercial version of a tool to double check that your dependencies are clear of malicious content and also that you are not introducing some type of vulnerability to your software project. Because let's get real, projects are rarely built of one of two or two libraries. Usually those are dozens or tens or hundreds of open source library in the dependency tree and it's hard for developer organization to security wet everything there is in your open source luggage.
Dave Buettner
Our thanks to Carlo Zanki from Reversing Labs for joining us. The research is titled Malicious Pypi Crypto Pay Package Implants Info Stealer Code. We'll have a link in the Show Notes. We'd love to know what you think of this podcast. Your feedback ensures we deliver the insights that keep you a step ahead in the rapidly changing world of cybersecurity. If you like our show, please share a rating and review in your favorite podcast app. Please also fill out the survey in the Show Notes or send an email to to cyberwire2k.com we're privileged that N2K Cyberwire is part of the daily routine of the most influential leaders and operators in the public and private sector. From the Fortune 500 to many of the world's preeminent intelligence and law enforcement agencies, N2K makes it easy for companies to optimize your biggest investment your people. We make you smarter about your teams while making your team smarter. Learn how@n2k.com this episode was produced by Liz Stokes. We're mixed by Elliot Peltzman and Trey Hester. Our executive producer is Jennifer Ibin. Our executive editor is Brandon Karp. Simone Petrella is our president, Peter Kilpe is our publisher and I'm Dave Bittner. Thanks for listening. We'll see you back here next.
Release Date: January 4, 2025
Host/Author: N2K Networks
Guest: Carlos Zanki, Reverse Engineer at Reversing Labs
Research Topic: Malicious Pypi Crypto Pay Implants, Infostealer Code
In the January 4, 2025 episode of CyberWire Daily, hosted by Dave Buettner, listeners are introduced to a critical discussion on a recent cybersecurity threat involving malicious packages in the Python Package Index (PyPI). The episode, titled "Crypto client or cyber trap? [Research Saturday]," features an in-depth conversation with Carlos Zanki from Reversing Labs, who delves into the intricacies of detecting and mitigating threats within open-source software repositories.
Carlos Zanki presents his research on Malicious PyPI Crypto Pay Implants, Infostealer Code, highlighting a sophisticated attack vector targeting cryptocurrency trading applications. The discussion begins with the detection process:
[01:16] Carlos Zanki: "So in this case the detection was triggered by machine learning model and we have a review procedure of those detections to see which are true positives and determine what type of malware in this case we had."
Zanki explains that their machine learning models flagged certain PyPI packages, prompting a detailed review to ascertain the legitimacy of these detections.
Zanki outlines the methodology used to uncover the malicious code embedded within seemingly legitimate packages. He emphasizes the role of machine learning in initial detection followed by rigorous manual review:
[02:02] Carlos Zanki: "We took a detailed look and spotted well-observed obfuscation pattern of base64 encoding and zlib compression. So when we encounter something like that, even though sometimes you find something using it in legitimate purposes, most often it is malware."
The team identified obfuscation techniques such as base64 encoding and zlib compression, common methods used by attackers to conceal malicious payloads within code.
The conversation progresses to the deobfuscation process, where Zanki describes the steps taken to reveal the hidden malware:
[03:31] Carlos Zanki: "It's several rounds of base64 decoding and reversing string and doing zlib decompression. So you do what the attacker did just in reverse order."
Upon deobfuscation, the malicious code was found to be designed to steal sensitive information related to cryptocurrency trading, indicating a clear intent for financial gain:
[04:06] Carlos Zanki: "The goal was financial gain. Most often we see in latest time threat actors try to steal cryptocurrencies and secrets related to cryptocurrency trading to quickly get to financial gain."
Zanki differentiates this attack from more common supply chain attacks such as typosquatting or impersonation. Instead of mimicking popular packages, the attacker created a legitimate-looking crypto trading tool and gradually built a user base before introducing malicious versions:
[04:44] Carlos Zanki: "In this case we had developer creating his own crypto trading tool, likely forking from some other legitimate previously deployed tool and waiting for some time, several months in this case to build up a user base and then publish malicious version to them."
This strategy underscores a shift towards more sophisticated and stealthy approaches in targeting software supply chains.
A critical aspect of the attack was the injection of malicious code directly into the PyPI package without altering the corresponding GitHub repository. Zanki explains how attackers exploit the separation between source code repositories and package distribution platforms:
[06:23] Carlos Zanki: "You can separate your publishing process in. If you control your publishing process, you can separate it however you wish. So you have PYPI assets tokens. You can publish Source code to GitHub and then take your source code, add some malicious code to it, package it to the PYPI package format and publish that version to PyPI."
This method allows attackers to maintain a clean appearance on public repositories while distributing malicious versions through official channels.
To detect such sophisticated attacks, Zanki discusses the use of differential behavior analysis between different versions of a package. This technique involves comparing the behaviors extracted from each version to identify suspicious changes:
[07:50] Carlos Zanki: "We compare two versions of package and gives you a way to see what behaviors have been introduced in a new version."
This approach enables the identification of anomalies without requiring in-depth code analysis, making it an efficient tool for threat detection.
Zanki warns about the severe consequences if such malicious packages go undetected:
[13:39] Carlos Zanki: "It could result in financial loss for the users who installed this package and used it in their projects. So basically it's stealing of cryptocurrencies."
The stealthy nature of these implants means that significant financial losses and data breaches could occur before detection.
The discussion broadens to the inherent risks of using open-source repositories in the software supply chain. Zanki notes the evolving sophistication of malware:
[14:06] Carlos Zanki: "The software the risks are emerging on year basis. It's not just the amount of malware present there, but it's also the level of sophistication that we see each new year."
He emphasizes that even reputable packages with millions of downloads can become vectors for malicious activities if compromised.
Zanki shares insights into the responsible disclosure process they followed by reporting the malicious package to PyPI. He praises the swift and effective response from the PyPI team:
[16:29] Carlos Zanki: "They put that package into quarantine until they determine if the package is truly malicious or it was false positive reporting. And their response were quick. I believe just a few hours the package was quarantined and in a few days later it was removed."
This collaboration between researchers and repository maintainers is crucial in mitigating threats swiftly.
Concluding the discussion, Zanki offers several recommendations for developers and organizations to safeguard against such threats:
[17:39] Carlos Zanki: "You should double check everything you have in your code base. So basically, security web everything you plan to use from open source package repositories."
He advocates for comprehensive security vetting of all dependencies, utilizing dedicated tools, and adopting a proactive approach to monitor and verify the integrity of open-source packages.
Dave Buettner wraps up the episode by thanking Carlos Zanki for his valuable insights into the evolving landscape of software supply chain threats. He encourages listeners to stay vigilant and implement robust security measures to protect their projects and organizations from similar malicious intrusions.
Detection Trigger:
Carlos Zanki [01:16]: "The detection was triggered by machine learning model and we have a review procedure of those detections to see which are true positives and determine what type of malware in this case we had."
Obfuscation Patterns:
Carlos Zanki [02:02]: "We spotted well observed obfuscation pattern of base64 encoding and zlib compression."
Deobfuscation Process:
Carlos Zanki [03:31]: "You do what the attacker did just in reverse order."
Attack Intent:
Carlos Zanki [04:06]: "The goal was financial gain."
Sophisticated Attack Vector:
Carlos Zanki [04:44]: "Creating his own crypto trading tool... build up a user base and then publish malicious version to them."
Reporting to PyPI:
Carlos Zanki [16:29]: "Their response were quick. I believe just a few hours the package was quarantined and in a few days later it was removed."
Security Recommendations:
Carlos Zanki [17:39]: "You should double check everything you have in your code base."
This episode of CyberWire Daily provides a comprehensive examination of the vulnerabilities within open-source package repositories and the sophisticated methods attackers employ to infiltrate and exploit them. Through Carlos Zanki's expert analysis, listeners gain valuable insights into the detection, analysis, and prevention of such threats, underscoring the importance of vigilant security practices in the software development lifecycle.