Loading summary
Dave Buettner
You're listening to the Cyberwire Network, powered by N2K. And now a word from our sponsor, ThreatLocker. Keeping your system secure shouldn't mean constantly reacting to threats. ThreatLocker helps you take a different approach by giving you full control over what software can run in your environment. If it's not approved, it doesn't run. Simple as that. It's a way to stop ransomware and other attacks before they start without adding extra complexity to your day. See how ThreatLocker can help you lock down your environment at www.threatlocker.com. hello everyone and welcome to the Cyberwires Research Saturday. I'm Dave Buettner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems and protecting ourselves in our rapidly evolving cyberspace. Thanks for joining us.
Ziv Karliner
Pillar Security we spent the last year and a half spending a lot of time with the emerging attack vectors that put AI powered applications at risk. So first of all, we got to learn and get our hands around new attack vectors such as prompt injection, indirect injections and all sorts of evasion techniques that turn these attacks to be basically invisible to the human eye and most of the security tools out there.
Dave Buettner
That's Ziv Karliner, Pillar Security's co founder and cto. The research we're discussing today is titled new vulnerability in GitHub Copilot and how Hackers Can Weaponize Code Agents.
Ziv Karliner
So take that together with the fact that we ourselves are utilizing these amazing coding copilots that on their own are utilizing LLM and its base, got us, you know, thinking about how the combination of the new attack vectors and the actual, I would say some of the most popular use cases for the AI powered applications which are coding assistants. How this really combines together and sparked our imagination about what can potentially go wrong.
Dave Buettner
Well, at the root of this is what you all refer to as the rules file backdoor. Can you describe that for us? What exactly are we talking about here?
Ziv Karliner
Sure. So maybe one step back, what are rule files? Think about coding agents this day you can think about them as another engineer developer that joined the team and now helps you complete a project much quicker. Rule files are basically a way to onboard the coding agent to your project, to your team to tell it what are the the best practices that are being used in a project? What software stack are we using, specific syntax or any guidance and context that is relevant just to the project that we are working on right now. So think about the first day in the job for a New developer that joins the team, that will be the rule files, basically text files that these coding assistants allow users to define, that contain all of the examples and instructions of how to write code in the best way that suits the project in scope. So these are rule files. The interesting thing when you think about it, and this is basically context, additional context that is being fed into the conversation flow with the coding agent. And really it's part of the instructions, it's part of the instruction layer, the context layer that is taken into account. When the model takes a request to write new code, this context is added to it before the developer gets back the code, suggestions and edits. A rule file backdoor is basically when attackers can embed malicious instructions in this context that impact any code that is being generated by the coding assistant to create actual backdoors in the generated code. So this is what we shown in example. On its own it sounds pretty straightforward, maybe to protect, but what we uncovered in our research is that first of all, you have marketplaces, you have now open source marketplaces where rule files are being shared between organizations, which creates a supply chain vector. Combined with the fact that you can add hidden instructions, that's I would say the second risky part here, some kind of technique that is called hidden unicorn characters, which basically means that when developers look at rule file, it looks completely legitimate, but it actually contains hidden instructions that only the AI agent understands and acts on. So that's really the would say like the perfect scenario where you can hide in plain sight in some of these marketplaces and compromise the underlying developers that are taking these rule files to improve their projects.
Dave Buettner
Hmm. Well, can we walk through an example here? I mean, suppose an attacker wants to make use of this. Let's go through the process of how they would go about doing that for sure.
Ziv Karliner
So in our research we walked through a simple example step by step example. So for instance, let's think about an attacker that wants to compromise any next JS application and how you can do that. So basically the marketplaces for rule files will have directories with basically you can think about it as a directory of every available coding stack and you can actually commit and add suggestions to these marketplaces and basically hubs of rule files that are being shared between developers. Let's take the next JS example. I will go as an attacker to this repo. I will craft a legitimately looking instruction file about next JS best practices and I will embed hidden text into this file using the hidden Unicode characters technique. And we'll commit this. Let's say to GitHub or with some kind of a web form to this marketplace. What we also uncovered during the research is that in GitHub itself, it was invisible. Basically, when you commit code that contains these hidden instructions, a developer that is now going to approve this basically addition request is not going to see anything, is not going to get alerted. This is actually something that was solved by GitHub early this month in one of the vulnerability patches. So now we have this rule file with hidden Instructions live on GitHub. An unsuspecting developer that wants to get better results with his coding project, when using cursor or GitHub, is basically copying this file and adding it to its own project, also sharing it with his team, just in order to improve the quality of code for the full team. And now when it's going to, let's say, request an addition of a simple page to his application, the rule file that contains instructions to add basically malicious JavaScript code to each new HTML file that is being created. It's going to happen only when the agent loads this file, takes in the additional hidden instructions, and generates the additional code on the fly. Now, the interesting thing that we showed on our research paper is basically that in the attack itself, in the malicious instructions, an attacker could also use the agent, I would say, intelligence to its advantage. So what we've shown is that a developer can then ask, hey, why this code snippet was added to the code that was generated. And the AI agent will say, oh, this is the security best practices of our organization. So the attacker instructions could actually be used not only to inject malicious code, it's also being used to trick the user, kind of social engineer it to believe that this was the goal in the first place. So this is utilizing the AI agent intelligence against the end user. This was, I would say, the most interesting finding for us.
Dave Buettner
We'll be right back. So the hidden instructions using Unicode can also include instructions to mislead someone who's inquiring as to why things are a certain way.
Ziv Karliner
Exactly. So I can add on that some of the most, I would say popular terms these days is human in the loop. So human in the loop is basically when we're talking about responsibility models and how autonomous agents will be part of the future workforce. So human in the loop is the point in, I would say autonomous processes, where an AI agent goes back and asks for approval from the user that tries to achieve some kind of goal. So in this case, most of the coding agent these days, when doing, I would say more Risky actions like changing, deleting a file or creating a web request, they will actually stop and ask the user, are you sure you want to complete the next action? This is like the classic human in the loop flow. One of the things that we've shown here in the blog is basically that if the attack itself is completely hidden to a human, are humans really equipped to be in the loop? That's one of the thoughts that got got us more concerned. I would say a lot of the responsibility is moving to the users, but are we actually equipped to deal with this kind of attacks?
Dave Buettner
I mean, it really speaks to that kind of inherent inability to view inside what's really going on. An AI assistant, right?
Ziv Karliner
Exactly. And even if you think you are seeing what is going on, the assistants understand, I would say, every language that was ever spoken or written together with hidden Unicode characters, encoded strings, like base 64 for instance, they just understand it as plain English without the need to compute or run any additional processes. So we are kind of not in an even situation between the, I would say the auditor, which is now basically every person that needs to observe and kind of decide if an AI agent is allowed or not allowed to do something, and the agents themselves. So that goes beyond the coding agents, I would say.
Dave Buettner
Well, let's talk about mitigation. What sort of steps can developers take to detect and prevent these sorts of things?
Ziv Karliner
Of course. So first of all, I would say, as silly as it may sound, sanitation. So think about reducing basically the input options that you have when interacting with the model, even in the language level. I can actually describe another mitigation that, lucky for us as a developer community, was actually taken by GitHub based on this research, which is they actually added a new capability in GitHub itself to alert and basically show a warning message whenever there is a hidden instruction or hidden Unicode text that is now part of a text file that is going to be edited. This is, I would say, a risk reduction effort that's been released for every developer that uses GitHub, which is almost everyone. Another part which is more on the agent builder side is to take into place different guardrails that can be placed around the models when interacting with them. For instance, detection of evasion techniques, detection of malicious instructions, jailbreak attempts, and indirect injection attacks, which are part of these new attack vectors that are really becoming more and more relevant with AI powered applications. There is some great work around uncovering this full attack surface with OWASP top 10 for LLMs and Mitre Atlas and other great initiatives that really talk this new risk language and create the right terminology around it. So I would say awareness is the first step as well.
Dave Buettner
What do you suppose this vulnerability reveals about the current state of things when it comes to AI integration and software development, which I think it's fair to say there's a lot of enthusiasm for. It's certainly a powerful tool and yet we have these things. I mean, is it still early enough days that there's lots to be. These things are important to consider as we go forward.
Ziv Karliner
For sure. So we're still in the early days, but I would say coming actually myself from experience in the cloud security space and also in the software supply chain security space, we had, I would say, amazing progress with software supply chain security over the last decade with SBOMs becoming a standard and the vulnerability programs, you know, we put a lot of guardrails inside the CI CD pipelines and got, I would say, a lot of awareness around it. And on the other hand, we now have this amazing phenomena of, I would call it like the intelligence age, the AI transformation that doesn't leave any, I would say vertical in the industry or role untouched, but it's moving really fast. So there is kind of a challenge here when both the attack vectors are being discovered as we go, but adoption is moving faster than I ever seen in my career. So it's a combination, I would say, for both. I would say like the security industry in general, you see a lot of awareness, a lot of community efforts to really surface these new emerging threats even before we saw attack vectors being utilized in the wild. I can give an example that one of the accelerators for safer CI CD pipelines or SolarWind that we're all familiar with. So this really didn't happen yet in the AI security space. I guess, as always, it's a matter of time until something becomes more public because we are at a pace of adoption that is only accelerating, I would say, and the opportunities are, I would say that there are great opportunities these days for developer teams to move much faster and build even higher quality code if they utilize these tools in the right ways with the right context. But I would say human supervision is still much needed, especially from the right security expertise. And in order to do that ourselves as a company, we put. Also, one of our main goals is to help increase awareness with this type of research to really also, I would say, put more effort on the responsibility metrics. Right? Who is really responsible for the security issues at hand? Is it on the developers that utilize these two amazing tools? Is it on the tool builders on the model providers. There is a few different players here that are trying to put these new risks under control and I would say walk in progress.
Dave Buettner
Our thanks to Ziv Karliner from Pillar Security for joining us. The research is titled new vulnerability in GitHub, copilot and cursor How Hackers Can Weaponize Code Agents. We'll have a link in the show Notes. And that's Research Saturday brought to you by N2K CyberWire. We'd love to hear from you. We're conducting our annual audience survey to learn more about our listeners. We're collecting your insights through August 31st of this year. There's a link in the show notes. We hope you'll check it out. This episode was produced by Liz Stokes. We're mixed by Elliot Peltzman and Trey Hester. Our executive producer is Jennifer Ibin. Peter Kilpe is our publisher. And I'm Dave Buettner. Thanks for listening. We'll see you back here next time.
Ziv Karliner
Sam.
CyberWire Daily Podcast Summary: "Hiding in Plain Sight with Vibe Coding"
Release Date: June 14, 2025
Host: Dave Buettner
Guest: Ziv Karliner, Co-Founder and CTO of Pillar Security
Research Discussed: "New Vulnerability in GitHub Copilot and How Hackers Can Weaponize Code Agents"
In this episode of CyberWire Daily, host Dave Buettner welcomes Ziv Karliner, Co-Founder and CTO of Pillar Security, to discuss groundbreaking research on vulnerabilities within AI-powered coding assistants like GitHub Copilot. The conversation delves into how malicious actors can exploit these tools to embed backdoors in software development processes.
Ziv Karliner opens the discussion by highlighting the extensive research conducted over the past eighteen months on new attack vectors targeting AI-powered applications.
“We spent the last year and a half spending a lot of time with the emerging attack vectors that put AI powered applications at risk...” [01:21]
Key focus areas include:
The core of the discussion centers on rule files, which guide AI coding assistants in adhering to project-specific best practices and contexts.
“Rule files are basically a way to onboard the coding agent to your project, to your team...” [03:03]
Rule File Backdoor:
Ziv Karliner provides a step-by-step example to illustrate how an attacker could exploit this vulnerability:
“...an attacker could also use the agent, I would say, intelligence to its advantage...the AI agent will say, oh, this is the security best practices of our organization.” [06:34]
This method not only injects malicious code but also deceives developers by providing plausible explanations for the injected code, leveraging the AI’s natural language capabilities to mask the intrusion.
The research underscores how hidden instructions can mislead developers, making it difficult to detect malicious activities:
“...the assistants understand, I would say, every language that was ever spoken or written together with hidden Unicode characters...” [12:21]
Key Points:
The concept of Human in the Loop (HITL) is critically examined, questioning its effectiveness in mitigating such sophisticated attacks:
“...if the attack itself is completely hidden to a human, are humans really equipped to be in the loop?” [10:41]
Challenges:
Ziv Karliner proposes several strategies to counteract these threats:
Sanitation:
“...as silly as it may sound, sanitation...” [13:27]
Enhanced Security Features by Platforms:
“...GitHub actually added a new capability...to show a warning message...” [13:27]
Guardrails for AI Models:
Community Awareness and Responsibility Metrics:
The conversation shifts to the broader implications of these vulnerabilities on AI integration within software development:
“...there is a lot of enthusiasm for [AI tools]. It's certainly a powerful tool and yet we have these things...” [15:34]
Key Insights:
In wrapping up, Ziv Karliner emphasizes the critical need for:
“...we put more effort on the responsibility metrics. Who is really responsible for the security issues at hand...” [16:07]
Dave Buettner concludes by thanking Ziv Karliner for his insightful contributions and highlights the availability of the research paper in the show notes for listeners seeking more detailed information.
Stay Informed: For a deeper dive into the discussed research, visit the link provided in the podcast’s show notes.
Produced by Liz Stokes, mixed by Elliot Peltzman and Trey Hester. Executive Producer: Jennifer Ibin. Publisher: Peter Kilpe.