
Learn about how a US Naval Research Laboratory research team successfully conducted the first reinforcement learning control of a free-flyer in space.
Loading summary
A
You're listening to the N2K space network.
B
AI adoption is exploding and security teams are under pressure to keep up. That's why the industry is coming together at the Data SEC AI Conference, the premier event for cybersecurity, data and AI leaders. Hosted by data security leader ciara. Built for the industry by the industry, this two day conference is where real world insights and bold solutions take center stage. Data SEC AI25 is happening November 12th and 13th in Dallas. There's no cost to attend. Just bring your perspective and join the conversation. Register now at data secai2025.com cyberwire.
A
I.
C
Would wager most of you listening have heard of Pavlov and his dogs. Does that name ring a bell?
D
Ha ha.
C
It is a famous experiment of conditioning training or reinforced learning. But why am I talking about conditioned responses on a space show? Well, what if a similar technique could apply to robotics in space?
A
Yeah.
C
Wild, right? Want to know more? Let's dive. This is T minus Deep Space. I'm Maria Varmazas. A US Naval Research Laboratory research team successfully conducted the first reinforcement learning control of a free flyer in space this past May. I spoke with NRL space roboticist Dr. Samantha Chapin and NRL computer research scientist Dr. Kenneth Stewart about this demonstration and the future of autonomous robotics in space.
E
Hey, my name's Sam Chapin. I'm a space roboticist here at the US Naval Resource Laboratory. And I feel like I have the coolest job here because I get paid to play with robots and we actually got to send a robot to space and test something for the first time. So I'm really excited to talk about that. But kind of my background is all about space robotics. I did undergrad research into grad school, getting my PhD focusing on how we can do in space assembly and servicing. So basically, how can we make robots autonomously assemble large structures in space and fix things in space? Basically help astronauts have all the beautiful things we want in space. And now I feel really lucky to be here at NRL and getting to do really cool experimentation with our awesome research and development group where we get to really dig in and try what is really the latest in industry. Figuring out what are the coolest concepts and actually applying applying them to the real problems we're trying to solve.
A
That's so cool.
C
Ken.
D
I'm Ken Stewart and I'm a computer scientist here at NRL and I try to make robots smarter using AI machine learning. Before I joined here, I would have never thought that I would ever be doing space robotics when I was looking for Jobs. At the end of my PhD, I would have never thought space robots would be the thing. I would be getting smarter and learning and actually getting a robot to do something cool in space. Really cool achievement.
A
It is. And first, I want to say congratulations to you both and to the team, because I want you both to tell me more about this actual amazing achievement. But since I know what it is, first, congrats. It is really, really, really neat. So I'm just really thrilled that I get to speak to you both about.
C
What you all have achieved.
A
So, yeah, I won't keep the audience on tenterhooks. Let's tell me first, what did you do? What was the experiment?
E
Yeah, so the experiment that our apiary project focused on was how could we try using this thing called reinforcement learning to control a free flyer in space for the first time? So traditionally, robots in space are kind of done the most safe way possible. You know, space is very expensive to send things into space. There's limited access. Not everyone gets to test in space. Cause it's expensive. And so a lot of times for robotics, they kind of use the thing that they've done before that they know is gonna work. So humans will teleoperate and basically kind of control them or send slow commands. But we want to change that to be doing it autonomously. So what we did is actually the International Space Station has some different test facilities, and one of them NASA runs is this Astrobee robot. So it's this really cute Droid free flyer, kind of like, you know, something like R2D2, but it flies around and we were able to change how it moved. So instead of having the normal controller, we could try our cutting edge algorithms and test out and see if they were going to work.
A
And.
E
And on the day, we only had about five minutes to test. So we're really happy. You know, it worked for the first time, and this is the first time we think anyone has done it. So we feel so fortunate that we got to do that test. You know, with our really small team, it's basically Ken and I and then our coworkers Roxanna and Glenn. So, you know, our scrappy team of four were able to pull this thing off in three months and do something no one's ever done. So we're really happy.
A
That's awesome. Yeah, a scrappy team of four doing that is quite amazing. So, yeah, again, congrats. Tell me a bit more about what it was that you did. If you can get any specifics about.
C
What the goals were, what you Achieved.
A
Sam or Ken, if anyone wants to yield that one.
D
So basically the goal is just to demonstrate autonomy in space on some level. So we are looking at reinforcement learning at the time, which reinforcement learning is basically like getting a robot to learn from its environment with rewards. If you have a pet that, like a dog, right. If you want to train your dog to sit, if they sit, then you give them food as a reward. So we're kind of like doing that with robots. If the SRB robot did what he wanted it to do, which in this case was move to a correct position and orientation, then we would give it a reward, which we can't give a robot food, at least not yet.
A
But I was like, what's the reward for a robot?
E
Everyone asks that. It's like the most number one question.
D
If you're taking a test and you get an A, you feel really good. If you get like, you know, a failing grade, you feel bad. So we're doing that with the robot.
A
Oh, so a good job, A thumbs up to the robot, like, you did great job, robot.
E
Positive reinforcement.
A
That's like, that's great. It's like a little sticker chart for a robot.
D
It works.
A
I mean.
E
Yeah, yeah.
A
So, I mean, so yeah, go ahead, Ken. I think you're still in the middle of telling me about this.
D
Yeah, so, yeah, so that's, you know, how we trained it with reinforcement learning. And I mean, we only had a very short amount of time, so we only got to getting the attribute to move to the correct position orientation we wanted to do. If you send a command to it, it goes to the correct place, basically. But it did it all autonomously, figure it out on its own.
A
And that's huge. And from my recollection of many ISS experiments, often it is only a short window that people have. So being able to achieve that in a short window is quite amazing. I think for a lot of us who dream of the Star Trek future, of what we want to see in space one day, this is such a cool step towards that. So it's like, oh, my gosh. So, Ken, you started telling me a bit about reinforcement learning control because I saw that phrase in the press release and I was like, okay, what is that? So you did explain that really nicely. I'm wondering, and again, this question could.
E
Be for both of you, either of.
A
You, where else could we see that being used in space robotics or maybe even robotics in general? I'm just very curious. I mean, I've never heard that phrase before and I'm not super familiar with robotics. So that might be why. Oh, yeah, yeah. Tell me more about that.
E
Yeah, so it's kind of really cutting edge how people are using it now. So reinforcement learning is not like a new technique, but the way we're able to do it now is on such a larger scale. So basically, we've been able to use simulators to basically highly paralyze how we're testing. So instead of having things that used to take people months to train a reinforcement learning algorithm to be able to be effective, now we can do that in minutes. We can start a training in our simulation. It'll be running over hundreds and thousands of robots. They're doing all the same tasks slightly differently. So if I'm trying to pick up a pen, I can change the mass of the pen, I can change the friction of the pen, I can change where the center of mass is. And so that would allow me, if I was training a robot arm, to vary these parameters and make it so that the simulation is as varied and odd as kind of normal life is. So historically in robotics, some of the pitfalls are you can simulate a lot, but then the gap between simulation to real testing is kind of where things break. And so that reinforcement learning and being able to now we're simulating on these really large scales and doing it very quickly allows you to iterate and create policies for these reinforcement learning algorithms that actually are able to be deployed in the real world. So we tested this on our robot in space, the Astrobee. But previously, both Ken and I had been testing on different platforms. So like I tested on robotic arms, Ken was doing some quadruped work. But we were able to take that expertise on different robots and then apply it to this new robot. And then that's why we were able to execute it so quickly, because we already had the techniques and our group at NRL had been working up this expertise. And then we got a new robot. NASA was amazing to let us use their open source code, be able to find the easiest way to integrate, and then we're able to test it out really quickly. But we think that's also really cool, because if there's another type of robot that you're interested in, a different application, a different environment, we're able to show that we can successfully model environments so that the robots can work when they need to work, because that's super hard, something like space, since it's so hard to get testing time there, you want to really have that high confidence that you simulated something and it's going to work. The first Time.
A
Yeah, it makes me think about, I mean, ideally in simulated environments, it's sort of like a closed set, so to speak. But of course, in space we can't simulate everything. So adaptation is sort of the name of the game. Right. And I would imagine this really lends itself well to that. Ken, is there anything you want to add? I want to make sure that I give you an opportunity, if there's anything you want to add.
D
Oh, sure. Thanks. I guess I've talked a lot about like the using a simulator and simulation environment. I mean, what's kind of really neat about all these machine learning and AI technologies, including the simulator, is that they're really based upon the success in the video game industry. The simulator is like GPUs, graphical processing units. They're made for video games and now they're used for AI. And these simulators are basically game engines. Except now it's adapted for scientific work where we can with very highly realistic physics. Right. Whereas people are trying to make their video games much more realistic. Now we can adapt that and make our. A simulation environment's much more realistic for training robots for all kinds of things, which is cool and exciting.
A
It really is.
E
It really is.
A
This is just the beginning of things. So where do you see this going? Specifically within space, but maybe even beyond that?
E
Yeah, I mean, so my dream is, I love Star Trek, I love Star Wars. How do we get the most autonomous robots that can do everything? So right now in space it's expensive to get things up there. Astronaut time is very precious. And it's also dangerous for astronauts to do certain things. Say you want to service a telescope that's in a really far away orbit that's remote. It takes a long time to get there. Can we have robots do those kind of tasks? So I kind of view it as can we give robots the tools with autonomous methods to be able to handle all of the type of uncertainty you're going to get in space? So I dream of really, really large space structures like space telescopes or the next great space station. And how could we have robots do the assembly? Because right now in space, the International Space Station that was assemb with I guess, a team of robots and humans, but it was again kind of teleoperated humans at the end of a canidarm attaching things together. But I dream that we could send out a team of robots and then they could execute on whatever the tasks we want are and make sure that we're keeping up our in space ecosystem and things are thriving for a really Long time. But I also think these same type of desires that we want for being able to have robots complete really complex tasks that are highly variable, changing throughout the time that they're completing it, apply to any other type of application you want, either on the surface, in water. So I love space, but I think it can apply basically anywhere you want robots to do complex tasks that we don't need humans to do, we can have the robots do it.
D
Yeah. And I think in addition to complex tasks, the real big push right now in autonomy for machine learning and robotics is trying to get robots to do many varied tasks or to generalize to a greater number of capabilities. Because historically, a lot of machine learning and robotics things, they're trained on data to be very specifically good at one thing or one type of task. And so it's really exciting as trying to figure out how you can get a robot to function if it's space, get a robot to do some task in orbit and then do something different on the ground, for example, and be able to plan and figure out and adapt based on conditions or if it's on land, like, having, like, a robot, be able to do, like, surgery or medical things, like on the fly, and, like, figure out, like, you know, how to treat someone, for example, like, because, you know, there's so many variables and, you know, everyone's different, and industry right.
E
Now is super focused on things like kitchen tasks. So you'll see lots of videos of robots, you know, picking up a cup, moving it somewhere else. And we're kind of trying to figure out how do we take that, like, wealth of work that people are doing in industry for very specific tasks that are obviously, you know, important to our daily lives to, but don't exactly solve the same type of problems that we care about here at the Naval Research Laboratory. So we're very much, you know, how do we make sure we're not, you know, we're including whatever is the current state of the art, but we're finding the ways to adapt it to the use cases that we have, which have a lot less data for them. So, you know, there's not as many people trying to solve the exact same problems we are. So how do we adapt things that have a ton of data that everyone's, you know, kind of, you know, universities and industry are on the same team working on solving those kind of, you know, household problems, and how do we apply those same type of techniques to these very specific problems?
C
We'll be right back.
F
CISO Perspectives is back with an all new season this season is all about change. Whether it be emerging technologies like AI, shifting governmental roles or evolving threats, we are sitting down with security experts and getting their insights to help you make sense of these changes.
D
We are part of a larger ecosystem and if you look at the largest cyber incidents, they have massive downstream effects.
F
I'm Ethan Cook, editor of ciso Perspectives at N2K CyberWire. This week, host Kim Jones with his first guest, Ben Yellen to discuss the current state of regulation.
G
Absolute security by definition is an oxymoron. I can secure you absolutely if you shutter your doors, wipe your computers, wrap them in Lucite and drop them in the MAD Nas trash. But then again, you ain't gonna make no money.
F
CISO perspectives is an N2K Pro exclusive show, but for this season we're sharing the first two episodes free on the Cyberwire daily. To hear the full season, visit TheCyberWire.com and click on subscribe now to become an N2K Pro.
B
@ Talas they know cybersecurity can be tough and you can't protect everything. But with Thales, you can secure what matters most. With Thales industry leading platforms, you can protect critical applications, data and identities anywhere and at scale with the highest roi. That's why the most trusted brands and largest banks, retailers and healthcare companies in the world rely on Thales to protect what matters most applications, data and identity. That's Thales. T H A L E S learn more@talasgroup.com cyber investigating is hard enough. Your tools shouldn't make it harder. Maltego brings all your intelligence into one platform and gives you curated data along with a full suite of tools to handle any digital investigation. Plus, with on demand courses and live training, your team won't just install the platform, they'll actually use it and connect the dots so fast cybercriminals won't realize they're already in cuffs. Maltego is trusted by global law enforcement, financial institutions and security teams worldwide. See it in action now@maltego.com.
C
And I'm.
A
Wondering again, as somebody who's really not familiar with robotics, aside from whatever I hear in pop culture and maybe a few headlines, I think it might be also important to delineate what's possible versus what. What realistically is not super possible within the next five, 10 years, what is more sort of a long term goal? Because I think sometimes there's a hype cycle. At least I know in the space industry where people go, I don't know how realistic that is, sooner rather than later. So what's realistic within the next five to 10 years versus way longer, especially within the realm of space. I'm thinking ISAM type stuff.
D
Yeah.
E
So I think the funny thing is, for space specifically, it's not that we couldn't do a lot of this stuff right now. It's that people don't want to take that extra risk of trying it a new way. And so I think it's more of a culture of not wanting to take risks doing stuff the same way we have done it. And that's kind of what's slowing the progress. In addition to things like it's kind of funny, you see robotics on the ground, you see all the amazing stuff we can do. And then most people don't realize that actually the. The things you have at your disposal in space are a lot different. So there's not as much processing usually, or compute. And that's again because of trying to do risk aversion. So flying hardware that's already been flown. So actually, as an example, on Mars, we have the rovers, and their processing is actually much simpler. The highest processing on Mars right now is from the quadcopter that they flew. Ingenuity. And that's because it wasn't the critical path. It was kind of an extra test. And they flew a Snapdragon processor. And so that's like they ended up using that for some other test because it was there. And so that's what we're trying to do is how do we get these smaller tests that aren't as big of a deal. People aren't worried. It's an extra test. Like with the Astrobee, it was a scientific platform. They had it there for people to test. And so that's why they were fine with us testing this algorithm that someone else might not be okay with us testing, because they don't want it to end whatever their major testing cycle is. So we're trying to find opportunities where we kind of buy down the risk by saying, hey, we've tested this smaller component. It's not everything. Our Astrobee test did a very specific goal of moving from A to B. And that's just one aspect of space robotics. But how can we show people autonomy is not scary? We can show that it's going to work time and time again. We can simulate it, we can test it on the ground, on our granite tables, and then we can show it working in space. So we're really trying to prove to people that our algorithms are robust and can work when they really need it to.
C
That's Fascinating. Anything to add to that, Ken?
D
Yeah, I mean, like Sam was saying, first, space, specifically, partly it's an economic problem, but we've been talking to people at NASA, for example, and it seems like people are more open, especially I think, with the AI boom, to putting more compute in space, such as gpus and things like that. So I think because people are much more accepting of it and because these algorithms are being so rapidly developed, I could see space having a lot more autonomy. Maybe some basic autonomous ISAM demonstrations of a robot arm doing some very basic assembly. Or maybe we see robots being able to dock with each other to a station autonomously without someone having to control it manually. Or having observation satellites that can autonomously take data or take data at specific times rather than always recording data. Things like that.
E
Yeah, I hope that, like, in the next five to 10 years, it does shift from not just using autonomy when you must, like you simply have to, and more of a shift to using it instead of having humans more in the loop as we can, you know, prove that it's, you know, something you can rely on.
A
That is a fascinating distinction, honestly. And I anecdotally, I'm just wondering. This is just a comment, not a question, as I know I've had a lot of conversations with people about edge computing in space, and as that advances, I wonder. Wonder also how that will interact with what you all are doing. So that's just more of a comment. It's going to be very interesting to watch that happen. I'm curious about the research that you all are doing and how it supports civil space as well as defense operations and what you all think about that.
D
Yeah, our research topic is focused on robotics and machine learning and trying to give robots more autonomy, make them smarter for different applications that people are interested in. I mean, a lot of people in our lab look at, like, ISAM tests, and I know Sam is especially excited about isam, trying to figure out how we can get robots to assemble things. But we're also part of the Navy, so we also do things like naval ship maintenance tasks or one thing that we've been looking at. Or like, I know a lot of people have been talking about drones, so there's people starting to get into that space as well. And with what's really cool about machine learning algorithms, that the algorithms themselves are fairly agnostic to the task if you have the data and problem set up correctly. So as we advance these methods, we can probably find lots of different spaces, including space, to apply them to.
E
Yeah, I guess I just focus on isam, SAM and isam. But yeah, so for inspace assembly and servicing, I think the cool thing is that once we get some more of these capabilities as kind of, you know, proven out, we have industry testing really cool things, government testing really cool things, and we show that these kind of this catalog of ability that we have, that we can do robotic things, we can, you know, do things like refueling or do things like adding on a new payload or switching out a payload, that we can show that, you know, this is just, you know, how we think, you know, you can do space operations and keep it, you know, so that we're not sending up a satellite, having it, having one thing fail after 15 years and then having it completely be, you know, jettisoned into a graveyard orbit instead. Being able to just fix maybe whatever that is or find a new way to use it and repurpose it and kind of make the space ecosystem last longer and have us be able to do even more awesome science and really cool things. So, yeah, I hope that we get to do even more things with robots in space.
A
Amen to that. Anything that you would like to add.
C
That you would like the audience to know?
A
Any reflections about this mission or if there's anything I missed that we should cover? By all means, yeah.
D
So obviously this was really exciting opportunity to be able to actually test in space in a microgravity migraine. Iss, we're hoping to test in space. Space or outside of space station, actually in orbit around Earth. That's what we're looking at next, is applying these types of things to that domain. Because we're the dod, we're also hoping to apply robotics to problems like search and rescue and like battlefield medicine. Like I mentioned, Navy ship maintenance, you know, there's lots of exciting applications, I think, for robots, especially in this exciting time where people are now super interested because of the success the industry has shown and, you know, things like ChatGPT. So we're just hoping to keep making more and more cool things.
E
Yeah, I just feel so lucky that we get to work with such a cool team here at nrl. Our group is just full of these awesome roboticists coming from kind of different backgrounds and different viewpoints. And so I think it's really fortunate that we're able to apply our expertise to these really fun problems. And we're always open to new fun problems. So this was a specific example of us having access to a cool robot on the ISS and getting to see if we could control it in an interesting way. And we keep trying to find new ways that we can push what is the state of the art so that we can push the final frontier and, you know, make robots be able to do even more cool things. And yeah, so it's I feel really lucky we got to do this and hopefully we'll have even more cool stuff to talk to you in the future.
C
That's T minus Deep Space Brought to you by N2K CyberWire we would love to know what you think of our podcast. Your feedback ensures we deliver the insights that keep you a step ahead in the rapidly changing space industry. If you like our show, please share a rating and review in your podcast app or you can send us an email to spacen2k.com we're proud that N2K CyberWire is part of the daily routine of the most influential leaders and operators in the public and private sector. From the Fortune 500 to many of the world's preeminent intelligence and law enforcement agencies, N2K helps space and cybersecurity professionals grow, learn and stay informed. As the nexus for discovery and connection, we bring you the people, the technology and the ideas shaping the future of secure innovation. Learn how@n2k.com N2K's senior producer is Alice Carruth. Our producer is Liz Stokes. We are mixed by Elliot Peltzman and Trey Hester with original music by Elliot Peltzman. Our executive producer is Jennifer Ibin. Peter Kilby is our publisher and I am your host, Maria Ramazes. Thank you for listening. We'll see you next time.
H
And Doug Limu and I always tell you to customize your car insurance and save hundreds with Liberty Mutual, but now we want you to feel it. Cue the emu music.
E
Limu Save yourself money today.
H
Increase your wealth.
E
Customize and save.
H
We say that may have been too much feeling.
B
Only pay for what you need@liberty mutual.com Savings very unwritten by Liberty Mutual Insurance Company affiliates excludes Massachusetts.
Date: September 27, 2025
Host: Maria Varmazas (N2K Networks)
Featured Guests:
This episode of T-Minus Deep Space focuses on groundbreaking advancements in autonomous robotics in space, highlighting a recent US Naval Research Laboratory (NRL) experiment where reinforcement learning was used to control a free-flying robot on the International Space Station (ISS). Host Maria Varmazas interviews Dr. Sam Chapin and Dr. Ken Stewart about their team’s achievement, the technical and cultural challenges, and the promising future of AI-driven space robotics.
| Timestamp | Speaker | Quote / Moment | |-----------|---------|----------------| | 03:54 | Sam | “We were able to change how [Astrobee] moved...try our cutting edge algorithms.” | | 04:49 | Sam | “Our scrappy team of four were able to pull this thing off in three months and do something no one's ever done.” | | 05:25 | Ken | “If the Astrobee robot did what you wanted it to…we would give it a reward…which, we can't give a robot food, at least not yet.” | | 07:43 | Sam | “Now we can do that [train RL] in minutes...we’re simulating on these really large scales.” | | 10:07 | Ken | “Simulators are basically game engines...now adapted for scientific work.” | | 11:01 | Sam | “I dream that we could send out a team of robots and...make sure we're keeping up our in-space ecosystem.” | | 12:33 | Ken | “The real big push right now...is to generalize to a greater number of capabilities.” | | 17:46 | Sam | “It’s not that we couldn’t do a lot of this stuff right now. It’s that people don’t want to take that extra risk of trying it a new way.” | | 20:34 | Sam | “In the next five to ten years, I hope it does shift from not just using autonomy when you must…but more to using it instead of having humans more in the loop.” | | 23:17 | Sam | “Being able to just fix...or find a new way to use [satellites] and repurpose it and...have us do even more awesome science.” |
The mood is enthusiastic, candid, and full of technical optimism. Both Sam and Ken repeatedly credit teamwork, express excitement about the future, and emphasize the transformative potential of AI-driven autonomy in space and beyond.
Final Reflection (Sam, 24:14):
“I just feel so lucky that we get to work with such a cool team here at NRL...This was a specific example of us having access to a cool robot on the ISS... and we keep trying to find new ways that we can push what is the state of the art so that we can push the final frontier and, you know, make robots be able to do even more cool things.”
For listeners interested in the intersection of AI, robotics, and deep space operations, this episode offers a front-row seat to the future—and a candid behind-the-scenes look at what it takes to get there.