Loading summary
A
Welcome, everyone. This is episode 141 of the AWS Bytes podcast, and today we are going to dive deep on some pretty transformative new features in Step Functions. We've always been big fans of step Functions, and we actually covered them back in episode 54. Recently, AWS released big new features, so we wanted to share our experience of using them with you. We're talking mainly about jsonata support, but we'll also touch on the new variables feature. I'm Owen. I'm here with Luciano. So let's get started. AWS Bytes is brought to you in association with 4th E. If you need a friendly partner to support you and work with you to de risk any AWS migration or development project, check them out@fourtheorem.com so maybe it makes sense to.
B
Start by giving a quick recap of what Step Functions are. Right? What do you think?
A
Yeah, yeah, let's go for it.
B
Okay, I'll try my best. So, essentially, Step Functions allow you to avoid writing code in order to define some kind of workflow or state machine. So if you have multiple independent steps and you want to orchestrate them together using steps conditions, parallelization loops, Step Function is effectively the service you want to use in aws. And just to give you some examples, it can be, I don't know, an ecommerce order flow, something that you might want to model with Step Functions, or maybe an ETL or some kind of other data transformation process. One thing that we have done, and we actually have been talking about that back in episode 103, is automating a transcription for this very podcast. And we have a step function that uses some AI components to basically extrapolate transcriptions and help us to create all the things that we need to publish the episodes and also update them on our website. So if you're curious about that, go and check out that particular episode where we'll share all the details. And also, some of this work is even open source, so you can even check out the code if you're curious. So there are lots of benefits when it comes to Step functions. The main thing is that you don't have to use programming language. It's effectively a declarative approach, and therefore you don't even need to worry about what is the operative system that you need to install things on. Effectively, AWS manages all of that for you. So effectively AWS manages all the stuff so you worry less about security issues, library dependencies, upgrades, all this kind of stuff. So it's good to use manager service for this kind of things and a few features are actually really cool and I really like them about step functions. One is for instance that if a step fails, because maybe you are trying.
C
To model something that might occasionally fail, if it's very easy to configure retries and then if something fails and it.
B
Breaks your entire step function, you can.
C
Easily inspect that step function and replace specific steps. You also have archives, redrives, and in general robust error handling and similarly observability I think is really good. For instance, when something goes wrong, you can easily see exactly what happened throughout all the steps. AWS will retain all the inputs and outputs for each step, so effectively you can easily pinpoint exactly what kind of.
B
Error happened, you can easily fix it.
C
And then maybe retry from there. And the other cool thing is that step function can integrate directly with almost every other AWS service. And for instance there are specific optimized integration with certain services. But whenever that integration doesn't exist, you can rely on the SDK and effectively model an API call directly in step functions. So, so step functions are how do you define them? This is probably the main question you might have. So there is a specific declarative syntax that is called Amazon States Language or ASL and it can either be written in JSON or YAML and effectively allows you to define all the different components of your state machine and also to reference things like ECS, task, HTTP APIs and much more. You can also use CDK or other infrastructure as code tools if you don't want to write plain JSON or YAML, and those make the process a little bit easier. But there is actually a very good IDE that is called Workflow Studio and you can access that in the AWS Management console. But recently AWS also launched a VS code extension that effectively supports this UI too directly in VS code. So I think so far we talked about the benefits of step functions, but it's fair to say that there are a few drawbacks that the local development isn't great. This is probably the main thing. So the feedback loop can be sometimes a little bit annoying. There are things like Local Stack that you can try to use and maybe they solve some of these problems, but I think in general what I've seen is that people just deploy to AWS and test it directly in aws, so there might be a little bit of latency there between changes and then testing those changes. The syntax that is supported is of course not as good as a fully feature programming language. So if you are trying to model something Very complex. Maybe you find that the syntax itself might become a little bit limiting and often you need to do something custom and you end up creating like an AWS lambda step to just do something very specific, maybe a particular transformation. So those are things that can get in the way and maybe be a little bit easier sometime. And there has been another annoying limit which is the state, which is effectively all the data that you are carrying around between steps. And then every step can read and write into that state, which was limited up to 256 kilobyte. And I think it's still limited to that. But there are ways to kind of work around it now. And this is where we're going to start to talk about JSON data and variables. So one last thing that I'm going to mention is that those two features are also available, are available in both standard step functions and Express step functions. So standard is effectively long run workflow. So you pay based on the number of transition and Express is kind of a short lived version of step functions where effectively your step function can only run for maximum five minutes and you pay by execution time. Generally these are a little bit faster and cheaper. So depending on the type of workflow you might want to use either Standard or Express. So I'll pass it now to you Owen, because I was talking a lot and you can tell us everything about JSON.
A
Nata this is pretty new to me. I think this is the first time I've really used JSONutter. Maybe I've heard of it before, but it's not something that's widely used. I think it's supported a lot by IBM and they use it a lot in their products. But it is a JSON query language which was inspired by XPath from the world of XML and it allows you to create sophisticated queries to transform and extract data from JSON. You might be familiar with JSON path support, which has been used in AWS in a few different places, including step functions. Jsonata is a much, much more fully featured syntax. So it supports string manipulation, numerical operations, things like date, time conversion, regular expressions which we know we all love, comparison operators and conditional logic. Also array and object manipulation. You can do even sorting, grouping and aggregation. You can define functions in IT and closures, but you've also got things like filter and map and reduce. So pretty much anything I think you can imagine doing in order to transform a blob of JSON A into a blob of JSON B. Jsonata has support for it now, traditionally Step function supports the JSON path mechanism and it also had some intrinsic functions. They added some of those like for formatting strings and whatnot. But realistically, jsonpath really only allows you to provide a tiny subset of what JSON Attic can now do. So the amount of data transformation you can do is massively increased, and there are a load of benefits that come with that. So I guess you might ask, how would you use JSON, ATTA and step functions and how does it differ to the traditional approach if you like? Well, previously your JSON path was your only option. Now every step function itself can have a top level query language specified, which can be jsonata or jsonpath, and you can also customize it on a state by state level. And that's pretty interesting if you already have like a large code base, lots of step functions, and you just want to start dipping your toes in or maybe just applying it where it's really, really valuable. So you can do that. You can just specify the query language for one state as being jsonata and instead of using all of those quite frustrating and difficult to understand properties before, like output path, result path, input path, result selector, all that stuff, you just specify a jsonata expression for either the output field which encapsulates everything you're going to output from that state, like in a past state, or you can specify jsonata for an arguments property as well. So if you imagine you're invoking a lambda function and you want to pass some parameters to it, you now do that with arguments which supports jsonata and then all of the input, like the states input can be referenced using a special variable dollar states dot input. And once you know that, you want to start writing some expressions in jsonata and you might want, you can look up the documentation, which is pretty good. There's also some online tools. One is the kind of official, more or less I think jsonata Exerciser, which is an online tool. But we also know that Steady has an online jsonata playground which has nice autocomplete support and stuff in it. There's not a lot of other tools like VS Code plugins yet, so the ecosystem is not as rich as some other things, but those tools I found have been pretty useful to get you everything you need now at the time of recording. Today, jsonata support is not yet fully developed for the AWS CDK constructs, so if you do need jsonata in CDK snap functions right now you have to use a custom state and provide the raw Amazon states language and I've found that to be pretty okay. Actually it's fine because once you're writing jsonata you're in a string within an op property anyway, so I don't know if CDK is really going to provide anything significant beyond that, but it is something to be aware of. There is an issue in the AWS CDK repo to track that. Overall the experience with JSON attestates is I think a simpler, easier to read, easier to understand since you don't have to deal with all those input path, output path, result path stuff that all interact together. You also don't have that dollar syntax for property names that you might keep forgetting anymore either. And overall I just think it's a much more developer friendly experience and way more powerful. There is a good guide on moving from jsonpath to jsonata and Eric Johnson has a overview video on that page as well. We'll link it in the show notes and yeah, that's JSON auto support. Found it pretty useful so far.
B
That's pretty cool. What about this new variables feature then?
A
Variables? Yeah, so this 256k limit you had for state data and basically state passed all the way down from the top fell out the bottom of your step function. There was no such thing as global state and the 256k limit could be annoying and we definitely had cases before where you were pushing data to S3 or DynamoDB and trying to pull it out and use lambda functions to extract a subset that was under 256k. Now there's a new feature called variables and that allows you to declare variables in one state and then reference them in any subsequent state without having to propagate them all the way down the state chain. So the variables are just named and the total size of these is up to 10 megabytes. So already you can do far, far more than you could previously. Now you assign those variables using either jsonatta or jsonpath. I think you're going to see them being used with jsonata a lot more to be honest. But when you combine this feature with JSON atta now you have a really big step forward in what you can do with the capability of step functions. We have been using it in real projects and we were able to remove a lot of lambda functions that just did data transformation only and reduce the complexity and even the cost of step functions overall. If you consider the cost of invoking lambda functions and waiting for them. Maybe some examples of the things we Were able to do one which I found really satisfying actually was on a project we were able to use step Functions for doing API integrations. We had a third party API that was authorized with a bearer token. You're able to set up an EventBridge connection, which is what Step Functions uses for HTTP invocations. And we can call APIs now over HTTPs using Step Functions using the bearer token from Secrets Manager. But now we can process large API responses. We can process them with jsonata to filter and transform the data in a really powerful way and then fan out and call more specific actions in like a map step. And that was an existing code base. So we were able to just remove a whole lot of lambda functions, which has great benefit. You know, you reduce your deployment time, you reduce your maintenance, you reduce the runtimes you need to keep track of, and it just simplifies everything and makes your whole workflow very easy to understand by reading it. And the other thing we use jsonata a lot is just for simple past states, like you've got the output of one state and you want to do some processing transformation on it. And we just do all that with jsonata in a simple pass state.
B
That's awesome. And I'm sure that everyone is wondering about cost. Like are these two new amazing features coming with some extra cost or not surprisingly, no.
A
No. The cost of building model for step functions is still the same in that for standard mode step functions you're paying per transition. By the way, you can reduce transitions now with jsonata so you might actually get a price reduction. And for express functions those are paste are priced based on the runtime, the time that they take to execute, up to five minutes. So no, there's no additional cost for using variables or jsonata. I would definitely recommend people give them a try and let us know what you think. Have you found any drawbacks? I think for us this has been, you know, there's a bit of a learning curve for sure, but it's not significant. And I think it's not like VTL, the Velocity language you might have come across for AppSync and various other things like API integrations in the past. Jsonida has been a much more pleasant experience, at least for me. So let us know what you think. I think that's everything we wanted to share. Share your use cases, let us know. And we're always looking to compare notes and learn how you use aws. So thanks for listening and we'll see you in the next episode.
D
Sam.
Date: March 21, 2025
Hosts: Eoin Shanaghy (“Owen”) and Luciano Mammino
In this episode, Eoin and Luciano dive into two transformative new features for AWS Step Functions: JSONata support and the introduction of variables. They discuss how these additions are making workflows more powerful and developer-friendly, breaking down what these features are, how to use them, and the real-world impact on serverless architectures. The conversation provides practical examples, highlights benefits and limitations, and offers opinions on the overall developer experience.
(00:52–06:20)
Quote:
“Step Functions allow you to avoid writing code in order to define some kind of workflow or state machine... it can be, I don't know, an ecommerce order flow... or maybe an ETL or some kind of other data transformation process.”
(B, 00:57)
(06:36–11:24)
Quote:
“It is a JSON query language which was inspired by XPath from the world of XML and it allows you to create sophisticated queries to transform and extract data from JSON.”
(A, 06:38)
InputPath, OutputPath, etc., with clear JSONata expressions for output and arguments.$State.Input.Quote:
“You just specify a JSONata expression for either the output field… or you can specify JSONata for an arguments property as well.”
(A, 08:27)
Quote:
“Overall, the experience with JSONata… is a simpler, easier to read, easier to understand… a much more developer friendly experience and way more powerful.”
(A, 10:55)
(11:26–14:07)
Quote:
“Now there's a new feature called variables and that allows you to declare variables in one state and then reference them in any subsequent state without having to propagate them all the way down the state chain.”
(A, 11:33)
Quote:
“We were able to just remove a whole lot of lambda functions, which has great benefit. You know, you reduce your deployment time, you reduce your maintenance, you reduce the runtimes you need to keep track of, and it just simplifies everything.”
(A, 13:23)
(14:08–15:22)
Quote:
“No, there's no additional cost for using variables or jsonata. I would definitely recommend people give them a try and let us know what you think.”
(A, 14:24)
(15:22-end)
“The main thing is that you don't have to use programming language... It's effectively a declarative approach.”
(B, 01:32)
“You also don't have that dollar syntax for property names that you might keep forgetting anymore either... just makes your whole workflow very easy to understand by reading it.”
(A, 10:51 / 13:44)
Summary:
This episode offers a comprehensive look at two new features in AWS Step Functions—JSONata integration and variables. Both make complex serverless workflows easier to manage, more expressive, and cost-effective, with direct impact on real-world applications. The hosts share best practices, practical examples, migration tips, and set realistic expectations for anyone looking to upgrade their Step Functions usage.