AWS Bites – Episode 151: EC2 ❤️ Lambda – Lambda Managed Instances

Released January 16, 2026 | Hosts: Eoin Shanaghy & Luciano Mammino

Episode Overview

In this episode, Eoin and Luciano dive into Lambda Managed Instances (Lambda MI), a newly announced execution model for AWS Lambda that brings managed EC2-style capacity to one of the hallmark serverless services. The hosts dissect what changes and what stays the same, explore hands-on use cases, compare Lambda MI to default Lambda execution, and highlight where Lambda MI fits into the evolving AWS compute landscape.

Key Discussion Points & Insights

1. Background: What Is Lambda Managed Instances?

[00:00–04:30]

Lambda MI is inspired by ECS Managed Instances, but applied to Lambda: you still run your code on EC2—this time managed and provisioned by AWS, relieving you of AMI selection, OS patching, and ongoing maintenance.
Instead of classic on-demand Lambda scaling and cold starts, Lambda MI pools EC2 resources ahead of time and handles multiple requests concurrently in each environment.
Notable quote:

"Why in the world would you want to bring EC2 instances, aka servers, into one of the most serverless compute services out there?"
— Luciano [01:14]
Spoilers teased:
- "No more cold starts, kinda..."
- Single Lambda environments now support concurrency.

2. Default Lambda vs Lambda MI Execution Model

[02:39–07:51]

Default Lambda: Each invocation runs in its own ephemeral environment; scales horizontally with incoming events; “cold starts” occur when new environments are launched.
Lambda Managed Instances: Your code runs in containerized environments on always-on EC2s within your account; these can serve multiple concurrent invocations.
AWS manages instance provisioning, patching, and scaling, but changes the scaling, provisioning, and cost model compared to standard Lambda.
Longer-lived environments (up to 14 days) mean memory leaks and state management bugs are more visible.
Notable quote:

"One execution environment can now handle multiple concurrent invocations, unlike default Lambda’s single invocation per environment model. This is a big change in how Lambda operates and something that people have been asking about for a long time."
— Eoin [04:44]

3. Scaling Behavior: Proactive vs Reactive

[07:51–14:00]

Default Lambda: Scales reactively—new environments created as needed for each event.
Lambda MI: Scales proactively/asynchronously, watching CPU and concurrency. May cause throttling on sudden high-traffic spikes until more capacity is ready.
Analogy: The “restaurant” analogy compares EC2 instances to tables, execution environments to staff, and max concurrency to the number of guests each server can handle.
Two ways to scale:
- Add more environments on the same instance ("hiring more staff").
- Add more instances ("adding more tables"/capacity).
Scaling can feel less “automatic”—requires advance planning of capacity.
Notable quote:

"If your traffic doubles very quickly... maybe it's where you start to see throttles. But in general, if you have predictable traffic and your capacity is enough, you are not going to see cold starts or throttling—so that gives you a little bit more of a predictable and always available environment."
— Luciano [10:45]
Technical deep dive:
- Router, Scaler, and Lambda Agent components underpin MI’s dynamic resource management.

4. Tuning & Configuration Options

[14:00–17:01]

Function-level specs:
- VCPU & memory sizing (min: 2GB/1 vCPU; max: 32GB).
- Max concurrency per environment (up to 64 requests per vCPU).
Capacity provider-level settings:
- Target utilization for headroom vs. efficiency.
- Specific instance type inclusion/exclusion (fewer selection controls than ECS MI).
- Manual vs automatic scaling based on CPU thresholds.
Ability to change execution environment min/max via API for dynamic scaling.

5. Hands-On Example: Video Processing Pipeline

[17:01–22:02]

The hosts developed a simulated video-processing API, motivated by real-world workloads.
Three core components:
1. REST API (CRUD for video entries, trigger processing)
2. Simulated processor Lambda (could be swapped for e.g. FFmpeg, AI, etc)
3. Step Function orchestrator (to cold-scale processor Lambdas on demand)
Demonstrated two operation types:
- Persistent capacity for APIs (always-available pool)
- Batch/on-demand (scaling from zero for heavy, event-driven work)
All examples written in TypeScript using CDK, Node.js 24 on ARM64, DynamoDB.
Notable quote:

"It is a little bit of a hack to effectively get that scale to zero, which of course only makes sense if you control the event that triggers, in this case the processing."
— Luciano [19:19]

6. Limitations, Pitfalls & Gotchas

[22:02–29:17]

Instance selection: Fewer “abstraction” options than ECS MI; you must specify allowed/excluded types.
Region support (as of Jan '26): Limited to 5 regions globally.
Runtime support: Latest versions only—no older or deprecated runtime support.
VPC required: All Lambda MI workloads must run in a VPC (networking/egress concerns apply).
Deployment lag: Initial deployments can take minutes, as instances must spin up and warm environments.
Minimal resource size: No tiny Lambdas; smallest allowed is 2GB/1 vCPU.
Manual scaling quirks: Creating a manual capacity provider can still spin up baseline instances, even unused.
Scale-to-zero is complex: Not as seamless as with containers; function deactivation occurs if min=0.
Concurrency headaches: Must account for thread safety, shared global state, filesystem collisions, log interleaving.
AWS Guidance: Extensive docs available, including per-language pitfalls and solutions.
Notable quote:

"Because now you get concurrent execution, that comes with a few more headaches from a developer perspective... if you're using Node.js, for instance, and you have a global variable, you might end up with inconsistent state. This is something that might lead to very serious bugs. So just be aware of that."
— Luciano [26:03]

7. Pricing Model

[29:17–31:46]

Billing now includes THREE components:
1. Request charge: Same as classic Lambda ($0.20 per million, typically negligible)
2. EC2 instance charge: Billed at normal on-demand rates (can use Savings Plans/Reserved Instances, but NOT Spot)
3. Management fee: A 15% premium on the on-demand EC2 price (applies in addition to compute charge; not discountable).
Potential for significant cost savings on steady, high-volume, high-concurrency workloads (and you can leverage EC2 discounts), but must be sized and measured carefully.
Multi-concurrency and big, long-running workloads may tilt cost advantage toward Lambda MI.
Notable quote:

"The new thing is, just like with ECS MI, now you’ve got kind of a management fee—a managed instance tax if you like... with Lambda, it’s 15% premium calculated on the EC2 on-demand price of the instance."
— Eoin [30:10]

8. Conclusions & Summary Guidance

[31:46–End]

Lambda MI is not Lambda v2 or a replacement: it's another tool in the kit.
Retains the Lambda developer experience; you just gain (and must manage) compute control, capacity settings, and concurrency.
Best for:
- High-throughput APIs, steady and predictable loads, CPU-heavy/batch/long-running work.
Not optimal when you need:
- Ultra-low-latency scale-up (as spike throttling can occur), very small Lambdas, or automatic scale-to-zero for all cases.
New options ("more choices, more complexity"); careful evaluation needed per workload.
Request for AWS: Would love to see on-the-fly fallback from MI to default Lambda during scaling, but this is not currently possible.
Open-source example code and further documentation are provided in their GitHub repo.
Notable quote:

"This is a new tool and it might be a very good tool for the right workloads. It's not necessarily something we should consider as an upgrade to Lambda..."
— Luciano [34:47]

Memorable Quotes & Moments

"Why in the world would you want to bring EC2 instances, aka servers, into one of the most serverless compute services out there?"
— Luciano [01:14]

"One execution environment can now handle multiple concurrent invocations... something people have been asking for a long time."
— Eoin [04:44]

"If your traffic doubles very quickly... maybe it's where you start to see throttles. But in general, if you have predictable traffic and your capacity is enough, you are not going to see cold starts or throttling..."
— Luciano [10:45]

"It is a little bit of a hack to effectively get that scale to zero..."
— Luciano [19:19]

"Because now you get concurrent execution, that comes with a few more headaches... you might end up with inconsistent state. This is something that might lead to very serious bugs."
— Luciano [26:03]

"A management fee—a managed instance tax if you like... with Lambda it’s 15% premium calculated on the EC2 on-demand price of the instance."
— Eoin [30:10]

"This is a new tool... It's not necessarily something we should consider as an upgrade to Lambda."
— Luciano [34:47]

Important Timestamps

00:00 — Introduction, context & episode roadmap
02:39 — Default vs. Lambda MI: what’s new, what changes
07:51 — Deep dive: scaling behaviors and restaurant analogy
14:00 — Function and capacity provider configuration options
17:01 — Example: Building a video processing pipeline on Lambda MI
22:02 — Limitations and pitfalls: instance selection, regions, runtimes, VPC, deployment, scaling
29:17 — Lambda MI pricing and cost comparisons
31:46 — Conclusions, usage guidance, feature requests to AWS
34:47 — Final reflections and closing remarks

Summary Table

| Feature/Aspect | Default Lambda | Lambda Managed Instances | |---------------------------|-----------------------|---------------------------| | Cold starts | Yes | Mostly eliminated | | Concurrency | 1 per environment | Multi-concurrent/env | | Scaling | On-demand/reactive | Proactive/asynchronous | | Runtime size | 128MB min | 2GB/1vCPU min | | Pricing | Pay-per-use | EC2 price + 15% fee | | Spot/Discounts | N/A | EC2 Savings Plans, but not Spot instances | | Deployment speed | Fast | Slower, minutes possible | | Scale-to-zero | Automatic | Complicated, not simple | | Networking | VPC optional | VPC required | | Developer headaches | Fewer (less state) | More (manage concurrency, state, tmp) |

Closing Notes

Lambda MI is a niche but powerful addition, ideal for high-load, predictable, or high-concurrency workloads, or those with cost optimization via EC2 discounts.
It adds complexity and some resource limitations—choose with care.
Example code and further exploration are available via their GitHub (link in show notes).
AWS Bites is open to community feedback: what use cases fit Lambda MI best for you?

Sponsor mention: ForTheorem – AWS consulting & architecture specialists.

AWS Bites – Episode 151: EC2 ❤️ Lambda – Lambda Managed Instances

Released January 16, 2026 | Hosts: Eoin Shanaghy & Luciano Mammino

Episode Overview

Key Discussion Points & Insights

1. Background: What Is Lambda Managed Instances?

[00:00–04:30]

Lambda MI is inspired by ECS Managed Instances, but applied to Lambda: you still run your code on EC2—this time managed and provisioned by AWS, relieving you of AMI selection, OS patching, and ongoing maintenance.
Instead of classic on-demand Lambda scaling and cold starts, Lambda MI pools EC2 resources ahead of time and handles multiple requests concurrently in each environment.
Notable quote:

"Why in the world would you want to bring EC2 instances, aka servers, into one of the most serverless compute services out there?"
— Luciano [01:14]
Spoilers teased:
- "No more cold starts, kinda..."
- Single Lambda environments now support concurrency.

2. Default Lambda vs Lambda MI Execution Model

[02:39–07:51]

Default Lambda: Each invocation runs in its own ephemeral environment; scales horizontally with incoming events; “cold starts” occur when new environments are launched.
Lambda Managed Instances: Your code runs in containerized environments on always-on EC2s within your account; these can serve multiple concurrent invocations.
AWS manages instance provisioning, patching, and scaling, but changes the scaling, provisioning, and cost model compared to standard Lambda.
Longer-lived environments (up to 14 days) mean memory leaks and state management bugs are more visible.
Notable quote:

"One execution environment can now handle multiple concurrent invocations, unlike default Lambda’s single invocation per environment model. This is a big change in how Lambda operates and something that people have been asking about for a long time."
— Eoin [04:44]

3. Scaling Behavior: Proactive vs Reactive

[07:51–14:00]

Default Lambda: Scales reactively—new environments created as needed for each event.
Lambda MI: Scales proactively/asynchronously, watching CPU and concurrency. May cause throttling on sudden high-traffic spikes until more capacity is ready.
Analogy: The “restaurant” analogy compares EC2 instances to tables, execution environments to staff, and max concurrency to the number of guests each server can handle.
Two ways to scale:
- Add more environments on the same instance ("hiring more staff").
- Add more instances ("adding more tables"/capacity).
Scaling can feel less “automatic”—requires advance planning of capacity.
Notable quote:

"If your traffic doubles very quickly... maybe it's where you start to see throttles. But in general, if you have predictable traffic and your capacity is enough, you are not going to see cold starts or throttling—so that gives you a little bit more of a predictable and always available environment."
— Luciano [10:45]
Technical deep dive:
- Router, Scaler, and Lambda Agent components underpin MI’s dynamic resource management.

4. Tuning & Configuration Options

[14:00–17:01]

Function-level specs:
- VCPU & memory sizing (min: 2GB/1 vCPU; max: 32GB).
- Max concurrency per environment (up to 64 requests per vCPU).
Capacity provider-level settings:
- Target utilization for headroom vs. efficiency.
- Specific instance type inclusion/exclusion (fewer selection controls than ECS MI).
- Manual vs automatic scaling based on CPU thresholds.
Ability to change execution environment min/max via API for dynamic scaling.

5. Hands-On Example: Video Processing Pipeline

[17:01–22:02]

The hosts developed a simulated video-processing API, motivated by real-world workloads.
Three core components:
1. REST API (CRUD for video entries, trigger processing)
2. Simulated processor Lambda (could be swapped for e.g. FFmpeg, AI, etc)
3. Step Function orchestrator (to cold-scale processor Lambdas on demand)
Demonstrated two operation types:
- Persistent capacity for APIs (always-available pool)
- Batch/on-demand (scaling from zero for heavy, event-driven work)
All examples written in TypeScript using CDK, Node.js 24 on ARM64, DynamoDB.
Notable quote:

"It is a little bit of a hack to effectively get that scale to zero, which of course only makes sense if you control the event that triggers, in this case the processing."
— Luciano [19:19]

6. Limitations, Pitfalls & Gotchas

[22:02–29:17]

Instance selection: Fewer “abstraction” options than ECS MI; you must specify allowed/excluded types.
Region support (as of Jan '26): Limited to 5 regions globally.
Runtime support: Latest versions only—no older or deprecated runtime support.
VPC required: All Lambda MI workloads must run in a VPC (networking/egress concerns apply).
Deployment lag: Initial deployments can take minutes, as instances must spin up and warm environments.
Minimal resource size: No tiny Lambdas; smallest allowed is 2GB/1 vCPU.
Manual scaling quirks: Creating a manual capacity provider can still spin up baseline instances, even unused.
Scale-to-zero is complex: Not as seamless as with containers; function deactivation occurs if min=0.
Concurrency headaches: Must account for thread safety, shared global state, filesystem collisions, log interleaving.
AWS Guidance: Extensive docs available, including per-language pitfalls and solutions.
Notable quote:

"Because now you get concurrent execution, that comes with a few more headaches from a developer perspective... if you're using Node.js, for instance, and you have a global variable, you might end up with inconsistent state. This is something that might lead to very serious bugs. So just be aware of that."
— Luciano [26:03]

7. Pricing Model

[29:17–31:46]

Billing now includes THREE components:
1. Request charge: Same as classic Lambda ($0.20 per million, typically negligible)
2. EC2 instance charge: Billed at normal on-demand rates (can use Savings Plans/Reserved Instances, but NOT Spot)
3. Management fee: A 15% premium on the on-demand EC2 price (applies in addition to compute charge; not discountable).
Potential for significant cost savings on steady, high-volume, high-concurrency workloads (and you can leverage EC2 discounts), but must be sized and measured carefully.
Multi-concurrency and big, long-running workloads may tilt cost advantage toward Lambda MI.
Notable quote:

"The new thing is, just like with ECS MI, now you’ve got kind of a management fee—a managed instance tax if you like... with Lambda, it’s 15% premium calculated on the EC2 on-demand price of the instance."
— Eoin [30:10]

8. Conclusions & Summary Guidance

[31:46–End]

Lambda MI is not Lambda v2 or a replacement: it's another tool in the kit.
Retains the Lambda developer experience; you just gain (and must manage) compute control, capacity settings, and concurrency.
Best for:
- High-throughput APIs, steady and predictable loads, CPU-heavy/batch/long-running work.
Not optimal when you need:
- Ultra-low-latency scale-up (as spike throttling can occur), very small Lambdas, or automatic scale-to-zero for all cases.
New options ("more choices, more complexity"); careful evaluation needed per workload.
Request for AWS: Would love to see on-the-fly fallback from MI to default Lambda during scaling, but this is not currently possible.
Open-source example code and further documentation are provided in their GitHub repo.
Notable quote:

"This is a new tool and it might be a very good tool for the right workloads. It's not necessarily something we should consider as an upgrade to Lambda..."
— Luciano [34:47]

Memorable Quotes & Moments

"Why in the world would you want to bring EC2 instances, aka servers, into one of the most serverless compute services out there?"
— Luciano [01:14]

"One execution environment can now handle multiple concurrent invocations... something people have been asking for a long time."
— Eoin [04:44]

"If your traffic doubles very quickly... maybe it's where you start to see throttles. But in general, if you have predictable traffic and your capacity is enough, you are not going to see cold starts or throttling..."
— Luciano [10:45]

"It is a little bit of a hack to effectively get that scale to zero..."
— Luciano [19:19]

"Because now you get concurrent execution, that comes with a few more headaches... you might end up with inconsistent state. This is something that might lead to very serious bugs."
— Luciano [26:03]

"A management fee—a managed instance tax if you like... with Lambda it’s 15% premium calculated on the EC2 on-demand price of the instance."
— Eoin [30:10]

"This is a new tool... It's not necessarily something we should consider as an upgrade to Lambda."
— Luciano [34:47]

Important Timestamps

00:00 — Introduction, context & episode roadmap
02:39 — Default vs. Lambda MI: what’s new, what changes
07:51 — Deep dive: scaling behaviors and restaurant analogy
14:00 — Function and capacity provider configuration options
17:01 — Example: Building a video processing pipeline on Lambda MI
22:02 — Limitations and pitfalls: instance selection, regions, runtimes, VPC, deployment, scaling
29:17 — Lambda MI pricing and cost comparisons
31:46 — Conclusions, usage guidance, feature requests to AWS
34:47 — Final reflections and closing remarks

Summary Table

Closing Notes

Lambda MI is a niche but powerful addition, ideal for high-load, predictable, or high-concurrency workloads, or those with cost optimization via EC2 discounts.
It adds complexity and some resource limitations—choose with care.
Example code and further exploration are available via their GitHub (link in show notes).
AWS Bites is open to community feedback: what use cases fit Lambda MI best for you?

Sponsor mention: ForTheorem – AWS consulting & architecture specialists.

wavePod

151. EC2 ❤️ Lambda - Lambda Managed Instances

Get Free Podcast Summaries in Your Inbox

Pick Your Shows

Subscribe Free

Get Instant Summaries

Summary

AWS Bites – Episode 151: EC2 ❤️ Lambda – Lambda Managed Instances

Episode Overview

Key Discussion Points & Insights

1. Background: What Is Lambda Managed Instances?

2. Default Lambda vs Lambda MI Execution Model

3. Scaling Behavior: Proactive vs Reactive

4. Tuning & Configuration Options

5. Hands-On Example: Video Processing Pipeline

6. Limitations, Pitfalls & Gotchas

7. Pricing Model

8. Conclusions & Summary Guidance

Memorable Quotes & Moments

Important Timestamps

Summary Table

Closing Notes

Summary

AWS Bites – Episode 151: EC2 ❤️ Lambda – Lambda Managed Instances

Episode Overview

Key Discussion Points & Insights

1. Background: What Is Lambda Managed Instances?

2. Default Lambda vs Lambda MI Execution Model

3. Scaling Behavior: Proactive vs Reactive

4. Tuning & Configuration Options

5. Hands-On Example: Video Processing Pipeline

6. Limitations, Pitfalls & Gotchas

7. Pricing Model

8. Conclusions & Summary Guidance

Memorable Quotes & Moments

Important Timestamps

Summary Table

Closing Notes