Embedded Memories: The Next Generation - Asianometry

Summary6 min read

Episode Overview

Podcast: Asianometry
Host: Jon Y
Episode Title: Embedded Memories: The Next Generation
Date: June 21, 2026

In this episode, Jon Y delves into the evolving world of embedded memories in the semiconductor industry, examining the historical dominance of SRAM and eFlash, the rise and fall of eDRAM, and the promise and problems of next-generation technologies such as STT MRAM and ReRAM. The episode provides a rich mix of historical context, technical explanations, current market implications, and a look at what’s coming next as chip designers seek new solutions to persistent scaling and performance challenges.

Key Discussion Points & Insights

1. Why Embedded Memories Matter

As chips have grown faster and more complex, memory performance and proximity to logic circuits have become bottlenecks (00:02).
The need for faster, more efficient data access has pushed the industry to embed memories directly into chips, reducing latency and power consumption.

2. Historical Landscape of Semiconductor Memory Tech

In the 1980s, SRAM, DRAM, and Flash EEPROM dominated as standalone memory (00:40).
Technological divergence in the 1990s saw differing process nodes for DRAM, flash, and logic (01:05).
SRAM could be embedded due to using only transistors (no capacitors), leading to its use as on-chip cache and making it the largest embedded memory market (02:00).

3. SRAM: Ubiquitous but Facing Limits

Density issues: “SRAMs can use only transistors… but it's a thicc boy. The most widely used SRAM cell design uses six transistors.” (03:15)
Process node advances have reached a point where SRAM limits now dictate bragging rights: “TSMC uses SRAM density to do so. One of the few hard numbers that TSMC has publicly announced about their N2 process node is how it can stuff more SRAM onto the die.” (04:20)
In some CPUs, up to 70% of die area is just SRAM cache (04:45).

4. Embedded DRAM (eDRAM): Denser, Lower Power, but Fading

Can achieve 5-6x the density of SRAM; uses a fraction of the power (05:00).
Drawbacks: Process complexity, extra mask steps, yield risk, and it’s volatile (05:50).
“The eDRAM market was once quite considerable, used for items like the Xbox 360. However, its momentum has sort of petered out in recent years with fewer industry products being made with it.” (06:12)

5. Embedded Flash (eFlash): Workhorse for MCUs, but Hitting a Wall

Type: Always NOR (not NAND), so decent random access but lower density (07:00).
Voltage mismatch: “Programming and erasing eFlash’s NOR-style networks require high voltages… something like 9 to 18 volts… standard logic transistors run at about 1 volt.” (08:00)
Scaling limits: “28nm is the end of the road scaling wise for eFlash.” (09:10)
- Shrinking means fewer electrons in the floating gate, making leakage and degradation more likely.
Market: Dominates in microcontrollers (MCUs), especially automotive, but faces pressures as nodes advance (10:00).

6. Next Generation Embedded Memories

Jon Y focuses on two main contenders, highlighting their mechanisms, pros, and cons:

A. STT MRAM (Spin Transfer Torque Magnetoresistive RAM)

How it works: Encodes data by manipulating resistance in 'magnetic tunnel junctions' (MTJs), made of ferromagnets and thin oxide layers (12:00).
Advantages:
- Non-volatile (retains data without power)
- Area savings; “At the 5nm node, we get like a 43% reduction as compared to SRAM.” (17:00)
- Endurance and speed: Writes in <10 ns (DRAM-like); much faster than flash
Manufacturing challenges: “Fabbing the 15 to 20 various metal and dielectric stacks that make up the MTJ is challenging.” (18:30)
Scaling issues: As MTJs and nodes shrink, delivering needed write current and ensuring stability is tough (20:00).

B. ReRAM (Resistive RAM)

How it works: Uses a metal oxide layer where “a small conductive filament… bridges the two electrodes,” switching resistance (22:00).
Advantages:
- Non-volatile, fast, low energy
- Fewer additional process steps than other non-volatile options
- Scalable to advanced nodes (23:20)
Challenges:
- Variability in forming/resetting filaments; endurance concerns
- Commercial deployment has been patchy; “feels like something that pops its head up every so often, attempting to ride the latest significant trend.” (25:00)
- TSMC and Samsung both offer as optional technology, but with less uptake than MRAM.

C. Other Candidates

Brief mentions of phase change RAM and ferroelectrics (27:30).
Jon Y focuses on STT MRAM and ReRAM as the prime contenders for eFlash replacement.

7. Industry Conservatism, Market Dynamics, and the AI Future

Despite new tech, “EFLASH remains quite resilient... a tried and true solution in a space where reliability matters more than raw performance.” (28:20)
Most MCUs still use older (65nm+) process nodes.
AI’s need for fast, local memory may be the catalyst for next-gen adoption. Embedded memory could help sidestep the von Neumann bottleneck (29:00).
ReRAM and neuromorphic computing are discussed as possibilities for AI-specific use cases (30:00).
The end-user market's readiness is an open question: “In both cases the technology seems to have gotten ahead of the use case.” (30:40)

Notable Quotes & Memorable Moments

On SRAM’s role and limitations:
“SRAMs can use only transistors… but it's a thicc boy. The most widely used SRAM cell design uses six transistors.” (Jon Y, 03:15)
On embedded DRAM's decline:
“The eDRAM market was once quite considerable, used for items like the Xbox360. However, its momentum has sort of petered out in recent years.” (Jon Y, 06:12)
On eFlash scaling:
“28nm is the end of the road scaling wise for eFlash. Scaling down eFlash means making smaller transistors and packing them closer together. This becomes a serious issue at 28nm, the last planar transistor node.” (Jon Y, 09:10)
On MRAM manufacturing hurdles:
“Fabbing the 15 to 20 various metal and dielectric stacks that make up the MTJ is challenging.” (Jon Y, 18:30)
On ReRAM’s commercial fortunes:
“Broadly speaking, the technology has commercial potential. But it also feels like something that pops its head up every so often, attempting to ride the latest significant trend.” (Jon Y, 25:00)
On the future of embedded memories and industry conservatism:
“The semiconductor industry is pretty conservative and they're not apt to try new weird stuff until they have to.” (Jon Y, 28:40)
On AI as a driver for next-gen memory adoption:
“The current major hope for these embedded non volatile memories technologies is AI. Since embedded memories are so close to the logic, there is some potential to evade the von Neumann bottleneck...” (Jon Y, 29:00)

Timestamps for Important Segments

00:02 — Introduction to embedded memory and why it matters
01:05 — Historical context: Divergence of DRAM, Flash, and logic nodes
02:00 — Embedded SRAM takes off in CPUs
05:00 — Embedded DRAM: Pros, cons, and market trends
07:00 — Embedded Flash/NOR: Technical overview and usage
09:10 — Scaling issues for eFlash at 28nm
12:00 — Introduction to MRAM and memory cell mechanics
17:00 — Advantages of STT MRAM over SRAM
18:30 — MRAM production/manufacturing hurdles
22:00 — How ReRAM works and its elegant filament switching concept
25:00 — Commercial potential and spotty adoption of ReRAM
28:20 — Industry conservatism, MCUs and eFlash resilience
29:00 — AI as a catalyst for next-generation embedded memories
30:40 — Technology readiness vs. use-case maturity

Structure & Flow

The episode masterfully intertwines engineering fundamentals, historical shifts, and market realities, always explained in Jon Y’s conversational and lightly irreverent style. Listeners are taken from a big-picture look at why embedded memories matter, deep into the weeds of how and why SRAM, eDRAM, and eFlash have occupied (and are ceding) their positions, and then out again to explore the market dynamics shaping the Next Big Thing in memory for chips.

Summary Takeaways

SRAM is space-hungry and near process scaling limits.
eDRAM once looked promising but has faded due to cost/complexity.
eFlash (NOR) has powered the MCU segment, especially in harsh environments, but 28nm is its practical limit.
Next-gen nonvolatile candidates—STT MRAM and ReRAM—bring speed, density and non-volatility, but their challenges (manufacturing, variability, endurance) and a conservative market mean broad adoption will take time.
AI and advanced edge computing may be the tipping point that drives adoption of these advanced embedded memories.

This episode is both a technical deep dive and a market primer for anyone interested in the forces shaping next-generation semiconductor memories.

Loading summary

Transcript1 lines

[00:02]
A
As chips speed up and get more capable, they must also fetch more data and get it faster. Most of the time that means going off chip to some external memory module. It slows things down and uses energy. One alternative is to embed some memory right alongside the logic circuits on the chip. Embedded Memories for years, two types of embedded memories dominated. But things are changing. In today's video, we take a look at those plus some of the next generation memories coming down the pike. Forty years ago in the 1980s, there were three big categories of discrete standalone memory the SRAMs, DRAMs and Flash EEPROMs. But as time and technology demands progressed, these three changed. Like high school friends after graduation in the 1990s, the process nodes used to make DRAM and flash memories greatly diverge from each other. As well as the nodes used for making logic chips. Drams transitioned from using flat planar capacitors to vertical ones. And today the dominant DRAM nodes use these tall and skinny capacitors stacked on top of or below their axis transistors. And as for the EEPROMs, the they evolved into the flash memories Nor and Nand, with planar Nand evolving yet again into the lasagna. Like 3D nand, such vertically stacked nands are some of the most scalable in the semiconductor world. I did a video about it a while ago. Now, as a standalone memory, SRAM hasn't had the same success as its two friends. Unlike DRAMS and EEPROMs, however, SRAMS can use only transistors to store bits, which lets us make it alongside the rest of the chip on an embedded status without needing any additional masks. In the late 1980s, CPU makers started embedding SRAMs onto their chips as cache to store important data. It remains very significant and the single largest embedded memory market. However, in recent years, SRAMs have found themselves on the ropes. For one thing, it's a thicc boy. The most widely used SRAM cell design uses six transistors. That is a lot compared to dram, which famously is just one transistor and one capacitor. And that's a problem because transistors aren't getting any smaller nowadays. Fabs have optimized SRAM to such an extent that when they brag about their process nodes, they use SRAM density to do so. One of the few hard numbers that TSMC has publicly announced about their N2 process node is how it can stuff more SRAM onto the die. With CPUs and other systems on chips getting more advanced, you get situations where a surprisingly significant portion of Certain chips is just embedded SRAM memory. Back in the 2000s, some high performance CPUs had as much as 70% of their whole dies being just SRAM. So if SRAM is reaching its density limits, why not embed something that can be far denser? That is why some have turned to embedded DRAM or edram. It is the same one transistor, one capacitor structure just embedded onto the die. With that skinnier setup, we can stuff five to six times more EDRAM than SRAM onto the same space. Edram also uses significantly less power than sram, even if you still have to periodically refresh them. Like with commodity drams, you use just a third of the power of SRAMs, not to mention the power saved from not going off chip. There are also integration benefits since we are less likely to get bad connections, bend pins or other mechanical failure points, etc. EED RAMs tend to be more reliable. Data transfers to and from memory have better latency. But what are the downsides? Memory and logic process nodes are nowadays very different. So producing Edram adds maybe four to six masks to the fabrication process, which exposes your chip to yield risk and higher costs. The Edram market was once quite considerable, used for items like the Xbox360. However, its momentum has sort of petered out in recent years with fewer industry products being made with it. However, there seems to be plenty of compute and memory research done in academia with it. Also, like sram, EDRAM is volatile. Once the power goes out, everything is forgotten. Ideally, we want something non volatile, something that can hold its data when the power goes off. So over time, vendors have embedded flash memories onto the chip. Embedded Flash or E Flash. E Flash is a NOR type memory. With nor, we string together many special memory cells. Planar transistors equipped with a floating gate. Electrons are compelled into that floating gate through an oxide, raising the transistor's threshold voltage. NOR arranges these cells in such a way that we can access them one at a time. Random access at the cost of less density. Its younger cousin nand, on the other hand, networks its cells together in strings of 16 to 128, with each cell source connected to its neighbor's drain. It lets us pack cells very closely together, but also means no random access. We can only manipulate data in blocks or pages. That is why we aren't getting embedded NAND anytime soon. Programming and erasing eFlash's nor style networks require high voltages, something like 9 to 18 volts to compel the electrons to go in and out of the floating gate. That's a problem because standard logic transistors run at about 1 volt. To protect these neighboring logic transistors from getting fried, you need deep isolation trenches or some kind of hardening. NAND requires even higher voltages than NOR to program and erase their long strings of cells, which is too high for the die anyway. So E Flash being NOR means it cannot achieve the same density as NAND has. It also does not write as fast as SRAM or dram. Its cells also suffer the same endurance issues as discrete flash memories breaking down over repeated write cycles and same as Edram. There is a fabrication cost. It requires an additional six to eight mass steps, which can be more than Edram. You take on more yield risk. Today, eFlash is most often used to store program code and data for these small but vital chips called microcontrollers or MCUs. These are basically computers on a single chip, though nowhere near as powerful as an intel or AMD CPU. EFlash hit the bill, fit the bill before the systems because it boots fast, power efficient, rewritable and can survive the rough conditions that cars or other industrial devices often experience. The E Flash automotive MCU market is often cited as the second largest overall embedded memory market after sram, though you can also find them in edge, AI and Data center applications. EFLASH's most serious issue however, is scaling. Largely speaking, 28nm is the end of the road scaling wise for E Flash. Scaling down E Flash means making smaller transistors and packing them closer together. This becomes a serious issue at 28nm, the last planar transistor node. There you have all the standard problems of shrink, loss of control over the gate, short channel effects so on par for the course and why the logic fabs switch to 3D FinFets. But then there are the flash memory related issues too. The flash memory cells are now so physically small that their floating gates contain about 100 or so electrons for a threshold of 1 volt. It takes fewer electrons leaking to cause significant degradation, and with the tunnel oxide layers so thin now, that is way more likely. These scaling problems are why the NAND maker switched to 3D NAND. You loosen the floating gates as technical requirements by resetting their sizes from 28nm to 40nm but then stack them vertically to achieve massive storage numbers. 3D NAND is made in a parallel manner and very cool, some cases literally, but not a valid technical pathway for E Flash, nor is it economically feasible for the chip designer to add what can be up to 10 additional NAs to produce a wholly different transistor type onto the chip. Without a valid successor to E Flash, OEMs of MCUs and such products might move back to discrete memories, perhaps using advanced packaging to put them together. So fabs and startups have suggested potential successors the Next Generation Memories and there are a lot of next generation embedded memories out there. Let me cut it down to a few such with serious backing by major foundries like TSMC and Samsung. First up are the magnetoresistive rams or mrams. Drams and flash memories encode the bit by storing a charge. The mram on the other hand does it by manipulating electrical resistance levels. An MRAM cell is made up of two an access transistor and the magnetic tunnel junction or mtj. The latter is where the magic happens. The MTJ is a tiny sandwich of ferromagnets and non magnetic lines layers. The simplest MTJ has two ferromagnets and a very thin insulator layer in between them. The top ferromagnet is referred to as the free layer. The ferromagnet on the bottom is called the reference or fixed or pinned layer. Both ferromagnets are usually made from an iron alloy like cobalt iron boron and depending on the variant you may have several pinned layers. As for the very thin, maybe 1-2nm thick insulator layer that is most often made of magnesium oxide, the MTJ works by having an external current orient to magnetization of the MTJ's free layer as compared to its reference or pinned layers. When the magnetic moments of both ferromagnet layers are parallel, then the whole MTJ will have low electrical resistance so we can easily run a current through it. And when the magnetic moments of the two ferromagnets are not parallel to each other or antiparallel, then the electrical resistance gets significantly higher. So in a way it's like twisting a faucet open or closed. We can map the two high or low resistance states to a bit. How is that done? With mrams, we send small currents into the MTJ direction dry and discern its resistance state, comparing it against the middle point reference level to determine the final value. The first MRAMs were introduced in the 1980s and are today known as conventional or field switched MRAMs. These older memories used magnetic fields to write to the mtj. That is how we set the free layer. This magnetic field was created by running a current through a wire, making this flip an indirect effect. Basically that was how we wrote data to the old ferrite core memories. For this mechanism to work, the magnetic field has to be strong enough to flip the MTJ's magnetic state. The problem was that as the MTJ got smaller, it gets easier for that bit to accidentally flip due to thermal noise. Hard disk drives suffer this problem too. They have something called the super paramagnetic limit, where the grains in a bit get so small that thermal energy can flip them anyway. The response by engineers has been to raise the flip limit. But the downside of doing that is that we need a more powerful magnetic field to switch it when we actually need to do so. When the MTJ is small, controlling that field is harder to do. In the 1990s, the field switch MRAM was replaced by a new variant known as spin transfer torque MRAM STT mram. Instead of using a magnetic field, we send a special current through the mtj. Such electrons flood into the free layer and flip it directly. Sttmram is a very promising memory technology. To start, it's non volatile, so the data stays even after the power goes off. It does not need to be continually refreshed. The area savings are also very significant. It is basically just the MTJ and an access transistor, a very dram ish setup. At the 5nm node, we get like a 43% reduction as compared to SRAM. Despite being a non volatile memory, you can write to it in less than 10Ns, which is DRAM like speeds and far faster than flash memory's 20 to 100 microseconds. And unlike flash memory, the cells have very good endurance. The biggest challenge involves things on the manufacturing side. The technology is technically CMOS compatible, but fabbing the 15 to 20 various metal and dielectric stacks that make up the MTJ is challenging. In particular, the insulating oxide barrier between the free and pinned layers needs to be about 1-2nm wide. Common issues often happen during the etch or or post etch process, where oxygen impinges into the insulator layer to create an effect called bird's beaking. There can also be issues with the bottom electrode contact, which connects the MTJ to the metal lines. Roughness in that contact can make the MTJ's layers rough too, which causes the free layer to magnetically sink with the reference layer, thus making it harder to discern the actual resistance level of the MTJ. Beyond that, there are scaling issues at the 3 to 5 nanometer class nodes. The STT MRAM gets so Small that we need a decently strong current to properly write to it and avoid thermal induced bit flips. But advanced node transistors are so small and delicate that such a current cannot be easily delivered. The other major eflash replacement is resistive RAM or ReRam or RRam. I'm going to say ReRam. ReRam is another non volatile memory that stores a bit using either a high or low resistive state. Yes, so a lot like mram. However, the way in which they go about doing that is very different and kind of fun. There are a variety of RERAM cells, but the most commonly used one is the filament based reram. It too is a sandwich of a metal oxide insulator layer between two metal electrodes. The oxide might be of elements like hafnium, tantalum or titanium, but research into more exotic things like 2D materials is ongoing. The electrodes can be titanium, platinum, or something else. The RERAM cell switches between high and low resistive states by creating, set or destroying a small conductive filament, maybe as small as 10x10 nanometers. Bridging the two electrodes. That little filament is essentially a wire through the naturally insulating dielectric. We set or reset the filament by applying a voltage or current signal to the electrodes. Very elegant, very simple concept. The concept reminds me of another memory called phase change RAM which uses heat to switch a chalcogenide glass between an amorphous or crystalline phase. In this case, RERAM does not require the phase change. Producing embedded RERAM requires fewer mass steps than other embedded non volatile memories. It can scale down to advanced nodes fairly well. It uses less energy, is non volatile, and reads writes very quickly. On the other hand, there are some variability and endurance issues. The SET and reset processes appears to be inherently random, leading to inconsistent behaviors. Not what you want in a semiconductor technology. And I do wonder how many times can we make and break the filaments before it starts to exhibit weird behaviors? Broadly speaking, the technology has commercial potential. But it also feels like something that pops its head up every so often, attempting to ride the latest significant trend. Between 2005 and 2015, ReRAM gained serious traction as a potential successor to 2D NAND. Until the rise of 3D NAND closed the door on that. There were proposals to do stacked reram, vertical and horizontal, but those failed to compete. A few companies offer this technology today. There is one called Weebit Nano from Israel that shows up a lot in the literature. They've been around for over 10 years. Licensing RERAM IP to customers for for end user products. TSMC offers it as an option for customers too. They have published a few papers on the technology, though far fewer than what they have on mram. It seems like TSMC and Samsung are positioning both technologies as potential successors to E Flash, especially in the automotive MCU space. There are others like the aforementioned phase change, RAM and ferroelectrics that might have a shot, but I think STT MRAM and RERAM are the leaders. Despite both TSMC and Samsung positioning STT MRAM or reram, EFLASH remains quite resilient. Why eflash is a tried and true solution in a space where reliability matters more than raw performance. Most MCUs are still made using trailing edge nodes like 65nm though. This is starting to change and as we laid out throughout this video, the next generation contenders are not exactly free of trade offs. The semiconductor industry is pretty conservative and they're not apt to try new weird stuff until they have to. The current major hope for these embedded non volatile memories technologies is AI. Since embedded memories are so close to the logic, there is some potential to evade the von Neumann bottleneck, and that seems to be the case with STT mram. Another option is for doing AI inference on the device at low power and fast latency, maybe even using neuromorphic principles to do so. This is more for reram. In both cases the technology seems to have gotten ahead of the use case, though we shall see if the end user markets can get on board. Alright everyone, that's it for tonight. Thanks for watching. Subscribe to the Channel, Sign up for the Patreon and I'll see you guys next time.