Transcript
A (0:02)
There has been a lot of talk recently about building AI data centers. Massive data centers. Recall that famous one? Posted by Zuck Superimposed Over Manhattan the thought of these monstrosities sucking up water and power for the sake of chatgpt tokens gets some people very worried. What if we can put all of that in a shoebox? That's the pitch from a young deep tech startup called Snocap Computer, which I met with during my recent visit to the United States. They've been working on something that promises minuscule heat dissipation and speeds in the hundreds of gigahertz, and it depends on superconductors. It sounds like sci fi, but the technology has a heritage stretching back to IBM in the 1960s. I must admit that this one is rather heavy, but I can't stop thinking about it in this video. Rsfq, RQL and snocap's Superconducting Computer the most famous superconducting computer project by far was IBM's, which lasted for over a decade. I discussed this project in a prior video. It relied on a device called the Josephson junction. These are basically sandwiches, two layers of superconducting material, a lead based alloy in IBM's case, with a layer of insulating material in between. These work kind of like transistors. Transistors use voltage levels to represent a binary 1 or 0. You apply a high enough voltage to the gate and the transistor creates a high voltage level that signifies a 1. If the applied voltage falls below the threshold voltage, then you get a low voltage level signifying a zero. For this reason we call the transistor a voltage controlled device. Josephson junctions can sort of mimic this behavior. Normally a current in the junction is impeded by the insulating material. There's resistance, meaning a non zero voltage. This can be mapped to A1, but when the junction goes below the critical thresholds and temperature, magnetic field and current density, then the junction enters a superconducting phase. Then that current can quantum mechanically tunnel through the junction's insulating layer. It flows without any resistance, meaning zero voltage. This can be mapped to a zero by raising the current density again. Beyond the critical threshold, superconductivity collapses and the junction is back in its resistive state again. Since it's the current doing this, we call the junction a current controlled device. IBM's efforts tried to get the Josephson junction to mimic the voltage level behavior of a transistor. They called it Josephson latching logic. After the junction's tendency to stay on or Latch once it enters a resistive phase. IBM's project wasn't vaporware, but suffered three fundamental issues that crippled its chances at beating CMOS. First, IBM struggled with manufacturing issues. The junction's insulating layers can only be a few nanometers thick at best. IBM struggled to consistently achieve those with the available technology at the time. Moreover, IBM making the junctions from lead based superconducting materials caused them to gradually degrade and lose their electrical properties as they repeatedly cycled through temperatures of 4.2 and 300 Kelvin. This made their switching inconsistent. IBM worked hard and made strides, but they never got to the desired stability before the project ended. A second issue was that when the junction was in resistive mode, it dissipated heat, which had to be removed. Worse yet, the energy cost for removing that heat was significant. It requires way more energy to remove heat at 4.2 degrees Kelvin than it is at 100 or 300 or so Kelvin. Cooling systems, whether they be liquid helium or the special fridges called cryo coolers, are inefficient at that temperature. It can take up to 300 watts of wall power to remove a single watt of heat. True, the switches dissipate a thousandth of the heat energy that CMOS transistors do. But with tens of thousands of junctions, all that heat starts to add up. Finally, the Josephson junction's switching speeds were fundamentally limited. Why? Well, this one is a bit complicated, so bear with me. It has to do with the latching behavior that we just talked about a little while ago. Imagine a junction in superconducting mode, meaning that a current flows through it without voltage. The current is held at a value just below the critical threshold. Now we raise the current just a little bit higher to breach the critical threshold. Superconductivity collapses and a voltage develops. As you expect, this happens very fast, only a few picoseconds. But per the latching behavior I just mentioned, the junction remains resistive even if you brought the current value back down to where it was before, just below the critical threshold. In other words, the junction stays on like a room's light switch after you hit it. The only way to return to a superconducting state is to bring the current down to a very low value, a reset. In essence, you can't run calculations with latched junctions. So IBM used radio frequency or RF signals to do this reset. RF signals are high frequency alternating currents, and IBM used them to globally reset latch junctions across the whole chip. But then IBM came across a serious problem in the form of a phenomenon Called punch through. The best way to explain this superconducting silliness is with a pair of swinging doors like those at the entrance of the saloons in those Wild West Hollywood movies. Imagine a pair of the swinging saloon doors. They're closed right now. 1. And you want to open them again. 0. You slowly push them open and they open with a satisfying click. But imagine opening them faster and faster until you are pushing them with so much force that the momentum causes them to punch through the closed position and go back in the other direction. In other words, an overshoot. That is broadly what the punch through effect is. When the resetting RF signal frequency rises to about 1 GHz, the junctions will not reset properly and might overshoot to an opposite state. The next calculation fails and the program crashes. These issues undermine IBM's claims of superconductor computing being faster and more energy efficient than cmos, which back then was getting faster every year thanks to Moore's law. IBM finally ended its Josephson technology project in 1983. It was a high profile setback, but work continued. The Japanese began their own collaborative research effort into the field as part of a broader exploration of next generation computer technology. Most significantly for this video, the Japanese introduced a new class of Josephson junctions made with niobium for the superconductor and aluminium oxide for the insulator. These were far more manufacturable than IBM's lead based stuff. Unfortunately, the Japanese suffered the same latching issues as IBM did and it eventually ended their efforts too. But it is worth acknowledging that they did ship an actual computer. In 1991 they unveiled the ETL JC1 computer equipped with 22,000 Josephson junctions. It was the first such computer capable of running programs for aram. The next major development came out of the Soviet Union. IBM's core flaw was trying to replicate the transistor's voltage level behavior. Turns out that a superconducting computer cannot out cmos. Cmos. Something entirely different was needed. When a Josephson junction switches from a superconducting to resistive state, it emits a single traveling voltage pulse called single flux quantum or SFQ. This pulse is very short, just about 1 to 2 picoseconds, but it is consistent. Seems useful. Work on turning this quirky phenomenon into a logic scheme dates back to 1973 at Bell Labs with work referred to as flux shuttles. The Japanese tried something similar, calling their efforts phase mode logic. Then in the mid-1980s, three scientists at Moscow State University and the Institute of Radioengineering and Electronics Likarev, Mukhamnov and Semenov, apologies for messing up the pronunciations defined a convention for turning the pulses into binary compute. Their basic convention is as the arrival of an SFQ pulse to some device terminal during some period of time means a binary one value and the absence of such means zero. Devices like flip flops or gates and more can be built using superconducting loops with Josephson junctions and inductors in them. Roughly speaking, SFQ pulses can be stored inside those superconducting loops, which helps the device remember its state. As for how the SFQ pulses travel and propagate, there are these superconducting microstrip lines which the pulses can go through at nearly the speed of light. In 1985 and 1986, Likarev and his cohorts at MSU worked to implement this logic, building a family of devices and working their way up to a computer device. They called their logic rsfq. The R in that originally meant resistive for resistive single flux quantum. This was because the first implementation connected together the logic gates with resistors. The first samples were fabricated by the summer of 1986, and to the team's surprise, they worked on the first try. Better yet, RSFQ demonstrated clock frequencies of up to 30 GHz, a remarkable improvement over IBM's work. However, having these resistors caused performance issues, so the MSU team replaced them with either inductors or additional Josephson junctions. It worked, but made the old acronym irrelevant, so they changed it to mean rapid single Flux Quantum. In 1991, the Moscow State University team moved to the United States, perhaps due to the collapse of the Soviet Union, and joined a New York based startup called Hypris, founded by a Libyan American named Sadegh Mustafa Faris. Hyprys tried to commercialize some of the IBM project's original technology, like a Josephson junction based sampling oscilloscope marketed to NASA for inspecting tiles on the space shuttle. But after the Moscow team joined, they sought to bring RSFQ based integrated circuits to the market. At the same time, they worked on refining the manufacturing processes for the superconductor chips, which can be pretty challenging. Efforts at Hybris, Nagoya University and other places continued to develop RSFQ circuits. US government funding helped too. In 1997 there was the Hybrid Technology Multi Threaded, or HTMT, project, which ambitiously sought to break the petaflop barrier using a combo of semiconductor, superconductor and optical technologies. By the 2000s, circuits had grown to tens of Thousands of junctions. High speed devices like a 144Ghz flip flop occasionally made the news. One paper discussed a static digital divider component that ran at a staggering 770 gigahertz. The big problem, however, was power consumption. If you recall, the Josephson junctions emit SFQ pulses when they start tipping over the critical threshold. We want this to happen when the junctions are hit with another small SFQ pulse. So to prime them for this, we run what is called a small bias current through them, which is just under the critical threshold. Imagine like a tub of water filled to the very brim but not overflowing. The pulse would be that extra little splash of water to cause the tub to tide over. Distributing this bias current for RSFQ meant having bias resistors strung out on a separate voltage rail parallel to the junctions themselves. And as it turns out, these resistors dissipate power even when just sitting around doing nothing. This static dissipation, as it is called, turns out to be 10 times greater than the active circuits themselves. One other issue is that the bias current scales with the number of junctions. So a circuit with even just 10,000 junctions, not a lot, can end up dissipating a lot of power and generating a lot of heat. Imagine having millions or billions. By 2000 the HTMT project had made progress, but major technical gaps remained in producing a petaflop computer. So the project wound down and quantum computing took the stage as the altcomputing technology du jour. OK, now we are caught up to SnoCap, we can start landing this plane. Over the years, several extensions to RSFQ logic have been proposed. Notables include energy efficient single flux quantum technology, efficient SFQ and low voltage rsfq, lv, rsfq. But the one that seems most promising and what SNOCAP is working on is reciprocal quantum logic, or rql. RQL was proposed by a team at the defense contractor Northrop Grumman, led by Quentin Hur and Anna Herr. Northrop had participated in the original HTMT project and in the late 2000s they proposed a twist on the original RSFQ concept. Per a 2011 interview with IEEE Spectrum. The Baltimore based team began this journey by tearing out the network of resistors delivering bias current to the junctions. This created knock on effects which they tried to fix until suddenly they had something new. The key difference is as RSFQ counts a single sfq pulse, as a 1 RQL counts a pair of pulses in a cycle. First a positive voltage pulse that is stored and routed as the one, then a second negative voltage pulse that resets the gate after it does its logic operation. Another big thing is that RQL relies on AC power instead of DC power to distribute the bias current. This lets you replace the resistors with transformers, the passive component, not the toy or AI algorithm. AC current flowing through a transformer component does not dissipate energy as heat. That current can then be terminated in a resistor cooled at room temperature rather than a resistor at superconducting temps. This radically reduces static power dissipation, creating a system much more power efficient. RQL also uses the AC power signal as a clock. This removes RSFQ's prior need for a clock signal distribution network, but does make circuit design a bit trickier. Northrop Grumman is still working on the technology 10 or so years later, but other than the odd paper or patent, they haven't published much. Evidently they can't give it the sort of youthful energy that a startup can. The folks at SNOCAP have brought on the aforementioned Anna and Quentin Hur as their CSO and cto. They evidently believe that the technology can produce the next superconducting AI data center. One major step that this snowcap team has done, and it took them two years of efforts at imec, was making the manufacturing CMOS friendly. This involved changing the materials inside the junctions to so that FABS can produce them. People have been making Josephson junctions out of niobium since the Japanese did it in the 1980s. But pure niobium can also diffuse into and contaminate silicon and oxides. No sane FAB will let you introduce niobium into their clean room. So the team at IMEC ended up replacing the pure niobium with niobium titanium nitride, which is less reactive and more heat resistant. And as for the junction's insulating layer, aluminium oxide was replaced with alpha silicon, more FAB friendly and apparently also allowed the barriers to get slightly thicker. IMEC also helped engineer out the need for a bulky transformer component near every circuit element by creating a resonant circuit of inductors and capacitors to distribute AC power to all the junctions. Capacitors are far easier for a FAB to produce than transformers and their bulky coils. The result was a streamlined and more scalable Josephson junction system that does not have to be made in a superconductor specific fab. A CMOS compatible FAB might be able to do it. One of the major issues that must be figured out before RQL can be practically used for any sort of AI. Is is memory. Quantum computers struggle to implement memory too, but those guys can sidestep it by focusing on HPC. SnoCap wants to do AI and that means having a lot of memory. There is no hiding that this is an obstacle and I give them credit for not dancing around it. SNOCAP and IMEC did produce a good Josephson Junction based SRAM, but SRAM lacks DRAM's density. Fabricated with a 14nm node, it gives you only hundreds of gigabytes, which is short of what you need for leading edge LLMs. There is also consideration of cryo dram, which is exactly as cool as it sounds. DRAM that works when you freeze it. DRAM can technically perform at those temperatures, albeit with somewhat compromised performance on bandwidth. The bigger issue is from a business perspective, memory makers won't make any cryo DRAM until the DRAM standards consortium JEDEC gets on board first, and that will take time. Snocap's IMEC write up did hint at a possible compromise solution. A wired glass bridge between the superconducting part of the computer which runs at 4 Kelvin and a part that is just chilled down normal silicon dram which runs at 77 kelvin. This might work better than cryo DRAM since it will be likely that those memory chips dissipate heat. But the SNO CAP people have pointed out that the SRAM will probably be enough to demonstrate feasibility. Several more issues remain outstanding. A big one is that one of the downsides of using multi phase AC power is that you need to keep those phases aligned across the whole chip. Even the slightest picosecond mismatch can cause problems. Another major outstanding question is scale. RQL requires very large components and they're hard to shrink. Distributing the AC power still requires these high frequency power splitters to divide up the AC signal for distribution to all the junctions they chonk. For example, the various splitters for 18 bit adder took up 2.5 times more space than the adder itself. Not to mention the Josephson junctions themselves are difficult to shrink beyond the micron size range. As they get physically smaller, noise issues become more critical and patterning is harder to consistently achieve. Can it all scale down to nanometers? This remains outstanding. Maybe it doesn't matter. Maybe we just stack boards together in densities that silicon CMOS can only dream of. The IMEC WriteUp noted that 20 exaflops of compute can be put into a small shoebox 20x20x12 cm over the past 50 years. Millions of man hours have gone into trying to keep computers from overheating. The sheer amount of energy we want to use to cool these data centers is daunting. What if something comes along that changes the curve? Like how CMOS itself once did? In the 1970s and 1980s, the semiconductor industry used to use less efficient bipolar transistors until CMOS's power savings became impossible to ignore. It's not hard to believe that CMOS2 can't be replaced in the future. Now, I don't know if superconductors are that thing, nor am I sure that snocap's technology can actually shrink entire data centers into shoeboxes. And even if it did, silicon CMOS will still fulfill much of the world's compute needs. But I reckon it can make a dent in the world's energy needs in this part of the industry if it works. In the end, the proof is in the pudding. They have to pull it off. I hope to keep an eye on it as it progresses along to a demo. All right, everyone, that's it for tonight. Thanks for watching. Subscribe to the channel, sign up for the Patreon, and I'll see you guys next time.
