Transcript
A (0:02)
Nvidia's Blackwell Ultra has 208 billion transistors across two dies. Those dies are made in a titanic fab, using an intricate process with hundreds of steps dealing in the dozens of nanometers. So, dumb question. How do we know it all worked? How can we be sure that the thing being made actually does its job and not be the other guy? This is why the multi billion dollar automated test equipment, or ate, industry exists in today's video, an underrated part of the semiconductor ecosystem. This video is brought to you by the Asynometry Patreon. Going as far back as the first semiconductor transistors, we've needed tests to qualify them. Back in the 1950s, producers ran 10 to 20 tests to ensure that the transistor met the required specs. These tests were delightfully crude. Harry Sella, who first learned at Fairchild Semiconductor, recalled running one such test in an oral history for the Computer History Museum. All we had to do at that time was to learn how to use a cathode ray oscilloscope made by Textronics. You took two needles and you touched them down to the device. If the scope on the electronics machine, the tester made certain patterns, you knew you had a good device. If it didn't make certain patterns, you knew you had a bad device. That's just one test. Manufacturers generally use a whole battery of them, administered at key points in the production process. Determining what specific tests to run and when to run them makes up what we call our test strategy, which is a key part of the industry. In the mid-1950s, these tasks were manually performed by rows of human operators, mostly girls. This was neither efficient nor did it scale with transistor production, and this soon became a problem for the semiconductor pioneer Texas Instruments. In 1954, TI helped produce the first commercially successful transistor radio, the Regency TR1. The production design used six Germanium transistors. Two had to be two N185 transistors of similar qualities matched up in pairs for good audio quality. Finding a 2N185 transistor pair that matched, however, was challenging. Semiconductor manufacturing in the 1950s was crude and subject to failure. One TI employee jokingly said of the time, we could hardly make one alike. The TR1 also sold 100,000 units. Demand was high and growing. Were they really going to manually test all Those transistors? In 1958, Ed Millis and a small team at Texas Instruments put together the first prototype of what would be called the cat, or centralized automatic tester. A person first manually mounts the transistors onto the cat and then automatically cycles the transistors through about 18 tests. Each test result is recorded as a simple pass, fail or go or no go. The transistors are then sorted into 10 bins. Is there a Harry Potter joke that I can be making here? Anyway, those in the same bin are considered similar enough to be paired in a radio, and those who didn't get sorted are failures like me. Millis recalled that the thing looked like a piece of junk, but by the end of 1958, this junk kitty was testing a cool 2,000 transistors an hour. Just in case you were wondering, it would take this poor cat some 4570 years to test all the 80 billion transistors on an Nvidia H200 one by one, Texas Instruments said in an industry magazine article that it planned to improve and eventually sell its cat tool. TI later produced an improved version called the Supercat, which can run some 6,000 to 9,000 transistors per hour through a gamut of 40 plus tests. Other semiconductor companies started selling their in house tools too. Signetix sold a product called the Model 1420. Fairchild sold testers too. But the industry really took off with the first full merchant test system companies led by a pioneer in Boston. It's the late 1950s and Nick DeWolf was trying to figure out what to do next. After graduating from MIT, DeWolfe worked at General Electric putting together televisions. He recalled putting them together and having no idea whether they worked until they tested them. He got bored of that and quit. He then joined a semiconductor company called Transitron in Boston as their chief engineer. Most of his work involved engineering and producing millions of discrete transistors. DeWolf joined despite knowing next to nothing about semiconductors. But that's okay because as I said, in the 1950s nobody knew anything. In his oral history for the Computer History Museum, DeWolf our key product was the germanium diode. Nothing controllable, sort of a hunk of semiconductor with a needle that touched it. We didn't understand the principles. It just happened to work. And then we wound up later on building the gold bonded diode, which had a junction that was a little more controllable. But essentially the challenge was how to push a needle into a piece of silicon, make a connection. Back then there were no fabs, no tin spitting lithography machines with robots picking up and dropping off wafers. It was just a warehouse with rows of girls sticking needles into rocks. This lack of sophistication made test very important. Dewolf said, it was very obvious to any of us in that world that the testing was very much at the heart of the economics of the whole plant, because the yield rates were often so low when you put an object into a testing system that decided whether it was worth 20 cents or $5. Wow. Therefore, that test needed to be reliable. The tests were intense, the equivalent of a Navy SEALs course for diodes. They ran voltages and currents through the diodes, put them into furnaces and humidity chambers, and just beat the dickens out of them electrically. DeWolf eventually left Transitron, in part because he did not feel appreciated by the company's owners. DeWolf considered starting several businesses before settling on semiconductor testing. He recruited his former MIT classmate and friend, Alex D'. Arbeloff. Alex Handel, the marketing admin, and DeWolf the engineering. They famously met in a class because both their last names started with the letter D. The two didn't know each other that well at the start, but DeWolf just had a feeling that they would vibe. And they did. They wrote out a business plan, raised VC money from one of the investors of DEC2, and founded Teradyne in 1960. Teradyne's first product was a diode tester built from DeWolf's time making diodes at Transitron. Roughly speaking, a diode is a one way valve for electricity. When you send a current through it in the right direction, you want to see a characteristic voltage drop. Then when you try to send a current through the diode in the opposite direction, you want to see it blocked, save a little bit of leakage. To test this, Transitron had this instrument called a forward inverse 2 meter rig. As the name implies, it had two analog meters. The operator runs the diode and reviews the measurement to see whether it was within spec. This 2 meter rig was an old school device made for a laboratory. Very temperamental, requiring a technician to constantly hover over it, making tweaks and adjustments to keep it working. DeWolf realized that this was not practical. He had the genius idea to remove the meters and have the instrument output a simple binary result, go or no go. What DeWolf saw, that few others did, was that semiconductors were changing from laboratory curiosities to commodities made in a factory at high volume. Test systems had to adjust for that. They can't be delicate instruments for laboratories anymore. They had to be sturdy industrial tools for factories. In the lab, accuracy was paramount. In the factory, productivity and uptime matter far more. So DeWolf designed the tool to be smaller, fanless and reliable. He also put thought into how the tool might make their human operators more productive. An example of such thoughtfulness was the tester's diode clip. It had magnets so that an operator can just sort of throw the diode at the clip and it automatically latches on. Then after the test, they can flip it over into the good or bad bin with a flick. Teradyne's diode tester was a breakthrough conceptually, but it didn't sell that well. What actually kept the company afloat in its early days was a tester for a very special client. In the 1950s, there was a company called Allen Bradley which made factory automation equipment. Nick DeWolf recalled that they were a old fashioned, stodgy, solid company, Very conservative and tough to please. They owned a factory that produced carbon resistors. These are a common discrete component used to limit current flow in circuits. They're simple, basically just graphite wrapped in a ceramic core. But Alan Bradley made millions each day to produce them. The factory had these batches of carbon soup mud, essentially, which get poured into molding machines. The factory's issue was that the different mud batches varied in quality and they had no way of detecting a bad batch until after the resistors were made. So Teradyne made a simple go no go resistor tester that made it possible for Allen Bradley to test and monitor batches as they were made. They sold dozens of these testers to Allen Bradley and the money they made tided over the company in its early days. In the mid-1960s, we had the rise of the integrated circuit. And to quote your average LinkedIn thought leadership post, integrated circuits changed everything. ICs also changed semiconductor testing. With discrete transistors, you, you can reach in and measure them because they had wires going in and out. But since they were so integrated, ICs did not grant you physical access. So Teradyne had to go back to the drawing board and develop a tester for something that wasn't yet in the market. In 1963, they locked a few people in a room and designed the first computer controlled IC tester, the model J259. DeWolf often liked to give his machines prime numbers. In this case, 259 wasn't a prime number. It's divisible by 37 and 7. But a brochure misprint meant the number stuck. How did it work? The customer starts by loading a test program into the system's computer via paper tape. This program contains test vectors or patterns. This is a set of electrical stimuli that the tester pumps into the IC test subject, referred to as the device under test. Through its pins, the IC device under test responds with a set of output signals which the J259 collects and compares against what is to be expected. The J259 is often called the first true blue automatic tester. It was a monster hit that cemented Teradyne as the top merchant semiconductor test company. Powered by DEC's famous PDP8 minicomputer, the tool and others like it made Teradyne Deck's single largest customer. For a while in those days, Teradyne's top competition were the in house test division's of the vertically integrated semiconductor houses. IBM, for instance, had a massive one in the 1970s the giant had more engineers on their test equipment than Teradyne. IBM, however, used those tools only for their own production. Several integrated design houses sold their internal test tools to customers. Fairchild, for instance, was a serious competitor with their own computer controlled IC testers. The series 4000 and 5000. Texas Instruments had their automated test system or ATS 960, a testing rig powered by TI's own digital computer, which was also called the 960. And throughout the 1970s various entrepreneurs launched their own startups. A few of the more serious challengers were Macro Data, Nothing to do with the TV show Severance and ltx. The latter was founded by a raft of former Teradyne senior staff who immediately started stealing customers from their former employer. DeWolf considered them a very serious threat. But ultimately the most serious competition emerged in the 1980s from Japan. In 1954, a company named Takeda Riken Industry was was founded as a manufacturer of electronic measuring instruments. In the first 15 or so years of its existence, the company produced electrometers and other electronic counters. Then in the 1970s, they started producing memory IC testers like the T3 1031. Then in 1976, Takeda Riken collaborated with Japan's NTT Labs to test its next generation memory chips. And this led to their breakthrough tester, the TAKEDA RIKEN Advantest T3380. Advantest stood for advanced test technology, a name that came up during a late night drinking session in the United states. Launched in 1979, the T33 80 turned heads for its remarkable speed, 100 MHz. This meant that the tester can apply and sample digital signals at 100 million cycles per second. Everyone else was at about 30 to 40 megahertz. The American ate companies had considered going that high, but had thought it impractical in real world usage Considering the serious engineering problems that would have to be overcome, they turned out to be right. Models with lower cycle counts were cheaper and sold better, but the T33 80 nevertheless put Takeda Riken on the map. Later, in 1985, they changed their name to Advantast. They wanted to make their company more Western friendly, and too many Westerners thought Takeda Riken was take da Riken. They were also close to the Japanese computing giant Fujitsu, which owns about 20% of their stock, a relationship that began with a financial bailout some years ago and has continued on for engineering knowledge sharing reasons. Driven by Japan's growing memory industry, Japanese tester market share grew from 30% in 1978 to 45% in 1985. Advantest and others like Ando made some inroads into the United States, their memory testers earning plaudits for their reliability. Teradyne fought back by improving product quality, sending employees to Japan to learn stuff like total quality management. They also participated in Sematech, the American government sponsored cooperative. They also set up a sales office in Japan, which forced Advantest and Ando to defend their own turf, so to speak. Teradyne tools also introduce innovative concepts like parallel testing, where systems can test multiple memory chips at the same time. This improved testing throughput, lowering the tool's total cost of ownership. These efforts largely succeeded, and by the end of the decade Teradyne and Advantest had cemented themselves as the industry's dominant players. Other significant but niche players include Hewlett Packard, which built up a good digital logic business by servicing motorola for the PowerPC SPEA, which hails from Italy and is the big European player especially for automotive and power electronics chip testers. The aforementioned LTX as well as Credence and Schlumberger. The latter used to be Fairchild Semiconductor before they were bought by the oil equipment giant in the early 1990s. Rising PC volumes helped push CPUs ahead of their analog and mix signal peers to be the face of the semiconductor industry. Driven by Moore's law, these chips exploding transistor counts granted them more capabilities, but also presented massive challenges for the test equipment companies. In the past, companies tested their chips by putting them through normal operating conditions. Like if you wanted to test all the water pipes in a building, you go to every faucet in every bathroom and and turn them. They called this functional testing, as in does it function as expected? This approach is sensible and it lets the designer easily produce the test patterns by reusing work from the design process. But Moore's law made functional tests impractical. The number of test patterns just grew exponentially until it became impossible to fully test even very simple circuits. For example, a simple 32 bit adder circuit with just two inputs would need two circumflex test patterns to fully test. A 1 GHz tester system would take over 1000 years to get through them all. And that is just one circuit. Imagine the vast complexity inside even a 30 year old CPU from the 1990s. So functional tests end up missing many non trivial faults. In other words, the fault coverage is too low. Even if the fault coverage was good, there were economic constraints. Testing a chip ensures quality and helps improve yield, which can in theory lead to more profits for the manufacturer. But this is an indirect effect. The perspective of most companies is that testing is a cost center and money spent without directly generating revenue. So an Intel CPU is free to grow at Moore's Law rates, but its testers definitely cannot, which puts test equipment makers in a tough spot. This motivated a change in testing philosophy away from functional testing to structural fault model based testing. Instead of asking if the chip processes all inputs properly, we ask if the chip hardware contains the physical defects. Designers write tests targeting specific faults, like having a gate be permanently stuck at logic value 1, or if some gate cannot switch fast enough, which would cause a timing issue. Or if there are lines bridged together that should not be the method remains similar as before, however. We shift bits into the IC device under test. We then apply a clock cycle to launch an event inside the device and capture the response. We then shift those bits out often as we shift the next set of bits in. This whole thing is called a scan test pattern. The design needs some additional test logic inserted into it by the designer to do this, which is why they might also call this design for test structure. Fault model based testing scales well and is still used today, sometimes alongside evolved versions of functional tests or others too. The PC slowed in the mid-1990s, but that was replaced by demand from new markets like automotive and telecom. The telecom market was hot. In particular, companies were building optical fiber networks across the United States, and such networks needed analog and mixed signal chips for switch equipment and such. The decade also saw the rise of the outsourced semiconductor assembly and test companies, or OSATs like Taiwan's ASE Group and SPIL. This trend had several drivers. First, it was concurrent to the rise of TSMC and the Pure Play foundries. It just makes sense. Outsourcing front end wafer fabrication doesn't get you very far if you can't outsource back end too. So as one grew, so did the other. However, outsourced testing differs from wafer fabrication. In the latter, the chip designer must adhere to the process node's design rules. But in test, the customer dictates to the OSAT the test strategy, which can include what ate tool the OSAT must go out and get with these automated test equipments getting ever more so expensive millions of dollars. In many cases, the OSATs can aggregate test demand across multiple customers and keep those expensive tools fully utilized while also circulating best practices. Many of the OSATs started off by buying the in house facilities of the IDMs, which helped those companies become more asset light and scaled. From there. Advantest and Teradyne rode these dual waves to Great Benefit. In 1997, Advantest was the fourth largest semiconductor equipment company by revenue with $1.6 billion. Teradyne ranked 10th with 841 million. As late as 2000, news reports reported how the OSATs were struggling to get enough tester equipment from suppliers. One analyst, Ron Leckie, said test is running pretty close to capacity in most cases, but we do not have enough test capacity for some products, especially communications devices. Test subcontractors are expanding and buying automatic test equipment like crazy, but the problem is that lead times for equipment have gone up. Revenue surged in 2000, 53% growth for Advantest that year, and then abruptly turned over the year after when the telecom bubble burst. Plummeting demand for new chips and fab capacity smashed the test companies as badly as they ever have been. ATE sales crashed 70% for Advantest and 65% for Teradyne. Order backlogs collapsed even harder. The semiconductor industry has seen many a boom and bust, but the fiber and telecom bus triggered a particularly deep reckoning. It was the end of a golden age for test. Intel and others had realized that test costs, having grown some 25 times by 2001, had spiraled out of control, and they started turning the screws on cost. Teradyne ended up outsourcing tool production, closing their factories in the United States, and focusing on design, sales and maintenance. So far as I know, maybe 85% of their tools are made by subcontractors. There was also a big wave of consolidations in the second tier. ATE players. LTX Credence and Schlumberger merged together in the early 2000s to form LTX Credence, which then, after a few more mergers, is now Xera. The test industry eventually rebounded from the post Bubble Doldrums thanks to the mobile boom, mobile phones have these complex systems on chips that handle both digital and analog mixed signal systems like Teradyne's ultraflex, a mixed signal tool used for final test, meaning after the packaging is done. It is apparently programmed with Microsoft vba, which sounds dodgy. To save on cost and help stay flexible, Advantest and Teradyne also consolidated their various products into modular system platforms. Today, test systems can be configured with compatible instrument cards. Such cards provide different electrical functions like waveforms, digital interfaces or power, all to achieve a particular test strategy. It lets Teradyne do stuff like develop a best selling tool like the J750, an affordable tool which with the right cards can test microcontrollers, wireless devices and even image sensors. Teradyne has shipped over 6,000 of these puppies. Today. AI accelerators are the industry's dominant chips and these guys present mighty test challenges. Most significantly, they are chimeras adopting advanced packaging techniques that package together multiple processor I O and memory diesel. If any part of that sandwich is bad, then the whole thing fails. So we need to test each chiplet individually first. And since these things operate within larger, deeply interconnected AI systems, we need to evaluate performance within that context too. Advanced packaging is also exploding. Chip transistor counts to crazy levels. 50, 80, even 200 billion. Testing all those transistors and connections for faults means shifting immense numbers of bits in and out, maybe terabytes of data for each gpu. There are also thermal concerns. These guys throw off huge amounts of heat and testing them might push them or the ATE tool to the limit, requiring yet more specialized engineering. Advantas has written a bit about AI's potential to improve these flows, where usage data can train models to raise yields and flag failure modes. It's not too clear to me how much of these have been implemented. I want to thank several within the test community for inspiring and speaking with me for this video. Special thanks to Rich, Randy, Bill and Ed. Like with other companies in the semiconductor supply chain, the AI boom has been good for the test companies. Advantest's revenues have surged thanks to the AI boom. They're estimating that the AI tester market will surge 30% year over year to some $10 billion. The company now has a market cap of over $113 billion. As of this writing, they are the 10th most valuable company on the Tokyo Stock exchange. Prior to ChatGPT, they were worth less than $9 billion. We can rely on our semiconductors and systems because they pass their tests with leading edge AI chips at 200 billion transistors and counting. The work these guys have to do is incredible. Alright everyone, that's it for tonight. Thanks for watching. Subscribe to the channel. Sign up for the Patreon and I'll see you guys next time.
