
Google released Gemma 4 12B, a multimodal model that runs locally on 16GB devices. TSMC's CEO warned chip supply won't meet demand for years. Ramp raised $750M at $44B, and Anthropic says 80%+ of its merged code is now Claude-authored.
Loading summary
A
Where is Daredevil? I'm right here. Don't miss the return of Marvel Television's Daredevil Born Again. So what's next? I feel liberated. We're gonna take this city back over medicated in an all new season. Now streaming only on Disney plus. They're hunting us. It's time we started hunting them. I can work with them.
B
This should be tons of fun.
A
Marvel Television's Daredevil Born Again now streaming only on Disney pl.
B
Welcome to the Techbrew Ride Home for Thursday, June 4th, 2026. I'm Brian McCullough. Today Google released Gemma 412B, a multimodal model that runs locally on 16 gigabyte devices. TSMC CEO warned chip supply won't meet demand for years. RAMP raised $750 million at a $44 billion valuation and Anthropic says more than 80% of its code is now claude authored. Here's what you missed today in the world of tech. Today's episode is brought to you by Doppl. Social Engineering Attacks don't bother to knock. They slip right into your inbox phone or on websites, instead pretending to be a harmless internal email or a normal text message until it's too late. Doppel sees right through this disguise. Their AI native platform trains your team to recognize threats like deepfakes, bad links and impersonation attempts before they can actually cause any damage. Doppel strengthens team resilience by giving employee the tools and defenses they need to protect themselves from increasingly sophisticated social engineering threats. It's kind of like having a security team that has eyes everywhere, and their digital risk protection takes it one step further by keeping an eye on every channel to connect patterns and shut them down fast. Invest in Social Engineering defense Learn more at D o p e l.com that's.p p e l.com Google has released Gemma 412B, an 11.95 billion parameter Unified Encoder Open Multimodal AI model that can run locally on devices with as little as 16 gigabytes of VRAM or unified memory. Quoting VentureBeat while many AI open source model providers are pursuing larger and more powerful models, Google is still giving attention to the smaller, more local side of the market. Today the tech giant released Gemma 412B, an 11.95 billion parameter open weights model with permissive Apache 2.0 license optimized to execute locally on a standard enterprise laptop using just 16 gigabytes of VRAM or unified memory. That means those enterprise users looking to keep working with AI while on a flight without WI fi or trying to keep it offline for security reasons, can now do so far more easily and at far less cost. Free to Download and operate Gemma 412B's most notable breakthrough is an encoder free unified architecture which allows raw audio waveforms and visual patches to flow directly into the core LLM backbone without the latency or memory overhead of secondary processing modules available immediately for download on Hugging Face and Kaggle and for use on Google's AI Edge Gallery. More on that in a second. Gemma 412B packs a 256,000 token context window, native agentic tool use capabilities, and an explicit step by step reasoning mode into a highly optimized footprint that bridges the gap between mobile edge models and heavy Data center infrastructure. Gemma 412B is highly relevant to enterprise architecture due to its novel unified structure. Traditional multimodal systems typically utilize discrete separate encoders to translate audio waveforms and visual data into representations that the core language model can process. This conventional approach inherently increases both inference latency and total memory consumption. Gemma412B radically alters this pipeline by functioning entirely without these secondary encoders. Instead, visual patches and raw audio waveforms are projected directly into the core language model's embedding space through lightweight linear layers. The Vision encoder is replaced by a 35 million parameter module utilizing a single matrix multiplication, while the audio encod is eliminated entirely. For enterprise engineering teams, this unified architecture delivers distinct operational advantages. Lower latency for multimodal tasks, reduced VRAM requirements down to 16 gigabytes typical for your average user. And despite its compact size, Gemma 412B achieves benchmarks nearing Google's larger 26B mixture of experts model. Beyond static benchmarks, the model supports a massive 256k token context window. This is critical for enterprises needing to process lengthy financial reports, extensive code repository or hour long meeting transcripts. Furthermore, Gemma 412B includes a native thinking mode to map out step by step reasoning before generating a response. It also features out of the box support for native function calling and system prompts, which are essential requirements for building highly capable autonomous software agents. So as mentioned, to run this, Google also released a macOS version of AI Edge Gallery, which lets users run open models on their devices and Aiedge Eloquent, an on device voice dictation app running 9 to 5 Mac. The majority of users who rely on LLMs for everyday tasks tend to use ChatGPT, Claude or Gemini, which are cloud based models running on OpenAI Anthropic and Google servers. Another way to interact with LLMs is through local models. These are usually much smaller and less capable than the Trillium parameter models that run in the cloud, but they also come with several advantages. For one, being less capable than cloud based models does not mean they are bad. Also, they do not require an active Internet connection since they run on the computer's own processing power. Additionally, the better the computer, the faster faster the responses and the larger the models it can handle. And finally, because everything runs locally, these models are more private too, since conversation data does not need to leave the device. There are a few ways to install local models on a Mac, and we covered this here when OpenAI released its own open models. But in a nutshell, you need to install platforms such as Ollama and LM Studio and then install a model that runs smoothly on your Mac's hardware. One thing to note right from the get go is that contrary to Ollama and LM Studio, which allow users to install any AI compatible with their hardware, Google AI Edge Gallery for Mac currently only offers access to five of Google's own models, where it stands for Instruct, meaning they can be tuned to follow user instructions rather than simply complete text. Alongside Gemma 12B and the release of Google AI Edge Gallery for macOS, Google also launched the Google AI Edge Eloquent app for Mac today after bringing the app to iOS a few months ago. Google AI Edge Eloquent is a free dictation app that captures what users say and transcribes it while polishing the text, removing disfluencies and making light edits for clarity and flow. Processing is done on device rather than in the cloud. The app also lets users choose between different writing styles and add custom words such as names, jargon and other terms they use often. That helps avoid the kind of frequent miscorrections that dictation apps can otherwise make with specific words and phrases. End quote. According to Public First, a mere 26% of Americans support increased data center construction, the lowest share among 15 large countries. Quoting the FT, just 26% of Americans supported increased construction of data centers, while roughly 30% of Britain's Germans and French backed such projects. Support was highest in Nigeria, where 74% believed in building more infrastructure, and in India at 65%. Seb Wirdi, head of opinion research at Public first, said, our research shows America, the home of Silicon Valley and the majority of the biggest tech companies, has the population least in favor of the very infrastructure needed to support that sector. The findings come as the US AI industry and Donald Trump's administration are increasingly concerned about mounting local opposition to data centers, which is thwarting the sector's growth. Dozens of projects collectively worth at least $156 billion have been blocked or stalled since 2025, according to data Center Watch, a research project run by AI security company 10A Labs. Precisely as the sector is trying to secure more processing power to serve newer AI models. Researchers at Goldman Sachs last month predicted delays and cancellations mean only half of planned US Data center cap would be completed or on schedule in the next couple of years. Multiple recent polls have shown support for data centers collapsing in the US Amid rising fears over job losses and the spread of harmful AI content. A Gallup survey conducted last month found 70% of Americans oppose such construction in their community. Leading progressive politicians, including Vermont Senator Bernie Sanders, have seized on such sentiment to demand a moratorium on data center developments. Trump's allies, including Steve Bannon, have also called for constraints on the rollout of the technology. End quote. AI is uncharted territory, and many leaders are trying to navigate through without a guide to help them. That's why Morning Brew created the Intelligence Shift, a new podcast with PwC. It's all about how AI is fundamentally changing different industries. Host Dan Priest sits down with people who work with AI on a daily basis. Together, they discuss real stories, real strategies and real takeaways for leaders. Get guidance from industry experts Listen to the Intelligence Shift wherever you get your podcast. There's a lot to navigate in the economy right now, and small business owners like you are really feeling the effects. You can't control interest rates or tariffs, but there is one thing you can control, and that's how efficient your business is. Gusto can help. Automating payroll and HR with Gusto is one of the fastest ways to cut friction and focus on what actually moves the needle. Gusto is online payroll and benefit software built for small businesses. It's all in one remote, friendly and incredibly easy to use so you can pay, hire onboard and support your team from anywhere. Unlimited payroll runs for one monthly price. Try Gusto today at gusto.com brewster and get three months free when you run your first payroll. That's three months of free payroll at gusto.com Brew
A
Study and Play come together on a Windows 11 PC and for a limited time, college students get the best of both worlds. Get the unreal college deal everything you need to study and play with select Windows 11 PCs. Eligible students get a year of Microsoft 365 Premium and a year of Xbox game Pass ultimate with a custom color Xbox wireless controller. Learn more@windows.com studentoffer while supplies last ends June 30th. Terms at aka mscollegepc also from the
B
Scary File Sam Altman and Dario Amodai are among the signatories on a public letter urging improved tracking of so called synthetic DNA that could be used in AI developed bioweapons. Quoting Wired, organized by the nonpartisan Institute for Progress and the right leaning foundation for American Innovation, the letter acknowledges that given the pace of AI development, quote there is a real possibility that the knowledge barriers which have historically prevented bad actors from obtaining biological weapons will meaningfully erode. End quote Scientist Arthur Kornberg was the first to successfully synthesize DNA in the 1950s. Now the process is automated, with dozens of companies around the world using commercial synthesizers to print and sell custom genetic sequences that are used for scientific research, drug development and diagnostics. Many providers sell only to qualified researchers, biotech companies and educational institutions. But not all of them vet customers or the gene sequences they order. In 2017, Canadian researchers raised alarms when they used $100,000 worth of mail order DNA to reconstitute the extinct horsepox virus. Critics said the same methodology could be used to construct smallpox, a closely related and deadly virus. Gene synthesis has only gotten cheaper since then. Combined with advances in AI, it's now feasible to design dangerous new toxins and pathogens using large language models, although some biology training would likely still be needed to make a functional virus from scratch. While bioterror attacks have been rare, they have the potential to cause mass casualties, public panic and economic loss. A major concern is that an AI designed pathogen could intentionally or unintentionally spark a global pandemic. AI tools enable a user to very quickly identify where to turn to order sequences that will not be subject to screening, says David Relman, a microbiologist and biosecurity expert at Stanford University who signed the letter. If prompted appropriately, they can also tell you how to change the nature of your order so that even those that are screening may be much less able to detect what it is you are trying to make, end quote. The signers include other scientists, national security experts and executives from gene synthesis companies Twist Bioscience and Ansa Biotechnologies. These firms are members of the International Gene Synthesis Consortium, which formed in 2009 to implement voluntary screening practices. Many companies already use software to screen orders for sequences of concern that can contribute to an organism's toxicity or ability to cause disease. End quote. TSMC CEO CC way, says the company won't be able to fulfill the demand led by US Customers even as more capacity comes online in the US over the next few years, quoting Bloomberg. Still, Taiwan's largest company, which makes the majority of the world's advanced semiconductors for AI and other electronics, will refrain from initiating the sort of abrupt price hikes that shook up the memory chip sector, Y added. TSMC's intent is to ensure a stable business, he said. At an annual shareholders meeting in Taiwan, Y reiterated a forecast for sales growth of more than 30% for this year, an outlook TSMC raised just weeks ago. Asia's largest company is an essential player in the global AI industry, making cutting edge semiconductors for the likes of Nvidia and amd. TSMC has been expanding its footprint beyond its home island to add capacity. Yet even that isn't enough to satisfy all needs. With major hyperscalers set to spend $725 billion on AI this year alone, it will be a long time before we can meet customer demand, wise said. TSMC is racing to expand at a time customers from Nvidia to Broadcom are vying for access to its cutting edge facilities as part of the Taiwan trade pact. The Asian company envisions building at least four more U.S. chip making plants on top of six already planned. That entails $165 billion in investments requiring roughly an additional $100 billion of capital, Bloomberg News has reported. Two plots of land TSMC has acquired in Arizona should be enough to satisfy its needs for a decade, y added. In April, the company raised its full year sales guidance and said its own capital spending should trend toward the upper end of an existing forecast range of as much as $56 billion. Corporate spending management platform ramp raised $750 million at a $44 billion valuation led by iconic Singapore's GIC and the OTPP, taking its total funding to $3 billion. What's interesting is how their business has apparently rebounded, quoting Bloomberg. The New York based company's valuation, which includes the money raised, has nearly tripled across multiple rounds over the past year refle significant growth in its business. RAMP's annual revenue run rate is now more than $1.5 billion, according to a person familiar with the matter. That financial metric, a projection of full year revenue based on a shorter period, was $1 billion back in September. Founded in 2019, RAMP initially focused on helping startups simplify expense reporting. Now it offers a wider mix of services for businesses, including payments processing and AI powered fraud detection. RAMP's closest competitor, Brex sold to Capital One Financial Corp. Earlier this year for $5.15 billion. RAMP intends to eventually go public rather than seek a buyer, according to co founder and Chief executive Officer Eric Gleiman. Roy Lowe, a general partner at Iconic, said the startup is big enough to be a public company whenever it chooses, stressing its fast growth. The firm currently works with 70,000 businesses, up from 50,000 at the beginning of the year. Much of that increase has come from young startups looking for help navigating unprecedented spending on AI models and managing their payments, Gleimon said. I don't think we've ever particularly been slouches, but we've seen this incredible re acceleration of growth, gleiman said. As more work is done through models, the total addressable market for payments as opposed to payroll is growing very significantly. Like its customers, Ramp is also adapting its business to AI, including by creating an artificial intelligence research lab, developing internal AI tools for employees and encouraging sales, customer support and engineering teams to embrace the technology. In March, it released software that lets AI agents carry out any actions on Ramp that a human could, including paying with corporate cards. The future is going to be agents buying on behalf of agents and agents on the other side receiving the payments, said Jeff Charles, Ramp's chief product officer. Ramp currently has about 1700 employees globally and plans to continue hiring across its business, including recruiting salespeople and developers as well as forward deployed engineers to assist companies with AI adoption. End quote. Finally today, the Anthropic Institute argues that it's no longer theoretical. AI is already today accelerating the development of AI and that this would point us toward recursive self improvement, which we've discussed in the past, a system that can autonomously design its own successors. They stress that this hasn't happened yet and it isn't inevitable. Maybe, but they say it could arrive faster than institutions are ready for. The evidence comes from two things they've already noticed. First, the length of tasks AI can reliably complete is now doubling roughly every four months, and coding or research benchmarks like Swebench or Core Bench have gone from single digits to almost majority scores in two years. Internally, more than 80% of code merged at anthropic as of May 2026 is Claude authored up from low single digits Before CLAUDE code, the typical Engineer now merges eight times as much code per day as in 2024. Claude's code quality is roughly at human parity and improving. And on Optim task, CLAUDE has gone from around 3x to a around 52x speed up in under a year. The persistent human advantage is still research taste, choosing which problems matter and which results to trust. In the blog post I'm linking to, they see three the trend stalls out, but the current capabilities diffuse widely or labs see compounding efficiency gains while humans still set the direction deemed by them to be the most likely outcome or full recursive self improvement begins soon with humans shifting to oversight roles. Only Anthropic argues that the world should preserve the option to slow down or pause frontier development, but notes this requires verifiable multi lab, multi country coordination, which is probably impossible because training runs are easy to conceal. Nothing more for you today. Talk to you tomorrow. Tomorrow,
C
the Wired newsroom is known for award winning reporting on how technology shapes our world. On WIRED's Uncanny Valley, we take that curiosity even further. Each week, journalists from Wired break down the biggest stories in tech while speaking directly with the people building challenging and reshaping the future. Is the AI boom sustainable? How do you protect your privacy in an age of constant surveillance? Uncanny Valley tackles the questions driving today's tech debates and lighting up your group chats. Listen to new episodes every Thursday. Wherever you get your podcasts.
Tech Brew Ride Home – "Small And Open Source Still Has A Horse In This Race"
Date: June 4, 2026
Host: Brian McCullough
Podcast: Tech Brew Ride Home (Morning Brew)
This episode delves into the ongoing relevance of smaller, open-source, and locally-run AI models amidst the rapid scaling of cloud-based, data center-heavy AI systems. Brian McCullough highlights Google’s new Gemma 412B multimodal open model, updates on TSMC’s semiconductor supply crunch, shifting public sentiment on data centers, bioweapon risks posed by synthetic DNA and AI, Ramp’s massive new funding, and Anthropic’s insights into AI accelerating its own development.
“Being less capable than cloud-based models does not mean they are bad. Also, they do not require an active Internet connection since they run on the computer's own processing power … because everything runs locally, these models are more private too.” – Brian McCullough (06:10)
“Our research shows America, the home of Silicon Valley and the majority of the biggest tech companies, has the population least in favor of the very infrastructure needed to support that sector.” – Seb Wirdi, Public First (08:35)
“AI tools enable a user to very quickly identify where to turn to order sequences that will not be subject to screening... they can also tell you how to change the nature of your order so that even those that are screening may be much less able to detect what it is you are trying to make.” – David Relman, Stanford (12:18)
“With major hyperscalers set to spend $725 billion on AI this year alone, it will be a long time before we can meet customer demand.” – CC Wei, TSMC (14:46)
“The persistent human advantage is still research taste, choosing which problems matter and which results to trust.” – Brian McCullough, summarizing Anthropic (18:35)
This episode articulates why smaller and open-source AI models retain major strategic importance as cloud-based giants dominate headlines—highlighting privacy, accessibility, and operational independence. It additionally touches on crucial risks and infrastructural challenges facing tech, from semiconductor and data center bottlenecks to emerging AI-driven security threats and the accelerating feedback loop of AI development itself.