Microsoft has revealed its latest weapon in the battle for artificial intelligence supremacy: the Maia 200 chip, a powerhouse accelerator built to handle AI tasks without relying on Nvidia’s expensive hardware. The new chip marks a serious step forward in Microsoft’s plan to control its own destiny in the AI race while keeping costs under control.
Microsoft’s Maia 200 chip represents a bold move to break free from Nvidia’s grip on AI hardware while slashing infrastructure costs.
Built using TSMC’s cutting-edge 3nm process, the Maia 200 packs over 140 billion transistors into a single chip. It delivers more than 10 petaFLOPS in FP4 precision and over 5 petaFLOPS in FP8 precision, which are specialized number formats that make AI calculations faster and more efficient. The chip comes with 216GB of ultra-fast HBM3e memory moving data at 7 TB/s, plus 272MB of on-chip SRAM that keeps information close at hand. This design choice also leverages the balance sheet channel effects that can influence asset valuations in the tech sector.
The performance numbers tell an impressive story. Microsoft claims the Maia 200 delivers three times the FP4 performance of Amazon’s third-generation Trainium and exceeds Google’s seventh-generation TPU in FP8 tasks. More importantly for Microsoft’s bottom line, it offers 30% better performance per dollar than the previous Maia 100, making it the most cost-effective inference accelerator the company has deployed.
Microsoft designed the Maia 200 specifically for AI inference, the process where trained models actually answer questions and generate text. The chip handles massive language models like those powering Microsoft Copilot and other Superintelligence operations with room to spare for future models that will be even larger. The company is rolling out the chip first in its US Central datacenter near Des Moines, Iowa, with additional deployment planned for Phoenix.
The chip integrates directly into Microsoft Azure’s cloud infrastructure and can scale to clusters of 6,144 accelerators connected over Ethernet. An integrated network interface provides 2.8 TB/s of bidirectional bandwidth, ensuring data flows smoothly across thousands of chips working together. Microsoft has also launched a preview SDK with PyTorch and Triton support to attract developers and startups to the platform.




