OpenAI’s Next Play: Custom Chips to Slash AI Costs

OpenAI is doubling down on hardware control, pairing Nvidia GPUs for model training with Broadcom-designed custom chips for inference, according to The Wall Street Journal.

The goal? Cut costs, improve latency, and secure chip supply — the three pillars of scaling AGI infrastructure sustainably.

The roadmap targets 10 gigawatts of compute by 2030, up from 2 GW today and another 16 GW already committed. These systems rely on high-bandwidth memory (HBM) and dense optical networking to keep massive data centers running efficiently.

Here’s the breakdown 👇

🧠 Training stays on Nvidia’s flexible accelerators (still holding ~70% of the training market).
⚡ Inference shifts to custom silicon optimized for known model workloads — crucial since serving LLMs dominates cost at global scale.
💾 Chips integrate large on-package memory to prevent compute stalls, and expert routing to activate only the necessary parts of the model — cutting power use dramatically.
🔗 Broadcom’s switching and optical interconnects ensure that thousands of accelerators act as a single, ultra-fast machine.

Even OpenAI’s new Pulse product — which uses AI agents to summarize the internet for users daily — runs behind a $200/month Pro tier, showing how inference costs shape product pricing.

💡 Bottom line: OpenAI is evolving from a model company into a vertically integrated AI hardware ecosystem, where custom inference silicon defines the next phase of scale.

📄 Source: WSJ

OpenAI’s Next Play: Custom Chips to Slash AI Costs

Related

Comments

Leave a Reply Cancel reply