Us News

Elon Musk's Xai Colossus supercomputer is about to reach 1 million GPUs

(Credit: NVIDIIA)

The first phase of XAI's supercomputer runs as planned. According to Tom's hardware, Elon Musk's massive AI training system in Memphis, Tennessee has transitioned to the city's main grid. This mesh gives XAI access to 150MW to power the center operating 200,000 NVIDIA GPUs. The company plans to reach 1 million GPUs with scale-up to compete with competitors like Oracle Cloud Infrastructure (OCI), which has at least 131,072 GPUs.

One of the most attractive aspects of Xai's ambitious project is its speed of development. The company entered the AI ​​scene in the summer of 2023 when Musk announced Xai on X and wrote that its purpose was to “understand reality.” Fast forward a year and the construction of the company's AI supergroup Colossus is underway. The Colossus started with 100,000 NVIDIA H100 H100 HOPPER AI accelerators, which is an amazing device in itself, but that number doubled to 200,000 in February 2025.

According to Xai, the Colossus was built in 122 days, which was a shocking period of rapid construction. The company then hit 200,000 GPUs in 92 days. During this period, the company faced a huge challenge: empowering a large number of GPUs in its cluster. Unfortunately, the grid only provides a 7MW colossus when it is launched. To provide electricity while awaiting a better connection with the Memphis grid, the company utilizes natural gas generators, as Tom's hardware notes have sparked complaints from some residents.

Now, things are different. Memphis provides 150MW of power, which causes the Colossus to reduce the number of generators it uses. On top of that, Colossus apparently has a 150 MW battery in Tesla batteries to provide power when needed. This is good news for the first phase, but the second phase will require more power. Another substation could help the giant reach 300 MW of power, as early as this fall. Power demand for supercomputers will continue to rise as more GPUs are added to existing settings.

Xai provides Colossus for training large language models (LLMs) and uses it to train its own Grok, an AI tool available on Musk’s X social network. The company recently released the Grok 3 Beta, which allows users to see the answers it provides and the reasons used to get that answer.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button