In the early months of 2024, a major hyperscaler revealed it had quietly doubled its artificial intelligence (AI) cluster power budget to more than 300 MW—enough to power an entire city. The announcement stunned many industry observers, but for those building and managing the infrastructure that powers AI, the escalation came as no surprise. It simply reinforced a truth long understood by those at the heart of data center operations: AI’s hunger for compute power is boundless, and energy is its currency.
COMMENTARY
Today’s data centers, the workhorses of our digital economy, face a paradox. While they are asked to fuel the rapid advancement of generative AI, they must do so under increasingly tight energy budgets, with sustainability mandates and power constraints threatening to slow innovation. The question isn’t whether AI will transform industries—it already is—but whether our infrastructure can keep up without buckling under the weight of AI’s energy appetite.
Why Traditional Architectures Fall Short for AI Workloads
AI workloads differ dramatically from traditional analytics and database processing. While conventional workloads tend to be transactional and latency-sensitive, AI training and inference tasks are throughput-intensive, requiring massive parallelism and memory bandwidth. Deep learning models rely on graphics processing units (GPUs) and other accelerators that chew through terabytes of data to identify patterns, requiring immense memory and I/O (Input/Output) bandwidth.
Want to learn more about artificial intelligence in the power industry, and how energy demand from data centers is impacting electric utilities and the power generation sector? Register to attend POWER’s Data Center POWER eXchange event, set for Oct. 28 in Denver, Colorado.
However, the architecture of most existing data centers is rooted in legacy designs optimized for CPU-centric, general-purpose computing. These traditional infrastructures are ill-equipped for the data movement and memory demands of AI. As AI models become larger and more complex, performance is increasingly throttled not by processor speed, but by the system’s ability to feed data fast enough—a phenomenon known as the memory wall.
As described by Gholami et al. in 2403.14123, processor performance (FLOPS, or floating point operations per second) has increased at 3.0× every two years, outpacing memory bandwidth (1.6×) and interconnect bandwidth (1.4×) growth. This bottleneck impedes the utilization of expensive compute resources, driving up costs and energy waste. It’s like trying to irrigate a massive farm with just a watering can—most of the land remains underutilized, and the effort becomes inefficient and wasteful.
Breaking the Bottleneck: The Promise of CXL and Advanced ECC
Compute Express Link (CXL) is emerging as a key solution to the memory wall. By enabling low-latency, coherent communication between CPUs, GPUs, and memory, CXL allows systems to pool and flexibly allocate memory resources. This eliminates the need for overprovisioned local memory in each server and reduces idle capacity.
CXL-attached memory modules with enhanced ECC (Error Correction Code) capabilities bring another level of efficiency. These modules offer higher effective capacity without compromising reliability or requiring exotic DIMM technologies, effectively reducing system cost per gigabyte. By enabling larger and shared memory pools, AI workloads can be run more efficiently, reducing the number of GPU nodes needed and the total energy consumed.
Smarter Storage: Cutting Latency and Power at the Source
In AI pipelines, data storage often becomes a second bottleneck—especially when large models must frequently read and write vast datasets. Fortunately, innovations in SSD technology are stepping up to meet this challenge.
One of the most promising developments is the integration of hardware-based write reduction and transparent data compression directly into the SSD controller. These hardware state machines process data inline as it is written or read, providing a vastly more performant, efficient, and scalable means of compressing data than traditional CPU-based software compression. This has two critical benefits: it accelerates data transfer and reduces the total power required per operation.
By compressing data at the point of storage, these intelligent SSDs reduce the burden on upstream components— fewer processor cycles are consumed, and tasks complete faster.
The result is a virtuous cycle of energy savings and performance gains at the component, system, and data center levels.
For example, in high-performance training clusters, even small reductions in per-SSD power draw can add up to significant energy savings when scaled across tens of thousands of drives. Less heat generated means lower cooling requirements—a significant consideration given that cooling can account for up to 40% of data center energy use, according to the IEA.
Security Meets Efficiency: Caliptra and CXL
As compute and memory resources become more distributed and interconnected via CXL, the need for robust security increases. Enter Caliptra, an open-source Root-of-Trust initiative jointly developed by industry leaders to standardize secure boot and attestation for CXL and other next-gen system interconnects.
Caliptra ensures that all connected components, whether memory or accelerator modules, can be authenticated and verified before being allowed access to system resources. This reduces the risk of supply chain attacks and strengthens the overall security posture—an often overlooked but energy-relevant issue. Secure systems are resilient systems, and breaches not only risk data loss but often require full-system scans, data recovery operations, and costly downtime—all of which burn energy and hurt sustainability metrics.
Scaling Sustainably: A New Architecture for AI
The data center of tomorrow must be built on a foundation of energy-aware architecture. This means:
• Pooling memory across servers via CXL to reduce redundancy and improve utilization.
• Using advanced SSDs with built-in compression to lower both compute overhead and energy use.
• Deploying domain-specific processing that matches the task to the right engine—whether it’s a GPU, a CPU, or a specialized compression core inside an SSD.
• Embedding security at the hardware level to prevent breaches and reduce costly system remediation.
These are not speculative ideas. They are proven strategies already being implemented by forward-thinking organizations, including contributors to the Caliptra initiative and adopters of memory-enhancing ECC solutions.
Final Thoughts: Powering the Future of AI
Balancing AI’s potential with the realities of power-hungry infrastructure requires a new way of thinking—one where every component in the data pipeline, from memory to storage to security, contributes to performance and sustainability.
By investing in scalable, efficient, and secure technologies such as CXL-enabled memory, compression-capable SSDs, and hardware-level security like Caliptra, we can ensure that our infrastructure evolves not only to keep up with AI—but to do so responsibly.
—JB Baker is vice president of Products at ScaleFlux, a data infrastructure and technology company. Baker is a technology business leader with a 20-plus-years track record of driving top- and bottom-line growth through new products for enterprise and data center storage.