There’s a looming global computing capacity crunch that cannot be sustainably addressed the way we’re doing things right now.
Simply put, between artificial intelligence (AI) models growing exponentially and an ongoing global digital transformation, data centres are running out of space. Their vacancy rates are hitting record-lows and prices are rising in response to demand, which is cause for much unease among tech leaders.
If this trend continues, at some point, we will reach a juncture where we can no longer accomplish all the things that technology theoretically allows us to do, because our capacity to process data will be constrained.
Perhaps the biggest worry is that AI’s transformative potential, which we’re only just beginning to tap into, will be throttled by purely physical constraints. This will hinder new discoveries and the development of more advanced machine learning (ML) models, which is bad news for all, except AI apocalypse alarmists.
Is there any way to avoid the computing capacity crisis? Since massively scaling back our computational demands isn’t really an option, the only alternative is to significantly boost capacity, which boils down to two available courses of action: build more data centres and develop better digital infrastructure.
But that’s easier said than done — here’s why.
Why more data centres isn’t the answer
Until now, increasing demand for computing capacity has been, in part, met by building more data centres, with conservative estimates putting real estate taken up by data centres growing at ~40% per year. It’s a figure that you can expect to remain fairly steady, as supply issues, power challenges, and construction delays are severely limiting scaling capacity expansion.
In other words, today, demand cannot be simply met by ramping up data centre construction.
Nor should that be something we aspire to. Each of these football-field-sized warehouses gobbles up gargantuan amounts of energy and water, placing severe strain on the environment, both locally and globally. A single data centre can consume as much electricity and water as 50,000 homes and the cloud’s carbon footprint already exceeds that of the aviation industry.
Credit where credit is due — data centres have come a long way in minimising their environmental impact. This is in large part thanks to a fierce sustainability race, which has propelled innovation, particularly as it relates to cooling and energy efficiency. Nowadays, you’ll find data centres in underground mines, in the sea, and using other natural cooling opportunities such as fjord water flows, all to reduce energy and water consumption.
The trouble is, this isn’t scalable globally, nor is boiling our seas a viable path forward. Erecting more data centres — no matter how efficient — will continue to wreak havoc on local ecosystems and impede national and international sustainability efforts. All while still failing to meet the demand for compute resources.
Still, two chips are better than one, unless…
Think inside the box
… unless that single chip operates at twice the speed. To avoid the capacity crunch, all hopes rest on improving the digital infrastructure, namely, the chips, the switches, the wires, and other components that can improve data speeds and bandwidth while consuming less energy.
Let me reiterate — the evolution of AI depends on finding ways to transfer more data, without using more energy.
Essentially, this means two things. First, the development of more powerful and AI-centric chips. Second, the enhancement of data transfer speeds.
1. Designing custom chips for AI
Existing digital infrastructure isn’t particularly well suited for the efficient development of ML models. General-purpose central processing units (CPUs), which continue to be the primary computing components in data centres, struggle with AI-specific tasks due to their lack of specialisation and computational efficiency.
When it comes to AI, graphics processing units (GPUs) fare much better thanks to better processing power, higher energy efficiency, and parallelism. That’s why everyone’s snatching them up, which has led to a chip shortage.
Yet GPUs inevitably hit the same brick wall. They’re not inherently optimised for AI tasks, leading to energy waste and suboptimal performance in handling the increasingly intricate and data-intensive demands of modern AI applications.
That’s why companies such as IBM are designing chips tailored to AI’s computational demands that promise to squeeze out the most performance while minimising energy consumption and space.
2. Improving data transfer capacity
No modern AI model operates on a single chip. Instead, to get the most of available resources, you assemble multiple chips into clusters. These clusters often form a part of larger networks, each designed for specific tasks.
Accordingly, the interconnect, or the system facilitating communication between chips, clusters, and networks, becomes a critical component. Unless it can keep up with the speed of the rest of the system, it risks being a bottleneck that hinders performance.
The challenges for data transfer devices mirror those for chips: they must operate at high speeds, consume minimal energy, and occupy as little physical space as possible. With traditional electrical interconnects fast reaching their limits in terms of bandwidth and energy efficiency, all eyes are on optical computing — and silicon photonics, in particular.
Unlike electrical systems, optical systems use light to transmit information, providing key advantages in the areas that matter — photonic signals can travel at the speed of light and carry a higher density of data. Plus, optical systems consume less power and photonic components can be much smaller than their electrical counterparts, allowing for more compact chip designs.
The operative words here are “can be.”
The growing pains of cutting-edge tech
Optical computing, while extremely fast and energy-efficient, currently faces challenges in miniaturisation, compatibility, and cost.
Optical switches and other components can be bulkier and more complex than their electronic counterparts, leading to challenges in achieving the same level of miniaturisation. As of now, we are yet to find materials that can act as both an effective optical medium and are scalable for high-density computing applications.
Adoption would also be an uphill battle. Data centres are generally optimised for electronic, not photonic, processing, and integrating optical components with existing electronic architectures poses a major challenge.
Plus, just like any cutting edge technology, optical computing has yet to prove itself in the field. There is a critical lack of research into the long-term reliability of optical components, particularly under the high-load, high-stress conditions typical of data centre environments.
And to top it all off — the specialised materials required in optical components are expensive, making widespread adoption potentially cost-prohibitive, especially for smaller data centres or those with tight budget constraints.
So, are we moving fast enough to avoid the crunch?
Probably not. Definitely not to stop building data centres in the short term.
If it’s any consolation, know that scientists and engineers are very aware of the problem and working hard to find solutions that won’t destroy the planet by constantly pushing the boundaries and making significant advances in data centre optimisation, chip design, and all facets of optical computing.
My team alone has broken three world records in symbol rate for data centre interconnects using intensity modulation and direct detection approach.
But there are serious challenges, and it’s essential to address them head-on for modern technologies to realise their full potential.
Professor Oskars Ozoliņš received his Dr.sc.ing. degree in optical communications from Riga Technical University (Latvia) in 2013 and received a habilitation degree in physics with a specialization in optical communication from KTH Royal Institute of Technology in 2021. He is the author of around 270 international journal publications, conference contributions, invited talks/tutorials/keynotes/