Matryoshka Dolls and Data Centers

Categories: , , ,

Over the past year, there has been much debate on the energy efficiency of large data centers and how best to measure that efficiency. Power Usage Effectiveness (PUE) has emerged as the de facto standard, where PUE is defined as the ratio of power entering the facility to power drawn by the computing equipment. Intuitively, the ideal PUE is unity (i.e., all incoming power is used by the computing equipment, with no overhead for cooling or energy distribution). For details on PUE and data center efficiency, I encourage you to read the documents from the The Green Grid, which has promulgated and evangelized these measurement standards. (Truth in advertising: Microsoft is a leading member of the The Green Grid.)

Legacy and Modern

Many legacy data centers — those built more than a few years ago – have PUEs in excess of two, or even three. This is largely due to inefficient computer room air-conditioning (CRAC) units, lack of hot and cold aisles, energy losses due to multiple (unnecessary) voltage conversions and aging or inappropriate building designs. If your “data center” is located in the nearest available space to your laboratory — a retrofitted janitor’s closet – and cooled by two box fans from Walmart, odds are you are not a paragon of PUE virtue, even if your aggregate computing power is small.

Today, state of the art data centers have PUEs below 1.5, and there are new designs that could approach a PUE of one by reducing UPS support where appropriate, operating at substantially higher temperatures and exploiting ambient cooling. Many people do not realize that computing hardware is much more resilient to high temperature than history and practice would suggest. It need not be chilled to temperatures suitable for polar bears.

Last year, Microsoft’s Christian Belady illustrated this point by operating a group of servers in a tent for over six months, with zero failures. Although not a statistically valid sample, the experiment did illustrate why ASHRAE and hardware vendors have broadened the temperature and relative humidity envelope of acceptable operation.

More generally, efficient data center design and analysis are much like Matryoshka dolls, because the PUE one obtains depends on where and what one measures. Ideally, the bounding box for measurement should be the entire data center ecosystem, from power distribution and cooling for servers to data center networks to operations and support. Equally importantly, a PUE near one does not mean the computing system itself is efficient and well-matched to the offered load or that the computations are either necessary are useful. Choose wisely and measure thoughtfully and carefully.

Appearance and Utility

Finally, I would be remiss if I did not opine on the most obvious, visual difference between cloud data centers and high-performance computing (HPC) facilities. The former are designed for function, not appearance. They are usually nondescript facilities optimized for efficient hardware operation at large scale, not for human accessibility or for comfort. Indeed, container-based data centers look more like a warehouse and distribution center with parking and utility connections than Hollywood’s idea of a computing center. Conversely, HPC facilities are usually showpieces with signs, elegant packaging and lighted spaces suitable for tours by visiting dignitaries.

At large scale, efficient trumps pretty. It’s all about what one measures.


Discover more from Reed's Ruminations: The Past, Present, and Future

Subscribe to get the latest posts sent to your email.

Please leave a comment …

Discover more from Reed's Ruminations: The Past, Present, and Future

Subscribe now to keep reading and get access to the full archive.

Continue reading