When Every Watt Has to Earn Its Keep, Airflow Stops Being Free

Blog

Jun 17

Keep It Cool — Practical insights on data center airflow and cooling performance.

Last week at the 7x24 Spring Conference, Peter Panfil gave a keynote called "Scale at Speed." It was a sharp talk, focused on the future of the AI data center and the next iterations coming down the line. Liquid-cooled buildings, high-density GPU deployments, 800V DC pushed deep into the room, GPU load swings in fractions of a second, 6 megawatt designs built and tested before they ever ship to site. A look at the part of the industry most operators know is coming, even if it isn't the room they walk into on Monday morning.

Then he made the comment that stuck with me. So much so that on our Monday morning stand up our team really got into a debate around it… (always fun!)

Panfil said he believes that tokens per dollar per watt will replace PUE as the metric that matters.

For a hyperscale data center, that makes sense. The product is AI compute, measured in tokens, and the thing limiting how much they can produce is power. So the business case depends on how much compute they get out of every watt. But most data center operators don’t run that type of facility. They are running already-built facilities, often air-cooled, often full, and often asked to support new loads inside a building designed years before anyone was planning around AI racks. They are not choosing the next GPU roadmap or the next power architecture. They are trying to keep the room stable, make good use of the cooling they already have, and explain to leadership what can be done without turning every issue into a capital project.

I think that’s where Panfil’s point becomes useful outside the hyperscale environment.

If the new efficiency conversation is about whether a watt produced useful work, then an air-cooled room has its own version of the same problem. The question is whether the cooling watt reached the rack, passed through the equipment, and carried heat away, or whether it moved around the room without doing much of anything useful.

PUE gets quiet at the rack

PUE is useful because it gives teams a simple way to compare total facility energy against IT equipment energy, and for years that helped push the industry toward better power and cooling discipline. The problem is what PUE averages away. Cooling, fans, pumps, power conversion, and losses all get lumped onto the facility side of the ratio. That keeps the number clean, but it also means a lot of waste disappears into one bucket. A room can post a respectable PUE while cold air slips past the racks and hot exhaust loops back into the intakes.

That is the part operators know from experience. The room average looks calm, the BMS shows acceptable conditions, and the rack face tells a different story. PUE will not tell you that the top of a cabinet is warmer than the bottom. It will not show you that supply air is slipping through open rack space instead of through servers. It will not flag a hot spot that exists because air is not getting to the load, even when the cooling plant still has capacity. These are airflow problems, and in an air-cooled room, airflow problems are where watts can go to waste without looking dramatic on a report.

Cost of Airflow Waste

Airflow waste has always cost money. That part is not new. What's new is who's paying attention.

It used to be the operator's problem to manage quietly. Now it shows up at the hyperscale level, in the same conversation as GPUs and power contracts because racks pull more power than they used to and there isn't enough power to go around. So when the cooling isn't reaching the equipment, that gap is obvious and expensive.

Tokens per watt asks the question PUE skipped. Did the watt do work? A watt spent moving air that never touched a server is a watt that produced nothing, and that standard doesn't stop at the AI factory. It starts there and works its way down to every room where power and cooling margin matter. .

What this looks like at the rack?

A rack doesn't feel the room average. It feels the air arriving at its intakes, at the bottom, middle, and top of the cabinet, under whatever load is running right then. In older rooms that grew one rack at a time, that local picture can swing a lot more than the room-level number suggests.

Rack-level readings turn a vague worry into something specific you can hand to leadership. Three checks do most of the work.

Delta-T across the rack. Delta-T is just the temperature difference between the air going into the gear and the air coming out the back. When that gap is smaller than it should be, the air is finding an easy path around the equipment instead of through it. The cooling runs fine. The rack isn't getting the benefit.
Inlet temperature, bottom to top. Read the intake air at the bottom of the cabinet, then the top. A warm top over a cool bottom usually means delivery is uneven, or hot exhaust is curling back over the top of the rack and getting pulled into the intakes. The room average will hide this every time.
A hot spot while the room still has cooling headroom. If the plant has capacity to spare and a rack is still running hot, you don't have a cooling shortage. The air just isn't getting there.

That last one is the whole game. It separates a capacity problem from a delivery problem, and the difference is money. A capacity problem ends in a purchase order. A delivery problem usually ends with blanking panels, a tile moved, a containment gap closed, or a setpoint walked back. Cheap fixes, no capital request.

The shift is already moving

While the “Tokens per watt” discussion usually refers to AI infrastructure, I think that the discipline behind it belongs in any room where power is tight and cooling margin matters. You don't need a new number on the wall to use it. You need to see whether the cooling you already pay for is reaching the rack. Before the next efficiency conversation turns into a debate about major upgrades, get a rack-level picture of the room you have now. The fixes may be smaller than you expect, and the case is easier to defend when it starts with measured conditions instead of a guess. If you want a faster read on where you stand, start with the Cooling Risk Quiz.

About the Author

Aheli Purkayastha is Chief Product Officer at Purkay Labs, where she works on the tools, reports, and processes operators use to make sense of what their cooling is doing at the rack. Her focus is turning rack-level temperature and airflow data into something a team can act on quickly and defend to leadership. She believes cooling decisions should start with measured conditions, not room averages or old design assumptions.

About Purkay Labs

Purkay Labs helps data center operators see what their cooling is doing at the rack. Our portable thermal assessments collect rack-level temperature and airflow data, then turn it into heat maps and a practical report that shows where cooling is reaching the equipment, where it is missing the load, and what to check next. It is a fast way to benchmark the room as it is today, especially before a layout change, capacity discussion, or leadership review.

data center airflow managementPUE limitationstokens per watt data centerrack-level coolingdata center hot spotsdelta-T data centerbypass airflowair-cooled data center efficiencydata center cooling assessmentrecirculation data center

Purkay Labs https://www.purkaylabs.com

When Every Watt Has to Earn Its Keep, Airflow Stops Being Free

PUE gets quiet at the rack

Cost of Airflow Waste

What this looks like at the rack?

The shift is already moving

Is Your Data Center Thermally Ready for AI? What to Validate Before the Dense Racks Arrive

Purkay Labs Launches New Website to Help Data Center Operators Diagnose Cooling Issues Faster