Case Study: Overheating Issues in a Hyperscale Data Center PSU Room
In early 2025, Purkay Labs was brought in to investigate a persistent temperature problem inside a newly commissioned large-scale data center. While the facility was state-of-the-art and purpose-built for high-density compute loads, the operators saw unexpected temperature alarms in one of the power rooms. The cause wasn’t immediately obvious, and that's where targeted airflow analysis came into play.
The Facility
The customer had recently completed construction of a hyperscale data center—designed to support large-scale compute clusters, including liquid-cooled GPU servers optimized for AI and other high-performance workloads. As tenant pods came online and additional server loads were energized, minor environmental issues began to surface in secondary spaces outside the primary data halls.
In this facility, while liquid cooling managed the server loads, the power infrastructure—including transformer rooms, switchgear, and PSU rooms—relied on more traditional air-based cooling systems to maintain safe equipment operating conditions.
The Problem: Temperature Alarms with No Clear Cause
As one tenant expanded operations and energized more of their GPU racks, the operations team began receiving temperature alarms in one of the Power Supply Unit (PSU) rooms. These alarms were triggered by the building management system (BMS), but by the time staff responded, temperatures had already normalized. The intermittent nature of the alarms made it difficult to diagnose whether this was simply a nuisance or an early sign of a more significant airflow imbalance.
After several cycles of alarms without a clear root cause, the facility engaged Purkay Labs to provide an independent environmental assessment.
Purkay Labs Deployment
Example Layout.
Purkay Labs began with a full walkthrough of the PSU room layout and existing monitoring points. While the facility’s BMS included a limited number of fixed sensors, there wasn’t enough spatial or vertical coverage to fully characterize airflow behavior near critical equipment.
To capture higher-resolution data, Purkay Labs deployed its portable Audit-Buddy monitoring system:
Transformer zones: 2 stands placed at either end
Switchgear rows: 4 stands
UPS units: 16 stands (covering 4 UPS units with 4 stands each)
PDUs: 4 stands total
Each stand was equipped with sensors positioned at multiple elevations (6", 25", 50") to capture vertical thermal stratification. The Audit-Buddy system was configured for a 24-hour LongScan, collecting temperature and humidity readings every 60 seconds to build a comprehensive profile across full operational cycles.
What the Data Revealed
Example of a Purkay Labs Map
Analysis of the 24-hour scan surfaced several key findings:
Transformer and switchgear zones remained stable with no significant temperature fluctuations.
UPS units—specifically UPS No. 3—showed elevated temperatures that aligned with the BMS alarm incidents.
PDUs exhibited minor increases, but not to a degree that contributed meaningfully to the overall issue.
Time-stamped data and vertical temperature maps confirmed that thermal buildup was localized around UPS No. 3 and coincided with elevated electrical load activity as recorded by the facility’s BMS.
The data made clear that the alarms were not random, but were tied to predictable patterns of load-dependent heat accumulation.
The Root Cause: Localized Airflow Restriction
With targeted data in hand, the airflow issue became apparent. UPS No. 3 had been installed near a solid wall with no overhead supply vent positioned directly above it. As loads increased, heat exhausted by the UPS struggled to dissipate effectively, instead pooling in the space between UPS No. 2 and No. 3. Without sufficient vertical or lateral airflow pathways, localized temperatures gradually spiked, intermittently triggering the BMS alarms.
Resolution
The facility team quickly implemented a short-term solution by deploying a portable cooling unit to redirect airflow and pull heat away from the problem zone.
A long-term correction followed: minor ductwork modifications introduced targeted overhead airflow directly above UPS No. 3, improving heat removal efficiency during peak load conditions and restoring thermal stability.
The Takeaway
Even in highly engineered environments built for dense compute and liquid cooling, secondary spaces like PSU rooms can present subtle airflow challenges that standard facility monitoring may not fully capture. Airflow behavior is highly sensitive to equipment placement, load profiles, and seemingly minor layout constraints.
Purkay Labs’ Audit-Buddy system provided high-resolution, rack-level thermal data that allowed the facility to isolate the problem quickly, validate the root cause, and implement targeted corrective actions—all while avoiding unnecessary downtime or costly trial-and-error adjustments.
About Purkay Labs
Purkay Labs helps data center operators and facility managers diagnose airflow and thermal issues that impact equipment reliability. Our portable AUDIT-BUDDY System delivers detailed, rack-level environmental data to identify hot spots, validate airflow performance, and support informed cooling decisions. Whether troubleshooting isolated problems or conducting full-site audits, Purkay Labs provides actionable insight that helps operators protect their infrastructure and optimize performance.