Fixing Hot Spots During COVID-19

Overview

In the midst of the COVID-19 pandemic, a Fortune 50 Online Retailer faced a critical challenge as increased server loads and limited on-site staff threatened server uptime. To address the issue without relying on third-party vendors, the Retailer utilized their AUDIT-BUDDY™ systems to diagnose and resolve the hot spots within 24 hours.

 

Introduction

During COVID-19, a major Online Retailer had to significantly increase server loads, while scaling back on-site staffing and preventative maintenance. In April 2020, the Building Management System (BMS) registered a series of alarm readings in a particularly dense area of the Data Center. However, there weren’t enough sensors in place to narrow the location of the problem. The Retailer needed to find the hot spots without third-party assistance or taking up an already-stretched staff’s time. The solution was to use Purkay Labs’ AUDIT-BUDDY™ system to locate and diagnose to mitigate and prevent future issues.

The Project

Prior to the pandemic, the Retailer’s Data Center was approaching their cooling capacity, and the increased IT load exacerbated existing airflow issues. The Retailer needed to find a way to minimize or eliminate the hot spots to avoid future complications. The Retailer’s System Engineer used the AUDIT-BUDDY™ in two ways.

About the Facility:

  • Customer: Fortune 50 Online Retailer

  • Facility: Tier 4 Facility in Dallas, TX

  • Facility Details: 20,000 ft² raised floor with 350 cabinets

Figure 1: QuickScan + Thermal Contour Map

Figure 1: QuickScan + Thermal Contour Map

First, the Systems Engineer used the QuickScan mode to take 20 second scans at each cabinet across the area where the BMS alerts occurred, and generated a Thermal Contour Map to show what the environment across the aisle looked like (See Figure 1: QuickScan + Thermal Contour Map). The Contour Map showed that the top of Cabinet 202 was significantly warmer than the rest of the aisle. A quick inspection of the physical cabinet showed that there were blanking panels (correctly installed) and a fully opened perforated tile directly in front of the cabinet. The cooling systems were running at max, and there was no way to add more cooling into the room. There wasn’t a clear solution to fix the Hot Spot.


To find the root cause of the hot spot, the Systems Engineer proceeded to use AUDIT-BUDDY™’s delta-T scan mode to measure the change in temperature across the cabinet. He placed one AUDIT-BUDDY™ system in the front of the cabinet and one AUDIT-BUDDY™ system at the back of the cabinet to measure the cabinet delta-T at 6”, 36” and 72” for 24 hours (See Figure 2: Delta-T Mode). He then collected the CRAC supply and return temperatures, and input them into Purkay Labs’ Air Performance calculator, which compares the CRAC Supply, CRAC Return, Server Inlet and Server Outlet to determine how much airflow is reaching the server, and how much is being lost.

 
Figure 2: Delta-T Mode

Figure 2: Delta-T Mode

How does the Air Performance / Delta-T Calculator work?

Figure 3: Ideal Airflow

Figure 3: Ideal Airflow

Figure 4: Realistic Airflow

Figure 4: Realistic Airflow

In an ideal (raised floor) scenario, there would be a closed loop cooling pattern, where all CRAC supplied cold air goes to the cabinet, and all exhaust air goes back to the CRAC unit (See Figure 3: Ideal Airflow). In reality, some air escapes through gaps in the floor (bypass airflow) or returns to server inlet (recirculation airflow), resulting in hot spots or overcooled areas(See Figure 4: Realistic Airflow).

By looking at four temperature values - CRAC supply, Server Inlet, Server Exhaust, CRAC return — you can diagnose the effectiveness of your cold air.

Some Rules of Thumb

  1. The closer the CRAC ΔT and the Cabinet ΔT are to each other, the more air is flowing correctly.

  2. If your CRAC Return is cooler than your server exhaust, you may have bypass airflow

  3. if your server inlet temp is warmer than the CRAC supply, you may have recirculation airflow

Purkay Labs automates these airflow calculations within WIFI-MATE Air Performance Calculator.

The Results

Two hours into the Delta-T scan, the Air Performance calculator showed that there was 50% bypass airflow and 40% recirculation airflow at the top of Cabinet 202 (See Figure 5: Example Air Performance Screen- Before). Since there were already blanking panels in the cabinet, the Systems Engineer elected to add a temporary curtain to separate the hot and cold aisle. Six hours later, the Air Performance calculator showed that the inlet temperature, bypass and recirculation all went down (See Figure 6: Example Air Performance Screen- After). More cold air was actually performing their designated task (i.e. cooling the cabinets) as opposed to being wasted. The Air performance calculator from AUDIT-BUDDY™ screens illustrate the before and after behavior.

Figure 5: Example Air Performance Screen - Before

Figure 5: Example Air Performance Screen - Before

Figure 6: Example Air Performance Screen - After

Figure 6: Example Air Performance Screen - After

Conclusion

In the middle of COVID-19, the Client needed to eliminate hot spots that threatened server uptime at a time when demand was at an all-time high. Safety protocols in place meant the Retailer needed to use on-site tools that required the least amount of time to operate. Though the BMS initially alerted the Retailer to the problem, they needed specific information to diagnose the problem. AUDIT-BUDDY™’s built-in scan modes allowed the Retailer to both locate the hot spot and diagnose the cause. Without the Air Performance Calculator, the Systems Engineer would have had to spend extra time to guess a solution. AUDIT-BUDDY™' data provided a quick and easy way to make informed decisions, and ultimately prevented the hot spots from reaching a crucial level.

Disclaimer: To protect Client Confidentiality, Purkay Labs has altered the data.

 

About Purkay Labs

We believe that Data Center Operators deserve quick, reliable and independent data about their white space environment, without the burden of complicated permanent monitoring systems. We create simple, standalone, and cost-effective portable environmental monitoring systems so you can get data wherever and whenever you need it. Our flagship product- the AUDIT-BUDDY system- is the first multi-height portable environmental monitor that provides data to help manage your airflow, reduce Scope II emissions, increase energy efficiency & cooling optimization.

You can follow us @purkaylabs or visit our website: www.purkaylabs.com

 
Previous
Previous

Purkay Labs PM Joins 7X24 NE Board of Directors

Next
Next

In Recognition of Essential Workers