Abstract

Traditionally, data centers (DC) have used air cooling for IT equipment, but as graphics processing units (GPUs) evolve, they demand more power and sophisticated cooling. Aiming for efficiency, direct liquid cooling (DLC) emerges as a promising solution. We evaluated the effectiveness of DLC versus traditional air cooling on a Microsoft G50 GPU server performing artificial intelligence/machine learning (AI/ML) tasks. The results indicated that DLC greatly enhances GPU performance, increases efficiency by 2.7% in Gflops/s, cuts power usage by 12%, reduces execution times by up to 6.22%, and lowers chip temperatures by 20 °C compared to air cooling. Our research develops an overall performance metric that considers data center, hardware, and chip levels, concluding that DLC is extremely beneficial for AI workloads, increasing energy savings and balancing performance with power requirements.

References

1.
Singh
,
R.
,
2021
, “
Cloud Computing and Covid19
,” 3rd International Conference on Signal Processing and Communication (
ICPSC
),
Coimbatore, India
, May
13
14
.10.1109/ICSPC51351.2021.9451792
2.
Kantipudi
,
P. M.
,
Moses
,
J. C.
,
Aluvalu
,
R.
, and
Kumar
,
S.
,
2021
, “
Remote Patient Monitoring Using IoT, Cloud Computing and AI
,”
Hybrid Artif. Intell. IoT Healthcare
,
209
, pp.
51
74
.10.1007/978-981-16-2972-3
3.
Gill
,
S. S.
,
Xu
,
M.
,
Ottaviani
,
C.
,
Patros
,
P.
,
Bahsoon
,
R.
,
Shaghaghi
,
A.
,
Golec
,
M.
, et al.,
2022
, “
AI for Next Generation Computing: Emerging Trends and Future Directions
,”
Internet Things
,
19
, p.
100514
.10.1016/j.iot.2022.100514
4.
SuperMicro
,
2023
, “
Datasheet-NVIDIA-MGX1 U-GH200-Grace-Hopper-Systems.pdf
,” SuperMicro, San Jose, CA, accessed Nov. 26, 2024, https://www.supermicro.com/datasheet/datasheet-NVIDIA-MGX-1U-GH200-Grace-Hopper-Systems.pdf
5.
Li
,
C.
,
Zhou
,
R.
, and
Li
,
T.
,
2013
, “
Enabling Distributed Generation Powered Sustainable High-Performance Data Center
,” 2013 IEEE 19th International Symposium on High Performance Computer Architecture (
HPCA
), Shenzhen, China, Feb.
23
27
.10.1109/HPCA.2013.6522305
6.
Digital Infrastructure
,
2023
, “
How Data Centers Are Enabling Artificial Intelligence (AI)
,” Digital Infrastructure, New York, accessed Nov. 26, 2024, https://dgtlinfra.com/data-centers-artificial-intelligence-ai/
7.
Strevell
,
M.
,
Lambiaso
,
D.
,
Brendamour
,
A.
, and
Squillo
,
T.
,
2019
, “
Designing an Energy Efficient HPC Supercomputing Center
,”
Proceedings of the 48th International Conference on Parallel Processing
, Kyoto, Japan, Aug.
5
8
.https://www.esdglobal.com/wp-content/uploads/2021/09/EE-HPC-SOP-2019-Designing-an-Energy-Efficient-HPC-Supercomputing-Center.pdf
8.
Dennard
,
R. H.
,
Gaensslen
,
F. H.
,
Yu
,
H. N.
,
Rideout
,
V. L.
,
Bassous
,
E.
, and
LeBlanc
,
A. R.
,
1974
, “
Design of Ion-Implanted MOSFET's With Very Small Physical Dimensions
,”
IEEE J. Solid-State Circuits
,
9
(
5
), pp.
256
268
.10.1109/JSSC.1974.1050511
9.
Pagani
,
S.
,
Manoj
,
P. S.
,
Jantsch
,
A.
, and
Henkel
,
J.
,
2020
, “
Machine Learning for Power, Energy, and Thermal Management on Multicore Processors: A Survey
,”
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
,
39
(
1
), pp.
101
116
.10.1109/TCAD.2018.2878168
10.
Masanet
,
E.
,
Shehabi
,
A.
,
Lei
,
N.
,
Smith
,
S.
, and
Koomey
,
J.
,
2020
, “
Recalibrating Global Data Center Energy-Use Estimates
,”
Science
,
367
(
6481
), pp.
984
986
.10.1126/science.aba3758
11.
Song
,
Z.
,
Zhang
,
X.
, and
Eriksson
,
C.
,
2015
, “
Data Center Energy and Cost Saving Evaluation
,”
Energy Proc.
,
75
, pp.
1255
1260
.10.1016/j.egypro.2015.07.178
12.
Heydari
,
A.
,
Eslami
,
B.
,
Radmard
,
V.
,
Rebarber
,
F.
,
Buell
,
T.
,
Grey
,
K.
,
Sather
,
S.
, and
Rodriguez
,
J.
,
2022
, “
Power Usage Effectiveness Analysis of a High-Density Air-Liquid Hybrid Cooled Data Center
,”
ASME
Paper No. IPACK2022-97447.10.1115/IPACK2022-97447
13.
ASHRAE Technical Committee 9.9, Mission Critical Facilities
,
2021
, “
Emergence and Expansion of Liquid Cooling in Mainstream Datacenters
,” ASHRAE Technical White Paper 2021.https://www.ashrae.org/file%20library/technical%20resources/bookstore/emergence-and-expansion-of-liquid-cooling-in-mainstream-data-centers_wp.pdf
14.
George
,
J.
,
Fernandes
,
J.
,
Alonso
,
J. C. C.
,
Kiernan
,
K.
,
Thompson
,
M.
, and
Boisvert
,
P.
,
2023
, “
Evaluating the Limits of Rear Door Heat Exchanger in Datacenters
,”
Open Compute Project
, pp.
1
24
.https://www.opencompute.org/documents/acs-door-hx-whitepaper-final-230419-pdf
15.
Saini
,
M.
, and
Webb
,
R.
,
2003
, “
Heat Rejection Limits of Air Cooled Plane Fin Heat Sinks for Computer Cooling
,”
Compon. Packag. Technol., IEEE Trans.
,
26
(
1
), pp.
71
79
.10.1109/TCAPT.2003.811465
16.
Heydari
,
A.
,
Gharaibeh
,
A. R.
,
Tradat
,
M.
,
Soud
,
Q.
,
Manaserh
,
Y.
,
Radmard
,
V.
,
Eslami
,
B.
,
Rodriguez
,
J.
, and
Sammakia
,
B.
,
2024
, “
Experimental Evaluation of Direct-to-Chip Cold Plate Liquid Cooling for High-Heat-Density Data Centers
,”
Appl. Therm. Eng.
,
239
, p.
122122
.10.1016/j.applthermaleng.2023.122122
17.
Alappat
,
C. L.
,
Hofmann
,
J.
,
Hager
,
G.
,
Fehske
,
H.
,
Bishop
,
A. R.
, and
Wellein
,
G.
,
2020
, “
Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors
,”
High Performance Computing
, 12151(2020), pp.
412
433
.10.1007/978-3-030-50743-5_21
18.
Global Speciale Mobile Association
,
2022
, “
GSMA 2022
,” State of the Industry on Climate Action 2022.
19.
ITU
,
2020
, “
Greenhouse Gas Emissions for the Information and Communications Technology Sector 2020
,” International Telecommunications Union, Geneva, Switzerland.https://c2e2.unepccc.org/wp-content/uploads/sites/3/2020/03/greenhouse-gas-emissions-in-the-ict-sector.pdf
20.
Ramakrishnan
,
B.
,
Hadad
,
Y.
,
Alkharabsheh
,
S.
,
Chiarot
,
P. R.
, and
Sammakia
,
B.
,
2019
, “
Thermal Analysis of Cold Plate for Direct Liquid Cooling of High Performance Servers
,”
ASME J. Electron. Packag.
,
141
(
4
), p.
041005
.10.1115/1.4044130
21.
Ramakrishnan
,
B.
,
Hoang
,
C. H.
,
Khalili
,
S.
,
Hadad
,
Y.
,
Rangarajan
,
S.
,
Pattamatta
,
A.
, and
Sammakia
,
B.
,
2021
, “
Experimental Characterization of Two-Phase Cold Plates Intended for High-Density Data Center Servers Using a Dielectric Fluid
,”
ASME J. Electron. Packag.
,
143
(
2
), p.
020904
.10.1115/1.4049928
22.
Tang
,
S.
,
Zhao
,
Y.
,
Diao
,
Y.
, and
Quan
,
Z.
,
2018
, “
Effects of Various Inlet/Outlet Positions and Header Forms on Flow Distribution and Thermal Performance in Microchannel Heat Sink
,”
Microsyst. Technol.
,
24
(
5
), pp.
2485
2497
.10.1007/s00542-017-3688-y
23.
Wang
,
Y.
,
Zhu
,
K.
,
Cui
,
Z.
, and
Wei
,
J.
,
2019
, “
Effects of the Location of the Inlet and Outlet on Heat Transfer Performance in Pin Fin CPU Heat Sink
,”
Appl. Therm. Eng.
,
151
, pp.
506
513
.10.1016/j.applthermaleng.2019.02.030
24.
Zhou
,
F.
,
Liu
,
Y.
,
Liu
,
Y.
,
Joshi
,
S. N.
, and
Dede
,
E. M.
,
2016
, “
Modular Design for a Single-Phase Manifold Mini/Microchannel Cold Plate
,”
ASME J. Therm. Sci. Eng. Appl.
,
8
(
2
), p.
021010
.10.1115/1.4031932
25.
Kandlikar
,
S. G.
, and
Hayner
,
C. N.
,
2009
, “
Liquid Cooled Cold Plates for Industrial High-power Electronic Devices—Thermal Design and Manufacturing Considerations
,”
Heat Transfer Eng.
,
30
(
12
), pp.
918
930
.10.1080/01457630902837343
26.
Pérez
,
S.
,
Arroba
,
P.
, and
Moya
,
J. M.
,
2021
, “
Energy-Conscious Optimization of Edge Computing Through Deep Reinforcement Learning and Two-Phase Immersion Cooling
,”
Future Gener. Comput. Syst.
,
125
, pp.
891
907
.10.1016/j.future.2021.07.031
27.
Boye
,
M.
,
Bortolozzo
,
P. A.
,
Hansen
,
N. C.
,
Happe
,
H. H.
,
Madsen
,
E. B.
,
Nielsen
,
O. H. O.
, and
Visling
,
J.
,
2020
, “
DeiC Super Computing 2019 Report
,”
Fact Finding Tour at Super Computing 19
,
Denver
, CO, Nov.
17
22
.https://backend.orbit.dtu.dk/ws/portalfiles/portal/214022680/HPC_Repport2019.pdf
28.
Pambudi
,
N. A.
,
Sarifudin
,
A.
,
Firdaus
,
R. A.
,
Ulfa
,
D. K.
,
Gandidi
,
I. M.
, and
Romadhon
,
R.
,
2022
, “
The Immersion Cooling Technology: Current and Future Development in Energy Saving
,”
Alexandria Eng. J.
,
61
(
12
), pp.
9509
9527
.10.1016/j.aej.2022.02.059
29.
Jalili
,
M.
,
Manousakis
,
I.
,
Goiri
,
Í.
,
Misra
,
P. A.
,
Raniwala
,
A.
,
Alissa
,
H.
,
Ramakrishnan
,
B.
, et al.,
2021
, “
Cost-Efficient Overclocking in Immersion-Cooled Datacenters
,”
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture
(
ISCA
), Valencia, Spain, June 14–18, pp.
623
636
.10.1109/ISCA52012.2021.00055
30.
Ramakrishnan
,
B.
,
Alissa
,
H.
,
Manousakis
,
I.
,
Lankston
,
R.
,
Bianchini
,
R.
,
Kim
,
W.
,
Baca
,
R.
, et al.,
2021
, “
CPU Overclocking: A Performance Assessment of Air, Cold Plates, and Two-Phase Immersion Cooling
,”
IEEE Trans. Compon., Packag. Manuf. Technol.
,
11
(
10
), pp.
1703
1715
.10.1109/TCPMT.2021.3106026
31.
Cautche
,
J.
,
Shaeri
,
M. R.
, and
Ellis
,
M. C.
,
2022
, “
Additive Manufacturing of Capillary Driven Two-Phase Cold Plates
,”
Proceedings of the 8th World Congress on Mechanical, Chemical, and Material Engineering (MCM'22)
,
Prague, Czech Republic
, July 31–Aug. 2, Paper No.
HTFF 174
.10.11159/htff22.174
32.
Ellsworth
,
J. M. J.
, and
Iyengar
,
M. K.
,
2009
, “
Energy Efficiency Analyses and Comparison of Air and Water Cooled High Performance Servers
,”
ASME
Paper No. InterPACK2009-89248.10.1115/InterPACK2009-89248
33.
Heydari
,
A.
,
Soud
,
Q.
,
Tradat
,
M.
,
Gharaibeh
,
A.
,
Fallahtafti
,
N.
,
Rodriguez
,
J.
, and
Sammakia
,
B.
,
2023
, “
L2A CDUs Performance and Considerations for Server Rooms Upgrade With Conventional Air Conditioning
,”
ASME
Paper No. IPACK2023-111564.10.1115/IPACK2023-111564
34.
Chi
,
Y. Q.
,
Summers
,
J.
,
Hopton
,
P.
,
Deakin
,
K.
,
Real
,
A.
,
Kapur
,
N.
,
Thompson
,
H.
, et al.,
2014
, “
Case Study of a Data Centre Using Enclosed, Immersed, Direct Liquid-Cooled Servers
,” Semiconductor Thermal Measurement and Management Symposium (
SEMITHERM
), San Jose, CA, Mar.
9
13
.10.1109/SEMI-THERM.2014.6892234
35.
Iyengar
,
M.
,
David
,
M.
,
Parida
,
P.
,
Kamath
,
V.
,
Kochuparambil
,
B.
,
Graybill
,
D.
,
Schultz
,
M.
, et al.,
2012
, “
Server Liquid Cooling With Chiller-Less Data Center Design to Enable Significant Energy Savings
,” IEEE Semiconductor Thermal Measurement and Management Symposium (
SEMI-THERM
),
San Jose
, MA, Mar.
18
22
.10.1109/STHERM.2012.6188851
36.
Parida
,
P. R.
,
David
,
M.
,
Iyengar
,
M.
,
Schultz
,
M.
,
Gaynes
,
M.
,
Kamath
,
V.
,
Kochuparambil
,
B.
, and
Chainer
,
T.
,
2012
, “
Experimental Investigation of Water-Cooled Server Microprocessors and Memory Devices in an Energy Efficient Chiller-Less Data Center
,” IEEE Semiconductor Thermal Measurement and Management Symposium (
SEMI-THERM
),
San Jose
, CA, Mar.
18
22
.10.1109/STHERM.2012.6188852
37.
M
,
R. S.
, and
Schmidt
,
S. A.
,
2019
, “
Moderating the Impact of Integrating Water-Cooled Servers Into Data Centers
,”
ASHRAE J.
,
61
(
7
), pp.
45
51
.https://airah.org.au/Common/Uploaded%20files/Archive/Ecolibrium/2021/10-21-Eco-technical-paper.pdf
38.
Tuckerman
,
D. B.
, and
Pease
,
R.
,
1981
, “
High-Performance Heat Sinking for VLSI
,”
IEEE Electron Device Lett.
,
2
(
5
), pp.
126
129
.10.1109/EDL.1981.25367
You do not currently have access to this content.