How Efficient Are Global Data Centers? A Complete SAS Analytics Study

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS

1. INTRODUCTION

Modern data centers are the backbone of cloud computing, AI, banking, healthcare, and government systems.
They consume massive power, require efficient cooling, and must ensure zero downtime.

This project simulates global data center operational data and demonstrates how a SAS programmer analyzes efficiency, sustainability, risk, and anomalies using:

· Base SAS

· PROC SQL

· Statistical procedures

· Macros

· Date intelligence

· Data quality & fraud detection logic

2. BUSINESS CONTEXT

Why this analysis matters:

· High Power Usage Effectiveness (PUE) → Increased operational cost

· High Downtime → SLA penalties & revenue loss

· Low Sustainability Score → Regulatory & ESG risks

· Abnormal metrics → Possible data manipulation or reporting fraud

Business questions answered:

· Which data centers are inefficient?

· Which locations show abnormal behavior?

· Are sustainability scores aligned with operational metrics?

· Can we auto-classify risk and utilization?

3. TABLE OF CONTENTS

1. Data Creation

2. Standardization & Formatting

3. Derived Metrics

4. Macro-Driven Classification

5. Statistical Analysis

6. Fraud Detection Logic

7. Correlation Analysis

8. Visualization

9. Advanced Data Handling (Append, Transpose)

10. Cleanup & Optimization

11. Conclusion

4. DATASET CREATION (RAW DATA)

data datacenter_raw;

input Center_ID Center_Name $ Country:$12. Power_Usage_PUE Server_Count Cooling_Efficiency

Downtime_Hours Sustainability_Score Install_Date : date9.;

format Install_Date date9.;

datalines;

1 alpha_dc usa 1.4 12000 82 5 78 15JAN2015

2 beta_dc india 1.8 9500 70 18 65 22MAR2016

3 gamma_dc germany 1.3 14000 88 2 85 10JUN2014

4 delta_dc uk 1.6 11000 75 10 72 01DEC2017

5 epsilon_dc japan 1.2 16000 92 1 90 09SEP2013

6 zeta_dc india 2.1 8000 60 30 50 12FEB2018

7 eta_dc usa 1.5 10000 78 7 74 19AUG2016

8 theta_dc canada 1.4 10500 80 6 77 25NOV2015

9 iota_dc australia 1.7 9800 73 12 70 11APR2017

10 kappa_dc singapore 1.3 15000 90 3 88 06JUL2014

11 lambda_dc france 1.6 10800 76 9 73 18OCT2016

12 mu_dc india 2.3 7000 55 40 45 29JAN2019

13 nu_dc netherlands 1.2 15500 93 2 91 03MAY2013

14 xi_dc brazil 1.9 9000 68 22 60 14DEC2018

15 omicron_dc uae 1.5 11500 79 8 75 21JUN2016

;

run;

proc print data=datacenter_raw;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date
1	1	alpha_dc	usa	1.4	12000	82	5	78	15JAN2015
2	2	beta_dc	india	1.8	9500	70	18	65	22MAR2016
3	3	gamma_dc	germany	1.3	14000	88	2	85	10JUN2014
4	4	delta_dc	uk	1.6	11000	75	10	72	01DEC2017
5	5	epsilon_	japan	1.2	16000	92	1	90	09SEP2013
6	6	zeta_dc	india	2.1	8000	60	30	50	12FEB2018
7	7	eta_dc	usa	1.5	10000	78	7	74	19AUG2016
8	8	theta_dc	canada	1.4	10500	80	6	77	25NOV2015
9	9	iota_dc	australia	1.7	9800	73	12	70	11APR2017
10	10	kappa_dc	singapore	1.3	15000	90	3	88	06JUL2014
11	11	lambda_d	france	1.6	10800	76	9	73	18OCT2016
12	12	mu_dc	india	2.3	7000	55	40	45	29JAN2019
13	13	nu_dc	netherlands	1.2	15500	93	2	91	03MAY2013
14	14	xi_dc	brazil	1.9	9000	68	22	60	14DEC2018
15	15	omicron_	uae	1.5	11500	79	8	75	21JUN2016

· DATA step gives full control over variable types

· Realistic multi-country, multi-year data

· Includes both numeric & character variables

· Dates allow INTCK / INTNX analysis

5. CHARACTER STANDARDIZATION & CLEANUP

data datacenter_clean;

set datacenter_raw;

Center_Name = propcase(strip(Center_Name));

Country = upcase(strip(Country));

Center_Code = catx('_', Country, Center_ID);

run;

proc print data=datacenter_clean;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code
1	1	Alpha_dc	USA	1.4	12000	82	5	78	15JAN2015	USA_1
2	2	Beta_dc	INDIA	1.8	9500	70	18	65	22MAR2016	INDIA_2
3	3	Gamma_dc	GERMANY	1.3	14000	88	2	85	10JUN2014	GERMANY_3
4	4	Delta_dc	UK	1.6	11000	75	10	72	01DEC2017	UK_4
5	5	Epsilon_	JAPAN	1.2	16000	92	1	90	09SEP2013	JAPAN_5
6	6	Zeta_dc	INDIA	2.1	8000	60	30	50	12FEB2018	INDIA_6
7	7	Eta_dc	USA	1.5	10000	78	7	74	19AUG2016	USA_7
8	8	Theta_dc	CANADA	1.4	10500	80	6	77	25NOV2015	CANADA_8
9	9	Iota_dc	AUSTRALIA	1.7	9800	73	12	70	11APR2017	AUSTRALIA_9
10	10	Kappa_dc	SINGAPORE	1.3	15000	90	3	88	06JUL2014	SINGAPORE_10
11	11	Lambda_d	FRANCE	1.6	10800	76	9	73	18OCT2016	FRANCE_11
12	12	Mu_dc	INDIA	2.3	7000	55	40	45	29JAN2019	INDIA_12
13	13	Nu_dc	NETHERLANDS	1.2	15500	93	2	91	03MAY2013	NETHERLANDS_13
14	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14
15	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15

Function	Purpose
STRIP	Removes leading/trailing spaces
PROPCASE	Standard naming
UPCASE	Country consistency
CATX	Safe concatenation with delimiter

6. DATE DERIVATIONS (MDY, INTCK, INTNX)

data datacenter_dates;

set datacenter_clean;

Current_Date = today();

Years_Operational = intck('year', Install_Date, Current_Date);

Next_Maintenance = intnx('month', Install_Date, 60, 'same');

format Current_Date Next_Maintenance date9.;

run;

proc print data=datacenter_dates;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code	Current_Date	Years_Operational	Next_Maintenance
1	1	Alpha_dc	USA	1.4	12000	82	5	78	15JAN2015	USA_1	15FEB2026	11	15JAN2020
2	2	Beta_dc	INDIA	1.8	9500	70	18	65	22MAR2016	INDIA_2	15FEB2026	10	22MAR2021
3	3	Gamma_dc	GERMANY	1.3	14000	88	2	85	10JUN2014	GERMANY_3	15FEB2026	12	10JUN2019
4	4	Delta_dc	UK	1.6	11000	75	10	72	01DEC2017	UK_4	15FEB2026	9	01DEC2022
5	5	Epsilon_	JAPAN	1.2	16000	92	1	90	09SEP2013	JAPAN_5	15FEB2026	13	09SEP2018
6	6	Zeta_dc	INDIA	2.1	8000	60	30	50	12FEB2018	INDIA_6	15FEB2026	8	12FEB2023
7	7	Eta_dc	USA	1.5	10000	78	7	74	19AUG2016	USA_7	15FEB2026	10	19AUG2021
8	8	Theta_dc	CANADA	1.4	10500	80	6	77	25NOV2015	CANADA_8	15FEB2026	11	25NOV2020
9	9	Iota_dc	AUSTRALIA	1.7	9800	73	12	70	11APR2017	AUSTRALIA_9	15FEB2026	9	11APR2022
10	10	Kappa_dc	SINGAPORE	1.3	15000	90	3	88	06JUL2014	SINGAPORE_10	15FEB2026	12	06JUL2019
11	11	Lambda_d	FRANCE	1.6	10800	76	9	73	18OCT2016	FRANCE_11	15FEB2026	10	18OCT2021
12	12	Mu_dc	INDIA	2.3	7000	55	40	45	29JAN2019	INDIA_12	15FEB2026	7	29JAN2024
13	13	Nu_dc	NETHERLANDS	1.2	15500	93	2	91	03MAY2013	NETHERLANDS_13	15FEB2026	13	03MAY2018
14	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023
15	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021

· Operational age impacts maintenance cost

· Predictive scheduling

· Time-based risk modelling

7. UTILIZATION & RISK CLASSIFICATION (MACRO)

%macro classify;

data datacenter_classified;

set datacenter_dates;

length Utilization $12.;

if Server_Count >= 14000 then Utilization = 'HIGH';

else if Server_Count >= 10000 then Utilization = 'MEDIUM';

else Utilization = 'LOW';

length Risk_Flag $12.;

if Downtime_Hours > 25 or Power_Usage_PUE > 2 then Risk_Flag = 'HIGH_RISK';

else Risk_Flag = 'NORMAL';

run;

proc print data=datacenter_classified;

run;

%mend;

%classify;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code	Current_Date	Years_Operational	Next_Maintenance	Utilization	Risk_Flag
1	1	Alpha_dc	USA	1.4	12000	82	5	78	15JAN2015	USA_1	15FEB2026	11	15JAN2020	MEDIUM	NORMAL
2	2	Beta_dc	INDIA	1.8	9500	70	18	65	22MAR2016	INDIA_2	15FEB2026	10	22MAR2021	LOW	NORMAL
3	3	Gamma_dc	GERMANY	1.3	14000	88	2	85	10JUN2014	GERMANY_3	15FEB2026	12	10JUN2019	HIGH	NORMAL
4	4	Delta_dc	UK	1.6	11000	75	10	72	01DEC2017	UK_4	15FEB2026	9	01DEC2022	MEDIUM	NORMAL
5	5	Epsilon_	JAPAN	1.2	16000	92	1	90	09SEP2013	JAPAN_5	15FEB2026	13	09SEP2018	HIGH	NORMAL
6	6	Zeta_dc	INDIA	2.1	8000	60	30	50	12FEB2018	INDIA_6	15FEB2026	8	12FEB2023	LOW	HIGH_RISK
7	7	Eta_dc	USA	1.5	10000	78	7	74	19AUG2016	USA_7	15FEB2026	10	19AUG2021	MEDIUM	NORMAL
8	8	Theta_dc	CANADA	1.4	10500	80	6	77	25NOV2015	CANADA_8	15FEB2026	11	25NOV2020	MEDIUM	NORMAL
9	9	Iota_dc	AUSTRALIA	1.7	9800	73	12	70	11APR2017	AUSTRALIA_9	15FEB2026	9	11APR2022	LOW	NORMAL
10	10	Kappa_dc	SINGAPORE	1.3	15000	90	3	88	06JUL2014	SINGAPORE_10	15FEB2026	12	06JUL2019	HIGH	NORMAL
11	11	Lambda_d	FRANCE	1.6	10800	76	9	73	18OCT2016	FRANCE_11	15FEB2026	10	18OCT2021	MEDIUM	NORMAL
12	12	Mu_dc	INDIA	2.3	7000	55	40	45	29JAN2019	INDIA_12	15FEB2026	7	29JAN2024	LOW	HIGH_RISK
13	13	Nu_dc	NETHERLANDS	1.2	15500	93	2	91	03MAY2013	NETHERLANDS_13	15FEB2026	13	03MAY2018	HIGH	NORMAL
14	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023	LOW	NORMAL
15	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021	MEDIUM	NORMAL

Why macros?

Reusable logic
Standard enterprise practice
Interview favorite topic

8. FRAUD / ANOMALY DETECTION LOGIC

data fraud_check;

set datacenter_classified;

length Fraud_Flag $12.;

if Sustainability_Score > 85 and Cooling_Efficiency < 70

then Fraud_Flag = 'SUSPECT';

else Fraud_Flag = 'CLEAR';

run;

proc print data=fraud_check;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code	Current_Date	Years_Operational	Next_Maintenance	Utilization	Risk_Flag	Fraud_Flag
1	1	Alpha_dc	USA	1.4	12000	82	5	78	15JAN2015	USA_1	15FEB2026	11	15JAN2020	MEDIUM	NORMAL	CLEAR
2	2	Beta_dc	INDIA	1.8	9500	70	18	65	22MAR2016	INDIA_2	15FEB2026	10	22MAR2021	LOW	NORMAL	CLEAR
3	3	Gamma_dc	GERMANY	1.3	14000	88	2	85	10JUN2014	GERMANY_3	15FEB2026	12	10JUN2019	HIGH	NORMAL	CLEAR
4	4	Delta_dc	UK	1.6	11000	75	10	72	01DEC2017	UK_4	15FEB2026	9	01DEC2022	MEDIUM	NORMAL	CLEAR
5	5	Epsilon_	JAPAN	1.2	16000	92	1	90	09SEP2013	JAPAN_5	15FEB2026	13	09SEP2018	HIGH	NORMAL	CLEAR
6	6	Zeta_dc	INDIA	2.1	8000	60	30	50	12FEB2018	INDIA_6	15FEB2026	8	12FEB2023	LOW	HIGH_RISK	CLEAR
7	7	Eta_dc	USA	1.5	10000	78	7	74	19AUG2016	USA_7	15FEB2026	10	19AUG2021	MEDIUM	NORMAL	CLEAR
8	8	Theta_dc	CANADA	1.4	10500	80	6	77	25NOV2015	CANADA_8	15FEB2026	11	25NOV2020	MEDIUM	NORMAL	CLEAR
9	9	Iota_dc	AUSTRALIA	1.7	9800	73	12	70	11APR2017	AUSTRALIA_9	15FEB2026	9	11APR2022	LOW	NORMAL	CLEAR
10	10	Kappa_dc	SINGAPORE	1.3	15000	90	3	88	06JUL2014	SINGAPORE_10	15FEB2026	12	06JUL2019	HIGH	NORMAL	CLEAR
11	11	Lambda_d	FRANCE	1.6	10800	76	9	73	18OCT2016	FRANCE_11	15FEB2026	10	18OCT2021	MEDIUM	NORMAL	CLEAR
12	12	Mu_dc	INDIA	2.3	7000	55	40	45	29JAN2019	INDIA_12	15FEB2026	7	29JAN2024	LOW	HIGH_RISK	CLEAR
13	13	Nu_dc	NETHERLANDS	1.2	15500	93	2	91	03MAY2013	NETHERLANDS_13	15FEB2026	13	03MAY2018	HIGH	NORMAL	CLEAR
14	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023	LOW	NORMAL	CLEAR
15	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021	MEDIUM	NORMAL	CLEAR

High sustainability with poor cooling → possible manipulated ESG reporting

9. STATISTICAL ANALYSIS

PROC MEANS

proc means data=fraud_check mean min max;

var Power_Usage_PUE Downtime_Hours Sustainability_Score;

run;

OUTPUT:

The MEANS Procedure


Variable	Mean	Minimum	Maximum
Power_Usage_PUE Downtime_Hours Sustainability_Score	1.5866667 11.6666667 72.8666667	1.2000000 1.0000000 45.0000000	2.3000000 40.0000000 91.0000000

PROC UNIVARIATE

proc univariate data=fraud_check;

var Power_Usage_PUE;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Power_Usage_PUE


Moments
N	15	Sum Weights	15
Mean	1.58666667	Sum Observations	23.8
Std Deviation	0.32484429	Variance	0.10552381
Skewness	0.87308189	Kurtosis	0.18800433
Uncorrected SS	39.24	Corrected SS	1.47733333
Coeff Variation	20.4733793	Std Error Mean	0.08387443


Basic Statistical Measures
Location		Variability
Mean	1.586667	Std Deviation	0.32484
Median	1.500000	Variance	0.10552
Mode	1.200000	Range	1.10000
		Interquartile Range	0.50000

Note: The mode displayed is the smallest of 5 modes with a count of 2.


Tests for Location: Mu0=0
Test	Statistic		p Value
Student's t	t	18.91717	Pr > \|t\|	<.0001
Sign	M	7.5	Pr >= \|M\|	<.0001
Signed Rank	S	60	Pr >= \|S\|	<.0001


Quantiles (Definition 5)
Level	Quantile
100% Max	2.3
99%	2.3
95%	2.3
90%	2.1
75% Q3	1.8
50% Median	1.5
25% Q1	1.3
10%	1.2
5%	1.2
1%	1.2
0% Min	1.2


Extreme Observations
Lowest		Highest
Value	Obs	Value	Obs
1.2	13	1.7	9
1.2	5	1.8	2
1.3	10	1.9	14
1.3	3	2.1	6
1.4	8	2.3	12

10. FREQUENCY ANALYSIS

proc freq data=fraud_check;

tables Country Utilization Risk_Flag Fraud_Flag;

run;

OUTPUT:

The FREQ Procedure


Country	Frequency	Percent	Cumulative Frequency	Cumulative Percent
AUSTRALIA	1	6.67	1	6.67
BRAZIL	1	6.67	2	13.33
CANADA	1	6.67	3	20.00
FRANCE	1	6.67	4	26.67
GERMANY	1	6.67	5	33.33
INDIA	3	20.00	8	53.33
JAPAN	1	6.67	9	60.00
NETHERLANDS	1	6.67	10	66.67
SINGAPORE	1	6.67	11	73.33
UAE	1	6.67	12	80.00
UK	1	6.67	13	86.67
USA	2	13.33	15	100.00


Utilization	Frequency	Percent	Cumulative Frequency	Cumulative Percent
HIGH	4	26.67	4	26.67
LOW	5	33.33	9	60.00
MEDIUM	6	40.00	15	100.00


Risk_Flag	Frequency	Percent	Cumulative Frequency	Cumulative Percent
HIGH_RISK	2	13.33	2	13.33
NORMAL	13	86.67	15	100.00


Fraud_Flag	Frequency	Percent	Cumulative Frequency	Cumulative Percent
CLEAR	15	100.00	15	100.00

11. CORRELATION ANALYSIS

proc corr data=fraud_check;

var Power_Usage_PUE Downtime_Hours Cooling_Efficiency Sustainability_Score;

run;

OUTPUT:

The CORR Procedure


4 Variables:	Power_Usage_PUE Downtime_Hours Cooling_Efficiency Sustainability_Score


Simple Statistics
Variable	N	Mean	Std Dev	Sum	Minimum	Maximum
Power_Usage_PUE	15	1.58667	0.32484	23.80000	1.20000	2.30000
Downtime_Hours	15	11.66667	11.28632	175.00000	1.00000	40.00000
Cooling_Efficiency	15	77.26667	11.12569	1159	55.00000	93.00000
Sustainability_Score	15	72.86667	13.58501	1093	45.00000	91.00000


Pearson Correlation Coefficients, N = 15 Prob > \|r\| under H0: Rho=0
	Power_Usage_PUE	Downtime_Hours	Cooling_Efficiency	Sustainability_Score
Power_Usage_PUE	1.00000	0.98062 <.0001	-0.98516 <.0001	-0.98939 <.0001
Downtime_Hours	0.98062 <.0001	1.00000	-0.94637 <.0001	-0.96372 <.0001
Cooling_Efficiency	-0.98516 <.0001	-0.94637 <.0001	1.00000	0.99553 <.0001
Sustainability_Score	-0.98939 <.0001	-0.96372 <.0001	0.99553 <.0001	1.00000

12. VISUALIZATION – PROC SGPLOT

proc sgplot data=fraud_check;

scatter x=Power_Usage_PUE y=Sustainability_Score;

run;

OUTPUT:

13. APPEND,TRANSPOSE

CREATE A NEW DATASET

data fraud_check_new;

set fraud_check;

if Center_ID in (14,15);

run;

proc print data=fraud_check_new;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code	Current_Date	Years_Operational	Next_Maintenance	Utilization	Risk_Flag	Fraud_Flag
1	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023	LOW	NORMAL	CLEAR
2	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021	MEDIUM	NORMAL	CLEAR

· Simulates incremental data load

· Common in production environments

APPEND

proc append base=fraud_check

data=fraud_check_new force;

run;

proc print data=fraud_check;

run;

OUTPUT:


Obs	Center_ID	Center_Name	Country	Power_Usage_PUE	Server_Count	Cooling_Efficiency	Downtime_Hours	Sustainability_Score	Install_Date	Center_Code	Current_Date	Years_Operational	Next_Maintenance	Utilization	Risk_Flag	Fraud_Flag
1	1	Alpha_dc	USA	1.4	12000	82	5	78	15JAN2015	USA_1	15FEB2026	11	15JAN2020	MEDIUM	NORMAL	CLEAR
2	2	Beta_dc	INDIA	1.8	9500	70	18	65	22MAR2016	INDIA_2	15FEB2026	10	22MAR2021	LOW	NORMAL	CLEAR
3	3	Gamma_dc	GERMANY	1.3	14000	88	2	85	10JUN2014	GERMANY_3	15FEB2026	12	10JUN2019	HIGH	NORMAL	CLEAR
4	4	Delta_dc	UK	1.6	11000	75	10	72	01DEC2017	UK_4	15FEB2026	9	01DEC2022	MEDIUM	NORMAL	CLEAR
5	5	Epsilon_	JAPAN	1.2	16000	92	1	90	09SEP2013	JAPAN_5	15FEB2026	13	09SEP2018	HIGH	NORMAL	CLEAR
6	6	Zeta_dc	INDIA	2.1	8000	60	30	50	12FEB2018	INDIA_6	15FEB2026	8	12FEB2023	LOW	HIGH_RISK	CLEAR
7	7	Eta_dc	USA	1.5	10000	78	7	74	19AUG2016	USA_7	15FEB2026	10	19AUG2021	MEDIUM	NORMAL	CLEAR
8	8	Theta_dc	CANADA	1.4	10500	80	6	77	25NOV2015	CANADA_8	15FEB2026	11	25NOV2020	MEDIUM	NORMAL	CLEAR
9	9	Iota_dc	AUSTRALIA	1.7	9800	73	12	70	11APR2017	AUSTRALIA_9	15FEB2026	9	11APR2022	LOW	NORMAL	CLEAR
10	10	Kappa_dc	SINGAPORE	1.3	15000	90	3	88	06JUL2014	SINGAPORE_10	15FEB2026	12	06JUL2019	HIGH	NORMAL	CLEAR
11	11	Lambda_d	FRANCE	1.6	10800	76	9	73	18OCT2016	FRANCE_11	15FEB2026	10	18OCT2021	MEDIUM	NORMAL	CLEAR
12	12	Mu_dc	INDIA	2.3	7000	55	40	45	29JAN2019	INDIA_12	15FEB2026	7	29JAN2024	LOW	HIGH_RISK	CLEAR
13	13	Nu_dc	NETHERLANDS	1.2	15500	93	2	91	03MAY2013	NETHERLANDS_13	15FEB2026	13	03MAY2018	HIGH	NORMAL	CLEAR
14	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023	LOW	NORMAL	CLEAR
15	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021	MEDIUM	NORMAL	CLEAR
16	14	Xi_dc	BRAZIL	1.9	9000	68	22	60	14DEC2018	BRAZIL_14	15FEB2026	8	14DEC2023	LOW	NORMAL	CLEAR
17	15	Omicron_	UAE	1.5	11500	79	8	75	21JUN2016	UAE_15	15FEB2026	10	21JUN2021	MEDIUM	NORMAL	CLEAR

· fraud_check → historical master table

· fraud_check_new → new records

· Rows get physically added

· Structure mismatch handled by FORCE

TRANSPOSE

proc transpose data=fraud_check out=dc_transposed;

var Power_Usage_PUE Sustainability_Score;

run;

proc print data=dc_transposed;

run;

OUTPUT:


Obs	_NAME_	COL1	COL2	COL3	COL4	COL5	COL6	COL7	COL8	COL9	COL10	COL11	COL12	COL13	COL14	COL15	COL16	COL17
1	Power_Usage_PUE	1.4	1.8	1.3	1.6	1.2	2.1	1.5	1.4	1.7	1.3	1.6	2.3	1.2	1.9	1.5	1.9	1.5
2	Sustainability_Score	78.0	65.0	85.0	72.0	90.0	50.0	74.0	77.0	70.0	88.0	73.0	45.0	91.0	60.0	75.0	60.0	75.0

14. CLEANUP – PROC DATASETS DELETE

proc datasets library=work nolist;

delete datacenter_raw;

quit;

LOG:

 NOTE: Deleting WORK.DATACENTER_RAW (memtype=DATA).

15. Conclusion

This Data Center Analytics project demonstrates how SAS can be effectively used to transform raw operational data into meaningful business insights. By integrating DATA steps, PROC SQL, statistical procedures, macros, and advanced date functions, the analysis evaluates power efficiency, server utilization, cooling performance, downtime, and sustainability metrics across global data centers. The structured use of character and numeric functions ensures data standardization, while correlation and visualization techniques help uncover relationships between efficiency, downtime, and sustainability. Overall, the project reflects a realistic enterprise workflow where data quality, performance measurement, and operational monitoring are equally important.

From a business and interview perspective, the project highlights how SAS supports decision-making beyond basic reporting. Utilization and risk classification macros enable automation and scalability, while anomaly and fraud-detection logic addresses governance and ESG reliability concerns. Proper use of PROC APPEND, MERGE, SET, and TRANSPOSE mirrors real production environments with incremental data loads and evolving structures. In summary, this analysis showcases SAS as a powerful tool for operational intelligence, risk control, and sustainable data center management in modern organizations.

INTERVIEW QUESTIONS FOR YOU

1. 1.What is PROC MEANS used for in SAS?
2. 2.What is the difference between SET and MERGE statements?

3. 3.Why are macros used in SAS?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.

Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent DATA CENTERS data.

Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.

This project is suitable for:
·  Students learning SAS
·  Data analysts building portfolios
·  Professionals preparing for SAS interviews
·  Bloggers writing about analytics and smart cities

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On :

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

1.Which Country Truly Dominates the Olympics? – A Complete SAS Medal Efficiency Analytics Project
2.Which Airports Are Really the Busiest? – An End-to-End SAS Airport Traffic Analytics Project
3.Can Data Predict Election Outcomes? – A Complete SAS Voting Analytics Project
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact | Privacy Policy

Search This Blog

SAS Learning Hub

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study

How Efficient Are Global Data Centers? A Complete SAS Analytics Study

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS

1. INTRODUCTION

2. BUSINESS CONTEXT

Why this analysis matters:

Business questions answered:

3. TABLE OF CONTENTS

INTERVIEW QUESTIONS FOR YOU

Follow Us On :

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

About Us | Contact | Privacy Policy

Comments

Post a Comment

Popular posts from this blog

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?

379.CAN SAS DELIVER END-TO-END TLF (TABLES, LISTINGS, AND FIGURES) FOR INSURANCE CLAIMS ANALYSIS?

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study

How Efficient Are Global Data Centers? A Complete SAS Analytics Study

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS

1. INTRODUCTION

2. BUSINESS CONTEXT

Why this analysis matters:

Business questions answered:

3. TABLE OF CONTENTS

INTERVIEW QUESTIONS FOR YOU

Follow Us On :

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

About Us | Contact | Privacy Policy

Comments

Post a Comment

Popular posts from this blog

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

383.Which Pharma Distributors Are High-Risk? Can SAS Detect Delays, Temperature Violations, and Fraud?

379.CAN SAS DELIVER END-TO-END TLF (TABLES, LISTINGS, AND FIGURES) FOR INSURANCE CLAIMS ANALYSIS?

HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS