326.How Fast Is Our Planet Warming? A Data-Driven Climate Change Study in SAS
How Fast Is Our Planet Warming? A Data-Driven Climate Change Study in SAS
options nocenter;
1) Create raw dataset with date formats
data climate_raw;
informat Observation_Date date9.;
format Observation_Date date9.;
input Country :$25. Observation_Date :date9. CO2_Emissions :8.2 /* metric tons per capita */
Average_Temperature :6.2 /*in °C */ Sea_Level_Rise :8.1 /* mm since baseline year */
Forest_Cover :6.2 /*in percent */ Pollution_Index :6.1;
datalines;
Sweden 01JAN2024 4.90 7.1 9.5 69.0 22.0
India 01JAN2024 1.90 26.8 18.2 24.0 65.5
USA 01JAN2024 14.50 12.3 22.4 33.9 40.2
Bangladesh 01JAN2024 0.50 25.1 35.6 17.0 78.9
Australia 01JAN2024 16.80 21.5 14.0 16.5 36.0
Brazil 01JAN2024 2.10 25.0 10.1 59.4 30.0
Germany 01JAN2024 8.40 10.5 8.8 31.9 28.5
China 01JAN2024 7.60 14.2 16.7 22.3 55.0
Maldives 01JAN2024 2.00 28.3 70.2 32.0 62.0
Ethiopia 01JAN2024 0.30 19.8 12.0 45.0 34.0
Japan 01JAN2024 9.20 13.6 19.1 67.0 29.8
UK 01JAN2024 5.90 10.8 11.4 12.8 35.3
;
run;
proc print data=climate_raw;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index |
|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 |
| 2 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 |
| 3 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 |
| 4 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 |
| 5 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 |
| 6 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 |
| 7 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 |
| 8 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 |
| 10 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 |
| 11 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 |
| 12 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 |
2) Add a baseline year and compute 'Years_Since_Baseline' using INTCK
data climate_raw2;
set climate_raw;
/* choose baseline date (e.g., 1990-01-01) for climate comparisons */
Baseline_Date = '01JAN1990'd;
format Baseline_Date date9.;
Years_Since_Baseline = intck('year', Baseline_Date, Observation_Date); /* integer years */
/* Also create a quarterly-aligned date using INTNX (e.g., shift to quarter start) */
Obs_QuarterStart = intnx('quarter', Observation_Date, 0, 'b'); /* beginning of quarter */
format Obs_QuarterStart date9.;
run;
proc print data=climate_raw2;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 |
| 2 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 |
| 3 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 |
| 4 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 |
| 5 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 |
| 6 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 |
| 7 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 |
| 8 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 |
| 10 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 |
| 11 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 |
| 12 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 |
3) Create a small country_info dataset to demonstrate MERGE (e.g., Region)
data country_info;
length Country $25 Region $15 Income_Group $10;
infile datalines dlm='|';
input Country :$25. Region :$15. Income_Group :$10.;
datalines;
Sweden|Europe|High
India|Asia|Lower-Middle
USA|North America|High
Bangladesh|Asia|Low
Australia|Oceania|High
Brazil|South America|Upper-Middle
Germany|Europe|High
China|Asia|Upper-Middle
Maldives|Asia|Upper-Middle
Ethiopia|Africa|Low
Japan|Asia|High
UK|Europe|High
;
run;
proc print data=country_info;
run;
OUTPUT:
| Obs | Country | Region | Income_Group |
|---|---|---|---|
| 1 | Sweden | Europe | High |
| 2 | India | Asia | Lower-Midd |
| 3 | USA | North America | High |
| 4 | Bangladesh | Asia | Low |
| 5 | Australia | Oceania | High |
| 6 | Brazil | South America | Upper-Midd |
| 7 | Germany | Europe | High |
| 8 | China | Asia | Upper-Midd |
| 9 | Maldives | Asia | Upper-Midd |
| 10 | Ethiopia | Africa | Low |
| 11 | Japan | Asia | High |
| 12 | UK | Europe | High |
4) Sort datasets for MERGE by Country
proc sort data=climate_raw2;
by Country;
run;
proc print data=climate_raw2;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 |
proc sort data=country_info;
by Country;
run;
proc print data=country_info;
run;
OUTPUT:
| Obs | Country | Region | Income_Group |
|---|---|---|---|
| 1 | Australia | Oceania | High |
| 2 | Bangladesh | Asia | Low |
| 3 | Brazil | South America | Upper-Midd |
| 4 | China | Asia | Upper-Midd |
| 5 | Ethiopia | Africa | Low |
| 6 | Germany | Europe | High |
| 7 | India | Asia | Lower-Midd |
| 8 | Japan | Asia | High |
| 9 | Maldives | Asia | Upper-Midd |
| 10 | Sweden | Europe | High |
| 11 | UK | Europe | High |
| 12 | USA | North America | High |
5) Merge climate indicators with country_info
data climate_merged;
merge climate_raw2 (in=a) country_info (in=b);
by Country;
if a; /* keep only climate records */
run;
proc print data=climate_merged;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart | Region | Income_Group |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 | Oceania | High |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 | Asia | Low |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 | South America | Upper-Midd |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 | Africa | Low |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 | Europe | High |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 | Asia | Lower-Midd |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 | Asia | High |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 | Europe | High |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 | Europe | High |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 | North America | High |
6) Compute a composite 'Risk_Score' using z-scores (standardization)
6a) Get means and std devs into a dataset via PROC MEANS
proc means data=climate_merged noprint;
var CO2_Emissions Average_Temperature Sea_Level_Rise Forest_Cover Pollution_Index;
output out=stats mean=mean_CO2 mean_Temp mean_SL mean_Forest mean_Poll
std=std_CO2 std_Temp std_SL std_Forest std_Poll;
run;
proc print data=stats;
run;
OUTPUT:
| Obs | _TYPE_ | _FREQ_ | mean_CO2 | mean_Temp | mean_SL | mean_Forest | mean_Poll | std_CO2 | std_Temp | std_SL | std_Forest | std_Poll |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 12 | 6.175 | 17.9167 | 20.6667 | 35.9 | 43.1 | 5.38569 | 7.33322 | 17.2849 | 19.8330 | 17.8161 |
6b) Load stats into macro variables for later use
data _null_;
set stats;
call symputx('mean_CO2', mean_CO2);
call symputx('std_CO2', std_CO2);
call symputx('mean_Temp', mean_Temp);
call symputx('std_Temp', std_Temp);
call symputx('mean_SL', mean_SL);
call symputx('std_SL', std_SL);
call symputx('mean_Forest', mean_Forest);
call symputx('std_Forest', std_Forest);
call symputx('mean_Poll', mean_Poll);
call symputx('std_Poll', std_Poll);
run;
LOG:
6c) Create z-scores and composite risk (weights can be customized)
data climate_scores;
set climate_merged;
/* avoid division by zero */
z_CO2 = (CO2_Emissions - &mean_CO2.) / ( &std_CO2. + 1e-8 );
z_Temp = (Average_Temperature - &mean_Temp.) / ( &std_Temp. + 1e-8 );
z_SL = (Sea_Level_Rise - &mean_SL.) / ( &std_SL. + 1e-8 );
/* lower Forest_Cover => higher risk, so invert */
z_Forest = -1 * (Forest_Cover - &mean_Forest.) / ( &std_Forest. + 1e-8 );
z_Poll = (Pollution_Index - &mean_Poll.) / ( &std_Poll. + 1e-8 );
/* Weighted composite score (weights sum to 1) */
Risk_Score = 0.30*z_CO2 + 0.25*z_Temp + 0.20*z_SL + 0.15*z_Forest + 0.10*z_Poll;
/* scale Risk_Score to 0-100 for readability */
minRisk = .; maxRisk = .;
run;
proc print data=climate_scores;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart | Region | Income_Group | z_CO2 | z_Temp | z_SL | z_Forest | z_Poll | Risk_Score | minRisk | maxRisk |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 | Oceania | High | 1.97282 | 0.48864 | -0.38569 | 0.97817 | -0.39852 | 0.74374 | . | . |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 | Asia | Low | -1.05372 | 0.97956 | 0.86396 | 0.95296 | 2.00941 | 0.44545 | . | . |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 | South America | Upper-Midd | -0.75663 | 0.96592 | -0.61133 | -1.18489 | -0.73529 | -0.35904 | . | . |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | 0.26459 | -0.50683 | -0.22949 | 0.68572 | 0.66793 | 0.07642 | . | . |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 | Africa | Low | -1.09085 | 0.25682 | -0.50140 | -0.45883 | -0.51077 | -0.48323 | . | . |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | 0.41313 | -1.01138 | -0.68654 | 0.20168 | -0.81948 | -0.31791 | . | . |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 | Asia | Lower-Midd | -0.79377 | 1.21138 | -0.14271 | 0.60001 | 1.25729 | 0.25190 | . | . |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 | Asia | High | 0.56167 | -0.58865 | -0.09064 | -1.56809 | -0.74651 | -0.30665 | . | . |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | -0.77520 | 1.41593 | 2.86571 | 0.19664 | 1.06084 | 0.83014 | . | . |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.23674 | -1.47502 | -0.64604 | -1.66893 | -1.18432 | -0.93776 | . | . |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.05106 | -0.97047 | -0.53611 | 1.16472 | -0.43781 | -0.23423 | . | . |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 | North America | High | 1.54576 | -0.76592 | 0.10028 | 0.10084 | -0.16277 | 0.29115 | . | . |
7) Determine percentile ranks using PROC RANK and then categorize risk using a macro
proc rank data=climate_scores out=climate_rank ties=mean percent;
var Risk_Score;
ranks Risk_Pct; /* 0-100 scale percentiles (0..99) */
run;
proc print data=climate_rank;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart | Region | Income_Group | z_CO2 | z_Temp | z_SL | z_Forest | z_Poll | Risk_Score | minRisk | maxRisk | Risk_Pct |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 | Oceania | High | 1.97282 | 0.48864 | -0.38569 | 0.97817 | -0.39852 | 0.74374 | . | . | 91.667 |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 | Asia | Low | -1.05372 | 0.97956 | 0.86396 | 0.95296 | 2.00941 | 0.44545 | . | . | 83.333 |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 | South America | Upper-Midd | -0.75663 | 0.96592 | -0.61133 | -1.18489 | -0.73529 | -0.35904 | . | . | 25.000 |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | 0.26459 | -0.50683 | -0.22949 | 0.68572 | 0.66793 | 0.07642 | . | . | 58.333 |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 | Africa | Low | -1.09085 | 0.25682 | -0.50140 | -0.45883 | -0.51077 | -0.48323 | . | . | 16.667 |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | 0.41313 | -1.01138 | -0.68654 | 0.20168 | -0.81948 | -0.31791 | . | . | 33.333 |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 | Asia | Lower-Midd | -0.79377 | 1.21138 | -0.14271 | 0.60001 | 1.25729 | 0.25190 | . | . | 66.667 |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 | Asia | High | 0.56167 | -0.58865 | -0.09064 | -1.56809 | -0.74651 | -0.30665 | . | . | 41.667 |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | -0.77520 | 1.41593 | 2.86571 | 0.19664 | 1.06084 | 0.83014 | . | . | 100.000 |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.23674 | -1.47502 | -0.64604 | -1.66893 | -1.18432 | -0.93776 | . | . | 8.333 |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.05106 | -0.97047 | -0.53611 | 1.16472 | -0.43781 | -0.23423 | . | . | 50.000 |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 | North America | High | 1.54576 | -0.76592 | 0.10028 | 0.10084 | -0.16277 | 0.29115 | . | . | 75.000 |
7a) Macro to categorize into Low/Moderate/High/Severe using Risk_Pct
%macro categorize(in=, out=, pctvar=Risk_Pct);
data &out.;
set &in.;
length Risk_Category $10.;
if &pctvar. < 25 then Risk_Category = 'Low';
else if 25 <= &pctvar. < 50 then Risk_Category = 'Moderate';
else if 50 <= &pctvar. < 75 then Risk_Category = 'High';
else Risk_Category = 'Severe';
run;
proc print data=&out.;
run;
%mend categorize;
%categorize(in=climate_rank, out=climate_final, pctvar=Risk_Pct);
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart | Region | Income_Group | z_CO2 | z_Temp | z_SL | z_Forest | z_Poll | Risk_Score | minRisk | maxRisk | Risk_Pct | Risk_Category |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 | Oceania | High | 1.97282 | 0.48864 | -0.38569 | 0.97817 | -0.39852 | 0.74374 | . | . | 91.667 | Severe |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 | Asia | Low | -1.05372 | 0.97956 | 0.86396 | 0.95296 | 2.00941 | 0.44545 | . | . | 83.333 | Severe |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 | South America | Upper-Midd | -0.75663 | 0.96592 | -0.61133 | -1.18489 | -0.73529 | -0.35904 | . | . | 25.000 | Moderate |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | 0.26459 | -0.50683 | -0.22949 | 0.68572 | 0.66793 | 0.07642 | . | . | 58.333 | High |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 | Africa | Low | -1.09085 | 0.25682 | -0.50140 | -0.45883 | -0.51077 | -0.48323 | . | . | 16.667 | Low |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | 0.41313 | -1.01138 | -0.68654 | 0.20168 | -0.81948 | -0.31791 | . | . | 33.333 | Moderate |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 | Asia | Lower-Midd | -0.79377 | 1.21138 | -0.14271 | 0.60001 | 1.25729 | 0.25190 | . | . | 66.667 | High |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 | Asia | High | 0.56167 | -0.58865 | -0.09064 | -1.56809 | -0.74651 | -0.30665 | . | . | 41.667 | Moderate |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | -0.77520 | 1.41593 | 2.86571 | 0.19664 | 1.06084 | 0.83014 | . | . | 100.000 | Severe |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.23674 | -1.47502 | -0.64604 | -1.66893 | -1.18432 | -0.93776 | . | . | 8.333 | Low |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.05106 | -0.97047 | -0.53611 | 1.16472 | -0.43781 | -0.23423 | . | . | 50.000 | High |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 | North America | High | 1.54576 | -0.76592 | 0.10028 | 0.10084 | -0.16277 | 0.29115 | . | . | 75.000 | Severe |
8) Append an extra observation (simulating newly collected data) using PROC APPEND
data new_obs;
informat Observation_Date date9.;
format Observation_Date date9.;
Country = 'Fiji';
Observation_Date = '01JAN2024'd;
CO2_Emissions=1.70;
Average_Temperature=26.6;
Sea_Level_Rise=40.2;
Forest_Cover=50.0;
Pollution_Index=45.0;
run;
proc print data=new_obs;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index |
|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Fiji | 1.7 | 26.6 | 40.2 | 50 | 45 |
/* Append to climate_final */
proc append base=climate_final
data=new_obs force;
run;
proc print data=climate_final ;
run;
OUTPUT:
| Obs | Observation_Date | Country | CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Baseline_Date | Years_Since_Baseline | Obs_QuarterStart | Region | Income_Group | z_CO2 | z_Temp | z_SL | z_Forest | z_Poll | Risk_Score | minRisk | maxRisk | Risk_Pct | Risk_Category |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01JAN2024 | Australia | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 01JAN1990 | 34 | 01JAN2024 | Oceania | High | 1.97282 | 0.48864 | -0.38569 | 0.97817 | -0.39852 | 0.74374 | . | . | 91.667 | Severe |
| 2 | 01JAN2024 | Bangladesh | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 01JAN1990 | 34 | 01JAN2024 | Asia | Low | -1.05372 | 0.97956 | 0.86396 | 0.95296 | 2.00941 | 0.44545 | . | . | 83.333 | Severe |
| 3 | 01JAN2024 | Brazil | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | 01JAN1990 | 34 | 01JAN2024 | South America | Upper-Midd | -0.75663 | 0.96592 | -0.61133 | -1.18489 | -0.73529 | -0.35904 | . | . | 25.000 | Moderate |
| 4 | 01JAN2024 | China | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | 0.26459 | -0.50683 | -0.22949 | 0.68572 | 0.66793 | 0.07642 | . | . | 58.333 | High |
| 5 | 01JAN2024 | Ethiopia | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | 01JAN1990 | 34 | 01JAN2024 | Africa | Low | -1.09085 | 0.25682 | -0.50140 | -0.45883 | -0.51077 | -0.48323 | . | . | 16.667 | Low |
| 6 | 01JAN2024 | Germany | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | 0.41313 | -1.01138 | -0.68654 | 0.20168 | -0.81948 | -0.31791 | . | . | 33.333 | Moderate |
| 7 | 01JAN2024 | India | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 01JAN1990 | 34 | 01JAN2024 | Asia | Lower-Midd | -0.79377 | 1.21138 | -0.14271 | 0.60001 | 1.25729 | 0.25190 | . | . | 66.667 | High |
| 8 | 01JAN2024 | Japan | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | 01JAN1990 | 34 | 01JAN2024 | Asia | High | 0.56167 | -0.58865 | -0.09064 | -1.56809 | -0.74651 | -0.30665 | . | . | 41.667 | Moderate |
| 9 | 01JAN2024 | Maldives | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 01JAN1990 | 34 | 01JAN2024 | Asia | Upper-Midd | -0.77520 | 1.41593 | 2.86571 | 0.19664 | 1.06084 | 0.83014 | . | . | 100.000 | Severe |
| 10 | 01JAN2024 | Sweden | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.23674 | -1.47502 | -0.64604 | -1.66893 | -1.18432 | -0.93776 | . | . | 8.333 | Low |
| 11 | 01JAN2024 | UK | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | 01JAN1990 | 34 | 01JAN2024 | Europe | High | -0.05106 | -0.97047 | -0.53611 | 1.16472 | -0.43781 | -0.23423 | . | . | 50.000 | High |
| 12 | 01JAN2024 | USA | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 01JAN1990 | 34 | 01JAN2024 | North America | High | 1.54576 | -0.76592 | 0.10028 | 0.10084 | -0.16277 | 0.29115 | . | . | 75.000 | Severe |
| 13 | 01JAN2024 | Fiji | 1.7 | 26.6 | 40.2 | 50.0 | 45.0 | . | . | . | . | . | . | . | . | . | . | . | . |
9) Use PROC SQL to create a summary table by Region & Income_Group
proc sql;
create table region_summary as
select Region, Income_Group,
count(*) as N,
mean(CO2_Emissions) as Mean_CO2 format=8.2,
mean(Average_Temperature) as Mean_Temp format=8.2,
mean(Sea_Level_Rise) as Mean_SL format=8.2,
mean(Forest_Cover) as Mean_Forest format=8.2,
mean(Pollution_Index) as Mean_Poll format=8.2
from climate_final
group by Region, Income_Group
order by Region;
quit;
proc print data=region_summary;
run;
OUTPUT:
| Obs | Region | Income_Group | N | Mean_CO2 | Mean_Temp | Mean_SL | Mean_Forest | Mean_Poll |
|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1.70 | 26.60 | 40.20 | 50.00 | 45.00 | ||
| 2 | Africa | Low | 1 | 0.30 | 19.80 | 12.00 | 45.00 | 34.00 |
| 3 | Asia | High | 1 | 9.20 | 13.60 | 19.10 | 67.00 | 29.80 |
| 4 | Asia | Low | 1 | 0.50 | 25.10 | 35.60 | 17.00 | 78.90 |
| 5 | Asia | Lower-Midd | 1 | 1.90 | 26.80 | 18.20 | 24.00 | 65.50 |
| 6 | Asia | Upper-Midd | 2 | 4.80 | 21.25 | 43.45 | 27.15 | 58.50 |
| 7 | Europe | High | 3 | 6.40 | 9.47 | 9.90 | 37.90 | 28.60 |
| 8 | North America | High | 1 | 14.50 | 12.30 | 22.40 | 33.90 | 40.20 |
| 9 | Oceania | High | 1 | 16.80 | 21.50 | 14.00 | 16.50 | 36.00 |
| 10 | South America | Upper-Midd | 1 | 2.10 | 25.00 | 10.10 | 59.40 | 30.00 |
10) Correlation matrix to explore relationships
proc corr data=climate_final nosimple plots=matrix(histogram);
var CO2_Emissions Average_Temperature Sea_Level_Rise Forest_Cover Pollution_Index;
run;
OUTPUT:
The CORR Procedure
| 5 Variables: | CO2_Emissions Average_Temperature Sea_Level_Rise Forest_Cover Pollution_Index |
|---|
| Pearson Correlation Coefficients, N = 13 Prob > |r| under H0: Rho=0 | |||||
|---|---|---|---|---|---|
| CO2_Emissions | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | |
| CO2_Emissions | 1.00000 | -0.47774 0.0987 | -0.31052 0.3018 | -0.18307 0.5494 | -0.37300 0.2094 |
| Average_Temperature | -0.47774 0.0987 | 1.00000 | 0.59813 0.0308 | -0.16870 0.5817 | 0.62744 0.0217 |
| Sea_Level_Rise | -0.31052 0.3018 | 0.59813 0.0308 | 1.00000 | -0.12244 0.6903 | 0.60689 0.0278 |
| Forest_Cover | -0.18307 0.5494 | -0.16870 0.5817 | -0.12244 0.6903 | 1.00000 | -0.57290 0.0407 |
| Pollution_Index | -0.37300 0.2094 | 0.62744 0.0217 | 0.60689 0.0278 | -0.57290 0.0407 | 1.00000 |
11) Regression: model CO2_Emissions as a function of other indicators
proc reg data=climate_final;
model CO2_Emissions = Average_Temperature Sea_Level_Rise Forest_Cover Pollution_Index / vif collin;
output out=reg_out p=Predicted_CO2 r=Residual;
run;
quit;
OUTPUT:
The REG Procedure
Model: MODEL1
Dependent Variable: CO2_Emissions
| Number of Observations Read | 13 |
|---|---|
| Number of Observations Used | 13 |
| Analysis of Variance | |||||
|---|---|---|---|---|---|
| Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
| Model | 4 | 141.33682 | 35.33421 | 1.44 | 0.3054 |
| Error | 8 | 196.21087 | 24.52636 | ||
| Corrected Total | 12 | 337.54769 | |||
| Root MSE | 4.95241 | R-Square | 0.4187 |
|---|---|---|---|
| Dependent Mean | 5.83077 | Adj R-Sq | 0.1281 |
| Coeff Var | 84.93578 |
| Parameter Estimates | ||||||
|---|---|---|---|---|---|---|
| Variable | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| | Variance Inflation |
| Intercept | 1 | 22.53190 | 7.54545 | 2.99 | 0.0174 | 0 |
| Average_Temperature | 1 | -0.20253 | 0.26974 | -0.75 | 0.4743 | 1.96136 |
| Sea_Level_Rise | 1 | 0.05058 | 0.11457 | 0.44 | 0.6705 | 1.94720 |
| Forest_Cover | 1 | -0.15478 | 0.09790 | -1.58 | 0.1525 | 1.76263 |
| Pollution_Index | 1 | -0.19271 | 0.15156 | -1.27 | 0.2393 | 3.27311 |
| Collinearity Diagnostics | |||||||
|---|---|---|---|---|---|---|---|
| Number | Eigenvalue | Condition Index | Proportion of Variation | ||||
| Intercept | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | |||
| 1 | 4.40533 | 1.00000 | 0.00162 | 0.00321 | 0.00780 | 0.00454 | 0.00184 |
| 2 | 0.37187 | 3.44187 | 0.00506 | 0.00274 | 0.15352 | 0.16926 | 0.00754 |
| 3 | 0.15516 | 5.32847 | 0.02216 | 0.02150 | 0.56155 | 0.12402 | 0.05199 |
| 4 | 0.04984 | 9.40191 | 0.10629 | 0.92534 | 0.08884 | 0.00261 | 0.08517 |
| 5 | 0.01781 | 15.72816 | 0.86488 | 0.04722 | 0.18830 | 0.69957 | 0.85348 |
The REG Procedure
Model: MODEL1
Dependent Variable: CO2_Emissions
12) Visualization: scatter matrix and a scatter + regression line for CO2 vs Temp
proc sgscatter data=climate_final;
matrix CO2_Emissions Average_Temperature Sea_Level_Rise Forest_Cover Pollution_Index
/ diagonal=(histogram);
run;
OUTPUT:
/* Scatter + fitted line */
proc sgplot data=reg_out;
title "CO2 Emissions vs Average Temperature with Regression Line";
scatter x=Average_Temperature y=CO2_Emissions / datalabel=Country;
reg x=Average_Temperature y=CO2_Emissions / CLI CLM;
xaxis label="Average Temperature (°C)";
yaxis label="CO2 Emissions (metric tons per capita)";
run;
title;
OUTPUT:
13) Final dataset view
proc print data=climate_final label noobs;
var Country Region Income_Group Observation_Date CO2_Emissions Average_Temperature
Sea_Level_Rise Forest_Cover Pollution_Index Risk_Score Risk_Pct Risk_Category;
label CO2_Emissions = "CO2 (t per capita)";
run;
OUTPUT:
| Country | Region | Income_Group | Observation_Date | CO2 (t per capita) | Average_Temperature | Sea_Level_Rise | Forest_Cover | Pollution_Index | Risk_Score | Rank for Variable Risk_Score | Risk_Category |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Australia | Oceania | High | 01JAN2024 | 16.8 | 21.5 | 14.0 | 16.5 | 36.0 | 0.74374 | 91.667 | Severe |
| Bangladesh | Asia | Low | 01JAN2024 | 0.5 | 25.1 | 35.6 | 17.0 | 78.9 | 0.44545 | 83.333 | Severe |
| Brazil | South America | Upper-Midd | 01JAN2024 | 2.1 | 25.0 | 10.1 | 59.4 | 30.0 | -0.35904 | 25.000 | Moderate |
| China | Asia | Upper-Midd | 01JAN2024 | 7.6 | 14.2 | 16.7 | 22.3 | 55.0 | 0.07642 | 58.333 | High |
| Ethiopia | Africa | Low | 01JAN2024 | 0.3 | 19.8 | 12.0 | 45.0 | 34.0 | -0.48323 | 16.667 | Low |
| Germany | Europe | High | 01JAN2024 | 8.4 | 10.5 | 8.8 | 31.9 | 28.5 | -0.31791 | 33.333 | Moderate |
| India | Asia | Lower-Midd | 01JAN2024 | 1.9 | 26.8 | 18.2 | 24.0 | 65.5 | 0.25190 | 66.667 | High |
| Japan | Asia | High | 01JAN2024 | 9.2 | 13.6 | 19.1 | 67.0 | 29.8 | -0.30665 | 41.667 | Moderate |
| Maldives | Asia | Upper-Midd | 01JAN2024 | 2.0 | 28.3 | 70.2 | 32.0 | 62.0 | 0.83014 | 100.000 | Severe |
| Sweden | Europe | High | 01JAN2024 | 4.9 | 7.1 | 9.5 | 69.0 | 22.0 | -0.93776 | 8.333 | Low |
| UK | Europe | High | 01JAN2024 | 5.9 | 10.8 | 11.4 | 12.8 | 35.3 | -0.23423 | 50.000 | High |
| USA | North America | High | 01JAN2024 | 14.5 | 12.3 | 22.4 | 33.9 | 40.2 | 0.29115 | 75.000 | Severe |
| Fiji | 01JAN2024 | 1.7 | 26.6 | 40.2 | 50.0 | 45.0 | . | . |
Comments
Post a Comment