137.​ANALYZING GLOBAL RENEWABLE ENERGY CONSUMPTION DATA USING SAS: DATA CREATION, MANAGEMENT, AND ADVANCED ANALYTICAL PROCEDURES FOR INSIGHTFUL ENERGY ANALYSIS

​ANALYZING GLOBAL RENEWABLE ENERGY CONSUMPTION DATA USING SAS: DATA CREATION, MANAGEMENT, AND ADVANCED ANALYTICAL PROCEDURES FOR INSIGHTFUL ENERGY ANALYSIS 


/*Step 1: Creating the Dataset*/

/*We'll design a dataset named RenewableEnergy that encapsulates information about

different countries, their renewable energy consumption, the type of renewable energy,

and the year of record. Here's how you can create this dataset in SAS:*/ 

data RenewableEnergy;

    input Country $ EnergyType $ ConsumptionMW Year;

    datalines;

USA Solar 50000 2020

USA Wind 60000 2020

USA Hydro 80000 2020

Germany Solar 45000 2020

Germany Wind 55000 2020

Germany Hydro 70000 2020

India Solar 30000 2020

India Wind 35000 2020

India Hydro 50000 2020

China Solar 70000 2020

China Wind 80000 2020

China Hydro 90000 2020

Brazil Solar 20000 2020

Brazil Wind 25000 2020

Brazil Hydro 60000 2020

;

run;

proc print;run;


Output:

ObsCountryEnergyTypeConsumptionMWYear
1USASolar500002020
2USAWind600002020
3USAHydro800002020
4GermanySolar450002020
5GermanyWind550002020
6GermanyHydro700002020
7IndiaSolar300002020
8IndiaWind350002020
9IndiaHydro500002020
10ChinaSolar700002020
11ChinaWind800002020
12ChinaHydro900002020
13BrazilSolar200002020
14BrazilWind250002020
15BrazilHydro600002020

/*In this dataset:*/

/*Country: Name of the country.*/

/*EnergyType: Type of renewable energy (e.g., Solar, Wind, Hydro).*/

/*ConsumptionMW: Energy consumption in megawatts.*/

/*Year: The year of the recorded data.*/


/*Step 2: Exploring the Dataset with PROC CONTENTS*/

/*To understand the structure of the RenewableEnergy dataset, we can use the 

PROC CONTENTS procedure:*/


proc contents data=RenewableEnergy;

run;


Output:

Alphabetic List of Variables and Attributes
#VariableTypeLen
3ConsumptionMWNum8
1CountryChar8
2EnergyTypeChar8
4YearNum8


/*This procedure provides detailed information about the dataset, including

variable names, types, and attributes.*/


/*Step 3: Sorting Data with PROC SORT*/

/*Before performing analyses, it's often beneficial to sort the data.

For instance, to sort the dataset by Country and EnergyType:*/


proc sort data=RenewableEnergy;

    by Country EnergyType;

run;

proc print;run;


Output:

ObsCountryEnergyTypeConsumptionMWYear
1BrazilHydro600002020
2BrazilSolar200002020
3BrazilWind250002020
4ChinaHydro900002020
5ChinaSolar700002020
6ChinaWind800002020
7GermanyHydro700002020
8GermanySolar450002020
9GermanyWind550002020
10IndiaHydro500002020
11IndiaSolar300002020
12IndiaWind350002020
13USAHydro800002020
14USASolar500002020
15USAWind600002020

/*Step 4: Summarizing Data with PROC MEANS*/

/*To obtain descriptive statistics on energy consumption:*/


proc means data=RenewableEnergy mean median maxdec=2;

    var ConsumptionMW;

    class Country EnergyType;

run;


Output:

                                                                  The MEANS Procedure

Analysis Variable : ConsumptionMW
CountryEnergyTypeN ObsMeanMedian
BrazilHydro160000.0060000.00
 Solar120000.0020000.00
 Wind125000.0025000.00
ChinaHydro190000.0090000.00
 Solar170000.0070000.00
 Wind180000.0080000.00
GermanyHydro170000.0070000.00
 Solar145000.0045000.00
 Wind155000.0055000.00
IndiaHydro150000.0050000.00
 Solar130000.0030000.00
 Wind135000.0035000.00
USAHydro180000.0080000.00
 Solar150000.0050000.00
 Wind160000.0060000.00


/*This code calculates the mean and median energy consumption for each combination

of Country and EnergyType, displaying results with two decimal places.*/


/*Step 5: Generating Frequency Tables with PROC FREQ*/

/*To analyze the distribution of energy types across countries:*/


proc freq data=RenewableEnergy;

    tables Country*EnergyType / norow nocol nopercent;

run;


Output:

                                                                           The FREQ Procedure

Frequency
Table of Country by EnergyType
CountryEnergyType
HydroSolarWindTotal
Brazil
1
1
1
3
China
1
1
1
3
Germany
1
1
1
3
India
1
1
1
3
USA
1
1
1
3
Total
5
5
5
15

/*This cross-tabulation shows the count of each energy type per country without 

displaying row, column, or percentage values.*/


/*Step 6: Data Transformation with PROC TRANSPOSE*/

/*Suppose we want to restructure the dataset to have energy types as columns

for each country. We can achieve this using PROC TRANSPOSE:*/


proc transpose data=RenewableEnergy out=EnergyTransposed;

    by Country Year;

    id EnergyType;

    var ConsumptionMW;

run;

proc print;run;


Output:
ObsCountryYear_NAME_HydroSolarWind
1Brazil2020ConsumptionMW600002000025000
2China2020ConsumptionMW900007000080000
3Germany2020ConsumptionMW700004500055000
4India2020ConsumptionMW500003000035000
5USA2020ConsumptionMW800005000060000

/*This transposes the EnergyType values into separate columns, with their 

corresponding ConsumptionMW values.*/


/*Step 7: Creating Custom Formats with PROC FORMAT*/

/*To enhance the readability of our reports, we can define custom formats. 

For example, let's categorize the energy consumption levels:*/


proc format;

    value ConsumptionFmt

        low-<40000 = 'Low'

        40000-<70000 = 'Medium'

        70000-high = 'High';

run;

/*Applying this format in a report:*/


proc print data=RenewableEnergy;

    var Country EnergyType ConsumptionMW;

    format ConsumptionMW ConsumptionFmt.;

run;


Output:

ObsCountryEnergyTypeConsumptionMW
1BrazilHydroMedium
2BrazilSolarLow
3BrazilWindLow
4ChinaHydroHigh
5ChinaSolarHigh
6ChinaWindHigh
7GermanyHydroHigh
8GermanySolarMedium
9GermanyWindMedium
10IndiaHydroMedium
11IndiaSolarLow
12IndiaWindLow
13USAHydroHigh
14USASolarMedium
15USAWindMedium

/*This will display the consumption levels as 'Low', 'Medium', or 'High'

based on the defined ranges.*/



/*Step 8: Appending Data with PROC APPEND*/

/*Imagine we have a new dataset, RenewableEnergy2021, containing data for the year 2021. We can append this data to our existing RenewableEnergy dataset:*/


proc append base=RenewableEnergy2021 data=RenewableEnergy;

run;proc print;run;


Output:

ObsCountryEnergyTypeConsumptionMWYear
1BrazilHydro600002020
2BrazilSolar200002020
3BrazilWind250002020
4ChinaHydro900002020
5ChinaSolar700002020
6ChinaWind800002020
7GermanyHydro700002020
8GermanySolar450002020
9GermanyWind550002020
10IndiaHydro500002020
11IndiaSolar300002020
12IndiaWind350002020
13USAHydro800002020
14USASolar500002020
15USAWind600002020


/*This procedure adds the observations from RenewableEnergy2021 to RenewableEnergy.*/


/*Step 9: Merging Datasets with DATA Step*/

/*If we have another dataset containing additional information, 

such as CountryDetails with attributes like GDP and Population,

we can merge it with our RenewableEnergy dataset:*/


data EnergyWithDetails;

    merge RenewableEnergy(in=a) RenewableEnergy2021(in=b);

    by Country;

    if a and b;

run;


Output:

ObsCountryEnergyTypeConsumptionMWYear
1BrazilHydro600002020
2BrazilSolar200002020
3BrazilWind250002020
4ChinaHydro900002020
5ChinaSolar700002020
6ChinaWind800002020
7GermanyHydro700002020
8GermanySolar450002020
9GermanyWind550002020
10IndiaHydro500002020
11IndiaSolar300002020
12IndiaWind350002020
13USAHydro800002020
14USASolar500002020
15USAWind600002020


/*This merges the datasets by Country, including only the records that exist in 

both datasets.*/


/*Step 10: Renaming Variables with PROC DATASETS*/

/*To rename variables for clarity, we can use the PROC DATASETS procedure:*/


proc datasets library=work;

    modify RenewableEnergy;

    rename ConsumptionMW=EnergyConsumption_MW;

run;

quit;


Output Log:

                                                                  Member      File

                #  Name                                        Type        Size  Last Modified


                1  ENERGYTRANSPOSED     DATA      131072  28/03/2025 09:06:09

                2  ENERGYWITHDETAILS    DATA      131072  28/03/2025 09:11:26

                3  FORMATS                         CATALOG    17408  28/03/2025 09:07:01

                4  RENEWABLEENERGY      DATA      131072  28/03/2025 09:07:43

                5  RENEWABLEENERGY2021  DATA      131072  28/03/2025 09:08:48




/*This changes the variable name from ConsumptionMW to EnergyConsumption_MW within the RenewableEnergy dataset.*/



/*Step 11: Deleting Datasets with PROC DATASETS*/

/*To remove datasets that are no longer needed:*/


proc datasets library=work;

    delete RenewableEnergy2021;

run;

quit;


Output:

Directory
LibrefWORK
EngineV9
Physical NameC:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD9692_DESKTOP-QFAA4KV_
FilenameC:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD9692_DESKTOP-QFAA4KV_


#NameMember TypeFile SizeLast Modified
1ENERGYTRANSPOSEDDATA13107228/03/2025 09:06:09
2ENERGYWITHDETAILSDATA13107228/03/2025 09:11:26
3FORMATSCATALOG1740828/03/2025 09:07:01
4RENEWABLEENERGYDATA19660828/03/2025 09:14:20

/*This deletes the RenewableEnergy2021 dataset from the WORK library.*/


/*Step 12: Exporting Data with PROC EXPORT*/

/*To share our dataset with others, we might need to export it to an 

external file format, such as CSV:*/


proc export data=RenewableEnergy

    outfile='/path/to/output/RenewableEnergy.csv'

    dbms=csv

    replace;

run;

/*This exports the RenewableEnergy dataset to a CSV file at the specified location.*/



PRACTICE AND COMMENT YOUR CODE: 

-->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

TO FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

Comments