160.Exploring Renewable Energy Projects with PROC CONTENTS | PROC PRINT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SQL | PROC DATASETS IN SAS

Exploring Renewable Energy Projects with PROC CONTENTS | PROC PRINT | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC SQL | PROC DATASETS IN SAS


 /*Creating a unique dataset centered around "Global Renewable Energy Projects"*/

1. Dataset Creation: Global Renewable Energy Projects

1.1. Defining the Dataset Structure

/*We'll define a dataset named Renewable_Projects with the following variables:*/

Project_ID: Unique identifier for each project.

Project_Name: Name of the renewable energy project.

Country: Country where the project is located.

Region: Geographical region (e.g., Asia, Europe).

Energy_Type: Type of renewable energy (e.g., Solar, Wind, Hydro).

Capacity_MW: Installed capacity in megawatts.

Investment_USD_Million: Total investment in USD millions.

Start_Year: Year the project commenced operations.

Status: Current status (e.g., Operational, Under Construction).

Operator: Company or entity operating the project.​


1.2. Sample Data Creation

/*We'll create a sample of 10 records for illustration.​*/

data Renewable_Projects;

    input Project_ID $ Project_Name $ Country $ Region $ Energy_Type $ 

          Capacity_MW Investment_USD_Million Start_Year Status $ Operator $;

    datalines;

P001 SolarOne India Asia Solar 100 150 2015 Operational SunEnergy

P002 WindMax Germany Europe Wind 200 300 2012 Operational WindCorp

P003 HydroFlow Brazil SouthAmerica Hydro 500 600 2010 Operational HydroWorks

P004 SunRise USA NorthAmerica Solar 150 200 2018 Operational SolarTech

P005 WindyPlains Australia Oceania Wind 250 350 2016 Operational WindMasters

P006 RiverPower Canada NorthAmerica Hydro 400 500 2011 Operational HydroCanada

P007 DesertSun UAE Asia Solar 120 180 2019 Operational SunDesert

P008 MountainWind China Asia Wind 300 400 2013 Operational WindChina

P009 AquaEnergy Norway Europe Hydro 350 450 2014 Operational AquaNor

P010 SolarVista Kenya Africa Solar 80 100 2020 Operational SolarAfrica

;

run;

proc print;run;

Output:

Obs Project_ID Project_Name Country Region Energy_Type Capacity_MW Investment_USD_Million Start_Year Status Operator
1 P001 SolarOne India Asia Solar 100 150 2015 Operatio SunEnerg
2 P002 WindMax Germany Europe Wind 200 300 2012 Operatio WindCorp
3 P003 HydroFlo Brazil SouthAme Hydro 500 600 2010 Operatio HydroWor
4 P004 SunRise USA NorthAme Solar 150 200 2018 Operatio SolarTec
5 P005 WindyPla Australi Oceania Wind 250 350 2016 Operatio WindMast
6 P006 RiverPow Canada NorthAme Hydro 400 500 2011 Operatio HydroCan
7 P007 DesertSu UAE Asia Solar 120 180 2019 Operatio SunDeser
8 P008 Mountain China Asia Wind 300 400 2013 Operatio WindChin
9 P009 AquaEner Norway Europe Hydro 350 450 2014 Operatio AquaNor
10 P010 SolarVis Kenya Africa Solar 80 100 2020 Operatio SolarAfr


2. Data Exploration and Management

2.1. Viewing the Dataset Structure

/*Use PROC CONTENTS to view the structure and attributes of the dataset.*/

proc contents data=Renewable_Projects;

run;

Output:

                                                                  The CONTENTS Procedure

Data Set Name WORK.RENEWABLE_PROJECTS Observations 10
Member Type DATA Variables 10
Engine V9 Indexes 0
Created 14/09/2015 00:03:27 Observation Length 80
Last Modified 14/09/2015 00:03:27 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    

Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 817
Obs in First Data Page 10
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD860_DESKTOP-QFAA4KV_\renewable_projects.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME

Alphabetic List of Variables and Attributes
# Variable Type Len
6 Capacity_MW Num 8
3 Country Char 8
5 Energy_Type Char 8
7 Investment_USD_Million Num 8
10 Operator Char 8
1 Project_ID Char 8
2 Project_Name Char 8
4 Region Char 8
8 Start_Year Num 8
9 Status Char 8

2.2. Viewing the Data

/*Use PROC PRINT to display the dataset.​*/

proc print data=Renewable_Projects;

run;

Output:

Obs Project_ID Project_Name Country Region Energy_Type Capacity_MW Investment_USD_Million Start_Year Status Operator
1 P001 SolarOne India Asia Solar 100 150 2015 Operatio SunEnerg
2 P002 WindMax Germany Europe Wind 200 300 2012 Operatio WindCorp
3 P003 HydroFlo Brazil SouthAme Hydro 500 600 2010 Operatio HydroWor
4 P004 SunRise USA NorthAme Solar 150 200 2018 Operatio SolarTec
5 P005 WindyPla Australi Oceania Wind 250 350 2016 Operatio WindMast
6 P006 RiverPow Canada NorthAme Hydro 400 500 2011 Operatio HydroCan
7 P007 DesertSu UAE Asia Solar 120 180 2019 Operatio SunDeser
8 P008 Mountain China Asia Wind 300 400 2013 Operatio WindChin
9 P009 AquaEner Norway Europe Hydro 350 450 2014 Operatio AquaNor
10 P010 SolarVis Kenya Africa Solar 80 100 2020 Operatio SolarAfr

3. Data Manipulation

3.1. Filtering Data

/*Filter projects located in Asia.*/

data Asia_Projects;

    set Renewable_Projects;

    where Region = 'Asia';

run;

proc print;run;

Output:

Obs Project_ID Project_Name Country Region Energy_Type Capacity_MW Investment_USD_Million Start_Year Status Operator
1 P001 SolarOne India Asia Solar 100 150 2015 Operatio SunEnerg
2 P007 DesertSu UAE Asia Solar 120 180 2019 Operatio SunDeser
3 P008 Mountain China Asia Wind 300 400 2013 Operatio WindChin

3.2. Creating New Variables

/*Calculate the cost per megawatt for each project.*/

data Renewable_Projects;

    set Renewable_Projects;

    Cost_Per_MW = Investment_USD_Million / Capacity_MW;

run;

proc print;run;

Output:

Obs Project_ID Project_Name Country Region Energy_Type Capacity_MW Investment_USD_Million Start_Year Status Operator Cost_Per_MW
1 P001 SolarOne India Asia Solar 100 150 2015 Operatio SunEnerg 1.50000
2 P002 WindMax Germany Europe Wind 200 300 2012 Operatio WindCorp 1.50000
3 P003 HydroFlo Brazil SouthAme Hydro 500 600 2010 Operatio HydroWor 1.20000
4 P004 SunRise USA NorthAme Solar 150 200 2018 Operatio SolarTec 1.33333
5 P005 WindyPla Australi Oceania Wind 250 350 2016 Operatio WindMast 1.40000
6 P006 RiverPow Canada NorthAme Hydro 400 500 2011 Operatio HydroCan 1.25000
7 P007 DesertSu UAE Asia Solar 120 180 2019 Operatio SunDeser 1.50000
8 P008 Mountain China Asia Wind 300 400 2013 Operatio WindChin 1.33333
9 P009 AquaEner Norway Europe Hydro 350 450 2014 Operatio AquaNor 1.28571
10 P010 SolarVis Kenya Africa Solar 80 100 2020 Operatio SolarAfr 1.25000


4. Data Analysis

4.1. Summary Statistics

/*Use PROC MEANS to compute summary statistics for capacity and investment.*/

proc means data=Renewable_Projects mean median min max;

    var Capacity_MW Investment_USD_Million;

run;

Output:

                                                             The MEANS Procedure

Variable Mean Median Minimum Maximum
Capacity_MW
Investment_USD_Million
245.0000000
323.0000000
225.0000000
325.0000000
80.0000000
100.0000000
500.0000000
600.0000000


4.2. Frequency Analysis

/*Use PROC FREQ to analyze the distribution of energy types.*/

proc freq data=Renewable_Projects;

    tables Energy_Type;

run;

Output:

                                                                 The FREQ Procedure

Energy_Type Frequency Percent Cumulative
Frequency
Cumulative
Percent
Hydro 3 30.00 3 30.00
Solar 4 40.00 7 70.00
Wind 3 30.00 10 100.00


5. Data Visualization

5.1. Bar Chart of Projects by Energy Type

/*Use PROC SGPLOT to create a bar chart showing the number of projects per energy type.*/

proc sgplot data=Renewable_Projects;

    vbar Energy_Type;

    title "Number of Projects by Energy Type";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           2.56 seconds

      cpu time            0.53 seconds


NOTE: Listing image output written to SGPlot1.png.

NOTE: There were 10 observations read from the data set WORK.RENEWABLE_PROJECTS.


5.2. Scatter Plot of Investment vs. Capacity

/*Visualize the relationship between investment and capacity.*/

proc sgplot data=Renewable_Projects;

    scatter x=Capacity_MW y=Investment_USD_Million / group=Energy_Type;

    title "Investment vs. Capacity by Energy Type";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           0.75 seconds

      cpu time            0.09 seconds


NOTE: Listing image output written to SGPlot3.png.

NOTE: There were 10 observations read from the data set WORK.RENEWABLE_PROJECTS.


6. Advanced Data Operations

6.1. Sorting Data

/*Sort projects by descending capacity.*/

proc sort data=Renewable_Projects out=Sorted_Projects;

    by descending Capacity_MW;

run;

proc print;run;

Output:

Obs Project_ID Project_Name Country Region Energy_Type Capacity_MW Investment_USD_Million Start_Year Status Operator Cost_Per_MW
1 P003 HydroFlo Brazil SouthAme Hydro 500 600 2010 Operatio HydroWor 1.20000
2 P006 RiverPow Canada NorthAme Hydro 400 500 2011 Operatio HydroCan 1.25000
3 P009 AquaEner Norway Europe Hydro 350 450 2014 Operatio AquaNor 1.28571
4 P008 Mountain China Asia Wind 300 400 2013 Operatio WindChin 1.33333
5 P005 WindyPla Australi Oceania Wind 250 350 2016 Operatio WindMast 1.40000
6 P002 WindMax Germany Europe Wind 200 300 2012 Operatio WindCorp 1.50000
7 P004 SunRise USA NorthAme Solar 150 200 2018 Operatio SolarTec 1.33333
8 P007 DesertSu UAE Asia Solar 120 180 2019 Operatio SunDeser 1.50000
9 P001 SolarOne India Asia Solar 100 150 2015 Operatio SunEnerg 1.50000
10 P010 SolarVis Kenya Africa Solar 80 100 2020 Operatio SolarAfr 1.25000


6.2. Aggregating Data

/*Calculate total capacity and investment by region.​*/

proc sql;

    create table Region_Summary as

    select Region, 

           sum(Capacity_MW) as Total_Capacity,

           sum(Investment_USD_Million) as Total_Investment

    from Renewable_Projects

    group by Region;

quit;

proc print;run;

Output:

Obs Region Total_Capacity Total_Investment
1 Africa 80 100
2 Asia 520 730
3 Europe 550 750
4 NorthAme 550 700
5 Oceania 250 350
6 SouthAme 500 600


7. Creating Indexes for Efficient Access

/*Use PROC DATASETS to create an index on the Country variable for faster access.*/

proc datasets library=work;

    modify Renewable_Projects;

    index create Country;

run;

quit;

Log:

index create Country;

NOTE: Simple index Country has been defined.

62   run;

NOTE: MODIFY was successful for WORK.RENEWABLE_PROJECTS.DATA.

63   quit;

NOTE: PROCEDURE DATASETS used (Total process time):

      real time           0.26 seconds

      cpu time            0.03 seconds


Output:

Directory
Libref WORK
Engine V9
Physical Name C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD860_DESKTOP-QFAA4KV_
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD860_DESKTOP-QFAA4KV_

# Name Member Type File Size Last Modified
1 ASIA_PROJECTS DATA 131072 22/04/2025 08:54:30
2 REGION_SUMMARY DATA 131072 22/04/2025 08:58:00
3 RENEWABLE_PROJECTS DATA 131072 22/04/2025 08:54:47
4 SORTED_PROJECTS DATA 131072 22/04/2025 08:57:39

PRACTICE AND COMMENT YOUR CODE: 

-->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

TO FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

Comments