COMPREHENSIVE RAINFALL ANALYSIS ACROSS DIFFERENT PLACES IN INDIA USING PROC PRINT | PROC SORT | PROC MEANS | PROC FREQ | PROC SQL | ADVANCED MACROS IN SAS
/* Creating a dataset named rainfall_data */
1. Dataset Creation
options nocenter;
data rainfall_data;
infile datalines dlm=',' dsd;
input City : $20. State : $20. Month : $10. Rainfall_mm Region : $10.;
datalines;
Hyderabad,Telangana,June,145,South
Chennai,Tamil Nadu,June,110,South
Mumbai,Maharashtra,June,540,West
Delhi,Delhi,June,180,North
Bengaluru,Karnataka,June,130,South
Kolkata,West Bengal,June,380,East
Ahmedabad,Gujarat,June,220,West
Lucknow,Uttar Pradesh,June,160,North
Jaipur,Rajasthan,June,90,West
Patna,Bihar,June,290,East
Bhopal,Madhya Pradesh,June,250,Central
Guwahati,Assam,June,420,Northeast
Imphal,Manipur,June,380,Northeast
Shimla,Himachal Pradesh,June,200,North
Panaji,Goa,June,470,West
Thiruvananthapuram,Kerala,June,600,South
Ranchi,Jharkhand,June,270,East
Pune,Maharashtra,June,490,West
Vijayawada,Andhra Pradesh,June,180,South
Nagpur,Maharashtra,June,300,Central
Agartala,Tripura,June,350,Northeast
Shillong,Meghalaya,June,500,Northeast
Varanasi,Uttar Pradesh,June,210,North
Coimbatore,Tamil Nadu,June,140,South
Raipur,Chhattisgarh,June,310,Central
Itanagar,Arunachal Pradesh,June,460,Northeast
Dehradun,Uttarakhand,June,280,North
Aizawl,Mizoram,June,410,Northeast
;
run;
proc print;run;
Output:
| Obs | City | State | Month | Rainfall_mm | Region |
|---|---|---|---|---|---|
| 1 | Hyderabad | Telangana | June | 145 | South |
| 2 | Chennai | Tamil Nadu | June | 110 | South |
| 3 | Mumbai | Maharashtra | June | 540 | West |
| 4 | Delhi | Delhi | June | 180 | North |
| 5 | Bengaluru | Karnataka | June | 130 | South |
| 6 | Kolkata | West Bengal | June | 380 | East |
| 7 | Ahmedabad | Gujarat | June | 220 | West |
| 8 | Lucknow | Uttar Pradesh | June | 160 | North |
| 9 | Jaipur | Rajasthan | June | 90 | West |
| 10 | Patna | Bihar | June | 290 | East |
| 11 | Bhopal | Madhya Pradesh | June | 250 | Central |
| 12 | Guwahati | Assam | June | 420 | Northeast |
| 13 | Imphal | Manipur | June | 380 | Northeast |
| 14 | Shimla | Himachal Pradesh | June | 200 | North |
| 15 | Panaji | Goa | June | 470 | West |
| 16 | Thiruvananthapuram | Kerala | June | 600 | South |
| 17 | Ranchi | Jharkhand | June | 270 | East |
| 18 | Pune | Maharashtra | June | 490 | West |
| 19 | Vijayawada | Andhra Pradesh | June | 180 | South |
| 20 | Nagpur | Maharashtra | June | 300 | Central |
| 21 | Agartala | Tripura | June | 350 | Northeast |
| 22 | Shillong | Meghalaya | June | 500 | Northeast |
| 23 | Varanasi | Uttar Pradesh | June | 210 | North |
| 24 | Coimbatore | Tamil Nadu | June | 140 | South |
| 25 | Raipur | Chhattisgarh | June | 310 | Central |
| 26 | Itanagar | Arunachal Pradesh | June | 460 | Northeast |
| 27 | Dehradun | Uttarakhand | June | 280 | North |
| 28 | Aizawl | Mizoram | June | 410 | Northeast |
2. PROC PRINT – Display the Dataset
title "Rainfall Dataset Across Different Cities in India";
proc print data=rainfall_data noobs;
run;
Output:
| City | State | Month | Rainfall_mm | Region |
|---|---|---|---|---|
| Hyderabad | Telangana | June | 145 | South |
| Chennai | Tamil Nadu | June | 110 | South |
| Mumbai | Maharashtra | June | 540 | West |
| Delhi | Delhi | June | 180 | North |
| Bengaluru | Karnataka | June | 130 | South |
| Kolkata | West Bengal | June | 380 | East |
| Ahmedabad | Gujarat | June | 220 | West |
| Lucknow | Uttar Pradesh | June | 160 | North |
| Jaipur | Rajasthan | June | 90 | West |
| Patna | Bihar | June | 290 | East |
| Bhopal | Madhya Pradesh | June | 250 | Central |
| Guwahati | Assam | June | 420 | Northeast |
| Imphal | Manipur | June | 380 | Northeast |
| Shimla | Himachal Pradesh | June | 200 | North |
| Panaji | Goa | June | 470 | West |
| Thiruvananthapuram | Kerala | June | 600 | South |
| Ranchi | Jharkhand | June | 270 | East |
| Pune | Maharashtra | June | 490 | West |
| Vijayawada | Andhra Pradesh | June | 180 | South |
| Nagpur | Maharashtra | June | 300 | Central |
| Agartala | Tripura | June | 350 | Northeast |
| Shillong | Meghalaya | June | 500 | Northeast |
| Varanasi | Uttar Pradesh | June | 210 | North |
| Coimbatore | Tamil Nadu | June | 140 | South |
| Raipur | Chhattisgarh | June | 310 | Central |
| Itanagar | Arunachal Pradesh | June | 460 | Northeast |
| Dehradun | Uttarakhand | June | 280 | North |
| Aizawl | Mizoram | June | 410 | Northeast |
3. PROC SORT – Sorting by Rainfall
proc sort data=rainfall_data out=sorted_rainfall;
by descending Rainfall_mm;
run;
title "Top Cities by Rainfall";
proc print data=sorted_rainfall (obs=10);
run;
Output:
| Obs | City | State | Month | Rainfall_mm | Region |
|---|---|---|---|---|---|
| 1 | Thiruvananthapuram | Kerala | June | 600 | South |
| 2 | Mumbai | Maharashtra | June | 540 | West |
| 3 | Shillong | Meghalaya | June | 500 | Northeast |
| 4 | Pune | Maharashtra | June | 490 | West |
| 5 | Panaji | Goa | June | 470 | West |
| 6 | Itanagar | Arunachal Pradesh | June | 460 | Northeast |
| 7 | Guwahati | Assam | June | 420 | Northeast |
| 8 | Aizawl | Mizoram | June | 410 | Northeast |
| 9 | Kolkata | West Bengal | June | 380 | East |
| 10 | Imphal | Manipur | June | 380 | Northeast |
4. PROC MEANS – Rainfall Statistics
title "Rainfall Summary Statistics by Region";
proc means data=rainfall_data mean median min max maxdec=2;
class Region;
var Rainfall_mm;
run;
Output:
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |||||
|---|---|---|---|---|---|
| Region | N Obs | Mean | Median | Minimum | Maximum |
| Central | 3 | 286.67 | 300.00 | 250.00 | 310.00 |
| East | 3 | 313.33 | 290.00 | 270.00 | 380.00 |
| North | 5 | 206.00 | 200.00 | 160.00 | 280.00 |
| Northeast | 6 | 420.00 | 415.00 | 350.00 | 500.00 |
| South | 6 | 217.50 | 142.50 | 110.00 | 600.00 |
| West | 5 | 362.00 | 470.00 | 90.00 | 540.00 |
5. PROC FREQ – Frequency of Cities by Region
title "Number of Cities Per Region";
proc freq data=rainfall_data;
tables Region;
run;
Output:
The FREQ Procedure
| Region | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Central | 3 | 10.71 | 3 | 10.71 |
| East | 3 | 10.71 | 6 | 21.43 |
| North | 5 | 17.86 | 11 | 39.29 |
| Northeast | 6 | 21.43 | 17 | 60.71 |
| South | 6 | 21.43 | 23 | 82.14 |
| West | 5 | 17.86 | 28 | 100.00 |
6. PROC SQL – Advanced Queries
a. Average Rainfall per State
title "Average Rainfall per State";
proc sql;
select State, mean(Rainfall_mm) as Avg_Rainfall format=8.2
from rainfall_data
group by State
order by Avg_Rainfall desc;
quit;
Output:
| State | Avg_Rainfall |
|---|---|
| Kerala | 600.00 |
| Meghalaya | 500.00 |
| Goa | 470.00 |
| Arunachal Pradesh | 460.00 |
| Maharashtra | 443.33 |
| Assam | 420.00 |
| Mizoram | 410.00 |
| Manipur | 380.00 |
| West Bengal | 380.00 |
| Tripura | 350.00 |
| Chhattisgarh | 310.00 |
| Bihar | 290.00 |
| Uttarakhand | 280.00 |
| Jharkhand | 270.00 |
| Madhya Pradesh | 250.00 |
| Gujarat | 220.00 |
| Himachal Pradesh | 200.00 |
| Uttar Pradesh | 185.00 |
| Delhi | 180.00 |
| Andhra Pradesh | 180.00 |
| Telangana | 145.00 |
| Karnataka | 130.00 |
| Tamil Nadu | 125.00 |
| Rajasthan | 90.00 |
b. Cities with Rainfall above 400mm
title "Cities with Rainfall Greater than 400mm";
proc sql;
select City, State, Rainfall_mm
from rainfall_data
where Rainfall_mm > 400
order by Rainfall_mm desc;
quit;
Output:
| City | State | Rainfall_mm |
|---|---|---|
| Thiruvananthapuram | Kerala | 600 |
| Mumbai | Maharashtra | 540 |
| Shillong | Meghalaya | 500 |
| Pune | Maharashtra | 490 |
| Panaji | Goa | 470 |
| Itanagar | Arunachal Pradesh | 460 |
| Guwahati | Assam | 420 |
| Aizawl | Mizoram | 410 |
c. Regional Rainfall Contribution
title "Total Rainfall by Region";
proc sql;
select Region, sum(Rainfall_mm) as Total_Rainfall format=8.0
from rainfall_data
group by Region;
quit;
Output:
| Region | Total_Rainfall |
|---|---|
| Central | 860 |
| East | 940 |
| North | 1030 |
| Northeast | 2520 |
| South | 1305 |
| West | 1810 |
7. MACRO – Automating Regional Analysis
%macro avg_rainfall_by_region;
proc sql noprint;
select distinct Region into :region_list separated by '|'
from rainfall_data;
quit;
data _null_;
length str $500;
str = symget("region_list");
count = countw(str, '|');
call symputx("region_count", count);
run;
%do i = 1 %to ®ion_count;
%let region_name = %scan(®ion_list, &i, |);
title "Rainfall Stats for ®ion_name Region";
proc means data=rainfall_data mean median maxdec=1;
where Region = "®ion_name";
var Rainfall_mm;
run;
%end;
%mend avg_rainfall_by_region;
%avg_rainfall_by_region;
Output:
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 286.7 | 300.0 |
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 313.3 | 290.0 |
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 206.0 | 200.0 |
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 420.0 | 415.0 |
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 217.5 | 142.5 |
The MEANS Procedure
| Analysis Variable : Rainfall_mm | |
|---|---|
| Mean | Median |
| 362.0 | 470.0 |
8. Additional Analytical Scenarios
a. Top 5 Cities in South India by Rainfall
title "Top 5 South Indian Cities by Rainfall";
proc sql;
select City, Rainfall_mm
from rainfall_data
where Region = 'South'
order by Rainfall_mm desc;
quit;
Output:
| City | Rainfall_mm |
|---|---|
| Thiruvananthapuram | 600 |
| Vijayawada | 180 |
| Hyderabad | 145 |
| Coimbatore | 140 |
| Bengaluru | 130 |
| Chennai | 110 |
b. Lowest Rainfall Across India
title "Cities with Least Rainfall";
proc sql outobs=5;
select City, Rainfall_mm
from rainfall_data
order by Rainfall_mm asc;
quit;
Output:
| City | Rainfall_mm |
|---|---|
| Jaipur | 90 |
| Chennai | 110 |
| Bengaluru | 130 |
| Coimbatore | 140 |
| Hyderabad | 145 |
No comments:
Post a Comment