Wednesday, 12 November 2025

307.INDIAN RIVERS DATA ANALYSIS USING DATA STEP | PROC SQL | PROC MEANS | PROC SORT | PROC UNIVARIATE | PROC SGPLOT AND MACRO AUTOMATION

INDIAN RIVERS DATA ANALYSIS USING DATA STEP | PROC SQL | PROC MEANS | PROC SORT | PROC UNIVARIATE | PROC SGPLOT AND MACRO AUTOMATION

1. CREATING THE DATASET USING THE DATA STEP

options nocenter nodate nonumber;

data work.indian_rivers;

    length River_Name $20 Origin_State $20 End_Location $25;

    infile datalines dsd;

    input River_Name $ Length_KM Origin_State $ End_Location $ No_of_States 

          Annual_Flow_BCM;

    datalines;

Ganga,2525,Uttarakhand,Bay_of_Bengal,11,525

Godavari,1465,Maharashtra,Bay_of_Bengal,7,110

Krishna,1400,Maharashtra,Bay_of_Bengal,5,78

Yamuna,1376,Uttarakhand,Ganga_Confluence,6,93

Brahmaputra,2900,Arunachal_Pradesh,Bay_of_Bengal,4,537

Narmada,1312,Madhya_Pradesh,Arabian_Sea,3,96

Tapti,724,Madhya_Pradesh,Arabian_Sea,3,37

Mahanadi,858,Chhattisgarh,Bay_of_Bengal,5,66

Cauvery,805,Karnataka,Bay_of_Bengal,3,21

Sabarmati,371,Rajasthan,Arabian_Sea,2,12

Beas,470,Himachal_Pradesh,Sutlej_River,2,15

Sutlej,1450,Tibet,Indus_River,4,77

;

run;

proc print data=work.indian_rivers;

    title "INDIAN RIVERS DATASET";

run;

OUTPUT:

INDIAN RIVERS DATASET

ObsRiver_NameOrigin_StateEnd_LocationLength_KMNo_of_StatesAnnual_Flow_BCM
1GangaUttarakhandBay_of_Bengal252511525
2GodavariMaharashtraBay_of_Bengal14657110
3KrishnaMaharashtraBay_of_Bengal1400578
4YamunaUttarakhandGanga_Confluence1376693
5BrahmaputraArunachal_PradeshBay_of_Bengal29004537
6NarmadaMadhya_PradeshArabian_Sea1312396
7TaptiMadhya_PradeshArabian_Sea724337
8MahanadiChhattisgarhBay_of_Bengal858566
9CauveryKarnatakaBay_of_Bengal805321
10SabarmatiRajasthanArabian_Sea371212
11BeasHimachal_PradeshSutlej_River470215
12SutlejTibetIndus_River1450477

2. DESCRIPTIVE STATISTICS USING PROC MEANS

proc means data=work.indian_rivers mean min max maxdec=2;

    var Length_KM No_of_States Annual_Flow_BCM;

    title "SUMMARY STATISTICS FOR NUMERIC VARIABLES";

run;

OUTPUT:

SUMMARY STATISTICS FOR NUMERIC VARIABLES

The MEANS Procedure

VariableMeanMinimumMaximum
Length_KM
No_of_States
Annual_Flow_BCM
1304.67
4.58
138.92
371.00
2.00
12.00
2900.00
11.00
537.00

3. DATA SORTING USING PROC SORT

proc sort data=work.indian_rivers out=work.rivers_sorted;

    by descending Annual_Flow_BCM Length_KM;

run;

proc print data=work.rivers_sorted;

    title "RIVERS SORTED BY DESCENDING FLOW AND ASCENDING LENGTH";

run;

OUTPUT:

RIVERS SORTED BY DESCENDING FLOW AND ASCENDING LENGTH

ObsRiver_NameOrigin_StateEnd_LocationLength_KMNo_of_StatesAnnual_Flow_BCM
1BrahmaputraArunachal_PradeshBay_of_Bengal29004537
2GangaUttarakhandBay_of_Bengal252511525
3GodavariMaharashtraBay_of_Bengal14657110
4NarmadaMadhya_PradeshArabian_Sea1312396
5YamunaUttarakhandGanga_Confluence1376693
6KrishnaMaharashtraBay_of_Bengal1400578
7SutlejTibetIndus_River1450477
8MahanadiChhattisgarhBay_of_Bengal858566
9TaptiMadhya_PradeshArabian_Sea724337
10CauveryKarnatakaBay_of_Bengal805321
11BeasHimachal_PradeshSutlej_River470215
12SabarmatiRajasthanArabian_Sea371212

4. VISUALIZING RIVER DATA USING PROC SGPLOT

proc sgplot data=work.indian_rivers;

    title "LENGTH vs ANNUAL FLOW OF MAJOR INDIAN RIVERS";

    scatter x=Length_KM y=Annual_Flow_BCM 

                / datalabel=River_Name markerattrs=(symbol=circlefilled);

    xaxis label="River Length (KM)";

    yaxis label="Annual Flow (Billion Cubic Meters)";

run;

OUTPUT:

The SGPlot Procedure


5. USING PROC SQL FOR DATA MANIPULATION

proc sql;

    title "RIVERS LONGER THAN 1000 KM";

    select River_Name, Length_KM, Origin_State, Annual_Flow_BCM

    from work.indian_rivers

    where Length_KM > 1000

    order by Annual_Flow_BCM desc;

quit;

OUTPUT:

RIVERS LONGER THAN 1000 KM

River_NameLength_KMOrigin_StateAnnual_Flow_BCM
Brahmaputra2900Arunachal_Pradesh537
Ganga2525Uttarakhand525
Godavari1465Maharashtra110
Narmada1312Madhya_Pradesh96
Yamuna1376Uttarakhand93
Krishna1400Maharashtra78
Sutlej1450Tibet77

6. MACRO AUTOMATION FOR REUSABLE ANALYSIS

/*Macro Example: Compute average flow for selected rivers*/

%macro river_flow(rivername);

    proc sql noprint;

        select mean(Annual_Flow_BCM) into :avgflow

        from work.indian_rivers

        where River_Name="&rivername";

    quit;

    %put Average Flow for &rivername = &avgflow Billion Cubic Meters;

%mend;


%river_flow(Ganga);

LOG:

Average Flow for Ganga = 525 Billion Cubic Meters

%river_flow(Godavari);

LOG:

Average Flow for Godavari = 110 Billion Cubic Meters

%river_flow(Narmada);

LOG:

Average Flow for Narmada = 96 Billion Cubic Meters

7. DISTRIBUTION ANALYSIS USING PROC UNIVARIATE

proc univariate data=work.indian_rivers normal;

    var Annual_Flow_BCM;

    title "DISTRIBUTION ANALYSIS OF RIVER FLOW (BCM)";

run;

OUTPUT:

DISTRIBUTION ANALYSIS OF RIVER FLOW (BCM)

The UNIVARIATE Procedure

Variable: Annual_Flow_BCM

Moments
N12Sum Weights12
Mean138.916667Sum Observations1667
Std Deviation186.092088Variance34630.2652
Skewness1.9118054Kurtosis2.29904603
Uncorrected SS612507Corrected SS380932.917
Coeff Variation133.959511Std Error Mean53.7201585
Basic Statistical Measures
LocationVariability
Mean138.9167Std Deviation186.09209
Median77.5000Variance34630
Mode.Range525.00000
  Interquartile Range74.00000
Tests for Location: Mu0=0
TestStatisticp Value
Student's tt2.585932Pr > |t|0.0253
SignM6Pr >= |M|0.0005
Signed RankS39Pr >= |S|0.0005
Tests for Normality
TestStatisticp Value
Shapiro-WilkW0.629754Pr < W0.0002
Kolmogorov-SmirnovD0.395076Pr > D<0.0100
Cramer-von MisesW-Sq0.382163Pr > W-Sq<0.0050
Anderson-DarlingA-Sq2.042281Pr > A-Sq<0.0050
Quantiles (Definition 5)
LevelQuantile
100% Max537.0
99%537.0
95%537.0
90%525.0
75% Q3103.0
50% Median77.5
25% Q129.0
10%15.0
5%12.0
1%12.0
0% Min12.0
Extreme Observations
LowestHighest
ValueObsValueObs
1210934
1511966
2191102
3775251
6685375

8. DERIVING NEW VARIABLES (Flow Efficiency)

data work.river_efficiency;

    set work.indian_rivers;

    Flow_Efficiency = Annual_Flow_BCM / Length_KM;

run;


proc print data=work.river_efficiency;

    title "DERIVED VARIABLE: FLOW EFFICIENCY (BCM PER KM)";

run;

OUTPUT:

DERIVED VARIABLE: FLOW EFFICIENCY (BCM PER KM)

ObsRiver_NameOrigin_StateEnd_LocationLength_KMNo_of_StatesAnnual_Flow_BCMFlow_Efficiency
1GangaUttarakhandBay_of_Bengal2525115250.20792
2GodavariMaharashtraBay_of_Bengal146571100.07509
3KrishnaMaharashtraBay_of_Bengal14005780.05571
4YamunaUttarakhandGanga_Confluence13766930.06759
5BrahmaputraArunachal_PradeshBay_of_Bengal290045370.18517
6NarmadaMadhya_PradeshArabian_Sea13123960.07317
7TaptiMadhya_PradeshArabian_Sea7243370.05110
8MahanadiChhattisgarhBay_of_Bengal8585660.07692
9CauveryKarnatakaBay_of_Bengal8053210.02609
10SabarmatiRajasthanArabian_Sea3712120.03235
11BeasHimachal_PradeshSutlej_River4702150.03191
12SutlejTibetIndus_River14504770.05310

9. ADVANCED SQL GROUPING (BY TERMINATION LOCATION)

proc sql;

    title "TOTAL FLOW CONTRIBUTION BY END LOCATION";

    select End_Location,

           count(*) as River_Count,

           sum(Annual_Flow_BCM) as Total_Flow_BCM format=8.2

    from work.indian_rivers

    group by End_Location

    order by Total_Flow_BCM desc;

quit;

OUTPUT:

DERIVED VARIABLE: FLOW EFFICIENCY (BCM PER KM)

ObsRiver_NameOrigin_StateEnd_LocationLength_KMNo_of_StatesAnnual_Flow_BCMFlow_Efficiency
1GangaUttarakhandBay_of_Bengal2525115250.20792
2GodavariMaharashtraBay_of_Bengal146571100.07509
3KrishnaMaharashtraBay_of_Bengal14005780.05571
4YamunaUttarakhandGanga_Confluence13766930.06759
5BrahmaputraArunachal_PradeshBay_of_Bengal290045370.18517
6NarmadaMadhya_PradeshArabian_Sea13123960.07317
7TaptiMadhya_PradeshArabian_Sea7243370.05110
8MahanadiChhattisgarhBay_of_Bengal8585660.07692
9CauveryKarnatakaBay_of_Bengal8053210.02609
10SabarmatiRajasthanArabian_Sea3712120.03235
11BeasHimachal_PradeshSutlej_River4702150.03191
12SutlejTibetIndus_River14504770.05310

10. COMPARATIVE VISUALIZATION BY END LOCATION

proc sgplot data=work.indian_rivers;

    vbar End_Location / response=Annual_Flow_BCM stat=sum datalabel;

    title "TOTAL ANNUAL FLOW CONTRIBUTION BY TERMINATION BASIN";

    yaxis label="Total Flow (BCM)";

    xaxis label="End Location / Basin";

run;

OUTPUT:

The SGPlot Procedure


11. MACRO FOR AUTOMATED GRAPH GENERATION

%macro plot_numeric(var);

    proc sgplot data=work.indian_rivers;

        vbar River_Name / response=&var datalabel;

        title "BAR CHART OF &var FOR INDIAN RIVERS";

        yaxis label="&var";

    run;

%mend;


%plot_numeric(Length_KM);

OUTPUT:

The SGPlot Procedure


%plot_numeric(Annual_Flow_BCM);

OUTPUT:
The SGPlot Procedure




To Visit My Previous Online Streaming Flatform Dataset:Click Here
To Visit My Previous Statistical Evaluation Of Clinical Trials:Click Here
To Visit My Previous Unlocking Retail Insights Dataset:Click Here
To Visit My Previous Sas Interview Questions-1:Click Here  



Follow Us On : 


 


--- FOLLOW OUR BLOG FOR MORE INFORMATION.

--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

No comments:

Post a Comment