273.ADVANCED SAS TECHNIQUES ON INDIAN STATES DATA: PROC PRINT | PROC MEANS | PROC FREQ | PROC FORMAT | PROC SORT | PROC REPORT | PROC SQL | PROC GPLOT | MACROS

ADVANCED SAS TECHNIQUES ON INDIAN STATES DATA: PROC PRINT | PROC MEANS | PROC FREQ | PROC FORMAT | PROC SORT | PROC REPORT | PROC SQL | PROC GPLOT | MACROS

 /*Creating A Dataset Of Indian states - population (in millions), area (kmsq), GDP (billion USD), literacy rate (%) */

1.Sample data on Indian states - population (in millions), area (kmsq), GDP (billion USD), literacy rate (%) 

options nocenter;

data india_states;

    infile datalines dlm=' ' dsd truncover;

    input State :$20. Population :8.2 Area :8. GDP :8.2 LiteracyRate :8.2;

datalines;

"Maharashtra" 112.37 307713 397.70 82.3

"UttarPradesh" 199.81 243286 230.10 67.7

"TamilNadu" 72.14 130058 274.30 80.3

"WestBengal" 91.35 88752 185.00 77.1

"Karnataka" 61.13 191791 220.60 75.4

"Gujarat" 60.44 196024 195.50 78.0

"Rajasthan" 68.62 342239 135.80 67.1

"AndhraPradesh" 49.67 162968 130.00 67.7

"Bihar" 104.10 94163 88.50 61.8

"Kerala" 33.38 38863 117.80 94.0

"Odisha" 41.97 155707 63.70 72.9

"Punjab" 27.74 50362 78.10 75.8

"Haryana" 25.35 44212 87.90 75.6

"Telangana" 35.19 112077 123.60 66.5

"Assam" 31.17 78438 44.80 72.2

"Jharkhand" 38.59 79716 41.20 67.6

"Chhattisgarh" 25.54 135192 38.60 71.0

"HimachalPradesh" 6.86 55673 24.10 83.8

"Uttarakhand" 10.08 53483 32.30 79.6

"Goa" 1.54 3702 11.10 88.7

;

run;

proc print data=india_states; 

run;

Output:

ObsStatePopulationAreaGDPLiteracyRate
1Maharashtra112.37307713397.782.3
2UttarPradesh199.81243286230.167.7
3TamilNadu72.14130058274.380.3
4WestBengal91.3588752185.077.1
5Karnataka61.13191791220.675.4
6Gujarat60.44196024195.578.0
7Rajasthan68.62342239135.867.1
8AndhraPradesh49.67162968130.067.7
9Bihar104.109416388.561.8
10Kerala33.3838863117.894.0
11Odisha41.9715570763.772.9
12Punjab27.745036278.175.8
13Haryana25.354421287.975.6
14Telangana35.19112077123.666.5
15Assam31.177843844.872.2
16Jharkhand38.597971641.267.6
17Chhattisgarh25.5413519238.671.0
18HimachalPradesh6.865567324.183.8
19Uttarakhand10.085348332.379.6
20Goa1.54370211.188.7


1.1 OBS : Observations

proc print data=india_states (obs=15);

    title "Sample data of Indian States - Population, Area, GDP, Literacy";

run;

Output:

Sample data of Indian States - Population, Area, GDP, Literacy

ObsStatePopulationAreaGDPLiteracyRate
1Maharashtra112.37307713397.782.3
2UttarPradesh199.81243286230.167.7
3TamilNadu72.14130058274.380.3
4WestBengal91.3588752185.077.1
5Karnataka61.13191791220.675.4
6Gujarat60.44196024195.578.0
7Rajasthan68.62342239135.867.1
8AndhraPradesh49.67162968130.067.7
9Bihar104.109416388.561.8
10Kerala33.3838863117.894.0
11Odisha41.9715570763.772.9
12Punjab27.745036278.175.8
13Haryana25.354421287.975.6
14Telangana35.19112077123.666.5
15Assam31.177843844.872.2

2. PROC MEANS: Summary statistics for numeric variables 

proc means data=india_states mean median min max maxdec=2;

    var Population Area GDP LiteracyRate;

    title "Descriptive statistics for population, area, GDP, and literacy";

run;

Output:

Descriptive statistics for population, area, GDP, and literacy

The MEANS Procedure

VariableMeanMedianMinimumMaximum
Population
Area
GDP
LiteracyRate
54.85
128220.95
126.04
75.25
40.28
103120.00
103.15
75.50
1.54
3702.00
11.10
61.80
199.81
342239.00
397.70
94.00

3. PROC FREQ: Frequency distribution of states grouped by literacy rate categories 

proc format;

    value literacy_fmt

    low -< 65 = 'Low Literacy'

    65 -< 75 = 'Moderate Literacy'

    75 - high = 'High Literacy';

run;


data states_lit;

    set india_states;

    LiteracyCat = put(LiteracyRate, literacy_fmt.);

run;

proc print data=states_lit (obs=10);

run;

Output:

ObsStatePopulationAreaGDPLiteracyRateLiteracyCat
1Maharashtra112.37307713397.782.3High Literacy
2UttarPradesh199.81243286230.167.7Moderate Literacy
3TamilNadu72.14130058274.380.3High Literacy
4WestBengal91.3588752185.077.1High Literacy
5Karnataka61.13191791220.675.4High Literacy
6Gujarat60.44196024195.578.0High Literacy
7Rajasthan68.62342239135.867.1Moderate Literacy
8AndhraPradesh49.67162968130.067.7Moderate Literacy
9Bihar104.109416388.561.8Low Literacy
10Kerala33.3838863117.894.0High Literacy


proc freq data=states_lit;

    tables LiteracyCat / nocum;

    title "Frequency distribution by literacy categories";

run;

Output:

Frequency distribution by literacy categories

The FREQ Procedure

LiteracyCatFrequencyPercent
High Literacy1155.00
Low Literacy15.00
Moderate Literacy840.00

 4. PROC SORT: Sort states by GDP descending 

proc sort data=india_states out=india_sorted_gdp;

    by descending GDP;

run;

proc print data=india_sorted_gdp (obs=10);run;

Output:

ObsStatePopulationAreaGDPLiteracyRate
1Maharashtra112.37307713397.782.3
2TamilNadu72.14130058274.380.3
3UttarPradesh199.81243286230.167.7
4Karnataka61.13191791220.675.4
5Gujarat60.44196024195.578.0
6WestBengal91.3588752185.077.1
7Rajasthan68.62342239135.867.1
8AndhraPradesh49.67162968130.067.7
9Telangana35.19112077123.666.5
10Kerala33.3838863117.894.0


5. PROC REPORT: Tabulate top 10 states by GDP 

proc report data=india_sorted_gdp (obs=10) nowd headline spacing=1;

    column State Population GDP LiteracyRate;

    define State / display 'State';

    define Population / analysis format=comma10.1 'Population (in millions)';

    define GDP / analysis format=dollar12.1 'GDP (billion USD)';

    define LiteracyRate / analysis format=6.1 'Literacy Rate (%)';

    title "Top 10 Indian States by GDP";

run;

Output:

Top 10 Indian States by GDP

StatePopulation (in millions)GDP (billion USD)Literacy Rate (%)
Maharashtra112.4$397.782.3
TamilNadu72.1$274.380.3
UttarPradesh199.8$230.167.7
Karnataka61.1$220.675.4
Gujarat60.4$195.578.0
WestBengal91.4$185.077.1
Rajasthan68.6$135.867.1
AndhraPradesh49.7$130.067.7
Telangana35.2$123.666.5
Kerala33.4$117.894.0

6. PROC SQL: Get average GDP and Population by Literacy Category 

proc sql;

    create table avg_gdp_litcat as

    select 

        case 

          when LiteracyRate < 65 then 'Low Literacy'

          when LiteracyRate >= 65 and LiteracyRate < 75 then 'Moderate Literacy'

          else 'High Literacy'

        end as LiteracyCategory,

        mean(GDP) as Avg_GDP format=dollar12.1,

        mean(Population) as Avg_Population format=comma10.1,

        count(State) as NumberOfStates

    from india_states

    group by calculated LiteracyCategory;

quit;


proc print data=avg_gdp_litcat;

    title "Average GDP and Population by Literacy Category";

run;

Output:

Average GDP and Population by Literacy Category

ObsLiteracyCategoryAvg_GDPAvg_PopulationNumberOfStates
1High Literacy$147.745.711
2Low Literacy$88.5104.11
3Moderate Literacy$101.061.38

7. Macro variable example: Define thresholds for GDP 

%let gdp_min=100;

%let gdp_max=300;


8. Macro program to filter states by GDP range dynamically 

%macro filter_states_by_gdp(min=&gdp_min, max=&gdp_max);

    title "States with GDP between &min and &max billion USD";

    proc print data=india_states;

        where GDP between &min and &max;

        var State GDP Population LiteracyRate;

        format GDP dollar12.1 Population comma10.1;

    run;

%mend;


%filter_states_by_gdp();

Output:

States with GDP between 100 and 300 billion USD

ObsStateGDPPopulationLiteracyRate
2UttarPradesh$230.1199.867.7
3TamilNadu$274.372.180.3
4WestBengal$185.091.477.1
5Karnataka$220.661.175.4
6Gujarat$195.560.478.0
7Rajasthan$135.868.667.1
8AndhraPradesh$130.049.767.7
10Kerala$117.833.494.0
14Telangana$123.635.266.5

9. PROC GPLOT: Plot GDP vs Population with color by Literacy category 

symbol1 v=dot c=blue h=1;

symbol2 v=star c=red h=1;

symbol3 v=circle c=green h=1;


proc gplot data=states_lit;

    plot GDP*Population=LiteracyCat / legend=legend1;

    title "Scatter plot of GDP vs Population categorized by Literacy";

run;

quit;

Output:

Plot of GDP by Population identified by LiteracyCat


10. Macro example to print state info dynamically 

%macro print_state_info(state=);

    %put NOTE: Display info for state &state;

    proc print data=india_states noobs;

        where State = "&state";

        format Population comma10.1 GDP dollar12.1 LiteracyRate 6.1;

        title "Information for state: &state";

    run;

%mend;


%print_state_info(state=Kerala)

Output:

Information for state: Kerala

StatePopulationAreaGDPLiteracyRate
Kerala33.438863$117.894.0

11. PROC SQL advanced join example: Create a table showing states with population density 

11.1 First create a new variable PopulationDensity 

data india_states_density;

    set india_states;

    PopulationDensity = Population * 1e6 / Area; /* persons per sq km */

run;

proc print data=india_states_density(obs=10);

run;

Output:

ObsStatePopulationAreaGDPLiteracyRatePopulationDensity
1Maharashtra112.37307713397.782.3365.18
2UttarPradesh199.81243286230.167.7821.30
3TamilNadu72.14130058274.380.3554.68
4WestBengal91.3588752185.077.11029.27
5Karnataka61.13191791220.675.4318.73
6Gujarat60.44196024195.578.0308.33
7Rajasthan68.62342239135.867.1200.50
8AndhraPradesh49.67162968130.067.7304.78
9Bihar104.109416388.561.81105.53
10Kerala33.3838863117.894.0858.91


proc sql;

    create table state_density_info as

    select State, Population, Area, GDP, LiteracyRate, PopulationDensity format=comma7.1

    from india_states_density

    order by PopulationDensity desc;

quit;


proc print data=state_density_info (obs=15);

    title "Indian States with Population Density";

run;

Output:

Indian States with Population Density

ObsStatePopulationAreaGDPLiteracyRatePopulationDensity
1Bihar104.109416388.561.81,105.5
2WestBengal91.3588752185.077.11,029.3
3Kerala33.3838863117.894.0858.9
4UttarPradesh199.81243286230.167.7821.3
5Haryana25.354421287.975.6573.4
6TamilNadu72.14130058274.380.3554.7
7Punjab27.745036278.175.8550.8
8Jharkhand38.597971641.267.6484.1
9Goa1.54370211.188.7416.0
10Assam31.177843844.872.2397.4
11Maharashtra112.37307713397.782.3365.2
12Karnataka61.13191791220.675.4318.7
13Telangana35.19112077123.666.5314.0
14Gujarat60.44196024195.578.0308.3
15AndhraPradesh49.67162968130.067.7304.8

12. Advanced macro to create report with variable threshold for Population Density 

%macro report_high_density(threshold=500);

    title "States with Population Density above &threshold persons per sq km";

    proc print data=state_density_info;

        where PopulationDensity > &threshold;

        var State PopulationDensity Population Area LiteracyRate GDP;

        format Population comma10.1 GDP dollar12.1;

    run;

%mend;


%report_high_density(threshold=600);

Output:

States with Population Density above 600 persons per sq km

ObsStatePopulationDensityPopulationAreaLiteracyRateGDP
1Bihar1,105.5104.19416361.8$88.5
2WestBengal1,029.391.48875277.1$185.0
3Kerala858.933.43886394.0$117.8
4UttarPradesh821.3199.824328667.7$230.1

13. PROC FORMAT advanced example: Define GDP categories 

proc format;

    value gdp_cat_fmt

    low -< 50 = 'Low GDP'

    50 -< 150 = 'Medium GDP'

    150 - high = 'High GDP';

run;


data india_states_gdp_cat;

    set india_states;

    GDP_Category = put(GDP, gdp_cat_fmt.);

run;

proc print data=india_states_gdp_cat (obs=10);run;

Output:

ObsStatePopulationAreaGDPLiteracyRateGDP_Category
1Maharashtra112.37307713397.782.3High GDP
2UttarPradesh199.81243286230.167.7High GDP
3TamilNadu72.14130058274.380.3High GDP
4WestBengal91.3588752185.077.1High GDP
5Karnataka61.13191791220.675.4High GDP
6Gujarat60.44196024195.578.0High GDP
7Rajasthan68.62342239135.867.1Medium GDP
8AndhraPradesh49.67162968130.067.7Medium GDP
9Bihar104.109416388.561.8Medium GDP
10Kerala33.3838863117.894.0Medium GDP


13.1 PROC FREQ to see distribution 

proc freq data=india_states_gdp_cat;

    tables GDP_Category / nocum;

    title "Distribution of states by GDP category";

run;

Output:

Distribution of states by GDP category

The FREQ Procedure

GDP_CategoryFrequencyPercent
High GDP630.00
Low GDP630.00
Medium GDP840.00

14. Macro for generating summary by GDP category 

%macro summary_by_gdp_cat;

    proc means data=india_states_gdp_cat mean median min max maxdec=2;

        class GDP_Category;

        var Population LiteracyRate;

        title "Summary statistics of Population and Literacy Rate by GDP Category";

    run;

%mend;


%summary_by_gdp_cat;

Output:

Summary statistics of Population and Literacy Rate by GDP Category

The MEANS Procedure

GDP_CategoryN ObsVariableMeanMedianMinimumMaximum
High GDP6
Population
LiteracyRate
99.54
76.80
81.75
77.55
60.44
67.70
199.81
82.30
Low GDP6
Population
LiteracyRate
18.96
77.15
17.81
75.90
1.54
67.60
38.59
88.70
Medium GDP8
Population
LiteracyRate
48.25
72.68
38.58
70.30
25.35
61.80
104.10
94.00

15. PROC REPORT for final polished report 

proc report data=india_states_gdp_cat nowd;

    columns State GDP GDP_Category Population LiteracyRate;

    define State / group;

    define GDP / analysis format=dollar12.1;

    define GDP_Category / group;

    define Population / analysis format=comma10.1;

    define LiteracyRate / analysis format=6.1;

    title "Indian States Detailed Report with GDP Categories";

run;

Output:

Indian States Detailed Report with GDP Categories

StateGDPGDP_CategoryPopulationLiteracyRate
AndhraPradesh$130.0Medium GDP49.767.7
Assam$44.8Low GDP31.272.2
Bihar$88.5Medium GDP104.161.8
Chhattisgarh$38.6Low GDP25.571.0
Goa$11.1Low GDP1.588.7
Gujarat$195.5High GDP60.478.0
Haryana$87.9Medium GDP25.475.6
HimachalPradesh$24.1Low GDP6.983.8
Jharkhand$41.2Low GDP38.667.6
Karnataka$220.6High GDP61.175.4
Kerala$117.8Medium GDP33.494.0
Maharashtra$397.7High GDP112.482.3
Odisha$63.7Medium GDP42.072.9
Punjab$78.1Medium GDP27.775.8
Rajasthan$135.8Medium GDP68.667.1
TamilNadu$274.3High GDP72.180.3
Telangana$123.6Medium GDP35.266.5
UttarPradesh$230.1High GDP199.867.7
Uttarakhand$32.3Low GDP10.179.6
WestBengal$185.0High GDP91.477.1




To Visit My Previous Online Streaming Flatform Dataset:Click Here
To Visit My Previous Statistical Evaluation Of Clinical Trials:Click Here
To Visit My Previous Unlocking Retail Insights Dataset:Click Here
To Visit My Previous Sas Interview Questions-1:Click Here  



Follow Us On : 


 


--- FOLLOW OUR BLOG FOR MORE INFORMATION.

--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

Comments

Popular posts from this blog

409.Can We Build a Reliable Emergency Services Analytics & Fraud Detection System in SAS While Identifying and Fixing Intentional Errors?

397.If a satellite has excellent signal strength but very high latency, can it still deliver good quality communication? Why or why not?A Sas Study

401.How Efficient Are Global Data Centers? A Complete SAS Analytics Study