273.ADVANCED SAS TECHNIQUES ON INDIAN STATES DATA: PROC PRINT | PROC MEANS | PROC FREQ | PROC FORMAT | PROC SORT | PROC REPORT | PROC SQL | PROC GPLOT | MACROS
ADVANCED SAS TECHNIQUES ON INDIAN STATES DATA: PROC PRINT | PROC MEANS | PROC FREQ | PROC FORMAT | PROC SORT | PROC REPORT | PROC SQL | PROC GPLOT | MACROS
/*Creating A Dataset Of Indian states - population (in millions), area (kmsq), GDP (billion USD), literacy rate (%) */
1.Sample data on Indian states - population (in millions), area (kmsq), GDP (billion USD), literacy rate (%)
options nocenter;
data india_states;
infile datalines dlm=' ' dsd truncover;
input State :$20. Population :8.2 Area :8. GDP :8.2 LiteracyRate :8.2;
datalines;
"Maharashtra" 112.37 307713 397.70 82.3
"UttarPradesh" 199.81 243286 230.10 67.7
"TamilNadu" 72.14 130058 274.30 80.3
"WestBengal" 91.35 88752 185.00 77.1
"Karnataka" 61.13 191791 220.60 75.4
"Gujarat" 60.44 196024 195.50 78.0
"Rajasthan" 68.62 342239 135.80 67.1
"AndhraPradesh" 49.67 162968 130.00 67.7
"Bihar" 104.10 94163 88.50 61.8
"Kerala" 33.38 38863 117.80 94.0
"Odisha" 41.97 155707 63.70 72.9
"Punjab" 27.74 50362 78.10 75.8
"Haryana" 25.35 44212 87.90 75.6
"Telangana" 35.19 112077 123.60 66.5
"Assam" 31.17 78438 44.80 72.2
"Jharkhand" 38.59 79716 41.20 67.6
"Chhattisgarh" 25.54 135192 38.60 71.0
"HimachalPradesh" 6.86 55673 24.10 83.8
"Uttarakhand" 10.08 53483 32.30 79.6
"Goa" 1.54 3702 11.10 88.7
;
run;
proc print data=india_states;
run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate |
|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 |
| 2 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 |
| 3 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 |
| 4 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 |
| 5 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 |
| 6 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 |
| 9 | Bihar | 104.10 | 94163 | 88.5 | 61.8 |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 |
| 11 | Odisha | 41.97 | 155707 | 63.7 | 72.9 |
| 12 | Punjab | 27.74 | 50362 | 78.1 | 75.8 |
| 13 | Haryana | 25.35 | 44212 | 87.9 | 75.6 |
| 14 | Telangana | 35.19 | 112077 | 123.6 | 66.5 |
| 15 | Assam | 31.17 | 78438 | 44.8 | 72.2 |
| 16 | Jharkhand | 38.59 | 79716 | 41.2 | 67.6 |
| 17 | Chhattisgarh | 25.54 | 135192 | 38.6 | 71.0 |
| 18 | HimachalPradesh | 6.86 | 55673 | 24.1 | 83.8 |
| 19 | Uttarakhand | 10.08 | 53483 | 32.3 | 79.6 |
| 20 | Goa | 1.54 | 3702 | 11.1 | 88.7 |
1.1 OBS : Observations
proc print data=india_states (obs=15);
title "Sample data of Indian States - Population, Area, GDP, Literacy";
run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate |
|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 |
| 2 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 |
| 3 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 |
| 4 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 |
| 5 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 |
| 6 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 |
| 9 | Bihar | 104.10 | 94163 | 88.5 | 61.8 |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 |
| 11 | Odisha | 41.97 | 155707 | 63.7 | 72.9 |
| 12 | Punjab | 27.74 | 50362 | 78.1 | 75.8 |
| 13 | Haryana | 25.35 | 44212 | 87.9 | 75.6 |
| 14 | Telangana | 35.19 | 112077 | 123.6 | 66.5 |
| 15 | Assam | 31.17 | 78438 | 44.8 | 72.2 |
2. PROC MEANS: Summary statistics for numeric variables
proc means data=india_states mean median min max maxdec=2;
var Population Area GDP LiteracyRate;
title "Descriptive statistics for population, area, GDP, and literacy";
run;
Output:
The MEANS Procedure
| Variable | Mean | Median | Minimum | Maximum |
|---|---|---|---|---|
Population Area GDP LiteracyRate | 54.85 128220.95 126.04 75.25 | 40.28 103120.00 103.15 75.50 | 1.54 3702.00 11.10 61.80 | 199.81 342239.00 397.70 94.00 |
3. PROC FREQ: Frequency distribution of states grouped by literacy rate categories
proc format;
value literacy_fmt
low -< 65 = 'Low Literacy'
65 -< 75 = 'Moderate Literacy'
75 - high = 'High Literacy';
run;
data states_lit;
set india_states;
LiteracyCat = put(LiteracyRate, literacy_fmt.);
run;
proc print data=states_lit (obs=10);
run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate | LiteracyCat |
|---|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 | High Literacy |
| 2 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 | Moderate Literacy |
| 3 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 | High Literacy |
| 4 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 | High Literacy |
| 5 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 | High Literacy |
| 6 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 | High Literacy |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 | Moderate Literacy |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 | Moderate Literacy |
| 9 | Bihar | 104.10 | 94163 | 88.5 | 61.8 | Low Literacy |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 | High Literacy |
proc freq data=states_lit;
tables LiteracyCat / nocum;
title "Frequency distribution by literacy categories";
run;
Output:
The FREQ Procedure
| LiteracyCat | Frequency | Percent |
|---|---|---|
| High Literacy | 11 | 55.00 |
| Low Literacy | 1 | 5.00 |
| Moderate Literacy | 8 | 40.00 |
4. PROC SORT: Sort states by GDP descending
proc sort data=india_states out=india_sorted_gdp;
by descending GDP;
run;
proc print data=india_sorted_gdp (obs=10);run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate |
|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 |
| 2 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 |
| 3 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 |
| 4 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 |
| 5 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 |
| 6 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 |
| 9 | Telangana | 35.19 | 112077 | 123.6 | 66.5 |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 |
5. PROC REPORT: Tabulate top 10 states by GDP
proc report data=india_sorted_gdp (obs=10) nowd headline spacing=1;
column State Population GDP LiteracyRate;
define State / display 'State';
define Population / analysis format=comma10.1 'Population (in millions)';
define GDP / analysis format=dollar12.1 'GDP (billion USD)';
define LiteracyRate / analysis format=6.1 'Literacy Rate (%)';
title "Top 10 Indian States by GDP";
run;
Output:
| State | Population (in millions) | GDP (billion USD) | Literacy Rate (%) |
|---|---|---|---|
| Maharashtra | 112.4 | $397.7 | 82.3 |
| TamilNadu | 72.1 | $274.3 | 80.3 |
| UttarPradesh | 199.8 | $230.1 | 67.7 |
| Karnataka | 61.1 | $220.6 | 75.4 |
| Gujarat | 60.4 | $195.5 | 78.0 |
| WestBengal | 91.4 | $185.0 | 77.1 |
| Rajasthan | 68.6 | $135.8 | 67.1 |
| AndhraPradesh | 49.7 | $130.0 | 67.7 |
| Telangana | 35.2 | $123.6 | 66.5 |
| Kerala | 33.4 | $117.8 | 94.0 |
6. PROC SQL: Get average GDP and Population by Literacy Category
proc sql;
create table avg_gdp_litcat as
select
case
when LiteracyRate < 65 then 'Low Literacy'
when LiteracyRate >= 65 and LiteracyRate < 75 then 'Moderate Literacy'
else 'High Literacy'
end as LiteracyCategory,
mean(GDP) as Avg_GDP format=dollar12.1,
mean(Population) as Avg_Population format=comma10.1,
count(State) as NumberOfStates
from india_states
group by calculated LiteracyCategory;
quit;
proc print data=avg_gdp_litcat;
title "Average GDP and Population by Literacy Category";
run;
Output:
| Obs | LiteracyCategory | Avg_GDP | Avg_Population | NumberOfStates |
|---|---|---|---|---|
| 1 | High Literacy | $147.7 | 45.7 | 11 |
| 2 | Low Literacy | $88.5 | 104.1 | 1 |
| 3 | Moderate Literacy | $101.0 | 61.3 | 8 |
7. Macro variable example: Define thresholds for GDP
%let gdp_min=100;
%let gdp_max=300;
8. Macro program to filter states by GDP range dynamically
%macro filter_states_by_gdp(min=&gdp_min, max=&gdp_max);
title "States with GDP between &min and &max billion USD";
proc print data=india_states;
where GDP between &min and &max;
var State GDP Population LiteracyRate;
format GDP dollar12.1 Population comma10.1;
run;
%mend;
%filter_states_by_gdp();
Output:
| Obs | State | GDP | Population | LiteracyRate |
|---|---|---|---|---|
| 2 | UttarPradesh | $230.1 | 199.8 | 67.7 |
| 3 | TamilNadu | $274.3 | 72.1 | 80.3 |
| 4 | WestBengal | $185.0 | 91.4 | 77.1 |
| 5 | Karnataka | $220.6 | 61.1 | 75.4 |
| 6 | Gujarat | $195.5 | 60.4 | 78.0 |
| 7 | Rajasthan | $135.8 | 68.6 | 67.1 |
| 8 | AndhraPradesh | $130.0 | 49.7 | 67.7 |
| 10 | Kerala | $117.8 | 33.4 | 94.0 |
| 14 | Telangana | $123.6 | 35.2 | 66.5 |
9. PROC GPLOT: Plot GDP vs Population with color by Literacy category
symbol1 v=dot c=blue h=1;
symbol2 v=star c=red h=1;
symbol3 v=circle c=green h=1;
proc gplot data=states_lit;
plot GDP*Population=LiteracyCat / legend=legend1;
title "Scatter plot of GDP vs Population categorized by Literacy";
run;
quit;
Output:
10. Macro example to print state info dynamically
%macro print_state_info(state=);
%put NOTE: Display info for state &state;
proc print data=india_states noobs;
where State = "&state";
format Population comma10.1 GDP dollar12.1 LiteracyRate 6.1;
title "Information for state: &state";
run;
%mend;
%print_state_info(state=Kerala)
Output:
| State | Population | Area | GDP | LiteracyRate |
|---|---|---|---|---|
| Kerala | 33.4 | 38863 | $117.8 | 94.0 |
11. PROC SQL advanced join example: Create a table showing states with population density
11.1 First create a new variable PopulationDensity
data india_states_density;
set india_states;
PopulationDensity = Population * 1e6 / Area; /* persons per sq km */
run;
proc print data=india_states_density(obs=10);
run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate | PopulationDensity |
|---|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 | 365.18 |
| 2 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 | 821.30 |
| 3 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 | 554.68 |
| 4 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 | 1029.27 |
| 5 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 | 318.73 |
| 6 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 | 308.33 |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 | 200.50 |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 | 304.78 |
| 9 | Bihar | 104.10 | 94163 | 88.5 | 61.8 | 1105.53 |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 | 858.91 |
proc sql;
create table state_density_info as
select State, Population, Area, GDP, LiteracyRate, PopulationDensity format=comma7.1
from india_states_density
order by PopulationDensity desc;
quit;
proc print data=state_density_info (obs=15);
title "Indian States with Population Density";
run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate | PopulationDensity |
|---|---|---|---|---|---|---|
| 1 | Bihar | 104.10 | 94163 | 88.5 | 61.8 | 1,105.5 |
| 2 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 | 1,029.3 |
| 3 | Kerala | 33.38 | 38863 | 117.8 | 94.0 | 858.9 |
| 4 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 | 821.3 |
| 5 | Haryana | 25.35 | 44212 | 87.9 | 75.6 | 573.4 |
| 6 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 | 554.7 |
| 7 | Punjab | 27.74 | 50362 | 78.1 | 75.8 | 550.8 |
| 8 | Jharkhand | 38.59 | 79716 | 41.2 | 67.6 | 484.1 |
| 9 | Goa | 1.54 | 3702 | 11.1 | 88.7 | 416.0 |
| 10 | Assam | 31.17 | 78438 | 44.8 | 72.2 | 397.4 |
| 11 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 | 365.2 |
| 12 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 | 318.7 |
| 13 | Telangana | 35.19 | 112077 | 123.6 | 66.5 | 314.0 |
| 14 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 | 308.3 |
| 15 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 | 304.8 |
12. Advanced macro to create report with variable threshold for Population Density
%macro report_high_density(threshold=500);
title "States with Population Density above &threshold persons per sq km";
proc print data=state_density_info;
where PopulationDensity > &threshold;
var State PopulationDensity Population Area LiteracyRate GDP;
format Population comma10.1 GDP dollar12.1;
run;
%mend;
%report_high_density(threshold=600);
Output:
| Obs | State | PopulationDensity | Population | Area | LiteracyRate | GDP |
|---|---|---|---|---|---|---|
| 1 | Bihar | 1,105.5 | 104.1 | 94163 | 61.8 | $88.5 |
| 2 | WestBengal | 1,029.3 | 91.4 | 88752 | 77.1 | $185.0 |
| 3 | Kerala | 858.9 | 33.4 | 38863 | 94.0 | $117.8 |
| 4 | UttarPradesh | 821.3 | 199.8 | 243286 | 67.7 | $230.1 |
13. PROC FORMAT advanced example: Define GDP categories
proc format;
value gdp_cat_fmt
low -< 50 = 'Low GDP'
50 -< 150 = 'Medium GDP'
150 - high = 'High GDP';
run;
data india_states_gdp_cat;
set india_states;
GDP_Category = put(GDP, gdp_cat_fmt.);
run;
proc print data=india_states_gdp_cat (obs=10);run;
Output:
| Obs | State | Population | Area | GDP | LiteracyRate | GDP_Category |
|---|---|---|---|---|---|---|
| 1 | Maharashtra | 112.37 | 307713 | 397.7 | 82.3 | High GDP |
| 2 | UttarPradesh | 199.81 | 243286 | 230.1 | 67.7 | High GDP |
| 3 | TamilNadu | 72.14 | 130058 | 274.3 | 80.3 | High GDP |
| 4 | WestBengal | 91.35 | 88752 | 185.0 | 77.1 | High GDP |
| 5 | Karnataka | 61.13 | 191791 | 220.6 | 75.4 | High GDP |
| 6 | Gujarat | 60.44 | 196024 | 195.5 | 78.0 | High GDP |
| 7 | Rajasthan | 68.62 | 342239 | 135.8 | 67.1 | Medium GDP |
| 8 | AndhraPradesh | 49.67 | 162968 | 130.0 | 67.7 | Medium GDP |
| 9 | Bihar | 104.10 | 94163 | 88.5 | 61.8 | Medium GDP |
| 10 | Kerala | 33.38 | 38863 | 117.8 | 94.0 | Medium GDP |
13.1 PROC FREQ to see distribution
proc freq data=india_states_gdp_cat;
tables GDP_Category / nocum;
title "Distribution of states by GDP category";
run;
Output:
The FREQ Procedure
| GDP_Category | Frequency | Percent |
|---|---|---|
| High GDP | 6 | 30.00 |
| Low GDP | 6 | 30.00 |
| Medium GDP | 8 | 40.00 |
14. Macro for generating summary by GDP category
%macro summary_by_gdp_cat;
proc means data=india_states_gdp_cat mean median min max maxdec=2;
class GDP_Category;
var Population LiteracyRate;
title "Summary statistics of Population and Literacy Rate by GDP Category";
run;
%mend;
%summary_by_gdp_cat;
Output:
The MEANS Procedure
| GDP_Category | N Obs | Variable | Mean | Median | Minimum | Maximum |
|---|---|---|---|---|---|---|
| High GDP | 6 | Population LiteracyRate | 99.54 76.80 | 81.75 77.55 | 60.44 67.70 | 199.81 82.30 |
| Low GDP | 6 | Population LiteracyRate | 18.96 77.15 | 17.81 75.90 | 1.54 67.60 | 38.59 88.70 |
| Medium GDP | 8 | Population LiteracyRate | 48.25 72.68 | 38.58 70.30 | 25.35 61.80 | 104.10 94.00 |
15. PROC REPORT for final polished report
proc report data=india_states_gdp_cat nowd;
columns State GDP GDP_Category Population LiteracyRate;
define State / group;
define GDP / analysis format=dollar12.1;
define GDP_Category / group;
define Population / analysis format=comma10.1;
define LiteracyRate / analysis format=6.1;
title "Indian States Detailed Report with GDP Categories";
run;
| State | GDP | GDP_Category | Population | LiteracyRate |
|---|---|---|---|---|
| AndhraPradesh | $130.0 | Medium GDP | 49.7 | 67.7 |
| Assam | $44.8 | Low GDP | 31.2 | 72.2 |
| Bihar | $88.5 | Medium GDP | 104.1 | 61.8 |
| Chhattisgarh | $38.6 | Low GDP | 25.5 | 71.0 |
| Goa | $11.1 | Low GDP | 1.5 | 88.7 |
| Gujarat | $195.5 | High GDP | 60.4 | 78.0 |
| Haryana | $87.9 | Medium GDP | 25.4 | 75.6 |
| HimachalPradesh | $24.1 | Low GDP | 6.9 | 83.8 |
| Jharkhand | $41.2 | Low GDP | 38.6 | 67.6 |
| Karnataka | $220.6 | High GDP | 61.1 | 75.4 |
| Kerala | $117.8 | Medium GDP | 33.4 | 94.0 |
| Maharashtra | $397.7 | High GDP | 112.4 | 82.3 |
| Odisha | $63.7 | Medium GDP | 42.0 | 72.9 |
| Punjab | $78.1 | Medium GDP | 27.7 | 75.8 |
| Rajasthan | $135.8 | Medium GDP | 68.6 | 67.1 |
| TamilNadu | $274.3 | High GDP | 72.1 | 80.3 |
| Telangana | $123.6 | Medium GDP | 35.2 | 66.5 |
| UttarPradesh | $230.1 | High GDP | 199.8 | 67.7 |
| Uttarakhand | $32.3 | Low GDP | 10.1 | 79.6 |
| WestBengal | $185.0 | High GDP | 91.4 | 77.1 |
Comments
Post a Comment