246.IN-DEPTH DEMOGRAPHIC AND SOCIOECONOMIC ANALYSIS OF 100+ LIVE INDIAN CITIZENS USING SAS PROCEDURES: PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | ADVANCED MACROS FOR REAL-WORLD DATA INSIGHTS

IN-DEPTH DEMOGRAPHIC AND SOCIOECONOMIC ANALYSIS OF 100+ LIVE INDIAN CITIZENS USING SAS PROCEDURES: PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | ADVANCED MACROS FOR REAL-WORLD DATA INSIGHTS

/* Creating and analyzing data of  live persons from India */

Step 1: Dataset Creation

options nocenter;

data Indian_Live_People;

    length Name $15 Gender $6 City $15 State $15 Region $12 Occupation $15 MaritalStatus $10 Education $20 Language $10;

    do PersonID = 1 to 20;

        Name = catt('Person', put(PersonID, 3.));

        Age = 18 + int(ranuni(0)*50); /* Ages 18 to 68 */

        Gender = scan('Male Female Other', ceil(ranuni(0)*3));

        City = scan('Hyderabad Chennai Mumbai Delhi Kolkata Bangalore Jaipur Lucknow Patna Bhopal', ceil(ranuni(0)*10));

        State = scan('Telangana TamilNadu Maharashtra Delhi WestBengal Karnataka Rajasthan UP Bihar MP', ceil(ranuni(0)*10));

        Region = scan('South South West North East South West North East Central', ceil(ranuni(0)*10));

        Occupation = scan('Engineer Doctor Farmer Student Artist Driver Teacher Lawyer Accountant Police', ceil(ranuni(0)*10));

        Income = 8000 + int(ranuni(0)*92000); /* Range: 8000 to 100000 */

        MaritalStatus = scan('Single Married Divorced Widowed', ceil(ranuni(0)*4));

        Education = scan('10th 12th BSc MSc BTech MTech MBBS MBA PhD Diploma', ceil(ranuni(0)*10));

        Language = scan('Hindi Telugu Tamil Bengali Kannada Malayalam Marathi Urdu Punjabi Assamese', ceil(ranuni(0)*10));

        output;

    end;

run;

proc print;run;

Output:

ObsNameGenderCityStateRegionOccupationMaritalStatusEducationLanguagePersonIDAgeIncome
1Person 1MalePatnaTelanganaEastPoliceMarriedMTechBengali12838265
2Person 2MaleHyderabadMPWestLawyerDivorced10thTamil22072950
3Person 3OtherBangaloreMPSouthArtistMarriedBTechMalayalam33520778
4Person 4FemaleChennaiUPEastStudentMarriedMTechMalayalam42277156
5Person 5OtherPatnaBiharWestAccountantSingle12thTelugu56346538
6Person 6FemaleMumbaiRajasthanSouthArtistSingleMScBengali64317816
7Person 7OtherPatnaUPEastDriverSingle12thBengali74356047
8Person 8MaleLucknowDelhiEastPoliceWidowedBScTelugu84942725
9Person 9OtherLucknowBiharWestDriverMarriedPhDAssamese95021427
10Person 10MaleLucknowTamilNaduSouthTeacherDivorcedDiplomaPunjabi104857333
11Person 11MaleDelhiMaharashtraWestEngineerDivorced12thKannada114418315
12Person 12MaleBhopalTelanganaCentralArtistWidowedMScMalayalam126130434
13Person 13MaleKolkataWestBengalCentralPoliceDivorcedMBABengali133318839
14Person 14FemaleHyderabadTelanganaCentralDoctorDivorcedBScTelugu145680211
15Person 15OtherPatnaDelhiSouthEngineerWidowed12thUrdu156683725
16Person 16MalePatnaKarnatakaWestFarmerDivorcedBTechKannada165366285
17Person 17OtherJaipurDelhiEastFarmerDivorcedMTechMalayalam176729173
18Person 18FemaleChennaiDelhiEastFarmerWidowed10thTelugu186529256
19Person 19FemaleHyderabadMPSouthPoliceMarriedDiplomaMalayalam196314395
20Person 20MaleKolkataBiharNorthFarmerSingleMScHindi206731117


Step 2: View the Raw Dataset

proc print data=Indian_Live_People (obs=25);

    title "Sample of Live Persons Dataset in India";

run;

Output:

Sample of Live Persons Dataset in India

ObsNameGenderCityStateRegionOccupationMaritalStatusEducationLanguagePersonIDAgeIncome
1Person 1MalePatnaTelanganaEastPoliceMarriedMTechBengali12838265
2Person 2MaleHyderabadMPWestLawyerDivorced10thTamil22072950
3Person 3OtherBangaloreMPSouthArtistMarriedBTechMalayalam33520778
4Person 4FemaleChennaiUPEastStudentMarriedMTechMalayalam42277156
5Person 5OtherPatnaBiharWestAccountantSingle12thTelugu56346538
6Person 6FemaleMumbaiRajasthanSouthArtistSingleMScBengali64317816
7Person 7OtherPatnaUPEastDriverSingle12thBengali74356047
8Person 8MaleLucknowDelhiEastPoliceWidowedBScTelugu84942725
9Person 9OtherLucknowBiharWestDriverMarriedPhDAssamese95021427
10Person 10MaleLucknowTamilNaduSouthTeacherDivorcedDiplomaPunjabi104857333
11Person 11MaleDelhiMaharashtraWestEngineerDivorced12thKannada114418315
12Person 12MaleBhopalTelanganaCentralArtistWidowedMScMalayalam126130434
13Person 13MaleKolkataWestBengalCentralPoliceDivorcedMBABengali133318839
14Person 14FemaleHyderabadTelanganaCentralDoctorDivorcedBScTelugu145680211
15Person 15OtherPatnaDelhiSouthEngineerWidowed12thUrdu156683725
16Person 16MalePatnaKarnatakaWestFarmerDivorcedBTechKannada165366285
17Person 17OtherJaipurDelhiEastFarmerDivorcedMTechMalayalam176729173
18Person 18FemaleChennaiDelhiEastFarmerWidowed10thTelugu186529256
19Person 19FemaleHyderabadMPSouthPoliceMarriedDiplomaMalayalam196314395
20Person 20MaleKolkataBiharNorthFarmerSingleMScHindi206731117

Step 3: Frequency Analysis

proc freq data=Indian_Live_People;

    tables Gender Region Occupation MaritalStatus / nocum nopercent;

    title "Frequency Distribution of Categorical Variables";

run;

Output:

Frequency Distribution of Categorical Variables

The FREQ Procedure

GenderFrequency
Female5
Male9
Other6
RegionFrequency
Central3
East6
North1
South5
West5
OccupationFrequency
Accountant1
Artist3
Doctor1
Driver2
Engineer2
Farmer4
Lawyer1
Police4
Student1
Teacher1
MaritalStatusFrequency
Divorced7
Married5
Single4
Widowed4

Step 4: Summary Statistics Using PROC MEANS

proc means data=Indian_Live_People n mean min max std;

    var Age Income;

    title "Summary Statistics for Age and Income";

run;

Output:

Summary Statistics for Age and Income

The MEANS Procedure

VariableNMeanMinimumMaximumStd Dev
Age
Income
20
20
48.8000000
42639.25
20.0000000
14395.00
67.0000000
83725.00
15.1400480
23329.52

Step 5: Sorting and Filtering Using PROC SORT

proc sort data=Indian_Live_People out=SortedPeople;

    by descending Income;

run;


proc print data=SortedPeople (obs=10);

    title "Top 10 Earners in the Dataset";

run;

Output:

Top 10 Earners in the Dataset

ObsNameGenderCityStateRegionOccupationMaritalStatusEducationLanguagePersonIDAgeIncome
1Person 15OtherPatnaDelhiSouthEngineerWidowed12thUrdu156683725
2Person 14FemaleHyderabadTelanganaCentralDoctorDivorcedBScTelugu145680211
3Person 4FemaleChennaiUPEastStudentMarriedMTechMalayalam42277156
4Person 2MaleHyderabadMPWestLawyerDivorced10thTamil22072950
5Person 16MalePatnaKarnatakaWestFarmerDivorcedBTechKannada165366285
6Person 10MaleLucknowTamilNaduSouthTeacherDivorcedDiplomaPunjabi104857333
7Person 7OtherPatnaUPEastDriverSingle12thBengali74356047
8Person 5OtherPatnaBiharWestAccountantSingle12thTelugu56346538
9Person 8MaleLucknowDelhiEastPoliceWidowedBScTelugu84942725
10Person 1MalePatnaTelanganaEastPoliceMarriedMTechBengali12838265

Step 6: Data Selection Using PROC SQL

proc sql;

    select Name, Age, Income, Occupation, State

    from Indian_Live_People

    where Income > 50000 and Age between 25 and 45

    order by Income desc;

quit;

Output:

NameAgeIncomeOccupationState
Person 74356047DriverUP

Step 7: Macro for Regional Summary

%macro RegionStats(region_name);

    title "Region-wise Summary for &region_name Region";

    proc means data=Indian_Live_People noprint;

        where Region = "&region_name";

        var Age Income;

        output out=RegionSummary_&region_name mean=AvgAge AvgIncome;

    run;

    proc print data=RegionSummary_&region_name;

    run;

%mend RegionStats;


%RegionStats(South)

Output:

Region-wise Summary for South Region

Obs_TYPE__FREQ_AvgAgeAvgIncome
1055138809.4

%RegionStats(North)

Output:

Region-wise Summary for North Region

Obs_TYPE__FREQ_AvgAgeAvgIncome
1016731117

%RegionStats(West)

Output:

Region-wise Summary for West Region

Obs_TYPE__FREQ_AvgAgeAvgIncome
1054645103

Step 8: Group Analysis Using PROC SQL

proc sql;

    select Occupation, Gender, count(*) as Count, avg(Income) as Avg_Income

    from Indian_Live_People

    group by Occupation, Gender

    order by Avg_Income desc;

quit;

Output:

OccupationGenderCountAvg_Income
EngineerOther183725
DoctorFemale180211
StudentFemale177156
LawyerMale172950
TeacherMale157333
FarmerMale248701
AccountantOther146538
DriverOther238737
PoliceMale333276.33
ArtistMale130434
FarmerFemale129256
FarmerOther129173
ArtistOther120778
EngineerMale118315
ArtistFemale117816
PoliceFemale114395

Step 9: Income Distribution Buckets

data Buckets;

    set Indian_Live_People;

    length IncomeBracket $15;

    if Income < 20000 then IncomeBracket = 'Low';

    else if Income < 50000 then IncomeBracket = 'Middle';

    else if Income < 80000 then IncomeBracket = 'High';

    else IncomeBracket = 'Very High';

run;

proc print;run;

Output:

ObsNameGenderCityStateRegionOccupationMaritalStatusEducationLanguagePersonIDAgeIncomeIncomeBracket
1Person 1MalePatnaTelanganaEastPoliceMarriedMTechBengali12838265Middle
2Person 2MaleHyderabadMPWestLawyerDivorced10thTamil22072950High
3Person 3OtherBangaloreMPSouthArtistMarriedBTechMalayalam33520778Middle
4Person 4FemaleChennaiUPEastStudentMarriedMTechMalayalam42277156High
5Person 5OtherPatnaBiharWestAccountantSingle12thTelugu56346538Middle
6Person 6FemaleMumbaiRajasthanSouthArtistSingleMScBengali64317816Low
7Person 7OtherPatnaUPEastDriverSingle12thBengali74356047High
8Person 8MaleLucknowDelhiEastPoliceWidowedBScTelugu84942725Middle
9Person 9OtherLucknowBiharWestDriverMarriedPhDAssamese95021427Middle
10Person 10MaleLucknowTamilNaduSouthTeacherDivorcedDiplomaPunjabi104857333High
11Person 11MaleDelhiMaharashtraWestEngineerDivorced12thKannada114418315Low
12Person 12MaleBhopalTelanganaCentralArtistWidowedMScMalayalam126130434Middle
13Person 13MaleKolkataWestBengalCentralPoliceDivorcedMBABengali133318839Low
14Person 14FemaleHyderabadTelanganaCentralDoctorDivorcedBScTelugu145680211Very High
15Person 15OtherPatnaDelhiSouthEngineerWidowed12thUrdu156683725Very High
16Person 16MalePatnaKarnatakaWestFarmerDivorcedBTechKannada165366285High
17Person 17OtherJaipurDelhiEastFarmerDivorcedMTechMalayalam176729173Middle
18Person 18FemaleChennaiDelhiEastFarmerWidowed10thTelugu186529256Middle
19Person 19FemaleHyderabadMPSouthPoliceMarriedDiplomaMalayalam196314395Low
20Person 20MaleKolkataBiharNorthFarmerSingleMScHindi206731117Middle


proc freq data=Buckets;

    tables IncomeBracket;

    title "Income Bracket Distribution";

run;

Output:

Income Bracket Distribution

The FREQ Procedure

IncomeBracketFrequencyPercentCumulative
Frequency
Cumulative
Percent
High525.00525.00
Low420.00945.00
Middle945.001890.00
Very High210.0020100.00

Step 10: Educational Background vs. Income

proc sql;

    select Education, count(*) as People, avg(Income) as AverageIncome

    from Indian_Live_People

    group by Education

    order by AverageIncome desc;

quit;

Output:

EducationPeopleAverageIncome
BSc261468
12th451156.25
10th251103
MTech348198
BTech243531.5
Diploma235864
MSc326455.67
PhD121427
MBA118839

Step 11: Custom Macro for Filtering

%macro FilterPeople(min_income=30000, min_age=25);

    proc sql;

        select Name, Occupation, Age, Income

        from Indian_Live_People

        where Income >= &min_income and Age >= &min_age

        order by Income desc;

    quit;

%mend;


%FilterPeople(min_income=50000, min_age=30)

Output:

NameOccupationAgeIncome
Person 15Engineer6683725
Person 14Doctor5680211
Person 16Farmer5366285
Person 10Teacher4857333
Person 7Driver4356047

Step 12: Frequency by City Using PROC FREQ

proc freq data=Indian_Live_People;

    tables City;

    title "City-wise Count of Individuals";

run;

Output:

City-wise Count of Individuals

The FREQ Procedure

CityFrequencyPercentCumulative
Frequency
Cumulative
Percent
Bangalore15.0015.00
Bhopal15.00210.00
Chennai210.00420.00
Delhi15.00525.00
Hyderabad315.00840.00
Jaipur15.00945.00
Kolkata210.001155.00
Lucknow315.001470.00
Mumbai15.001575.00
Patna525.0020100.00




To Visit My Previous Proc  Means And Nway Option:Click Here
To Visit My Previous Proc Means And CharType Option:Click Here
To Visit My Previous SAS Functions:Click Here
To Visit My Previous Length Statement Using In Many Ways:Click Here








--->PLEASE FOLLOW OUR BLOG FOR MORE INFORMATION.
--->PLEASE DO COMMENTS AND SHARE OUR BLOG.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE




Comments