REAL WORLD DIFFERENT TYPES OF PEOPLE IN INDIA DATASET CREATION AND ANALYSIS USING PROC FORMAT | PROC CONTENTS | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SUMMARY | PROC UNIVARIATE | PROC TABULATE | PROC REPORT | PROC TRANSPOSE | PROC SQL | PROC RANK | PROC SGPLOT | MACROS IN SAS
/*CREATING THE REAL-WORLD "TYPES OF PEOPLE IN INDIA" DATASET*/
1) FORMATS & LABELS
Purpose: Define user-friendly categories for readability in outputs.
proc format;
value $genderF 'M'='Male' 'F'='Female' 'O'='Other/Non-binary';
value $marF 'S'='Single' 'M'='Married' 'D'='Divorced' 'W'='Widowed';
value $ruF 'Urban'='Urban' 'Rural'='Rural' 'SemiUrban'='Semi-Urban';
value yesnoF 0='No' 1='Yes';
value agebandF low-17='0-17' 18-24='18-24' 25-34='25-34' 35-44='35-44'
45-54='45-54' 55-64='55-64' 65-high='65+';
value incomeF low-100000='<= 1L' 100001-300000='1L-3L' 300001-600000='3L-6L'
600001-1200000='6L-12L' 1200001-high='> 12L';
run;
Log:
2) CORE DATA CREATION
Purpose: Create the base PEOPLE_INDIA dataset with realistic attributes.
data people_india;
length Person_ID 8 Name $28 Gender $1 Age 8 City $20 State $20
Education_Level $20 Occupation $24 Sector $16
Income_INR 8 Household_Size 8 Marital_Status $1
Language $18 Smartphone_User 8 Internet_Hours_Per_Day 8
Commute_Mode $14 Commute_Minutes 8 Fitness_Mins_Per_Week 8
Has_Health_Insurance 8 Voter_ID_Flag 8 Credit_Score 8
UPI_Transactions_Month 8 Festival_Celebrated $18
Cuisine_Pref $16 Veg_Flag 8 Digital_Literacy_Score 8
Rural_Urban $10 Disability_Flag 8 Travel_Trips_Year 8
Pollution_Concern_Score 8 Blood_Group $3;
infile datalines dsd truncover;
input Person_ID Name :$28. Gender :$1. Age City :$20. State :$20.
Education_Level :$20. Occupation :$24. Sector :$16.
Income_INR Household_Size Marital_Status :$1.
Language :$18. Smartphone_User Internet_Hours_Per_Day
Commute_Mode :$14. Commute_Minutes Fitness_Mins_Per_Week
Has_Health_Insurance Voter_ID_Flag Credit_Score
UPI_Transactions_Month Festival_Celebrated :$18.
Cuisine_Pref :$16. Veg_Flag Digital_Literacy_Score
Rural_Urban :$10. Disability_Flag Travel_Trips_Year
Pollution_Concern_Score Blood_Group :$3.;
format Gender $genderF. Marital_Status $marF. Rural_Urban $ruF.
Smartphone_User yesnoF. Has_Health_Insurance yesnoF.
Voter_ID_Flag yesnoF. Veg_Flag yesnoF. Age agebandF.
Income_INR incomeF.;
label Education_Level = "Highest Education"
Income_INR = "Annual Income (INR)"
Internet_Hours_Per_Day = "Daily Internet Hours"
Commute_Mode = "Primary Commute Mode"
Fitness_Mins_Per_Week = "Weekly Fitness Minutes"
Digital_Literacy_Score = "Digital Literacy (0-100)"
UPI_Transactions_Month = "Monthly UPI Transactions"
Pollution_Concern_Score = "Pollution Concern (1-10)";
datalines;
1,Arjun Mehta,M,27,Mumbai,Maharashtra,Graduate,Software Engineer,IT,900000,3,S,Hindi,1,4,Metro,60,120,1,1,760,35,Diwali,North Indian,0,85,Urban,0,4,7,O+
2,Priya Iyer,F,31,Chennai,Tamil Nadu,Postgraduate,Data Analyst,IT,1100000,4,M,Tamil,1,3,Bike,40,90,1,1,780,28,Pongal,South Indian,1,88,Urban,0,3,6,B+
3,Sameer Khan,M,24,Hyderabad,Telangana,Graduate,Inside Sales,Private,450000,5,S,Urdu,1,5,Bike,35,60,0,1,720,22,Eid,Hyderabadi,0,72,Urban,0,2,5,A+
4,Neha Sharma,F,39,Delhi,Delhi,Postgraduate,Marketing Manager,Private,1400000,3,M,Hindi,1,2,Car,55,75,1,1,805,18,Holi,North Indian,0,83,Urban,0,5,8,AB+
5,Rohan Das,M,45,Kolkata,West Bengal,Graduate,School Teacher,Public,650000,4,M,Bengali,1,2,Bus,50,80,1,1,768,20,Durga Puja,Bengali,1,78,Urban,0,1,7,O-
6,Ananya Roy,F,22,Kolkata,West Bengal,Undergraduate,Student,NA,0,5,S,Bengali,1,6,Metro,30,100,0,0,0,12,Durga Puja,Continental,0,70,Urban,0,1,6,A-
7,Amit Patil,M,34,Pune,Maharashtra,Diploma,Mechanic,Private,380000,6,M,Marathi,1,2,Bike,45,40,0,1,700,15,Ganesh Chaturthi,Maharashtrian,1,68,SemiUrban,0,1,5,O+
8,Sana Parveen,F,28,Patna,Bihar,Graduate,Nurse,Healthcare,480000,5,S,Hindi,1,3,Auto,35,120,1,1,730,20,Eid,North Indian,0,77,SemiUrban,0,1,6,B+
9,Ritesh Verma,M,52,Lucknow,Uttar Pradesh,Graduate,Shop Owner,Informal,550000,6,M,Hindi,1,1,Car,25,30,0,1,690,40,Diwali,North Indian,0,60,Urban,0,1,6,A+
10,Keerthi R,F,26,Visakhapatnam,Andhra Pradesh,Graduate,Graphic Designer,Media,520000,4,S,Telugu,1,5,Bike,30,90,0,0,740,24,Ugadi,South Indian,0,82,Urban,0,3,7,B-
11,Gurpreet Singh,M,33,Amritsar,Punjab,Graduate,Logistics Supervisor,Logistics,600000,5,M,Punjabi,1,2,Car,40,60,0,1,715,18,Gurpurab,Punjabi,0,66,Urban,0,2,6,O+
12,Sonali Kulkarni,F,41,Nagpur,Maharashtra,Postgraduate,HR Lead,Private,1200000,3,M,Marathi,1,2,Car,35,120,1,1,790,26,Diwali,Maharashtrian,0,84,Urban,0,2,8,AB-
13,Faizan Ali,M,29,Jaipur,Rajasthan,Graduate,Hotel Front Office,Hospitality,360000,4,S,Hindi,1,4,Bus,25,45,0,1,705,20,Diwali,Rajasthani,0,64,Urban,0,1,5,A+
14,Kavya Nair,F,35,Kochi,Kerala,Postgraduate,Physiotherapist,Healthcare,900000,3,M,Malayalam,1,2,Car,30,180,1,1,775,32,Onam,South Indian,1,86,Urban,0,2,6,O+
15,Manoj Kumar,M,47,Gurugram,Haryana,Graduate,Project Manager,IT,1800000,4,M,Hindi,1,2,Car,60,60,1,1,820,38,Diwali,North Indian,0,90,Urban,0,4,9,B+
16,Anusha S,F,23,Mysuru,Karnataka,Undergraduate,Student,NA,0,5,S,Kannada,1,6,Bus,20,110,0,0,0,14,Dasara,South Indian,1,74,Urban,0,1,6,A-
17,Deepak Yadav,M,30,Indore,Madhya Pradesh,Graduate,Field Sales,Private,420000,6,M,Hindi,1,5,Bike,50,50,0,1,710,25,Diwali,North Indian,0,69,Urban,0,1,6,O+
18,Trisha Dey,F,27,Silchar,Assam,Graduate,Content Writer,Media,460000,4,S,Assamese,1,4,Auto,20,70,0,1,735,22,Bihu,North East,0,80,SemiUrban,0,2,6,B+
19,Vikram Rao,M,38,Bengaluru,Karnataka,Postgraduate,Data Scientist,IT,2200000,3,M,Kannada,1,3,Metro,50,150,1,1,835,45,Ugadi,South Indian,0,92,Urban,0,3,9,O+
20,Sapna Jain,F,29,Bhopal,Madhya Pradesh,Graduate,Accountant,Private,500000,5,S,Hindi,1,3,Bus,30,60,0,1,725,20,Diwali,North Indian,1,76,SemiUrban,0,1,7,A+
21,Harish Chandra,M,56,Varanasi,Uttar Pradesh,Secondary,Priest,Religious,240000,5,M,Hindi,0,0,Walk,10,15,0,1,650,8,Diwali,North Indian,1,50,Urban,0,0,6,O+
22,Rekha Gupta,F,43,Kanpur,Uttar Pradesh,Graduate,Bank Officer,Public,980000,4,M,Hindi,1,2,Car,30,90,1,1,790,30,Diwali,North Indian,1,88,Urban,0,2,8,AB+
23,Aakash Jain,M,25,Ahmedabad,Gujarat,Graduate,Entrepreneur,Startup,1200000,4,S,Gujarati,1,5,Car,35,70,0,1,780,60,Navratri,Gujarati,0,85,Urban,0,4,8,B+
24,Meera Pillai,F,48,Thiruvananthapuram,Kerala,Postgraduate,School Principal,Public,1250000,3,M,Malayalam,1,1,Car,25,100,1,1,815,20,Onam,South Indian,1,90,Urban,0,2,8,A+
25,Rajeev Ranjan,M,32,Ranchi,Jharkhand,Graduate,Police Sub-Inspector,Public,700000,4,M,Hindi,1,2,Bike,20,60,1,1,740,25,Chhath,North Indian,0,72,Urban,0,1,7,O+
26,Nisha B,F,27,Coimbatore,Tamil Nadu,Graduate,Quality Analyst,Manufacturing,520000,4,S,Tamil,1,3,Bike,25,80,0,1,735,22,Pongal,South Indian,1,82,Urban,0,2,7,B-
27,Arvind Sinha,M,44,Bhubaneswar,Odisha,Graduate,Government Clerk,Public,580000,5,M,Odia,1,2,Bike,15,45,1,1,720,18,Raja Parba,Odia,1,65,Urban,0,1,7,A+
28,Shruti Joshi,F,36,Surat,Gujarat,Postgraduate,Fashion Buyer,Private,1300000,3,M,Gujarati,1,3,Car,35,120,0,1,800,35,Navratri,Gujarati,0,88,Urban,0,3,8,O-
29,Noor Zoya,F,21,Aligarh,Uttar Pradesh,Undergraduate,Student,NA,0,6,S,Urdu,1,6,Bus,30,80,0,0,0,10,Eid,North Indian,0,72,SemiUrban,0,1,6,B+
30,Anil Kumar,M,28,Chandigarh,Chandigarh,Graduate,Civil Engineer,Construction,850000,3,S,Hindi,1,2,Car,30,100,0,1,770,25,Holi,North Indian,0,81,Urban,0,2,8,A-
31,Devika Rao,F,33,Mangalore,Karnataka,Graduate,Product Manager,IT,1600000,3,M,Kannada,1,3,Car,35,100,1,1,825,40,Ugadi,South Indian,0,89,Urban,0,3,8,AB+
32,Shivam Patel,M,26,Vadodara,Gujarat,Graduate,Mechanical Engineer,Manufacturing,780000,4,S,Gujarati,1,2,Bike,25,70,0,1,760,20,Navratri,Gujarati,1,80,Urban,0,2,7,O+
;
run;
proc print;run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Arjun Mehta | Male | 25-34 | Mumbai | Maharashtra | Graduate | Software Engineer | IT | 6L-12L | 3 | Single | Hindi | Yes | 4 | Metro | 60 | 120 | Yes | Yes | 760 | 35 | Diwali | North Indian | No | 85 | Urban | 0 | 4 | 7 | O+ |
| 2 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ |
| 3 | 3 | Sameer Khan | Male | 18-24 | Hyderabad | Telangana | Graduate | Inside Sales | Private | 3L-6L | 5 | Single | Urdu | Yes | 5 | Bike | 35 | 60 | No | Yes | 720 | 22 | Eid | Hyderabadi | No | 72 | Urban | 0 | 2 | 5 | A+ |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ |
| 5 | 5 | Rohan Das | Male | 45-54 | Kolkata | West Bengal | Graduate | School Teacher | Public | 6L-12L | 4 | Married | Bengali | Yes | 2 | Bus | 50 | 80 | Yes | Yes | 768 | 20 | Durga Puja | Bengali | Yes | 78 | Urban | 0 | 1 | 7 | O- |
| 6 | 6 | Ananya Roy | Female | 18-24 | Kolkata | West Bengal | Undergraduate | Student | NA | <= 1L | 5 | Single | Bengali | Yes | 6 | Metro | 30 | 100 | No | No | 0 | 12 | Durga Puja | Continental | No | 70 | Urban | 0 | 1 | 6 | A- |
| 7 | 7 | Amit Patil | Male | 25-34 | Pune | Maharashtra | Diploma | Mechanic | Private | 3L-6L | 6 | Married | Marathi | Yes | 2 | Bike | 45 | 40 | No | Yes | 700 | 15 | Ganesh Chaturthi | Maharashtrian | Yes | 68 | Semi-Urban | 0 | 1 | 5 | O+ |
| 8 | 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ |
| 9 | 9 | Ritesh Verma | Male | 45-54 | Lucknow | Uttar Pradesh | Graduate | Shop Owner | Informal | 3L-6L | 6 | Married | Hindi | Yes | 1 | Car | 25 | 30 | No | Yes | 690 | 40 | Diwali | North Indian | No | 60 | Urban | 0 | 1 | 6 | A+ |
| 10 | 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- |
| 11 | 11 | Gurpreet Singh | Male | 25-34 | Amritsar | Punjab | Graduate | Logistics Supervisor | Logistics | 3L-6L | 5 | Married | Punjabi | Yes | 2 | Car | 40 | 60 | No | Yes | 715 | 18 | Gurpurab | Punjabi | No | 66 | Urban | 0 | 2 | 6 | O+ |
| 12 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- |
| 13 | 13 | Faizan Ali | Male | 25-34 | Jaipur | Rajasthan | Graduate | Hotel Front Office | Hospitality | 3L-6L | 4 | Single | Hindi | Yes | 4 | Bus | 25 | 45 | No | Yes | 705 | 20 | Diwali | Rajasthani | No | 64 | Urban | 0 | 1 | 5 | A+ |
| 14 | 14 | Kavya Nair | Female | 35-44 | Kochi | Kerala | Postgraduate | Physiotherapist | Healthcare | 6L-12L | 3 | Married | Malayalam | Yes | 2 | Car | 30 | 180 | Yes | Yes | 775 | 32 | Onam | South Indian | Yes | 86 | Urban | 0 | 2 | 6 | O+ |
| 15 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ |
| 16 | 16 | Anusha S | Female | 18-24 | Mysuru | Karnataka | Undergraduate | Student | NA | <= 1L | 5 | Single | Kannada | Yes | 6 | Bus | 20 | 110 | No | No | 0 | 14 | Dasara | South Indian | Yes | 74 | Urban | 0 | 1 | 6 | A- |
| 17 | 17 | Deepak Yadav | Male | 25-34 | Indore | Madhya Pradesh | Graduate | Field Sales | Private | 3L-6L | 6 | Married | Hindi | Yes | 5 | Bike | 50 | 50 | No | Yes | 710 | 25 | Diwali | North Indian | No | 69 | Urban | 0 | 1 | 6 | O+ |
| 18 | 18 | Trisha Dey | Female | 25-34 | Silchar | Assam | Graduate | Content Writer | Media | 3L-6L | 4 | Single | Assamese | Yes | 4 | Auto | 20 | 70 | No | Yes | 735 | 22 | Bihu | North East | No | 80 | Semi-Urban | 0 | 2 | 6 | B+ |
| 19 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ |
| 20 | 20 | Sapna Jain | Female | 25-34 | Bhopal | Madhya Pradesh | Graduate | Accountant | Private | 3L-6L | 5 | Single | Hindi | Yes | 3 | Bus | 30 | 60 | No | Yes | 725 | 20 | Diwali | North Indian | Yes | 76 | Semi-Urban | 0 | 1 | 7 | A+ |
| 21 | 21 | Harish Chandra | Male | 55-64 | Varanasi | Uttar Pradesh | Secondary | Priest | Religious | 1L-3L | 5 | Married | Hindi | No | 0 | Walk | 10 | 15 | No | Yes | 650 | 8 | Diwali | North Indian | Yes | 50 | Urban | 0 | 0 | 6 | O+ |
| 22 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ |
| 23 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ |
| 24 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ |
| 25 | 25 | Rajeev Ranjan | Male | 25-34 | Ranchi | Jharkhand | Graduate | Police Sub-Inspector | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Bike | 20 | 60 | Yes | Yes | 740 | 25 | Chhath | North Indian | No | 72 | Urban | 0 | 1 | 7 | O+ |
| 26 | 26 | Nisha B | Female | 25-34 | Coimbatore | Tamil Nadu | Graduate | Quality Analyst | Manufacturing | 3L-6L | 4 | Single | Tamil | Yes | 3 | Bike | 25 | 80 | No | Yes | 735 | 22 | Pongal | South Indian | Yes | 82 | Urban | 0 | 2 | 7 | B- |
| 27 | 27 | Arvind Sinha | Male | 35-44 | Bhubaneswar | Odisha | Graduate | Government Clerk | Public | 3L-6L | 5 | Married | Odia | Yes | 2 | Bike | 15 | 45 | Yes | Yes | 720 | 18 | Raja Parba | Odia | Yes | 65 | Urban | 0 | 1 | 7 | A+ |
| 28 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- |
| 29 | 29 | Noor Zoya | Female | 18-24 | Aligarh | Uttar Pradesh | Undergraduate | Student | NA | <= 1L | 6 | Single | Urdu | Yes | 6 | Bus | 30 | 80 | No | No | 0 | 10 | Eid | North Indian | No | 72 | Semi-Urban | 0 | 1 | 6 | B+ |
| 30 | 30 | Anil Kumar | Male | 25-34 | Chandigarh | Chandigarh | Graduate | Civil Engineer | Construction | 6L-12L | 3 | Single | Hindi | Yes | 2 | Car | 30 | 100 | No | Yes | 770 | 25 | Holi | North Indian | No | 81 | Urban | 0 | 2 | 8 | A- |
| 31 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ |
| 32 | 32 | Shivam Patel | Male | 25-34 | Vadodara | Gujarat | Graduate | Mechanical Engineer | Manufacturing | 6L-12L | 4 | Single | Gujarati | Yes | 2 | Bike | 25 | 70 | No | Yes | 760 | 20 | Navratri | Gujarati | Yes | 80 | Urban | 0 | 2 | 7 | O+ |
3) DATA QUALITY & CONTENTS / BASIC OVERVIEW
Purpose: Quick schema and metadata overview for the dataset.
proc contents data=people_india varnum;
title "PEOPLE_INDIA — STRUCTURE & ATTRIBUTES";
run;
title;
Output:
The CONTENTS Procedure
| Data Set Name | WORK.PEOPLE_INDIA | Observations | 32 |
|---|---|---|---|
| Member Type | DATA | Variables | 31 |
| Engine | V9 | Indexes | 0 |
| Created | 09/01/2025 17:54:07 | Observation Length | 352 |
| Last Modified | 09/01/2025 17:54:07 | Deleted Observations | 0 |
| Protection | Compressed | NO | |
| Data Set Type | Sorted | NO | |
| Label | |||
| Data Representation | SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64 | ||
| Encoding | utf-8 Unicode (UTF-8) |
| Engine/Host Dependent Information | |
|---|---|
| Data Set Page Size | 131072 |
| Number of Data Set Pages | 1 |
| First Data Page | 1 |
| Max Obs per Page | 372 |
| Obs in First Data Page | 32 |
| Number of Data Set Repairs | 0 |
| Filename | /saswork/SAS_work638500008593_odaws01-apse1-2.oda.sas.com/SAS_work8DCE00008593_odaws01-apse1-2.oda.sas.com/people_india.sas7bdat |
| Release Created | 9.0401M8 |
| Host Created | Linux |
| Inode Number | 67165383 |
| Access Permission | rw-r--r-- |
| Owner Name | u63247146 |
| File Size | 256KB |
| File Size (bytes) | 262144 |
| Variables in Creation Order | |||||
|---|---|---|---|---|---|
| # | Variable | Type | Len | Format | Label |
| 1 | Person_ID | Num | 8 | ||
| 2 | Name | Char | 28 | ||
| 3 | Gender | Char | 1 | $GENDERF. | |
| 4 | Age | Num | 8 | AGEBANDF. | |
| 5 | City | Char | 20 | ||
| 6 | State | Char | 20 | ||
| 7 | Education_Level | Char | 20 | Highest Education | |
| 8 | Occupation | Char | 24 | ||
| 9 | Sector | Char | 16 | ||
| 10 | Income_INR | Num | 8 | INCOMEF. | Annual Income (INR) |
| 11 | Household_Size | Num | 8 | ||
| 12 | Marital_Status | Char | 1 | $MARF. | |
| 13 | Language | Char | 18 | ||
| 14 | Smartphone_User | Num | 8 | YESNOF. | |
| 15 | Internet_Hours_Per_Day | Num | 8 | Daily Internet Hours | |
| 16 | Commute_Mode | Char | 14 | Primary Commute Mode | |
| 17 | Commute_Minutes | Num | 8 | ||
| 18 | Fitness_Mins_Per_Week | Num | 8 | Weekly Fitness Minutes | |
| 19 | Has_Health_Insurance | Num | 8 | YESNOF. | |
| 20 | Voter_ID_Flag | Num | 8 | YESNOF. | |
| 21 | Credit_Score | Num | 8 | ||
| 22 | UPI_Transactions_Month | Num | 8 | Monthly UPI Transactions | |
| 23 | Festival_Celebrated | Char | 18 | ||
| 24 | Cuisine_Pref | Char | 16 | ||
| 25 | Veg_Flag | Num | 8 | YESNOF. | |
| 26 | Digital_Literacy_Score | Num | 8 | Digital Literacy (0-100) | |
| 27 | Rural_Urban | Char | 10 | $RUF. | |
| 28 | Disability_Flag | Num | 8 | ||
| 29 | Travel_Trips_Year | Num | 8 | ||
| 30 | Pollution_Concern_Score | Num | 8 | Pollution Concern (1-10) | |
| 31 | Blood_Group | Char | 3 | ||
Purpose: Inspect a small sample to eyeball plausibility of values.
proc print data=people_india (obs=10) label noobs;
title "First 10 Rows for Sanity Check";
run;
title;
Output:
| Person_ID | Name | Gender | Age | City | State | Highest Education | Occupation | Sector | Annual Income (INR) | Household_Size | Marital_Status | Language | Smartphone_User | Daily Internet Hours | Primary Commute Mode | Commute_Minutes | Weekly Fitness Minutes | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | Monthly UPI Transactions | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital Literacy (0-100) | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution Concern (1-10) | Blood_Group |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Arjun Mehta | Male | 25-34 | Mumbai | Maharashtra | Graduate | Software Engineer | IT | 6L-12L | 3 | Single | Hindi | Yes | 4 | Metro | 60 | 120 | Yes | Yes | 760 | 35 | Diwali | North Indian | No | 85 | Urban | 0 | 4 | 7 | O+ |
| 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ |
| 3 | Sameer Khan | Male | 18-24 | Hyderabad | Telangana | Graduate | Inside Sales | Private | 3L-6L | 5 | Single | Urdu | Yes | 5 | Bike | 35 | 60 | No | Yes | 720 | 22 | Eid | Hyderabadi | No | 72 | Urban | 0 | 2 | 5 | A+ |
| 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ |
| 5 | Rohan Das | Male | 45-54 | Kolkata | West Bengal | Graduate | School Teacher | Public | 6L-12L | 4 | Married | Bengali | Yes | 2 | Bus | 50 | 80 | Yes | Yes | 768 | 20 | Durga Puja | Bengali | Yes | 78 | Urban | 0 | 1 | 7 | O- |
| 6 | Ananya Roy | Female | 18-24 | Kolkata | West Bengal | Undergraduate | Student | NA | <= 1L | 5 | Single | Bengali | Yes | 6 | Metro | 30 | 100 | No | No | 0 | 12 | Durga Puja | Continental | No | 70 | Urban | 0 | 1 | 6 | A- |
| 7 | Amit Patil | Male | 25-34 | Pune | Maharashtra | Diploma | Mechanic | Private | 3L-6L | 6 | Married | Marathi | Yes | 2 | Bike | 45 | 40 | No | Yes | 700 | 15 | Ganesh Chaturthi | Maharashtrian | Yes | 68 | Semi-Urban | 0 | 1 | 5 | O+ |
| 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ |
| 9 | Ritesh Verma | Male | 45-54 | Lucknow | Uttar Pradesh | Graduate | Shop Owner | Informal | 3L-6L | 6 | Married | Hindi | Yes | 1 | Car | 25 | 30 | No | Yes | 690 | 40 | Diwali | North Indian | No | 60 | Urban | 0 | 1 | 6 | A+ |
| 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- |
Purpose: Detect duplicates by Person_ID (none expected).
proc sort data=people_india out=people_india_sorted nodupkey;
by Person_ID;
run;
proc print data=people_india_sorted (obs=10);run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Arjun Mehta | Male | 25-34 | Mumbai | Maharashtra | Graduate | Software Engineer | IT | 6L-12L | 3 | Single | Hindi | Yes | 4 | Metro | 60 | 120 | Yes | Yes | 760 | 35 | Diwali | North Indian | No | 85 | Urban | 0 | 4 | 7 | O+ |
| 2 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ |
| 3 | 3 | Sameer Khan | Male | 18-24 | Hyderabad | Telangana | Graduate | Inside Sales | Private | 3L-6L | 5 | Single | Urdu | Yes | 5 | Bike | 35 | 60 | No | Yes | 720 | 22 | Eid | Hyderabadi | No | 72 | Urban | 0 | 2 | 5 | A+ |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ |
| 5 | 5 | Rohan Das | Male | 45-54 | Kolkata | West Bengal | Graduate | School Teacher | Public | 6L-12L | 4 | Married | Bengali | Yes | 2 | Bus | 50 | 80 | Yes | Yes | 768 | 20 | Durga Puja | Bengali | Yes | 78 | Urban | 0 | 1 | 7 | O- |
| 6 | 6 | Ananya Roy | Female | 18-24 | Kolkata | West Bengal | Undergraduate | Student | NA | <= 1L | 5 | Single | Bengali | Yes | 6 | Metro | 30 | 100 | No | No | 0 | 12 | Durga Puja | Continental | No | 70 | Urban | 0 | 1 | 6 | A- |
| 7 | 7 | Amit Patil | Male | 25-34 | Pune | Maharashtra | Diploma | Mechanic | Private | 3L-6L | 6 | Married | Marathi | Yes | 2 | Bike | 45 | 40 | No | Yes | 700 | 15 | Ganesh Chaturthi | Maharashtrian | Yes | 68 | Semi-Urban | 0 | 1 | 5 | O+ |
| 8 | 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ |
| 9 | 9 | Ritesh Verma | Male | 45-54 | Lucknow | Uttar Pradesh | Graduate | Shop Owner | Informal | 3L-6L | 6 | Married | Hindi | Yes | 1 | Car | 25 | 30 | No | Yes | 690 | 40 | Diwali | North Indian | No | 60 | Urban | 0 | 1 | 6 | A+ |
| 10 | 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- |
4) FREQUENCIES, DISTRIBUTIONS & SUMMARIES
Purpose: Frequency tables to see categorical distributions at a glance.
proc freq data=people_india;
tables Gender Marital_Status Rural_Urban Veg_Flag Smartphone_User
Has_Health_Insurance Sector Education_Level Language
Cuisine_Pref Festival_Celebrated / missing;
title "Categorical Distribution Snapshots";
run;
title;
Output:
The FREQ Procedure
| Gender | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Female | 16 | 50.00 | 16 | 50.00 |
| Male | 16 | 50.00 | 32 | 100.00 |
| Marital_Status | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Married | 18 | 56.25 | 18 | 56.25 |
| Single | 14 | 43.75 | 32 | 100.00 |
| Rural_Urban | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Semi-Urban | 5 | 15.63 | 5 | 15.63 |
| Urban | 27 | 84.38 | 32 | 100.00 |
| Veg_Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| No | 20 | 62.50 | 20 | 62.50 |
| Yes | 12 | 37.50 | 32 | 100.00 |
| Smartphone_User | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| No | 1 | 3.13 | 1 | 3.13 |
| Yes | 31 | 96.88 | 32 | 100.00 |
| Has_Health_Insurance | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| No | 18 | 56.25 | 18 | 56.25 |
| Yes | 14 | 43.75 | 32 | 100.00 |
| Sector | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Construction | 1 | 3.13 | 1 | 3.13 |
| Healthcare | 2 | 6.25 | 3 | 9.38 |
| Hospitality | 1 | 3.13 | 4 | 12.50 |
| IT | 5 | 15.63 | 9 | 28.13 |
| Informal | 1 | 3.13 | 10 | 31.25 |
| Logistics | 1 | 3.13 | 11 | 34.38 |
| Manufacturing | 2 | 6.25 | 13 | 40.63 |
| Media | 2 | 6.25 | 15 | 46.88 |
| NA | 3 | 9.38 | 18 | 56.25 |
| Private | 7 | 21.88 | 25 | 78.13 |
| Public | 5 | 15.63 | 30 | 93.75 |
| Religious | 1 | 3.13 | 31 | 96.88 |
| Startup | 1 | 3.13 | 32 | 100.00 |
| Highest Education | ||||
|---|---|---|---|---|
| Education_Level | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
| Diploma | 1 | 3.13 | 1 | 3.13 |
| Graduate | 20 | 62.50 | 21 | 65.63 |
| Postgraduate | 7 | 21.88 | 28 | 87.50 |
| Secondary | 1 | 3.13 | 29 | 90.63 |
| Undergraduate | 3 | 9.38 | 32 | 100.00 |
| Language | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Assamese | 1 | 3.13 | 1 | 3.13 |
| Bengali | 2 | 6.25 | 3 | 9.38 |
| Gujarati | 3 | 9.38 | 6 | 18.75 |
| Hindi | 12 | 37.50 | 18 | 56.25 |
| Kannada | 3 | 9.38 | 21 | 65.63 |
| Malayalam | 2 | 6.25 | 23 | 71.88 |
| Marathi | 2 | 6.25 | 25 | 78.13 |
| Odia | 1 | 3.13 | 26 | 81.25 |
| Punjabi | 1 | 3.13 | 27 | 84.38 |
| Tamil | 2 | 6.25 | 29 | 90.63 |
| Telugu | 1 | 3.13 | 30 | 93.75 |
| Urdu | 2 | 6.25 | 32 | 100.00 |
| Cuisine_Pref | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Bengali | 1 | 3.13 | 1 | 3.13 |
| Continental | 1 | 3.13 | 2 | 6.25 |
| Gujarati | 3 | 9.38 | 5 | 15.63 |
| Hyderabadi | 1 | 3.13 | 6 | 18.75 |
| Maharashtrian | 2 | 6.25 | 8 | 25.00 |
| North East | 1 | 3.13 | 9 | 28.13 |
| North Indian | 12 | 37.50 | 21 | 65.63 |
| Odia | 1 | 3.13 | 22 | 68.75 |
| Punjabi | 1 | 3.13 | 23 | 71.88 |
| Rajasthani | 1 | 3.13 | 24 | 75.00 |
| South Indian | 8 | 25.00 | 32 | 100.00 |
| Festival_Celebrated | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Bihu | 1 | 3.13 | 1 | 3.13 |
| Chhath | 1 | 3.13 | 2 | 6.25 |
| Dasara | 1 | 3.13 | 3 | 9.38 |
| Diwali | 9 | 28.13 | 12 | 37.50 |
| Durga Puja | 2 | 6.25 | 14 | 43.75 |
| Eid | 3 | 9.38 | 17 | 53.13 |
| Ganesh Chaturthi | 1 | 3.13 | 18 | 56.25 |
| Gurpurab | 1 | 3.13 | 19 | 59.38 |
| Holi | 2 | 6.25 | 21 | 65.63 |
| Navratri | 3 | 9.38 | 24 | 75.00 |
| Onam | 2 | 6.25 | 26 | 81.25 |
| Pongal | 2 | 6.25 | 28 | 87.50 |
| Raja Parba | 1 | 3.13 | 29 | 90.63 |
| Ugadi | 3 | 9.38 | 32 | 100.00 |
Purpose: Descriptive stats for key numeric variables.
proc means data=people_india mean std min p25 median p75 max maxdec=1;
var Age Income_INR Household_Size Internet_Hours_Per_Day
Commute_Minutes Fitness_Mins_Per_Week Credit_Score
UPI_Transactions_Month Digital_Literacy_Score Travel_Trips_Year
Pollution_Concern_Score;
title "Descriptive Statistics — Numerics";
run;
title;
Output:
The MEANS Procedure
| Variable | Label | Mean | Std Dev | Minimum | 25th Pctl | Median | 75th Pctl | Maximum |
|---|---|---|---|---|---|---|---|---|
Age Income_INR Household_Size Internet_Hours_Per_Day Commute_Minutes Fitness_Mins_Per_Week Credit_Score UPI_Transactions_Month Digital_Literacy_Score Travel_Trips_Year Pollution_Concern_Score | Annual Income (INR) Daily Internet Hours Weekly Fitness Minutes Monthly UPI Transactions Digital Literacy (0-100) Pollution Concern (1-10) | 33.7 777187.5 4.2 3.0 33.8 82.5 684.0 25.2 77.7 2.0 6.8 | 9.2 519920.8 1.0 1.6 12.4 35.0 227.7 11.0 10.1 1.2 1.1 | 21.0 0.0 3.0 0.0 10.0 15.0 0.0 8.0 50.0 0.0 5.0 | 27.0 455000.0 3.0 2.0 25.0 60.0 712.5 19.0 71.0 1.0 6.0 | 31.5 625000.0 4.0 3.0 30.0 80.0 740.0 22.0 80.0 2.0 7.0 | 40.0 1150000.0 5.0 4.0 40.0 100.0 785.0 31.0 85.5 3.0 8.0 | 56.0 2200000.0 6.0 6.0 60.0 180.0 835.0 60.0 92.0 5.0 9.0 |
Purpose: Distributional diagnostics for selected metrics.
proc univariate data=people_india noprint;
var Income_INR Credit_Score Digital_Literacy_Score UPI_Transactions_Month;
histogram;
inset n mean std min max / pos=ne;
title "Distribution Diagnostics (Histograms)";
run;
title;
Output:
The UNIVARIATE Procedure
The UNIVARIATE Procedure
The UNIVARIATE Procedure
The UNIVARIATE Procedure
5) GROUPED SUMMARIES & TABULATED REPORTING
Purpose: Summaries by State and Sector for planning headcount/market views.
proc summary data=people_india nway;
class State Sector;
var Income_INR Credit_Score UPI_Transactions_Month;
output out=state_sector_summary
n()=N_Records
mean(Income_INR)=Avg_Income
mean(Credit_Score)=Avg_Credit
mean(UPI_Transactions_Month)=Avg_UPI;
title "Summary by State x Sector";
run;
title;
proc print;run;
Output:
| Obs | State | Sector | _TYPE_ | _FREQ_ | N_Records | Avg_Income | Avg_Credit | Avg_UPI |
|---|---|---|---|---|---|---|---|---|
| 1 | Andhra Pradesh | Media | 3 | 1 | 1 | 3L-6L | 740.0 | 24.0 |
| 2 | Assam | Media | 3 | 1 | 1 | 3L-6L | 735.0 | 22.0 |
| 3 | Bihar | Healthcare | 3 | 1 | 1 | 3L-6L | 730.0 | 20.0 |
| 4 | Chandigarh | Construction | 3 | 1 | 1 | 6L-12L | 770.0 | 25.0 |
| 5 | Delhi | Private | 3 | 1 | 1 | > 12L | 805.0 | 18.0 |
| 6 | Gujarat | Manufacturing | 3 | 1 | 1 | 6L-12L | 760.0 | 20.0 |
| 7 | Gujarat | Private | 3 | 1 | 1 | > 12L | 800.0 | 35.0 |
| 8 | Gujarat | Startup | 3 | 1 | 1 | 6L-12L | 780.0 | 60.0 |
| 9 | Haryana | IT | 3 | 1 | 1 | > 12L | 820.0 | 38.0 |
| 10 | Jharkhand | Public | 3 | 1 | 1 | 6L-12L | 740.0 | 25.0 |
| 11 | Karnataka | IT | 3 | 2 | 2 | > 12L | 830.0 | 42.5 |
| 12 | Karnataka | NA | 3 | 1 | 1 | <= 1L | 0.0 | 14.0 |
| 13 | Kerala | Healthcare | 3 | 1 | 1 | 6L-12L | 775.0 | 32.0 |
| 14 | Kerala | Public | 3 | 1 | 1 | > 12L | 815.0 | 20.0 |
| 15 | Madhya Pradesh | Private | 3 | 2 | 2 | 3L-6L | 717.5 | 22.5 |
| 16 | Maharashtra | IT | 3 | 1 | 1 | 6L-12L | 760.0 | 35.0 |
| 17 | Maharashtra | Private | 3 | 2 | 2 | 6L-12L | 745.0 | 20.5 |
| 18 | Odisha | Public | 3 | 1 | 1 | 3L-6L | 720.0 | 18.0 |
| 19 | Punjab | Logistics | 3 | 1 | 1 | 3L-6L | 715.0 | 18.0 |
| 20 | Rajasthan | Hospitality | 3 | 1 | 1 | 3L-6L | 705.0 | 20.0 |
| 21 | Tamil Nadu | IT | 3 | 1 | 1 | 6L-12L | 780.0 | 28.0 |
| 22 | Tamil Nadu | Manufacturing | 3 | 1 | 1 | 3L-6L | 735.0 | 22.0 |
| 23 | Telangana | Private | 3 | 1 | 1 | 3L-6L | 720.0 | 22.0 |
| 24 | Uttar Pradesh | Informal | 3 | 1 | 1 | 3L-6L | 690.0 | 40.0 |
| 25 | Uttar Pradesh | NA | 3 | 1 | 1 | <= 1L | 0.0 | 10.0 |
| 26 | Uttar Pradesh | Public | 3 | 1 | 1 | 6L-12L | 790.0 | 30.0 |
| 27 | Uttar Pradesh | Religious | 3 | 1 | 1 | 1L-3L | 650.0 | 8.0 |
| 28 | West Bengal | NA | 3 | 1 | 1 | <= 1L | 0.0 | 12.0 |
| 29 | West Bengal | Public | 3 | 1 | 1 | 6L-12L | 768.0 | 20.0 |
Purpose: Cross-tabs in a single compact table for leadership review.
proc tabulate data=people_india format=8.1;
class State Gender Rural_Urban;
var Income_INR Digital_Literacy_Score;
table State,
Gender * (n colpctn)
Rural_Urban * (n colpctn)
(Income_INR Digital_Literacy_Score) * mean;
title "Multi-Dimension Snapshot (TABULATE)";
run;
title;
Output:
| Gender | Rural_Urban | Annual Income (INR) | Digital Literacy (0-100) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Female | Male | Semi-Urban | Urban | |||||||
| N | ColPctN | N | ColPctN | N | ColPctN | N | ColPctN | Mean | Mean | |
| State | 1 | 6.3 | . | . | . | . | 1 | 3.7 | 520000.0 | 82.0 |
| Andhra Pradesh | ||||||||||
| Assam | 1 | 6.3 | . | . | 1 | 20.0 | . | . | 460000.0 | 80.0 |
| Bihar | 1 | 6.3 | . | . | 1 | 20.0 | . | . | 480000.0 | 77.0 |
| Chandigarh | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 850000.0 | 81.0 |
| Delhi | 1 | 6.3 | . | . | . | . | 1 | 3.7 | 1400000 | 83.0 |
| Gujarat | 1 | 6.3 | 2 | 12.5 | . | . | 3 | 11.1 | 1093333 | 84.3 |
| Haryana | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 1800000 | 90.0 |
| Jharkhand | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 700000.0 | 72.0 |
| Karnataka | 2 | 12.5 | 1 | 6.3 | . | . | 3 | 11.1 | 1266667 | 85.0 |
| Kerala | 2 | 12.5 | . | . | . | . | 2 | 7.4 | 1075000 | 88.0 |
| Madhya Pradesh | 1 | 6.3 | 1 | 6.3 | 1 | 20.0 | 1 | 3.7 | 460000.0 | 72.5 |
| Maharashtra | 1 | 6.3 | 2 | 12.5 | 1 | 20.0 | 2 | 7.4 | 826666.7 | 79.0 |
| Odisha | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 580000.0 | 65.0 |
| Punjab | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 600000.0 | 66.0 |
| Rajasthan | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 360000.0 | 64.0 |
| Tamil Nadu | 2 | 12.5 | . | . | . | . | 2 | 7.4 | 810000.0 | 85.0 |
| Telangana | . | . | 1 | 6.3 | . | . | 1 | 3.7 | 450000.0 | 72.0 |
| Uttar Pradesh | 2 | 12.5 | 2 | 12.5 | 1 | 20.0 | 3 | 11.1 | 442500.0 | 67.5 |
| West Bengal | 1 | 6.3 | 1 | 6.3 | . | . | 2 | 7.4 | 325000.0 | 74.0 |
Purpose: Polished tabular report with custom columns and labels.
proc report data=people_india nowd;
columns State City Name Gender Age Education_Level Occupation Sector Income_INR Credit_Score Digital_Literacy_Score;
define State / group 'State';
define City / display 'City';
define Name / display 'Name';
define Gender / display 'Gender';
define Age / display 'Age';
define Education_Level / display 'Education';
define Occupation / display 'Occupation';
define Sector / display 'Sector';
define Income_INR / analysis mean format=comma12. 'Avg Income';
define Credit_Score / analysis mean 'Avg Credit';
define Digital_Literacy_Score / analysis mean 'Avg Digital Lit';
title "People of India — City & Career Lens (REPORT)";
run;
title;
Output:
| State | City | Name | Gender | Age | Education | Occupation | Sector | Avg Income | Avg Credit | Avg Digital Lit |
|---|---|---|---|---|---|---|---|---|---|---|
| Andhra Pradesh | Visakhapatnam | Keerthi R | Female | 25-34 | Graduate | Graphic Designer | Media | 520,000 | 740 | 82 |
| Assam | Silchar | Trisha Dey | Female | 25-34 | Graduate | Content Writer | Media | 460,000 | 735 | 80 |
| Bihar | Patna | Sana Parveen | Female | 25-34 | Graduate | Nurse | Healthcare | 480,000 | 730 | 77 |
| Chandigarh | Chandigarh | Anil Kumar | Male | 25-34 | Graduate | Civil Engineer | Construction | 850,000 | 770 | 81 |
| Delhi | Delhi | Neha Sharma | Female | 35-44 | Postgraduate | Marketing Manager | Private | 1,400,000 | 805 | 83 |
| Gujarat | Ahmedabad | Aakash Jain | Male | 25-34 | Graduate | Entrepreneur | Startup | 1,200,000 | 780 | 85 |
| Surat | Shruti Joshi | Female | 35-44 | Postgraduate | Fashion Buyer | Private | 1,300,000 | 800 | 88 | |
| Vadodara | Shivam Patel | Male | 25-34 | Graduate | Mechanical Engineer | Manufacturing | 780,000 | 760 | 80 | |
| Haryana | Gurugram | Manoj Kumar | Male | 45-54 | Graduate | Project Manager | IT | 1,800,000 | 820 | 90 |
| Jharkhand | Ranchi | Rajeev Ranjan | Male | 25-34 | Graduate | Police Sub-Inspector | Public | 700,000 | 740 | 72 |
| Karnataka | Mysuru | Anusha S | Female | 18-24 | Undergraduate | Student | NA | 0 | 0 | 74 |
| Bengaluru | Vikram Rao | Male | 35-44 | Postgraduate | Data Scientist | IT | 2,200,000 | 835 | 92 | |
| Mangalore | Devika Rao | Female | 25-34 | Graduate | Product Manager | IT | 1,600,000 | 825 | 89 | |
| Kerala | Kochi | Kavya Nair | Female | 35-44 | Postgraduate | Physiotherapist | Healthcare | 900,000 | 775 | 86 |
| Thiruvananthapuram | Meera Pillai | Female | 45-54 | Postgraduate | School Principal | Public | 1,250,000 | 815 | 90 | |
| Madhya Pradesh | Indore | Deepak Yadav | Male | 25-34 | Graduate | Field Sales | Private | 420,000 | 710 | 69 |
| Bhopal | Sapna Jain | Female | 25-34 | Graduate | Accountant | Private | 500,000 | 725 | 76 | |
| Maharashtra | Mumbai | Arjun Mehta | Male | 25-34 | Graduate | Software Engineer | IT | 900,000 | 760 | 85 |
| Pune | Amit Patil | Male | 25-34 | Diploma | Mechanic | Private | 380,000 | 700 | 68 | |
| Nagpur | Sonali Kulkarni | Female | 35-44 | Postgraduate | HR Lead | Private | 1,200,000 | 790 | 84 | |
| Odisha | Bhubaneswar | Arvind Sinha | Male | 35-44 | Graduate | Government Clerk | Public | 580,000 | 720 | 65 |
| Punjab | Amritsar | Gurpreet Singh | Male | 25-34 | Graduate | Logistics Supervisor | Logistics | 600,000 | 715 | 66 |
| Rajasthan | Jaipur | Faizan Ali | Male | 25-34 | Graduate | Hotel Front Office | Hospitality | 360,000 | 705 | 64 |
| Tamil Nadu | Chennai | Priya Iyer | Female | 25-34 | Postgraduate | Data Analyst | IT | 1,100,000 | 780 | 88 |
| Coimbatore | Nisha B | Female | 25-34 | Graduate | Quality Analyst | Manufacturing | 520,000 | 735 | 82 | |
| Telangana | Hyderabad | Sameer Khan | Male | 18-24 | Graduate | Inside Sales | Private | 450,000 | 720 | 72 |
| Uttar Pradesh | Lucknow | Ritesh Verma | Male | 45-54 | Graduate | Shop Owner | Informal | 550,000 | 690 | 60 |
| Varanasi | Harish Chandra | Male | 55-64 | Secondary | Priest | Religious | 240,000 | 650 | 50 | |
| Kanpur | Rekha Gupta | Female | 35-44 | Graduate | Bank Officer | Public | 980,000 | 790 | 88 | |
| Aligarh | Noor Zoya | Female | 18-24 | Undergraduate | Student | NA | 0 | 0 | 72 | |
| West Bengal | Kolkata | Rohan Das | Male | 45-54 | Graduate | School Teacher | Public | 650,000 | 768 | 78 |
| Kolkata | Ananya Roy | Female | 18-24 | Undergraduate | Student | NA | 0 | 0 | 70 |
6) RANKING & BUCKETS
Purpose: Rank individuals by income to identify top earners and bands.
proc rank data=people_india out=people_ranked groups=4 ties=low;
var Income_INR;
ranks Income_Quartile;
run;proc print;run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Arjun Mehta | Male | 25-34 | Mumbai | Maharashtra | Graduate | Software Engineer | IT | 6L-12L | 3 | Single | Hindi | Yes | 4 | Metro | 60 | 120 | Yes | Yes | 760 | 35 | Diwali | North Indian | No | 85 | Urban | 0 | 4 | 7 | O+ | 2 |
| 2 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ | 2 |
| 3 | 3 | Sameer Khan | Male | 18-24 | Hyderabad | Telangana | Graduate | Inside Sales | Private | 3L-6L | 5 | Single | Urdu | Yes | 5 | Bike | 35 | 60 | No | Yes | 720 | 22 | Eid | Hyderabadi | No | 72 | Urban | 0 | 2 | 5 | A+ | 0 |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 |
| 5 | 5 | Rohan Das | Male | 45-54 | Kolkata | West Bengal | Graduate | School Teacher | Public | 6L-12L | 4 | Married | Bengali | Yes | 2 | Bus | 50 | 80 | Yes | Yes | 768 | 20 | Durga Puja | Bengali | Yes | 78 | Urban | 0 | 1 | 7 | O- | 2 |
| 6 | 6 | Ananya Roy | Female | 18-24 | Kolkata | West Bengal | Undergraduate | Student | NA | <= 1L | 5 | Single | Bengali | Yes | 6 | Metro | 30 | 100 | No | No | 0 | 12 | Durga Puja | Continental | No | 70 | Urban | 0 | 1 | 6 | A- | 0 |
| 7 | 7 | Amit Patil | Male | 25-34 | Pune | Maharashtra | Diploma | Mechanic | Private | 3L-6L | 6 | Married | Marathi | Yes | 2 | Bike | 45 | 40 | No | Yes | 700 | 15 | Ganesh Chaturthi | Maharashtrian | Yes | 68 | Semi-Urban | 0 | 1 | 5 | O+ | 0 |
| 8 | 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ | 1 |
| 9 | 9 | Ritesh Verma | Male | 45-54 | Lucknow | Uttar Pradesh | Graduate | Shop Owner | Informal | 3L-6L | 6 | Married | Hindi | Yes | 1 | Car | 25 | 30 | No | Yes | 690 | 40 | Diwali | North Indian | No | 60 | Urban | 0 | 1 | 6 | A+ | 1 |
| 10 | 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- | 1 |
| 11 | 11 | Gurpreet Singh | Male | 25-34 | Amritsar | Punjab | Graduate | Logistics Supervisor | Logistics | 3L-6L | 5 | Married | Punjabi | Yes | 2 | Car | 40 | 60 | No | Yes | 715 | 18 | Gurpurab | Punjabi | No | 66 | Urban | 0 | 2 | 6 | O+ | 1 |
| 12 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 |
| 13 | 13 | Faizan Ali | Male | 25-34 | Jaipur | Rajasthan | Graduate | Hotel Front Office | Hospitality | 3L-6L | 4 | Single | Hindi | Yes | 4 | Bus | 25 | 45 | No | Yes | 705 | 20 | Diwali | Rajasthani | No | 64 | Urban | 0 | 1 | 5 | A+ | 0 |
| 14 | 14 | Kavya Nair | Female | 35-44 | Kochi | Kerala | Postgraduate | Physiotherapist | Healthcare | 6L-12L | 3 | Married | Malayalam | Yes | 2 | Car | 30 | 180 | Yes | Yes | 775 | 32 | Onam | South Indian | Yes | 86 | Urban | 0 | 2 | 6 | O+ | 2 |
| 15 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 |
| 16 | 16 | Anusha S | Female | 18-24 | Mysuru | Karnataka | Undergraduate | Student | NA | <= 1L | 5 | Single | Kannada | Yes | 6 | Bus | 20 | 110 | No | No | 0 | 14 | Dasara | South Indian | Yes | 74 | Urban | 0 | 1 | 6 | A- | 0 |
| 17 | 17 | Deepak Yadav | Male | 25-34 | Indore | Madhya Pradesh | Graduate | Field Sales | Private | 3L-6L | 6 | Married | Hindi | Yes | 5 | Bike | 50 | 50 | No | Yes | 710 | 25 | Diwali | North Indian | No | 69 | Urban | 0 | 1 | 6 | O+ | 0 |
| 18 | 18 | Trisha Dey | Female | 25-34 | Silchar | Assam | Graduate | Content Writer | Media | 3L-6L | 4 | Single | Assamese | Yes | 4 | Auto | 20 | 70 | No | Yes | 735 | 22 | Bihu | North East | No | 80 | Semi-Urban | 0 | 2 | 6 | B+ | 1 |
| 19 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 |
| 20 | 20 | Sapna Jain | Female | 25-34 | Bhopal | Madhya Pradesh | Graduate | Accountant | Private | 3L-6L | 5 | Single | Hindi | Yes | 3 | Bus | 30 | 60 | No | Yes | 725 | 20 | Diwali | North Indian | Yes | 76 | Semi-Urban | 0 | 1 | 7 | A+ | 1 |
| 21 | 21 | Harish Chandra | Male | 55-64 | Varanasi | Uttar Pradesh | Secondary | Priest | Religious | 1L-3L | 5 | Married | Hindi | No | 0 | Walk | 10 | 15 | No | Yes | 650 | 8 | Diwali | North Indian | Yes | 50 | Urban | 0 | 0 | 6 | O+ | 0 |
| 22 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ | 2 |
| 23 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 |
| 24 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 |
| 25 | 25 | Rajeev Ranjan | Male | 25-34 | Ranchi | Jharkhand | Graduate | Police Sub-Inspector | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Bike | 20 | 60 | Yes | Yes | 740 | 25 | Chhath | North Indian | No | 72 | Urban | 0 | 1 | 7 | O+ | 2 |
| 26 | 26 | Nisha B | Female | 25-34 | Coimbatore | Tamil Nadu | Graduate | Quality Analyst | Manufacturing | 3L-6L | 4 | Single | Tamil | Yes | 3 | Bike | 25 | 80 | No | Yes | 735 | 22 | Pongal | South Indian | Yes | 82 | Urban | 0 | 2 | 7 | B- | 1 |
| 27 | 27 | Arvind Sinha | Male | 35-44 | Bhubaneswar | Odisha | Graduate | Government Clerk | Public | 3L-6L | 5 | Married | Odia | Yes | 2 | Bike | 15 | 45 | Yes | Yes | 720 | 18 | Raja Parba | Odia | Yes | 65 | Urban | 0 | 1 | 7 | A+ | 1 |
| 28 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 |
| 29 | 29 | Noor Zoya | Female | 18-24 | Aligarh | Uttar Pradesh | Undergraduate | Student | NA | <= 1L | 6 | Single | Urdu | Yes | 6 | Bus | 30 | 80 | No | No | 0 | 10 | Eid | North Indian | No | 72 | Semi-Urban | 0 | 1 | 6 | B+ | 0 |
| 30 | 30 | Anil Kumar | Male | 25-34 | Chandigarh | Chandigarh | Graduate | Civil Engineer | Construction | 6L-12L | 3 | Single | Hindi | Yes | 2 | Car | 30 | 100 | No | Yes | 770 | 25 | Holi | North Indian | No | 81 | Urban | 0 | 2 | 8 | A- | 2 |
| 31 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 |
| 32 | 32 | Shivam Patel | Male | 25-34 | Vadodara | Gujarat | Graduate | Mechanical Engineer | Manufacturing | 6L-12L | 4 | Single | Gujarati | Yes | 2 | Bike | 25 | 70 | No | Yes | 760 | 20 | Navratri | Gujarati | Yes | 80 | Urban | 0 | 2 | 7 | O+ | 2 |
Purpose: Frequency check of ranks for sanity.
proc freq data=people_ranked;
tables Income_Quartile;
title "Income Quartile Distribution";
run;
title;
Output:
The FREQ Procedure
| Rank for Variable Income_INR | ||||
|---|---|---|---|---|
| Income_Quartile | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
| 0 | 8 | 25.00 | 8 | 25.00 |
| 1 | 8 | 25.00 | 16 | 50.00 |
| 2 | 8 | 25.00 | 24 | 75.00 |
| 3 | 8 | 25.00 | 32 | 100.00 |
7) SIMPLE DATA DERIVEDS
Purpose: Add derived KPIs used across reports (e.g., Income per Capita).
data people_enriched;
set people_ranked;
Income_per_Capita = ifn(Household_Size>0, Income_INR/Household_Size, .);
High_Digital = (Digital_Literacy_Score >= 80);
Busy_Internet_User = (Internet_Hours_Per_Day >= 4);
Long_Commute = (Commute_Minutes >= 45);
Active_Lifestyle = (Fitness_Mins_Per_Week >= 90);
format High_Digital Busy_Internet_User Long_Commute Active_Lifestyle yesnoF.;
run;
proc print;run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Arjun Mehta | Male | 25-34 | Mumbai | Maharashtra | Graduate | Software Engineer | IT | 6L-12L | 3 | Single | Hindi | Yes | 4 | Metro | 60 | 120 | Yes | Yes | 760 | 35 | Diwali | North Indian | No | 85 | Urban | 0 | 4 | 7 | O+ | 2 | 300000.00 | Yes | Yes | Yes | Yes |
| 2 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ | 2 | 275000.00 | Yes | No | No | Yes |
| 3 | 3 | Sameer Khan | Male | 18-24 | Hyderabad | Telangana | Graduate | Inside Sales | Private | 3L-6L | 5 | Single | Urdu | Yes | 5 | Bike | 35 | 60 | No | Yes | 720 | 22 | Eid | Hyderabadi | No | 72 | Urban | 0 | 2 | 5 | A+ | 0 | 90000.00 | No | Yes | No | No |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No |
| 5 | 5 | Rohan Das | Male | 45-54 | Kolkata | West Bengal | Graduate | School Teacher | Public | 6L-12L | 4 | Married | Bengali | Yes | 2 | Bus | 50 | 80 | Yes | Yes | 768 | 20 | Durga Puja | Bengali | Yes | 78 | Urban | 0 | 1 | 7 | O- | 2 | 162500.00 | No | No | Yes | No |
| 6 | 6 | Ananya Roy | Female | 18-24 | Kolkata | West Bengal | Undergraduate | Student | NA | <= 1L | 5 | Single | Bengali | Yes | 6 | Metro | 30 | 100 | No | No | 0 | 12 | Durga Puja | Continental | No | 70 | Urban | 0 | 1 | 6 | A- | 0 | 0.00 | No | Yes | No | Yes |
| 7 | 7 | Amit Patil | Male | 25-34 | Pune | Maharashtra | Diploma | Mechanic | Private | 3L-6L | 6 | Married | Marathi | Yes | 2 | Bike | 45 | 40 | No | Yes | 700 | 15 | Ganesh Chaturthi | Maharashtrian | Yes | 68 | Semi-Urban | 0 | 1 | 5 | O+ | 0 | 63333.33 | No | No | Yes | No |
| 8 | 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ | 1 | 96000.00 | No | No | No | Yes |
| 9 | 9 | Ritesh Verma | Male | 45-54 | Lucknow | Uttar Pradesh | Graduate | Shop Owner | Informal | 3L-6L | 6 | Married | Hindi | Yes | 1 | Car | 25 | 30 | No | Yes | 690 | 40 | Diwali | North Indian | No | 60 | Urban | 0 | 1 | 6 | A+ | 1 | 91666.67 | No | No | No | No |
| 10 | 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- | 1 | 130000.00 | Yes | Yes | No | Yes |
| 11 | 11 | Gurpreet Singh | Male | 25-34 | Amritsar | Punjab | Graduate | Logistics Supervisor | Logistics | 3L-6L | 5 | Married | Punjabi | Yes | 2 | Car | 40 | 60 | No | Yes | 715 | 18 | Gurpurab | Punjabi | No | 66 | Urban | 0 | 2 | 6 | O+ | 1 | 120000.00 | No | No | No | No |
| 12 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 | 400000.00 | Yes | No | No | Yes |
| 13 | 13 | Faizan Ali | Male | 25-34 | Jaipur | Rajasthan | Graduate | Hotel Front Office | Hospitality | 3L-6L | 4 | Single | Hindi | Yes | 4 | Bus | 25 | 45 | No | Yes | 705 | 20 | Diwali | Rajasthani | No | 64 | Urban | 0 | 1 | 5 | A+ | 0 | 90000.00 | No | Yes | No | No |
| 14 | 14 | Kavya Nair | Female | 35-44 | Kochi | Kerala | Postgraduate | Physiotherapist | Healthcare | 6L-12L | 3 | Married | Malayalam | Yes | 2 | Car | 30 | 180 | Yes | Yes | 775 | 32 | Onam | South Indian | Yes | 86 | Urban | 0 | 2 | 6 | O+ | 2 | 300000.00 | Yes | No | No | Yes |
| 15 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No |
| 16 | 16 | Anusha S | Female | 18-24 | Mysuru | Karnataka | Undergraduate | Student | NA | <= 1L | 5 | Single | Kannada | Yes | 6 | Bus | 20 | 110 | No | No | 0 | 14 | Dasara | South Indian | Yes | 74 | Urban | 0 | 1 | 6 | A- | 0 | 0.00 | No | Yes | No | Yes |
| 17 | 17 | Deepak Yadav | Male | 25-34 | Indore | Madhya Pradesh | Graduate | Field Sales | Private | 3L-6L | 6 | Married | Hindi | Yes | 5 | Bike | 50 | 50 | No | Yes | 710 | 25 | Diwali | North Indian | No | 69 | Urban | 0 | 1 | 6 | O+ | 0 | 70000.00 | No | Yes | Yes | No |
| 18 | 18 | Trisha Dey | Female | 25-34 | Silchar | Assam | Graduate | Content Writer | Media | 3L-6L | 4 | Single | Assamese | Yes | 4 | Auto | 20 | 70 | No | Yes | 735 | 22 | Bihu | North East | No | 80 | Semi-Urban | 0 | 2 | 6 | B+ | 1 | 115000.00 | Yes | Yes | No | No |
| 19 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 | 733333.33 | Yes | No | Yes | Yes |
| 20 | 20 | Sapna Jain | Female | 25-34 | Bhopal | Madhya Pradesh | Graduate | Accountant | Private | 3L-6L | 5 | Single | Hindi | Yes | 3 | Bus | 30 | 60 | No | Yes | 725 | 20 | Diwali | North Indian | Yes | 76 | Semi-Urban | 0 | 1 | 7 | A+ | 1 | 100000.00 | No | No | No | No |
| 21 | 21 | Harish Chandra | Male | 55-64 | Varanasi | Uttar Pradesh | Secondary | Priest | Religious | 1L-3L | 5 | Married | Hindi | No | 0 | Walk | 10 | 15 | No | Yes | 650 | 8 | Diwali | North Indian | Yes | 50 | Urban | 0 | 0 | 6 | O+ | 0 | 48000.00 | No | No | No | No |
| 22 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ | 2 | 245000.00 | Yes | No | No | Yes |
| 23 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 | 300000.00 | Yes | Yes | No | No |
| 24 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 | 416666.67 | Yes | No | No | Yes |
| 25 | 25 | Rajeev Ranjan | Male | 25-34 | Ranchi | Jharkhand | Graduate | Police Sub-Inspector | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Bike | 20 | 60 | Yes | Yes | 740 | 25 | Chhath | North Indian | No | 72 | Urban | 0 | 1 | 7 | O+ | 2 | 175000.00 | No | No | No | No |
| 26 | 26 | Nisha B | Female | 25-34 | Coimbatore | Tamil Nadu | Graduate | Quality Analyst | Manufacturing | 3L-6L | 4 | Single | Tamil | Yes | 3 | Bike | 25 | 80 | No | Yes | 735 | 22 | Pongal | South Indian | Yes | 82 | Urban | 0 | 2 | 7 | B- | 1 | 130000.00 | Yes | No | No | No |
| 27 | 27 | Arvind Sinha | Male | 35-44 | Bhubaneswar | Odisha | Graduate | Government Clerk | Public | 3L-6L | 5 | Married | Odia | Yes | 2 | Bike | 15 | 45 | Yes | Yes | 720 | 18 | Raja Parba | Odia | Yes | 65 | Urban | 0 | 1 | 7 | A+ | 1 | 116000.00 | No | No | No | No |
| 28 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 | 433333.33 | Yes | No | No | Yes |
| 29 | 29 | Noor Zoya | Female | 18-24 | Aligarh | Uttar Pradesh | Undergraduate | Student | NA | <= 1L | 6 | Single | Urdu | Yes | 6 | Bus | 30 | 80 | No | No | 0 | 10 | Eid | North Indian | No | 72 | Semi-Urban | 0 | 1 | 6 | B+ | 0 | 0.00 | No | Yes | No | No |
| 30 | 30 | Anil Kumar | Male | 25-34 | Chandigarh | Chandigarh | Graduate | Civil Engineer | Construction | 6L-12L | 3 | Single | Hindi | Yes | 2 | Car | 30 | 100 | No | Yes | 770 | 25 | Holi | North Indian | No | 81 | Urban | 0 | 2 | 8 | A- | 2 | 283333.33 | Yes | No | No | Yes |
| 31 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 | 533333.33 | Yes | No | No | Yes |
| 32 | 32 | Shivam Patel | Male | 25-34 | Vadodara | Gujarat | Graduate | Mechanical Engineer | Manufacturing | 6L-12L | 4 | Single | Gujarati | Yes | 2 | Bike | 25 | 70 | No | Yes | 760 | 20 | Navratri | Gujarati | Yes | 80 | Urban | 0 | 2 | 7 | O+ | 2 | 195000.00 | Yes | No | No | No |
Purpose: Quick stats on derived KPIs.
proc means data=people_enriched mean maxdec=2;
var Income_per_Capita;
class State;
title "Income per Capita by State — Mean";
run;
title;
Output:
The MEANS Procedure
| Analysis Variable : Income_per_Capita | ||
|---|---|---|
| State | N Obs | Mean |
| Andhra Pradesh | 1 | 130000.00 |
| Assam | 1 | 115000.00 |
| Bihar | 1 | 96000.00 |
| Chandigarh | 1 | 283333.33 |
| Delhi | 1 | 466666.67 |
| Gujarat | 3 | 309444.44 |
| Haryana | 1 | 450000.00 |
| Jharkhand | 1 | 175000.00 |
| Karnataka | 3 | 422222.22 |
| Kerala | 2 | 358333.33 |
| Madhya Pradesh | 2 | 85000.00 |
| Maharashtra | 3 | 254444.44 |
| Odisha | 1 | 116000.00 |
| Punjab | 1 | 120000.00 |
| Rajasthan | 1 | 90000.00 |
| Tamil Nadu | 2 | 202500.00 |
| Telangana | 1 | 90000.00 |
| Uttar Pradesh | 4 | 96166.67 |
| West Bengal | 2 | 81250.00 |
8) PROC TRANSPOSE
Purpose: Reshape summary from long to wide for dashboard ingestion.
proc summary data=people_enriched nway;
class State;
var Income_INR Credit_Score Digital_Literacy_Score;
output out=state_means (drop=_type_ _freq_)
mean(Income_INR)=Avg_Income
mean(Credit_Score)=Avg_Credit
mean(Digital_Literacy_Score)=Avg_DigiLit;
run;
proc print;run;
Output:
| Obs | State | Avg_Income | Avg_Credit | Avg_DigiLit |
|---|---|---|---|---|
| 1 | Andhra Pradesh | 3L-6L | 740.000 | 82.0000 |
| 2 | Assam | 3L-6L | 735.000 | 80.0000 |
| 3 | Bihar | 3L-6L | 730.000 | 77.0000 |
| 4 | Chandigarh | 6L-12L | 770.000 | 81.0000 |
| 5 | Delhi | > 12L | 805.000 | 83.0000 |
| 6 | Gujarat | 6L-12L | 780.000 | 84.3333 |
| 7 | Haryana | > 12L | 820.000 | 90.0000 |
| 8 | Jharkhand | 6L-12L | 740.000 | 72.0000 |
| 9 | Karnataka | > 12L | 553.333 | 85.0000 |
| 10 | Kerala | 6L-12L | 795.000 | 88.0000 |
| 11 | Madhya Pradesh | 3L-6L | 717.500 | 72.5000 |
| 12 | Maharashtra | 6L-12L | 750.000 | 79.0000 |
| 13 | Odisha | 3L-6L | 720.000 | 65.0000 |
| 14 | Punjab | 3L-6L | 715.000 | 66.0000 |
| 15 | Rajasthan | 3L-6L | 705.000 | 64.0000 |
| 16 | Tamil Nadu | 6L-12L | 757.500 | 85.0000 |
| 17 | Telangana | 3L-6L | 720.000 | 72.0000 |
| 18 | Uttar Pradesh | 3L-6L | 532.500 | 67.5000 |
| 19 | West Bengal | 3L-6L | 384.000 | 74.0000 |
proc transpose data=state_means out=state_means_t prefix=Mean_;
id State; /* Each state becomes a column */
var Avg_Income; /* Variable you want to transpose */
run;
proc print data=state_means_t;
run;
Output:
| Obs | _NAME_ | _LABEL_ | Mean_Andhra Pradesh | Mean_Assam | Mean_Bihar | Mean_Chandigarh | Mean_Delhi | Mean_Gujarat | Mean_Haryana | Mean_Jharkhand | Mean_Karnataka | Mean_Kerala | Mean_Madhya Pradesh | Mean_Maharashtra | Mean_Odisha | Mean_Punjab | Mean_Rajasthan | Mean_Tamil Nadu | Mean_Telangana | Mean_Uttar Pradesh | Mean_West Bengal |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Avg_Income | Annual Income (INR) | 3L-6L | 3L-6L | 3L-6L | 6L-12L | > 12L | 6L-12L | > 12L | 6L-12L | > 12L | 6L-12L | 3L-6L | 6L-12L | 3L-6L | 3L-6L | 3L-6L | 6L-12L | 3L-6L | 3L-6L | 3L-6L |
9) PROC SQL SHOWCASE
Purpose: Create a city-level aggregate table for BI consumers.
proc sql;
create table city_agg as
select State, City,
count(*) as N,
mean(Age) as Avg_Age format=8.1,
mean(Income_INR) as Avg_Income format=comma12.,
mean(Credit_Score) as Avg_Credit format=8.1,
mean(Digital_Literacy_Score) as Avg_DigiLit format=8.1,
sum(UPI_Transactions_Month) as Total_UPI
from people_enriched
group by State, City
order by State, Avg_Income desc;
quit;
proc print data=city_agg (obs=10);run;
Output:
| Obs | State | City | N | Avg_Age | Avg_Income | Avg_Credit | Avg_DigiLit | Total_UPI |
|---|---|---|---|---|---|---|---|---|
| 1 | Andhra Pradesh | Visakhapatnam | 1 | 26.0 | 520,000 | 740.0 | 82.0 | 24 |
| 2 | Assam | Silchar | 1 | 27.0 | 460,000 | 735.0 | 80.0 | 22 |
| 3 | Bihar | Patna | 1 | 28.0 | 480,000 | 730.0 | 77.0 | 20 |
| 4 | Chandigarh | Chandigarh | 1 | 28.0 | 850,000 | 770.0 | 81.0 | 25 |
| 5 | Delhi | Delhi | 1 | 39.0 | 1,400,000 | 805.0 | 83.0 | 18 |
| 6 | Gujarat | Surat | 1 | 36.0 | 1,300,000 | 800.0 | 88.0 | 35 |
| 7 | Gujarat | Ahmedabad | 1 | 25.0 | 1,200,000 | 780.0 | 85.0 | 60 |
| 8 | Gujarat | Vadodara | 1 | 26.0 | 780,000 | 760.0 | 80.0 | 20 |
| 9 | Haryana | Gurugram | 1 | 47.0 | 1,800,000 | 820.0 | 90.0 | 38 |
| 10 | Jharkhand | Ranchi | 1 | 32.0 | 700,000 | 740.0 | 72.0 | 25 |
Purpose: Identify top 10 earners with additional flags using subquery.
proc sql;
create table top10_earners as
select *
from people_enriched
order by Income_INR desc;
quit;
proc print data=top10_earners (obs=10);run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 | 733333.33 | Yes | No | Yes | Yes |
| 2 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No |
| 3 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 | 533333.33 | Yes | No | No | Yes |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No |
| 5 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 | 433333.33 | Yes | No | No | Yes |
| 6 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 | 416666.67 | Yes | No | No | Yes |
| 7 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 | 400000.00 | Yes | No | No | Yes |
| 8 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 | 300000.00 | Yes | Yes | No | No |
| 9 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ | 2 | 275000.00 | Yes | No | No | Yes |
| 10 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ | 2 | 245000.00 | Yes | No | No | Yes |
data top10_earners;
set top10_earners(obs=10);
run;
proc print;run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 | 733333.33 | Yes | No | Yes | Yes |
| 2 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No |
| 3 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 | 533333.33 | Yes | No | No | Yes |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No |
| 5 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 | 433333.33 | Yes | No | No | Yes |
| 6 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 | 416666.67 | Yes | No | No | Yes |
| 7 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 | 400000.00 | Yes | No | No | Yes |
| 8 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 | 300000.00 | Yes | Yes | No | No |
| 9 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ | 2 | 275000.00 | Yes | No | No | Yes |
| 10 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ | 2 | 245000.00 | Yes | No | No | Yes |
proc sql;
create table top10_earners_flags as
select t.*,
case when t.High_Digital=1 and t.Active_Lifestyle=1 then 'Digital+Active'
when t.High_Digital=1 then 'Digital'
when t.Active_Lifestyle=1 then 'Active'
else 'Other' end as Persona length=16
from top10_earners t;
quit;
proc print data=top10_earners_flags (obs=10);run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle | Persona |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 | 733333.33 | Yes | No | Yes | Yes | Digital+Active |
| 2 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No | Digital |
| 3 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 | 533333.33 | Yes | No | No | Yes | Digital+Active |
| 4 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No | Digital |
| 5 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 | 433333.33 | Yes | No | No | Yes | Digital+Active |
| 6 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 | 416666.67 | Yes | No | No | Yes | Digital+Active |
| 7 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 | 400000.00 | Yes | No | No | Yes | Digital+Active |
| 8 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 | 300000.00 | Yes | Yes | No | No | Digital |
| 9 | 2 | Priya Iyer | Female | 25-34 | Chennai | Tamil Nadu | Postgraduate | Data Analyst | IT | 6L-12L | 4 | Married | Tamil | Yes | 3 | Bike | 40 | 90 | Yes | Yes | 780 | 28 | Pongal | South Indian | Yes | 88 | Urban | 0 | 3 | 6 | B+ | 2 | 275000.00 | Yes | No | No | Yes | Digital+Active |
| 10 | 22 | Rekha Gupta | Female | 35-44 | Kanpur | Uttar Pradesh | Graduate | Bank Officer | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Car | 30 | 90 | Yes | Yes | 790 | 30 | Diwali | North Indian | Yes | 88 | Urban | 0 | 2 | 8 | AB+ | 2 | 245000.00 | Yes | No | No | Yes | Digital+Active |
Purpose: Join aggregates to individuals to provide “context columns”.
proc sql;
create table people_with_city_context as
select p.*, c.N as City_Count, c.Avg_Income as City_Avg_Income,
c.Avg_DigiLit as City_Avg_DigiLit
from people_enriched as p
left join city_agg as c
on p.State=c.State and p.City=c.City;
quit;
proc print data=people_with_city_context (obs=10);run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle | City_Count | City_Avg_Income | City_Avg_DigiLit |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 10 | Keerthi R | Female | 25-34 | Visakhapatnam | Andhra Pradesh | Graduate | Graphic Designer | Media | 3L-6L | 4 | Single | Telugu | Yes | 5 | Bike | 30 | 90 | No | No | 740 | 24 | Ugadi | South Indian | No | 82 | Urban | 0 | 3 | 7 | B- | 1 | 130000.00 | Yes | Yes | No | Yes | 1 | 520,000 | 82.0 |
| 2 | 18 | Trisha Dey | Female | 25-34 | Silchar | Assam | Graduate | Content Writer | Media | 3L-6L | 4 | Single | Assamese | Yes | 4 | Auto | 20 | 70 | No | Yes | 735 | 22 | Bihu | North East | No | 80 | Semi-Urban | 0 | 2 | 6 | B+ | 1 | 115000.00 | Yes | Yes | No | No | 1 | 460,000 | 80.0 |
| 3 | 8 | Sana Parveen | Female | 25-34 | Patna | Bihar | Graduate | Nurse | Healthcare | 3L-6L | 5 | Single | Hindi | Yes | 3 | Auto | 35 | 120 | Yes | Yes | 730 | 20 | Eid | North Indian | No | 77 | Semi-Urban | 0 | 1 | 6 | B+ | 1 | 96000.00 | No | No | No | Yes | 1 | 480,000 | 77.0 |
| 4 | 30 | Anil Kumar | Male | 25-34 | Chandigarh | Chandigarh | Graduate | Civil Engineer | Construction | 6L-12L | 3 | Single | Hindi | Yes | 2 | Car | 30 | 100 | No | Yes | 770 | 25 | Holi | North Indian | No | 81 | Urban | 0 | 2 | 8 | A- | 2 | 283333.33 | Yes | No | No | Yes | 1 | 850,000 | 81.0 |
| 5 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No | 1 | 1,400,000 | 83.0 |
| 6 | 23 | Aakash Jain | Male | 25-34 | Ahmedabad | Gujarat | Graduate | Entrepreneur | Startup | 6L-12L | 4 | Single | Gujarati | Yes | 5 | Car | 35 | 70 | No | Yes | 780 | 60 | Navratri | Gujarati | No | 85 | Urban | 0 | 4 | 8 | B+ | 3 | 300000.00 | Yes | Yes | No | No | 1 | 1,200,000 | 85.0 |
| 7 | 28 | Shruti Joshi | Female | 35-44 | Surat | Gujarat | Postgraduate | Fashion Buyer | Private | > 12L | 3 | Married | Gujarati | Yes | 3 | Car | 35 | 120 | No | Yes | 800 | 35 | Navratri | Gujarati | No | 88 | Urban | 0 | 3 | 8 | O- | 3 | 433333.33 | Yes | No | No | Yes | 1 | 1,300,000 | 88.0 |
| 8 | 32 | Shivam Patel | Male | 25-34 | Vadodara | Gujarat | Graduate | Mechanical Engineer | Manufacturing | 6L-12L | 4 | Single | Gujarati | Yes | 2 | Bike | 25 | 70 | No | Yes | 760 | 20 | Navratri | Gujarati | Yes | 80 | Urban | 0 | 2 | 7 | O+ | 2 | 195000.00 | Yes | No | No | No | 1 | 780,000 | 80.0 |
| 9 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No | 1 | 1,800,000 | 90.0 |
| 10 | 25 | Rajeev Ranjan | Male | 25-34 | Ranchi | Jharkhand | Graduate | Police Sub-Inspector | Public | 6L-12L | 4 | Married | Hindi | Yes | 2 | Bike | 20 | 60 | Yes | Yes | 740 | 25 | Chhath | North Indian | No | 72 | Urban | 0 | 1 | 7 | O+ | 2 | 175000.00 | No | No | No | No | 1 | 700,000 | 72.0 |
10) GRAPHING (PROC SGPLOT)
Purpose: Visualize relationships for storytelling (income vs digital).
ods graphics on;
proc sgplot data=people_enriched;
reg x=Digital_Literacy_Score y=Income_INR;
scatter x=Digital_Literacy_Score y=Income_INR / group=Sector;
yaxis label="Annual Income (INR)";
xaxis label="Digital Literacy Score";
title "Income vs Digital Literacy by Sector";
run;
ods graphics off;
Output:
Purpose: Show distribution of commute times segmented by Rural/Urban.
ods graphics on;
proc sgplot data=people_enriched;
vbox Commute_Minutes / category=Rural_Urban;
title "Commute Minutes by Rural/Urban Category";
run;
ods graphics off;
Output:
11) REUSABLE MACROS
Purpose: %make_formats — Reapply formats/labels after transformations.
%macro make_formats;
proc datasets lib=work nolist;
modify people_enriched;
format Gender $genderF. Marital_Status $marF. Rural_Urban $ruF.
Smartphone_User yesnoF. Has_Health_Insurance yesnoF.
Voter_ID_Flag yesnoF. Veg_Flag yesnoF.;
quit;
%mend;
Purpose: %summary_by(state or sector) — Instant grouped KPI table.
%macro summary_by(classvar);
proc summary data=people_enriched nway;
class &classvar.;
var Income_INR Credit_Score Digital_Literacy_Score UPI_Transactions_Month;
output out=summary_&classvar.
n()=N
mean(Income_INR)=Avg_Income
mean(Credit_Score)=Avg_Credit
mean(Digital_Literacy_Score)=Avg_DigiLit
mean(UPI_Transactions_Month)=Avg_UPI;
title "Auto Summary by &classvar.";
run;
title;
%mend;
Purpose: %add_person — Append one more record (for demonstrations).
%macro add_person(
Person_ID, Name, Gender, Age, City, State, Education_Level, Occupation, Sector,
Income_INR, Household_Size, Marital_Status, Language, Smartphone_User,
Internet_Hours_Per_Day, Commute_Mode, Commute_Minutes, Fitness_Mins_Per_Week,
Has_Health_Insurance, Voter_ID_Flag, Credit_Score, UPI_Transactions_Month,
Festival_Celebrated, Cuisine_Pref, Veg_Flag, Digital_Literacy_Score, Rural_Urban,
Disability_Flag, Travel_Trips_Year, Pollution_Concern_Score, Blood_Group);
data one_more;
length Name $28 City $20 State $20 Education_Level $20 Occupation $24 Sector $16
Marital_Status $1 Language $18 Commute_Mode $14 Festival_Celebrated $18
Cuisine_Pref $16 Rural_Urban $10 Blood_Group $3 Gender $1;
Person_ID=&Person_ID.;
Name="&Name."; Gender="&Gender."; Age=&Age.;
City="&City."; State="&State."; Education_Level="&Education_Level.";
Occupation="&Occupation."; Sector="&Sector."; Income_INR=&Income_INR.;
Household_Size=&Household_Size.; Marital_Status="&Marital_Status.";
Language="&Language."; Smartphone_User=&Smartphone_User.;
Internet_Hours_Per_Day=&Internet_Hours_Per_Day.;
Commute_Mode="&Commute_Mode."; Commute_Minutes=&Commute_Minutes.;
Fitness_Mins_Per_Week=&Fitness_Mins_Per_Week.;
Has_Health_Insurance=&Has_Health_Insurance.; Voter_ID_Flag=&Voter_ID_Flag.;
Credit_Score=&Credit_Score.; UPI_Transactions_Month=&UPI_Transactions_Month.;
Festival_Celebrated="&Festival_Celebrated.";
Cuisine_Pref="&Cuisine_Pref."; Veg_Flag=&Veg_Flag.;
Digital_Literacy_Score=&Digital_Literacy_Score.; Rural_Urban="&Rural_Urban.";
Disability_Flag=&Disability_Flag.; Travel_Trips_Year=&Travel_Trips_Year.;
Pollution_Concern_Score=&Pollution_Concern_Score.; Blood_Group="&Blood_Group.";
format Gender $genderF. Marital_Status $marF. Rural_Urban $ruF.
Smartphone_User yesnoF. Has_Health_Insurance yesnoF.
Voter_ID_Flag yesnoF. Veg_Flag yesnoF. Age agebandF.
Income_INR incomeF.;
run;
proc print;run;
proc append base=people_enriched data=one_more force; run;
%mend;
Purpose: %qc_minmax — Quick QC: check value ranges and flag outliers.
%macro qc_minmax(ds,var,low,high);
data qc_&var.;
set &ds.;
length QC_Flag $40;
if &var. < &low. then QC_Flag="Below Min";
else if &var. > &high. then QC_Flag="Above Max";
else QC_Flag="OK";
run;
proc freq data=qc_&var.;
tables QC_Flag / missing;
title "QC for &var.: Range [&low., &high.]";
run;
title;
%mend;
%summary_by(State);
Output:
| Obs | State | _TYPE_ | _FREQ_ | N | Avg_Income | Avg_Credit | Avg_DigiLit | Avg_UPI |
|---|---|---|---|---|---|---|---|---|
| 1 | Andhra Pradesh | 1 | 1 | 1 | 3L-6L | 740.000 | 82.0000 | 24.0000 |
| 2 | Assam | 1 | 1 | 1 | 3L-6L | 735.000 | 80.0000 | 22.0000 |
| 3 | Bihar | 1 | 1 | 1 | 3L-6L | 730.000 | 77.0000 | 20.0000 |
| 4 | Chandigarh | 1 | 1 | 1 | 6L-12L | 770.000 | 81.0000 | 25.0000 |
| 5 | Delhi | 1 | 1 | 1 | > 12L | 805.000 | 83.0000 | 18.0000 |
| 6 | Gujarat | 1 | 3 | 3 | 6L-12L | 780.000 | 84.3333 | 38.3333 |
| 7 | Haryana | 1 | 1 | 1 | > 12L | 820.000 | 90.0000 | 38.0000 |
| 8 | Jharkhand | 1 | 1 | 1 | 6L-12L | 740.000 | 72.0000 | 25.0000 |
| 9 | Karnataka | 1 | 3 | 3 | > 12L | 553.333 | 85.0000 | 33.0000 |
| 10 | Kerala | 1 | 2 | 2 | 6L-12L | 795.000 | 88.0000 | 26.0000 |
| 11 | Madhya Pradesh | 1 | 2 | 2 | 3L-6L | 717.500 | 72.5000 | 22.5000 |
| 12 | Maharashtra | 1 | 3 | 3 | 6L-12L | 750.000 | 79.0000 | 25.3333 |
| 13 | Odisha | 1 | 1 | 1 | 3L-6L | 720.000 | 65.0000 | 18.0000 |
| 14 | Punjab | 1 | 1 | 1 | 3L-6L | 715.000 | 66.0000 | 18.0000 |
| 15 | Rajasthan | 1 | 1 | 1 | 3L-6L | 705.000 | 64.0000 | 20.0000 |
| 16 | Tamil Nadu | 1 | 2 | 2 | 6L-12L | 757.500 | 85.0000 | 25.0000 |
| 17 | Telangana | 1 | 1 | 1 | 3L-6L | 720.000 | 72.0000 | 22.0000 |
| 18 | Uttar Pradesh | 1 | 4 | 4 | 3L-6L | 532.500 | 67.5000 | 22.0000 |
| 19 | West Bengal | 1 | 2 | 2 | 3L-6L | 384.000 | 74.0000 | 16.0000 |
%summary_by(Sector);
Output:
| Obs | State | _TYPE_ | _FREQ_ | N | Avg_Income | Avg_Credit | Avg_DigiLit | Avg_UPI |
|---|---|---|---|---|---|---|---|---|
| 1 | Andhra Pradesh | 1 | 1 | 1 | 3L-6L | 740.000 | 82.0000 | 24.0000 |
| 2 | Assam | 1 | 1 | 1 | 3L-6L | 735.000 | 80.0000 | 22.0000 |
| 3 | Bihar | 1 | 1 | 1 | 3L-6L | 730.000 | 77.0000 | 20.0000 |
| 4 | Chandigarh | 1 | 1 | 1 | 6L-12L | 770.000 | 81.0000 | 25.0000 |
| 5 | Delhi | 1 | 1 | 1 | > 12L | 805.000 | 83.0000 | 18.0000 |
| 6 | Gujarat | 1 | 3 | 3 | 6L-12L | 780.000 | 84.3333 | 38.3333 |
| 7 | Haryana | 1 | 1 | 1 | > 12L | 820.000 | 90.0000 | 38.0000 |
| 8 | Jharkhand | 1 | 1 | 1 | 6L-12L | 740.000 | 72.0000 | 25.0000 |
| 9 | Karnataka | 1 | 3 | 3 | > 12L | 553.333 | 85.0000 | 33.0000 |
| 10 | Kerala | 1 | 2 | 2 | 6L-12L | 795.000 | 88.0000 | 26.0000 |
| 11 | Madhya Pradesh | 1 | 2 | 2 | 3L-6L | 717.500 | 72.5000 | 22.5000 |
| 12 | Maharashtra | 1 | 3 | 3 | 6L-12L | 750.000 | 79.0000 | 25.3333 |
| 13 | Odisha | 1 | 1 | 1 | 3L-6L | 720.000 | 65.0000 | 18.0000 |
| 14 | Punjab | 1 | 1 | 1 | 3L-6L | 715.000 | 66.0000 | 18.0000 |
| 15 | Rajasthan | 1 | 1 | 1 | 3L-6L | 705.000 | 64.0000 | 20.0000 |
| 16 | Tamil Nadu | 1 | 2 | 2 | 6L-12L | 757.500 | 85.0000 | 25.0000 |
| 17 | Telangana | 1 | 1 | 1 | 3L-6L | 720.000 | 72.0000 | 22.0000 |
| 18 | Uttar Pradesh | 1 | 4 | 4 | 3L-6L | 532.500 | 67.5000 | 22.0000 |
| 19 | West Bengal | 1 | 2 | 2 | 3L-6L | 384.000 | 74.0000 | 16.0000 |
%make_formats;
Log:
%add_person(99,Rehan Qureshi,M,34,Noida,Uttar Pradesh,Graduate,UX Designer,IT,
1050000,3,S,Hindi,1,4,Metro,50,80,1,1,790,30,Diwali,North Indian,0,88,Urban,0,2,8,B+);
Output:
| Obs | Name | City | State | Education_Level | Occupation | Sector | Marital_Status | Language | Commute_Mode | Festival_Celebrated | Cuisine_Pref | Rural_Urban | Blood_Group | Gender | Person_ID | Age | Income_INR | Household_Size | Smartphone_User | Internet_Hours_Per_Day | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Veg_Flag | Digital_Literacy_Score | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Rehan Qureshi | Noida | Uttar Pradesh | Graduate | UX Designer | IT | Single | Hindi | Metro | Diwali | North Indian | Urban | B+ | Male | 99 | 25-34 | 6L-12L | 3 | Yes | 4 | 50 | 80 | Yes | Yes | 790 | 30 | No | 88 | 0 | 2 | 8 |
12) TARGETED QC CHECKS
Purpose: Use macro QC checker for critical numeric fields.
%qc_minmax(people_enriched, Credit_Score, 300, 900);
Output:
The FREQ Procedure
| QC_Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Below Min | 3 | 8.82 | 3 | 8.82 |
| OK | 31 | 91.18 | 34 | 100.00 |
%qc_minmax(people_enriched, Digital_Literacy_Score, 0, 100);
Output:
The FREQ Procedure
| QC_Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| OK | 34 | 100.00 | 34 | 100.00 |
%qc_minmax(people_enriched, Income_INR, 0, 10000000);
Output:
The FREQ Procedure
| QC_Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| OK | 34 | 100.00 | 34 | 100.00 |
13) AD-HOC BUSINESS QUERIES
Purpose: Identify "digital champions" cities for a pilot (SQL filter).
proc sql;
create table digital_champion_cities as
select State, City,
mean(Digital_Literacy_Score) as Avg_DigiLit format=8.1,
mean(Income_INR) as Avg_Income format=comma12.,
sum(UPI_Transactions_Month) as Total_UPI
from people_enriched
group by State, City
having Avg_DigiLit >= 80 and Total_UPI >= 200
order by Avg_DigiLit desc, Total_UPI desc;
quit;
proc print;run;
Log
Purpose: Extract target segment: high-income, high-digital, insured.
data target_segment;
set people_enriched;
if Income_INR >= 1200000 and High_Digital=1 and Has_Health_Insurance=1;
run;
proc print;run;
Output:
| Obs | Person_ID | Name | Gender | Age | City | State | Education_Level | Occupation | Sector | Income_INR | Household_Size | Marital_Status | Language | Smartphone_User | Internet_Hours_Per_Day | Commute_Mode | Commute_Minutes | Fitness_Mins_Per_Week | Has_Health_Insurance | Voter_ID_Flag | Credit_Score | UPI_Transactions_Month | Festival_Celebrated | Cuisine_Pref | Veg_Flag | Digital_Literacy_Score | Rural_Urban | Disability_Flag | Travel_Trips_Year | Pollution_Concern_Score | Blood_Group | Income_Quartile | Income_per_Capita | High_Digital | Busy_Internet_User | Long_Commute | Active_Lifestyle |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 4 | Neha Sharma | Female | 35-44 | Delhi | Delhi | Postgraduate | Marketing Manager | Private | > 12L | 3 | Married | Hindi | Yes | 2 | Car | 55 | 75 | Yes | Yes | 805 | 18 | Holi | North Indian | No | 83 | Urban | 0 | 5 | 8 | AB+ | 3 | 466666.67 | Yes | No | Yes | No |
| 2 | 12 | Sonali Kulkarni | Female | 35-44 | Nagpur | Maharashtra | Postgraduate | HR Lead | Private | 6L-12L | 3 | Married | Marathi | Yes | 2 | Car | 35 | 120 | Yes | Yes | 790 | 26 | Diwali | Maharashtrian | No | 84 | Urban | 0 | 2 | 8 | AB- | 3 | 400000.00 | Yes | No | No | Yes |
| 3 | 15 | Manoj Kumar | Male | 45-54 | Gurugram | Haryana | Graduate | Project Manager | IT | > 12L | 4 | Married | Hindi | Yes | 2 | Car | 60 | 60 | Yes | Yes | 820 | 38 | Diwali | North Indian | No | 90 | Urban | 0 | 4 | 9 | B+ | 3 | 450000.00 | Yes | No | Yes | No |
| 4 | 19 | Vikram Rao | Male | 35-44 | Bengaluru | Karnataka | Postgraduate | Data Scientist | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Metro | 50 | 150 | Yes | Yes | 835 | 45 | Ugadi | South Indian | No | 92 | Urban | 0 | 3 | 9 | O+ | 3 | 733333.33 | Yes | No | Yes | Yes |
| 5 | 24 | Meera Pillai | Female | 45-54 | Thiruvananthapuram | Kerala | Postgraduate | School Principal | Public | > 12L | 3 | Married | Malayalam | Yes | 1 | Car | 25 | 100 | Yes | Yes | 815 | 20 | Onam | South Indian | Yes | 90 | Urban | 0 | 2 | 8 | A+ | 3 | 416666.67 | Yes | No | No | Yes |
| 6 | 31 | Devika Rao | Female | 25-34 | Mangalore | Karnataka | Graduate | Product Manager | IT | > 12L | 3 | Married | Kannada | Yes | 3 | Car | 35 | 100 | Yes | Yes | 825 | 40 | Ugadi | South Indian | No | 89 | Urban | 0 | 3 | 8 | AB+ | 3 | 533333.33 | Yes | No | No | Yes |
Purpose: Side-by-side compare of segment vs overall means.
proc means data=people_enriched noprint;
var Income_INR Credit_Score Digital_Literacy_Score UPI_Transactions_Month;
output out=overall_means mean=;
run;
proc print;run;
Output:
| Obs | _TYPE_ | _FREQ_ | Income_INR | Credit_Score | Digital_Literacy_Score | UPI_Transactions_Month |
|---|---|---|---|---|---|---|
| 1 | 0 | 34 | 6L-12L | 690.235 | 78.2941 | 25.5 |
proc means data=target_segment noprint;
var Income_INR Credit_Score Digital_Literacy_Score UPI_Transactions_Month;
output out=segment_means mean=;
run;
proc print;run;
Output:
| Obs | _TYPE_ | _FREQ_ | Income_INR | Credit_Score | Digital_Literacy_Score | UPI_Transactions_Month |
|---|---|---|---|---|---|---|
| 1 | 0 | 6 | > 12L | 815 | 88 | 31.1667 |
Purpose: Merge comparator rows to feed a simple delta calc.
data compare_means;
merge overall_means(rename=(Income_INR=All_Income Credit_Score=All_Credit
Digital_Literacy_Score=All_Digi UPI_Transactions_Month=All_UPI))
segment_means(rename=(Income_INR=Seg_Income Credit_Score=Seg_Credit
Digital_Literacy_Score=Seg_Digi UPI_Transactions_Month=Seg_UPI));
run;
proc print data=compare_means label noobs;
label All_Income="All Income" Seg_Income="Segment Income"
All_Credit="All Credit" Seg_Credit="Segment Credit"
All_Digi="All Digital Lit" Seg_Digi="Segment Digital Lit"
All_UPI="All UPI" Seg_UPI="Segment UPI";
title "Segment vs Overall Averages";
run;
title;
Output:
| _TYPE_ | _FREQ_ | All Income | All Credit | All Digital Lit | All UPI | Segment Income | Segment Credit | Segment Digital Lit | Segment UPI |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 6L-12L | 690.235 | 78.2941 | 25.5 | > 12L | 815 | 88 | 31.1667 |
No comments:
Post a Comment