195.INDIAN CUISINES DATA ANALYSIS USING DATA STEP | LENGTH | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | PROC UNIVARIATE | MACROS | DERIVED VARIABLES | EXPORT IN SAS FOR CULINARY INSIGHTS
INDIAN CUISINES DATA ANALYSIS USING DATA STEP | LENGTH | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | PROC UNIVARIATE | MACROS | DERIVED VARIABLES | EXPORT IN SAS FOR CULINARY INSIGHTS
/*Creating A unique dataset on different types of Indian cuisines.*/
Step 1: Creating the Dataset – Indian Cuisines
A dataset called Indian_Cuisines using DATALINES in SAS. It includes:
Cuisine_ID: Unique identifier
Cuisine_Name: Name of the cuisine
Region: Region of origin
Spice_Level: Low, Medium, High
Main_Ingredient: Primary ingredient
Veg_NonVeg: Type of dish
Popularity_Score: Numeric rating from 1 to 10
Preparation_Time: Time in minutes
data Indian_Cuisines;
length Cuisine_Name $20 Region $15 Spice_Level $10
Main_Ingredient $15 Veg_NonVeg $10;
input Cuisine_ID Cuisine_Name $ Region $ Spice_Level $
Main_Ingredient $ Veg_NonVeg $ Popularity_Score Preparation_Time;
datalines;
1 Biryani South High Rice NonVeg 9 60
2 ButterChicken North Medium Chicken NonVeg 8 50
3 Dosa South Medium Rice Veg 7 30
4 RoganJosh North High Lamb NonVeg 8 55
5 Pakhala East Low Rice Veg 6 20
6 LittiChokha East Medium Wheat Veg 7 35
7 Idli South Low Rice Veg 8 25
8 PavBhaji West Medium MixedVeg Veg 9 40
9 Poha West Low Rice Veg 7 15
10 CholeBhature North Medium Chickpeas Veg 9 45
11 HyderabadiHaleem South High Meat NonVeg 8 70
12 Sambhar South Medium Lentils Veg 8 35
13 DalTadka North Low Lentils Veg 7 30
14 MacherJhol East Medium Fish NonVeg 6 40
15 Vindaloo West High Pork NonVeg 7 50
16 RajmaChawal North Medium KidneyBeans Veg 8 35
17 FishCurry East High Fish NonVeg 7 45
18 Khandvi West Low Gramflour Veg 6 20
19 AlooParatha North Medium Potato Veg 8 30
20 Khichdi PanIndia Low Lentils Veg 9 25
;
run;
proc print;run;
Output:
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
1 | Biryani | South | High | Rice | NonVeg | 1 | 9 | 60 |
2 | ButterChicken | North | Medium | Chicken | NonVeg | 2 | 8 | 50 |
3 | Dosa | South | Medium | Rice | Veg | 3 | 7 | 30 |
4 | RoganJosh | North | High | Lamb | NonVeg | 4 | 8 | 55 |
5 | Pakhala | East | Low | Rice | Veg | 5 | 6 | 20 |
6 | LittiChokha | East | Medium | Wheat | Veg | 6 | 7 | 35 |
7 | Idli | South | Low | Rice | Veg | 7 | 8 | 25 |
8 | PavBhaji | West | Medium | MixedVeg | Veg | 8 | 9 | 40 |
9 | Poha | West | Low | Rice | Veg | 9 | 7 | 15 |
10 | CholeBhature | North | Medium | Chickpeas | Veg | 10 | 9 | 45 |
11 | HyderabadiHaleem | South | High | Meat | NonVeg | 11 | 8 | 70 |
12 | Sambhar | South | Medium | Lentils | Veg | 12 | 8 | 35 |
13 | DalTadka | North | Low | Lentils | Veg | 13 | 7 | 30 |
14 | MacherJhol | East | Medium | Fish | NonVeg | 14 | 6 | 40 |
15 | Vindaloo | West | High | Pork | NonVeg | 15 | 7 | 50 |
16 | RajmaChawal | North | Medium | KidneyBeans | Veg | 16 | 8 | 35 |
17 | FishCurry | East | High | Fish | NonVeg | 17 | 7 | 45 |
18 | Khandvi | West | Low | Gramflour | Veg | 18 | 6 | 20 |
19 | AlooParatha | North | Medium | Potato | Veg | 19 | 8 | 30 |
20 | Khichdi | PanIndia | Low | Lentils | Veg | 20 | 9 | 25 |
Step 2: Using Basic PROC Steps
a) PROC PRINT
proc print data=Indian_Cuisines;
title "All Indian Cuisines";
run;
Output:
All Indian Cuisines |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
1 | Biryani | South | High | Rice | NonVeg | 1 | 9 | 60 |
2 | ButterChicken | North | Medium | Chicken | NonVeg | 2 | 8 | 50 |
3 | Dosa | South | Medium | Rice | Veg | 3 | 7 | 30 |
4 | RoganJosh | North | High | Lamb | NonVeg | 4 | 8 | 55 |
5 | Pakhala | East | Low | Rice | Veg | 5 | 6 | 20 |
6 | LittiChokha | East | Medium | Wheat | Veg | 6 | 7 | 35 |
7 | Idli | South | Low | Rice | Veg | 7 | 8 | 25 |
8 | PavBhaji | West | Medium | MixedVeg | Veg | 8 | 9 | 40 |
9 | Poha | West | Low | Rice | Veg | 9 | 7 | 15 |
10 | CholeBhature | North | Medium | Chickpeas | Veg | 10 | 9 | 45 |
11 | HyderabadiHaleem | South | High | Meat | NonVeg | 11 | 8 | 70 |
12 | Sambhar | South | Medium | Lentils | Veg | 12 | 8 | 35 |
13 | DalTadka | North | Low | Lentils | Veg | 13 | 7 | 30 |
14 | MacherJhol | East | Medium | Fish | NonVeg | 14 | 6 | 40 |
15 | Vindaloo | West | High | Pork | NonVeg | 15 | 7 | 50 |
16 | RajmaChawal | North | Medium | KidneyBeans | Veg | 16 | 8 | 35 |
17 | FishCurry | East | High | Fish | NonVeg | 17 | 7 | 45 |
18 | Khandvi | West | Low | Gramflour | Veg | 18 | 6 | 20 |
19 | AlooParatha | North | Medium | Potato | Veg | 19 | 8 | 30 |
20 | Khichdi | PanIndia | Low | Lentils | Veg | 20 | 9 | 25 |
b) PROC SORT – Sort by Region and Spice Level
proc sort data=Indian_Cuisines out=Sorted_Cuisines;
by Region Spice_Level;
run;
proc print;run;
Output:
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
1 | FishCurry | East | High | Fish | NonVeg | 17 | 7 | 45 |
2 | Pakhala | East | Low | Rice | Veg | 5 | 6 | 20 |
3 | LittiChokha | East | Medium | Wheat | Veg | 6 | 7 | 35 |
4 | MacherJhol | East | Medium | Fish | NonVeg | 14 | 6 | 40 |
5 | RoganJosh | North | High | Lamb | NonVeg | 4 | 8 | 55 |
6 | DalTadka | North | Low | Lentils | Veg | 13 | 7 | 30 |
7 | ButterChicken | North | Medium | Chicken | NonVeg | 2 | 8 | 50 |
8 | CholeBhature | North | Medium | Chickpeas | Veg | 10 | 9 | 45 |
9 | RajmaChawal | North | Medium | KidneyBeans | Veg | 16 | 8 | 35 |
10 | AlooParatha | North | Medium | Potato | Veg | 19 | 8 | 30 |
11 | Khichdi | PanIndia | Low | Lentils | Veg | 20 | 9 | 25 |
12 | Biryani | South | High | Rice | NonVeg | 1 | 9 | 60 |
13 | HyderabadiHaleem | South | High | Meat | NonVeg | 11 | 8 | 70 |
14 | Idli | South | Low | Rice | Veg | 7 | 8 | 25 |
15 | Dosa | South | Medium | Rice | Veg | 3 | 7 | 30 |
16 | Sambhar | South | Medium | Lentils | Veg | 12 | 8 | 35 |
17 | Vindaloo | West | High | Pork | NonVeg | 15 | 7 | 50 |
18 | Poha | West | Low | Rice | Veg | 9 | 7 | 15 |
19 | Khandvi | West | Low | Gramflour | Veg | 18 | 6 | 20 |
20 | PavBhaji | West | Medium | MixedVeg | Veg | 8 | 9 | 40 |
c) PROC FREQ – Frequency of Spice Levels
proc freq data=Indian_Cuisines;
tables Spice_Level;
title "Frequency of Spice Levels";
run;
Output:
Frequency of Spice Levels |
Spice_Level | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
High | 5 | 25.00 | 5 | 25.00 |
Low | 6 | 30.00 | 11 | 55.00 |
Medium | 9 | 45.00 | 20 | 100.00 |
d) PROC MEANS – Average Preparation Time by Veg/Non-Veg
proc means data=Indian_Cuisines noprint;
class Veg_NonVeg;
var Preparation_Time;
output out=PrepTimeAvg mean=Avg_Prep_Time;
run;
proc print data=PrepTimeAvg;
title "Average Preparation Time by Veg/Non-Veg";
run;
Output:
Average Preparation Time by Veg/Non-Veg |
Obs | Veg_NonVeg | _TYPE_ | _FREQ_ | Avg_Prep_Time |
---|---|---|---|---|
1 | 0 | 20 | 37.7500 | |
2 | NonVeg | 1 | 7 | 52.8571 |
3 | Veg | 1 | 13 | 29.6154 |
Step 3: Using PROC SQL
a) Find Top 5 Most Popular Dishes
proc sql outobs=5;
select Cuisine_Name, Popularity_Score
from Indian_Cuisines
order by Popularity_Score desc;
quit;
Output:
Cuisine_Name | Popularity_Score |
---|---|
PavBhaji | 9 |
Khichdi | 9 |
Biryani | 9 |
CholeBhature | 9 |
Idli | 8 |
b) Count of Veg and Non-Veg Items
proc sql;
select Veg_NonVeg, count(*) as Count
from Indian_Cuisines
group by Veg_NonVeg;
quit;
Output:
Veg_NonVeg | Count |
---|---|
NonVeg | 7 |
Veg | 13 |
c) Average Popularity Score per Region
proc sql;
select Region, avg(Popularity_Score) as Avg_Score format=4.2
from Indian_Cuisines
group by Region;
quit;
Output:
Region | Avg_Score |
---|---|
East | 6.50 |
North | 8.00 |
PanIndia | 9.00 |
South | 8.00 |
West | 7.25 |
Step 4: Advanced Use with Macros
Task: Create summary reports per Region using a Macro
%macro Region_Report(region);
proc print data=Indian_Cuisines;
where Region = "®ion";
title "Cuisine Report for Region: ®ion";
run;
proc means data=Indian_Cuisines;
where Region = "®ion";
var Popularity_Score Preparation_Time;
title "Average Popularity and Preparation Time for ®ion";
run;
%mend;
%Region_Report(North);
Output:
Cuisine Report for Region: North |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
2 | ButterChicken | North | Medium | Chicken | NonVeg | 2 | 8 | 50 |
4 | RoganJosh | North | High | Lamb | NonVeg | 4 | 8 | 55 |
10 | CholeBhature | North | Medium | Chickpeas | Veg | 10 | 9 | 45 |
13 | DalTadka | North | Low | Lentils | Veg | 13 | 7 | 30 |
16 | RajmaChawal | North | Medium | KidneyBeans | Veg | 16 | 8 | 35 |
19 | AlooParatha | North | Medium | Potato | Veg | 19 | 8 | 30 |
%Region_Report(South);
Output:
Cuisine Report for Region: South |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
1 | Biryani | South | High | Rice | NonVeg | 1 | 9 | 60 |
3 | Dosa | South | Medium | Rice | Veg | 3 | 7 | 30 |
7 | Idli | South | Low | Rice | Veg | 7 | 8 | 25 |
11 | HyderabadiHaleem | South | High | Meat | NonVeg | 11 | 8 | 70 |
12 | Sambhar | South | Medium | Lentils | Veg | 12 | 8 | 35 |
%Region_Report(East);
Output:
Cuisine Report for Region: East |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
5 | Pakhala | East | Low | Rice | Veg | 5 | 6 | 20 |
6 | LittiChokha | East | Medium | Wheat | Veg | 6 | 7 | 35 |
14 | MacherJhol | East | Medium | Fish | NonVeg | 14 | 6 | 40 |
17 | FishCurry | East | High | Fish | NonVeg | 17 | 7 | 45 |
%Region_Report(West);
Output:
Cuisine Report for Region: West |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time |
---|---|---|---|---|---|---|---|---|
8 | PavBhaji | West | Medium | MixedVeg | Veg | 8 | 9 | 40 |
9 | Poha | West | Low | Rice | Veg | 9 | 7 | 15 |
15 | Vindaloo | West | High | Pork | NonVeg | 15 | 7 | 50 |
18 | Khandvi | West | Low | Gramflour | Veg | 18 | 6 | 20 |
Step 5: Conditional Logic with Macro Parameters
Task: Filter Cuisines Based on Spice Level and Type
%macro Filter_Cuisines(spice, type);
proc sql;
select Cuisine_Name, Region, Main_Ingredient, Preparation_Time
from Indian_Cuisines
where Spice_Level="&spice" and Veg_NonVeg="&type";
quit;
%mend;
%Filter_Cuisines(High, NonVeg);
Output:
Cuisine_Name | Region | Main_Ingredient | Preparation_Time |
---|---|---|---|
Biryani | South | Rice | 60 |
RoganJosh | North | Lamb | 55 |
HyderabadiHaleem | South | Meat | 70 |
Vindaloo | West | Pork | 50 |
FishCurry | East | Fish | 45 |
Step 6: Creating a Flag Variable – Fast Food (≤30 mins)
data Fast_Food;
set Indian_Cuisines;
if Preparation_Time <= 30 then FastFood="Yes";
else FastFood="No";
run;
proc print data=Fast_Food;
title "Cuisines Classified as Fast Food (<=30 mins)";
run;
Output:
Cuisines Classified as Fast Food (<=30 mins) |
Obs | Cuisine_Name | Region | Spice_Level | Main_Ingredient | Veg_NonVeg | Cuisine_ID | Popularity_Score | Preparation_Time | FastFood |
---|---|---|---|---|---|---|---|---|---|
1 | Biryani | South | High | Rice | NonVeg | 1 | 9 | 60 | No |
2 | ButterChicken | North | Medium | Chicken | NonVeg | 2 | 8 | 50 | No |
3 | Dosa | South | Medium | Rice | Veg | 3 | 7 | 30 | Yes |
4 | RoganJosh | North | High | Lamb | NonVeg | 4 | 8 | 55 | No |
5 | Pakhala | East | Low | Rice | Veg | 5 | 6 | 20 | Yes |
6 | LittiChokha | East | Medium | Wheat | Veg | 6 | 7 | 35 | No |
7 | Idli | South | Low | Rice | Veg | 7 | 8 | 25 | Yes |
8 | PavBhaji | West | Medium | MixedVeg | Veg | 8 | 9 | 40 | No |
9 | Poha | West | Low | Rice | Veg | 9 | 7 | 15 | Yes |
10 | CholeBhature | North | Medium | Chickpeas | Veg | 10 | 9 | 45 | No |
11 | HyderabadiHaleem | South | High | Meat | NonVeg | 11 | 8 | 70 | No |
12 | Sambhar | South | Medium | Lentils | Veg | 12 | 8 | 35 | No |
13 | DalTadka | North | Low | Lentils | Veg | 13 | 7 | 30 | Yes |
14 | MacherJhol | East | Medium | Fish | NonVeg | 14 | 6 | 40 | No |
15 | Vindaloo | West | High | Pork | NonVeg | 15 | 7 | 50 | No |
16 | RajmaChawal | North | Medium | KidneyBeans | Veg | 16 | 8 | 35 | No |
17 | FishCurry | East | High | Fish | NonVeg | 17 | 7 | 45 | No |
18 | Khandvi | West | Low | Gramflour | Veg | 18 | 6 | 20 | Yes |
19 | AlooParatha | North | Medium | Potato | Veg | 19 | 8 | 30 | Yes |
20 | Khichdi | PanIndia | Low | Lentils | Veg | 20 | 9 | 25 | Yes |
Step 7: Using PROC UNIVARIATE – Analyze Time Distribution
proc univariate data=Indian_Cuisines;
var Preparation_Time;
histogram;
title "Preparation Time Distribution of Indian Cuisines";
run;
Output:
Preparation Time Distribution of Indian Cuisines |
Moments | |||
---|---|---|---|
N | 20 | Sum Weights | 20 |
Mean | 37.75 | Sum Observations | 755 |
Std Deviation | 14.3705657 | Variance | 206.513158 |
Skewness | 0.50422081 | Kurtosis | -0.1578856 |
Uncorrected SS | 32425 | Corrected SS | 3923.75 |
Coeff Variation | 38.0677236 | Std Error Mean | 3.21335617 |
Basic Statistical Measures | |||
---|---|---|---|
Location | Variability | ||
Mean | 37.75000 | Std Deviation | 14.37057 |
Median | 35.00000 | Variance | 206.51316 |
Mode | 30.00000 | Range | 55.00000 |
Interquartile Range | 20.00000 |
Note: The mode displayed is the smallest of 2 modes with a count of 3. |
Tests for Location: Mu0=0 | ||||
---|---|---|---|---|
Test | Statistic | p Value | ||
Student's t | t | 11.74784 | Pr > |t| | <.0001 |
Sign | M | 10 | Pr >= |M| | <.0001 |
Signed Rank | S | 105 | Pr >= |S| | <.0001 |
Quantiles (Definition 5) | |
---|---|
Level | Quantile |
100% Max | 70.0 |
99% | 70.0 |
95% | 65.0 |
90% | 57.5 |
75% Q3 | 47.5 |
50% Median | 35.0 |
25% Q1 | 27.5 |
10% | 20.0 |
5% | 17.5 |
1% | 15.0 |
0% Min | 15.0 |
Extreme Observations | |||
---|---|---|---|
Lowest | Highest | ||
Value | Obs | Value | Obs |
15 | 9 | 50 | 2 |
20 | 18 | 50 | 15 |
20 | 5 | 55 | 4 |
25 | 20 | 60 | 1 |
25 | 7 | 70 | 11 |
Step 8: Creating a Macro for Top N Dishes by Popularity
%macro Top_Dishes(n);
proc sql outobs=&n;
select Cuisine_Name, Region, Popularity_Score
from Indian_Cuisines
order by Popularity_Score desc;
quit;
%mend;
%Top_Dishes(3);
Output:
Cuisine_Name | Region | Popularity_Score |
---|---|---|
PavBhaji | West | 9 |
Khichdi | PanIndia | 9 |
Biryani | South | 9 |
Step 9: Exporting Final Datasets
proc export data=Fast_Food
outfile="Fast_Food_Cuisines.csv"
dbms=csv
replace;
run;
Comments
Post a Comment