195.INDIAN CUISINES DATA ANALYSIS USING DATA STEP | LENGTH | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | PROC UNIVARIATE | MACROS | DERIVED VARIABLES | EXPORT IN SAS FOR CULINARY INSIGHTS

INDIAN CUISINES DATA ANALYSIS USING DATA STEP | LENGTH | PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC SQL | PROC UNIVARIATE | MACROS | DERIVED VARIABLES | EXPORT IN SAS FOR CULINARY INSIGHTS

/*Creating A unique dataset on different types of Indian cuisines.*/

Step 1: Creating the Dataset – Indian Cuisines

A dataset called Indian_Cuisines using DATALINES in SAS. It includes:

Cuisine_ID: Unique identifier

Cuisine_Name: Name of the cuisine

Region: Region of origin

Spice_Level: Low, Medium, High

Main_Ingredient: Primary ingredient

Veg_NonVeg: Type of dish

Popularity_Score: Numeric rating from 1 to 10

Preparation_Time: Time in minutes


data Indian_Cuisines;

    length Cuisine_Name $20 Region $15 Spice_Level $10 

           Main_Ingredient $15 Veg_NonVeg $10;

    input Cuisine_ID Cuisine_Name $ Region $ Spice_Level $ 

          Main_Ingredient $ Veg_NonVeg $ Popularity_Score Preparation_Time;

    datalines;

1 Biryani South High Rice NonVeg 9 60

2 ButterChicken North Medium Chicken NonVeg 8 50

3 Dosa South Medium Rice Veg 7 30

4 RoganJosh North High Lamb NonVeg 8 55

5 Pakhala East Low Rice Veg 6 20

6 LittiChokha East Medium Wheat Veg 7 35

7 Idli South Low Rice Veg 8 25

8 PavBhaji West Medium MixedVeg Veg 9 40

9 Poha West Low Rice Veg 7 15

10 CholeBhature North Medium Chickpeas Veg 9 45

11 HyderabadiHaleem South High Meat NonVeg 8 70

12 Sambhar South Medium Lentils Veg 8 35

13 DalTadka North Low Lentils Veg 7 30

14 MacherJhol East Medium Fish NonVeg 6 40

15 Vindaloo West High Pork NonVeg 7 50

16 RajmaChawal North Medium KidneyBeans Veg 8 35

17 FishCurry East High Fish NonVeg 7 45

18 Khandvi West Low Gramflour Veg 6 20

19 AlooParatha North Medium Potato Veg 8 30

20 Khichdi PanIndia Low Lentils Veg 9 25

;

run;

proc print;run;

Output:

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
1 Biryani South High Rice NonVeg 1 9 60
2 ButterChicken North Medium Chicken NonVeg 2 8 50
3 Dosa South Medium Rice Veg 3 7 30
4 RoganJosh North High Lamb NonVeg 4 8 55
5 Pakhala East Low Rice Veg 5 6 20
6 LittiChokha East Medium Wheat Veg 6 7 35
7 Idli South Low Rice Veg 7 8 25
8 PavBhaji West Medium MixedVeg Veg 8 9 40
9 Poha West Low Rice Veg 9 7 15
10 CholeBhature North Medium Chickpeas Veg 10 9 45
11 HyderabadiHaleem South High Meat NonVeg 11 8 70
12 Sambhar South Medium Lentils Veg 12 8 35
13 DalTadka North Low Lentils Veg 13 7 30
14 MacherJhol East Medium Fish NonVeg 14 6 40
15 Vindaloo West High Pork NonVeg 15 7 50
16 RajmaChawal North Medium KidneyBeans Veg 16 8 35
17 FishCurry East High Fish NonVeg 17 7 45
18 Khandvi West Low Gramflour Veg 18 6 20
19 AlooParatha North Medium Potato Veg 19 8 30
20 Khichdi PanIndia Low Lentils Veg 20 9 25

Step 2: Using Basic PROC Steps

a) PROC PRINT

proc print data=Indian_Cuisines;

    title "All Indian Cuisines";

run;

Output:

All Indian Cuisines

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
1 Biryani South High Rice NonVeg 1 9 60
2 ButterChicken North Medium Chicken NonVeg 2 8 50
3 Dosa South Medium Rice Veg 3 7 30
4 RoganJosh North High Lamb NonVeg 4 8 55
5 Pakhala East Low Rice Veg 5 6 20
6 LittiChokha East Medium Wheat Veg 6 7 35
7 Idli South Low Rice Veg 7 8 25
8 PavBhaji West Medium MixedVeg Veg 8 9 40
9 Poha West Low Rice Veg 9 7 15
10 CholeBhature North Medium Chickpeas Veg 10 9 45
11 HyderabadiHaleem South High Meat NonVeg 11 8 70
12 Sambhar South Medium Lentils Veg 12 8 35
13 DalTadka North Low Lentils Veg 13 7 30
14 MacherJhol East Medium Fish NonVeg 14 6 40
15 Vindaloo West High Pork NonVeg 15 7 50
16 RajmaChawal North Medium KidneyBeans Veg 16 8 35
17 FishCurry East High Fish NonVeg 17 7 45
18 Khandvi West Low Gramflour Veg 18 6 20
19 AlooParatha North Medium Potato Veg 19 8 30
20 Khichdi PanIndia Low Lentils Veg 20 9 25

b) PROC SORT – Sort by Region and Spice Level

proc sort data=Indian_Cuisines out=Sorted_Cuisines;

    by Region Spice_Level;

run;

proc print;run;

Output:

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
1 FishCurry East High Fish NonVeg 17 7 45
2 Pakhala East Low Rice Veg 5 6 20
3 LittiChokha East Medium Wheat Veg 6 7 35
4 MacherJhol East Medium Fish NonVeg 14 6 40
5 RoganJosh North High Lamb NonVeg 4 8 55
6 DalTadka North Low Lentils Veg 13 7 30
7 ButterChicken North Medium Chicken NonVeg 2 8 50
8 CholeBhature North Medium Chickpeas Veg 10 9 45
9 RajmaChawal North Medium KidneyBeans Veg 16 8 35
10 AlooParatha North Medium Potato Veg 19 8 30
11 Khichdi PanIndia Low Lentils Veg 20 9 25
12 Biryani South High Rice NonVeg 1 9 60
13 HyderabadiHaleem South High Meat NonVeg 11 8 70
14 Idli South Low Rice Veg 7 8 25
15 Dosa South Medium Rice Veg 3 7 30
16 Sambhar South Medium Lentils Veg 12 8 35
17 Vindaloo West High Pork NonVeg 15 7 50
18 Poha West Low Rice Veg 9 7 15
19 Khandvi West Low Gramflour Veg 18 6 20
20 PavBhaji West Medium MixedVeg Veg 8 9 40

c) PROC FREQ – Frequency of Spice Levels

proc freq data=Indian_Cuisines;

    tables Spice_Level;

    title "Frequency of Spice Levels";

run;

Output:

Frequency of Spice Levels

The FREQ Procedure

Spice_Level Frequency Percent Cumulative
Frequency
Cumulative
Percent
High 5 25.00 5 25.00
Low 6 30.00 11 55.00
Medium 9 45.00 20 100.00

d) PROC MEANS – Average Preparation Time by Veg/Non-Veg

proc means data=Indian_Cuisines noprint;

    class Veg_NonVeg;

    var Preparation_Time;

    output out=PrepTimeAvg mean=Avg_Prep_Time;

run;

proc print data=PrepTimeAvg;

    title "Average Preparation Time by Veg/Non-Veg";

run;

Output:

Average Preparation Time by Veg/Non-Veg

Obs Veg_NonVeg _TYPE_ _FREQ_ Avg_Prep_Time
1   0 20 37.7500
2 NonVeg 1 7 52.8571
3 Veg 1 13 29.6154

Step 3: Using PROC SQL

a) Find Top 5 Most Popular Dishes

proc sql outobs=5;

    select Cuisine_Name, Popularity_Score

    from Indian_Cuisines

    order by Popularity_Score desc;

quit;

Output:

Cuisine_Name Popularity_Score
PavBhaji 9
Khichdi 9
Biryani 9
CholeBhature 9
Idli 8

b) Count of Veg and Non-Veg Items

proc sql;

    select Veg_NonVeg, count(*) as Count

    from Indian_Cuisines

    group by Veg_NonVeg;

quit;

Output:

Veg_NonVeg Count
NonVeg 7
Veg 13

c) Average Popularity Score per Region

proc sql;

    select Region, avg(Popularity_Score) as Avg_Score format=4.2

    from Indian_Cuisines

    group by Region;

quit;

Output:

Region Avg_Score
East 6.50
North 8.00
PanIndia 9.00
South 8.00
West 7.25

Step 4: Advanced Use with Macros

Task: Create summary reports per Region using a Macro

%macro Region_Report(region);

    proc print data=Indian_Cuisines;

        where Region = "&region";

        title "Cuisine Report for Region: &region";

    run;


    proc means data=Indian_Cuisines;

        where Region = "&region";

        var Popularity_Score Preparation_Time;

        title "Average Popularity and Preparation Time for &region";

    run;

%mend;

%Region_Report(North);

Output:

Cuisine Report for Region: North

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
2 ButterChicken North Medium Chicken NonVeg 2 8 50
4 RoganJosh North High Lamb NonVeg 4 8 55
10 CholeBhature North Medium Chickpeas Veg 10 9 45
13 DalTadka North Low Lentils Veg 13 7 30
16 RajmaChawal North Medium KidneyBeans Veg 16 8 35
19 AlooParatha North Medium Potato Veg 19 8 30

Average Popularity and Preparation Time for North

The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Popularity_Score
Preparation_Time
6
6
8.0000000
40.8333333
0.6324555
10.6848803
7.0000000
30.0000000
9.0000000
55.0000000

%Region_Report(South);

Output:

Cuisine Report for Region: South

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
1 Biryani South High Rice NonVeg 1 9 60
3 Dosa South Medium Rice Veg 3 7 30
7 Idli South Low Rice Veg 7 8 25
11 HyderabadiHaleem South High Meat NonVeg 11 8 70
12 Sambhar South Medium Lentils Veg 12 8 35

Average Popularity and Preparation Time for South

The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Popularity_Score
Preparation_Time
5
5
8.0000000
44.0000000
0.7071068
19.8116128
7.0000000
25.0000000
9.0000000
70.0000000

%Region_Report(East);

Output:

Cuisine Report for Region: East

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
5 Pakhala East Low Rice Veg 5 6 20
6 LittiChokha East Medium Wheat Veg 6 7 35
14 MacherJhol East Medium Fish NonVeg 14 6 40
17 FishCurry East High Fish NonVeg 17 7 45

Average Popularity and Preparation Time for East

The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Popularity_Score
Preparation_Time
4
4
6.5000000
35.0000000
0.5773503
10.8012345
6.0000000
20.0000000
7.0000000
45.0000000

%Region_Report(West);

Output:

Cuisine Report for Region: West

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time
8 PavBhaji West Medium MixedVeg Veg 8 9 40
9 Poha West Low Rice Veg 9 7 15
15 Vindaloo West High Pork NonVeg 15 7 50
18 Khandvi West Low Gramflour Veg 18 6 20

Average Popularity and Preparation Time for West

The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Popularity_Score
Preparation_Time
4
4
7.2500000
31.2500000
1.2583057
16.5201897
6.0000000
15.0000000
9.0000000
50.0000000

Step 5: Conditional Logic with Macro Parameters

Task: Filter Cuisines Based on Spice Level and Type

%macro Filter_Cuisines(spice, type);

    proc sql;

        select Cuisine_Name, Region, Main_Ingredient, Preparation_Time

        from Indian_Cuisines

        where Spice_Level="&spice" and Veg_NonVeg="&type";

    quit;

%mend;


%Filter_Cuisines(High, NonVeg);

Output:

Cuisine_Name Region Main_Ingredient Preparation_Time
Biryani South Rice 60
RoganJosh North Lamb 55
HyderabadiHaleem South Meat 70
Vindaloo West Pork 50
FishCurry East Fish 45

Step 6: Creating a Flag Variable – Fast Food (≤30 mins)

data Fast_Food;

    set Indian_Cuisines;

    if Preparation_Time <= 30 then FastFood="Yes";

    else FastFood="No";

run;


proc print data=Fast_Food;

    title "Cuisines Classified as Fast Food (<=30 mins)";

run;

Output:

Cuisines Classified as Fast Food (<=30 mins)

Obs Cuisine_Name Region Spice_Level Main_Ingredient Veg_NonVeg Cuisine_ID Popularity_Score Preparation_Time FastFood
1 Biryani South High Rice NonVeg 1 9 60 No
2 ButterChicken North Medium Chicken NonVeg 2 8 50 No
3 Dosa South Medium Rice Veg 3 7 30 Yes
4 RoganJosh North High Lamb NonVeg 4 8 55 No
5 Pakhala East Low Rice Veg 5 6 20 Yes
6 LittiChokha East Medium Wheat Veg 6 7 35 No
7 Idli South Low Rice Veg 7 8 25 Yes
8 PavBhaji West Medium MixedVeg Veg 8 9 40 No
9 Poha West Low Rice Veg 9 7 15 Yes
10 CholeBhature North Medium Chickpeas Veg 10 9 45 No
11 HyderabadiHaleem South High Meat NonVeg 11 8 70 No
12 Sambhar South Medium Lentils Veg 12 8 35 No
13 DalTadka North Low Lentils Veg 13 7 30 Yes
14 MacherJhol East Medium Fish NonVeg 14 6 40 No
15 Vindaloo West High Pork NonVeg 15 7 50 No
16 RajmaChawal North Medium KidneyBeans Veg 16 8 35 No
17 FishCurry East High Fish NonVeg 17 7 45 No
18 Khandvi West Low Gramflour Veg 18 6 20 Yes
19 AlooParatha North Medium Potato Veg 19 8 30 Yes
20 Khichdi PanIndia Low Lentils Veg 20 9 25 Yes

Step 7: Using PROC UNIVARIATE – Analyze Time Distribution

proc univariate data=Indian_Cuisines;

    var Preparation_Time;

    histogram;

    title "Preparation Time Distribution of Indian Cuisines";

run;

Output:

Preparation Time Distribution of Indian Cuisines

The UNIVARIATE Procedure
Variable: Preparation_Time

Moments
N 20 Sum Weights 20
Mean 37.75 Sum Observations 755
Std Deviation 14.3705657 Variance 206.513158
Skewness 0.50422081 Kurtosis -0.1578856
Uncorrected SS 32425 Corrected SS 3923.75
Coeff Variation 38.0677236 Std Error Mean 3.21335617

Basic Statistical Measures
Location Variability
Mean 37.75000 Std Deviation 14.37057
Median 35.00000 Variance 206.51316
Mode 30.00000 Range 55.00000
    Interquartile Range 20.00000

Note: The mode displayed is the smallest of 2 modes with a count of 3.


Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 11.74784 Pr > |t| <.0001
Sign M 10 Pr >= |M| <.0001
Signed Rank S 105 Pr >= |S| <.0001

Quantiles (Definition 5)
Level Quantile
100% Max 70.0
99% 70.0
95% 65.0
90% 57.5
75% Q3 47.5
50% Median 35.0
25% Q1 27.5
10% 20.0
5% 17.5
1% 15.0
0% Min 15.0

Extreme Observations
Lowest Highest
Value Obs Value Obs
15 9 50 2
20 18 50 15
20 5 55 4
25 20 60 1
25 7 70 11

Step 8: Creating a Macro for Top N Dishes by Popularity

%macro Top_Dishes(n);

    proc sql outobs=&n;

        select Cuisine_Name, Region, Popularity_Score

        from Indian_Cuisines

        order by Popularity_Score desc;

    quit;

%mend;


%Top_Dishes(3);

Output:

Cuisine_Name Region Popularity_Score
PavBhaji West 9
Khichdi PanIndia 9
Biryani South 9

Step 9: Exporting Final Datasets

proc export data=Fast_Food

    outfile="Fast_Food_Cuisines.csv"

    dbms=csv

    replace;

run;

Log:
NOTE: "Fast_Food_Cuisines.csv" file was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
      real time           1.21 seconds
      cpu time            0.14 seconds


Try This Blog :  SAS QUESTIONS - 2



PRACTICE AND COMMENT YOUR OUTPUT: 

--->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

PLEASE FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

PLEASE FOLLOW OUR FACEBOOK PAGE  CLICK HERE

PLEASE FOLLOW OUR INSTAGRAM PAGE CLICK HERE


Comments