WORLD CUISINES DATASET CREATION AND NUTRIENT ANALYSIS USING PROC SQL | PROC FORMAT | PROC PRINT | PROC MEANS | PROC STDIZE | PROC SGPANEL WITH MACROS AND DATE FORMAT INTEGRATION
OPTIONS NOCENTER;
1. PROC FORMAT – custom formats for spiciness & Veg/NonVeg
proc format;
value spicefmt
1 = 'Very Mild'
2 = 'Mild'
3 = 'Medium'
4 = 'Spicy'
5 = 'Very Spicy';
NOTE:
NOTE: Format SPICEFMT has been output.
value $vegf
'Veg' = 'Vegetarian'
'NonVeg' = 'Non-Vegetarian';
run;
NOTE:
NOTE: Format $VEGF has been output.
%let analysis_date = %sysfunc(today());
NOTE:
2. PROC SQL – create the WORLD_CUISINE dataset
Includes a SAS date variable MENU_DATE with DATE9.
proc sql;
create table work.world_cuisine as
select "Paneer Tikka" as Dish_Name length=30,
"India" as Country length=20,
280 as Calories,
18 as Protein,
16 as Fat,
4 as Spiciness_Level,
"Veg" as Veg_or_NonVeg length=7,
"15JAN2025"d as Menu_Date format=date9.
from sashelp.class(obs=1)
union all
select "Sushi",
"Japan",
200,
16,
5,
2,
"NonVeg",
"10FEB2025"d
from sashelp.class(obs=1)
union all
select "Tacos al Pastor",
"Mexico",
450,
20,
22,
4,
"NonVeg",
"05MAR2025"d
from sashelp.class(obs=1)
union all
select "Margherita Pizza",
"Italy",
320,
14,
12,
2,
"Veg",
"20JAN2025"d
from sashelp.class(obs=1)
union all
select "Falafel Wrap",
"Lebanon",
380,
13,
18,
3,
"Veg",
"28FEB2025"d
from sashelp.class(obs=1)
union all
select "Bibimbap",
"South Korea",
500,
18,
15,
3,
"NonVeg",
"03APR2025"d
from sashelp.class(obs=1)
union all
select "Chicken Biryani",
"India",
550,
24,
20,
5,
"NonVeg",
"12FEB2025"d
from sashelp.class(obs=1)
union all
select "Greek Salad",
"Greece",
220,
8,
14,
1,
"Veg",
"25MAR2025"d
from sashelp.class(obs=1)
union all
select "Pad Thai",
"Thailand",
420,
17,
14,
4,
"NonVeg",
"18APR2025"d
from sashelp.class(obs=1)
union all
select "Shakshuka",
"Israel",
300,
14,
16,
3,
"Veg",
"30JAN2025"d
from sashelp.class(obs=1)
union all
select "Poutine",
"Canada",
650,
18,
42,
1,
"NonVeg",
"07MAR2025"d
from sashelp.class(obs=1)
union all
select "Hummus Platter",
"Middle East",
260,
10,
12,
2,
"Veg",
"14APR2025"d
from sashelp.class(obs=1)
;
quit;
proc print data=work.world_cuisine;
format Spiciness_Level spicefmt.
Veg_or_NonVeg $vegf.
Menu_Date date9.;
title "World Cuisine Dataset as of %sysfunc(putn(&analysis_date.,date9.))";
run;
OUTPUT:
| Obs | Dish_Name | Country | Calories | Protein | Fat | Spiciness_Level | Veg_or_NonVeg | Menu_Date |
|---|---|---|---|---|---|---|---|---|
| 1 | Paneer Tikka | India | 280 | 18 | 16 | Spicy | Vegetarian | 15JAN2025 |
| 2 | Sushi | Japan | 200 | 16 | 5 | Mild | Non-Vegetarian | 10FEB2025 |
| 3 | Tacos al Pastor | Mexico | 450 | 20 | 22 | Spicy | Non-Vegetarian | 05MAR2025 |
| 4 | Margherita Pizza | Italy | 320 | 14 | 12 | Mild | Vegetarian | 20JAN2025 |
| 5 | Falafel Wrap | Lebanon | 380 | 13 | 18 | Medium | Vegetarian | 28FEB2025 |
| 6 | Bibimbap | South Korea | 500 | 18 | 15 | Medium | Non-Vegetarian | 03APR2025 |
| 7 | Chicken Biryani | India | 550 | 24 | 20 | Very Spicy | Non-Vegetarian | 12FEB2025 |
| 8 | Greek Salad | Greece | 220 | 8 | 14 | Very Mild | Vegetarian | 25MAR2025 |
| 9 | Pad Thai | Thailand | 420 | 17 | 14 | Spicy | Non-Vegetarian | 18APR2025 |
| 10 | Shakshuka | Israel | 300 | 14 | 16 | Medium | Vegetarian | 30JAN2025 |
| 11 | Poutine | Canada | 650 | 18 | 42 | Very Mild | Non-Vegetarian | 07MAR2025 |
| 12 | Hummus Platter | Middle East | 260 | 10 | 12 | Mild | Vegetarian | 14APR2025 |
3. Macro – nutrient categorization
Creates protein and fat category variables
%macro nutrient_category(input_ds=, output_ds=);
data &output_ds.;
set &input_ds.;
length Protein_Cat Fat_Cat $20.;
/* Protein-based category */
if Protein < 10 then Protein_Cat = "Low Protein";
else if 10 <= Protein < 20 then Protein_Cat = "Moderate Protein";
else Protein_Cat = "High Protein";
/* Fat-based category */
if Fat < 10 then Fat_Cat = "Low Fat";
else if 10 <= Fat < 20 then Fat_Cat = "Moderate Fat";
else Fat_Cat = "High Fat";
/*Date-based filter flag using DATE statement logic */
if Menu_Date <= &analysis_date. then Available_Flag = 1;
else Available_Flag = 0;
run;
%mend nutrient_category;
%nutrient_category(input_ds=work.world_cuisine, output_ds=work.cuisine_with_cat);
OUTPUT:
| Obs | Dish_Name | Country | Calories | Protein | Fat | Spiciness_Level | Veg_or_NonVeg | Menu_Date | Protein_Cat | Fat_Cat | Available_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Paneer Tikka | India | 280 | 18 | 16 | 4 | Veg | 15JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 2 | Sushi | Japan | 200 | 16 | 5 | 2 | NonVeg | 10FEB2025 | Moderate Protein | Low Fat | 1 |
| 3 | Tacos al Pastor | Mexico | 450 | 20 | 22 | 4 | NonVeg | 05MAR2025 | High Protein | High Fat | 1 |
| 4 | Margherita Pizza | Italy | 320 | 14 | 12 | 2 | Veg | 20JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 5 | Falafel Wrap | Lebanon | 380 | 13 | 18 | 3 | Veg | 28FEB2025 | Moderate Protein | Moderate Fat | 1 |
| 6 | Bibimbap | South Korea | 500 | 18 | 15 | 3 | NonVeg | 03APR2025 | Moderate Protein | Moderate Fat | 1 |
| 7 | Chicken Biryani | India | 550 | 24 | 20 | 5 | NonVeg | 12FEB2025 | High Protein | High Fat | 1 |
| 8 | Greek Salad | Greece | 220 | 8 | 14 | 1 | Veg | 25MAR2025 | Low Protein | Moderate Fat | 1 |
| 9 | Pad Thai | Thailand | 420 | 17 | 14 | 4 | NonVeg | 18APR2025 | Moderate Protein | Moderate Fat | 1 |
| 10 | Shakshuka | Israel | 300 | 14 | 16 | 3 | Veg | 30JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 11 | Poutine | Canada | 650 | 18 | 42 | 1 | NonVeg | 07MAR2025 | Moderate Protein | High Fat | 1 |
| 12 | Hummus Platter | Middle East | 260 | 10 | 12 | 2 | Veg | 14APR2025 | Moderate Protein | Moderate Fat | 1 |
4. PROC MEANS – descriptive statistics
By Veg_or_NonVeg
proc means data=work.cuisine_with_cat n mean std min max;
class Veg_or_NonVeg;
var Calories Protein Fat;
format Veg_or_NonVeg $vegf.;
title "Descriptive Statistics of World Cuisines by Veg/NonVeg";
run;
OUTPUT:
The MEANS Procedure
| Veg_or_NonVeg | N Obs | Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|---|---|
| Non-Vegetarian | 6 | Calories Protein Fat | 6 6 6 | 461.6666667 18.8333333 19.6666667 | 151.7124473 2.8577380 12.4365054 | 200.0000000 16.0000000 5.0000000 | 650.0000000 24.0000000 42.0000000 |
| Vegetarian | 6 | Calories Protein Fat | 6 6 6 | 293.3333333 12.8333333 14.6666667 | 54.6504041 3.4880749 2.4221203 | 220.0000000 8.0000000 12.0000000 | 380.0000000 18.0000000 18.0000000 |
5. PROC STDIZE – standardize nutrients
Create standardized Z-scores for Calories, Protein, Fat
proc stdize data=work.cuisine_with_cat
out=work.cuisine_std
method=std;
var Calories Protein Fat;
run;
OUTPUT:
| Obs | Dish_Name | Country | Calories | Protein | Fat | Spiciness_Level | Veg_or_NonVeg | Menu_Date | Protein_Cat | Fat_Cat | Available_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Paneer Tikka | India | -0.69736 | 0.49628 | -0.13061 | 4 | Veg | 15JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 2 | Sushi | Japan | -1.26955 | 0.03818 | -1.36208 | 2 | NonVeg | 10FEB2025 | Moderate Protein | Low Fat | 1 |
| 3 | Tacos al Pastor | Mexico | 0.51855 | 0.95438 | 0.54110 | 4 | NonVeg | 05MAR2025 | High Protein | High Fat | 1 |
| 4 | Margherita Pizza | Italy | -0.41126 | -0.41993 | -0.57842 | 2 | Veg | 20JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 5 | Falafel Wrap | Lebanon | 0.01788 | -0.64898 | 0.09329 | 3 | Veg | 28FEB2025 | Moderate Protein | Moderate Fat | 1 |
| 6 | Bibimbap | South Korea | 0.87617 | 0.49628 | -0.24256 | 3 | NonVeg | 03APR2025 | Moderate Protein | Moderate Fat | 1 |
| 7 | Chicken Biryani | India | 1.23379 | 1.87058 | 0.31720 | 5 | NonVeg | 12FEB2025 | High Protein | High Fat | 1 |
| 8 | Greek Salad | Greece | -1.12650 | -1.79423 | -0.35451 | 1 | Veg | 25MAR2025 | Low Protein | Moderate Fat | 1 |
| 9 | Pad Thai | Thailand | 0.30398 | 0.26723 | -0.35451 | 4 | NonVeg | 18APR2025 | Moderate Protein | Moderate Fat | 1 |
| 10 | Shakshuka | Israel | -0.55431 | -0.41993 | -0.13061 | 3 | Veg | 30JAN2025 | Moderate Protein | Moderate Fat | 1 |
| 11 | Poutine | Canada | 1.94903 | 0.49628 | 2.78014 | 1 | NonVeg | 07MAR2025 | Moderate Protein | High Fat | 1 |
| 12 | Hummus Platter | Middle East | -0.84041 | -1.33613 | -0.57842 | 2 | Veg | 14APR2025 | Moderate Protein | Moderate Fat | 1 |
6. PROC SGPANEL – visualization by Veg_or_NonVeg
proc sgpanel data=work.cuisine_std;
panelby Veg_or_NonVeg / columns=2;
scatter x=Protein y=Calories / group=Country;
rowaxis label="Calories (Standardized)";
colaxis label="Protein (Standardized)";
format Veg_or_NonVeg $vegf.;
title "Calories vs Protein by Veg/NonVeg (Standardized Values)";
run;
No comments:
Post a Comment