292.Which Breakfast Is The Healthiest And Most Popular? — Using Sas Data Analytics
Which Breakfast Is The Healthiest And Most Popular? — Using Sas Data Analytics
/*CREATING A DATASET OF REALWORLD BREAKFAST ITEMS SAS*/
1) Create dataset
options nocenter;
data work.realbreakfast;
length ItemID $6 ItemName $40 Cuisine $20 Type $8;
input ItemID $ ItemName : $40. Cuisine : $20. Type : $8. Calories PriceINR
PrepTimeMin;
datalines;
B001 Idli SouthIndian Veg 120 30 12
B002 Dosa SouthIndian Veg 220 45 15
B003 Poha Maharashtrian Veg 260 35 10
B004 Paratha NorthIndian Veg 350 50 20
B005 EggBhurji Indian NonVeg 200 40 12
B006 BreadJam Continental Veg 180 25 5
B007 Upma SouthIndian Veg 240 30 10
B008 AlooParatha NorthIndian Veg 420 55 22
B009 PuriSabji NorthIndian Veg 380 60 18
B010 Omelette Continental NonVeg 220 35 8
B011 MasalaDosa SouthIndian Veg 260 55 18
B012 OatsPorridge Continental Veg 150 40 7
;
run;
proc print;run;
Output:
| Obs | ItemID | ItemName | Cuisine | Type | Calories | PriceINR | PrepTimeMin |
|---|---|---|---|---|---|---|---|
| 1 | B001 | Idli | SouthIndian | Veg | 120 | 30 | 12 |
| 2 | B002 | Dosa | SouthIndian | Veg | 220 | 45 | 15 |
| 3 | B003 | Poha | Maharashtrian | Veg | 260 | 35 | 10 |
| 4 | B004 | Paratha | NorthIndian | Veg | 350 | 50 | 20 |
| 5 | B005 | EggBhurji | Indian | NonVeg | 200 | 40 | 12 |
| 6 | B006 | BreadJam | Continental | Veg | 180 | 25 | 5 |
| 7 | B007 | Upma | SouthIndian | Veg | 240 | 30 | 10 |
| 8 | B008 | AlooParatha | NorthIndian | Veg | 420 | 55 | 22 |
| 9 | B009 | PuriSabji | NorthIndian | Veg | 380 | 60 | 18 |
| 10 | B010 | Omelette | Continental | NonVeg | 220 | 35 | 8 |
| 11 | B011 | MasalaDosa | SouthIndian | Veg | 260 | 55 | 18 |
| 12 | B012 | OatsPorridge | Continental | Veg | 150 | 40 | 7 |
2) Add formatted price and calorie category
purpose: Create human-friendly categories for calories.
proc format;
value calcat
low - 199 = 'Low'
200 - 299 = 'Moderate'
300 - high = 'High';
run;
Log:
purpose: create formatted price column for display
purpose: attach calorie category
Here we already know that Put Function
Purpose: Converts numeric values to character values.
data work.realbreakfast_fmt;
set work.realbreakfast;
PriceINR_f = put(PriceINR, dollar8.);
CalorieCat = put(Calories, calcat.);
run;
proc print;run;
Output:
| Obs | ItemID | ItemName | Cuisine | Type | Calories | PriceINR | PrepTimeMin | PriceINR_f | CalorieCat |
|---|---|---|---|---|---|---|---|---|---|
| 1 | B001 | Idli | SouthIndian | Veg | 120 | 30 | 12 | $30 | Low |
| 2 | B002 | Dosa | SouthIndian | Veg | 220 | 45 | 15 | $45 | Moderate |
| 3 | B003 | Poha | Maharashtrian | Veg | 260 | 35 | 10 | $35 | Moderate |
| 4 | B004 | Paratha | NorthIndian | Veg | 350 | 50 | 20 | $50 | High |
| 5 | B005 | EggBhurji | Indian | NonVeg | 200 | 40 | 12 | $40 | Moderate |
| 6 | B006 | BreadJam | Continental | Veg | 180 | 25 | 5 | $25 | Low |
| 7 | B007 | Upma | SouthIndian | Veg | 240 | 30 | 10 | $30 | Moderate |
| 8 | B008 | AlooParatha | NorthIndian | Veg | 420 | 55 | 22 | $55 | High |
| 9 | B009 | PuriSabji | NorthIndian | Veg | 380 | 60 | 18 | $60 | High |
| 10 | B010 | Omelette | Continental | NonVeg | 220 | 35 | 8 | $35 | Moderate |
| 11 | B011 | MasalaDosa | SouthIndian | Veg | 260 | 55 | 18 | $55 | Moderate |
| 12 | B012 | OatsPorridge | Continental | Veg | 150 | 40 | 7 | $40 | Low |
3) PROC PRINT: show dataset
purpose: Display the dataset in readable tabular form.
proc print data=work.realbreakfast_fmt(obs=8) noobs label;
title "Realworld Breakfast Items - Raw View";
var ItemID ItemName Cuisine Type Calories CalorieCat PriceINR_f PrepTimeMin;
label PriceINR_f = "Price (INR)"
CalorieCat = "Calorie Category";
run;
Output:
| ItemID | ItemName | Cuisine | Type | Calories | Calorie Category | Price (INR) | PrepTimeMin |
|---|---|---|---|---|---|---|---|
| B001 | Idli | SouthIndian | Veg | 120 | Low | $30 | 12 |
| B002 | Dosa | SouthIndian | Veg | 220 | Moderate | $45 | 15 |
| B003 | Poha | Maharashtrian | Veg | 260 | Moderate | $35 | 10 |
| B004 | Paratha | NorthIndian | Veg | 350 | High | $50 | 20 |
| B005 | EggBhurji | Indian | NonVeg | 200 | Moderate | $40 | 12 |
| B006 | BreadJam | Continental | Veg | 180 | Low | $25 | 5 |
| B007 | Upma | SouthIndian | Veg | 240 | Moderate | $30 | 10 |
| B008 | AlooParatha | NorthIndian | Veg | 420 | High | $55 | 22 |
4) PROC MEANS: numeric summaries
purpose: Compute central tendency and spread for numeric variables.
proc means data=work.realbreakfast n mean std min max maxdec=1;
var Calories PriceINR PrepTimeMin;
title "Numeric Summary for Calories, Price and Prep Time";
run;
Output:
The MEANS Procedure
| Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|
Calories PriceINR PrepTimeMin | 12 12 12 | 250.0 41.7 13.1 | 91.6 11.3 5.5 | 120.0 25.0 5.0 | 420.0 60.0 22.0 |
5) PROC FREQ: categorical distributions
purpose: Show counts across categorical variables.
proc freq data=work.realbreakfast_fmt;
tables Cuisine Type CalorieCat / nocum nopercent;
title "Frequency counts by Cuisine, Type and Calorie Category";
run;
Output:
The FREQ Procedure
| Cuisine | Frequency |
|---|---|
| Continental | 3 |
| Indian | 1 |
| Maharashtrian | 1 |
| NorthIndian | 3 |
| SouthIndian | 4 |
| Type | Frequency |
|---|---|
| NonVeg | 2 |
| Veg | 10 |
| CalorieCat | Frequency |
|---|---|
| High | 3 |
| Low | 3 |
| Moderate | 6 |
6) PROC SQL: aggregated summaries and joins
purpose: Use SQL to group by cuisine and compute aggregated metrics.
proc sql;
create table work.by_cuisine as
select Cuisine,
count(*) as N,
mean(Calories) format=8.1 as MeanCalories,
mean(PriceINR) format=8.1 as MeanPrice,
mean(PrepTimeMin) format=8.1 as MeanPrep
from work.realbreakfast
group by Cuisine
order by MeanPrice desc;
quit;
proc print;run;
Output:
| Obs | Cuisine | N | MeanCalories | MeanPrice | MeanPrep |
|---|---|---|---|---|---|
| 1 | NorthIndian | 3 | 383.3 | 55.0 | 20.0 |
| 2 | Indian | 1 | 200.0 | 40.0 | 12.0 |
| 3 | SouthIndian | 4 | 210.0 | 40.0 | 13.8 |
| 4 | Maharashtrian | 1 | 260.0 | 35.0 | 10.0 |
| 5 | Continental | 3 | 183.3 | 33.3 | 6.7 |
7) PROC SORT: prepare for reporting
purpose: Sort dataset by descending calories for top-calorie reporting.
proc sort data=work.realbreakfast out=work.realbreakfast_sorted;
by descending Calories;
run;
proc print data=work.realbreakfast_sorted;
run;
Output:
| Obs | ItemID | ItemName | Cuisine | Type | Calories | PriceINR | PrepTimeMin |
|---|---|---|---|---|---|---|---|
| 1 | B008 | AlooParatha | NorthIndian | Veg | 420 | 55 | 22 |
| 2 | B009 | PuriSabji | NorthIndian | Veg | 380 | 60 | 18 |
| 3 | B004 | Paratha | NorthIndian | Veg | 350 | 50 | 20 |
| 4 | B003 | Poha | Maharashtrian | Veg | 260 | 35 | 10 |
| 5 | B011 | MasalaDosa | SouthIndian | Veg | 260 | 55 | 18 |
| 6 | B007 | Upma | SouthIndian | Veg | 240 | 30 | 10 |
| 7 | B002 | Dosa | SouthIndian | Veg | 220 | 45 | 15 |
| 8 | B010 | Omelette | Continental | NonVeg | 220 | 35 | 8 |
| 9 | B005 | EggBhurji | Indian | NonVeg | 200 | 40 | 12 |
| 10 | B006 | BreadJam | Continental | Veg | 180 | 25 | 5 |
| 11 | B012 | OatsPorridge | Continental | Veg | 150 | 40 | 7 |
| 12 | B001 | Idli | SouthIndian | Veg | 120 | 30 | 12 |
8) Macro: summarize by Type (Veg/NonVeg) using PROC MEANS and PROC SQL
purpose: Macro to generate numeric summary and top-calorie list for a given Type.
%macro summary_by_type(type=);
%put NOTE: Starting summary for &type;
proc means data=work.realbreakfast(where=(upcase(Type)=upcase("&type"))) n mean std min max maxdec=1;
var Calories PriceINR PrepTimeMin;
title "Summary for &type Items";
run;
proc sql;
create table work.top_&type as
select ItemID, ItemName, Calories, PriceINR, PrepTimeMin
from work.realbreakfast
where upcase(Type)=upcase("&type")
order by Calories desc;
quit;
%mend;
%summary_by_type(type=Veg);
Log:
Output:
The MEANS Procedure
| Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|
Calories PriceINR PrepTimeMin | 10 10 10 | 258.0 42.5 13.7 | 99.0 12.3 5.8 | 120.0 25.0 5.0 | 420.0 60.0 22.0 |
%summary_by_type(type=NonVeg);
Log:
Output:
The MEANS Procedure
| Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|
Calories PriceINR PrepTimeMin | 2 2 2 | 210.0 37.5 10.0 | 14.1 3.5 2.8 | 200.0 35.0 8.0 | 220.0 40.0 12.0 |
9) PROC REPORT: fancy printable report
purpose: Produce a formatted, report-style listing for presentation.
proc report data=work.realbreakfast_fmtnowd headline spacing=1;
column ItemID ItemName Cuisine Type Calories PriceINR_f PrepTimeMin CalorieCat;
define ItemID / group 'ID' width=6;
define ItemName / display 'Breakfast Item' width=30;
define Cuisine / group 'Cuisine';
define Type / group 'Type';
define Calories / analysis mean 'Calories' format=8.;
define PriceINR_f / display 'Price (INR)';
define PrepTimeMin / analysis mean 'Prep Time (min)';
define CalorieCat / display 'Cal Cat';
title "Formatted Breakfast Items Report";
run;
Output:
| ID | Breakfast Item | Cuisine | Type | Calories | Price (INR) | Prep Time (min) | Cal Cat |
|---|---|---|---|---|---|---|---|
| B001 | Idli | SouthIndian | Veg | 120 | $30 | 12 | Low |
| B002 | Dosa | SouthIndian | Veg | 220 | $45 | 15 | Moderate |
| B003 | Poha | Maharashtrian | Veg | 260 | $35 | 10 | Moderate |
| B004 | Paratha | NorthIndian | Veg | 350 | $50 | 20 | High |
| B005 | EggBhurji | Indian | NonVeg | 200 | $40 | 12 | Moderate |
| B006 | BreadJam | Continental | Veg | 180 | $25 | 5 | Low |
| B007 | Upma | SouthIndian | Veg | 240 | $30 | 10 | Moderate |
| B008 | AlooParatha | NorthIndian | Veg | 420 | $55 | 22 | High |
| B009 | PuriSabji | NorthIndian | Veg | 380 | $60 | 18 | High |
| B010 | Omelette | Continental | NonVeg | 220 | $35 | 8 | Moderate |
| B011 | MasalaDosa | SouthIndian | Veg | 260 | $55 | 18 | Moderate |
| B012 | OatsPorridge | Continental | Veg | 150 | $40 | 7 | Low |
10) PROC SQL: select items with calories > 300 (high calorie items)
purpose: Create a table of high-calorie breakfast items.
proc sql;
create table work.high_cal as
select * from work.realbreakfast
where Calories > 300
order by Calories desc;
quit;
proc print;run;
Output:
| Obs | ItemID | ItemName | Cuisine | Type | Calories | PriceINR | PrepTimeMin |
|---|---|---|---|---|---|---|---|
| 1 | B008 | AlooParatha | NorthIndian | Veg | 420 | 55 | 22 |
| 2 | B009 | PuriSabji | NorthIndian | Veg | 380 | 60 | 18 |
| 3 | B004 | Paratha | NorthIndian | Veg | 350 | 50 | 20 |
11) Simple QC checks: duplicate ItemID and missing values
purpose: Basic QC checks to identify duplicates and missing numeric values.
proc sql;
select 'DUPLICATE_ITEMIDS' as CheckType, ItemID, count(*) as CNT
from work.realbreakfast
group by ItemID
having count(*) > 1;
quit;
Log:
NOTE: No rows were selected.
proc means data=work.realbreakfast n nmiss;
var Calories PriceINR PrepTimeMin;
title "Missing-value check for numeric fields";
run;
The MEANS Procedure
| Variable | N | N Miss |
|---|---|---|
Calories PriceINR PrepTimeMin | 12 12 12 | 0 0 0 |
Comments
Post a Comment