299.Which namkeen products sell the most, and what does SAS data creation reveal about customer taste?A Sas Study
Which namkeen products sell the most, and what does SAS data creation reveal about customer taste?A Sas Study
option nocenter;
1: Creating the Base Datasets
/*Macro 1: Data Simulation*/
%macro namkeen_simulate;
/* === NORTH REGION NAMKEEN DATA === */
filename north temp;
data _null_;
file north;
put "101 Haldiram Bhujia 45 200 4.7";
put "102 Bikaji AlooBhujia 40 180 4.5";
put "103 Bikanervala Navratan 55 250 4.6";
put "104 PepsiCo LaysMasala 20 100 4.2";
put "105 Balaji RatlamiSev 35 150 4.4";
put "106 Pratap Gathiya 30 200 4.1";
run;
data work.namkeen_north;
infile north;
input SnackID Brand :$20. Type :$20. Price Weight_g Rating;
run;
proc print data=work.namkeen_north noobs;
title "NAMKEEN PRODUCTS - NORTH REGION";
run;
/* === SOUTH REGION NAMKEEN DATA === */
filename south temp;
data _null_;
file south;
put "107 Aachi Mixture 50 220 4.3";
put "108 GRB KaraBoondi 45 200 4.5";
put "109 MTR Chakli 60 250 4.6";
put "110 Maiyas Nippattu 55 230 4.2";
put "111 Lays ClassicSalt 20 100 4.1";
put "112 Haldiram MoongDal 40 150 4.7";
run;
data work.namkeen_south;
infile south;
input SnackID Brand :$20. Type :$20. Price Weight_g Rating;
run;
proc print data=work.namkeen_south noobs;
title "NAMKEEN PRODUCTS - SOUTH REGION";
run;
%mend namkeen_simulate;
%namkeen_simulate;
OUTPUT:
| SnackID | Brand | Type | Price | Weight_g | Rating |
|---|---|---|---|---|---|
| 101 | Haldiram | Bhujia | 45 | 200 | 4.7 |
| 102 | Bikaji | AlooBhujia | 40 | 180 | 4.5 |
| 103 | Bikanervala | Navratan | 55 | 250 | 4.6 |
| 104 | PepsiCo | LaysMasala | 20 | 100 | 4.2 |
| 105 | Balaji | RatlamiSev | 35 | 150 | 4.4 |
| 106 | Pratap | Gathiya | 30 | 200 | 4.1 |
| SnackID | Brand | Type | Price | Weight_g | Rating |
|---|---|---|---|---|---|
| 107 | Aachi | Mixture | 50 | 220 | 4.3 |
| 108 | GRB | KaraBoondi | 45 | 200 | 4.5 |
| 109 | MTR | Chakli | 60 | 250 | 4.6 |
| 110 | Maiyas | Nippattu | 55 | 230 | 4.2 |
| 111 | Lays | ClassicSalt | 20 | 100 | 4.1 |
| 112 | Haldiram | MoongDal | 40 | 150 | 4.7 |
2: Combining Datasets
/*2.1 Using SET Statement*/
data all_namkeen_set;
set namkeen_north namkeen_south;
run;
proc print data=all_namkeen_set;
title "NAMKEEN DATA COMBINED USING SET STATEMENT";
run;
OUTPUT:
| Obs | SnackID | Brand | Type | Price | Weight_g | Rating |
|---|---|---|---|---|---|---|
| 1 | 101 | Haldiram | Bhujia | 45 | 200 | 4.7 |
| 2 | 102 | Bikaji | AlooBhujia | 40 | 180 | 4.5 |
| 3 | 103 | Bikanervala | Navratan | 55 | 250 | 4.6 |
| 4 | 104 | PepsiCo | LaysMasala | 20 | 100 | 4.2 |
| 5 | 105 | Balaji | RatlamiSev | 35 | 150 | 4.4 |
| 6 | 106 | Pratap | Gathiya | 30 | 200 | 4.1 |
| 7 | 107 | Aachi | Mixture | 50 | 220 | 4.3 |
| 8 | 108 | GRB | KaraBoondi | 45 | 200 | 4.5 |
| 9 | 109 | MTR | Chakli | 60 | 250 | 4.6 |
| 10 | 110 | Maiyas | Nippattu | 55 | 230 | 4.2 |
| 11 | 111 | Lays | ClassicSalt | 20 | 100 | 4.1 |
| 12 | 112 | Haldiram | MoongDal | 40 | 150 | 4.7 |
/*2.2 Using MERGE Statement*/
Before merging, we’ll create two supporting datasets:
namkeen_price → contains price & weight
namkeen_rating → contains rating
data namkeen_price;
set all_namkeen_set(keep=SnackID Brand Type Price Weight_g);
run;
proc print data=namkeen_price;
run;
OUTPUT:
| Obs | SnackID | Brand | Type | Price | Weight_g |
|---|---|---|---|---|---|
| 1 | 101 | Haldiram | Bhujia | 45 | 200 |
| 2 | 102 | Bikaji | AlooBhujia | 40 | 180 |
| 3 | 103 | Bikanervala | Navratan | 55 | 250 |
| 4 | 104 | PepsiCo | LaysMasala | 20 | 100 |
| 5 | 105 | Balaji | RatlamiSev | 35 | 150 |
| 6 | 106 | Pratap | Gathiya | 30 | 200 |
| 7 | 107 | Aachi | Mixture | 50 | 220 |
| 8 | 108 | GRB | KaraBoondi | 45 | 200 |
| 9 | 109 | MTR | Chakli | 60 | 250 |
| 10 | 110 | Maiyas | Nippattu | 55 | 230 |
| 11 | 111 | Lays | ClassicSalt | 20 | 100 |
| 12 | 112 | Haldiram | MoongDal | 40 | 150 |
data namkeen_rating;
set all_namkeen_set(keep=SnackID Rating);
run;
proc print data=namkeen_rating;
run;
OUTPUT:
| Obs | SnackID | Rating |
|---|---|---|
| 1 | 101 | 4.7 |
| 2 | 102 | 4.5 |
| 3 | 103 | 4.6 |
| 4 | 104 | 4.2 |
| 5 | 105 | 4.4 |
| 6 | 106 | 4.1 |
| 7 | 107 | 4.3 |
| 8 | 108 | 4.5 |
| 9 | 109 | 4.6 |
| 10 | 110 | 4.2 |
| 11 | 111 | 4.1 |
| 12 | 112 | 4.7 |
proc sort data=namkeen_price Out=namkeen_price_sorted;
by SnackID;
run;
proc print data=namkeen_price_sorted;
run;
OUTPUT:
| Obs | SnackID | Brand | Type | Price | Weight_g |
|---|---|---|---|---|---|
| 1 | 101 | Haldiram | Bhujia | 45 | 200 |
| 2 | 102 | Bikaji | AlooBhujia | 40 | 180 |
| 3 | 103 | Bikanervala | Navratan | 55 | 250 |
| 4 | 104 | PepsiCo | LaysMasala | 20 | 100 |
| 5 | 105 | Balaji | RatlamiSev | 35 | 150 |
| 6 | 106 | Pratap | Gathiya | 30 | 200 |
| 7 | 107 | Aachi | Mixture | 50 | 220 |
| 8 | 108 | GRB | KaraBoondi | 45 | 200 |
| 9 | 109 | MTR | Chakli | 60 | 250 |
| 10 | 110 | Maiyas | Nippattu | 55 | 230 |
| 11 | 111 | Lays | ClassicSalt | 20 | 100 |
| 12 | 112 | Haldiram | MoongDal | 40 | 150 |
proc sort data=namkeen_rating Out=namkeen_rating_sorted;
by SnackID;
run;
proc print data=namkeen_rating_sorted;
run;
OUTPUT:
| Obs | SnackID | Rating |
|---|---|---|
| 1 | 101 | 4.7 |
| 2 | 102 | 4.5 |
| 3 | 103 | 4.6 |
| 4 | 104 | 4.2 |
| 5 | 105 | 4.4 |
| 6 | 106 | 4.1 |
| 7 | 107 | 4.3 |
| 8 | 108 | 4.5 |
| 9 | 109 | 4.6 |
| 10 | 110 | 4.2 |
| 11 | 111 | 4.1 |
| 12 | 112 | 4.7 |
data merged_namkeen;
merge namkeen_price_sorted(in=a) namkeen_rating_sorted(in=b);
by SnackID;
if a and b;
run;
proc print data=merged_namkeen;
title "NAMKEEN DATA COMBINED USING MERGE STATEMENT";
run;
OUTPUT:
| Obs | SnackID | Brand | Type | Price | Weight_g | Rating |
|---|---|---|---|---|---|---|
| 1 | 101 | Haldiram | Bhujia | 45 | 200 | 4.7 |
| 2 | 102 | Bikaji | AlooBhujia | 40 | 180 | 4.5 |
| 3 | 103 | Bikanervala | Navratan | 55 | 250 | 4.6 |
| 4 | 104 | PepsiCo | LaysMasala | 20 | 100 | 4.2 |
| 5 | 105 | Balaji | RatlamiSev | 35 | 150 | 4.4 |
| 6 | 106 | Pratap | Gathiya | 30 | 200 | 4.1 |
| 7 | 107 | Aachi | Mixture | 50 | 220 | 4.3 |
| 8 | 108 | GRB | KaraBoondi | 45 | 200 | 4.5 |
| 9 | 109 | MTR | Chakli | 60 | 250 | 4.6 |
| 10 | 110 | Maiyas | Nippattu | 55 | 230 | 4.2 |
| 11 | 111 | Lays | ClassicSalt | 20 | 100 | 4.1 |
| 12 | 112 | Haldiram | MoongDal | 40 | 150 | 4.7 |
/*2.3 Using PROC SQL Join*/
proc sql;
create table namkeen_sql as
select a.SnackID, a.Brand, a.Type, a.Price, a.Weight_g, b.Rating
from namkeen_price a
inner join namkeen_rating b
on a.SnackID = b.SnackID;
quit;
proc print data=namkeen_sql;
title "NAMKEEN DATA COMBINED USING PROC SQL JOIN";
run;
OUTPUT:
| Obs | SnackID | Brand | Type | Price | Weight_g | Rating |
|---|---|---|---|---|---|---|
| 1 | 101 | Haldiram | Bhujia | 45 | 200 | 4.7 |
| 2 | 102 | Bikaji | AlooBhujia | 40 | 180 | 4.5 |
| 3 | 103 | Bikanervala | Navratan | 55 | 250 | 4.6 |
| 4 | 104 | PepsiCo | LaysMasala | 20 | 100 | 4.2 |
| 5 | 105 | Balaji | RatlamiSev | 35 | 150 | 4.4 |
| 6 | 106 | Pratap | Gathiya | 30 | 200 | 4.1 |
| 7 | 107 | Aachi | Mixture | 50 | 220 | 4.3 |
| 8 | 108 | GRB | KaraBoondi | 45 | 200 | 4.5 |
| 9 | 109 | MTR | Chakli | 60 | 250 | 4.6 |
| 10 | 110 | Maiyas | Nippattu | 55 | 230 | 4.2 |
| 11 | 111 | Lays | ClassicSalt | 20 | 100 | 4.1 |
| 12 | 112 | Haldiram | MoongDal | 40 | 150 | 4.7 |
3: QC Check – PROC COMPARE
proc compare base=merged_namkeen compare=namkeen_sql briefsummary;
title "QC CHECK: COMPARISON BETWEEN MERGE AND PROC SQL OUTPUTS";
run;
OUTPUT:
The COMPARE Procedure
Comparison of WORK.MERGED_NAMKEEN with WORK.NAMKEEN_SQL
(Method=EXACT)
NOTE: No unequal values were found. All values compared are exactly equal. 4: TLF-style Summary Tables
/*4.1 Listing: PROC REPORT*/
proc report data=all_namkeen_set nowd headline headskip;
columns SnackID Brand Type Price Weight_g Rating;
define SnackID / order "Snack ID";
define Brand / display "Brand Name";
define Type / display "Namkeen Type";
define Price / display "Price (₹)";
define Weight_g / display "Weight (g)";
define Rating / display "Customer Rating";
title "LISTING REPORT OF NAMKEEN PRODUCTS IN INDIA";
run;
OUTPUT:
| Snack ID | Brand Name | Namkeen Type | Price (₹) | Weight (g) | Customer Rating |
|---|---|---|---|---|---|
| 101 | Haldiram | Bhujia | 45 | 200 | 4.7 |
| 102 | Bikaji | AlooBhujia | 40 | 180 | 4.5 |
| 103 | Bikanervala | Navratan | 55 | 250 | 4.6 |
| 104 | PepsiCo | LaysMasala | 20 | 100 | 4.2 |
| 105 | Balaji | RatlamiSev | 35 | 150 | 4.4 |
| 106 | Pratap | Gathiya | 30 | 200 | 4.1 |
| 107 | Aachi | Mixture | 50 | 220 | 4.3 |
| 108 | GRB | KaraBoondi | 45 | 200 | 4.5 |
| 109 | MTR | Chakli | 60 | 250 | 4.6 |
| 110 | Maiyas | Nippattu | 55 | 230 | 4.2 |
| 111 | Lays | ClassicSalt | 20 | 100 | 4.1 |
| 112 | Haldiram | MoongDal | 40 | 150 | 4.7 |
/*4.2 Table: PROC TABULATE Summary*/
proc tabulate data=all_namkeen_set;
class Brand Type;
var Price Rating;
table Brand*Type,
(Price Rating)*(mean min max);
title "SUMMARY TABLE OF PRICE AND RATING BY BRAND AND TYPE";
run;
OUTPUT:
| Price | Rating | ||||||
|---|---|---|---|---|---|---|---|
| Mean | Min | Max | Mean | Min | Max | ||
| Brand | Type | 50.00 | 50.00 | 50.00 | 4.30 | 4.30 | 4.30 |
| Aachi | Mixture | ||||||
| Balaji | RatlamiSev | 35.00 | 35.00 | 35.00 | 4.40 | 4.40 | 4.40 |
| Bikaji | AlooBhujia | 40.00 | 40.00 | 40.00 | 4.50 | 4.50 | 4.50 |
| Bikanervala | Navratan | 55.00 | 55.00 | 55.00 | 4.60 | 4.60 | 4.60 |
| GRB | KaraBoondi | 45.00 | 45.00 | 45.00 | 4.50 | 4.50 | 4.50 |
| Haldiram | Bhujia | 45.00 | 45.00 | 45.00 | 4.70 | 4.70 | 4.70 |
| MoongDal | 40.00 | 40.00 | 40.00 | 4.70 | 4.70 | 4.70 | |
| Lays | ClassicSalt | 20.00 | 20.00 | 20.00 | 4.10 | 4.10 | 4.10 |
| MTR | Chakli | 60.00 | 60.00 | 60.00 | 4.60 | 4.60 | 4.60 |
| Maiyas | Nippattu | 55.00 | 55.00 | 55.00 | 4.20 | 4.20 | 4.20 |
| PepsiCo | LaysMasala | 20.00 | 20.00 | 20.00 | 4.20 | 4.20 | 4.20 |
| Pratap | Gathiya | 30.00 | 30.00 | 30.00 | 4.10 | 4.10 | 4.10 |
5: Automation with Reporting Macro
/*Macro 2: Summary Reporting*/
%macro namkeen_report(ds=);
%put Generating summary report for dataset: &ds;
proc means data=&ds n mean min max;
var Price Weight_g Rating;
class Brand;
title "PROC MEANS SUMMARY FOR &ds";
run;
proc freq data=&ds;
tables Type / nocum nopercent;
title "FREQUENCY OF NAMKEEN TYPES IN &ds";
run;
proc sgplot data=&ds;
vbar Brand / response=Price stat=mean datalabel;
title "AVERAGE PRICE BY BRAND - &ds";
run;
proc sgplot data=&ds;
vbar Type / response=Rating stat=mean datalabel;
title "AVERAGE CUSTOMER RATING BY NAMKEEN TYPE - &ds";
run;
%mend namkeen_report;
%namkeen_report(ds=all_namkeen_set);
OUTPUT:
The MEANS Procedure
| Brand | N Obs | Variable | N | Mean | Minimum | Maximum |
|---|---|---|---|---|---|---|
| Aachi | 1 | Price Weight_g Rating | 1 1 1 | 50.0000000 220.0000000 4.3000000 | 50.0000000 220.0000000 4.3000000 | 50.0000000 220.0000000 4.3000000 |
| Balaji | 1 | Price Weight_g Rating | 1 1 1 | 35.0000000 150.0000000 4.4000000 | 35.0000000 150.0000000 4.4000000 | 35.0000000 150.0000000 4.4000000 |
| Bikaji | 1 | Price Weight_g Rating | 1 1 1 | 40.0000000 180.0000000 4.5000000 | 40.0000000 180.0000000 4.5000000 | 40.0000000 180.0000000 4.5000000 |
| Bikanervala | 1 | Price Weight_g Rating | 1 1 1 | 55.0000000 250.0000000 4.6000000 | 55.0000000 250.0000000 4.6000000 | 55.0000000 250.0000000 4.6000000 |
| GRB | 1 | Price Weight_g Rating | 1 1 1 | 45.0000000 200.0000000 4.5000000 | 45.0000000 200.0000000 4.5000000 | 45.0000000 200.0000000 4.5000000 |
| Haldiram | 2 | Price Weight_g Rating | 2 2 2 | 42.5000000 175.0000000 4.7000000 | 40.0000000 150.0000000 4.7000000 | 45.0000000 200.0000000 4.7000000 |
| Lays | 1 | Price Weight_g Rating | 1 1 1 | 20.0000000 100.0000000 4.1000000 | 20.0000000 100.0000000 4.1000000 | 20.0000000 100.0000000 4.1000000 |
| MTR | 1 | Price Weight_g Rating | 1 1 1 | 60.0000000 250.0000000 4.6000000 | 60.0000000 250.0000000 4.6000000 | 60.0000000 250.0000000 4.6000000 |
| Maiyas | 1 | Price Weight_g Rating | 1 1 1 | 55.0000000 230.0000000 4.2000000 | 55.0000000 230.0000000 4.2000000 | 55.0000000 230.0000000 4.2000000 |
| PepsiCo | 1 | Price Weight_g Rating | 1 1 1 | 20.0000000 100.0000000 4.2000000 | 20.0000000 100.0000000 4.2000000 | 20.0000000 100.0000000 4.2000000 |
| Pratap | 1 | Price Weight_g Rating | 1 1 1 | 30.0000000 200.0000000 4.1000000 | 30.0000000 200.0000000 4.1000000 | 30.0000000 200.0000000 4.1000000 |
The FREQ Procedure
| Type | Frequency |
|---|---|
| AlooBhujia | 1 |
| Bhujia | 1 |
| Chakli | 1 |
| ClassicSalt | 1 |
| Gathiya | 1 |
| KaraBoondi | 1 |
| LaysMasala | 1 |
| Mixture | 1 |
| MoongDal | 1 |
| Navratan | 1 |
| Nippattu | 1 |
| RatlamiSev | 1 |
Comments
Post a Comment