201.DETAILED ANALYSIS OF DIFFERENT TYPES OF OILS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC UNIVARIATE | PROC REPORT | PROC SGPLOT | PROC SQL | PROC TRANSPOSE | MACROS IN SAS
- Get link
- X
- Other Apps
DETAILED ANALYSIS OF DIFFERENT TYPES OF OILS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC UNIVARIATE | PROC REPORT | PROC SGPLOT | PROC SQL | PROC TRANSPOSE | MACROS IN SAS
/*A dataset of different types of oils*/
Step 1: Create the Dataset
data Oils_Dataset;
length Oil_ID 8 Oil_Name $20 Category $15 Origin $15 Usage_Type $12;
input Oil_ID Oil_Name $ Category $ Origin $ Smoke_Point Usage_Type $
Price_Per_Litre Health_Rating;
datalines;
1 Olive_Oil Cooking Italy 190 Cooking 850 9
2 Coconut_Oil Cooking India 175 Cooking 700 7
3 Canola_Oil Cooking Canada 205 Cooking 650 8
4 Almond_Oil Cosmetic USA 220 Skincare 1200 6
5 Argan_Oil Cosmetic Morocco 215 Haircare 1400 7
6 Sunflower_Oil Cooking Ukraine 225 Cooking 600 8
7 Sesame_Oil Cooking China 210 Cooking 780 7
8 Mustard_Oil Cooking India 200 Cooking 500 6
9 Avocado_Oil Cooking Mexico 250 Cooking 1300 9
10 Jojoba_Oil Cosmetic USA 260 Skincare 1500 6
11 Flaxseed_Oil Cooking Canada 190 Cooking 850 7
12 Palm_Oil Cooking Malaysia 230 Cooking 550 5
13 Grapeseed_Oil Cooking France 215 Cooking 700 7
14 Castor_Oil Medicinal India 250 Laxative 400 5
15 Neem_Oil Medicinal India 210 Pesticide 300 4
16 Tea_Tree_Oil Medicinal Australia 190 Antifungal 1300 8
17 Rice_Bran_Oil Cooking Japan 245 Cooking 900 8
18 Wheatgerm_Oil Cosmetic Germany 210 Haircare 1100 7
19 Hempseed_Oil Cooking USA 170 Cooking 1050 7
20 Safflower_Oil Cooking USA 225 Cooking 650 6
;
run;
proc print;run;
Output:
Obs | Oil_ID | Oil_Name | Category | Origin | Usage_Type | Smoke_Point | Price_Per_Litre | Health_Rating |
---|---|---|---|---|---|---|---|---|
1 | 1 | Olive_Oil | Cooking | Italy | Cooking | 190 | 850 | 9 |
2 | 2 | Coconut_Oil | Cooking | India | Cooking | 175 | 700 | 7 |
3 | 3 | Canola_Oil | Cooking | Canada | Cooking | 205 | 650 | 8 |
4 | 4 | Almond_Oil | Cosmetic | USA | Skincare | 220 | 1200 | 6 |
5 | 5 | Argan_Oil | Cosmetic | Morocco | Haircare | 215 | 1400 | 7 |
6 | 6 | Sunflower_Oil | Cooking | Ukraine | Cooking | 225 | 600 | 8 |
7 | 7 | Sesame_Oil | Cooking | China | Cooking | 210 | 780 | 7 |
8 | 8 | Mustard_Oil | Cooking | India | Cooking | 200 | 500 | 6 |
9 | 9 | Avocado_Oil | Cooking | Mexico | Cooking | 250 | 1300 | 9 |
10 | 10 | Jojoba_Oil | Cosmetic | USA | Skincare | 260 | 1500 | 6 |
11 | 11 | Flaxseed_Oil | Cooking | Canada | Cooking | 190 | 850 | 7 |
12 | 12 | Palm_Oil | Cooking | Malaysia | Cooking | 230 | 550 | 5 |
13 | 13 | Grapeseed_Oil | Cooking | France | Cooking | 215 | 700 | 7 |
14 | 14 | Castor_Oil | Medicinal | India | Laxative | 250 | 400 | 5 |
15 | 15 | Neem_Oil | Medicinal | India | Pesticide | 210 | 300 | 4 |
16 | 16 | Tea_Tree_Oil | Medicinal | Australia | Antifungal | 190 | 1300 | 8 |
17 | 17 | Rice_Bran_Oil | Cooking | Japan | Cooking | 245 | 900 | 8 |
18 | 18 | Wheatgerm_Oil | Cosmetic | Germany | Haircare | 210 | 1100 | 7 |
19 | 19 | Hempseed_Oil | Cooking | USA | Cooking | 170 | 1050 | 7 |
20 | 20 | Safflower_Oil | Cooking | USA | Cooking | 225 | 650 | 6 |
Step 2: Display the Dataset — PROC PRINT
proc print data=Oils_Dataset label noobs;
title "Complete Oils Dataset";
run;
Output:
Complete Oils Dataset |
Oil_ID | Oil_Name | Category | Origin | Usage_Type | Smoke_Point | Price_Per_Litre | Health_Rating |
---|---|---|---|---|---|---|---|
1 | Olive_Oil | Cooking | Italy | Cooking | 190 | 850 | 9 |
2 | Coconut_Oil | Cooking | India | Cooking | 175 | 700 | 7 |
3 | Canola_Oil | Cooking | Canada | Cooking | 205 | 650 | 8 |
4 | Almond_Oil | Cosmetic | USA | Skincare | 220 | 1200 | 6 |
5 | Argan_Oil | Cosmetic | Morocco | Haircare | 215 | 1400 | 7 |
6 | Sunflower_Oil | Cooking | Ukraine | Cooking | 225 | 600 | 8 |
7 | Sesame_Oil | Cooking | China | Cooking | 210 | 780 | 7 |
8 | Mustard_Oil | Cooking | India | Cooking | 200 | 500 | 6 |
9 | Avocado_Oil | Cooking | Mexico | Cooking | 250 | 1300 | 9 |
10 | Jojoba_Oil | Cosmetic | USA | Skincare | 260 | 1500 | 6 |
11 | Flaxseed_Oil | Cooking | Canada | Cooking | 190 | 850 | 7 |
12 | Palm_Oil | Cooking | Malaysia | Cooking | 230 | 550 | 5 |
13 | Grapeseed_Oil | Cooking | France | Cooking | 215 | 700 | 7 |
14 | Castor_Oil | Medicinal | India | Laxative | 250 | 400 | 5 |
15 | Neem_Oil | Medicinal | India | Pesticide | 210 | 300 | 4 |
16 | Tea_Tree_Oil | Medicinal | Australia | Antifungal | 190 | 1300 | 8 |
17 | Rice_Bran_Oil | Cooking | Japan | Cooking | 245 | 900 | 8 |
18 | Wheatgerm_Oil | Cosmetic | Germany | Haircare | 210 | 1100 | 7 |
19 | Hempseed_Oil | Cooking | USA | Cooking | 170 | 1050 | 7 |
20 | Safflower_Oil | Cooking | USA | Cooking | 225 | 650 | 6 |
Step 3: Sort by Category & Price — PROC SORT
proc sort data=Oils_Dataset out=Sorted_Oils;
by Category descending Price_Per_Litre;
run;
proc print;run;
Output:
Obs | Oil_ID | Oil_Name | Category | Origin | Usage_Type | Smoke_Point | Price_Per_Litre | Health_Rating |
---|---|---|---|---|---|---|---|---|
1 | 9 | Avocado_Oil | Cooking | Mexico | Cooking | 250 | 1300 | 9 |
2 | 19 | Hempseed_Oil | Cooking | USA | Cooking | 170 | 1050 | 7 |
3 | 17 | Rice_Bran_Oil | Cooking | Japan | Cooking | 245 | 900 | 8 |
4 | 1 | Olive_Oil | Cooking | Italy | Cooking | 190 | 850 | 9 |
5 | 11 | Flaxseed_Oil | Cooking | Canada | Cooking | 190 | 850 | 7 |
6 | 7 | Sesame_Oil | Cooking | China | Cooking | 210 | 780 | 7 |
7 | 2 | Coconut_Oil | Cooking | India | Cooking | 175 | 700 | 7 |
8 | 13 | Grapeseed_Oil | Cooking | France | Cooking | 215 | 700 | 7 |
9 | 3 | Canola_Oil | Cooking | Canada | Cooking | 205 | 650 | 8 |
10 | 20 | Safflower_Oil | Cooking | USA | Cooking | 225 | 650 | 6 |
11 | 6 | Sunflower_Oil | Cooking | Ukraine | Cooking | 225 | 600 | 8 |
12 | 12 | Palm_Oil | Cooking | Malaysia | Cooking | 230 | 550 | 5 |
13 | 8 | Mustard_Oil | Cooking | India | Cooking | 200 | 500 | 6 |
14 | 10 | Jojoba_Oil | Cosmetic | USA | Skincare | 260 | 1500 | 6 |
15 | 5 | Argan_Oil | Cosmetic | Morocco | Haircare | 215 | 1400 | 7 |
16 | 4 | Almond_Oil | Cosmetic | USA | Skincare | 220 | 1200 | 6 |
17 | 18 | Wheatgerm_Oil | Cosmetic | Germany | Haircare | 210 | 1100 | 7 |
18 | 16 | Tea_Tree_Oil | Medicinal | Australia | Antifungal | 190 | 1300 | 8 |
19 | 14 | Castor_Oil | Medicinal | India | Laxative | 250 | 400 | 5 |
20 | 15 | Neem_Oil | Medicinal | India | Pesticide | 210 | 300 | 4 |
Step 4: Frequency Distribution — PROC FREQ
proc freq data=Oils_Dataset;
tables Category Origin Usage_Type / nocum;
title "Frequency of Oil Categories and Origins";
run;
Output:
Frequency of Oil Categories and Origins |
Category | Frequency | Percent |
---|---|---|
Cooking | 13 | 65.00 |
Cosmetic | 4 | 20.00 |
Medicinal | 3 | 15.00 |
Origin | Frequency | Percent |
---|---|---|
Australia | 1 | 5.00 |
Canada | 2 | 10.00 |
China | 1 | 5.00 |
France | 1 | 5.00 |
Germany | 1 | 5.00 |
India | 4 | 20.00 |
Italy | 1 | 5.00 |
Japan | 1 | 5.00 |
Malaysia | 1 | 5.00 |
Mexico | 1 | 5.00 |
Morocco | 1 | 5.00 |
USA | 4 | 20.00 |
Ukraine | 1 | 5.00 |
Usage_Type | Frequency | Percent |
---|---|---|
Antifungal | 1 | 5.00 |
Cooking | 13 | 65.00 |
Haircare | 2 | 10.00 |
Laxative | 1 | 5.00 |
Pesticide | 1 | 5.00 |
Skincare | 2 | 10.00 |
Step 5: Statistical Summary — PROC MEANS
proc means data=Oils_Dataset min max mean median std maxdec=2;
var Price_Per_Litre Health_Rating Smoke_Point;
title "Descriptive Statistics for Numerical Variables";
run;
Output:
Descriptive Statistics for Numerical Variables |
Variable | Minimum | Maximum | Mean | Median | Std Dev | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
Step 6: Distribution & Outliers — PROC UNIVARIATE
proc univariate data=Oils_Dataset;
var Price_Per_Litre;
histogram / normal;
inset mean std min max;
title "Univariate Analysis of Oil Price";
run;
Output:
Step 7:
Univariate Analysis of Oil Price |
Moments | |||
---|---|---|---|
N | 20 | Sum Weights | 20 |
Mean | 864 | Sum Observations | 17280 |
Std Deviation | 345.152903 | Variance | 119130.526 |
Skewness | 0.32520808 | Kurtosis | -0.8880889 |
Uncorrected SS | 17193400 | Corrected SS | 2263480 |
Coeff Variation | 39.9482526 | Std Error Mean | 77.1785353 |
Basic Statistical Measures | |||
---|---|---|---|
Location | Variability | ||
Mean | 864.0000 | Std Deviation | 345.15290 |
Median | 815.0000 | Variance | 119131 |
Mode | 650.0000 | Range | 1200 |
Interquartile Range | 525.00000 |
Note: The mode displayed is the smallest of 4 modes with a count of 2. |
Tests for Location: Mu0=0 | ||||
---|---|---|---|---|
Test | Statistic | p Value | ||
Student's t | t | 11.19482 | Pr > |t| | <.0001 |
Sign | M | 10 | Pr >= |M| | <.0001 |
Signed Rank | S | 105 | Pr >= |S| | <.0001 |
Quantiles (Definition 5) | |
---|---|
Level | Quantile |
100% Max | 1500 |
99% | 1500 |
95% | 1450 |
90% | 1350 |
75% Q3 | 1150 |
50% Median | 815 |
25% Q1 | 625 |
10% | 450 |
5% | 350 |
1% | 300 |
0% Min | 300 |
Extreme Observations | |||
---|---|---|---|
Lowest | Highest | ||
Value | Obs | Value | Obs |
300 | 15 | 1200 | 4 |
400 | 14 | 1300 | 9 |
500 | 8 | 1300 | 16 |
550 | 12 | 1400 | 5 |
600 | 6 | 1500 | 10 |
Univariate Analysis of Oil Price |
Parameters for Normal Distribution | ||
---|---|---|
Parameter | Symbol | Estimate |
Mean | Mu | 864 |
Std Dev | Sigma | 345.1529 |
Goodness-of-Fit Tests for Normal Distribution | ||||
---|---|---|---|---|
Test | Statistic | p Value | ||
Kolmogorov-Smirnov | D | 0.13266059 | Pr > D | >0.150 |
Cramer-von Mises | W-Sq | 0.05699813 | Pr > W-Sq | >0.250 |
Anderson-Darling | A-Sq | 0.32913203 | Pr > A-Sq | >0.250 |
Quantiles for Normal Distribution | ||
---|---|---|
Percent | Quantile | |
Observed | Estimated | |
1.0 | 300.000 | 61.0543 |
5.0 | 350.000 | 296.2740 |
10.0 | 450.000 | 421.6688 |
25.0 | 625.000 | 631.1979 |
50.0 | 815.000 | 864.0000 |
75.0 | 1150.000 | 1096.8021 |
90.0 | 1350.000 | 1306.3312 |
95.0 | 1450.000 | 1431.7260 |
99.0 | 1500.000 | 1666.9457 |
Report with Grouping — PROC REPORT
proc report data=Oils_Dataset nowd;
column Category Origin Price_Per_Litre Health_Rating;
define Category / group;
define Origin / group;
define Price_Per_Litre / analysis mean format=8.2;
define Health_Rating / analysis mean format=8.2;
title "Mean Price and Health Rating by Category and Origin";
run;
Output:
Mean Price and Health Rating by Category and Origin |
Category | Origin | Price_Per_Litre | Health_Rating |
---|---|---|---|
Cooking | Canada | 750.00 | 7.50 |
China | 780.00 | 7.00 | |
France | 700.00 | 7.00 | |
India | 600.00 | 6.50 | |
Italy | 850.00 | 9.00 | |
Japan | 900.00 | 8.00 | |
Malaysia | 550.00 | 5.00 | |
Mexico | 1300.00 | 9.00 | |
USA | 850.00 | 6.50 | |
Ukraine | 600.00 | 8.00 | |
Cosmetic | Germany | 1100.00 | 7.00 |
Morocco | 1400.00 | 7.00 | |
USA | 1350.00 | 6.00 | |
Medicinal | Australia | 1300.00 | 8.00 |
India | 350.00 | 4.50 |
Step 8: Plotting — PROC SGPLOT
proc sgplot data=Oils_Dataset;
vbar Category / response=Price_Per_Litre stat=mean;
title "Average Price per Litre by Oil Category";
run;
Log:
proc sgplot data=Oils_Dataset;
scatter x=Smoke_Point y=Health_Rating / group=Category;
title "Health Rating vs Smoke Point by Category";
run;
Output:
Step 9: SQL Querying — PROC SQL
proc sql;
select Oil_Name, Category, Origin, Price_Per_Litre
from Oils_Dataset
where Price_Per_Litre > 1000;
quit;
Output:
Oil_Name | Category | Origin | Price_Per_Litre |
---|---|---|---|
Almond_Oil | Cosmetic | USA | 1200 |
Argan_Oil | Cosmetic | Morocco | 1400 |
Avocado_Oil | Cooking | Mexico | 1300 |
Jojoba_Oil | Cosmetic | USA | 1500 |
Tea_Tree_Oil | Medicinal | Australia | 1300 |
Wheatgerm_Oil | Cosmetic | Germany | 1100 |
Hempseed_Oil | Cooking | USA | 1050 |
proc sql;
select Category, avg(Health_Rating) as Avg_Health_Rating
from Oils_Dataset
group by Category;
quit;
Output:
Category | Avg_Health_Rating |
---|---|
Cooking | 7.230769 |
Cosmetic | 6.5 |
Medicinal | 5.666667 |
Step 10: Macro Programming
Macro to Generate Summary for Any Variable
%macro summarize(var);
proc means data=Oils_Dataset mean std min max;
var &var;
title "Summary Statistics for &var";
run;
%mend summarize;
%summarize(Price_Per_Litre);
Output:
Summary Statistics for Price_Per_Litre |
Analysis Variable : Price_Per_Litre | |||
---|---|---|---|
Mean | Std Dev | Minimum | Maximum |
864.0000000 | 345.1529028 | 300.0000000 | 1500.00 |
%summarize(Smoke_Point);
Output:
Summary Statistics for Smoke_Point |
Analysis Variable : Smoke_Point | |||
---|---|---|---|
Mean | Std Dev | Minimum | Maximum |
214.2500000 | 24.8296830 | 170.0000000 | 260.0000000 |
Macro with PROC SQL inside
%macro filter_price(price);
proc sql;
select Oil_Name, Price_Per_Litre
from Oils_Dataset
where Price_Per_Litre > &price;
quit;
%mend filter_price;
%filter_price(1000);
Output:
Oil_Name | Price_Per_Litre |
---|---|
Almond_Oil | 1200 |
Argan_Oil | 1400 |
Avocado_Oil | 1300 |
Jojoba_Oil | 1500 |
Tea_Tree_Oil | 1300 |
Wheatgerm_Oil | 1100 |
Hempseed_Oil | 1050 |
Step 11: Transpose — PROC TRANSPOSE
Transpose Health Ratings
proc transpose data=Oils_Dataset out=Transposed_Health(drop=_NAME_);
id Oil_Name;
var Health_Rating;
run;
proc print data=Transposed_Health;
title "Transposed Health Ratings";
run;
Output:
Transposed Health Ratings |
Obs | Olive_Oil | Coconut_Oil | Canola_Oil | Almond_Oil | Argan_Oil | Sunflower_Oil | Sesame_Oil | Mustard_Oil | Avocado_Oil | Jojoba_Oil | Flaxseed_Oil | Palm_Oil | Grapeseed_Oil | Castor_Oil | Neem_Oil | Tea_Tree_Oil | Rice_Bran_Oil | Wheatgerm_Oil | Hempseed_Oil | Safflower_Oil |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 9 | 7 | 8 | 6 | 7 | 8 | 7 | 6 | 9 | 6 | 7 | 5 | 7 | 5 | 4 | 8 | 8 | 7 | 7 | 6 |
Step 12: Custom Format for Rating
proc format;
value healthfmt
1 - 4 = 'Poor'
5 - 6 = 'Moderate'
7 - 8 = 'Good'
9 - 10 = 'Excellent';
run;
proc freq data=Oils_Dataset;
tables Health_Rating;
format Health_Rating healthfmt.;
title "Formatted Health Rating Frequency";
run;
Output:
Formatted Health Rating Frequency |
Health_Rating | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Poor | 1 | 5.00 | 1 | 5.00 |
Moderate | 6 | 30.00 | 7 | 35.00 |
Good | 11 | 55.00 | 18 | 90.00 |
Excellent | 2 | 10.00 | 20 | 100.00 |
- Get link
- X
- Other Apps
Comments
Post a Comment