198.ANALYSIS OF VEGETABLES DATASET USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SORT | PROC SQL | PROC FORMAT | PROC RANK | SAS MACROS FOR SMART REPORTING AND CATEGORY-WISE INSIGHTS
- Get link
- X
- Other Apps
ANALYSIS OF VEGETABLES DATASET USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SORT | PROC SQL | PROC FORMAT | PROC RANK | SAS MACROS FOR SMART REPORTING AND CATEGORY-WISE INSIGHTS
/*A unique dataset on different types of vegetables*/
Step 1: Create the Dataset (More than 15 Observations)
Creating a dataset called veggies containing various vegetables with Variables like:
Veg_ID: Unique vegetable ID
Veg_Name: Name of the vegetable
Category: Type (Leafy, Root, Cruciferous, Allium, etc.)
Color: Primary color of the vegetable
Price_per_kg: Market price per kilogram (₹)
Shelf_Life: Days vegetable stays fresh
Is_Organic: Yes/No
Calories: Calories per 100 grams
data veggies;
input Veg_ID Veg_Name :$12. Category :$12. Color $ Price_per_kg Shelf_Life Is_Organic $
Calories;
datalines;
1 Spinach Leafy Green 40 5 Yes 23
2 Carrot Root Orange 30 15 No 41
3 Broccoli Cruciferous Green 60 7 Yes 34
4 Garlic Allium White 150 30 No 149
5 Tomato Fruit Red 25 7 No 18
6 Beetroot Root Red 35 10 Yes 43
7 Cabbage Cruciferous Green 28 14 No 25
8 Onion Allium Purple 20 20 No 40
9 Kale Leafy Green 70 5 Yes 35
10 Radish Root White 25 7 No 16
11 BellPepper Fruit Red 80 12 Yes 20
12 Cauliflower Cruciferous White 50 10 No 25
13 Mint Leafy Green 10 3 Yes 44
14 SpringOnion Allium Green 30 5 No 32
15 SweetPotato Root Brown 45 20 Yes 86
16 Coriander Leafy Green 15 4 Yes 23
17 Brinjal Fruit Purple 35 7 No 25
18 Turnip Root White 33 9 Yes 28
;
run;
proc print;run;
Output:
Obs | Veg_ID | Veg_Name | Category | Color | Price_per_kg | Shelf_Life | Is_Organic | Calories |
---|---|---|---|---|---|---|---|---|
1 | 1 | Spinach | Leafy | Green | 40 | 5 | Yes | 23 |
2 | 2 | Carrot | Root | Orange | 30 | 15 | No | 41 |
3 | 3 | Broccoli | Cruciferous | Green | 60 | 7 | Yes | 34 |
4 | 4 | Garlic | Allium | White | 150 | 30 | No | 149 |
5 | 5 | Tomato | Fruit | Red | 25 | 7 | No | 18 |
6 | 6 | Beetroot | Root | Red | 35 | 10 | Yes | 43 |
7 | 7 | Cabbage | Cruciferous | Green | 28 | 14 | No | 25 |
8 | 8 | Onion | Allium | Purple | 20 | 20 | No | 40 |
9 | 9 | Kale | Leafy | Green | 70 | 5 | Yes | 35 |
10 | 10 | Radish | Root | White | 25 | 7 | No | 16 |
11 | 11 | BellPepper | Fruit | Red | 80 | 12 | Yes | 20 |
12 | 12 | Cauliflower | Cruciferous | White | 50 | 10 | No | 25 |
13 | 13 | Mint | Leafy | Green | 10 | 3 | Yes | 44 |
14 | 14 | SpringOnion | Allium | Green | 30 | 5 | No | 32 |
15 | 15 | SweetPotato | Root | Brown | 45 | 20 | Yes | 86 |
16 | 16 | Coriander | Leafy | Green | 15 | 4 | Yes | 23 |
17 | 17 | Brinjal | Fruit | Purple | 35 | 7 | No | 25 |
18 | 18 | Turnip | Root | White | 33 | 9 | Yes | 28 |
Step 2: Use Basic PROC STEPS
2.1 PROC PRINT: Display all data
proc print data=veggies;
title "Complete List of Vegetables";
run;
Output:
Complete List of Vegetables |
Obs | Veg_ID | Veg_Name | Category | Color | Price_per_kg | Shelf_Life | Is_Organic | Calories |
---|---|---|---|---|---|---|---|---|
1 | 1 | Spinach | Leafy | Green | 40 | 5 | Yes | 23 |
2 | 2 | Carrot | Root | Orange | 30 | 15 | No | 41 |
3 | 3 | Broccoli | Cruciferous | Green | 60 | 7 | Yes | 34 |
4 | 4 | Garlic | Allium | White | 150 | 30 | No | 149 |
5 | 5 | Tomato | Fruit | Red | 25 | 7 | No | 18 |
6 | 6 | Beetroot | Root | Red | 35 | 10 | Yes | 43 |
7 | 7 | Cabbage | Cruciferous | Green | 28 | 14 | No | 25 |
8 | 8 | Onion | Allium | Purple | 20 | 20 | No | 40 |
9 | 9 | Kale | Leafy | Green | 70 | 5 | Yes | 35 |
10 | 10 | Radish | Root | White | 25 | 7 | No | 16 |
11 | 11 | BellPepper | Fruit | Red | 80 | 12 | Yes | 20 |
12 | 12 | Cauliflower | Cruciferous | White | 50 | 10 | No | 25 |
13 | 13 | Mint | Leafy | Green | 10 | 3 | Yes | 44 |
14 | 14 | SpringOnion | Allium | Green | 30 | 5 | No | 32 |
15 | 15 | SweetPotato | Root | Brown | 45 | 20 | Yes | 86 |
16 | 16 | Coriander | Leafy | Green | 15 | 4 | Yes | 23 |
17 | 17 | Brinjal | Fruit | Purple | 35 | 7 | No | 25 |
18 | 18 | Turnip | Root | White | 33 | 9 | Yes | 28 |
2.2 PROC MEANS: Descriptive statistics for numeric variables
proc means data=veggies mean min max std;
var Price_per_kg Shelf_Life Calories;
title "Descriptive Statistics for Vegetables";
run;
Output:
Descriptive Statistics for Vegetables |
Variable | Mean | Minimum | Maximum | Std Dev | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
2.3 PROC FREQ: Frequency of Categories and Organic Status
proc freq data=veggies;
tables Category Is_Organic Color / nocum;
title "Category and Organic Classification Frequencies";
run;
Output:
Category and Organic Classification Frequencies |
Category | Frequency | Percent |
---|---|---|
Allium | 3 | 16.67 |
Cruciferous | 3 | 16.67 |
Fruit | 3 | 16.67 |
Leafy | 4 | 22.22 |
Root | 5 | 27.78 |
Is_Organic | Frequency | Percent |
---|---|---|
No | 9 | 50.00 |
Yes | 9 | 50.00 |
Color | Frequency | Percent |
---|---|---|
Brown | 1 | 5.56 |
Green | 7 | 38.89 |
Orange | 1 | 5.56 |
Purple | 2 | 11.11 |
Red | 3 | 16.67 |
White | 4 | 22.22 |
2.4 PROC SORT + PROC MEANS (by category)
proc sort data=veggies out=sorted_veggies;
by Category;
run;
proc means data=sorted_veggies mean maxdec=2;
class Category;
var Price_per_kg Calories;
title "Average Price and Calories by Category";
run;
Output:
Average Price and Calories by Category |
Category | N Obs | Variable | Mean | ||||
---|---|---|---|---|---|---|---|
Allium | 3 |
|
| ||||
Cruciferous | 3 |
|
| ||||
Fruit | 3 |
|
| ||||
Leafy | 4 |
|
| ||||
Root | 5 |
|
|
Step 3: PROC SQL – Powerful Query-Based Manipulation
3.1 Query: List all Organic Vegetables priced above ₹40
proc sql;
select Veg_Name, Category, Price_per_kg, Shelf_Life
from veggies
where Is_Organic='Yes' and Price_per_kg > 40;
quit;
Output:
Veg_Name | Category | Price_per_kg | Shelf_Life |
---|---|---|---|
Broccoli | Cruciferous | 60 | 7 |
Kale | Leafy | 70 | 5 |
BellPepper | Fruit | 80 | 12 |
SweetPotato | Root | 45 | 20 |
3.2 Query: Average calories per category
proc sql;
select Category, avg(Calories) as Avg_Calories
from veggies
group by Category;
quit;
Output:
Category | Avg_Calories |
---|---|
Allium | 73.66667 |
Cruciferous | 28 |
Fruit | 21 |
Leafy | 31.25 |
Root | 42.8 |
3.3 Query: Count of vegetables by color
proc sql;
select Color, count(*) as Veg_Count
from veggies
group by Color;
quit;
Output:
Color | Veg_Count |
---|---|
Brown | 1 |
Green | 7 |
Orange | 1 |
Purple | 2 |
Red | 3 |
White | 4 |
3.4 Query: Highest and Lowest Shelf Life
proc sql outobs=1;
select Veg_Name, Shelf_Life
from veggies
order by Shelf_Life desc;
quit;
Output:
Veg_Name | Shelf_Life |
---|---|
Garlic | 30 |
proc sql outobs=1;
select Veg_Name, Shelf_Life
from veggies
order by Shelf_Life;
quit;
Output:
Veg_Name | Shelf_Life |
---|---|
Mint | 3 |
Step 4: SAS Macros for Automation
%macro veg_summary(cat=);
proc print data=veggies;
where Category="&cat";
title "Vegetables in the &cat Category";
run;
proc means data=veggies mean;
where Category="&cat";
var Price_per_kg Calories Shelf_Life;
title "Summary Statistics for &cat Category";
run;
%mend;
%veg_summary(cat=Leafy)
Output:
Summary Statistics for Leafy Category |
Variable | Mean | ||||||
---|---|---|---|---|---|---|---|
|
|
%veg_summary(cat=Root)
Output:
Summary Statistics for Root Category |
Variable | Mean | ||||||
---|---|---|---|---|---|---|---|
|
|
%veg_summary(cat=Fruit)
Output:
Summary Statistics for Fruit Category |
Variable | Mean | ||||||
---|---|---|---|---|---|---|---|
|
|
Step 5: Additional Analyses
5.1 PROC RANK: Rank vegetables by calorie content
proc rank data=veggies out=ranked_veggies descending;
var Calories;
ranks Calorie_Rank;
run;
proc print data=ranked_veggies;
var Veg_Name Calories Calorie_Rank;
title "Ranking of Vegetables by Calories";
run;
Output:
Ranking of Vegetables by Calories |
Obs | Veg_Name | Calories | Calorie_Rank |
---|---|---|---|
1 | Spinach | 23 | 14.5 |
2 | Carrot | 41 | 5.0 |
3 | Broccoli | 34 | 8.0 |
4 | Garlic | 149 | 1.0 |
5 | Tomato | 18 | 17.0 |
6 | Beetroot | 43 | 4.0 |
7 | Cabbage | 25 | 12.0 |
8 | Onion | 40 | 6.0 |
9 | Kale | 35 | 7.0 |
10 | Radish | 16 | 18.0 |
11 | BellPepper | 20 | 16.0 |
12 | Cauliflower | 25 | 12.0 |
13 | Mint | 44 | 3.0 |
14 | SpringOnion | 32 | 9.0 |
15 | SweetPotato | 86 | 2.0 |
16 | Coriander | 23 | 14.5 |
17 | Brinjal | 25 | 12.0 |
18 | Turnip | 28 | 10.0 |
5.2 PROC FORMAT: Add readable formats for Is_Organic
proc format;
value $orgfmt
'Yes' = 'Organic'
'No' = 'Non-Organic';
run;
proc print data=veggies;
format Is_Organic $orgfmt.;
title "Vegetables with Organic Status Formatted";
run;
Output:
Vegetables with Organic Status Formatted |
Obs | Veg_ID | Veg_Name | Category | Color | Price_per_kg | Shelf_Life | Is_Organic | Calories |
---|---|---|---|---|---|---|---|---|
1 | 1 | Spinach | Leafy | Green | 40 | 5 | Organic | 23 |
2 | 2 | Carrot | Root | Orange | 30 | 15 | Non-Organic | 41 |
3 | 3 | Broccoli | Cruciferous | Green | 60 | 7 | Organic | 34 |
4 | 4 | Garlic | Allium | White | 150 | 30 | Non-Organic | 149 |
5 | 5 | Tomato | Fruit | Red | 25 | 7 | Non-Organic | 18 |
6 | 6 | Beetroot | Root | Red | 35 | 10 | Organic | 43 |
7 | 7 | Cabbage | Cruciferous | Green | 28 | 14 | Non-Organic | 25 |
8 | 8 | Onion | Allium | Purple | 20 | 20 | Non-Organic | 40 |
9 | 9 | Kale | Leafy | Green | 70 | 5 | Organic | 35 |
10 | 10 | Radish | Root | White | 25 | 7 | Non-Organic | 16 |
11 | 11 | BellPepper | Fruit | Red | 80 | 12 | Organic | 20 |
12 | 12 | Cauliflower | Cruciferous | White | 50 | 10 | Non-Organic | 25 |
13 | 13 | Mint | Leafy | Green | 10 | 3 | Organic | 44 |
14 | 14 | SpringOnion | Allium | Green | 30 | 5 | Non-Organic | 32 |
15 | 15 | SweetPotato | Root | Brown | 45 | 20 | Organic | 86 |
16 | 16 | Coriander | Leafy | Green | 15 | 4 | Organic | 23 |
17 | 17 | Brinjal | Fruit | Purple | 35 | 7 | Non-Organic | 25 |
18 | 18 | Turnip | Root | White | 33 | 9 | Organic | 28 |
- Get link
- X
- Other Apps
Comments
Post a Comment