196.ANALYSIS OF INDIAN MANGO VARIETIES USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS IN SAS | DATA MANAGEMENT INSIGHTS | CHARACTER LENGTH HANDLING
ANALYSIS OF INDIAN MANGO VARIETIES USING PROC PRINT | PROC MEANS | PROC FREQ | PROC SQL | PROC SORT | PROC REPORT | MACROS IN SAS | DATA MANAGEMENT INSIGHTS | CHARACTER LENGTH HANDLING
/*A unique dataset project on different types of mangoes*/
Step 1: Create Mangoes Dataset
data mango_varieties;
length MangoName $20 OriginState $15 HarvestSeason $10 SweetnessLevel $10;
input MangoName $ OriginState $ HarvestSeason $ AvgWeight_grams SweetnessLevel $
PricePerKg Rating;
datalines;
Alphonso Maharashtra Summer 300 VeryHigh 300 9.5
Dasheri UttarPradesh Summer 250 High 150 8.8
Langra Bihar Summer 280 High 140 8.6
Banganapalli AndhraPradesh Summer 350 Medium 120 8.2
Kesar Gujarat Summer 260 VeryHigh 250 9.2
Himsagar WestBengal Summer 240 High 180 8.9
Totapuri Karnataka Summer 400 Low 100 7.5
Neelum TamilNadu Summer 220 Medium 130 7.9
Malda Bihar Summer 310 High 160 8.7
Raspuri Karnataka Summer 270 Medium 140 8.3
Chausa UttarPradesh Summer 290 High 155 8.4
Badami Karnataka Summer 300 Medium 135 8.1
Mankurad Goa Summer 320 High 270 9.0
Mallika PanIndia Summer 350 VeryHigh 240 9.1
Sindhura TamilNadu Summer 280 Medium 125 7.8
Suvernarekha AndhraPradesh Summer 260 Medium 110 7.6
Zardalu Bihar Summer 230 High 150 8.0
ImamPasand TamilNadu Summer 330 VeryHigh 280 9.3
Rajapuri Gujarat Summer 370 Low 90 7.2
Amrapali PanIndia Summer 270 Medium 145 8.5
;
run;
proc print;run;
Output:
Obs | MangoName | OriginState | HarvestSeason | SweetnessLevel | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|---|---|---|
1 | Alphonso | Maharashtra | Summer | VeryHigh | 300 | 300 | 9.5 |
2 | Dasheri | UttarPradesh | Summer | High | 250 | 150 | 8.8 |
3 | Langra | Bihar | Summer | High | 280 | 140 | 8.6 |
4 | Banganapalli | AndhraPradesh | Summer | Medium | 350 | 120 | 8.2 |
5 | Kesar | Gujarat | Summer | VeryHigh | 260 | 250 | 9.2 |
6 | Himsagar | WestBengal | Summer | High | 240 | 180 | 8.9 |
7 | Totapuri | Karnataka | Summer | Low | 400 | 100 | 7.5 |
8 | Neelum | TamilNadu | Summer | Medium | 220 | 130 | 7.9 |
9 | Malda | Bihar | Summer | High | 310 | 160 | 8.7 |
10 | Raspuri | Karnataka | Summer | Medium | 270 | 140 | 8.3 |
11 | Chausa | UttarPradesh | Summer | High | 290 | 155 | 8.4 |
12 | Badami | Karnataka | Summer | Medium | 300 | 135 | 8.1 |
13 | Mankurad | Goa | Summer | High | 320 | 270 | 9.0 |
14 | Mallika | PanIndia | Summer | VeryHigh | 350 | 240 | 9.1 |
15 | Sindhura | TamilNadu | Summer | Medium | 280 | 125 | 7.8 |
16 | Suvernarekha | AndhraPradesh | Summer | Medium | 260 | 110 | 7.6 |
17 | Zardalu | Bihar | Summer | High | 230 | 150 | 8.0 |
18 | ImamPasand | TamilNadu | Summer | VeryHigh | 330 | 280 | 9.3 |
19 | Rajapuri | Gujarat | Summer | Low | 370 | 90 | 7.2 |
20 | Amrapali | PanIndia | Summer | Medium | 270 | 145 | 8.5 |
Step 2: View and Explore the Dataset
proc print data=mango_varieties;
title "List of Indian Mango Varieties";
run;
Output:
List of Indian Mango Varieties |
Obs | MangoName | OriginState | HarvestSeason | SweetnessLevel | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|---|---|---|
1 | Alphonso | Maharashtra | Summer | VeryHigh | 300 | 300 | 9.5 |
2 | Dasheri | UttarPradesh | Summer | High | 250 | 150 | 8.8 |
3 | Langra | Bihar | Summer | High | 280 | 140 | 8.6 |
4 | Banganapalli | AndhraPradesh | Summer | Medium | 350 | 120 | 8.2 |
5 | Kesar | Gujarat | Summer | VeryHigh | 260 | 250 | 9.2 |
6 | Himsagar | WestBengal | Summer | High | 240 | 180 | 8.9 |
7 | Totapuri | Karnataka | Summer | Low | 400 | 100 | 7.5 |
8 | Neelum | TamilNadu | Summer | Medium | 220 | 130 | 7.9 |
9 | Malda | Bihar | Summer | High | 310 | 160 | 8.7 |
10 | Raspuri | Karnataka | Summer | Medium | 270 | 140 | 8.3 |
11 | Chausa | UttarPradesh | Summer | High | 290 | 155 | 8.4 |
12 | Badami | Karnataka | Summer | Medium | 300 | 135 | 8.1 |
13 | Mankurad | Goa | Summer | High | 320 | 270 | 9.0 |
14 | Mallika | PanIndia | Summer | VeryHigh | 350 | 240 | 9.1 |
15 | Sindhura | TamilNadu | Summer | Medium | 280 | 125 | 7.8 |
16 | Suvernarekha | AndhraPradesh | Summer | Medium | 260 | 110 | 7.6 |
17 | Zardalu | Bihar | Summer | High | 230 | 150 | 8.0 |
18 | ImamPasand | TamilNadu | Summer | VeryHigh | 330 | 280 | 9.3 |
19 | Rajapuri | Gujarat | Summer | Low | 370 | 90 | 7.2 |
20 | Amrapali | PanIndia | Summer | Medium | 270 | 145 | 8.5 |
proc contents data=mango_varieties;
run;
Output:
Data Set Name | WORK.MANGO_VARIETIES | Observations | 20 |
---|---|---|---|
Member Type | DATA | Variables | 7 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:03:34 | Observation Length | 80 |
Last Modified | 14/09/2015 00:03:34 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 817 |
Obs in First Data Page | 20 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD8000_DESKTOP-QFAA4KV_\mango_varieties.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | |||
---|---|---|---|
# | Variable | Type | Len |
5 | AvgWeight_grams | Num | 8 |
3 | HarvestSeason | Char | 10 |
1 | MangoName | Char | 20 |
2 | OriginState | Char | 15 |
6 | PricePerKg | Num | 8 |
7 | Rating | Num | 8 |
4 | SweetnessLevel | Char | 10 |
Step 3: Summary Statistics Using PROC MEANS
proc means data=mango_varieties mean median maxdec=2;
var AvgWeight_grams PricePerKg Rating;
title "Summary Statistics for Weight, Price, and Rating";
run;
Output:
Summary Statistics for Weight, Price, and Rating |
Variable | Mean | Median | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
Step 4: Frequency Analysis Using PROC FREQ
proc freq data=mango_varieties;
tables OriginState SweetnessLevel;
title "Frequency of Mango Origin States and Sweetness Levels";
run;
Output:
Frequency of Mango Origin States and Sweetness Levels |
OriginState | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
AndhraPradesh | 2 | 10.00 | 2 | 10.00 |
Bihar | 3 | 15.00 | 5 | 25.00 |
Goa | 1 | 5.00 | 6 | 30.00 |
Gujarat | 2 | 10.00 | 8 | 40.00 |
Karnataka | 3 | 15.00 | 11 | 55.00 |
Maharashtra | 1 | 5.00 | 12 | 60.00 |
PanIndia | 2 | 10.00 | 14 | 70.00 |
TamilNadu | 3 | 15.00 | 17 | 85.00 |
UttarPradesh | 2 | 10.00 | 19 | 95.00 |
WestBengal | 1 | 5.00 | 20 | 100.00 |
SweetnessLevel | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
High | 7 | 35.00 | 7 | 35.00 |
Low | 2 | 10.00 | 9 | 45.00 |
Medium | 7 | 35.00 | 16 | 80.00 |
VeryHigh | 4 | 20.00 | 20 | 100.00 |
Step 5: Sorting Data by Rating and Price
proc sort data=mango_varieties out=sorted_mangoes;
by descending Rating descending PricePerKg;
run;
proc print data=sorted_mangoes;
title "Mangoes Sorted by Rating and Price";
run;
Output:
Mangoes Sorted by Rating and Price |
Obs | MangoName | OriginState | HarvestSeason | SweetnessLevel | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|---|---|---|
1 | Alphonso | Maharashtra | Summer | VeryHigh | 300 | 300 | 9.5 |
2 | ImamPasand | TamilNadu | Summer | VeryHigh | 330 | 280 | 9.3 |
3 | Kesar | Gujarat | Summer | VeryHigh | 260 | 250 | 9.2 |
4 | Mallika | PanIndia | Summer | VeryHigh | 350 | 240 | 9.1 |
5 | Mankurad | Goa | Summer | High | 320 | 270 | 9.0 |
6 | Himsagar | WestBengal | Summer | High | 240 | 180 | 8.9 |
7 | Dasheri | UttarPradesh | Summer | High | 250 | 150 | 8.8 |
8 | Malda | Bihar | Summer | High | 310 | 160 | 8.7 |
9 | Langra | Bihar | Summer | High | 280 | 140 | 8.6 |
10 | Amrapali | PanIndia | Summer | Medium | 270 | 145 | 8.5 |
11 | Chausa | UttarPradesh | Summer | High | 290 | 155 | 8.4 |
12 | Raspuri | Karnataka | Summer | Medium | 270 | 140 | 8.3 |
13 | Banganapalli | AndhraPradesh | Summer | Medium | 350 | 120 | 8.2 |
14 | Badami | Karnataka | Summer | Medium | 300 | 135 | 8.1 |
15 | Zardalu | Bihar | Summer | High | 230 | 150 | 8.0 |
16 | Neelum | TamilNadu | Summer | Medium | 220 | 130 | 7.9 |
17 | Sindhura | TamilNadu | Summer | Medium | 280 | 125 | 7.8 |
18 | Suvernarekha | AndhraPradesh | Summer | Medium | 260 | 110 | 7.6 |
19 | Totapuri | Karnataka | Summer | Low | 400 | 100 | 7.5 |
20 | Rajapuri | Gujarat | Summer | Low | 370 | 90 | 7.2 |
Step 6: Grouping with PROC SQL
proc sql;
select OriginState,
count(*) as MangoCount,
avg(PricePerKg) as AvgPrice format=8.2,
avg(Rating) as AvgRating format=4.2
from mango_varieties
group by OriginState;
quit;
Output:
OriginState | MangoCount | AvgPrice | AvgRating |
---|---|---|---|
AndhraPradesh | 2 | 115.00 | 7.90 |
Bihar | 3 | 150.00 | 8.43 |
Goa | 1 | 270.00 | 9.00 |
Gujarat | 2 | 170.00 | 8.20 |
Karnataka | 3 | 125.00 | 7.97 |
Maharashtra | 1 | 300.00 | 9.50 |
PanIndia | 2 | 192.50 | 8.80 |
TamilNadu | 3 | 178.33 | 8.33 |
UttarPradesh | 2 | 152.50 | 8.60 |
WestBengal | 1 | 180.00 | 8.90 |
Step 7: Create a Macro for Regional Mango Analysis
%macro region_analysis(state);
proc sql;
select MangoName, PricePerKg, Rating
from mango_varieties
where OriginState="&state"
order by Rating desc;
quit;
%mend;
%region_analysis(Karnataka);
Output:
MangoName | PricePerKg | Rating |
---|---|---|
Raspuri | 140 | 8.3 |
Badami | 135 | 8.1 |
Totapuri | 100 | 7.5 |
%region_analysis(TamilNadu);
Output:
MangoName | PricePerKg | Rating |
---|---|---|
ImamPasand | 280 | 9.3 |
Neelum | 130 | 7.9 |
Sindhura | 125 | 7.8 |
%region_analysis(Bihar);
Output:
MangoName | PricePerKg | Rating |
---|---|---|
Malda | 160 | 8.7 |
Langra | 140 | 8.6 |
Zardalu | 150 | 8 |
Step 8: Custom Report Using PROC REPORT
proc report data=mango_varieties nowd;
column MangoName OriginState AvgWeight_grams PricePerKg Rating;
define MangoName / display "Mango Variety";
define OriginState / group "State";
define AvgWeight_grams / analysis mean "Avg Weight (g)";
define PricePerKg / analysis mean "Price (INR/Kg)";
define Rating / analysis mean "Avg Rating";
title "Custom Report of Mango Varieties by State";
run;
Output:
Custom Report of Mango Varieties by State |
Mango Variety | State | Avg Weight (g) | Price (INR Kg) |
Avg Rating |
---|---|---|---|---|
Banganapalli | AndhraPradesh | 350 | 120 | 8.2 |
Suvernarekha | 260 | 110 | 7.6 | |
Langra | Bihar | 280 | 140 | 8.6 |
Malda | 310 | 160 | 8.7 | |
Zardalu | 230 | 150 | 8 | |
Mankurad | Goa | 320 | 270 | 9 |
Kesar | Gujarat | 260 | 250 | 9.2 |
Rajapuri | 370 | 90 | 7.2 | |
Totapuri | Karnataka | 400 | 100 | 7.5 |
Raspuri | 270 | 140 | 8.3 | |
Badami | 300 | 135 | 8.1 | |
Alphonso | Maharashtra | 300 | 300 | 9.5 |
Mallika | PanIndia | 350 | 240 | 9.1 |
Amrapali | 270 | 145 | 8.5 | |
Neelum | TamilNadu | 220 | 130 | 7.9 |
Sindhura | 280 | 125 | 7.8 | |
ImamPasand | 330 | 280 | 9.3 | |
Dasheri | UttarPradesh | 250 | 150 | 8.8 |
Chausa | 290 | 155 | 8.4 | |
Himsagar | WestBengal | 240 | 180 | 8.9 |
Step 9: Create a New Variable - Price Category
data mango_price_category;
set mango_varieties;
length PriceCategory $10.;
if PricePerKg >= 250 then PriceCategory = "Premium";
else if PricePerKg >= 150 then PriceCategory = "MidRange";
else PriceCategory = "Budget";
run;
proc print;run;
Output:
Obs | MangoName | OriginState | HarvestSeason | SweetnessLevel | AvgWeight_grams | PricePerKg | Rating | PriceCategory |
---|---|---|---|---|---|---|---|---|
1 | Alphonso | Maharashtra | Summer | VeryHigh | 300 | 300 | 9.5 | Premium |
2 | Dasheri | UttarPradesh | Summer | High | 250 | 150 | 8.8 | MidRange |
3 | Langra | Bihar | Summer | High | 280 | 140 | 8.6 | Budget |
4 | Banganapalli | AndhraPradesh | Summer | Medium | 350 | 120 | 8.2 | Budget |
5 | Kesar | Gujarat | Summer | VeryHigh | 260 | 250 | 9.2 | Premium |
6 | Himsagar | WestBengal | Summer | High | 240 | 180 | 8.9 | MidRange |
7 | Totapuri | Karnataka | Summer | Low | 400 | 100 | 7.5 | Budget |
8 | Neelum | TamilNadu | Summer | Medium | 220 | 130 | 7.9 | Budget |
9 | Malda | Bihar | Summer | High | 310 | 160 | 8.7 | MidRange |
10 | Raspuri | Karnataka | Summer | Medium | 270 | 140 | 8.3 | Budget |
11 | Chausa | UttarPradesh | Summer | High | 290 | 155 | 8.4 | MidRange |
12 | Badami | Karnataka | Summer | Medium | 300 | 135 | 8.1 | Budget |
13 | Mankurad | Goa | Summer | High | 320 | 270 | 9.0 | Premium |
14 | Mallika | PanIndia | Summer | VeryHigh | 350 | 240 | 9.1 | MidRange |
15 | Sindhura | TamilNadu | Summer | Medium | 280 | 125 | 7.8 | Budget |
16 | Suvernarekha | AndhraPradesh | Summer | Medium | 260 | 110 | 7.6 | Budget |
17 | Zardalu | Bihar | Summer | High | 230 | 150 | 8.0 | MidRange |
18 | ImamPasand | TamilNadu | Summer | VeryHigh | 330 | 280 | 9.3 | Premium |
19 | Rajapuri | Gujarat | Summer | Low | 370 | 90 | 7.2 | Budget |
20 | Amrapali | PanIndia | Summer | Medium | 270 | 145 | 8.5 | Budget |
proc freq data=mango_price_category;
tables PriceCategory;
title "Frequency of Mangoes by Price Category";
run;
Output:
Frequency of Mangoes by Price Category |
PriceCategory | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Budget | 10 | 50.00 | 10 | 50.00 |
MidRange | 6 | 30.00 | 16 | 80.00 |
Premium | 4 | 20.00 | 20 | 100.00 |
Step 10: Macro to Compare Any Two Mango Varieties
%macro compare_mango(mango1, mango2);
proc sql;
select MangoName, OriginState, AvgWeight_grams, PricePerKg, Rating
from mango_varieties
where MangoName in ("&mango1", "&mango2");
quit;
%mend;
%compare_mango(Alphonso, Kesar);
Output:
MangoName | OriginState | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|
Alphonso | Maharashtra | 300 | 300 | 9.5 |
Kesar | Gujarat | 260 | 250 | 9.2 |
%compare_mango(Langra, Dasheri);
Output:
MangoName | OriginState | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|
Dasheri | UttarPradesh | 250 | 150 | 8.8 |
Langra | Bihar | 280 | 140 | 8.6 |
Step 11: Filter and Export Selected Data
data high_rated_mangoes;
set mango_varieties;
if Rating >= 9.0;
run;
proc print data=high_rated_mangoes;
title "Mangoes with Rating 9.0 and Above";
run;
Output:
Mangoes with Rating 9.0 and Above |
Obs | MangoName | OriginState | HarvestSeason | SweetnessLevel | AvgWeight_grams | PricePerKg | Rating |
---|---|---|---|---|---|---|---|
1 | Alphonso | Maharashtra | Summer | VeryHigh | 300 | 300 | 9.5 |
2 | Kesar | Gujarat | Summer | VeryHigh | 260 | 250 | 9.2 |
3 | Mankurad | Goa | Summer | High | 320 | 270 | 9.0 |
4 | Mallika | PanIndia | Summer | VeryHigh | 350 | 240 | 9.1 |
5 | ImamPasand | TamilNadu | Summer | VeryHigh | 330 | 280 | 9.3 |
proc export data=high_rated_mangoes
outfile="HighRatedMangoes.csv"
dbms=csv replace;
run;
Comments
Post a Comment