CREATION OF THE DATASET OF FISHES : DATA | PROC MEANS | PROC SQL | PROC FREQ | PROC SORT | PROC REPORT | PROC UNIVARIATE | PROC PRINT | SAS MACROS
/*Creating The Dataset Of Fishes In India*/
1.SAS Dataset Creation
options nocenter;
data indian_fish;
length Fish_Name $30 Type $15 Scientific_Name $40 Region $30;
input Fish_Name $ Type $ Scientific_Name $ Region $ Max_Length Avg_Weight
Market_Price;
datalines;
Rawas Saltwater Polynemus_tetradactylum West_Coast 120 8 400
Rohu Freshwater Labeo_rohita Ganges_Basin 75 5 250
Catla Freshwater Catla_catla East_India 80 7 220
Bangda Saltwater Rastrelliger_kanagurta Western_Coast 35 0.25 200
Surmai Saltwater Scomberomorus_commerson South_India 100 10 500
Pomfret Saltwater Pampus_argenteus All_coasts 60 2 600
Hilsa Freshwater Tenualosa_ilisha East_India 50 1.2 700
Rani Freshwater Pristolepis_marginata Central_India 30 0.5 180
Flower_Prawn Saltwater Penaeus_semisulcatus Kerala 18 0.2 800
Mud_Crab Brackish Scylla_serrata South_India 20 1 900
;
run;
proc print data=indian_fish;
run;
Output:
| Obs | Fish_Name | Type | Scientific_Name | Region | Max_Length | Avg_Weight | Market_Price |
|---|---|---|---|---|---|---|---|
| 1 | Rawas | Saltwater | Polynemus_tetradactylum | West_Coast | 120 | 8.00 | 400 |
| 2 | Rohu | Freshwater | Labeo_rohita | Ganges_Basin | 75 | 5.00 | 250 |
| 3 | Catla | Freshwater | Catla_catla | East_India | 80 | 7.00 | 220 |
| 4 | Bangda | Saltwater | Rastrelliger_kanagurta | Western_Coast | 35 | 0.25 | 200 |
| 5 | Surmai | Saltwater | Scomberomorus_commerson | South_India | 100 | 10.00 | 500 |
| 6 | Pomfret | Saltwater | Pampus_argenteus | All_coasts | 60 | 2.00 | 600 |
| 7 | Hilsa | Freshwater | Tenualosa_ilisha | East_India | 50 | 1.20 | 700 |
| 8 | Rani | Freshwater | Pristolepis_marginata | Central_India | 30 | 0.50 | 180 |
| 9 | Flower_Prawn | Saltwater | Penaeus_semisulcatus | Kerala | 18 | 0.20 | 800 |
| 10 | Mud_Crab | Brackish | Scylla_serrata | South_India | 20 | 1.00 | 900 |
2.Descriptive Statistics using PROC MEANS
proc means data=indian_fish;
var Max_Length Avg_Weight Market_Price;
run;
Output:
The MEANS Procedure
| Variable | N | Mean | Std Dev | Minimum | Maximum |
|---|---|---|---|---|---|
Max_Length Avg_Weight Market_Price | 10 10 10 | 58.8000000 3.5150000 475.0000000 | 34.6403746 3.6703050 266.0513735 | 18.0000000 0.2000000 180.0000000 | 120.0000000 10.0000000 900.0000000 |
3.Data Filtering using PROC SQL
proc sql;
select Fish_Name, Type, Avg_Weight, Market_Price
from indian_fish
where Market_Price > 500;
quit;
Output:
| Fish_Name | Type | Avg_Weight | Market_Price |
|---|---|---|---|
| Pomfret | Saltwater | 2 | 600 |
| Hilsa | Freshwater | 1.2 | 700 |
| Flower_Prawn | Saltwater | 0.2 | 800 |
| Mud_Crab | Brackish | 1 | 900 |
4.Frequency Distribution using PROC FREQ
proc freq data=indian_fish;
tables Type;
run;
Output:
The FREQ Procedure
| Type | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Brackish | 1 | 10.00 | 1 | 10.00 |
| Freshwater | 4 | 40.00 | 5 | 50.00 |
| Saltwater | 5 | 50.00 | 10 | 100.00 |
5.Macro for Regional Analysis
%macro region_fish(region);
proc sql;
select Fish_Name, Market_Price
from indian_fish
where Region = "®ion";
quit;
%mend region_fish;
%region_fish(East_India);
Output:
| Fish_Name | Market_Price |
|---|---|
| Catla | 220 |
| Hilsa | 700 |
6.Sorting by Market Price
proc sort data=indian_fish out=sorted_fish;
by descending Market_Price;
run;
proc print;run;
Output:
| Obs | Fish_Name | Type | Scientific_Name | Region | Max_Length | Avg_Weight | Market_Price |
|---|---|---|---|---|---|---|---|
| 1 | Mud_Crab | Brackish | Scylla_serrata | South_India | 20 | 1.00 | 900 |
| 2 | Flower_Prawn | Saltwater | Penaeus_semisulcatus | Kerala | 18 | 0.20 | 800 |
| 3 | Hilsa | Freshwater | Tenualosa_ilisha | East_India | 50 | 1.20 | 700 |
| 4 | Pomfret | Saltwater | Pampus_argenteus | All_coasts | 60 | 2.00 | 600 |
| 5 | Surmai | Saltwater | Scomberomorus_commerson | South_India | 100 | 10.00 | 500 |
| 6 | Rawas | Saltwater | Polynemus_tetradactylum | West_Coast | 120 | 8.00 | 400 |
| 7 | Rohu | Freshwater | Labeo_rohita | Ganges_Basin | 75 | 5.00 | 250 |
| 8 | Catla | Freshwater | Catla_catla | East_India | 80 | 7.00 | 220 |
| 9 | Bangda | Saltwater | Rastrelliger_kanagurta | Western_Coast | 35 | 0.25 | 200 |
| 10 | Rani | Freshwater | Pristolepis_marginata | Central_India | 30 | 0.50 | 180 |
7.Generating Report
proc report data=sorted_fish nowd;
columns Fish_Name Region Type Market_Price;
run;
Output:
| Fish_Name | Region | Type | Market_Price |
|---|---|---|---|
| Mud_Crab | South_India | Brackish | 900 |
| Flower_Prawn | Kerala | Saltwater | 800 |
| Hilsa | East_India | Freshwater | 700 |
| Pomfret | All_coasts | Saltwater | 600 |
| Surmai | South_India | Saltwater | 500 |
| Rawas | West_Coast | Saltwater | 400 |
| Rohu | Ganges_Basin | Freshwater | 250 |
| Catla | East_India | Freshwater | 220 |
| Bangda | Western_Coast | Saltwater | 200 |
| Rani | Central_India | Freshwater | 180 |
8.PROC UNIVARIATE for Distribution Study
proc univariate data=indian_fish;
var Avg_Weight;
run;
Output:
The UNIVARIATE Procedure
Variable: Avg_Weight
| Moments | |||
|---|---|---|---|
| N | 10 | Sum Weights | 10 |
| Mean | 3.515 | Sum Observations | 35.15 |
| Std Deviation | 3.67030501 | Variance | 13.4711389 |
| Skewness | 0.78123341 | Kurtosis | -1.0719786 |
| Uncorrected SS | 244.7925 | Corrected SS | 121.24025 |
| Coeff Variation | 104.41835 | Std Error Mean | 1.16065235 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 3.515000 | Std Deviation | 3.67031 |
| Median | 1.600000 | Variance | 13.47114 |
| Mode | . | Range | 9.80000 |
| Interquartile Range | 6.50000 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 3.028469 | Pr > |t| | 0.0143 |
| Sign | M | 5 | Pr >= |M| | 0.0020 |
| Signed Rank | S | 27.5 | Pr >= |S| | 0.0020 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 10.000 |
| 99% | 10.000 |
| 95% | 10.000 |
| 90% | 9.000 |
| 75% Q3 | 7.000 |
| 50% Median | 1.600 |
| 25% Q1 | 0.500 |
| 10% | 0.225 |
| 5% | 0.200 |
| 1% | 0.200 |
| 0% Min | 0.200 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 0.20 | 9 | 2 | 6 |
| 0.25 | 4 | 5 | 2 |
| 0.50 | 8 | 7 | 3 |
| 1.00 | 10 | 8 | 1 |
| 1.20 | 7 | 10 | 5 |
9.Using PROC PRINT for Quick Data View
proc print data=indian_fish;
where Type = 'Freshwater';
run;
Output:
| Obs | Fish_Name | Type | Scientific_Name | Region | Max_Length | Avg_Weight | Market_Price |
|---|---|---|---|---|---|---|---|
| 2 | Rohu | Freshwater | Labeo_rohita | Ganges_Basin | 75 | 5.0 | 250 |
| 3 | Catla | Freshwater | Catla_catla | East_India | 80 | 7.0 | 220 |
| 7 | Hilsa | Freshwater | Tenualosa_ilisha | East_India | 50 | 1.2 | 700 |
| 8 | Rani | Freshwater | Pristolepis_marginata | Central_India | 30 | 0.5 | 180 |
10.Macro for Automated Summaries Across Regions
%macro summary_by_region;
proc sql noprint;
select distinct Region into :regions separated by ' '
from indian_fish;
quit;
%let num = %sysfunc(countw(®ions));
%do i=1 %to #
%let r=%scan(®ions, &i);
proc means data=indian_fish n min max mean;
where Region="&r";
var Max_Length Market_Price;
title "Summary Statistics for &r";
run;
%end;
%mend summary_by_region;
%summary_by_region
Output:
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 60.0000000 600.0000000 | 60.0000000 600.0000000 | 60.0000000 600.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 30.0000000 180.0000000 | 30.0000000 180.0000000 | 30.0000000 180.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 2 2 | 50.0000000 220.0000000 | 80.0000000 700.0000000 | 65.0000000 460.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 75.0000000 250.0000000 | 75.0000000 250.0000000 | 75.0000000 250.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 18.0000000 800.0000000 | 18.0000000 800.0000000 | 18.0000000 800.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 2 2 | 20.0000000 500.0000000 | 100.0000000 900.0000000 | 60.0000000 700.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 120.0000000 400.0000000 | 120.0000000 400.0000000 | 120.0000000 400.0000000 |
The MEANS Procedure
| Variable | N | Minimum | Maximum | Mean |
|---|---|---|---|---|
Max_Length Market_Price | 1 1 | 35.0000000 200.0000000 | 35.0000000 200.0000000 | 35.0000000 200.0000000 |
11.Creating New Variables with DATA Step
data fish_value;
set indian_fish;
Value_Class = (Market_Price >= 500);
run;
proc print;run;
Output:
| Obs | Fish_Name | Type | Scientific_Name | Region | Max_Length | Avg_Weight | Market_Price | Value_Class |
|---|---|---|---|---|---|---|---|---|
| 1 | Rawas | Saltwater | Polynemus_tetradactylum | West_Coast | 120 | 8.00 | 400 | 0 |
| 2 | Rohu | Freshwater | Labeo_rohita | Ganges_Basin | 75 | 5.00 | 250 | 0 |
| 3 | Catla | Freshwater | Catla_catla | East_India | 80 | 7.00 | 220 | 0 |
| 4 | Bangda | Saltwater | Rastrelliger_kanagurta | Western_Coast | 35 | 0.25 | 200 | 0 |
| 5 | Surmai | Saltwater | Scomberomorus_commerson | South_India | 100 | 10.00 | 500 | 1 |
| 6 | Pomfret | Saltwater | Pampus_argenteus | All_coasts | 60 | 2.00 | 600 | 1 |
| 7 | Hilsa | Freshwater | Tenualosa_ilisha | East_India | 50 | 1.20 | 700 | 1 |
| 8 | Rani | Freshwater | Pristolepis_marginata | Central_India | 30 | 0.50 | 180 | 0 |
| 9 | Flower_Prawn | Saltwater | Penaeus_semisulcatus | Kerala | 18 | 0.20 | 800 | 1 |
| 10 | Mud_Crab | Brackish | Scylla_serrata | South_India | 20 | 1.00 | 900 | 1 |
No comments:
Post a Comment