158.COMPREHENSIVE ANALYSIS OF BIRD SPECIES OBSERVATIONS IN INDIA USING SAS PROCEDURES: PROC CONTENTS | PROC PRINT | PROC SORT | PROC SQL | PROC MEANS | PROC FREQ | PROC RANK | PROC CORR | PROC SGPLOT | PROC EXPORT
COMPREHENSIVE ANALYSIS OF BIRD SPECIES OBSERVATIONS IN INDIA USING SAS PROCEDURES: PROC CONTENTS | PROC PRINT | PROC SORT | PROC SQL | PROC MEANS | PROC FREQ | PROC RANK | PROC CORR | PROC SGPLOT | PROC EXPORT
/*Creating a unique dataset "Bird Species Observations in India".*/
1.Dataset Creation: Bird Species Observations in India
Creating a dataset named bird_observations containing the following variables:
Species_Name (Character): Name of the bird species.
Location (Character): Location of the sighting.
Observation_Date (Date): Date of the observation.
Observer_Name (Character): Name of the person who observed.
Count (Numeric): Number of birds observed.
Weather_Condition (Character): Weather during observation.
data bird_observations;
infile datalines dlm=',' dsd;
length Species_Name $50 Location $50 Observer_Name $50 Weather_Condition $20;
input Species_Name $ Location $ Observation_Date :date9. Observer_Name $ Count Weather_Condition $;
format Observation_Date date9.;
datalines;
"Indian Peafowl","Hyderabad",15APR2025,"Arjun Rao",5,"Sunny"
"House Sparrow","Delhi",16APR2025,"Meera Singh",12,"Cloudy"
"Common Myna","Mumbai",14APR2025,"Rahul Verma",8,"Rainy"
"Rose-ringed Parakeet","Chennai",13APR2025,"Lakshmi Nair",15,"Sunny"
"Black Kite","Kolkata",12APR2025,"Anil Kapoor",3,"Windy"
"Indian Peafowl","Jaipur",11APR2025,"Sneha Roy",7,"Sunny"
"House Sparrow","Bangalore",10APR2025,"Vikram Das",10,"Cloudy"
"Common Myna","Pune",09APR2025,"Priya Menon",6,"Rainy"
"Rose-ringed Parakeet","Ahmedabad",08APR2025,"Ravi Shankar",9,"Sunny"
"Black Kite","Lucknow",07APR2025,"Neha Gupta",4,"Windy"
;
run;
proc contents data=bird_observations;
run;
Output:
Data Set Name | WORK.BIRD_OBSERVATIONS | Observations | 10 |
---|---|---|---|
Member Type | DATA | Variables | 6 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:02:35 | Observation Length | 192 |
Last Modified | 14/09/2015 00:02:35 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 340 |
Obs in First Data Page | 10 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD7088_DESKTOP-QFAA4KV_\bird_observations.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | ||||
---|---|---|---|---|
# | Variable | Type | Len | Format |
6 | Count | Num | 8 | |
2 | Location | Char | 50 | |
5 | Observation_Date | Num | 8 | DATE9. |
3 | Observer_Name | Char | 50 | |
1 | Species_Name | Char | 50 | |
4 | Weather_Condition | Char | 20 |
proc print data=bird_observations;
run;
Output:
Obs | Species_Name | Location | Observer_Name | Weather_Condition | Observation_Date | Count |
---|---|---|---|---|---|---|
1 | Indian Peafowl | Hyderabad | Arjun Rao | Sunny | 15APR2025 | 5 |
2 | House Sparrow | Delhi | Meera Singh | Cloudy | 16APR2025 | 12 |
3 | Common Myna | Mumbai | Rahul Verma | Rainy | 14APR2025 | 8 |
4 | Rose-ringed Parakeet | Chennai | Lakshmi Nair | Sunny | 13APR2025 | 15 |
5 | Black Kite | Kolkata | Anil Kapoor | Windy | 12APR2025 | 3 |
6 | Indian Peafowl | Jaipur | Sneha Roy | Sunny | 11APR2025 | 7 |
7 | House Sparrow | Bangalore | Vikram Das | Cloudy | 10APR2025 | 10 |
8 | Common Myna | Pune | Priya Menon | Rainy | 09APR2025 | 6 |
9 | Rose-ringed Parakeet | Ahmedabad | Ravi Shankar | Sunny | 08APR2025 | 9 |
10 | Black Kite | Lucknow | Neha Gupta | Windy | 07APR2025 | 4 |
2.Data Cleaning and Transformation
/*a. Standardizing Species Names*/
/*Ensure that species names are in proper case.*/
data bird_observations_clean;
set bird_observations;
Species_Name = propcase(Species_Name);
run;
proc print;run;
Output:
Obs | Species_Name | Location | Observer_Name | Weather_Condition | Observation_Date | Count |
---|---|---|---|---|---|---|
1 | Indian Peafowl | Hyderabad | Arjun Rao | Sunny | 15APR2025 | 5 |
2 | House Sparrow | Delhi | Meera Singh | Cloudy | 16APR2025 | 12 |
3 | Common Myna | Mumbai | Rahul Verma | Rainy | 14APR2025 | 8 |
4 | Rose-Ringed Parakeet | Chennai | Lakshmi Nair | Sunny | 13APR2025 | 15 |
5 | Black Kite | Kolkata | Anil Kapoor | Windy | 12APR2025 | 3 |
6 | Indian Peafowl | Jaipur | Sneha Roy | Sunny | 11APR2025 | 7 |
7 | House Sparrow | Bangalore | Vikram Das | Cloudy | 10APR2025 | 10 |
8 | Common Myna | Pune | Priya Menon | Rainy | 09APR2025 | 6 |
9 | Rose-Ringed Parakeet | Ahmedabad | Ravi Shankar | Sunny | 08APR2025 | 9 |
10 | Black Kite | Lucknow | Neha Gupta | Windy | 07APR2025 | 4 |
/*b. Creating a New Variable: Month of Observation*/
/*Extract the month from the observation date.*/
data bird_observations_clean;
set bird_observations_clean;
Observation_Month = month(Observation_Date);
run;
proc print;run;
Output:
Obs | Species_Name | Location | Observer_Name | Weather_Condition | Observation_Date | Count | Observation_Month |
---|---|---|---|---|---|---|---|
1 | Indian Peafowl | Hyderabad | Arjun Rao | Sunny | 15APR2025 | 5 | 4 |
2 | House Sparrow | Delhi | Meera Singh | Cloudy | 16APR2025 | 12 | 4 |
3 | Common Myna | Mumbai | Rahul Verma | Rainy | 14APR2025 | 8 | 4 |
4 | Rose-Ringed Parakeet | Chennai | Lakshmi Nair | Sunny | 13APR2025 | 15 | 4 |
5 | Black Kite | Kolkata | Anil Kapoor | Windy | 12APR2025 | 3 | 4 |
6 | Indian Peafowl | Jaipur | Sneha Roy | Sunny | 11APR2025 | 7 | 4 |
7 | House Sparrow | Bangalore | Vikram Das | Cloudy | 10APR2025 | 10 | 4 |
8 | Common Myna | Pune | Priya Menon | Rainy | 09APR2025 | 6 | 4 |
9 | Rose-Ringed Parakeet | Ahmedabad | Ravi Shankar | Sunny | 08APR2025 | 9 | 4 |
10 | Black Kite | Lucknow | Neha Gupta | Windy | 07APR2025 | 4 | 4 |
3.Descriptive Statistics
/*a. Total Observations per Species*/
proc sql;
select Species_Name, sum(Count) as Total_Observed
from bird_observations_clean
group by Species_Name;
quit;
Output:
Species_Name | Total_Observed |
---|---|
Black Kite | 7 |
Common Myna | 14 |
House Sparrow | 22 |
Indian Peafowl | 12 |
Rose-Ringed Parakeet | 24 |
/*b. Average Count per Observation*/
proc means data=bird_observations_clean mean;
var Count;
run;
Output:
Analysis Variable : Count |
---|
Mean |
7.9000000 |
4.Frequency Analysis
/*a. Frequency of Observations by Weather Condition*/
proc freq data=bird_observations_clean;
tables Weather_Condition;
run;
Output:
Weather_Condition | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Cloudy | 2 | 20.00 | 2 | 20.00 |
Rainy | 2 | 20.00 | 4 | 40.00 |
Sunny | 4 | 40.00 | 8 | 80.00 |
Windy | 2 | 20.00 | 10 | 100.00 |
/*b. Cross-tabulation: Species and Weather Condition*/
proc freq data=bird_observations_clean;
tables Species_Name*Weather_Condition / norow nocol nopercent;
run;
Output:
|
|
5.Sorting and Ranking
/*a. Sorting by Count Descending*/
proc sort data=bird_observations_clean out=sorted_observations;
by descending Count;
run;
proc print data=sorted_observations;
run;
Output:
Obs | Species_Name | Location | Observer_Name | Weather_Condition | Observation_Date | Count | Observation_Month |
---|---|---|---|---|---|---|---|
1 | Rose-Ringed Parakeet | Chennai | Lakshmi Nair | Sunny | 13APR2025 | 15 | 4 |
2 | House Sparrow | Delhi | Meera Singh | Cloudy | 16APR2025 | 12 | 4 |
3 | House Sparrow | Bangalore | Vikram Das | Cloudy | 10APR2025 | 10 | 4 |
4 | Rose-Ringed Parakeet | Ahmedabad | Ravi Shankar | Sunny | 08APR2025 | 9 | 4 |
5 | Common Myna | Mumbai | Rahul Verma | Rainy | 14APR2025 | 8 | 4 |
6 | Indian Peafowl | Jaipur | Sneha Roy | Sunny | 11APR2025 | 7 | 4 |
7 | Common Myna | Pune | Priya Menon | Rainy | 09APR2025 | 6 | 4 |
8 | Indian Peafowl | Hyderabad | Arjun Rao | Sunny | 15APR2025 | 5 | 4 |
9 | Black Kite | Lucknow | Neha Gupta | Windy | 07APR2025 | 4 | 4 |
10 | Black Kite | Kolkata | Anil Kapoor | Windy | 12APR2025 | 3 | 4 |
/*b. Ranking Species by Total Observations*/
proc sql;
create table species_rank as
select Species_Name, sum(Count) as Total_Observed
from bird_observations_clean
group by Species_Name
order by Total_Observed desc;
quit;
proc print;run;
Output:
Obs | Species_Name | Total_Observed |
---|---|---|
1 | Rose-Ringed Parakeet | 24 |
2 | House Sparrow | 22 |
3 | Common Myna | 14 |
4 | Indian Peafowl | 12 |
5 | Black Kite | 7 |
proc rank data=species_rank out=species_ranked ties=low descending;
var Total_Observed;
ranks Rank;
run;
proc print data=species_ranked;
run;
Output:
Obs | Species_Name | Total_Observed | Rank |
---|---|---|---|
1 | Rose-Ringed Parakeet | 24 | 1 |
2 | House Sparrow | 22 | 2 |
3 | Common Myna | 14 | 3 |
4 | Indian Peafowl | 12 | 4 |
5 | Black Kite | 7 | 5 |
6.Visualization
/*a. Bar Chart: Total Observations per Species*/
proc sgplot data=species_rank;
vbar Species_Name / response=Total_Observed datalabel;
title "Total Observations per Bird Species";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 2.29 seconds
cpu time 0.64 seconds
NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 5 observations read from the data set WORK.SPECIES_RANK.
/*b. Line Chart: Observations Over Dates*/
proc sgplot data=bird_observations_clean;
series x=Observation_Date y=Count / group=Species_Name markers;
title "Bird Observations Over Time";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 0.40 seconds
cpu time 0.06 seconds
NOTE: Listing image output written to SGPlot5.png.
NOTE: There were 10 observations read from the data set WORK.BIRD_OBSERVATIONS_CLEAN.
7.Advanced Analysis
/*a. Correlation Analysis: Count and Observation Month*/
proc corr data=bird_observations_clean;
var Count Observation_Month;
run;
Output:
2 Variables: | Count Observation_Month |
---|
Simple Statistics | ||||||
---|---|---|---|---|---|---|
Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
Count | 10 | 7.90000 | 3.72529 | 79.00000 | 3.00000 | 15.00000 |
Observation_Month | 10 | 4.00000 | 0 | 40.00000 | 4.00000 | 4.00000 |
Pearson Correlation
Coefficients, N = 10 Prob > |r| under H0: Rho=0 | ||||||
---|---|---|---|---|---|---|
Count | Observation_Month | |||||
Count |
|
| ||||
Observation_Month |
|
|
8.Exporting the Cleaned Dataset
/*Export the cleaned dataset to a CSV file*/
proc export data=bird_observations_clean
outfile="C:\SAS\bird_observations_clean.csv"
dbms=csv
replace;
run;
Comments
Post a Comment