157.WILDLIFE MIGRATION ANALYSIS USING SAS: A COMPREHENSIVE GUIDE UTILIZING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC TRANSPOSE | PROC CORR | PROC REG | PROC SORT | PROC FORMAT | PROC REPORT | PROC TABULATE | PROC ANOVA | PROC EXPORT
- Get link
- X
- Other Apps
WILDLIFE MIGRATION ANALYSIS USING SAS: A COMPREHENSIVE GUIDE UTILIZING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC TRANSPOSE | PROC CORR | PROC REG | PROC SORT | PROC FORMAT | PROC REPORT | PROC TABULATE | PROC ANOVA | PROC EXPORT
/*Creating a unique dataset centered around wildlife migration patterns*/
/*Wildlife Migration Dataset: Overview*/
Dataset Structure:
Creating a dataset named wildlife_migration with the following variables:
Species: Name of the animal species (e.g., "Elephant", "Wildebeest", "Caribou").
Region: Geographical region of observation (e.g., "Savannah", "Tundra", "Forest").
Season: Season during which observation was made (e.g., "Spring", "Summer", "Autumn", "Winter").
Migration_Distance_km: Distance migrated during the season (in kilometers).
Group_Size: Number of individuals in the migrating group.
Observation_Date: Date of observation.
Observer_ID: Identifier for the observer recording the data.
Step 1: Creating the Dataset
data wildlife_migration;
input Species $ Region $ Season $ Migration_Distance_km Group_Size Observation_Date :date9. Observer_ID $;
format Observation_Date date9.;
datalines;
Elephant Savannah Spring 120 25 15MAR2025 OBS001
Wildebeest Savannah Summer 300 150 20JUN2025 OBS002
Caribou Tundra Autumn 500 200 10SEP2025 OBS003
Elephant Forest Winter 80 20 05DEC2025 OBS004
Wildebeest Savannah Spring 250 130 25MAR2025 OBS005
Caribou Tundra Summer 450 180 15JUL2025 OBS006
;
run;
proc print data=wildlife_migration;
title "Wildlife Migration Data";
run;
Output:
Wildlife Migration Data |
Obs | Species | Region | Season | Migration_Distance_km | Group_Size | Observation_Date | Observer_ID |
---|---|---|---|---|---|---|---|
1 | Elephant | Savannah | Spring | 120 | 25 | 15MAR2025 | OBS001 |
2 | Wildebee | Savannah | Summer | 300 | 150 | 20JUN2025 | OBS002 |
3 | Caribou | Tundra | Autumn | 500 | 200 | 10SEP2025 | OBS003 |
4 | Elephant | Forest | Winter | 80 | 20 | 05DEC2025 | OBS004 |
5 | Wildebee | Savannah | Spring | 250 | 130 | 25MAR2025 | OBS005 |
6 | Caribou | Tundra | Summer | 450 | 180 | 15JUL2025 | OBS006 |
proc contents data=wildlife_migration;
title "Dataset Structure";
run;
Output:
Dataset Structure |
Data Set Name | WORK.WILDLIFE_MIGRATION | Observations | 6 |
---|---|---|---|
Member Type | DATA | Variables | 7 |
Engine | V9 | Indexes | 0 |
Created | 14/09/2015 00:10:48 | Observation Length | 56 |
Last Modified | 14/09/2015 00:10:48 | Deleted Observations | 0 |
Protection | Compressed | NO | |
Data Set Type | Sorted | NO | |
Label | |||
Data Representation | WINDOWS_64 | ||
Encoding | wlatin1 Western (Windows) |
Engine/Host Dependent Information | |
---|---|
Data Set Page Size | 65536 |
Number of Data Set Pages | 1 |
First Data Page | 1 |
Max Obs per Page | 1167 |
Obs in First Data Page | 6 |
Number of Data Set Repairs | 0 |
ExtendObsCounter | YES |
Filename | C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD12856_DESKTOP-QFAA4KV_\wildlife_migration.sas7bdat |
Release Created | 9.0401M2 |
Host Created | X64_8HOME |
Alphabetic List of Variables and Attributes | ||||
---|---|---|---|---|
# | Variable | Type | Len | Format |
5 | Group_Size | Num | 8 | |
4 | Migration_Distance_km | Num | 8 | |
6 | Observation_Date | Num | 8 | DATE9. |
7 | Observer_ID | Char | 8 | |
2 | Region | Char | 8 | |
3 | Season | Char | 8 | |
1 | Species | Char | 8 |
Step 2: Descriptive Statistics
proc means data=wildlife_migration mean min max;
var Migration_Distance_km Group_Size;
title "Descriptive Statistics";
run;
Output:
Descriptive Statistics |
Variable | Mean | Minimum | Maximum | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
proc freq data=wildlife_migration;
tables Species Region;
title "Frequency of Species and Regions";
run;
Output:
Frequency of Species and
Regions |
Species | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Caribou | 2 | 33.33 | 2 | 33.33 |
Elephant | 2 | 33.33 | 4 | 66.67 |
Wildebee | 2 | 33.33 | 6 | 100.00 |
Region | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
Forest | 1 | 16.67 | 1 | 16.67 |
Savannah | 3 | 50.00 | 4 | 66.67 |
Tundra | 2 | 33.33 | 6 | 100.00 |
Step 3: Data Visualization
proc sgplot data=wildlife_migration;
vbar Species / response=Migration_Distance_km stat=mean;
title "Average Migration Distance by Species";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 3.31 seconds
cpu time 0.45 seconds
NOTE: Listing image output written to SGPlot1.png.
NOTE: There were 6 observations read from the data set WORK.WILDLIFE_MIGRATION.
proc sgplot data=wildlife_migration;
vline Season / response=Group_Size stat=mean group=Species;
title "Average Group Size Across Seasons";
run;
Log:
NOTE: PROCEDURE SGPLOT used (Total process time):
real time 0.85 seconds
cpu time 0.06 seconds
NOTE: Listing image output written to SGPlot3.png.
NOTE: There were 6 observations read from the data set WORK.WILDLIFE_MIGRATION.
Step 4: Data Transformation
proc transpose data=wildlife_migration out=transposed_data;
by Species notsortedd;
var Migration_Distance_km Group_Size;
title "Transposed Data for Species-wise Analysis";
run;
proc print;run;
Output:
Transposed Data for Species-wise
Analysis |
Obs | Species | _NAME_ | COL1 |
---|---|---|---|
1 | Elephant | Migration_Distance_km | 120 |
2 | Elephant | Group_Size | 25 |
3 | Wildebee | Migration_Distance_km | 300 |
4 | Wildebee | Group_Size | 150 |
5 | Caribou | Migration_Distance_km | 500 |
6 | Caribou | Group_Size | 200 |
7 | Elephant | Migration_Distance_km | 80 |
8 | Elephant | Group_Size | 20 |
9 | Wildebee | Migration_Distance_km | 250 |
10 | Wildebee | Group_Size | 130 |
11 | Caribou | Migration_Distance_km | 450 |
12 | Caribou | Group_Size | 180 |
Step 5: Advanced Analysis
proc corr data=wildlife_migration;
var Migration_Distance_km Group_Size;
title "Correlation Analysis";
run;
Output:
Correlation Analysis |
2 Variables: | Migration_Distance_km Group_Size |
---|
Simple Statistics | ||||||
---|---|---|---|---|---|---|
Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
Migration_Distance_km | 6 | 283.33333 | 169.78418 | 1700 | 80.00000 | 500.00000 |
Group_Size | 6 | 117.50000 | 77.44353 | 705.00000 | 20.00000 | 200.00000 |
Pearson Correlation
Coefficients, N = 6 Prob > |r| under H0: Rho=0 | ||||||
---|---|---|---|---|---|---|
Migration_Distance_km | Group_Size | |||||
Migration_Distance_km |
|
| ||||
Group_Size |
|
|
proc reg data=wildlife_migration;
model Migration_Distance_km = Group_Size;
title "Regression Analysis: Migration Distance vs. Group Size";
run;
Output:
Regression Analysis: Migration Distance vs. Group
Size |
Number of Observations Read | 6 |
---|---|
Number of Observations Used | 6 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 1 | 133830 | 133830 | 51.96 | 0.0020 |
Error | 4 | 10303 | 2575.87189 | ||
Corrected Total | 5 | 144133 |
Root MSE | 50.75305 | R-Square | 0.9285 |
---|---|---|---|
Dependent Mean | 283.33333 | Adj R-Sq | 0.9106 |
Coeff Var | 17.91284 |
Parameter Estimates | |||||
---|---|---|---|---|---|
Variable | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
Intercept | 1 | 35.10907 | 40.19010 | 0.87 | 0.4317 |
Group_Size | 1 | 2.11255 | 0.29308 | 7.21 | 0.0020 |
Step 6: Data Management
proc sort data=wildlife_migration;
by Species Season;
run;
proc format;
value $season_fmt
'Spring' = 'Spr'
'Summer' = 'Sum'
'Autumn' = 'Aut'
'Winter' = 'Win';
run;
proc print data=wildlife_migration;
format Season $season_fmt.;
title "Formatted Seasons";
run;
Output:
Formatted Seasons |
Obs | Species | Region | Season | Migration_Distance_km | Group_Size | Observation_Date | Observer_ID |
---|---|---|---|---|---|---|---|
1 | Caribou | Tundra | Aut | 500 | 200 | 10SEP2025 | OBS003 |
2 | Caribou | Tundra | Sum | 450 | 180 | 15JUL2025 | OBS006 |
3 | Elephant | Savannah | Spr | 120 | 25 | 15MAR2025 | OBS001 |
4 | Elephant | Forest | Win | 80 | 20 | 05DEC2025 | OBS004 |
5 | Wildebee | Savannah | Spr | 250 | 130 | 25MAR2025 | OBS005 |
6 | Wildebee | Savannah | Sum | 300 | 150 | 20JUN2025 | OBS002 |
Step 7: Generating Reports
proc report data=wildlife_migration nowd;
column Species Region Season Migration_Distance_km Group_Size;
define Species / group;
define Region / group;
define Season / group;
define Migration_Distance_km / analysis mean;
define Group_Size / analysis mean;
title "Wildlife Migration Report";
run;
Output:
Wildlife Migration
Report |
Species | Region | Season | Migration_Distance_km | Group_Size |
---|---|---|---|---|
Caribou | Tundra | Autumn | 500 | 200 |
Summer | 450 | 180 | ||
Elephant | Forest | Winter | 80 | 20 |
Savannah | Spring | 120 | 25 | |
Wildebee | Savannah | Spring | 250 | 130 |
Summer | 300 | 150 |
Step 8: Creating Custom Macros
%macro analyze_species(species_name);
proc print data=wildlife_migration;
where Species = "&species_name";
title "Data for &species_name";
run;
proc means data=wildlife_migration;
where Species = "&species_name";
var Migration_Distance_km Group_Size;
title "Statistics for &species_name";
run;
%mend analyze_species;
%analyze_species(Elephant);
Statistics for Elephant |
Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
Output:
Step 9: Data Cleaning
proc means data=wildlife_migration n nmiss;
var Migration_Distance_km Group_Size;
title "Missing Values Check";
run;
Output:
Missing Values Check |
Variable | N | N Miss | ||||||
---|---|---|---|---|---|---|---|---|
|
|
|
data cleaned_data;
set wildlife_migration;
if nmiss(of _numeric_) = 0;
run;
proc print;run;
Output:
Obs | Species | Region | Season | Migration_Distance_km | Group_Size | Observation_Date | Observer_ID |
---|---|---|---|---|---|---|---|
1 | Caribou | Tundra | Autumn | 500 | 200 | 10SEP2025 | OBS003 |
2 | Caribou | Tundra | Summer | 450 | 180 | 15JUL2025 | OBS006 |
3 | Elephant | Savannah | Spring | 120 | 25 | 15MAR2025 | OBS001 |
4 | Elephant | Forest | Winter | 80 | 20 | 05DEC2025 | OBS004 |
5 | Wildebee | Savannah | Spring | 250 | 130 | 25MAR2025 | OBS005 |
6 | Wildebee | Savannah | Summer | 300 | 150 | 20JUN2025 | OBS002 |
Step 10: Exporting the Dataset
proc export data=wildlife_migration
outfile='C:\SASData\wildlife_migration.csv'
dbms=csv
replace;
run;
Step 11: Creating Summary Tables
proc tabulate data=wildlife_migration;
class Species Season;
var Migration_Distance_km;
table Species, Season*Migration_Distance_km*(mean);
title "Average Migration Distance by Species and Season";
run;
Output:
Average Migration Distance by Species and
Season |
Season | ||||
---|---|---|---|---|
Autumn | Spring | Summer | Winter | |
Migration_Distance_km | Migration_Distance_km | Migration_Distance_km | Migration_Distance_km | |
Mean | Mean | Mean | Mean | |
Species | 500.00 | . | 450.00 | . |
Caribou | ||||
Elephant | . | 120.00 | . | 80.00 |
Wildebee | . | 250.00 | 300.00 | . |
Step 12: Advanced Analysis
proc anova data=wildlife_migration;
class Region;
model Migration_Distance_km = Region;
means Region / tukey;
title "ANOVA: Migration Distance Across Regions";
run;
ANOVA: Migration Distance Across Regions |
Class Level Information | ||
---|---|---|
Class | Levels | Values |
Region | 3 | Forest Savannah Tundra |
Number of Observations Read | 6 |
---|---|
Number of Observations Used | 6 |
ANOVA: Migration Distance Across Regions |
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
---|---|---|---|---|---|
Model | 2 | 125616.6667 | 62808.3333 | 10.18 | 0.0460 |
Error | 3 | 18516.6667 | 6172.2222 | ||
Corrected Total | 5 | 144133.3333 |
R-Square | Coeff Var | Root MSE | Migration_Distance_km Mean |
---|---|---|---|
0.871531 | 27.72829 | 78.56349 | 283.3333 |
Source | DF | Anova SS | Mean Square | F Value | Pr > F |
---|---|---|---|---|---|
Region | 2 | 125616.6667 | 62808.3333 | 10.18 | 0.0460 |
ANOVA: Migration Distance Across Regions |
Note: | This test controls the Type I experimentwise error rate. |
Alpha | 0.05 |
---|---|
Error Degrees of Freedom | 3 |
Error Mean Square | 6172.222 |
Critical Value of Studentized Range | 5.90958 |
Comparisons significant at the 0.05 level are indicated by ***. | ||||
---|---|---|---|---|
Region Comparison |
Difference Between Means |
Simultaneous 95%
Confidence Limits |
||
Tundra - Savannah | 251.67 | -48.02 | 551.36 | |
Tundra - Forest | 395.00 | -7.08 | 797.08 | |
Savannah - Tundra | -251.67 | -551.36 | 48.02 | |
Savannah - Forest | 143.33 | -235.75 | 522.41 | |
Forest - Tundra | -395.00 | -797.08 | 7.08 | |
Forest - Savannah | -143.33 | -522.41 | 235.75 |
- Get link
- X
- Other Apps
Comments
Post a Comment