157.WILDLIFE MIGRATION ANALYSIS USING SAS: A COMPREHENSIVE GUIDE UTILIZING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC TRANSPOSE | PROC CORR | PROC REG | PROC SORT | PROC FORMAT | PROC REPORT | PROC TABULATE | PROC ANOVA | PROC EXPORT

WILDLIFE MIGRATION ANALYSIS USING SAS: A COMPREHENSIVE GUIDE UTILIZING PROC PRINT | PROC CONTENTS | PROC MEANS | PROC FREQ | PROC SGPLOT | PROC TRANSPOSE | PROC CORR | PROC REG | PROC SORT | PROC FORMAT | PROC REPORT | PROC TABULATE | PROC ANOVA | PROC EXPORT



/*Creating a unique dataset centered around wildlife migration patterns*/

/*Wildlife Migration Dataset: Overview*/

Dataset Structure:

Creating a dataset named wildlife_migration with the following variables:

Species: Name of the animal species (e.g., "Elephant", "Wildebeest", "Caribou").

Region: Geographical region of observation (e.g., "Savannah", "Tundra", "Forest").

Season: Season during which observation was made (e.g., "Spring", "Summer", "Autumn", "Winter").

Migration_Distance_km: Distance migrated during the season (in kilometers).

Group_Size: Number of individuals in the migrating group.

Observation_Date: Date of observation.

Observer_ID: Identifier for the observer recording the data.


Step 1: Creating the Dataset

data wildlife_migration;

    input Species $ Region $ Season $ Migration_Distance_km Group_Size Observation_Date :date9. Observer_ID $;

    format Observation_Date date9.;

    datalines;

Elephant Savannah Spring 120 25 15MAR2025 OBS001

Wildebeest Savannah Summer 300 150 20JUN2025 OBS002

Caribou Tundra Autumn 500 200 10SEP2025 OBS003

Elephant Forest Winter 80 20 05DEC2025 OBS004

Wildebeest Savannah Spring 250 130 25MAR2025 OBS005

Caribou Tundra Summer 450 180 15JUL2025 OBS006

;

run;

proc print data=wildlife_migration;

    title "Wildlife Migration Data";

run;

Output:

                                                                  Wildlife Migration Data

Obs Species Region Season Migration_Distance_km Group_Size Observation_Date Observer_ID
1 Elephant Savannah Spring 120 25 15MAR2025 OBS001
2 Wildebee Savannah Summer 300 150 20JUN2025 OBS002
3 Caribou Tundra Autumn 500 200 10SEP2025 OBS003
4 Elephant Forest Winter 80 20 05DEC2025 OBS004
5 Wildebee Savannah Spring 250 130 25MAR2025 OBS005
6 Caribou Tundra Summer 450 180 15JUL2025 OBS006


proc contents data=wildlife_migration;

    title "Dataset Structure";

run;

Output:

                                                                   Dataset Structure

                                                           The CONTENTS Procedure

Data Set Name WORK.WILDLIFE_MIGRATION Observations 6
Member Type DATA Variables 7
Engine V9 Indexes 0
Created 14/09/2015 00:10:48 Observation Length 56
Last Modified 14/09/2015 00:10:48 Deleted Observations 0
Protection   Compressed NO
Data Set Type   Sorted NO
Label      
Data Representation WINDOWS_64    
Encoding wlatin1 Western (Windows)    


Engine/Host Dependent Information
Data Set Page Size 65536
Number of Data Set Pages 1
First Data Page 1
Max Obs per Page 1167
Obs in First Data Page 6
Number of Data Set Repairs 0
ExtendObsCounter YES
Filename C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary Files\_TD12856_DESKTOP-QFAA4KV_\wildlife_migration.sas7bdat
Release Created 9.0401M2
Host Created X64_8HOME


Alphabetic List of Variables and Attributes
# Variable Type Len Format
5 Group_Size Num 8  
4 Migration_Distance_km Num 8  
6 Observation_Date Num 8 DATE9.
7 Observer_ID Char 8  
2 Region Char 8  
3 Season Char 8  
1 Species Char 8  


Step 2: Descriptive Statistics

proc means data=wildlife_migration mean min max;

    var Migration_Distance_km Group_Size;

    title "Descriptive Statistics";

run;

Output:

                                                               Descriptive Statistics
                                                             The MEANS Procedure

Variable Mean Minimum Maximum
Migration_Distance_km
Group_Size
283.3333333
117.5000000
80.0000000
20.0000000
500.0000000
200.0000000


proc freq data=wildlife_migration;

    tables Species Region;

    title "Frequency of Species and Regions";

run;

Output:

                                                      Frequency of Species and Regions

                                                                The FREQ Procedure

Species Frequency Percent Cumulative
Frequency
Cumulative
Percent
Caribou 2 33.33 2 33.33
Elephant 2 33.33 4 66.67
Wildebee 2 33.33 6 100.00


Region Frequency Percent Cumulative
Frequency
Cumulative
Percent
Forest 1 16.67 1 16.67
Savannah 3 50.00 4 66.67
Tundra 2 33.33 6 100.00


Step 3: Data Visualization

proc sgplot data=wildlife_migration;

    vbar Species / response=Migration_Distance_km stat=mean;

    title "Average Migration Distance by Species";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           3.31 seconds

      cpu time            0.45 seconds


NOTE: Listing image output written to SGPlot1.png.

NOTE: There were 6 observations read from the data set WORK.WILDLIFE_MIGRATION.


proc sgplot data=wildlife_migration;

    vline Season / response=Group_Size stat=mean group=Species;

    title "Average Group Size Across Seasons";

run;

Log:

NOTE: PROCEDURE SGPLOT used (Total process time):

      real time           0.85 seconds

      cpu time            0.06 seconds


NOTE: Listing image output written to SGPlot3.png.

NOTE: There were 6 observations read from the data set WORK.WILDLIFE_MIGRATION.


Step 4: Data Transformation

proc transpose data=wildlife_migration out=transposed_data;

    by Species notsortedd;

    var Migration_Distance_km Group_Size;

    title "Transposed Data for Species-wise Analysis";

run;

proc print;run;

Output:

                                                Transposed Data for Species-wise Analysis

Obs Species _NAME_ COL1
1 Elephant Migration_Distance_km 120
2 Elephant Group_Size 25
3 Wildebee Migration_Distance_km 300
4 Wildebee Group_Size 150
5 Caribou Migration_Distance_km 500
6 Caribou Group_Size 200
7 Elephant Migration_Distance_km 80
8 Elephant Group_Size 20
9 Wildebee Migration_Distance_km 250
10 Wildebee Group_Size 130
11 Caribou Migration_Distance_km 450
12 Caribou Group_Size 180

Step 5: Advanced Analysis

proc corr data=wildlife_migration;

    var Migration_Distance_km Group_Size;

    title "Correlation Analysis";

run;

Output:

                                                                   Correlation Analysis

                                                                  The CORR Procedure

2 Variables: Migration_Distance_km Group_Size


Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
Migration_Distance_km 6 283.33333 169.78418 1700 80.00000 500.00000
Group_Size 6 117.50000 77.44353 705.00000 20.00000 200.00000


Pearson Correlation Coefficients, N = 6
Prob > |r| under H0: Rho=0
  Migration_Distance_km Group_Size
Migration_Distance_km
1.00000
 
0.96359
0.0020
Group_Size
0.96359
0.0020
1.00000
 


proc reg data=wildlife_migration;

    model Migration_Distance_km = Group_Size;

    title "Regression Analysis: Migration Distance vs. Group Size";

run;

Output:

                                      Regression Analysis: Migration Distance vs. Group Size

                                                             The REG Procedure
                                                                Model: MODEL1
                                               Dependent Variable: Migration_Distance_km

Number of Observations Read 6
Number of Observations Used 6


Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 1 133830 133830 51.96 0.0020
Error 4 10303 2575.87189    
Corrected Total 5 144133      


Root MSE 50.75305 R-Square 0.9285
Dependent Mean 283.33333 Adj R-Sq 0.9106
Coeff Var 17.91284    


Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 35.10907 40.19010 0.87 0.4317
Group_Size 1 2.11255 0.29308 7.21 0.0020


Step 6: Data Management

proc sort data=wildlife_migration;

    by Species Season;

run;


proc format;

    value $season_fmt

        'Spring' = 'Spr'

        'Summer' = 'Sum'

        'Autumn' = 'Aut'

        'Winter' = 'Win';

run;


proc print data=wildlife_migration;

    format Season $season_fmt.;

    title "Formatted Seasons";

run;

Output:

                                                                         Formatted Seasons

Obs Species Region Season Migration_Distance_km Group_Size Observation_Date Observer_ID
1 Caribou Tundra Aut 500 200 10SEP2025 OBS003
2 Caribou Tundra Sum 450 180 15JUL2025 OBS006
3 Elephant Savannah Spr 120 25 15MAR2025 OBS001
4 Elephant Forest Win 80 20 05DEC2025 OBS004
5 Wildebee Savannah Spr 250 130 25MAR2025 OBS005
6 Wildebee Savannah Sum 300 150 20JUN2025 OBS002


Step 7: Generating Reports

proc report data=wildlife_migration nowd;

    column Species Region Season Migration_Distance_km Group_Size;

    define Species / group;

    define Region / group;

    define Season / group;

    define Migration_Distance_km / analysis mean;

    define Group_Size / analysis mean;

    title "Wildlife Migration Report";

run;

Output:

                                                                 Wildlife Migration Report

Species Region Season Migration_Distance_km Group_Size
Caribou Tundra Autumn 500 200
    Summer 450 180
Elephant Forest Winter 80 20
  Savannah Spring 120 25
Wildebee Savannah Spring 250 130
    Summer 300 150

Step 8: Creating Custom Macros

%macro analyze_species(species_name);

proc print data=wildlife_migration;

    where Species = "&species_name";

    title "Data for &species_name";

run;


proc means data=wildlife_migration;

    where Species = "&species_name";

    var Migration_Distance_km Group_Size;

    title "Statistics for &species_name";

run;

%mend analyze_species;


%analyze_species(Elephant);

                                                             Statistics for Elephant

                                                           The MEANS Procedure

Variable N Mean Std Dev Minimum Maximum
Migration_Distance_km
Group_Size
2
2
100.0000000
22.5000000
28.2842712
3.5355339
80.0000000
20.0000000
120.0000000
25.0000000


Output:

Step 9: Data Cleaning

proc means data=wildlife_migration n nmiss;

    var Migration_Distance_km Group_Size;

    title "Missing Values Check";

run;

Output:

                                                             Missing Values Check

                                                           The MEANS Procedure

Variable N N Miss
Migration_Distance_km
Group_Size
6
6
0
0

data cleaned_data;

    set wildlife_migration;

    if nmiss(of _numeric_) = 0;

run;

proc print;run;

Output:

Obs Species Region Season Migration_Distance_km Group_Size Observation_Date Observer_ID
1 Caribou Tundra Autumn 500 200 10SEP2025 OBS003
2 Caribou Tundra Summer 450 180 15JUL2025 OBS006
3 Elephant Savannah Spring 120 25 15MAR2025 OBS001
4 Elephant Forest Winter 80 20 05DEC2025 OBS004
5 Wildebee Savannah Spring 250 130 25MAR2025 OBS005
6 Wildebee Savannah Summer 300 150 20JUN2025 OBS002


Step 10: Exporting the Dataset

proc export data=wildlife_migration

    outfile='C:\SASData\wildlife_migration.csv'

    dbms=csv

    replace;

run;


Step 11: Creating Summary Tables

proc tabulate data=wildlife_migration;

    class Species Season;

    var Migration_Distance_km;

    table Species, Season*Migration_Distance_km*(mean);

    title "Average Migration Distance by Species and Season";

run;

Output:

                                                                            Average Migration Distance by Species and Season

  Season
Autumn Spring Summer Winter
Migration_Distance_km Migration_Distance_km Migration_Distance_km Migration_Distance_km
Mean Mean Mean Mean
Species 500.00 . 450.00 .
Caribou
Elephant . 120.00 . 80.00
Wildebee . 250.00 300.00 .


Step 12: Advanced Analysis

proc anova data=wildlife_migration;

    class Region;

    model Migration_Distance_km = Region;

    means Region / tukey;

    title "ANOVA: Migration Distance Across Regions";

run;

Output:
                                                      ANOVA: Migration Distance Across Regions

                                                                    The ANOVA Procedure

Class Level Information
Class Levels Values
Region 3 Forest Savannah Tundra

Number of Observations Read 6
Number of Observations Used 6

                                               ANOVA: Migration Distance Across Regions

The ANOVA Procedure
 
Dependent Variable: Migration_Distance_km

Source DF Sum of Squares Mean Square F Value Pr > F
Model 2 125616.6667 62808.3333 10.18 0.0460
Error 3 18516.6667 6172.2222    
Corrected Total 5 144133.3333      

R-Square Coeff Var Root MSE Migration_Distance_km Mean
0.871531 27.72829 78.56349 283.3333

Source DF Anova SS Mean Square F Value Pr > F
Region 2 125616.6667 62808.3333 10.18 0.0460

                                              ANOVA: Migration Distance Across Regions

The ANOVA Procedure
 
Tukey's Studentized Range (HSD) Test for Migration_Distance_km

Note: This test controls the Type I experimentwise error rate.

Alpha 0.05
Error Degrees of Freedom 3
Error Mean Square 6172.222
Critical Value of Studentized Range 5.90958

Comparisons significant at the 0.05 level are indicated by ***.
Region
Comparison
Difference
Between
Means
Simultaneous 95% Confidence
Limits
 
Tundra - Savannah 251.67 -48.02 551.36  
Tundra - Forest 395.00 -7.08 797.08  
Savannah - Tundra -251.67 -551.36 48.02  
Savannah - Forest 143.33 -235.75 522.41  
Forest - Tundra -395.00 -797.08 7.08  
Forest - Savannah -143.33 -522.41 235.75  


PRACTICE AND COMMENT YOUR CODE: 

-->PLEASE FOLLOW OUR BLOG FOR MORE UPDATES.

TO FOLLOW OUR TELEGRAM CHANNEL CLICK HERE

Comments