199.UNVEILING MOSQUITO SPECIES ANALYSIS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC SQL | PROC EXPORT | PROC GCHART | MACROS
- Get link
- X
- Other Apps
UNVEILING MOSQUITO SPECIES ANALYSIS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC SQL | PROC EXPORT | PROC GCHART | MACROS
/*Creating a unique dataset about different types of mosquitoes*/
1. Dataset Creation
data mosquitoes;
length Species $20 Habitat $15 Active_Time $10 Disease $25 Region $15;
input Species $ Habitat $ Active_Time $ Disease $ Region $
Wingspan_mm Lifespan_days;
datalines;
Aedes_aegypti Urban Daytime Dengue_AIDS Tropics 3.5 30
Anopheles_gambiae Rural Nighttime Malaria Africa 4.0 14
Culex_quinquefasciatus Urban Nighttime Filariasis Asia 3.2 20
Aedes_albopictus Urban Daytime Zika Tropics 4.1 28
Anopheles_funestus Rural Nighttime Malaria Africa 4.3 12
Culex_pipiens Urban Nighttime West_Nile NorthAmerica 3.6 18
Psorophora_ciliata Forest Daytime Encephalitis SouthAmerica 5.0 25
Mansonia_uniformis Swamp Nighttime Filariasis Asia 3.9 16
Haemagogus_janthinomys Forest Daytime Yellow_Fever Amazon 4.5 20
Ochlerotatus_triseriatus Urban Daytime La_Crosse_USA USA 3.7 21
Aedes_scapularis Forest Daytime Dengue SouthAmerica 3.8 19
Culex_tarsalis Marsh Nighttime West_Nile USA 4.0 22
Aedes_vexans Urban Daytime Encephalitis Europe 3.6 17
Anopheles_dirus Forest Nighttime Malaria Asia 4.2 13
Culex_nigripalpus Wetlands Nighttime Encephalitis NorthAmerica 4.1 18
Anopheles_stephensi Urban Nighttime Malaria Asia 3.9 15
Aedes_notoscriptus Urban Daytime Ross_River Australia 3.5 27
;
run;
proc print data=mosquitoes label;
title "Mosquito Species Dataset Overview";
run;
Output:
Mosquito Species Dataset Overview |
Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
---|---|---|---|---|---|---|---|
1 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 |
2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
3 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
4 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 |
5 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
6 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 |
7 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 |
8 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
9 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 |
10 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 |
11 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 |
12 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 |
13 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 |
14 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
15 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 |
16 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
17 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 |
2. PROC FREQ: Frequency Analysis of Diseases and Habitat
proc freq data=mosquitoes;
tables Disease Habitat / nocum;
title "Frequency Distribution of Disease and Habitat";
run;
Output:
Frequency Distribution of Disease and Habitat |
Disease | Frequency | Percent |
---|---|---|
Dengue | 1 | 5.88 |
Dengue_AIDS | 1 | 5.88 |
Encephalitis | 3 | 17.65 |
Filariasis | 2 | 11.76 |
La_Crosse_USA | 1 | 5.88 |
Malaria | 4 | 23.53 |
Ross_River | 1 | 5.88 |
West_Nile | 2 | 11.76 |
Yellow_Fever | 1 | 5.88 |
Zika | 1 | 5.88 |
Habitat | Frequency | Percent |
---|---|---|
Forest | 4 | 23.53 |
Marsh | 1 | 5.88 |
Rural | 2 | 11.76 |
Swamp | 1 | 5.88 |
Urban | 8 | 47.06 |
Wetlands | 1 | 5.88 |
3. PROC MEANS: Descriptive Statistics for Numerical Variables
proc means data=mosquitoes n mean std min max;
var Wingspan_mm Lifespan_days;
title "Descriptive Statistics for Wingspan and Lifespan";
run;
Output:
Descriptive Statistics for Wingspan and Lifespan |
Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
4. PROC SORT: Sorting by Region and Lifespan
proc sort data=mosquitoes out=sorted_mosquitoes;
by Region descending Lifespan_days;
run;
proc print data=sorted_mosquitoes;
title "Mosquito Species Sorted by Region and Lifespan";
run;
Output:
Mosquito Species Sorted by Region and Lifespan |
Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
---|---|---|---|---|---|---|---|
1 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
2 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
3 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 |
4 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
5 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
6 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
7 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
8 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 |
9 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 |
10 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 |
11 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 |
12 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 |
13 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 |
14 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 |
15 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 |
16 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 |
17 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 |
5. PROC REPORT: Structured Summary
proc report data=mosquitoes nowd;
column Species Region Habitat Lifespan_days;
define Species / display;
define Region / group;
define Habitat / group;
define Lifespan_days / analysis mean format=4.1;
title "Average Lifespan of Mosquitoes by Region and Habitat";
run;
Output:
Average Lifespan of Mosquitoes by Region and Habitat |
Species | Region | Habitat | Lifespan_days |
---|---|---|---|
Anopheles_gambiae | Africa | Rural | 14.0 |
Anopheles_funestus | 12.0 | ||
Haemagogus_janthinom | Amazon | Forest | 20.0 |
Anopheles_dirus | Asia | Forest | 13.0 |
Mansonia_uniformis | Swamp | 16.0 | |
Culex_quinquefasciat | Urban | 20.0 | |
Anopheles_stephensi | 15.0 | ||
Aedes_notoscriptus | Australia | Urban | 27.0 |
Aedes_vexans | Europe | Urban | 17.0 |
Culex_pipiens | NorthAmerica | Urban | 18.0 |
Culex_nigripalpus | Wetlands | 18.0 | |
Psorophora_ciliata | SouthAmerica | Forest | 25.0 |
Aedes_scapularis | 19.0 | ||
Aedes_aegypti | Tropics | Urban | 30.0 |
Aedes_albopictus | 28.0 | ||
Culex_tarsalis | USA | Marsh | 22.0 |
Ochlerotatus_triseri | Urban | 21.0 |
6. PROC SQL: Advanced Queries
A. Select Only Daytime Active Species with Lifespan > 20 Days
proc sql;
select Species, Region, Lifespan_days
from mosquitoes
where Active_Time = 'Daytime' and Lifespan_days > 20;
quit;
Output:
Species | Region | Lifespan_days |
---|---|---|
Aedes_aegypti | Tropics | 30 |
Aedes_albopictus | Tropics | 28 |
Psorophora_ciliata | SouthAmerica | 25 |
Ochlerotatus_triseri | USA | 21 |
Aedes_notoscriptus | Australia | 27 |
B. Count of Species by Disease
proc sql;
select Disease, count(*) as Species_Count
from mosquitoes
group by Disease
order by Species_Count desc;
quit;
Output:
Disease | Species_Count |
---|---|
Malaria | 4 |
Encephalitis | 3 |
Filariasis | 2 |
West_Nile | 2 |
Ross_River | 1 |
Zika | 1 |
La_Crosse_USA | 1 |
Dengue | 1 |
Dengue_AIDS | 1 |
Yellow_Fever | 1 |
C. Average Wingspan per Habitat
proc sql;
select Habitat, avg(Wingspan_mm) as Avg_Wingspan format=4.2
from mosquitoes
group by Habitat;
quit;
Output:
Habitat | Avg_Wingspan |
---|---|
Forest | 4.38 |
Marsh | 4.00 |
Rural | 4.15 |
Swamp | 3.90 |
Urban | 3.64 |
Wetlands | 4.10 |
7. MACRO: Dynamic Filtering by Region
%macro filter_by_region(region);
title "Mosquito Species Found in ®ion Region";
proc print data=mosquitoes;
where Region = "®ion";
run;
%mend;
%filter_by_region(Asia);
Output:
Mosquito Species Found in Asia Region |
Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
---|---|---|---|---|---|---|---|
3 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
8 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
14 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
16 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
%filter_by_region(Africa);
Output:
Mosquito Species Found in Africa Region |
Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
---|---|---|---|---|---|---|---|
2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
5 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
8. MACRO with PROC SQL: Count by Disease for Given Active Time
%macro count_disease_by_activity(time);
proc sql;
select Disease, count(*) as Count
from mosquitoes
where Active_Time = "&time"
group by Disease;
quit;
%mend;
%count_disease_by_activity(Daytime);
Output:
Disease | Count |
---|---|
Dengue | 1 |
Dengue_AIDS | 1 |
Encephalitis | 2 |
La_Crosse_USA | 1 |
Ross_River | 1 |
Yellow_Fever | 1 |
Zika | 1 |
%count_disease_by_activity(Nighttime);
Output:
Disease | Count |
---|---|
Encephalitis | 1 |
Filariasis | 2 |
Malaria | 4 |
West_Nile | 2 |
9. Summary Table with PROC SQL Join (Simulated Lookup Table)
Create Region Continent Mapping Table
data region_map;
length Region $15 Continent $15;
input Region $ Continent $;
datalines;
Tropics South_America
Africa Africa
Asia Asia
NorthAmerica North_America
SouthAmerica South_America
Amazon South_America
USA North_America
Europe Europe
Australia Australia
;
run;
proc print;run;
Output:
Obs | Region | Continent |
---|---|---|
1 | Tropics | South_America |
2 | Africa | Africa |
3 | Asia | Asia |
4 | NorthAmerica | North_America |
5 | SouthAmerica | South_America |
6 | Amazon | South_America |
7 | USA | North_America |
8 | Europe | Europe |
9 | Australia | Australia |
Join with Main Dataset
proc sql;
create table mosquito_with_continent as
select a.*, b.Continent
from mosquitoes a
left join region_map b
on a.Region = b.Region;
quit;
proc print data=mosquito_with_continent;
title "Mosquito Species with Continent Info";
run;
Output:
Mosquito Species with Continent Info |
Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days | Continent |
---|---|---|---|---|---|---|---|---|
1 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 | Africa |
2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 | Africa |
3 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 | South_America |
4 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 | Asia |
5 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 | Asia |
6 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 | Asia |
7 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 | Asia |
8 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 | Australia |
9 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 | Europe |
10 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 | North_America |
11 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 | North_America |
12 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 | South_America |
13 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 | South_America |
14 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 | South_America |
15 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 | South_America |
16 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 | North_America |
17 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 | North_America |
10.PROC GCHART (Optional): Simple Bar Chart
proc gchart data=mosquitoes;
vbar Habitat / type=freq;
title "Bar Chart of Mosquito Habitats";
run;
quit;
Log:
NOTE: 13334 bytes written to C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary
Files\_TD13944_DESKTOP-QFAA4KV_\gchart.png.
NOTE: There were 17 observations read from the data set WORK.MOSQUITOES.
NOTE: PROCEDURE GCHART used (Total process time):
real time 2.59 seconds
cpu time 0.56 seconds
11. Advanced Macro: Lifespan Statistics for Any Disease
%macro lifespan_stats(disease);
proc sql;
select mean(Lifespan_days) as Avg_Life format=4.1,
min(Lifespan_days) as Min_Life,
max(Lifespan_days) as Max_Life
from mosquitoes
where Disease = "&disease";
quit;
%mend;
%lifespan_stats(Malaria);
Output:
Avg_Life | Min_Life | Max_Life |
---|---|---|
13.5 | 12 | 15 |
%lifespan_stats(Dengue_AIDS);
Output:
Avg_Life | Min_Life | Max_Life |
---|---|---|
30.0 | 30 | 30 |
12. Export Data (for Reporting)
proc export data=mosquito_with_continent
outfile="mosquito_dataset.csv"
dbms=csv
replace;
run;
- Get link
- X
- Other Apps
Comments
Post a Comment