199.UNVEILING MOSQUITO SPECIES ANALYSIS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC SQL | PROC EXPORT | PROC GCHART | MACROS
UNVEILING MOSQUITO SPECIES ANALYSIS USING PROC PRINT | PROC SORT | PROC FREQ | PROC MEANS | PROC REPORT | PROC SQL | PROC EXPORT | PROC GCHART | MACROS
/*Creating a unique dataset about different types of mosquitoes*/
1. Dataset Creation
data mosquitoes;
length Species $20 Habitat $15 Active_Time $10 Disease $25 Region $15;
input Species $ Habitat $ Active_Time $ Disease $ Region $
Wingspan_mm Lifespan_days;
datalines;
Aedes_aegypti Urban Daytime Dengue_AIDS Tropics 3.5 30
Anopheles_gambiae Rural Nighttime Malaria Africa 4.0 14
Culex_quinquefasciatus Urban Nighttime Filariasis Asia 3.2 20
Aedes_albopictus Urban Daytime Zika Tropics 4.1 28
Anopheles_funestus Rural Nighttime Malaria Africa 4.3 12
Culex_pipiens Urban Nighttime West_Nile NorthAmerica 3.6 18
Psorophora_ciliata Forest Daytime Encephalitis SouthAmerica 5.0 25
Mansonia_uniformis Swamp Nighttime Filariasis Asia 3.9 16
Haemagogus_janthinomys Forest Daytime Yellow_Fever Amazon 4.5 20
Ochlerotatus_triseriatus Urban Daytime La_Crosse_USA USA 3.7 21
Aedes_scapularis Forest Daytime Dengue SouthAmerica 3.8 19
Culex_tarsalis Marsh Nighttime West_Nile USA 4.0 22
Aedes_vexans Urban Daytime Encephalitis Europe 3.6 17
Anopheles_dirus Forest Nighttime Malaria Asia 4.2 13
Culex_nigripalpus Wetlands Nighttime Encephalitis NorthAmerica 4.1 18
Anopheles_stephensi Urban Nighttime Malaria Asia 3.9 15
Aedes_notoscriptus Urban Daytime Ross_River Australia 3.5 27
;
run;
proc print data=mosquitoes label;
title "Mosquito Species Dataset Overview";
run;
Output:
| Mosquito Species Dataset Overview |
| Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
|---|---|---|---|---|---|---|---|
| 1 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 |
| 2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
| 3 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
| 4 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 |
| 5 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
| 6 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 |
| 7 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 |
| 8 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
| 9 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 |
| 10 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 |
| 11 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 |
| 12 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 |
| 13 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 |
| 14 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
| 15 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 |
| 16 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
| 17 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 |
2. PROC FREQ: Frequency Analysis of Diseases and Habitat
proc freq data=mosquitoes;
tables Disease Habitat / nocum;
title "Frequency Distribution of Disease and Habitat";
run;
Output:
| Frequency Distribution of Disease and Habitat |
| Disease | Frequency | Percent |
|---|---|---|
| Dengue | 1 | 5.88 |
| Dengue_AIDS | 1 | 5.88 |
| Encephalitis | 3 | 17.65 |
| Filariasis | 2 | 11.76 |
| La_Crosse_USA | 1 | 5.88 |
| Malaria | 4 | 23.53 |
| Ross_River | 1 | 5.88 |
| West_Nile | 2 | 11.76 |
| Yellow_Fever | 1 | 5.88 |
| Zika | 1 | 5.88 |
| Habitat | Frequency | Percent |
|---|---|---|
| Forest | 4 | 23.53 |
| Marsh | 1 | 5.88 |
| Rural | 2 | 11.76 |
| Swamp | 1 | 5.88 |
| Urban | 8 | 47.06 |
| Wetlands | 1 | 5.88 |
3. PROC MEANS: Descriptive Statistics for Numerical Variables
proc means data=mosquitoes n mean std min max;
var Wingspan_mm Lifespan_days;
title "Descriptive Statistics for Wingspan and Lifespan";
run;
Output:
| Descriptive Statistics for Wingspan and Lifespan |
| Variable | N | Mean | Std Dev | Minimum | Maximum | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
4. PROC SORT: Sorting by Region and Lifespan
proc sort data=mosquitoes out=sorted_mosquitoes;
by Region descending Lifespan_days;
run;
proc print data=sorted_mosquitoes;
title "Mosquito Species Sorted by Region and Lifespan";
run;
Output:
| Mosquito Species Sorted by Region and Lifespan |
| Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
|---|---|---|---|---|---|---|---|
| 1 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
| 2 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
| 3 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 |
| 4 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
| 5 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
| 6 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
| 7 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
| 8 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 |
| 9 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 |
| 10 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 |
| 11 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 |
| 12 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 |
| 13 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 |
| 14 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 |
| 15 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 |
| 16 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 |
| 17 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 |
5. PROC REPORT: Structured Summary
proc report data=mosquitoes nowd;
column Species Region Habitat Lifespan_days;
define Species / display;
define Region / group;
define Habitat / group;
define Lifespan_days / analysis mean format=4.1;
title "Average Lifespan of Mosquitoes by Region and Habitat";
run;
Output:
| Average Lifespan of Mosquitoes by Region and Habitat |
| Species | Region | Habitat | Lifespan_days |
|---|---|---|---|
| Anopheles_gambiae | Africa | Rural | 14.0 |
| Anopheles_funestus | 12.0 | ||
| Haemagogus_janthinom | Amazon | Forest | 20.0 |
| Anopheles_dirus | Asia | Forest | 13.0 |
| Mansonia_uniformis | Swamp | 16.0 | |
| Culex_quinquefasciat | Urban | 20.0 | |
| Anopheles_stephensi | 15.0 | ||
| Aedes_notoscriptus | Australia | Urban | 27.0 |
| Aedes_vexans | Europe | Urban | 17.0 |
| Culex_pipiens | NorthAmerica | Urban | 18.0 |
| Culex_nigripalpus | Wetlands | 18.0 | |
| Psorophora_ciliata | SouthAmerica | Forest | 25.0 |
| Aedes_scapularis | 19.0 | ||
| Aedes_aegypti | Tropics | Urban | 30.0 |
| Aedes_albopictus | 28.0 | ||
| Culex_tarsalis | USA | Marsh | 22.0 |
| Ochlerotatus_triseri | Urban | 21.0 |
6. PROC SQL: Advanced Queries
A. Select Only Daytime Active Species with Lifespan > 20 Days
proc sql;
select Species, Region, Lifespan_days
from mosquitoes
where Active_Time = 'Daytime' and Lifespan_days > 20;
quit;
Output:
| Species | Region | Lifespan_days |
|---|---|---|
| Aedes_aegypti | Tropics | 30 |
| Aedes_albopictus | Tropics | 28 |
| Psorophora_ciliata | SouthAmerica | 25 |
| Ochlerotatus_triseri | USA | 21 |
| Aedes_notoscriptus | Australia | 27 |
B. Count of Species by Disease
proc sql;
select Disease, count(*) as Species_Count
from mosquitoes
group by Disease
order by Species_Count desc;
quit;
Output:
| Disease | Species_Count |
|---|---|
| Malaria | 4 |
| Encephalitis | 3 |
| Filariasis | 2 |
| West_Nile | 2 |
| Ross_River | 1 |
| Zika | 1 |
| La_Crosse_USA | 1 |
| Dengue | 1 |
| Dengue_AIDS | 1 |
| Yellow_Fever | 1 |
C. Average Wingspan per Habitat
proc sql;
select Habitat, avg(Wingspan_mm) as Avg_Wingspan format=4.2
from mosquitoes
group by Habitat;
quit;
Output:
| Habitat | Avg_Wingspan |
|---|---|
| Forest | 4.38 |
| Marsh | 4.00 |
| Rural | 4.15 |
| Swamp | 3.90 |
| Urban | 3.64 |
| Wetlands | 4.10 |
7. MACRO: Dynamic Filtering by Region
%macro filter_by_region(region);
title "Mosquito Species Found in ®ion Region";
proc print data=mosquitoes;
where Region = "®ion";
run;
%mend;
%filter_by_region(Asia);
Output:
| Mosquito Species Found in Asia Region |
| Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
|---|---|---|---|---|---|---|---|
| 3 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 |
| 8 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 |
| 14 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 |
| 16 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 |
%filter_by_region(Africa);
Output:
| Mosquito Species Found in Africa Region |
| Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days |
|---|---|---|---|---|---|---|---|
| 2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 |
| 5 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 |
8. MACRO with PROC SQL: Count by Disease for Given Active Time
%macro count_disease_by_activity(time);
proc sql;
select Disease, count(*) as Count
from mosquitoes
where Active_Time = "&time"
group by Disease;
quit;
%mend;
%count_disease_by_activity(Daytime);
Output:
| Disease | Count |
|---|---|
| Dengue | 1 |
| Dengue_AIDS | 1 |
| Encephalitis | 2 |
| La_Crosse_USA | 1 |
| Ross_River | 1 |
| Yellow_Fever | 1 |
| Zika | 1 |
%count_disease_by_activity(Nighttime);
Output:
| Disease | Count |
|---|---|
| Encephalitis | 1 |
| Filariasis | 2 |
| Malaria | 4 |
| West_Nile | 2 |
9. Summary Table with PROC SQL Join (Simulated Lookup Table)
Create Region Continent Mapping Table
data region_map;
length Region $15 Continent $15;
input Region $ Continent $;
datalines;
Tropics South_America
Africa Africa
Asia Asia
NorthAmerica North_America
SouthAmerica South_America
Amazon South_America
USA North_America
Europe Europe
Australia Australia
;
run;
proc print;run;
Output:
| Obs | Region | Continent |
|---|---|---|
| 1 | Tropics | South_America |
| 2 | Africa | Africa |
| 3 | Asia | Asia |
| 4 | NorthAmerica | North_America |
| 5 | SouthAmerica | South_America |
| 6 | Amazon | South_America |
| 7 | USA | North_America |
| 8 | Europe | Europe |
| 9 | Australia | Australia |
Join with Main Dataset
proc sql;
create table mosquito_with_continent as
select a.*, b.Continent
from mosquitoes a
left join region_map b
on a.Region = b.Region;
quit;
proc print data=mosquito_with_continent;
title "Mosquito Species with Continent Info";
run;
Output:
| Mosquito Species with Continent Info |
| Obs | Species | Habitat | Active_Time | Disease | Region | Wingspan_mm | Lifespan_days | Continent |
|---|---|---|---|---|---|---|---|---|
| 1 | Anopheles_funestus | Rural | Nighttime | Malaria | Africa | 4.3 | 12 | Africa |
| 2 | Anopheles_gambiae | Rural | Nighttime | Malaria | Africa | 4.0 | 14 | Africa |
| 3 | Haemagogus_janthinom | Forest | Daytime | Yellow_Fever | Amazon | 4.5 | 20 | South_America |
| 4 | Culex_quinquefasciat | Urban | Nighttime | Filariasis | Asia | 3.2 | 20 | Asia |
| 5 | Mansonia_uniformis | Swamp | Nighttime | Filariasis | Asia | 3.9 | 16 | Asia |
| 6 | Anopheles_dirus | Forest | Nighttime | Malaria | Asia | 4.2 | 13 | Asia |
| 7 | Anopheles_stephensi | Urban | Nighttime | Malaria | Asia | 3.9 | 15 | Asia |
| 8 | Aedes_notoscriptus | Urban | Daytime | Ross_River | Australia | 3.5 | 27 | Australia |
| 9 | Aedes_vexans | Urban | Daytime | Encephalitis | Europe | 3.6 | 17 | Europe |
| 10 | Culex_nigripalpus | Wetlands | Nighttime | Encephalitis | NorthAmerica | 4.1 | 18 | North_America |
| 11 | Culex_pipiens | Urban | Nighttime | West_Nile | NorthAmerica | 3.6 | 18 | North_America |
| 12 | Psorophora_ciliata | Forest | Daytime | Encephalitis | SouthAmerica | 5.0 | 25 | South_America |
| 13 | Aedes_scapularis | Forest | Daytime | Dengue | SouthAmerica | 3.8 | 19 | South_America |
| 14 | Aedes_aegypti | Urban | Daytime | Dengue_AIDS | Tropics | 3.5 | 30 | South_America |
| 15 | Aedes_albopictus | Urban | Daytime | Zika | Tropics | 4.1 | 28 | South_America |
| 16 | Culex_tarsalis | Marsh | Nighttime | West_Nile | USA | 4.0 | 22 | North_America |
| 17 | Ochlerotatus_triseri | Urban | Daytime | La_Crosse_USA | USA | 3.7 | 21 | North_America |
10.PROC GCHART (Optional): Simple Bar Chart
proc gchart data=mosquitoes;
vbar Habitat / type=freq;
title "Bar Chart of Mosquito Habitats";
run;
quit;
Log:
NOTE: 13334 bytes written to C:\Users\Lenovo\AppData\Local\Temp\SAS Temporary
Files\_TD13944_DESKTOP-QFAA4KV_\gchart.png.
NOTE: There were 17 observations read from the data set WORK.MOSQUITOES.
NOTE: PROCEDURE GCHART used (Total process time):
real time 2.59 seconds
cpu time 0.56 seconds
11. Advanced Macro: Lifespan Statistics for Any Disease
%macro lifespan_stats(disease);
proc sql;
select mean(Lifespan_days) as Avg_Life format=4.1,
min(Lifespan_days) as Min_Life,
max(Lifespan_days) as Max_Life
from mosquitoes
where Disease = "&disease";
quit;
%mend;
%lifespan_stats(Malaria);
Output:
| Avg_Life | Min_Life | Max_Life |
|---|---|---|
| 13.5 | 12 | 15 |
%lifespan_stats(Dengue_AIDS);
Output:
| Avg_Life | Min_Life | Max_Life |
|---|---|---|
| 30.0 | 30 | 30 |
12. Export Data (for Reporting)
proc export data=mosquito_with_continent
outfile="mosquito_dataset.csv"
dbms=csv
replace;
run;
Comments
Post a Comment