WATER SOURCES DATA ANALYSIS AND ENVIRONMENTAL RISK EVALUATION USING SAS DATA STEP | PROC CONTENTS | PROC PRINT | PROC SQL | PROC MEANS | PROC FREQ | PROC UNIVARIATE | PROC FORMAT | PROC RANK | PROC TRANSPOSE | PROC APPEND | DATA MERGE | PROC SGPLOT | MACROS | DATE FUNCTIONS (INTCK-INTNX-MDY) | SYSTEM OPTIONS
options nocenter;
1.WATER SOURCES DATA CREATION
data water_sources;
length Source_Type $20 Country $20 Usage $15;
format Monitoring_Date date9.;
input Source_Type $ Country $ Availability Usage $ Pollution_Level Monitoring_Date :date9.;
datalines;
River India 75 Agriculture 60 15JAN2022
Lake Canada 85 Drinking 20 10FEB2022
Groundwater India 65 Drinking 45 01MAR2022
Desalination UAE 90 Industrial 15 20APR2022
Rainwater Brazil 70 Agriculture 30 18MAY2022
River USA 80 Industrial 55 05JUN2022
Lake Germany 88 Drinking 18 12JUL2022
Groundwater China 60 Agriculture 70 25AUG2022
Desalination Israel 92 Drinking 10 10SEP2022
River Egypt 68 Agriculture 65 30OCT2022
Lake Japan 90 Drinking 12 15NOV2022
Groundwater Mexico 72 Industrial 50 20DEC2022
Rainwater India 78 Domestic 25 05JAN2023
;
run;
proc print data=water_sources;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level |
|---|---|---|---|---|---|---|
| 1 | River | India | Agriculture | 15JAN2022 | 75 | 60 |
| 2 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 |
| 3 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 |
| 5 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 |
| 6 | River | USA | Industrial | 05JUN2022 | 80 | 55 |
| 7 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 |
| 8 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 |
| 9 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 |
| 10 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 |
| 11 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 |
| 12 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 |
| 13 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 |
2.BASIC DATA VALIDATION – PROC CONTENTS & PRINT
proc contents data=water_sources;
run;
OUTPUT:
The CONTENTS Procedure
| Data Set Name | WORK.WATER_SOURCES | Observations | 13 |
|---|---|---|---|
| Member Type | DATA | Variables | 6 |
| Engine | V9 | Indexes | 0 |
| Created | 01/03/2026 07:29:17 | Observation Length | 80 |
| Last Modified | 01/03/2026 07:29:17 | Deleted Observations | 0 |
| Protection | Compressed | NO | |
| Data Set Type | Sorted | NO | |
| Label | |||
| Data Representation | SOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64 | ||
| Encoding | utf-8 Unicode (UTF-8) |
| Engine/Host Dependent Information | |
|---|---|
| Data Set Page Size | 131072 |
| Number of Data Set Pages | 1 |
| First Data Page | 1 |
| Max Obs per Page | 1635 |
| Obs in First Data Page | 13 |
| Number of Data Set Repairs | 0 |
| Filename | /saswork/SAS_work2CC20001A425_odaws01-apse1-2.oda.sas.com/SAS_work63320001A425_odaws01-apse1-2.oda.sas.com/water_sources.sas7bdat |
| Release Created | 9.0401M8 |
| Host Created | Linux |
| Inode Number | 1217211 |
| Access Permission | rw-r--r-- |
| Owner Name | u63247146 |
| File Size | 256KB |
| File Size (bytes) | 262144 |
| Alphabetic List of Variables and Attributes | ||||
|---|---|---|---|---|
| # | Variable | Type | Len | Format |
| 5 | Availability | Num | 8 | |
| 2 | Country | Char | 20 | |
| 4 | Monitoring_Date | Num | 8 | DATE9. |
| 6 | Pollution_Level | Num | 8 | |
| 1 | Source_Type | Char | 20 | |
| 3 | Usage | Char | 15 | |
proc print data=water_sources(obs=5);
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level |
|---|---|---|---|---|---|---|
| 1 | River | India | Agriculture | 15JAN2022 | 75 | 60 |
| 2 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 |
| 3 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 |
| 5 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 |
3.DATE DERIVATIONS – MDY, INTCK, INTNX
data water_dates;
set water_sources;
Year = year(Monitoring_Date);
Quarter_Start = intnx('qtr', Monitoring_Date, 0, 'begin');
Months_Since_Monitoring = intck('month', Monitoring_Date, today());
format Quarter_Start date9.;
run;
proc print data=water_dates;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring |
|---|---|---|---|---|---|---|---|---|---|
| 1 | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 |
| 2 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 |
| 3 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 |
| 5 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 |
| 6 | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 |
| 7 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 |
| 8 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 |
| 9 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 |
| 10 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 |
| 11 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 |
| 12 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 |
| 13 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 |
4.PROC SQL – DATA QUERY & AGGREGATION
proc sql;
create table pollution_summary as
select Country,
Source_Type,
avg(Pollution_Level) as Avg_Pollution format=8.2,
count(*) as Records
from water_dates
group by Country, Source_Type;
quit;
proc print data=pollution_summary;
run;
OUTPUT:
| Obs | Country | Source_Type | Avg_Pollution | Records |
|---|---|---|---|---|
| 1 | Brazil | Rainwater | 30.00 | 1 |
| 2 | Canada | Lake | 20.00 | 1 |
| 3 | China | Groundwater | 70.00 | 1 |
| 4 | Egypt | River | 65.00 | 1 |
| 5 | Germany | Lake | 18.00 | 1 |
| 6 | India | Groundwater | 45.00 | 1 |
| 7 | India | Rainwater | 25.00 | 1 |
| 8 | India | River | 60.00 | 1 |
| 9 | Israel | Desalination | 10.00 | 1 |
| 10 | Japan | Lake | 12.00 | 1 |
| 11 | Mexico | Groundwater | 50.00 | 1 |
| 12 | UAE | Desalination | 15.00 | 1 |
| 13 | USA | River | 55.00 | 1 |
5.PROC MEANS – NUMERIC SUMMARIES
proc means data=water_dates mean min max std;
class Source_Type;
var Availability Pollution_Level;
run;
OUTPUT:
The MEANS Procedure
| Source_Type | N Obs | Variable | Mean | Minimum | Maximum | Std Dev |
|---|---|---|---|---|---|---|
| Desalination | 2 | Availability Pollution_Level | 91.0000000 12.5000000 | 90.0000000 10.0000000 | 92.0000000 15.0000000 | 1.4142136 3.5355339 |
| Groundwater | 3 | Availability Pollution_Level | 65.6666667 55.0000000 | 60.0000000 45.0000000 | 72.0000000 70.0000000 | 6.0277138 13.2287566 |
| Lake | 3 | Availability Pollution_Level | 87.6666667 16.6666667 | 85.0000000 12.0000000 | 90.0000000 20.0000000 | 2.5166115 4.1633320 |
| Rainwater | 2 | Availability Pollution_Level | 74.0000000 27.5000000 | 70.0000000 25.0000000 | 78.0000000 30.0000000 | 5.6568542 3.5355339 |
| River | 3 | Availability Pollution_Level | 74.3333333 60.0000000 | 68.0000000 55.0000000 | 80.0000000 65.0000000 | 6.0277138 5.0000000 |
6.PROC FREQ – CATEGORICAL ANALYSIS
proc freq data=water_dates;
tables Source_Type*Usage / norow nocol nopercent;
run;
OUTPUT:
The FREQ Procedure
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
7.PROC UNIVARIATE – DISTRIBUTION CHECKS
proc univariate data=water_dates;
var Pollution_Level;
histogram Pollution_Level;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Pollution_Level
| Moments | |||
|---|---|---|---|
| N | 13 | Sum Weights | 13 |
| Mean | 36.5384615 | Sum Observations | 475 |
| Std Deviation | 21.6741604 | Variance | 469.769231 |
| Skewness | 0.25733598 | Kurtosis | -1.6447809 |
| Uncorrected SS | 22993 | Corrected SS | 5637.23077 |
| Coeff Variation | 59.3187549 | Std Error Mean | 6.01133052 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 36.53846 | Std Deviation | 21.67416 |
| Median | 30.00000 | Variance | 469.76923 |
| Mode | . | Range | 60.00000 |
| Interquartile Range | 37.00000 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 6.078265 | Pr > |t| | <.0001 |
| Sign | M | 6.5 | Pr >= |M| | 0.0002 |
| Signed Rank | S | 45.5 | Pr >= |S| | 0.0002 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 70 |
| 99% | 70 |
| 95% | 70 |
| 90% | 65 |
| 75% Q3 | 55 |
| 50% Median | 30 |
| 25% Q1 | 18 |
| 10% | 12 |
| 5% | 10 |
| 1% | 10 |
| 0% Min | 10 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 10 | 9 | 50 | 12 |
| 12 | 11 | 55 | 6 |
| 15 | 4 | 60 | 1 |
| 18 | 7 | 65 | 10 |
| 20 | 2 | 70 | 8 |
The UNIVARIATE Procedure
8.PROC FORMAT – CUSTOM CLASSIFICATION
proc format;
value pollution_fmt
low - 30 = 'Low'
31 - 60 = 'Moderate'
61 - high = 'High';
run;
LOG:
data water_formatted;
set water_dates;
Pollution_Category = put(Pollution_Level, pollution_fmt.);
run;
proc print data=water_formatted;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate |
| 2 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low |
| 3 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low |
| 5 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low |
| 6 | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate |
| 7 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low |
| 8 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High |
| 9 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low |
| 10 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High |
| 11 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low |
| 12 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate |
| 13 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low |
9.PROC RANK – POLLUTION RANKING
proc rank data=water_formatted out=water_ranked descending;
var Pollution_Level;
ranks Pollution_Rank;
run;
proc print data=water_ranked;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 2 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 3 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 6 | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 7 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 8 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 9 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 10 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 11 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 12 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 13 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
10.MACRO – AUTOMATED CLASSIFICATION
%macro pollution_flag;
data water_flagged;
Length Risk_Flag $10.;
set water_ranked;
if Pollution_Level >= 60 then Risk_Flag = 'HIGH';
else if Pollution_Level >= 30 then Risk_Flag = 'MEDIUM';
else Risk_Flag = 'LOW';
run;
run;
proc print data=water_flagged;
run;
%mend;
%pollution_flag;
OUTPUT:
| Obs | Risk_Flag | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 2 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 3 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 4 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 6 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 7 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 8 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 9 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 10 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 11 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 12 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 13 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
11.PROC TRANSPOSE – STRUCTURAL TRANSFORMATION
proc transpose data=water_flagged out=water_transposed;
by Country NotSorted;
var Availability Pollution_Level;
run;
proc print data=water_transposed;
run;
OUTPUT:
| Obs | Country | _NAME_ | COL1 |
|---|---|---|---|
| 1 | India | Availability | 75 |
| 2 | India | Pollution_Level | 60 |
| 3 | Canada | Availability | 85 |
| 4 | Canada | Pollution_Level | 20 |
| 5 | India | Availability | 65 |
| 6 | India | Pollution_Level | 45 |
| 7 | UAE | Availability | 90 |
| 8 | UAE | Pollution_Level | 15 |
| 9 | Brazil | Availability | 70 |
| 10 | Brazil | Pollution_Level | 30 |
| 11 | USA | Availability | 80 |
| 12 | USA | Pollution_Level | 55 |
| 13 | Germany | Availability | 88 |
| 14 | Germany | Pollution_Level | 18 |
| 15 | China | Availability | 60 |
| 16 | China | Pollution_Level | 70 |
| 17 | Israel | Availability | 92 |
| 18 | Israel | Pollution_Level | 10 |
| 19 | Egypt | Availability | 68 |
| 20 | Egypt | Pollution_Level | 65 |
| 21 | Japan | Availability | 90 |
| 22 | Japan | Pollution_Level | 12 |
| 23 | Mexico | Availability | 72 |
| 24 | Mexico | Pollution_Level | 50 |
| 25 | India | Availability | 78 |
| 26 | India | Pollution_Level | 25 |
12.PROC APPEND – DATASET COMBINATION
proc append base=water_flagged
data=water_flagged force;
run;
proc print data=water_flagged;
run;
OUTPUT:
| Obs | Risk_Flag | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 2 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 3 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 4 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 6 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 7 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 8 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 9 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 10 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 11 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 12 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 13 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
| 14 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 15 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 16 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 17 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 18 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 19 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 20 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 21 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 22 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 23 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 24 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 25 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 26 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
13.PROC MERGE – DATA INTEGRATION
proc sort data=water_sources out=water_sources ;
by Source_Type Country;
run;
proc print data=water_sources;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level |
|---|---|---|---|---|---|---|
| 1 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 |
| 2 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 |
| 3 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 |
| 4 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 |
| 5 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 |
| 6 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 |
| 7 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 |
| 8 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 |
| 9 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 |
| 10 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 |
| 11 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 |
| 12 | River | India | Agriculture | 15JAN2022 | 75 | 60 |
| 13 | River | USA | Industrial | 05JUN2022 | 80 | 55 |
proc sort data=water_flagged out=water_flagged ;
by Source_Type Country;
run;
proc print data=water_flagged;
run;
OUTPUT:
| Obs | Risk_Flag | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 2 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 3 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 4 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 6 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 7 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 8 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 9 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 10 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 11 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 12 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 13 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 14 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 15 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 16 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 17 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 18 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 19 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
| 20 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
| 21 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 22 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 23 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 24 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 25 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 26 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
data water_merged;
merge water_sources(in=a)
water_flagged(in=b);
by Source_Type Country;
if a and b;
run;
proc print data=water_merged;
run;
OUTPUT:
| Obs | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Risk_Flag | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | LOW | 2022 | 01JUL2022 | 40 | Low | 13 |
| 2 | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | LOW | 2022 | 01JUL2022 | 40 | Low | 13 |
| 3 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | LOW | 2022 | 01APR2022 | 45 | Low | 11 |
| 4 | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | LOW | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | HIGH | 2022 | 01JUL2022 | 41 | High | 1 |
| 6 | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | HIGH | 2022 | 01JUL2022 | 41 | High | 1 |
| 7 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | MEDIUM | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 8 | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | MEDIUM | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 9 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | MEDIUM | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 10 | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | MEDIUM | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 11 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | LOW | 2022 | 01JAN2022 | 47 | Low | 9 |
| 12 | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | LOW | 2022 | 01JAN2022 | 47 | Low | 9 |
| 13 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | LOW | 2022 | 01JUL2022 | 42 | Low | 10 |
| 14 | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | LOW | 2022 | 01JUL2022 | 42 | Low | 10 |
| 15 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | LOW | 2022 | 01OCT2022 | 38 | Low | 12 |
| 16 | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | LOW | 2022 | 01OCT2022 | 38 | Low | 12 |
| 17 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | MEDIUM | 2022 | 01APR2022 | 44 | Low | 7 |
| 18 | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | MEDIUM | 2022 | 01APR2022 | 44 | Low | 7 |
| 19 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | LOW | 2023 | 01JAN2023 | 36 | Low | 8 |
| 20 | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | LOW | 2023 | 01JAN2023 | 36 | Low | 8 |
| 21 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | HIGH | 2022 | 01OCT2022 | 39 | High | 2 |
| 22 | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | HIGH | 2022 | 01OCT2022 | 39 | High | 2 |
| 23 | River | India | Agriculture | 15JAN2022 | 75 | 60 | HIGH | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 24 | River | India | Agriculture | 15JAN2022 | 75 | 60 | HIGH | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 25 | River | USA | Industrial | 05JUN2022 | 80 | 55 | MEDIUM | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 26 | River | USA | Industrial | 05JUN2022 | 80 | 55 | MEDIUM | 2022 | 01APR2022 | 43 | Moderate | 4 |
14.PROC SGPLOT – VISUALIZATION
proc sgplot data=water_flagged;
vbar Source_Type / response=Pollution_Level stat=mean;
run;
OUTPUT:
15.SYSTEM OPTIONS & LABELS
options nodate nonumber;
data water_labeled;
set water_flagged;
label Pollution_Level = "Pollution Index Score"
Availability = "Water Availability Percentage";
run;
proc print data=water_labeled;
run;
OUTPUT:
| Obs | Risk_Flag | Source_Type | Country | Usage | Monitoring_Date | Availability | Pollution_Level | Year | Quarter_Start | Months_Since_Monitoring | Pollution_Category | Pollution_Rank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 2 | LOW | Desalination | Israel | Drinking | 10SEP2022 | 92 | 10 | 2022 | 01JUL2022 | 40 | Low | 13 |
| 3 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 4 | LOW | Desalination | UAE | Industrial | 20APR2022 | 90 | 15 | 2022 | 01APR2022 | 45 | Low | 11 |
| 5 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 6 | HIGH | Groundwater | China | Agriculture | 25AUG2022 | 60 | 70 | 2022 | 01JUL2022 | 41 | High | 1 |
| 7 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 8 | MEDIUM | Groundwater | India | Drinking | 01MAR2022 | 65 | 45 | 2022 | 01JAN2022 | 46 | Moderate | 6 |
| 9 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 10 | MEDIUM | Groundwater | Mexico | Industrial | 20DEC2022 | 72 | 50 | 2022 | 01OCT2022 | 37 | Moderate | 5 |
| 11 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 12 | LOW | Lake | Canada | Drinking | 10FEB2022 | 85 | 20 | 2022 | 01JAN2022 | 47 | Low | 9 |
| 13 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 14 | LOW | Lake | Germany | Drinking | 12JUL2022 | 88 | 18 | 2022 | 01JUL2022 | 42 | Low | 10 |
| 15 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 16 | LOW | Lake | Japan | Drinking | 15NOV2022 | 90 | 12 | 2022 | 01OCT2022 | 38 | Low | 12 |
| 17 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 18 | MEDIUM | Rainwater | Brazil | Agriculture | 18MAY2022 | 70 | 30 | 2022 | 01APR2022 | 44 | Low | 7 |
| 19 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
| 20 | LOW | Rainwater | India | Domestic | 05JAN2023 | 78 | 25 | 2023 | 01JAN2023 | 36 | Low | 8 |
| 21 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 22 | HIGH | River | Egypt | Agriculture | 30OCT2022 | 68 | 65 | 2022 | 01OCT2022 | 39 | High | 2 |
| 23 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 24 | HIGH | River | India | Agriculture | 15JAN2022 | 75 | 60 | 2022 | 01JAN2022 | 48 | Moderate | 3 |
| 25 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
| 26 | MEDIUM | River | USA | Industrial | 05JUN2022 | 80 | 55 | 2022 | 01APR2022 | 43 | Moderate | 4 |
No comments:
Post a Comment