389.Is Your City’s Water Really Safe?A Complete SAS Monitoring System
Is Your City’s Water Really Safe?A Complete SAS Monitoring System
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | APPEND | PROC SORT | MERGE | PROC DATASETS DELETE | DATA FUNCTIONS
INTRODUCTION
Water treatment plants play a critical role in maintaining public health and environmental safety. Every day, millions of liters of water are processed, treated, filtered, and supplied to households, industries, and hospitals.
In modern organizations, this process is no longer managed only by engineers and technicians. It is heavily supported by data analytics systems that monitor plant performance, contamination levels, operational efficiency, energy consumption, and regulatory compliance.
The goal of the project is to show how a data analyst or SAS programmer can:
· Build a complete dataset from scratch
· Apply business rules and transformations
· Perform statistical analysis
· Detect anomalies or fraud
· Generate meaningful insights for decision-makers
The project is designed in a way that is easy to understand, interview-friendly, and aligned with how real companies use SAS in environmental and infrastructure analytics.
TABLE OF CONTENTS
1. Business Background
2. Dataset Design and Variables
3. Data Creation Using DATA Step
4. Data Cleaning and Standardization
5. Date Intelligence and Time Calculations
6. SQL-Based Business Queries
7. Descriptive Statistics (PROC MEANS)
8. Distribution Analysis (PROC UNIVARIATE)
9. Frequency Analysis (PROC FREQ)
10. Correlation Study (PROC CORR)
11. Visualization Using PROC SGPLOT
12. Utilization Classification Macro
13. Fraud / Anomaly Detection Macro
14. Data Engineering Operations
15. Dataset Recreation and Appending
16. Real-World Business Benefits
17. Conclusion
BUSINESS BACKGROUND
Government agencies, municipal corporations, and private utility companies must continuously track the operational health of water treatment plants.
Some of the common business questions they face are:
· Which plants are producing the highest volume of water?
· Are contamination levels within safe limits?
· Which plants consume excessive energy?
· Are there any plants showing suspicious performance data?
· Are regulatory compliance scores improving or declining?
This project models exactly these questions using SAS.
DATASET DESIGN
The dataset represents daily operational metrics of water treatment plants.
Variable | Business Meaning |
Plant_Name | Name of water plant |
City | City location |
State | State location |
Profit | Financial performance |
Capacity_MLD | Water capacity in million liters |
Contaminant_Level | Pollution index |
Treatment_Efficiency | Effectiveness of purification |
Energy_Usage | Power consumption |
Compliance_Score | Regulatory rating |
Report_Date | Monitoring date |
These variables are commonly found in utility dashboards and compliance systems.
1. Business Context
Water treatment plants are responsible for converting raw water into safe drinking water. Governments and private companies continuously monitor:
- How much water is produced?
- How efficient is treatment?
- Is contamination within limits?
- Are energy costs rising?
- Is any plant showing suspicious or abnormal data?
This project simulates how a Data Analyst / SAS Programmer would build a monitoring system for multiple water plants across India.
This type of analysis is used in:
- Smart Cities
- Pollution Control Boards
- Municipal Corporations
- Environmental Compliance Audits
2. Dataset Creation
data water_plants;
input Plant_Name:$18. City:$12. State:$12. Profit Capacity_MLD Contaminant_Level
Treatment_Efficiency Energy_Usage Compliance_Score Report_Date :date9.;
format Report_Date date9.;
datalines;
KrishnaPlant Hyderabad Telangana 500000 120 3.2 92 450 95 01JAN2025
GodavariUnit Warangal Telangana 420000 100 4.5 89 430 90 05JAN2025
CauveryHub Chennai TamilNadu 610000 150 2.8 94 480 97 10JAN2025
GangaStation Patna Bihar 300000 90 6.2 85 410 82 12JAN2025
YamunaWorks Delhi Delhi 700000 180 3.5 91 520 96 15JAN2025
NarmadaPlant Bhopal MP 380000 95 5.0 87 400 88 18JAN2025
Sabarmati Ahmedabad Gujarat 450000 110 4.1 90 435 92 20JAN2025
MahanadiUnit Cuttack Odisha 290000 85 6.5 83 390 80 22JAN2025
PeriyarWorks Kochi Kerala 340000 100 3.8 93 420 95 25JAN2025
Brahmaputra Assam Assam 310000 88 5.9 86 405 85 28JAN2025
Tungabhadra Bellary Karnataka 360000 105 4.4 89 415 90 01FEB2025
BhimaPlant Pune Maharashtra 520000 140 3.1 94 470 96 05FEB2025
KaveriSouth Trichy TamilNadu 430000 115 4.7 88 440 91 10FEB2025
Hooghly Kolkata WestBengal 480000 130 3.9 92 460 94 15FEB2025
LuniWorks Jodhpur Rajasthan 260000 75 7.2 80 370 78 18FEB2025
ChambalUnit Kota Rajasthan 300000 90 6.1 85 395 84 22FEB2025
TeestaPlant Siliguri WestBengal 310000 85 5.5 87 400 86 25FEB2025
IndusHub Amritsar Punjab 390000 100 4.0 90 420 92 01MAR2025
SutlejUnit Ludhiana Punjab 410000 110 3.6 91 430 94 05MAR2025
PennarWorks Nellore AP 350000 95 4.9 88 415 90 10MAR2025
;
run;
proc print data=water_plants;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | KrishnaPlant | Hyderabad | Telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 |
| 2 | GodavariUnit | Warangal | Telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 |
| 3 | CauveryHub | Chennai | TamilNadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 |
| 4 | GangaStation | Patna | Bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 |
| 5 | YamunaWorks | Delhi | Delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 |
| 6 | NarmadaPlant | Bhopal | MP | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 |
| 7 | Sabarmati | Ahmedabad | Gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 |
| 8 | MahanadiUnit | Cuttack | Odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 |
| 9 | PeriyarWorks | Kochi | Kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 |
| 10 | Brahmaputra | Assam | Assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 |
| 11 | Tungabhadra | Bellary | Karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 |
| 12 | BhimaPlant | Pune | Maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 |
| 13 | KaveriSouth | Trichy | TamilNadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 |
| 14 | Hooghly | Kolkata | WestBengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 |
| 15 | LuniWorks | Jodhpur | Rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 |
| 16 | ChambalUnit | Kota | Rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 |
| 17 | TeestaPlant | Siliguri | WestBengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 |
| 18 | IndusHub | Amritsar | Punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 |
| 19 | SutlejUnit | Ludhiana | Punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 |
| 20 | PennarWorks | Nellore | AP | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 |
The DATA step is used to simulate raw operational data.
· Manual data simulation
· Date formatting
· Mixed numeric and character variables
3. Character Cleaning
data water_clean;
set water_plants;
Plant_Name = propcase(strip(Plant_Name));
City = upcase(strip(City));
State = lowcase(strip(State));
run;
proc print data=water_clean;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 |
Functions like:
· STRIP
· TRIM
· PROPCASE
· UPCASE
· LOWCASE
are used to standardize text values.
· Avoid duplicate records due to spelling differences
· Improve grouping accuracy in reports
· Ensure clean joins and merges
4. Date Calculations
data water_dates;
set water_clean;
Days_From_Start = intck('day','01JAN2025'd, Report_Date);
Next_Audit = intnx('month', Report_Date, 3);
run;
proc print data=water_dates;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 |
Functions used:
· MDY
· INTCK
· INTNX
These allow:
· Time-based comparisons
· Audit scheduling
· Trend analysis
This is critical in real systems where time is the main dimension.
5. PROC SQL – Business Queries
proc sql;
select State,
avg(Treatment_Efficiency) as Avg_Efficiency,
sum(Profit) as Total_Profit
from water_dates
group by State;
quit;
OUTPUT:
| State | Avg_Efficiency | Total_Profit |
|---|---|---|
| ap | 88 | 350000 |
| assam | 86 | 310000 |
| bihar | 85 | 300000 |
| delhi | 91 | 700000 |
| gujarat | 90 | 450000 |
| karnataka | 89 | 360000 |
| kerala | 93 | 340000 |
| maharashtra | 94 | 520000 |
| mp | 87 | 380000 |
| odisha | 83 | 290000 |
| punjab | 90.5 | 800000 |
| rajasthan | 82.5 | 560000 |
| tamilnadu | 91 | 1040000 |
| telangana | 90.5 | 920000 |
| westbengal | 89.5 | 790000 |
Used for:
· Aggregation
· Grouping
· Summarization
· Average efficiency by state
· Total profit by region
This shows how SAS integrates SQL-style thinking into analytics.
6. PROC MEANS
proc means data=water_dates mean min max;
var Capacity_MLD Contaminant_Level Treatment_Efficiency Energy_Usage;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum |
|---|---|---|---|
Capacity_MLD Contaminant_Level Treatment_Efficiency Energy_Usage | 108.1500000 4.6450000 88.7000000 427.7500000 | 75.0000000 2.8000000 80.0000000 370.0000000 | 180.0000000 7.2000000 94.0000000 520.0000000 |
Used to compute:
· Mean
· Minimum
· Maximum
This gives a quick health check of the system.
Managers use this to:
· Identify extreme values
· Spot underperforming plants
7. PROC UNIVARIATE
proc univariate data=water_dates;
var Contaminant_Level;
histogram;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Contaminant_Level
| Moments | |||
|---|---|---|---|
| N | 20 | Sum Weights | 20 |
| Mean | 4.645 | Sum Observations | 92.9 |
| Std Deviation | 1.24286384 | Variance | 1.54471053 |
| Skewness | 0.47131104 | Kurtosis | -0.7097626 |
| Uncorrected SS | 460.87 | Corrected SS | 29.3495 |
| Coeff Variation | 26.7570256 | Std Error Mean | 0.2779128 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 4.645000 | Std Deviation | 1.24286 |
| Median | 4.450000 | Variance | 1.54471 |
| Mode | . | Range | 4.40000 |
| Interquartile Range | 2.00000 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 16.71388 | Pr > |t| | <.0001 |
| Sign | M | 10 | Pr >= |M| | <.0001 |
| Signed Rank | S | 105 | Pr >= |S| | <.0001 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 7.20 |
| 99% | 7.20 |
| 95% | 6.85 |
| 90% | 6.35 |
| 75% Q3 | 5.70 |
| 50% Median | 4.45 |
| 25% Q1 | 3.70 |
| 10% | 3.15 |
| 5% | 2.95 |
| 1% | 2.80 |
| 0% Min | 2.80 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 2.8 | 3 | 5.9 | 10 |
| 3.1 | 12 | 6.1 | 16 |
| 3.2 | 1 | 6.2 | 4 |
| 3.5 | 5 | 6.5 | 8 |
| 3.6 | 19 | 7.2 | 15 |
The UNIVARIATE Procedure
Used for:
· Distribution analysis
· Outlier detection
This helps detect:
· Abnormally high contamination
· Unusual efficiency values
8. PROC CORR
proc corr data=water_dates;
var Energy_Usage Treatment_Efficiency Compliance_Score;
run;
OUTPUT:
The CORR Procedure
| 3 Variables: | Energy_Usage Treatment_Efficiency Compliance_Score |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Energy_Usage | 20 | 427.75000 | 35.11166 | 8555 | 370.00000 | 520.00000 |
| Treatment_Efficiency | 20 | 88.70000 | 3.70064 | 1774 | 80.00000 | 94.00000 |
| Compliance_Score | 20 | 89.75000 | 5.61834 | 1795 | 78.00000 | 97.00000 |
| Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 | |||
|---|---|---|---|
| Energy_Usage | Treatment_Efficiency | Compliance_Score | |
| Energy_Usage | 1.00000 | 0.77832 <.0001 | 0.82275 <.0001 |
| Treatment_Efficiency | 0.77832 <.0001 | 1.00000 | 0.97332 <.0001 |
| Compliance_Score | 0.82275 <.0001 | 0.97332 <.0001 | 1.00000 |
Used to understand relationships such as:
· Does high energy usage reduce efficiency?
· Does contamination impact compliance?
This introduces analytical thinking beyond simple reporting.
9. PROC SGPLOT
proc sgplot data=water_dates;
scatter x=Contaminant_Level y=Treatment_Efficiency;
run;
OUTPUT:
Visualization makes:
· Patterns visible
· Outliers obvious
· Trends easy to explain to management
Graphs are essential for:
· Board presentations
· Government reviews
· Public dashboards
10. Utilization Classification Macro
%macro utilization;
data utilization;
set water_dates;
length utilization $10.;
if Capacity_MLD > 130 then Utilization = "High";
else if 90 <= Capacity_MLD <= 130 then Utilization = "Medium";
else Utilization = "Low";
run;
proc print data=utilization;
run;
%mend;
%utilization;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit | utilization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 | Medium |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 | Medium |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 | High |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 | Medium |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 | High |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 | Medium |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 | Medium |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 | Low |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 | Medium |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 | Low |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 | Medium |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 | High |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 | Medium |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 | Medium |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 | Low |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 | Medium |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 | Low |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 | Medium |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 | Medium |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 | Medium |
This macro classifies plants into:
· High
· Medium
· Low utilization
Business purpose:
· Capacity planning
· Resource allocation
· Infrastructure investment decisions
11. Fraud Detection Macro
%macro fraud_check;
data fraud;
set water_dates;
if Treatment_Efficiency > 95 and Contaminant_Level > 6 then Flag = "Suspicious";
else Flag = "Normal";
run;
proc print data=fraud;
run;
%mend;
%fraud_check;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit | Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 | Normal |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 | Normal |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 | Normal |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 | Normal |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 | Normal |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 | Normal |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 | Normal |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 | Normal |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 | Normal |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 | Normal |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 | Normal |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 | Normal |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 | Normal |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 | Normal |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 | Normal |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 | Normal |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 | Normal |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 | Normal |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 | Normal |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 | Normal |
Flags plants when:
· Efficiency is very high
· But contamination is also high
This represents:
· Data manipulation
· Sensor failure
· Reporting issues
This type of logic is used in:
· Smart cities
· Banking fraud
· Healthcare monitoring
12. PROC FREQ
proc freq data=fraud;
tables Flag;
run;
OUTPUT:
The FREQ Procedure
| Flag | Frequency | Percent | Cumulative Frequency | Cumulative Percent |
|---|---|---|---|---|
| Normal | 20 | 100.00 | 20 | 100.00 |
The project uses:
· SET
· MERGE
· APPEND
· TRANSPOSE
· PROC DATASETS DELETE
These simulate:
· Batch processing
· Historical data storage
· Cleanup of temporary datasets
This is exactly how enterprise SAS jobs are built.
13. TRANSPOSE
proc transpose data=water_dates out=water_t;
var Profit Energy_Usage;
id Plant_Name;
run;
proc print data=water_t;
run;
OUTPUT:
| Obs | _NAME_ | Krishnaplant | Godavariunit | Cauveryhub | Gangastation | Yamunaworks | Narmadaplant | Sabarmati | Mahanadiunit | Periyarworks | Brahmaputra | Tungabhadra | Bhimaplant | Kaverisouth | Hooghly | Luniworks | Chambalunit | Teestaplant | Indushub | Sutlejunit | Pennarworks |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Profit | 500000 | 420000 | 610000 | 300000 | 700000 | 380000 | 450000 | 290000 | 340000 | 310000 | 360000 | 520000 | 430000 | 480000 | 260000 | 300000 | 310000 | 390000 | 410000 | 350000 |
| 2 | Energy_Usage | 450 | 430 | 480 | 410 | 520 | 400 | 435 | 390 | 420 | 405 | 415 | 470 | 440 | 460 | 370 | 395 | 400 | 420 | 430 | 415 |
14. APPEND
data water_copy;
set water_dates;
run;
proc print data=water_copy;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 |
proc append base=water_dates
data=water_copy;
run;
proc print data=water_dates;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 |
| 2 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 |
| 4 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 |
| 5 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 |
| 6 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 |
| 7 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 |
| 8 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 |
| 9 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 |
| 10 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 |
| 11 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 |
| 12 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 |
| 13 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 |
| 14 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 |
| 15 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 |
| 16 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 |
| 17 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 |
| 18 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 |
| 19 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 |
| 20 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 |
| 21 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 |
| 22 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 |
| 23 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 |
| 24 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 |
| 25 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 |
| 26 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 |
| 27 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 |
| 28 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 |
| 29 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 |
| 30 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 |
| 31 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 |
| 32 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 |
| 33 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 |
| 34 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 |
| 35 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 |
| 36 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 |
| 37 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 |
| 38 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 |
| 39 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 |
| 40 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 |
15. MERGE
proc sort data=utilization;by Plant_Name;run;
proc print data=utilization;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit | utilization |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 | High |
| 2 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 | Low |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 | High |
| 4 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 | Medium |
| 5 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 | Medium |
| 6 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 | Medium |
| 7 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 | Medium |
| 8 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 | Medium |
| 9 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 | Medium |
| 10 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 | Medium |
| 11 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 | Low |
| 12 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 | Low |
| 13 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 | Medium |
| 14 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 | Medium |
| 15 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 | Medium |
| 16 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 | Medium |
| 17 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 | Medium |
| 18 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 | Low |
| 19 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 | Medium |
| 20 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 | High |
proc sort data=fraud;by Plant_Name;run;
proc print data=fraud;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit | Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 | Normal |
| 2 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 | Normal |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 | Normal |
| 4 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 | Normal |
| 5 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 | Normal |
| 6 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 | Normal |
| 7 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 | Normal |
| 8 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 | Normal |
| 9 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 | Normal |
| 10 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 | Normal |
| 11 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 | Normal |
| 12 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 | Normal |
| 13 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 | Normal |
| 14 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 | Normal |
| 15 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 | Normal |
| 16 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 | Normal |
| 17 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 | Normal |
| 18 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 | Normal |
| 19 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 | Normal |
| 20 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 | Normal |
data merged;
merge utilization
fraud;
by Plant_Name;
run;
proc print data=merged;
run;
OUTPUT:
| Obs | Plant_Name | City | State | Profit | Capacity_MLD | Contaminant_Level | Treatment_Efficiency | Energy_Usage | Compliance_Score | Report_Date | Days_From_Start | Next_Audit | utilization | Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Bhimaplant | PUNE | maharashtra | 520000 | 140 | 3.1 | 94 | 470 | 96 | 05FEB2025 | 35 | 23862 | High | Normal |
| 2 | Brahmaputra | ASSAM | assam | 310000 | 88 | 5.9 | 86 | 405 | 85 | 28JAN2025 | 27 | 23832 | Low | Normal |
| 3 | Cauveryhub | CHENNAI | tamilnadu | 610000 | 150 | 2.8 | 94 | 480 | 97 | 10JAN2025 | 9 | 23832 | High | Normal |
| 4 | Chambalunit | KOTA | rajasthan | 300000 | 90 | 6.1 | 85 | 395 | 84 | 22FEB2025 | 52 | 23862 | Medium | Normal |
| 5 | Gangastation | PATNA | bihar | 300000 | 90 | 6.2 | 85 | 410 | 82 | 12JAN2025 | 11 | 23832 | Medium | Normal |
| 6 | Godavariunit | WARANGAL | telangana | 420000 | 100 | 4.5 | 89 | 430 | 90 | 05JAN2025 | 4 | 23832 | Medium | Normal |
| 7 | Hooghly | KOLKATA | westbengal | 480000 | 130 | 3.9 | 92 | 460 | 94 | 15FEB2025 | 45 | 23862 | Medium | Normal |
| 8 | Indushub | AMRITSAR | punjab | 390000 | 100 | 4.0 | 90 | 420 | 92 | 01MAR2025 | 59 | 23893 | Medium | Normal |
| 9 | Kaverisouth | TRICHY | tamilnadu | 430000 | 115 | 4.7 | 88 | 440 | 91 | 10FEB2025 | 40 | 23862 | Medium | Normal |
| 10 | Krishnaplant | HYDERABAD | telangana | 500000 | 120 | 3.2 | 92 | 450 | 95 | 01JAN2025 | 0 | 23832 | Medium | Normal |
| 11 | Luniworks | JODHPUR | rajasthan | 260000 | 75 | 7.2 | 80 | 370 | 78 | 18FEB2025 | 48 | 23862 | Low | Normal |
| 12 | Mahanadiunit | CUTTACK | odisha | 290000 | 85 | 6.5 | 83 | 390 | 80 | 22JAN2025 | 21 | 23832 | Low | Normal |
| 13 | Narmadaplant | BHOPAL | mp | 380000 | 95 | 5.0 | 87 | 400 | 88 | 18JAN2025 | 17 | 23832 | Medium | Normal |
| 14 | Pennarworks | NELLORE | ap | 350000 | 95 | 4.9 | 88 | 415 | 90 | 10MAR2025 | 68 | 23893 | Medium | Normal |
| 15 | Periyarworks | KOCHI | kerala | 340000 | 100 | 3.8 | 93 | 420 | 95 | 25JAN2025 | 24 | 23832 | Medium | Normal |
| 16 | Sabarmati | AHMEDABAD | gujarat | 450000 | 110 | 4.1 | 90 | 435 | 92 | 20JAN2025 | 19 | 23832 | Medium | Normal |
| 17 | Sutlejunit | LUDHIANA | punjab | 410000 | 110 | 3.6 | 91 | 430 | 94 | 05MAR2025 | 63 | 23893 | Medium | Normal |
| 18 | Teestaplant | SILIGURI | westbengal | 310000 | 85 | 5.5 | 87 | 400 | 86 | 25FEB2025 | 55 | 23862 | Low | Normal |
| 19 | Tungabhadra | BELLARY | karnataka | 360000 | 105 | 4.4 | 89 | 415 | 90 | 01FEB2025 | 31 | 23862 | Medium | Normal |
| 20 | Yamunaworks | DELHI | delhi | 700000 | 180 | 3.5 | 91 | 520 | 96 | 15JAN2025 | 14 | 23832 | High | Normal |
16. PROC DATASETS DELETE
proc datasets library=work;
delete water_t;
quit;
LOG:
This Project Demonstrates
This single project demonstrates:
Skill | Covered | |
Real Business Problem | YES | |
Data Creation | YES | |
SQL Analytics | YES | |
Statistical Profiling | YES | |
Correlation | YES | |
Visualization | YES | |
Macros | YES | |
Fraud Logic | YES | |
Date Intelligence | YES | |
Data Engineering | YES | |
Cleanup | YES | |
Real-Life Mapping
Variable | Real Meaning |
Capacity_MLD | Water output capacity |
Contaminant_Level | Pollution index |
Treatment_Efficiency | Quality of filtration |
Energy_Usage | Cost driver |
Compliance_Score | Government rating |
Fraud Flag | Possible data manipulation |
REAL-WORLD BUSINESS BENEFITS
If implemented in real life, this system would help:
· Improve water quality
· Reduce energy costs
· Detect operational fraud
· Ensure regulatory compliance
· Support smart city initiatives
It directly supports:
· Environmental sustainability
· Public health protection
· Government transparency
Why Companies Love This Kind of Project
Because it shows:
· You understand business, not just syntax
· You know how to build systems
· You think like a consultant
· You can explain things simply
· You can design end-to-end pipelines
CONCLUSION
This SAS project on Water Treatment Plants Analytics shows how data can be transformed into meaningful business insights.
Instead of focusing only on syntax, the project focuses on:
· Real business problems
· Real analytical workflows
· Real enterprise logic
It demonstrates that a SAS programmer is not just someone who writes code, but someone who:
· Understands the domain
· Designs analytical systems
· Supports business decisions
INTERVIEW QUESTIONS FOR YOU
· What is PROC FORMAT ?
· What is PROC REPORT ?
· What is PROC SGPLOT ?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent Water Plant data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
· EV and energy industry professionals
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Follow Us On :
To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:
1.Which Country Truly Dominates the Olympics? – A Complete SAS Medal Efficiency Analytics Project
2.Which Airports Are Really the Busiest? – An End-to-End SAS Airport Traffic Analytics Project
3.Can Data Predict Election Outcomes? – A Complete SAS Voting Analytics Project
Comments
Post a Comment