402.Can SAS Identify the Most Efficient Waste Collection Routes in a City?
Can SAS Identify the Most Efficient Waste Collection Routes in a City?
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | SET | PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS
INTRODUCTION
Urban waste collection is one of the most
critical municipal services. Cities spend millions of rupees every year on fuel, manpower,
vehicles, and maintenance for waste collection.
However, inefficient routing, fuel misuse, time delays, and fraud
(fake fuel claims, inflated tonnage reporting) can lead to huge losses.
This project demonstrates how a SAS Programmer / Data Analyst can:
·
Build a clean
analytical dataset
·
Measure route
efficiency
·
Detect operational
red flags (fraud-like patterns)
·
Analyze fuel
usage, time, distance, and waste
·
Generate statistical
and visual insights
TABLE OF CONTENTS
1.
Business Context
2.
Dataset Design
3.
Raw Data Creation
4.
Date Handling (MDY, INTCK, INTNX)
5.
Data Cleaning & Standardization
6.
Derived Metrics & Efficiency Score
7.
Macro: Utilization Classification
8.
Macro: Fraud Detection Logic
9.
PROC SQL Analytics
10. PROC
FREQ
11. PROC
MEANS
12. PROC
UNIVARIATE
13. PROC
CORR
14. PROC
SGPLOT
15. TRANSPOSE, SET
16. Character
& Numeric Functions
17. PROC
DATASETS DELETE
18. 5 Key Points About This Project
19.
20. Conclusion
1. BUSINESS CONTEXT
Municipal corporations want answers to
questions like:
·
Which routes are efficient or inefficient?
·
Are drivers over-reporting fuel usage?
·
Is waste collected proportional to distance?
·
Which cities perform better?
·
How does time affect fuel and waste volume?
This SAS project answers all of these.
2. DATASET DESIGN
Variables Used
|
Variable |
Description |
|
Route_ID |
Unique route identifier |
|
City |
City name |
|
Route_Date |
Collection date |
|
Distance_km |
Distance covered |
|
Waste_Collected_Tons |
Waste collected |
|
Fuel_Used_Liters |
Fuel used |
|
Time_Taken_Hours |
Time taken |
|
Vehicle_Type |
Truck type |
|
Driver_Name |
Driver |
|
Efficiency_Score |
Calculated % |
|
Utilization_Class |
High/Medium/Low |
|
Fraud_Flag |
Y / N |
3. RAW DATA CREATION (DATA STEP)
data waste_routes_raw;
input Route_ID $ City:$12. Route_Date : date9. Distance_km Waste_Collected_Tons
Fuel_Used_Liters Time_Taken_Hours Vehicle_Type:$14. Driver_Name $;
format Route_Date date9.;
datalines;
R001 Hyderabad 01JAN2026 25 12.5 18 4 Compactor Ravi
R002 Hyderabad 02JAN2026 30 14.0 22 5 Tipper Kumar
R003 Chennai 01JAN2026 20 10.2 15 3 Compactor Arjun
R004 Chennai 03JAN2026 28 16.8 26 6 Tipper Suresh
R005 Mumbai 01JAN2026 35 20.0 30 7 Compactor Ramesh
R006 Mumbai 02JAN2026 40 21.5 36 8 Tipper Mahesh
R007 Pune 01JAN2026 18 8.0 14 3 MiniTruck Ajay
R008 Pune 02JAN2026 22 9.5 16 4 MiniTruck Vijay
R009 Delhi 01JAN2026 45 24.0 40 9 Compactor Deepak
R010 Delhi 03JAN2026 50 26.5 45 10 Tipper Aman
R011 Jaipur 01JAN2026 20 11.0 17 4 MiniTruck Rohit
R012 Jaipur 02JAN2026 24 12.2 18 5 MiniTruck Mohan
R013 Kochi 01JAN2026 15 7.8 10 3 Compactor Sajan
R014 Kochi 02JAN2026 17 8.5 12 3 Compactor Anil
R015 Indore 03JAN2026 26 13.5 21 5 Tipper Sanjay
;
run;
proc print data=waste_routes_raw;
run;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Distance_km | Waste_Collected_Tons | Fuel_Used_Liters | Time_Taken_Hours | Vehicle_Type | Driver_Name |
|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi |
| 2 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar |
| 3 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun |
| 4 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh |
| 5 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh |
| 6 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh |
| 7 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay |
| 8 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay |
| 9 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak |
| 10 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman |
| 11 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit |
| 12 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan |
| 13 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan |
| 14 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil |
| 15 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay |
·
Same as real municipal data
·
Includes dates, numbers, and character variables
4. DATE FUNCTIONS (MDY, INTCK, INTNX)
data waste_routes_dates;
set waste_routes_raw;
Month_Start = intnx('month', Route_Date, 0, 'b');
Days_From_Start = intck('day', '01JAN2026'd, Route_Date);
format Month_Start date9.;
drop Distance_km Waste_Collected_Tons Fuel_Used_Liters Time_Taken_Hours Vehicle_Type;
run;
proc print data=waste_routes_dates;
run;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Driver_Name | Month_Start | Days_From_Start |
|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | Ravi | 01JAN2026 | 0 |
| 2 | R002 | Hyderabad | 02JAN2026 | Kumar | 01JAN2026 | 1 |
| 3 | R003 | Chennai | 01JAN2026 | Arjun | 01JAN2026 | 0 |
| 4 | R004 | Chennai | 03JAN2026 | Suresh | 01JAN2026 | 2 |
| 5 | R005 | Mumbai | 01JAN2026 | Ramesh | 01JAN2026 | 0 |
| 6 | R006 | Mumbai | 02JAN2026 | Mahesh | 01JAN2026 | 1 |
| 7 | R007 | Pune | 01JAN2026 | Ajay | 01JAN2026 | 0 |
| 8 | R008 | Pune | 02JAN2026 | Vijay | 01JAN2026 | 1 |
| 9 | R009 | Delhi | 01JAN2026 | Deepak | 01JAN2026 | 0 |
| 10 | R010 | Delhi | 03JAN2026 | Aman | 01JAN2026 | 2 |
| 11 | R011 | Jaipur | 01JAN2026 | Rohit | 01JAN2026 | 0 |
| 12 | R012 | Jaipur | 02JAN2026 | Mohan | 01JAN2026 | 1 |
| 13 | R013 | Kochi | 01JAN2026 | Sajan | 01JAN2026 | 0 |
| 14 | R014 | Kochi | 02JAN2026 | Anil | 01JAN2026 | 1 |
| 15 | R015 | Indore | 03JAN2026 | Sanjay | 01JAN2026 | 2 |
Why this is used
·
INTNX:
aligns reporting to month start
·
INTCK:
measures time gaps
·
Used heavily in operational reporting
5. DATA CLEANING & CHARACTER FUNCTIONS
data waste_routes_clean;
set waste_routes_raw;
City = propcase(strip(City));
Driver_Name = upcase(trim(Driver_Name));
Vehicle_Type = lowcase(Vehicle_Type);
Route_Key = catx('_', City, Route_ID);
keep Route_ID City Vehicle_Type Driver_Name;
run;
proc print data=waste_routes_clean;
run;
OUTPUT:
| Obs | Route_ID | City | Vehicle_Type | Driver_Name |
|---|---|---|---|---|
| 1 | R001 | Hyderabad | compactor | RAVI |
| 2 | R002 | Hyderabad | tipper | KUMAR |
| 3 | R003 | Chennai | compactor | ARJUN |
| 4 | R004 | Chennai | tipper | SURESH |
| 5 | R005 | Mumbai | compactor | RAMESH |
| 6 | R006 | Mumbai | tipper | MAHESH |
| 7 | R007 | Pune | minitruck | AJAY |
| 8 | R008 | Pune | minitruck | VIJAY |
| 9 | R009 | Delhi | compactor | DEEPAK |
| 10 | R010 | Delhi | tipper | AMAN |
| 11 | R011 | Jaipur | minitruck | ROHIT |
| 12 | R012 | Jaipur | minitruck | MOHAN |
| 13 | R013 | Kochi | compactor | SAJAN |
| 14 | R014 | Kochi | compactor | ANIL |
| 15 | R015 | Indore | tipper | SANJAY |
Functions explained
·
STRIP / TRIM → remove spaces
·
PROPCASE → city formatting
·
UPCASE / LOWCASE
·
CATX → create composite keys
6. EFFICIENCY SCORE CALCULATION
data waste_routes_metrics;
set waste_routes_raw;
Efficiency_Score = (Waste_Collected_Tons / Fuel_Used_Liters) * 100;
format Efficiency_Score 6.2;
run;
proc print data=waste_routes_metrics;
run;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Distance_km | Waste_Collected_Tons | Fuel_Used_Liters | Time_Taken_Hours | Vehicle_Type | Driver_Name | Efficiency_Score |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi | 69.44 |
| 2 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar | 63.64 |
| 3 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun | 68.00 |
| 4 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh | 64.62 |
| 5 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh | 66.67 |
| 6 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh | 59.72 |
| 7 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay | 57.14 |
| 8 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay | 59.38 |
| 9 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak | 60.00 |
| 10 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman | 58.89 |
| 11 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit | 64.71 |
| 12 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan | 67.78 |
| 13 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan | 78.00 |
| 14 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil | 70.83 |
| 15 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay | 64.29 |
·
Shows waste per fuel
·
Higher = better operational efficiency
7. MACRO: UTILIZATION CLASSIFICATION
%macro utilization;
data waste_routes_util;
set waste_routes_metrics;
length Utilization_Class $8.;
if Efficiency_Score >= 70 then Utilization_Class = 'HIGH';
else if Efficiency_Score >= 60 then Utilization_Class = 'MEDIUM';
else Utilization_Class = 'LOW';
run;
proc print data=waste_routes_util;
run;
%mend;
%utilization;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Distance_km | Waste_Collected_Tons | Fuel_Used_Liters | Time_Taken_Hours | Vehicle_Type | Driver_Name | Efficiency_Score | Utilization_Class |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi | 69.44 | MEDIUM |
| 2 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar | 63.64 | MEDIUM |
| 3 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun | 68.00 | MEDIUM |
| 4 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh | 64.62 | MEDIUM |
| 5 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh | 66.67 | MEDIUM |
| 6 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh | 59.72 | LOW |
| 7 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay | 57.14 | LOW |
| 8 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay | 59.38 | LOW |
| 9 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak | 60.00 | MEDIUM |
| 10 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman | 58.89 | LOW |
| 11 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit | 64.71 | MEDIUM |
| 12 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan | 67.78 | MEDIUM |
| 13 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan | 78.00 | HIGH |
| 14 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil | 70.83 | HIGH |
| 15 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay | 64.29 | MEDIUM |
Business Value
·
Helps management rank routes
·
Used in dashboards & KPIs
8. MACRO: FRAUD DETECTION LOGIC
%macro fraud_check;
data waste_routes_fraud;
set waste_routes_util;
if Fuel_Used_Liters > (Distance_km * 0.8) then Fraud_Flag = 'Y';
else Fraud_Flag = 'N';
run;
proc print data=waste_routes_fraud;
run;
%mend;
%fraud_check;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Distance_km | Waste_Collected_Tons | Fuel_Used_Liters | Time_Taken_Hours | Vehicle_Type | Driver_Name | Efficiency_Score | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi | 69.44 | MEDIUM | N |
| 2 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar | 63.64 | MEDIUM | N |
| 3 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun | 68.00 | MEDIUM | N |
| 4 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh | 64.62 | MEDIUM | Y |
| 5 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh | 66.67 | MEDIUM | Y |
| 6 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh | 59.72 | LOW | Y |
| 7 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay | 57.14 | LOW | N |
| 8 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay | 59.38 | LOW | N |
| 9 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak | 60.00 | MEDIUM | Y |
| 10 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman | 58.89 | LOW | Y |
| 11 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit | 64.71 | MEDIUM | Y |
| 12 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan | 67.78 | MEDIUM | N |
| 13 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan | 78.00 | HIGH | N |
| 14 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil | 70.83 | HIGH | N |
| 15 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay | 64.29 | MEDIUM | Y |
·
Fuel should be proportional to distance
·
Flags possible misuse or data issues
9. PROC SQL – ANALYTICS
proc sql;
create table city_summary as
select City,
count(*) as Routes,
avg(Efficiency_Score) as Avg_Efficiency,
sum(Waste_Collected_Tons) as Total_Waste
from waste_routes_fraud
group by City;
quit;
proc print data=city_summary;
run;
OUTPUT:
| Obs | City | Routes | Avg_Efficiency | Total_Waste |
|---|---|---|---|---|
| 1 | Chennai | 2 | 66.3077 | 27.0 |
| 2 | Delhi | 2 | 59.4444 | 50.5 |
| 3 | Hyderabad | 2 | 66.5404 | 26.5 |
| 4 | Indore | 1 | 64.2857 | 13.5 |
| 5 | Jaipur | 2 | 66.2418 | 23.2 |
| 6 | Kochi | 2 | 74.4167 | 16.3 |
| 7 | Mumbai | 2 | 63.1944 | 41.5 |
| 8 | Pune | 2 | 58.2589 | 17.5 |
10. PROC FREQ
proc freq data=waste_routes_fraud;
tables City*Fraud_Flag / nocol nopercent;
run;
OUTPUT:
The FREQ Procedure
|
| ||||||||||||||||||||||||||||||||||||||||||||||||
proc means data=waste_routes_fraud mean min max;
var Distance_km Fuel_Used_Liters Waste_Collected_Tons Efficiency_Score;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum |
|---|---|---|---|
Distance_km Fuel_Used_Liters Waste_Collected_Tons Efficiency_Score | 27.6666667 22.6666667 14.4000000 64.8729690 | 15.0000000 10.0000000 7.8000000 57.1428571 | 50.0000000 45.0000000 26.5000000 78.0000000 |
12. PROC UNIVARIATE
proc univariate data=waste_routes_fraud;
var Efficiency_Score;
histogram Efficiency_Score;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Efficiency_Score
| Moments | |||
|---|---|---|---|
| N | 15 | Sum Weights | 15 |
| Mean | 64.872969 | Sum Observations | 973.094535 |
| Std Deviation | 5.53051739 | Variance | 30.5866226 |
| Skewness | 0.74705715 | Kurtosis | 0.76556131 |
| Uncorrected SS | 63555.7444 | Corrected SS | 428.212717 |
| Coeff Variation | 8.52514919 | Std Error Mean | 1.42797345 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 64.87297 | Std Deviation | 5.53052 |
| Median | 64.61538 | Variance | 30.58662 |
| Mode | . | Range | 20.85714 |
| Interquartile Range | 8.27778 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 45.43009 | Pr > |t| | <.0001 |
| Sign | M | 7.5 | Pr >= |M| | <.0001 |
| Signed Rank | S | 60 | Pr >= |S| | <.0001 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 78.0000 |
| 99% | 78.0000 |
| 95% | 78.0000 |
| 90% | 70.8333 |
| 75% Q3 | 68.0000 |
| 50% Median | 64.6154 |
| 25% Q1 | 59.7222 |
| 10% | 58.8889 |
| 5% | 57.1429 |
| 1% | 57.1429 |
| 0% Min | 57.1429 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 57.1429 | 7 | 67.7778 | 12 |
| 58.8889 | 10 | 68.0000 | 3 |
| 59.3750 | 8 | 69.4444 | 1 |
| 59.7222 | 6 | 70.8333 | 14 |
| 60.0000 | 9 | 78.0000 | 13 |
The UNIVARIATE Procedure
13. PROC CORR
proc corr data=waste_routes_fraud;
var Distance_km Fuel_Used_Liters Waste_Collected_Tons Time_Taken_Hours;
run;
OUTPUT:
The CORR Procedure
| 4 Variables: | Distance_km Fuel_Used_Liters Waste_Collected_Tons Time_Taken_Hours |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Distance_km | 15 | 27.66667 | 10.52661 | 415.00000 | 15.00000 | 50.00000 |
| Fuel_Used_Liters | 15 | 22.66667 | 10.60099 | 340.00000 | 10.00000 | 45.00000 |
| Waste_Collected_Tons | 15 | 14.40000 | 6.02412 | 216.00000 | 7.80000 | 26.50000 |
| Time_Taken_Hours | 15 | 5.26667 | 2.28244 | 79.00000 | 3.00000 | 10.00000 |
| Pearson Correlation Coefficients, N = 15 Prob > |r| under H0: Rho=0 | ||||
|---|---|---|---|---|
| Distance_km | Fuel_Used_Liters | Waste_Collected_Tons | Time_Taken_Hours | |
| Distance_km | 1.00000 | 0.99042 <.0001 | 0.98469 <.0001 | 0.98206 <.0001 |
| Fuel_Used_Liters | 0.99042 <.0001 | 1.00000 | 0.99255 <.0001 | 0.98993 <.0001 |
| Waste_Collected_Tons | 0.98469 <.0001 | 0.99255 <.0001 | 1.00000 | 0.98808 <.0001 |
| Time_Taken_Hours | 0.98206 <.0001 | 0.98993 <.0001 | 0.98808 <.0001 | 1.00000 |
14. PROC SGPLOT
proc sgplot data=waste_routes_fraud;
scatter x=Distance_km y=Fuel_Used_Liters;
reg x=Distance_km y=Fuel_Used_Liters;
run;
OUTPUT:
15. TRANSPOSE,SET
proc transpose data=city_summary out=city_trans;
by City NotSorted;
run;
proc print data=city_trans;
run;
OUTPUT:
| Obs | City | _NAME_ | COL1 |
|---|---|---|---|
| 1 | Chennai | Routes | 2.0000 |
| 2 | Chennai | Avg_Efficiency | 66.3077 |
| 3 | Chennai | Total_Waste | 27.0000 |
| 4 | Delhi | Routes | 2.0000 |
| 5 | Delhi | Avg_Efficiency | 59.4444 |
| 6 | Delhi | Total_Waste | 50.5000 |
| 7 | Hyderabad | Routes | 2.0000 |
| 8 | Hyderabad | Avg_Efficiency | 66.5404 |
| 9 | Hyderabad | Total_Waste | 26.5000 |
| 10 | Indore | Routes | 1.0000 |
| 11 | Indore | Avg_Efficiency | 64.2857 |
| 12 | Indore | Total_Waste | 13.5000 |
| 13 | Jaipur | Routes | 2.0000 |
| 14 | Jaipur | Avg_Efficiency | 66.2418 |
| 15 | Jaipur | Total_Waste | 23.2000 |
| 16 | Kochi | Routes | 2.0000 |
| 17 | Kochi | Avg_Efficiency | 74.4167 |
| 18 | Kochi | Total_Waste | 16.3000 |
| 19 | Mumbai | Routes | 2.0000 |
| 20 | Mumbai | Avg_Efficiency | 63.1944 |
| 21 | Mumbai | Total_Waste | 41.5000 |
| 22 | Pune | Routes | 2.0000 |
| 23 | Pune | Avg_Efficiency | 58.2589 |
| 24 | Pune | Total_Waste | 17.5000 |
data all_routes;
set waste_routes_raw
waste_routes_fraud;
run;
proc print data=all_routes;
run;
OUTPUT:
| Obs | Route_ID | City | Route_Date | Distance_km | Waste_Collected_Tons | Fuel_Used_Liters | Time_Taken_Hours | Vehicle_Type | Driver_Name | Efficiency_Score | Utilization_Class | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi | . | ||
| 2 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar | . | ||
| 3 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun | . | ||
| 4 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh | . | ||
| 5 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh | . | ||
| 6 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh | . | ||
| 7 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay | . | ||
| 8 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay | . | ||
| 9 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak | . | ||
| 10 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman | . | ||
| 11 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit | . | ||
| 12 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan | . | ||
| 13 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan | . | ||
| 14 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil | . | ||
| 15 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay | . | ||
| 16 | R001 | Hyderabad | 01JAN2026 | 25 | 12.5 | 18 | 4 | Compactor | Ravi | 69.44 | MEDIUM | N |
| 17 | R002 | Hyderabad | 02JAN2026 | 30 | 14.0 | 22 | 5 | Tipper | Kumar | 63.64 | MEDIUM | N |
| 18 | R003 | Chennai | 01JAN2026 | 20 | 10.2 | 15 | 3 | Compactor | Arjun | 68.00 | MEDIUM | N |
| 19 | R004 | Chennai | 03JAN2026 | 28 | 16.8 | 26 | 6 | Tipper | Suresh | 64.62 | MEDIUM | Y |
| 20 | R005 | Mumbai | 01JAN2026 | 35 | 20.0 | 30 | 7 | Compactor | Ramesh | 66.67 | MEDIUM | Y |
| 21 | R006 | Mumbai | 02JAN2026 | 40 | 21.5 | 36 | 8 | Tipper | Mahesh | 59.72 | LOW | Y |
| 22 | R007 | Pune | 01JAN2026 | 18 | 8.0 | 14 | 3 | MiniTruck | Ajay | 57.14 | LOW | N |
| 23 | R008 | Pune | 02JAN2026 | 22 | 9.5 | 16 | 4 | MiniTruck | Vijay | 59.38 | LOW | N |
| 24 | R009 | Delhi | 01JAN2026 | 45 | 24.0 | 40 | 9 | Compactor | Deepak | 60.00 | MEDIUM | Y |
| 25 | R010 | Delhi | 03JAN2026 | 50 | 26.5 | 45 | 10 | Tipper | Aman | 58.89 | LOW | Y |
| 26 | R011 | Jaipur | 01JAN2026 | 20 | 11.0 | 17 | 4 | MiniTruck | Rohit | 64.71 | MEDIUM | Y |
| 27 | R012 | Jaipur | 02JAN2026 | 24 | 12.2 | 18 | 5 | MiniTruck | Mohan | 67.78 | MEDIUM | N |
| 28 | R013 | Kochi | 01JAN2026 | 15 | 7.8 | 10 | 3 | Compactor | Sajan | 78.00 | HIGH | N |
| 29 | R014 | Kochi | 02JAN2026 | 17 | 8.5 | 12 | 3 | Compactor | Anil | 70.83 | HIGH | N |
| 30 | R015 | Indore | 03JAN2026 | 26 | 13.5 | 21 | 5 | Tipper | Sanjay | 64.29 | MEDIUM | Y |
16. PROC DATASETS DELETE
proc datasets library=work;
delete waste_routes_dates waste_routes_clean;
quit;
LOG:
17. 5 Key Points About This Project
1. This
project shows how waste collection route data can be created,
cleaned, and analyzed using SAS in a real-world style.
2. It
explains how to measure route efficiency using distance, fuel
usage, time taken, and waste collected.
3. It
demonstrates the use of macros for utilization classification
and basic fraud detection logic.
4. It
covers important SAS procedures like PROC SQL, PROC MEANS,
PROC FREQ, PROC UNIVARIATE, PROC CORR, and PROC SGPLOT.
5. The
project is written in a simple, step-by-step manner, making it
easy to understand even for beginners.
18. Who Can Use and Read This Project?
· SAS
Beginners who want hands-on practice with real-life data
· SAS
Programmers preparing for interviews
· Data
Analysts working in government or urban planning domains
· Students
learning data analytics or statistics
· Municipal
or operations teams interested in route efficiency analysis
19. Conclusion
This Waste Collection Routes Analytics project explains how data can be used
to improve daily municipal operations in a simple and practical way. Waste
collection is an important activity for every city, and managing it efficiently
helps save fuel, time, and money. Through this project, we created a realistic
dataset and analyzed it using SAS to understand how distance, fuel usage, time
taken, and waste collected are connected.
The project shows how efficiency can be calculated and how routes can be
classified as high, medium, or low utilization. It also introduces a basic
fraud detection idea by checking unusual fuel usage patterns. All steps are
explained clearly, using simple logic and commonly used SAS procedures, so even
beginners can follow along without confusion.
This project is useful not only for learning SAS but also for understanding
how data supports decision-making in real life. It helps identify inefficient
routes, possible data issues, and areas where operations can be improved.
Overall, this project builds strong confidence in using SAS for practical analytics
and prepares learners for interviews, real projects, and professional data
analysis work in an easy and understandable way.
INTERVIEW QUESTIONS FOR YOU
1.What is the difference between PROC SQL and the DATA step?
2.What is the difference between INTCK and INTNX, and how
are they useful in SAS date processing?
3.What is the difference between CALL SYMPUT
and CALL SYMPUTX?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
1.What is the difference between PROC SQL and the DATA step?
2.What is the difference between INTCK and INTNX, and how
are they useful in SAS date processing?
3.What is the difference between CALL SYMPUT
and CALL SYMPUTX?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent WASTE COLLECTION data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment