423.Can We Detect and Correct Intentional Errors in Industrial Equipment Maintenance Data Using Advanced SAS Programming?
Detecting and Correcting Intentional Errors in Industrial Equipment Maintenance Data Using Advanced SAS Programming
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —DATA STEP | LENGTH | INPUT | DATALINES | SET | MERGE | PROC SORT | PROC PRINT | PROC TRANSPOSE | PROC APPEND | PROC DATASETS DELETE | MACRO / %MACRO / %MEND | NUMERIC AND CHARACTER FUNCTIONS
1. Introduction
In modern
industrial environments, maintenance analytics plays a critical role in
ensuring operational efficiency, equipment reliability, and cost optimization.
Manufacturing plants, oil refineries, pharmaceutical facilities, and logistics
warehouses all rely heavily on machinery that must operate continuously with
minimal downtime.
However,
industrial maintenance datasets often contain numerous data quality issues such
as:
·
Incorrect maintenance costs
·
Invalid equipment IDs
·
Missing dates
·
Incorrect percentages
·
Fraudulent maintenance fees
·
Duplicate maintenance records
·
Incorrect downtime calculations
Data
engineers and SAS programmers are responsible for detecting and correcting
these errors using structured data validation techniques.
In this
project, we simulate an Industrial Equipment Maintenance dataset
containing:
·
equipment usage data
·
maintenance records
·
operational failures
·
downtime analytics
·
cost tracking
·
fraud detection indicators
The dataset intentionally includes multiple errors, which will later be
identified and corrected using advanced SAS programming techniques.
2. Table of Contents
- Introduction
- Business Context
- Dataset Structure
- Raw Dataset Creation (with
errors)
- Intentional Error Injection
- Data Cleaning Process
- SAS Macro for Fraud
Detection
- Date Handling Functions
- Dataset Combination
Techniques
- Data Transformation using
PROC TRANSPOSE
- Numeric Function
Applications
- Character Function
Applications
- Utilization Classification
- Dataset Appending and
Merging
- Corrected Dataset Generation
- 20 Key Points About This
Project
- Conclusion
3. Business Context
Industrial
companies track maintenance performance using key variables such as:
|
Variable |
Description |
|
Equipment_ID |
Unique
machine identifier |
|
Usage_Hours |
Total
operating hours |
|
Failure_Count |
Number
of breakdown events |
|
Maintenance_Cost |
Cost of
repair |
|
Downtime_Hours |
Machine
downtime |
|
Condition_Index |
Equipment
health score |
|
Remaining_Life |
Estimated
life remaining |
|
Percentage |
Maintenance
completion % |
|
Fees |
Service
fees |
|
Utilization_Class |
Equipment
utilization category |
|
Maintenance_Date |
Date of
service |
Maintenance
fraud or errors may occur in cases such as:
•
Excessive maintenance cost compared to usage
• Fake downtime reporting
• Repeated maintenance billing
• Missing equipment identifiers
• Incorrect life estimation
Detecting such anomalies is an important task for SAS programmers working in industrial analytics.
1. Raw Dataset Creation (With Intentional Errors)
title "Industrial Equipment Maintenance Dataset With Intentional Errors";
data equipment_raw;
length Equipment_ID $10 Equipment_Type $20 Location $20 Technician $20;
input Equipment_ID $ Equipment_Type $ Usage_Hours
Failure_Count Maintenance_Cost Downtime_Hours
Condition_Index Remaining_Life Percentage
Fees Location $ Technician $ Maintenance_Date :ddmmyy10.;
format Maintenance_Date date9.;
datalines;
EQ001 Pump 1200 2 5000 10 85 24 90 250 Hyderabad Ravi 12-01-2025
EQ002 Motor 900 5 12000 30 60 18 85 500 Chennai Kiran 14-02-2025
EQ003 Compressor 1500 3 8000 15 75 20 88 350 Mumbai Suresh 21-03-2025
EQ004 Pump 2000 7 20000 45 40 10 95 800 Pune Ramesh 11-04-2025
EQ005 Turbine 500 1 3000 5 92 30 101 200 Delhi Ajay 22-05-2025
EQ006 Pump 700 2 -500 8 88 26 85 150 Hyderabad Ravi 15-06-2025
EQ007 Motor 1100 0 2500 0 95 35 80 100 Chennai Kiran 18-07-2025
EQ008 Compressor 1600 6 15000 40 55 15 92 650 Mumbai Suresh 25-08-2025
EQ009 Pump . 3 6000 12 70 20 88 300 Pune Ramesh 19-09-2025
EQ010 Turbine 1300 4 9000 22 65 18 89 400 Delhi Ajay 02-10-2025
EQ011 Pump 1000 2 5500 11 80 23 87 250 Hyderabad Ravi 17-11-2025
EQ012 Motor 1400 3 9500 20 72 19 84 450 Chennai Kiran 09-12-2025
EQ013 Compressor 1700 8 21000 60 35 8 99 900 Mumbai Suresh 01-01-2026
EQ014 Pump 850 2 4500 9 82 24 86 230 Pune Ramesh 14-02-2026
EQ015 Turbine 600 1 3200 6 90 28 83 180 Delhi Ajay 05-03-2026
;
run;
proc print data=equipment_raw;
run;
title;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 |
Intentional Errors Introduced
This dataset deliberately contains multiple
issues:
Error 1 — Invalid Percentage
Percentage = 101
Percentage cannot exceed 100.
Error
2 — Negative Maintenance Cost
Maintenance_Cost = -500
Maintenance costs cannot be negative.
Error
3 — Missing Usage Hours
Usage_Hours = .
This affects utilization calculations.
Error
4 — Abnormally High Maintenance Cost
21000
Possible fraud.
2. Fraud Detection Macro
%macro fraud_detection;
data fraud_flag;
set equipment_raw;
if Maintenance_Cost > 20000 then Fraud_Flag = "YES";
else Fraud_Flag = "NO";
run;
proc print data=fraud_flag;
run;
%mend;
%fraud_detection;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | NO |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO |
3. Utilization Classification
data utilization_class;
set equipment_raw;
length Utilization_Class $8.;
if Usage_Hours >=1500 then Utilization_Class="HIGH";
else if Usage_Hours >=800 then Utilization_Class="MEDIUM";
else Utilization_Class="LOW";
run;
proc print data=utilization_class;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Utilization_Class |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | MEDIUM |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | MEDIUM |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | HIGH |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | HIGH |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | LOW |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | LOW |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | MEDIUM |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | HIGH |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | LOW |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | MEDIUM |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | MEDIUM |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | MEDIUM |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | HIGH |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | MEDIUM |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | LOW |
4. Date Functions
data MDY;
set equipment_raw;
Next_Service_Date = mdy(month(Maintenance_Date),day(Maintenance_Date)
,year(Maintenance_Date)+1);
run;
proc print data=MDY;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Next_Service_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | 24118 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | 24151 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | 24186 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | 24207 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | 24248 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | 24272 |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | 24305 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | 24343 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | 24368 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | 24381 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | 24427 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | 24449 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | 24472 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | 24516 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | 24535 |
Creates next service
schedule.
data INTCK;
set equipment_raw;
Service_Age = intck('month',Maintenance_Date,today());
run;
proc print data=INTCK;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Service_Age |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | 14 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | 13 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | 12 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | 11 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | 10 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | 9 |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | 8 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | 7 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | 6 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | 5 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | 4 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | 3 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | 2 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | 1 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | 0 |
Counts months since last service.
data INTNX;
set equipment_raw;
Service_Age = intck('month',Maintenance_Date,today());
run;
proc print data=INTNX;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Service_Age |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | 14 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | 13 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | 12 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | 11 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | 10 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | 9 |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | 8 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | 7 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | 6 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | 5 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | 4 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | 3 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | 2 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | 1 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | 0 |
Calculates next month service.
5. Character Functions
data Char;
set equipment_raw;
Technician_Name = propcase(Technician);
Location_Upper = upcase(Location);
Location_Lower = lowcase(Location);
Clean_ID = strip(trim(Equipment_ID));
Full_ID = catx("-",Equipment_ID,Equipment_Type);
run;
proc print data=Char;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Technician_Name | Location_Upper | Location_Lower | Clean_ID | Full_ID |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | Ravi | HYDERABAD | hyderabad | EQ001 | EQ001-Pump |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | Kiran | CHENNAI | chennai | EQ002 | EQ002-Motor |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | Suresh | MUMBAI | mumbai | EQ003 | EQ003-Compressor |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | Ramesh | PUNE | pune | EQ004 | EQ004-Pump |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | Ajay | DELHI | delhi | EQ005 | EQ005-Turbine |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | Ravi | HYDERABAD | hyderabad | EQ006 | EQ006-Pump |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | Kiran | CHENNAI | chennai | EQ007 | EQ007-Motor |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | Suresh | MUMBAI | mumbai | EQ008 | EQ008-Compressor |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | Ramesh | PUNE | pune | EQ009 | EQ009-Pump |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | Ajay | DELHI | delhi | EQ010 | EQ010-Turbine |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | Ravi | HYDERABAD | hyderabad | EQ011 | EQ011-Pump |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | Kiran | CHENNAI | chennai | EQ012 | EQ012-Motor |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | Suresh | MUMBAI | mumbai | EQ013 | EQ013-Compressor |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | Ramesh | PUNE | pune | EQ014 | EQ014-Pump |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | Ajay | DELHI | delhi | EQ015 | EQ015-Turbine |
6. Numeric Functions
data Num;
set equipment_raw;
Cost_per_Hour = Maintenance_Cost / Usage_Hours;
Rounded_Cost = round(Maintenance_Cost);
Log_Cost = log(Maintenance_Cost);
run;
proc print data=Num;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Cost_per_Hour | Rounded_Cost | Log_Cost |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | 4.1667 | 5000 | 8.51719 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | 13.3333 | 12000 | 9.39266 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | 5.3333 | 8000 | 8.98720 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | 10.0000 | 20000 | 9.90349 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | 6.0000 | 3000 | 8.00637 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | -0.7143 | -500 | . |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | 2.2727 | 2500 | 7.82405 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | 9.3750 | 15000 | 9.61581 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | . | 6000 | 8.69951 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | 6.9231 | 9000 | 9.10498 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | 5.5000 | 5500 | 8.61250 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | 6.7857 | 9500 | 9.15905 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | 12.3529 | 21000 | 9.95228 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | 5.2941 | 4500 | 8.41183 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | 5.3333 | 3200 | 8.07091 |
7. PROC TRANSPOSE
proc transpose data=equipment_raw out=maintenance_transpose;
var Usage_Hours Maintenance_Cost Downtime_Hours;
run;
proc print data=maintenance_transpose;
run;
OUTPUT:
| Obs | _NAME_ | COL1 | COL2 | COL3 | COL4 | COL5 | COL6 | COL7 | COL8 | COL9 | COL10 | COL11 | COL12 | COL13 | COL14 | COL15 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Usage_Hours | 1200 | 900 | 1500 | 2000 | 500 | 700 | 1100 | 1600 | . | 1300 | 1000 | 1400 | 1700 | 850 | 600 |
| 2 | Maintenance_Cost | 5000 | 12000 | 8000 | 20000 | 3000 | -500 | 2500 | 15000 | 6000 | 9000 | 5500 | 9500 | 21000 | 4500 | 3200 |
| 3 | Downtime_Hours | 10 | 30 | 15 | 45 | 5 | 8 | 0 | 40 | 12 | 22 | 11 | 20 | 60 | 9 | 6 |
8. Dataset Append
proc append base=equipment_raw
data=utilization_class force;
run;
proc print data=equipment_raw;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 |
| 16 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 |
| 17 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 |
| 18 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 |
| 19 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 |
| 20 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 |
| 21 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 |
| 22 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 |
| 23 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 |
| 24 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 |
| 25 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 |
| 26 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 |
| 27 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 |
| 28 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 |
| 29 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 |
| 30 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 |
9. Dataset Merge
proc sort data=equipment_raw;by Equipment_ID;run;
proc print data=equipment_raw;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 |
| 2 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 |
| 3 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 |
| 4 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 |
| 5 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 |
| 6 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 |
| 7 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 |
| 8 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 |
| 9 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 |
| 10 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 |
| 11 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 |
| 12 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 |
| 13 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 |
| 14 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 |
| 15 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 |
| 16 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 |
| 17 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 |
| 18 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 |
| 19 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 |
| 20 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 |
| 21 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 |
| 22 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 |
| 23 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 |
| 24 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 |
| 25 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 |
| 26 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 |
| 27 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 |
| 28 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 |
| 29 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 |
| 30 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 |
proc sort data=fraud_flag;by Equipment_ID;run;
proc print data=fraud_flag;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO |
| 2 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO |
| 3 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO |
| 4 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO |
| 5 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | NO |
| 6 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO |
| 7 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO |
| 8 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO |
| 9 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO |
| 10 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO |
| 11 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO |
| 12 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO |
| 13 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES |
| 14 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO |
| 15 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO |
data final_maintenance;
merge equipment_raw
fraud_flag;
by Equipment_ID;
run;
proc print data=final_maintenance;
run;
OUTPUT:
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Fraud_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO |
| 2 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO |
| 3 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO |
| 4 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO |
| 5 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO |
| 6 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO |
| 7 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO |
| 8 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO |
| 9 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | NO |
| 10 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 101 | 200 | 22MAY2025 | NO |
| 11 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO |
| 12 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | -500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO |
| 13 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO |
| 14 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO |
| 15 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO |
| 16 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO |
| 17 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO |
| 18 | EQ009 | Pump | Pune | Ramesh | . | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO |
| 19 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO |
| 20 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO |
| 21 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO |
| 22 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO |
| 23 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO |
| 24 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO |
| 25 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES |
| 26 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES |
| 27 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO |
| 28 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO |
| 29 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO |
| 30 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO |
10. PROC DATASETS Delete
proc datasets library=work;
delete maintenance_transpose;
quit;
LOG:
NOTE: Deleting WORK.MAINTENANCE_TRANSPOSE (memtype=DATA).
11. Corrected Dataset Code (Full Program)
data equipment_clean;
set equipment_raw;
if Percentage>100 then Percentage=100;
if Maintenance_Cost<0 then Maintenance_Cost=abs(Maintenance_Cost);
if Usage_Hours=. then Usage_Hours=1000;
if Maintenance_Cost>20000 then Fraud_Flag="YES";
else Fraud_Flag="NO";
Cost_per_Hour = Maintenance_Cost / Usage_Hours;
Next_Service_Date = intnx('year',Maintenance_Date,1,'same');
format Next_Service_Date date9.;
run;
proc print data=equipment_clean;
run;
| Obs | Equipment_ID | Equipment_Type | Location | Technician | Usage_Hours | Failure_Count | Maintenance_Cost | Downtime_Hours | Condition_Index | Remaining_Life | Percentage | Fees | Maintenance_Date | Fraud_Flag | Cost_per_Hour | Next_Service_Date |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO | 4.1667 | 12JAN2026 |
| 2 | EQ001 | Pump | Hyderabad | Ravi | 1200 | 2 | 5000 | 10 | 85 | 24 | 90 | 250 | 12JAN2025 | NO | 4.1667 | 12JAN2026 |
| 3 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO | 13.3333 | 14FEB2026 |
| 4 | EQ002 | Motor | Chennai | Kiran | 900 | 5 | 12000 | 30 | 60 | 18 | 85 | 500 | 14FEB2025 | NO | 13.3333 | 14FEB2026 |
| 5 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO | 5.3333 | 21MAR2026 |
| 6 | EQ003 | Compressor | Mumbai | Suresh | 1500 | 3 | 8000 | 15 | 75 | 20 | 88 | 350 | 21MAR2025 | NO | 5.3333 | 21MAR2026 |
| 7 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO | 10.0000 | 11APR2026 |
| 8 | EQ004 | Pump | Pune | Ramesh | 2000 | 7 | 20000 | 45 | 40 | 10 | 95 | 800 | 11APR2025 | NO | 10.0000 | 11APR2026 |
| 9 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 100 | 200 | 22MAY2025 | NO | 6.0000 | 22MAY2026 |
| 10 | EQ005 | Turbine | Delhi | Ajay | 500 | 1 | 3000 | 5 | 92 | 30 | 100 | 200 | 22MAY2025 | NO | 6.0000 | 22MAY2026 |
| 11 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | 500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO | 0.7143 | 15JUN2026 |
| 12 | EQ006 | Pump | Hyderabad | Ravi | 700 | 2 | 500 | 8 | 88 | 26 | 85 | 150 | 15JUN2025 | NO | 0.7143 | 15JUN2026 |
| 13 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO | 2.2727 | 18JUL2026 |
| 14 | EQ007 | Motor | Chennai | Kiran | 1100 | 0 | 2500 | 0 | 95 | 35 | 80 | 100 | 18JUL2025 | NO | 2.2727 | 18JUL2026 |
| 15 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO | 9.3750 | 25AUG2026 |
| 16 | EQ008 | Compressor | Mumbai | Suresh | 1600 | 6 | 15000 | 40 | 55 | 15 | 92 | 650 | 25AUG2025 | NO | 9.3750 | 25AUG2026 |
| 17 | EQ009 | Pump | Pune | Ramesh | 1000 | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO | 6.0000 | 19SEP2026 |
| 18 | EQ009 | Pump | Pune | Ramesh | 1000 | 3 | 6000 | 12 | 70 | 20 | 88 | 300 | 19SEP2025 | NO | 6.0000 | 19SEP2026 |
| 19 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO | 6.9231 | 02OCT2026 |
| 20 | EQ010 | Turbine | Delhi | Ajay | 1300 | 4 | 9000 | 22 | 65 | 18 | 89 | 400 | 02OCT2025 | NO | 6.9231 | 02OCT2026 |
| 21 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO | 5.5000 | 17NOV2026 |
| 22 | EQ011 | Pump | Hyderabad | Ravi | 1000 | 2 | 5500 | 11 | 80 | 23 | 87 | 250 | 17NOV2025 | NO | 5.5000 | 17NOV2026 |
| 23 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO | 6.7857 | 09DEC2026 |
| 24 | EQ012 | Motor | Chennai | Kiran | 1400 | 3 | 9500 | 20 | 72 | 19 | 84 | 450 | 09DEC2025 | NO | 6.7857 | 09DEC2026 |
| 25 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES | 12.3529 | 01JAN2027 |
| 26 | EQ013 | Compressor | Mumbai | Suresh | 1700 | 8 | 21000 | 60 | 35 | 8 | 99 | 900 | 01JAN2026 | YES | 12.3529 | 01JAN2027 |
| 27 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO | 5.2941 | 14FEB2027 |
| 28 | EQ014 | Pump | Pune | Ramesh | 850 | 2 | 4500 | 9 | 82 | 24 | 86 | 230 | 14FEB2026 | NO | 5.2941 | 14FEB2027 |
| 29 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO | 5.3333 | 05MAR2027 |
| 30 | EQ015 | Turbine | Delhi | Ajay | 600 | 1 | 3200 | 6 | 90 | 28 | 83 | 180 | 05MAR2026 | NO | 5.3333 | 05MAR2027 |
20 Key Points About This Project
1.
The project begins by designing a realistic industrial equipment maintenance dataset.
2.
It includes operational variables such as Usage Hours, Failure Count, and Downtime.
3.
Financial metrics such as Maintenance Cost and Service Fees are incorporated.
4.
Equipment health is represented using Condition Index and Remaining Life.
5.
The dataset intentionally introduces multiple data quality errors.
6.
Examples include negative
maintenance cost and missing usage hours.
7.
Invalid percentages greater than 100% were deliberately inserted.
8.
SAS DATA step
programming was used to build the dataset.
9.
Character cleaning functions such as STRIP and TRIM were applied.
10. Formatting
improvements were performed using PROPCASE,
UPCASE, and LOWCASE.
11. Numeric
functions were used to derive metrics like cost per hour.
12. Date
calculations were performed using MDY,
INTCK, and INTNX.
13. A
fraud detection macro was
implemented to identify abnormal maintenance costs.
14. Equipment
utilization was categorized as LOW,
MEDIUM, or HIGH.
15. PROC TRANSPOSE was used to restructure
maintenance variables.
16. Multiple
datasets were combined using SET, MERGE,
and APPEND.
17. Incorrect
values were corrected using conditional
logic in DATA steps.
18. Temporary
datasets were cleaned using PROC DATASETS
DELETE.
19. The
final dataset supports maintenance
analytics and fraud detection.
20. The
completed project demonstrates advanced
SAS data engineering techniques used in real industrial analytics
environments.
Project Summary
This project demonstrates how SAS programming
can be used to manage and analyze industrial equipment maintenance data while
detecting and correcting data quality issues. A simulated dataset was created
containing variables such as Equipment_ID, Usage_Hours, Failure_Count,
Maintenance_Cost, Downtime_Hours, Condition_Index, Remaining_Life, Percentage,
Fees, and Maintenance_Date. Several intentional errors were introduced into the
dataset, including negative maintenance costs, missing usage hours, invalid
percentages greater than 100, and suspiciously high maintenance expenses. These
errors represent common real-world data issues encountered in industrial
maintenance analytics.
Using SAS DATA step programming, macros, character functions, numeric functions, and date functions such as MDY, INTCK, and INTNX, the dataset was cleaned and transformed. Fraud detection logic was implemented to flag abnormal maintenance costs, and utilization classification was created to categorize equipment usage levels. Additional dataset manipulation techniques such as SET, MERGE, APPEND, and PROC TRANSPOSE were applied. The final cleaned dataset provides reliable maintenance insights for operational decision-making and predictive equipment management.
Conclusion
This
project demonstrates a complete industrial equipment maintenance analytics
workflow using SAS programming. The program begins with the creation of a
raw dataset that intentionally contains multiple data errors, such as missing
values, invalid percentages, negative maintenance costs, and suspiciously high
repair expenses.
Through
the use of SAS DATA steps, macros, dataset manipulation techniques, and
built-in functions, the program systematically detects these issues, flags
fraudulent transactions, and corrects invalid data.
Key SAS
techniques include:
·
Dataset creation using DATA step
·
Fraud detection using macros
·
Data cleaning and validation
·
Date calculations using MDY, INTCK, and
INTNX
·
Character manipulation using STRIP,
TRIM, CATX, PROPCASE
·
Numeric analysis using mathematical
functions
·
Dataset transformation using PROC
TRANSPOSE
·
Dataset merging and appending
·
Dataset management using PROC DATASETS
The
final cleaned dataset provides reliable information for industrial maintenance
decision-making, enabling organizations to reduce downtime, detect fraudulent
maintenance activities, and improve equipment life cycle management.
SAS INTERVIEW QUESTIONS
1. Explain ARRAY processing in SAS?
An ARRAY
in SAS is used to process multiple variables
together using a single loop instead of writing repetitive code.
Example use:
·
Apply calculations to many variables
·
Data cleaning across multiple columns
Example:
array scores
score1-score5;
do i=1 to 5;
scores{i}=scores{i}+10;
end;
Short: ARRAY allows efficient
processing of multiple variables using loops.
2. What is the purpose of DO
UNTIL and DO WHILE?
Both are looping
statements used to repeat a block of code.
|
Statement |
Condition
Check |
|
DO
WHILE |
Condition
checked before execution |
|
DO
UNTIL |
Condition
checked after execution |
Short:
· DO WHILE
runs while condition is true.
·
DO
UNTIL runs until condition becomes true.
3. How do you remove duplicates in SAS?
Duplicates are usually removed using PROC SORT with NODUPKEY.
Example:
proc sort data=dataset nodupkey;
by id;
run;
Short: NODUPKEY removes
observations with duplicate values of the BY variable.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent EQUIPMENT MAINTENANCE data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics
· Clinical SAS Programmer
· Research Data Analyst
· Regulatory Data Validator
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment