Building a Reliable Emergency Services Analytics & Fraud Detection System in SAS Through Systematic Error Identification and Correction

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

HERE WE USED THESE SAS STATEMENTS AND FUNCTIONS FOR THIS PROJECT:

INTRODUCTION

1. Business Context

2. Dataset Design

3. Raw Dataset Creation (With Intentional Errors)

4. Identifying the Errors

5. Corrected Dataset Code (Full Version)

6. Utilization Classification Macro

7. Fraud Detection Logic

8. Descriptive Analysis (PROC MEANS, UNIVARIATE)

9. Correlation Analysis

10. Visualization (PROC SGPLOT)

11. Data Manipulation (APPEND, TRANSPOSE)

12. Why Each Procedure Is Used

13. Business Insights

14. Conclusion

Introduction:

Emergency services require accurate, reliable, and timely data to ensure public safety and operational efficiency. This project builds a structured SAS-based analytics and fraud detection system that simulates real-world emergency operations, introduces intentional data errors, detects inconsistencies, and corrects them using statistical validation, automation, and logical programming techniques.

Emergency services such as ambulance, fire, and police departments operate in high-pressure environments where response time, staff readiness, equipment availability, and incident handling capacity directly impact public safety.

In this project, we will:

· Create a custom emergency services dataset (with more than 12 observations).

· Introduce intentional errors.

· Detect and correct those errors.

Business Context

Emergency departments must answer:

· Are response times within acceptable thresholds?

· Is staff utilization optimized?

· Are equipment failures causing delays?

· Is there any fraudulent over-reporting of incidents?

· Are some cities under-resourced?

We simulate this environment using SAS programming.

1. Raw Dataset Creation (With Intentional Errors)

data emergency_raw;

length Service_Type $20 City $20 Equipment_Status $15;

format Service_Date date9.;

input Service_ID Service_Type $ City $ Response_Time_Minutes

Incidents_Handled Staff_Available Equipment_Status $

Readiness_Score Service_Date : date9.;

datalines;

101 ambulance hyderabad 8 20 15 functional 85 01JAN2026

102 Fire HYDERABAD 15 30 18 functional 90 05JAN2026

103 police mumbai -5 25 10 damaged 75 10JAN2026

104 Ambulance mumbai 12 200 5 functional 88 15JAN2026

105 fire chennai 20 18 12 missing 92 20JAN2026

106 police delhi 7 22 0 functional 80 25JAN2026

107 ambulance delhi 9 19 14 functional . 28JAN2026

108 fire kolkata 30 40 20 functional 95 30JAN2026

109 police kolkata 11 21 9 damaged 78 02FEB2026

110 ambulance chennai 14 16 13 functional 83 05FEB2026

111 fire mumbai 16 23 15 functional 89 08FEB2026

112 police hyderabad 13 17 11 functional 84 10FEB2026

113 ambulance mumbai 18 28 16 functional 91 12FEB2026

114 fire delhi 22 35 17 functional 94 14FEB2026

;

run;

proc print data=emergency_raw;

run;

OUTPUT:


Obs	Service_Type	City	Equipment_Status	Service_Date	Service_ID	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score
1	ambulance	hyderabad	functional	01JAN2026	101	8	20	15	85
2	Fire	HYDERABAD	functional	05JAN2026	102	15	30	18	90
3	police	mumbai	damaged	10JAN2026	103	-5	25	10	75
4	Ambulance	mumbai	functional	15JAN2026	104	12	200	5	88
5	fire	chennai	missing	20JAN2026	105	20	18	12	92
6	police	delhi	functional	25JAN2026	106	7	22	0	80
7	ambulance	delhi	functional	28JAN2026	107	9	19	14	.
8	fire	kolkata	functional	30JAN2026	108	30	40	20	95
9	police	kolkata	damaged	02FEB2026	109	11	21	9	78
10	ambulance	chennai	functional	05FEB2026	110	14	16	13	83
11	fire	mumbai	functional	08FEB2026	111	16	23	15	89
12	police	hyderabad	functional	10FEB2026	112	13	17	11	84
13	ambulance	mumbai	functional	12FEB2026	113	18	28	16	91
14	fire	delhi	functional	14FEB2026	114	22	35	17	94

Intentional Errors Introduced

Negative response time (-5).
Incidents_Handled = 200 (outlier).
Staff_Available = 0.
Equipment_Status = "missing".
Readiness_Score = missing.
Mixed case inconsistencies.
Logical inconsistency: High readiness but equipment damaged.

2. Identifying Errors

proc means data=emergency_raw n mean min max;

var Response_Time_Minutes Incidents_Handled Staff_Available Readiness_Score;

run;

OUTPUT:

The MEANS Procedure


Variable	N	Mean	Minimum	Maximum
Response_Time_Minutes Incidents_Handled Staff_Available Readiness_Score	14 14 14 13	13.5714286 36.7142857 12.5000000 86.4615385	-5.0000000 16.0000000 0 75.0000000	30.0000000 200.0000000 20.0000000 95.0000000

proc freq data=emergency_raw;

tables Equipment_Status Service_Type City / missing;

run;

OUTPUT:

The FREQ Procedure


Equipment_Status	Frequency	Percent	Cumulative Frequency	Cumulative Percent
damaged	2	14.29	2	14.29
functional	11	78.57	13	92.86
missing	1	7.14	14	100.00


Service_Type	Frequency	Percent	Cumulative Frequency	Cumulative Percent
Ambulance	1	7.14	1	7.14
Fire	1	7.14	2	14.29
ambulance	4	28.57	6	42.86
fire	4	28.57	10	71.43
police	4	28.57	14	100.00


City	Frequency	Percent	Cumulative Frequency	Cumulative Percent
HYDERABAD	1	7.14	1	7.14
chennai	2	14.29	3	21.43
delhi	3	21.43	6	42.86
hyderabad	2	14.29	8	57.14
kolkata	2	14.29	10	71.43
mumbai	4	28.57	14	100.00

· Identify equipment inconsistencies

· Verify service type standardization

· Detect rare suspicious categories

proc univariate data=emergency_raw;

var Incidents_Handled Response_Time_Minutes;

run;

OUTPUT:

The UNIVARIATE Procedure

Variable: Incidents_Handled


Moments
N	14	Sum Weights	14
Mean	36.7142857	Sum Observations	514
Std Deviation	47.5126357	Variance	2257.45055
Skewness	3.60459	Kurtosis	13.2493791
Uncorrected SS	48218	Corrected SS	29346.8571
Coeff Variation	129.411848	Std Error Mean	12.698286


Basic Statistical Measures
Location		Variability
Mean	36.71429	Std Deviation	47.51264
Median	22.50000	Variance	2257
Mode	.	Range	184.00000
		Interquartile Range	11.00000


Tests for Location: Mu0=0
Test	Statistic		p Value
Student's t	t	2.891279	Pr > \|t\|	0.0126
Sign	M	7	Pr >= \|M\|	0.0001
Signed Rank	S	52.5	Pr >= \|S\|	0.0001


Quantiles (Definition 5)
Level	Quantile
100% Max	200.0
99%	200.0
95%	200.0
90%	40.0
75% Q3	30.0
50% Median	22.5
25% Q1	19.0
10%	17.0
5%	16.0
1%	16.0
0% Min	16.0


Extreme Observations
Lowest		Highest
Value	Obs	Value	Obs
16	10	28	13
17	12	30	2
18	5	35	14
19	7	40	8
20	1	200	4

The UNIVARIATE Procedure

Variable: Response_Time_Minutes


Moments
N	14	Sum Weights	14
Mean	13.5714286	Sum Observations	190
Std Deviation	8.13079886	Variance	66.1098901
Skewness	-0.2631101	Kurtosis	1.82343499
Uncorrected SS	3438	Corrected SS	859.428571
Coeff Variation	59.9111495	Std Error Mean	2.1730474


Basic Statistical Measures
Location		Variability
Mean	13.57143	Std Deviation	8.13080
Median	13.50000	Variance	66.10989
Mode	.	Range	35.00000
		Interquartile Range	9.00000


Tests for Location: Mu0=0
Test	Statistic		p Value
Student's t	t	6.245344	Pr > \|t\|	<.0001
Sign	M	6	Pr >= \|M\|	0.0018
Signed Rank	S	51.5	Pr >= \|S\|	0.0002


Quantiles (Definition 5)
Level	Quantile
100% Max	30.0
99%	30.0
95%	30.0
90%	22.0
75% Q3	18.0
50% Median	13.5
25% Q1	9.0
10%	7.0
5%	-5.0
1%	-5.0
0% Min	-5.0


Extreme Observations
Lowest		Highest
Value	Obs	Value	Obs
-5	3	16	11
7	6	18	13
8	1	20	5
9	7	22	14
11	9	30	8

Helps identify:

· Skewness

· Extreme observations

· Distribution shape

Useful for operational monitoring.

3. Corrected Dataset (Full Clean Code)

data emergency_clean;

set emergency_raw;

Service_Type = propcase(strip(Service_Type));

City = propcase(strip(City));

Equipment_Status = lowcase(strip(Equipment_Status));

if Response_Time_Minutes < 0 then Response_Time_Minutes = .;

if Incidents_Handled > 100 then Incidents_Handled = 100;

if Staff_Available = 0 then Staff_Available = .;

if Equipment_Status = "missing" then Equipment_Status = "damaged";

Readiness_Score = coalesce(Readiness_Score, 80);

Month_Start = intnx('month', Service_Date, 0, 'b');

Month_End = intnx('month', Service_Date, 0, 'e');

Days_From_Start = intck('day', mdy(1,1,2026), Service_Date);

format Month_Start Month_End date9.;

run;

proc print data=emergency_clean;

run;

OUTPUT:


Obs	Service_Type	City	Equipment_Status	Service_Date	Service_ID	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score	Month_Start	Month_End	Days_From_Start
1	Ambulance	Hyderabad	functional	01JAN2026	101	8	20	15	85	01JAN2026	31JAN2026	0
2	Fire	Hyderabad	functional	05JAN2026	102	15	30	18	90	01JAN2026	31JAN2026	4
3	Police	Mumbai	damaged	10JAN2026	103	.	25	10	75	01JAN2026	31JAN2026	9
4	Ambulance	Mumbai	functional	15JAN2026	104	12	100	5	88	01JAN2026	31JAN2026	14
5	Fire	Chennai	damaged	20JAN2026	105	20	18	12	92	01JAN2026	31JAN2026	19
6	Police	Delhi	functional	25JAN2026	106	7	22	.	80	01JAN2026	31JAN2026	24
7	Ambulance	Delhi	functional	28JAN2026	107	9	19	14	80	01JAN2026	31JAN2026	27
8	Fire	Kolkata	functional	30JAN2026	108	30	40	20	95	01JAN2026	31JAN2026	29
9	Police	Kolkata	damaged	02FEB2026	109	11	21	9	78	01FEB2026	28FEB2026	32
10	Ambulance	Chennai	functional	05FEB2026	110	14	16	13	83	01FEB2026	28FEB2026	35
11	Fire	Mumbai	functional	08FEB2026	111	16	23	15	89	01FEB2026	28FEB2026	38
12	Police	Hyderabad	functional	10FEB2026	112	13	17	11	84	01FEB2026	28FEB2026	40
13	Ambulance	Mumbai	functional	12FEB2026	113	18	28	16	91	01FEB2026	28FEB2026	42
14	Fire	Delhi	functional	14FEB2026	114	22	35	17	94	01FEB2026	28FEB2026	44

· Conditional logic (IF-THEN)

· Missing value handling

· Value capping

· Logical validation

Functions used:

· STRIP( ) → removes leading/trailing spaces

· PROPCASE( ) → proper case formatting

· LOWCASE( ) / UPCASE( ) → consistency

· COALESCE ( ) used to replace missing readiness scores logically instead of deleting records.

Date functions applied:

· MDY( ) → construct dates

· INTCK( ) → calculate duration

· INTNX( ) → derive month start/end

4. Utilization Classification Macro

%macro utilization;

data emergency_util;

set emergency_clean;

length Utilization_Class $8.;

Utilization_Rate = Incidents_Handled / Staff_Available;

if Utilization_Rate >= 2 then Utilization_Class="High";

else if Utilization_Rate >=1 then Utilization_Class="Medium";

else Utilization_Class="Low";

run;

proc print data=emergency_util;

run;

%mend;

%utilization;

OUTPUT:


Obs	Service_Type	City	Equipment_Status	Service_Date	Service_ID	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score	Month_Start	Month_End	Days_From_Start	Utilization_Class	Utilization_Rate
1	Ambulance	Hyderabad	functional	01JAN2026	101	8	20	15	85	01JAN2026	31JAN2026	0	Medium	1.3333
2	Fire	Hyderabad	functional	05JAN2026	102	15	30	18	90	01JAN2026	31JAN2026	4	Medium	1.6667
3	Police	Mumbai	damaged	10JAN2026	103	.	25	10	75	01JAN2026	31JAN2026	9	High	2.5000
4	Ambulance	Mumbai	functional	15JAN2026	104	12	100	5	88	01JAN2026	31JAN2026	14	High	20.0000
5	Fire	Chennai	damaged	20JAN2026	105	20	18	12	92	01JAN2026	31JAN2026	19	Medium	1.5000
6	Police	Delhi	functional	25JAN2026	106	7	22	.	80	01JAN2026	31JAN2026	24	Low	.
7	Ambulance	Delhi	functional	28JAN2026	107	9	19	14	80	01JAN2026	31JAN2026	27	Medium	1.3571
8	Fire	Kolkata	functional	30JAN2026	108	30	40	20	95	01JAN2026	31JAN2026	29	High	2.0000
9	Police	Kolkata	damaged	02FEB2026	109	11	21	9	78	01FEB2026	28FEB2026	32	High	2.3333
10	Ambulance	Chennai	functional	05FEB2026	110	14	16	13	83	01FEB2026	28FEB2026	35	Medium	1.2308
11	Fire	Mumbai	functional	08FEB2026	111	16	23	15	89	01FEB2026	28FEB2026	38	Medium	1.5333
12	Police	Hyderabad	functional	10FEB2026	112	13	17	11	84	01FEB2026	28FEB2026	40	Medium	1.5455
13	Ambulance	Mumbai	functional	12FEB2026	113	18	28	16	91	01FEB2026	28FEB2026	42	Medium	1.7500
14	Fire	Delhi	functional	14FEB2026	114	22	35	17	94	01FEB2026	28FEB2026	44	High	2.0588

5. Fraud Detection Macro

%macro fraud_check;

data emergency_fraud;

set emergency_util;

Fraud_Flag=0;

if Response_Time_Minutes < 3 and Incidents_Handled > 30 then Fraud_Flag=1;

if Equipment_Status="damaged" and Readiness_Score > 90 then Fraud_Flag=1;

if Staff_Available=. and Incidents_Handled>20 then Fraud_Flag=1;

run;

proc print data=emergency_fraud;

run;

%mend;

%fraud_check;

OUTPUT:


Obs	Service_Type	City	Equipment_Status	Service_Date	Service_ID	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score	Month_Start	Month_End	Days_From_Start	Utilization_Class	Utilization_Rate	Fraud_Flag
1	Ambulance	Hyderabad	functional	01JAN2026	101	8	20	15	85	01JAN2026	31JAN2026	0	Medium	1.3333	0
2	Fire	Hyderabad	functional	05JAN2026	102	15	30	18	90	01JAN2026	31JAN2026	4	Medium	1.6667	0
3	Police	Mumbai	damaged	10JAN2026	103	.	25	10	75	01JAN2026	31JAN2026	9	High	2.5000	0
4	Ambulance	Mumbai	functional	15JAN2026	104	12	100	5	88	01JAN2026	31JAN2026	14	High	20.0000	0
5	Fire	Chennai	damaged	20JAN2026	105	20	18	12	92	01JAN2026	31JAN2026	19	Medium	1.5000	1
6	Police	Delhi	functional	25JAN2026	106	7	22	.	80	01JAN2026	31JAN2026	24	Low	.	1
7	Ambulance	Delhi	functional	28JAN2026	107	9	19	14	80	01JAN2026	31JAN2026	27	Medium	1.3571	0
8	Fire	Kolkata	functional	30JAN2026	108	30	40	20	95	01JAN2026	31JAN2026	29	High	2.0000	0
9	Police	Kolkata	damaged	02FEB2026	109	11	21	9	78	01FEB2026	28FEB2026	32	High	2.3333	0
10	Ambulance	Chennai	functional	05FEB2026	110	14	16	13	83	01FEB2026	28FEB2026	35	Medium	1.2308	0
11	Fire	Mumbai	functional	08FEB2026	111	16	23	15	89	01FEB2026	28FEB2026	38	Medium	1.5333	0
12	Police	Hyderabad	functional	10FEB2026	112	13	17	11	84	01FEB2026	28FEB2026	40	Medium	1.5455	0
13	Ambulance	Mumbai	functional	12FEB2026	113	18	28	16	91	01FEB2026	28FEB2026	42	Medium	1.7500	0
14	Fire	Delhi	functional	14FEB2026	114	22	35	17	94	01FEB2026	28FEB2026	44	High	2.0588	0

Fraud logic:

· Response_Time < 3 minutes but incidents > 30.

· Equipment damaged but readiness > 90.

· Staff missing but high incidents.

6. Statistical Analysis

PROC MEANS

proc means data=emergency_fraud mean median std min max;

class Service_Type;

var Response_Time_Minutes Incidents_Handled Readiness_Score;

run;

OUTPUT:

The MEANS Procedure


Service_Type	N Obs	Variable	Mean	Median	Std Dev	Minimum	Maximum
Ambulance	5	Response_Time_Minutes Incidents_Handled Readiness_Score	12.2000000 36.6000000 85.4000000	12.0000000 20.0000000 85.0000000	4.0249224 35.7183426 4.2778499	8.0000000 16.0000000 80.0000000	18.0000000 100.0000000 91.0000000
Fire	5	Response_Time_Minutes Incidents_Handled Readiness_Score	20.6000000 29.2000000 92.0000000	20.0000000 30.0000000 92.0000000	5.9833101 8.8713020 2.5495098	15.0000000 18.0000000 89.0000000	30.0000000 40.0000000 95.0000000
Police	4	Response_Time_Minutes Incidents_Handled Readiness_Score	10.3333333 21.2500000 79.2500000	11.0000000 21.5000000 79.0000000	3.0550505 3.3040379 3.7749172	7.0000000 17.0000000 75.0000000	13.0000000 25.0000000 84.0000000

· Mean

· Median

· Standard Deviation

· Min/Max

Used to detect range errors and abnormal values.

PROC CORR

proc corr data=emergency_fraud;

var Response_Time_Minutes Incidents_Handled Staff_Available Readiness_Score;

run;

OUTPUT:

The CORR Procedure


4 Variables:	Response_Time_Minutes Incidents_Handled Staff_Available Readiness_Score


Simple Statistics
Variable	N	Mean	Std Dev	Sum	Minimum	Maximum
Response_Time_Minutes	13	15.00000	6.37704	195.00000	7.00000	30.00000
Incidents_Handled	14	29.57143	21.43941	414.00000	16.00000	100.00000
Staff_Available	13	13.46154	4.07462	175.00000	5.00000	20.00000
Readiness_Score	14	86.00000	6.23884	1204	75.00000	95.00000


Pearson Correlation Coefficients Prob > \|r\| under H0: Rho=0 Number of Observations
	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score
Response_Time_Minutes	1.00000 13	0.10443 0.7342 13	0.56223 0.0571 12	0.84777 0.0003 13
Incidents_Handled	0.10443 0.7342 13	1.00000 14	-0.39884 0.1770 13	0.27950 0.3332 14
Staff_Available	0.56223 0.0571 12	-0.39884 0.1770 13	1.00000 13	0.56450 0.0444 13
Readiness_Score	0.84777 0.0003 13	0.27950 0.3332 14	0.56450 0.0444 13	1.00000 14

· Response time

· Incidents handled

· Staff availability

· Readiness score

Supports operational optimization.

7. Visualization

proc sgplot data=emergency_fraud;

scatter x=Response_Time_Minutes y=Incidents_Handled;

run;

OUTPUT:

proc sgplot data=emergency_fraud;

vbar Service_Type / response=Readiness_Score stat=mean;

run;

OUTPUT:

8. PROC SQL Usage

proc sql;

create table city_summary as

select City,

mean(Response_Time_Minutes) as Avg_Response,

sum(Incidents_Handled) as Total_Incidents

from emergency_fraud

group by City;

quit;

proc print data=city_summary;

run;

OUTPUT:


Obs	City	Avg_Response	Total_Incidents
1	Chennai	17.0000	34
2	Delhi	12.6667	76
3	Hyderabad	12.0000	67
4	Kolkata	20.5000	61
5	Mumbai	15.3333	176

City-level summary:

· Average response time

· Total incidents

Efficient for reporting and dashboard preparation.

9. PROC TRANSPOSE

proc transpose data=city_summary out=city_transpose;

by City NotSorted;

run;

proc print data=city_transpose;

run;

OUTPUT:


Obs	City	_NAME_	COL1
1	Chennai	Avg_Response	17.000
2	Chennai	Total_Incidents	34.000
3	Delhi	Avg_Response	12.667
4	Delhi	Total_Incidents	76.000
5	Hyderabad	Avg_Response	12.000
6	Hyderabad	Total_Incidents	67.000
7	Kolkata	Avg_Response	20.500
8	Kolkata	Total_Incidents	61.000
9	Mumbai	Avg_Response	15.333
10	Mumbai	Total_Incidents	176.000

10. PROC APPEND

proc append base=emergency_clean

data=emergency_raw force;

run;

proc print data=emergency_raw;

run;

OUTPUT:


Obs	Service_Type	City	Equipment_Status	Service_Date	Service_ID	Response_Time_Minutes	Incidents_Handled	Staff_Available	Readiness_Score
1	ambulance	hyderabad	functional	01JAN2026	101	8	20	15	85
2	Fire	HYDERABAD	functional	05JAN2026	102	15	30	18	90
3	police	mumbai	damaged	10JAN2026	103	-5	25	10	75
4	Ambulance	mumbai	functional	15JAN2026	104	12	200	5	88
5	fire	chennai	missing	20JAN2026	105	20	18	12	92
6	police	delhi	functional	25JAN2026	106	7	22	0	80
7	ambulance	delhi	functional	28JAN2026	107	9	19	14	.
8	fire	kolkata	functional	30JAN2026	108	30	40	20	95
9	police	kolkata	damaged	02FEB2026	109	11	21	9	78
10	ambulance	chennai	functional	05FEB2026	110	14	16	13	83
11	fire	mumbai	functional	08FEB2026	111	16	23	15	89
12	police	hyderabad	functional	10FEB2026	112	13	17	11	84
13	ambulance	mumbai	functional	12FEB2026	113	18	28	16	91
14	fire	delhi	functional	14FEB2026	114	22	35	17	94

· Vertical concatenation

· Horizontal joining

· Incremental data updates

11. PROC DATASETS DELETE

proc datasets library=work;

delete emergency_raw;

quit;

LOG:

 NOTE: Deleting WORK.EMERGENCY_RAW (memtype=DATA).

12. Why Each Procedure Is Used

DATA STEP

Used for row-level transformation and error correction.

SET

Reads dataset sequentially.

MERGE

Combines datasets by key variable.

PROC SQL

Advanced aggregation and joins.

PROC MEANS

Summary statistics.

PROC UNIVARIATE

Detect skewness and outliers.

PROC FREQ

Categorical distribution.

PROC CORR

Correlation matrix.

PROC SGPLOT

Data visualization.

MACROS

Reusable automation.

13. 20 Key Points About This Project

· Defined a Clear Business Objective – The project focuses on measuring emergency service efficiency and detecting potential fraud or operational inconsistencies.

· Designed a Realistic Emergency Dataset – Included Ambulance, Fire, and Police services operating across multiple cities with measurable performance indicators.

· Selected Meaningful Operational Variables – Used Response Time, Incidents Handled, Staff Available, Equipment Status, Readiness Score, and Service Date.

· Structured Dataset with Proper Attributes – Applied correct variable types, length statements, and date formats to avoid structural errors.

· Inserted Intentional Errors – Introduced negative response times, extreme outliers, missing values, and logical inconsistencies to simulate real-world dirty data.

· Performed Initial Data Profiling – Used statistical summaries to detect abnormal minimum, maximum, and missing values.

· Validated Categorical Variables – Checked service types, equipment status, and city names for inconsistencies.

· Identified Outliers and Distribution Issues – Analyzed spread and unusual values to detect unrealistic operational records.

· Standardized Character Data – Cleaned text using functions like STRIP, PROPCASE, UPCASE, and LOWCASE for consistency.

· Corrected Logical Numeric Errors – Applied conditional rules to fix negative response times and unrealistic incident counts.

· Handled Missing Data Strategically – Used COALESCE and business logic instead of deleting records to preserve data integrity.

· Applied Date Intelligence Functions – Used MDY, INTCK, and INTNX for time-based calculations and monthly aggregation.

· Calculated Workforce Utilization Rate – Derived Incidents ÷ Staff to measure operational load efficiency.

· Classified Utilization Using Macro Logic – Automated categorization into High, Medium, and Low workload levels.

· Designed Fraud Detection Rules – Flagged unrealistic operational combinations such as damaged equipment with high readiness.

· Automated Fraud Flagging – Implemented reusable macro logic to identify suspicious records.

· Performed Correlation Analysis – Examined relationships between staffing, response time, and readiness scores.

· Generated Aggregated Reports – Used SQL summarization to produce city-level and service-level insights.

· Created Visualizations for Management – Developed scatter plots and bar charts to clearly communicate patterns.

· Built an End-to-End Reliable Framework – Integrated data cleaning, validation, fraud detection, automation, statistical analysis, and reporting into a complete SAS-based system.

14. Business Insights

· Mumbai shows highest incident overload.

· Kolkata has longer response time.

· Damaged equipment correlated with low readiness.

· Fraud flags identified suspicious records.

15. Conclusion

This project demonstrates that a reliable Emergency Services Analytics and Fraud Detection System can be successfully built using SAS when structured validation and systematic error correction techniques are applied. By designing a realistic dataset, intentionally inserting data quality issues, and then identifying and correcting those errors, we simulated real-world operational challenges. Statistical procedures such as summary analysis, distribution checks, and correlation evaluation ensured data accuracy and reliability. Character and numeric validation improved consistency, while date functions enhanced time-based intelligence. The use of macros automated utilization classification and fraud detection logic, making the system scalable and reusable. Visualization and aggregation supported management-level decision-making. Most importantly, the project highlighted how data cleaning, anomaly detection, and logical validation are critical in high-risk domains like emergency services. Overall, this end-to-end framework proves that SAS provides powerful tools to build accurate, efficient, and fraud-aware emergency operations analytics systems.

SAS INTERVIEW QUESTIONS

· What is ADSL?

· What is ADAE?

· What is the purpose of TLFs?

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.

Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent EMERGENCY SERVICES data.

Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.

This project is suitable for:
·  Students learning SAS
·  Data analysts building portfolios
·  Professionals preparing for SAS interviews
·  Bloggers writing about analytics
·  Clinical SAS Programmer
·  Research Data Analyst
·  Regulatory Data Validator

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Follow Us On :

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--->Follow our blog for more SAS-based analytics projects and industry data models.

---> Support Us By Following Our Blog..

To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:

1.Can SAS analytics reveal which programming languages are popular but surprisingly difficult to master?
2.Is SDTM DM quality control in SAS the key to avoiding last-minute FDA rejection risks?
3.Can SAS analytics reveal which world tourist places truly attract visitors year after year?

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

About Us | Contact | Privacy Policy

Search This Blog

SAS Learning Hub

409.Can We Build a Reliable Emergency Services Analytics & Fraud Detection System in SAS While Identifying and Fixing Intentional Errors?

Building a Reliable Emergency Services Analytics & Fraud Detection System in SAS Through Systematic Error Identification and Correction

INTRODUCTION

Table of Contents

Introduction:

Business Context