398.Is Your Favorite Restaurant Really Safe? What Do Food Inspection Scores Reveal?A Sas Data Analytics
Is Your Favorite Restaurant Really Safe? What Do Food Inspection Scores Reveal?A Sas Data Analytics
HERE IN THIS PROJECT WE USED THESE SAS STATEMENTS —
DATA STEP | PROC SQL | PROC PRINT | PROC SGPLOT | MACROS | PROC CORR | PROC MEANS | PROC FREQ | PROC UNIVARIATE | PROC SORT | MERGE | PROC TRANSPOSE | PROC DATASETS DELETE | DATA FUNCTIONS
INTRODUCTION
Food
safety inspections play a critical role in public health.
Government bodies inspect restaurants periodically to ensure hygiene, safety,
and regulatory compliance.
Poor food safety leads to:
- Food poisoning outbreaks
- Loss of public trust
- Legal penalties
- Business shutdowns
This SAS
project simulates realistic food safety inspection data and demonstrates
how a SAS Programmer analyzes, validates, classifies, and detects risk or
fraud patterns using:
- DATA step
- PROC SQL
- PROC MEANS
- PROC UNIVARIATE
- PROC FREQ
- PROC CORR
- PROC SGPLOT
- MACROS
- DATE FUNCTIONS (MDY, INTCK, INTNX)
- STRING FUNCTIONS (STRIP, TRIM, CAT, CATX, PROPCASE, UPCASE, LOWCASE)
- DATA MANAGEMENT (SET, MERGE, APPEND, TRANSPOSE)
- PROC DATASETS DELETE
TABLE OF
CONTENTS
- Business Context
- Dataset Design
- Raw Data Creation
- Data Cleaning &
Formatting
- Date Derivations
- Utilization Classification
(Macro)
- Compliance Percentage
Calculation
- Fraud Detection Logic
(Macro)
- PROC SQL Analysis
- PROC FREQ Analysis
- PROC MEANS Analysis
- PROC UNIVARIATE Analysis
- PROC CORR Analysis
- Visualization using PROC
SGPLOT
- Data Restructuring
(TRANSPOSE, MERGE)
- Dataset Maintenance (PROC
DATASETS DELETE)
- Business Insights
- Conclusion
1.1. BUSINESS CONTEXT
Stakeholders:
- Municipal food safety
departments
- Restaurant owners
- Public health analysts
- Policy makers
Business
Questions Answered:
- Which cities have poor
compliance?
- Are inspections happening
frequently enough?
- Which restaurants show
suspicious patterns?
- Is re-inspection delayed for
unsafe restaurants?
- Is hygiene correlated with
inspection score?
2.2.DATASET DESIGN
Core Variables
|
Variable |
Description |
|
Restaurant_ID |
Unique
restaurant identifier |
|
City |
Inspection
city |
|
Inspection_Date |
Date of
inspection |
|
Inspection_Score |
Score
out of 100 |
|
Violations_Count |
Number
of violations |
|
Reinspection_Days |
Days
until next inspection |
|
Hygiene_Level |
High /
Medium / Low |
|
Compliance_Status |
Compliant
/ Non-Compliant |
|
Compliance_Percentage |
Calculated
percentage |
|
Inspector_Name |
Inspector |
|
Risk_Flag |
Fraud/Risk
indicator |
3. RAW DATA CREATION
data food_inspection_raw;
length Restaurant_ID $6 City $15 Hygiene_Level $10 Inspector_Name $20;
input Restaurant_ID $ City $ Inspection_Date : date9. Inspection_Score Violations_Count
Reinspection_Days Hygiene_Level $ Inspector_Name $;
format Inspection_Date date9.;
datalines;
R001 hyderabad 01JAN2025 92 1 30 High ramesh
R002 mumbai 05JAN2025 68 5 10 Low suresh
R003 delhi 07JAN2025 75 3 20 Medium anil
R004 chennai 10JAN2025 88 2 25 High kavita
R005 kolkata 12JAN2025 55 7 7 Low ramesh
R006 pune 15JAN2025 81 2 30 Medium anil
R007 hyderabad 18JAN2025 60 6 14 Low suresh
R008 mumbai 20JAN2025 90 1 30 High kavita
R009 delhi 22JAN2025 72 4 15 Medium ramesh
R010 chennai 25JAN2025 85 2 28 High anil
R011 kolkata 27JAN2025 58 8 5 Low suresh
R012 pune 28JAN2025 79 3 20 Medium kavita
R013 hyderabad 30JAN2025 95 0 45 High ramesh
R014 mumbai 01FEB2025 62 6 12 Low anil
R015 delhi 03FEB2025 70 4 18 Medium kavita
R016 chennai 05FEB2025 82 2 30 High suresh
;
run;
proc print data=food_inspection_raw;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Date | Inspection_Score | Violations_Count | Reinspection_Days |
|---|---|---|---|---|---|---|---|---|
| 1 | R001 | hyderabad | High | ramesh | 01JAN2025 | 92 | 1 | 30 |
| 2 | R002 | mumbai | Low | suresh | 05JAN2025 | 68 | 5 | 10 |
| 3 | R003 | delhi | Medium | anil | 07JAN2025 | 75 | 3 | 20 |
| 4 | R004 | chennai | High | kavita | 10JAN2025 | 88 | 2 | 25 |
| 5 | R005 | kolkata | Low | ramesh | 12JAN2025 | 55 | 7 | 7 |
| 6 | R006 | pune | Medium | anil | 15JAN2025 | 81 | 2 | 30 |
| 7 | R007 | hyderabad | Low | suresh | 18JAN2025 | 60 | 6 | 14 |
| 8 | R008 | mumbai | High | kavita | 20JAN2025 | 90 | 1 | 30 |
| 9 | R009 | delhi | Medium | ramesh | 22JAN2025 | 72 | 4 | 15 |
| 10 | R010 | chennai | High | anil | 25JAN2025 | 85 | 2 | 28 |
| 11 | R011 | kolkata | Low | suresh | 27JAN2025 | 58 | 8 | 5 |
| 12 | R012 | pune | Medium | kavita | 28JAN2025 | 79 | 3 | 20 |
| 13 | R013 | hyderabad | High | ramesh | 30JAN2025 | 95 | 0 | 45 |
| 14 | R014 | mumbai | Low | anil | 01FEB2025 | 62 | 6 | 12 |
| 15 | R015 | delhi | Medium | kavita | 03FEB2025 | 70 | 4 | 18 |
| 16 | R016 | chennai | High | suresh | 05FEB2025 | 82 | 2 | 30 |
·
LENGTH avoids truncation
·
DATE9. ensures proper date handling
·
Realistic inspection patterns
·
Mixed hygiene levels & scores
4. DATA CLEANING & STANDARDIZATION
data food_inspection_clean;
set food_inspection_raw;
City = propcase(strip(City));
Hygiene_Level = upcase(strip(Hygiene_Level));
Inspector_Name = propcase(strip(Inspector_Name));
length Compliance_Status $15;
if Inspection_Score >= 80 then Compliance_Status = "Compliant";
else Compliance_Status = "Non-Compliant";
run;
proc print data=food_inspection_clean;
var Restaurant_ID City Hygiene_Level Inspector_Name Inspection_Score Compliance_Status;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Score | Compliance_Status |
|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | HIGH | Ramesh | 92 | Compliant |
| 2 | R002 | Mumbai | LOW | Suresh | 68 | Non-Compliant |
| 3 | R003 | Delhi | MEDIUM | Anil | 75 | Non-Compliant |
| 4 | R004 | Chennai | HIGH | Kavita | 88 | Compliant |
| 5 | R005 | Kolkata | LOW | Ramesh | 55 | Non-Compliant |
| 6 | R006 | Pune | MEDIUM | Anil | 81 | Compliant |
| 7 | R007 | Hyderabad | LOW | Suresh | 60 | Non-Compliant |
| 8 | R008 | Mumbai | HIGH | Kavita | 90 | Compliant |
| 9 | R009 | Delhi | MEDIUM | Ramesh | 72 | Non-Compliant |
| 10 | R010 | Chennai | HIGH | Anil | 85 | Compliant |
| 11 | R011 | Kolkata | LOW | Suresh | 58 | Non-Compliant |
| 12 | R012 | Pune | MEDIUM | Kavita | 79 | Non-Compliant |
| 13 | R013 | Hyderabad | HIGH | Ramesh | 95 | Compliant |
| 14 | R014 | Mumbai | LOW | Anil | 62 | Non-Compliant |
| 15 | R015 | Delhi | MEDIUM | Kavita | 70 | Non-Compliant |
| 16 | R016 | Chennai | HIGH | Suresh | 82 | Compliant |
·
STRIP/TRIM: removes unwanted spaces
·
PROPCASE: professional city names
·
UPCASE: standard categorical values
5. DATE DERIVATIONS (MDY, INTCK, INTNX)
data food_inspection_dates;
set food_inspection_clean;
Inspection_Month = intnx('month', Inspection_Date, 0, 'b');
Days_Since_Inspection = intck('day', Inspection_Date, today());
Next_Inspection_Date = intnx('day', Inspection_Date, Reinspection_Days);
run;
proc print data=food_inspection_dates;
var Restaurant_ID City Hygiene_Level Inspector_Name Inspection_Date Inspection_Month
Days_Since_Inspection Next_Inspection_Date Reinspection_Days;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Date | Inspection_Month | Days_Since_Inspection | Next_Inspection_Date | Reinspection_Days |
|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | HIGH | Ramesh | 01JAN2025 | 23742 | 406 | 23772 | 30 |
| 2 | R002 | Mumbai | LOW | Suresh | 05JAN2025 | 23742 | 402 | 23756 | 10 |
| 3 | R003 | Delhi | MEDIUM | Anil | 07JAN2025 | 23742 | 400 | 23768 | 20 |
| 4 | R004 | Chennai | HIGH | Kavita | 10JAN2025 | 23742 | 397 | 23776 | 25 |
| 5 | R005 | Kolkata | LOW | Ramesh | 12JAN2025 | 23742 | 395 | 23760 | 7 |
| 6 | R006 | Pune | MEDIUM | Anil | 15JAN2025 | 23742 | 392 | 23786 | 30 |
| 7 | R007 | Hyderabad | LOW | Suresh | 18JAN2025 | 23742 | 389 | 23773 | 14 |
| 8 | R008 | Mumbai | HIGH | Kavita | 20JAN2025 | 23742 | 387 | 23791 | 30 |
| 9 | R009 | Delhi | MEDIUM | Ramesh | 22JAN2025 | 23742 | 385 | 23778 | 15 |
| 10 | R010 | Chennai | HIGH | Anil | 25JAN2025 | 23742 | 382 | 23794 | 28 |
| 11 | R011 | Kolkata | LOW | Suresh | 27JAN2025 | 23742 | 380 | 23773 | 5 |
| 12 | R012 | Pune | MEDIUM | Kavita | 28JAN2025 | 23742 | 379 | 23789 | 20 |
| 13 | R013 | Hyderabad | HIGH | Ramesh | 30JAN2025 | 23742 | 377 | 23816 | 45 |
| 14 | R014 | Mumbai | LOW | Anil | 01FEB2025 | 23773 | 375 | 23785 | 12 |
| 15 | R015 | Delhi | MEDIUM | Kavita | 03FEB2025 | 23773 | 373 | 23793 | 18 |
| 16 | R016 | Chennai | HIGH | Suresh | 05FEB2025 | 23773 | 371 | 23807 | 30 |
·
Identify overdue inspections
·
Schedule future audits
·
Regulatory planning
6. COMPLIANCE PERCENTAGE
data food_inspection_pct;
set food_inspection_dates;
Compliance_Percentage = round((Inspection_Score / 100) * 100, 0.01);
run;
proc print data=food_inspection_pct;
var Restaurant_ID City Hygiene_Level Inspector_Name Inspection_Score Compliance_Percentage;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Score | Compliance_Percentage |
|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | HIGH | Ramesh | 92 | 92 |
| 2 | R002 | Mumbai | LOW | Suresh | 68 | 68 |
| 3 | R003 | Delhi | MEDIUM | Anil | 75 | 75 |
| 4 | R004 | Chennai | HIGH | Kavita | 88 | 88 |
| 5 | R005 | Kolkata | LOW | Ramesh | 55 | 55 |
| 6 | R006 | Pune | MEDIUM | Anil | 81 | 81 |
| 7 | R007 | Hyderabad | LOW | Suresh | 60 | 60 |
| 8 | R008 | Mumbai | HIGH | Kavita | 90 | 90 |
| 9 | R009 | Delhi | MEDIUM | Ramesh | 72 | 72 |
| 10 | R010 | Chennai | HIGH | Anil | 85 | 85 |
| 11 | R011 | Kolkata | LOW | Suresh | 58 | 58 |
| 12 | R012 | Pune | MEDIUM | Kavita | 79 | 79 |
| 13 | R013 | Hyderabad | HIGH | Ramesh | 95 | 95 |
| 14 | R014 | Mumbai | LOW | Anil | 62 | 62 |
| 15 | R015 | Delhi | MEDIUM | Kavita | 70 | 70 |
| 16 | R016 | Chennai | HIGH | Suresh | 82 | 82 |
7. UTILIZATION CLASSIFICATION (MACRO)
%macro classify_utilization;
data food_inspection_util;
set food_inspection_pct;
length Utilization_Level $12.;
if Reinspection_Days <= 10 then Utilization_Level = "High Risk";
else if Reinspection_Days <= 25 then Utilization_Level = "Medium Risk";
else Utilization_Level = "Low Risk";
run;
proc print data=food_inspection_util;
run;
%mend;
%classify_utilization;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Date | Inspection_Score | Violations_Count | Reinspection_Days | Compliance_Status | Inspection_Month | Days_Since_Inspection | Next_Inspection_Date | Compliance_Percentage | Utilization_Level |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | HIGH | Ramesh | 01JAN2025 | 92 | 1 | 30 | Compliant | 23742 | 406 | 23772 | 92 | Low Risk |
| 2 | R002 | Mumbai | LOW | Suresh | 05JAN2025 | 68 | 5 | 10 | Non-Compliant | 23742 | 402 | 23756 | 68 | High Risk |
| 3 | R003 | Delhi | MEDIUM | Anil | 07JAN2025 | 75 | 3 | 20 | Non-Compliant | 23742 | 400 | 23768 | 75 | Medium Risk |
| 4 | R004 | Chennai | HIGH | Kavita | 10JAN2025 | 88 | 2 | 25 | Compliant | 23742 | 397 | 23776 | 88 | Medium Risk |
| 5 | R005 | Kolkata | LOW | Ramesh | 12JAN2025 | 55 | 7 | 7 | Non-Compliant | 23742 | 395 | 23760 | 55 | High Risk |
| 6 | R006 | Pune | MEDIUM | Anil | 15JAN2025 | 81 | 2 | 30 | Compliant | 23742 | 392 | 23786 | 81 | Low Risk |
| 7 | R007 | Hyderabad | LOW | Suresh | 18JAN2025 | 60 | 6 | 14 | Non-Compliant | 23742 | 389 | 23773 | 60 | Medium Risk |
| 8 | R008 | Mumbai | HIGH | Kavita | 20JAN2025 | 90 | 1 | 30 | Compliant | 23742 | 387 | 23791 | 90 | Low Risk |
| 9 | R009 | Delhi | MEDIUM | Ramesh | 22JAN2025 | 72 | 4 | 15 | Non-Compliant | 23742 | 385 | 23778 | 72 | Medium Risk |
| 10 | R010 | Chennai | HIGH | Anil | 25JAN2025 | 85 | 2 | 28 | Compliant | 23742 | 382 | 23794 | 85 | Low Risk |
| 11 | R011 | Kolkata | LOW | Suresh | 27JAN2025 | 58 | 8 | 5 | Non-Compliant | 23742 | 380 | 23773 | 58 | High Risk |
| 12 | R012 | Pune | MEDIUM | Kavita | 28JAN2025 | 79 | 3 | 20 | Non-Compliant | 23742 | 379 | 23789 | 79 | Medium Risk |
| 13 | R013 | Hyderabad | HIGH | Ramesh | 30JAN2025 | 95 | 0 | 45 | Compliant | 23742 | 377 | 23816 | 95 | Low Risk |
| 14 | R014 | Mumbai | LOW | Anil | 01FEB2025 | 62 | 6 | 12 | Non-Compliant | 23773 | 375 | 23785 | 62 | Medium Risk |
| 15 | R015 | Delhi | MEDIUM | Kavita | 03FEB2025 | 70 | 4 | 18 | Non-Compliant | 23773 | 373 | 23793 | 70 | Medium Risk |
| 16 | R016 | Chennai | HIGH | Suresh | 05FEB2025 | 82 | 2 | 30 | Compliant | 23773 | 371 | 23807 | 82 | Low Risk |
Macros help automate business logic and ensure consistency across studies.
8. FRAUD / RISK DETECTION LOGIC (MACRO)
%macro fraud_detection;
data food_inspection_fraud;
set food_inspection_util;
length Risk_Flag $12.;
if Inspection_Score > 85 and Violations_Count > 5 then Risk_Flag = "Suspicious";
else if Inspection_Score < 60 and Reinspection_Days > 25 then Risk_Flag = "Negligence";
else Risk_Flag = "Normal";
run;
proc print data=food_inspection_fraud;
var Restaurant_ID City Hygiene_Level Inspector_Name Inspection_Score Violations_Count
Risk_Flag;
run;
%mend;
%fraud_detection;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Score | Violations_Count | Risk_Flag |
|---|---|---|---|---|---|---|---|
| 1 | R001 | Hyderabad | HIGH | Ramesh | 92 | 1 | Normal |
| 2 | R002 | Mumbai | LOW | Suresh | 68 | 5 | Normal |
| 3 | R003 | Delhi | MEDIUM | Anil | 75 | 3 | Normal |
| 4 | R004 | Chennai | HIGH | Kavita | 88 | 2 | Normal |
| 5 | R005 | Kolkata | LOW | Ramesh | 55 | 7 | Normal |
| 6 | R006 | Pune | MEDIUM | Anil | 81 | 2 | Normal |
| 7 | R007 | Hyderabad | LOW | Suresh | 60 | 6 | Normal |
| 8 | R008 | Mumbai | HIGH | Kavita | 90 | 1 | Normal |
| 9 | R009 | Delhi | MEDIUM | Ramesh | 72 | 4 | Normal |
| 10 | R010 | Chennai | HIGH | Anil | 85 | 2 | Normal |
| 11 | R011 | Kolkata | LOW | Suresh | 58 | 8 | Normal |
| 12 | R012 | Pune | MEDIUM | Kavita | 79 | 3 | Normal |
| 13 | R013 | Hyderabad | HIGH | Ramesh | 95 | 0 | Normal |
| 14 | R014 | Mumbai | LOW | Anil | 62 | 6 | Normal |
| 15 | R015 | Delhi | MEDIUM | Kavita | 70 | 4 | Normal |
| 16 | R016 | Chennai | HIGH | Suresh | 82 | 2 | Normal |
·
High score + many violations = possible manipulation
·
Low score + delayed reinspection = regulatory lapse
9. PROC SQL – BUSINESS SUMMARIES
proc sql;
create table city_summary as
select City,
count(*) as Total_Restaurants,
avg(Inspection_Score) as Avg_Score format=6.2,
avg(Violations_Count) as Avg_Violations
from food_inspection_fraud
group by City;
quit;
proc print data=city_summary;
run;
OUTPUT:
| Obs | City | Total_Restaurants | Avg_Score | Avg_Violations |
|---|---|---|---|---|
| 1 | Chennai | 3 | 85.00 | 2.00000 |
| 2 | Delhi | 3 | 72.33 | 3.66667 |
| 3 | Hyderabad | 3 | 82.33 | 2.33333 |
| 4 | Kolkata | 2 | 56.50 | 7.50000 |
| 5 | Mumbai | 3 | 73.33 | 4.00000 |
| 6 | Pune | 2 | 80.00 | 2.50000 |
10. PROC FREQ
proc freq data=food_inspection_fraud;
tables City*Compliance_Status / nocum;
run;
OUTPUT:
The FREQ Procedure
|
| ||||||||||||||||||||||||||||||||||||||||
proc means data=food_inspection_fraud mean min max std;
var Inspection_Score Violations_Count Reinspection_Days;
run;
OUTPUT:
The MEANS Procedure
| Variable | Mean | Minimum | Maximum | Std Dev |
|---|---|---|---|---|
Inspection_Score Violations_Count Reinspection_Days | 75.7500000 3.5000000 21.1875000 | 55.0000000 0 5.0000000 | 95.0000000 8.0000000 45.0000000 | 12.7566976 2.3380904 10.6909229 |
12. PROC UNIVARIATE
proc univariate data=food_inspection_fraud;
var Inspection_Score;
histogram Inspection_Score;
run;
OUTPUT:
The UNIVARIATE Procedure
Variable: Inspection_Score
| Moments | |||
|---|---|---|---|
| N | 16 | Sum Weights | 16 |
| Mean | 75.75 | Sum Observations | 1212 |
| Std Deviation | 12.7566976 | Variance | 162.733333 |
| Skewness | -0.160423 | Kurtosis | -1.1980457 |
| Uncorrected SS | 94250 | Corrected SS | 2441 |
| Coeff Variation | 16.8405249 | Std Error Mean | 3.1891744 |
| Basic Statistical Measures | |||
|---|---|---|---|
| Location | Variability | ||
| Mean | 75.75000 | Std Deviation | 12.75670 |
| Median | 77.00000 | Variance | 162.73333 |
| Mode | . | Range | 40.00000 |
| Interquartile Range | 21.50000 | ||
| Tests for Location: Mu0=0 | ||||
|---|---|---|---|---|
| Test | Statistic | p Value | ||
| Student's t | t | 23.75223 | Pr > |t| | <.0001 |
| Sign | M | 8 | Pr >= |M| | <.0001 |
| Signed Rank | S | 68 | Pr >= |S| | <.0001 |
| Quantiles (Definition 5) | |
|---|---|
| Level | Quantile |
| 100% Max | 95.0 |
| 99% | 95.0 |
| 95% | 95.0 |
| 90% | 92.0 |
| 75% Q3 | 86.5 |
| 50% Median | 77.0 |
| 25% Q1 | 65.0 |
| 10% | 58.0 |
| 5% | 55.0 |
| 1% | 55.0 |
| 0% Min | 55.0 |
| Extreme Observations | |||
|---|---|---|---|
| Lowest | Highest | ||
| Value | Obs | Value | Obs |
| 55 | 5 | 85 | 10 |
| 58 | 11 | 88 | 4 |
| 60 | 7 | 90 | 8 |
| 62 | 14 | 92 | 1 |
| 68 | 2 | 95 | 13 |
The UNIVARIATE Procedure
13. PROC CORR
proc corr data=food_inspection_fraud;
var Inspection_Score Violations_Count Reinspection_Days;
run;
OUTPUT:
The CORR Procedure
| 3 Variables: | Inspection_Score Violations_Count Reinspection_Days |
|---|
| Simple Statistics | ||||||
|---|---|---|---|---|---|---|
| Variable | N | Mean | Std Dev | Sum | Minimum | Maximum |
| Inspection_Score | 16 | 75.75000 | 12.75670 | 1212 | 55.00000 | 95.00000 |
| Violations_Count | 16 | 3.50000 | 2.33809 | 56.00000 | 0 | 8.00000 |
| Reinspection_Days | 16 | 21.18750 | 10.69092 | 339.00000 | 5.00000 | 45.00000 |
| Pearson Correlation Coefficients, N = 16 Prob > |r| under H0: Rho=0 | |||
|---|---|---|---|
| Inspection_Score | Violations_Count | Reinspection_Days | |
| Inspection_Score | 1.00000 | -0.97677 <.0001 | 0.92132 <.0001 |
| Violations_Count | -0.97677 <.0001 | 1.00000 | -0.94280 <.0001 |
| Reinspection_Days | 0.92132 <.0001 | -0.94280 <.0001 | 1.00000 |
14. PROC SGPLOT
proc sgplot data=food_inspection_fraud;
vbox Inspection_Score / category=City;
run;
OUTPUT:
15. TRANSPOSE
proc transpose data=city_summary out=city_transposed;
id City;
var Avg_Score;
run;
proc print data=city_transposed;
run;
OUTPUT:
| Obs | _NAME_ | Chennai | Delhi | Hyderabad | Kolkata | Mumbai | Pune |
|---|---|---|---|---|---|---|---|
| 1 | Avg_Score | 85.00 | 72.33 | 82.33 | 56.50 | 73.33 | 80.00 |
16. SORT & MERGE
proc sort data=food_inspection_fraud;by City;run;
proc print data=food_inspection_fraud;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Date | Inspection_Score | Violations_Count | Reinspection_Days | Compliance_Status | Inspection_Month | Days_Since_Inspection | Next_Inspection_Date | Compliance_Percentage | Utilization_Level | Risk_Flag |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R004 | Chennai | HIGH | Kavita | 10JAN2025 | 88 | 2 | 25 | Compliant | 23742 | 397 | 23776 | 88 | Medium Risk | Normal |
| 2 | R010 | Chennai | HIGH | Anil | 25JAN2025 | 85 | 2 | 28 | Compliant | 23742 | 382 | 23794 | 85 | Low Risk | Normal |
| 3 | R016 | Chennai | HIGH | Suresh | 05FEB2025 | 82 | 2 | 30 | Compliant | 23773 | 371 | 23807 | 82 | Low Risk | Normal |
| 4 | R003 | Delhi | MEDIUM | Anil | 07JAN2025 | 75 | 3 | 20 | Non-Compliant | 23742 | 400 | 23768 | 75 | Medium Risk | Normal |
| 5 | R009 | Delhi | MEDIUM | Ramesh | 22JAN2025 | 72 | 4 | 15 | Non-Compliant | 23742 | 385 | 23778 | 72 | Medium Risk | Normal |
| 6 | R015 | Delhi | MEDIUM | Kavita | 03FEB2025 | 70 | 4 | 18 | Non-Compliant | 23773 | 373 | 23793 | 70 | Medium Risk | Normal |
| 7 | R001 | Hyderabad | HIGH | Ramesh | 01JAN2025 | 92 | 1 | 30 | Compliant | 23742 | 406 | 23772 | 92 | Low Risk | Normal |
| 8 | R007 | Hyderabad | LOW | Suresh | 18JAN2025 | 60 | 6 | 14 | Non-Compliant | 23742 | 389 | 23773 | 60 | Medium Risk | Normal |
| 9 | R013 | Hyderabad | HIGH | Ramesh | 30JAN2025 | 95 | 0 | 45 | Compliant | 23742 | 377 | 23816 | 95 | Low Risk | Normal |
| 10 | R005 | Kolkata | LOW | Ramesh | 12JAN2025 | 55 | 7 | 7 | Non-Compliant | 23742 | 395 | 23760 | 55 | High Risk | Normal |
| 11 | R011 | Kolkata | LOW | Suresh | 27JAN2025 | 58 | 8 | 5 | Non-Compliant | 23742 | 380 | 23773 | 58 | High Risk | Normal |
| 12 | R002 | Mumbai | LOW | Suresh | 05JAN2025 | 68 | 5 | 10 | Non-Compliant | 23742 | 402 | 23756 | 68 | High Risk | Normal |
| 13 | R008 | Mumbai | HIGH | Kavita | 20JAN2025 | 90 | 1 | 30 | Compliant | 23742 | 387 | 23791 | 90 | Low Risk | Normal |
| 14 | R014 | Mumbai | LOW | Anil | 01FEB2025 | 62 | 6 | 12 | Non-Compliant | 23773 | 375 | 23785 | 62 | Medium Risk | Normal |
| 15 | R006 | Pune | MEDIUM | Anil | 15JAN2025 | 81 | 2 | 30 | Compliant | 23742 | 392 | 23786 | 81 | Low Risk | Normal |
| 16 | R012 | Pune | MEDIUM | Kavita | 28JAN2025 | 79 | 3 | 20 | Non-Compliant | 23742 | 379 | 23789 | 79 | Medium Risk | Normal |
proc sort data=city_summary;by City;run;
proc print data=city_summary;
run;
OUTPUT:
| Obs | City | Total_Restaurants | Avg_Score | Avg_Violations |
|---|---|---|---|---|
| 1 | Chennai | 3 | 85.00 | 2.00000 |
| 2 | Delhi | 3 | 72.33 | 3.66667 |
| 3 | Hyderabad | 3 | 82.33 | 2.33333 |
| 4 | Kolkata | 2 | 56.50 | 7.50000 |
| 5 | Mumbai | 3 | 73.33 | 4.00000 |
| 6 | Pune | 2 | 80.00 | 2.50000 |
data final_merged;
merge food_inspection_fraud
city_summary;
by City;
run;
proc print data=final_merged;
run;
OUTPUT:
| Obs | Restaurant_ID | City | Hygiene_Level | Inspector_Name | Inspection_Date | Inspection_Score | Violations_Count | Reinspection_Days | Compliance_Status | Inspection_Month | Days_Since_Inspection | Next_Inspection_Date | Compliance_Percentage | Utilization_Level | Risk_Flag | Total_Restaurants | Avg_Score | Avg_Violations |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | R004 | Chennai | HIGH | Kavita | 10JAN2025 | 88 | 2 | 25 | Compliant | 23742 | 397 | 23776 | 88 | Medium Risk | Normal | 3 | 85.00 | 2.00000 |
| 2 | R010 | Chennai | HIGH | Anil | 25JAN2025 | 85 | 2 | 28 | Compliant | 23742 | 382 | 23794 | 85 | Low Risk | Normal | 3 | 85.00 | 2.00000 |
| 3 | R016 | Chennai | HIGH | Suresh | 05FEB2025 | 82 | 2 | 30 | Compliant | 23773 | 371 | 23807 | 82 | Low Risk | Normal | 3 | 85.00 | 2.00000 |
| 4 | R003 | Delhi | MEDIUM | Anil | 07JAN2025 | 75 | 3 | 20 | Non-Compliant | 23742 | 400 | 23768 | 75 | Medium Risk | Normal | 3 | 72.33 | 3.66667 |
| 5 | R009 | Delhi | MEDIUM | Ramesh | 22JAN2025 | 72 | 4 | 15 | Non-Compliant | 23742 | 385 | 23778 | 72 | Medium Risk | Normal | 3 | 72.33 | 3.66667 |
| 6 | R015 | Delhi | MEDIUM | Kavita | 03FEB2025 | 70 | 4 | 18 | Non-Compliant | 23773 | 373 | 23793 | 70 | Medium Risk | Normal | 3 | 72.33 | 3.66667 |
| 7 | R001 | Hyderabad | HIGH | Ramesh | 01JAN2025 | 92 | 1 | 30 | Compliant | 23742 | 406 | 23772 | 92 | Low Risk | Normal | 3 | 82.33 | 2.33333 |
| 8 | R007 | Hyderabad | LOW | Suresh | 18JAN2025 | 60 | 6 | 14 | Non-Compliant | 23742 | 389 | 23773 | 60 | Medium Risk | Normal | 3 | 82.33 | 2.33333 |
| 9 | R013 | Hyderabad | HIGH | Ramesh | 30JAN2025 | 95 | 0 | 45 | Compliant | 23742 | 377 | 23816 | 95 | Low Risk | Normal | 3 | 82.33 | 2.33333 |
| 10 | R005 | Kolkata | LOW | Ramesh | 12JAN2025 | 55 | 7 | 7 | Non-Compliant | 23742 | 395 | 23760 | 55 | High Risk | Normal | 2 | 56.50 | 7.50000 |
| 11 | R011 | Kolkata | LOW | Suresh | 27JAN2025 | 58 | 8 | 5 | Non-Compliant | 23742 | 380 | 23773 | 58 | High Risk | Normal | 2 | 56.50 | 7.50000 |
| 12 | R002 | Mumbai | LOW | Suresh | 05JAN2025 | 68 | 5 | 10 | Non-Compliant | 23742 | 402 | 23756 | 68 | High Risk | Normal | 3 | 73.33 | 4.00000 |
| 13 | R008 | Mumbai | HIGH | Kavita | 20JAN2025 | 90 | 1 | 30 | Compliant | 23742 | 387 | 23791 | 90 | Low Risk | Normal | 3 | 73.33 | 4.00000 |
| 14 | R014 | Mumbai | LOW | Anil | 01FEB2025 | 62 | 6 | 12 | Non-Compliant | 23773 | 375 | 23785 | 62 | Medium Risk | Normal | 3 | 73.33 | 4.00000 |
| 15 | R006 | Pune | MEDIUM | Anil | 15JAN2025 | 81 | 2 | 30 | Compliant | 23742 | 392 | 23786 | 81 | Low Risk | Normal | 2 | 80.00 | 2.50000 |
| 16 | R012 | Pune | MEDIUM | Kavita | 28JAN2025 | 79 | 3 | 20 | Non-Compliant | 23742 | 379 | 23789 | 79 | Medium Risk | Normal | 2 | 80.00 | 2.50000 |
17. PROC DATASETS DELETE
proc datasets library=work nolist;
delete food_inspection_raw food_inspection_clean;
quit;
LOG:
18. BUSINESS INSIGHTS
·
Demonstrates end-to-end data lifecycle
·
Shows regulatory
analytics
·
Uses advanced
SAS techniques
·
Practical fraud detection logic
· Clean, auditable, scalable code
19. Conclusion
This Food Safety Inspections SAS project clearly
demonstrates how real-world inspection data can be created, cleaned, analyzed,
and interpreted using SAS in a practical and business-oriented way. By
combining DATA step programming, PROC SQL, statistical procedures, macros, and
date and character functions, the project shows how food safety authorities can
monitor restaurant hygiene, identify high-risk establishments, and ensure
timely re-inspections.
The
analysis helps in understanding relationships between inspection scores,
violations, and reinspection timelines, while the fraud and risk detection
logic highlights suspicious or negligent inspection patterns. Overall, this
project reflects how SAS supports public health decision-making by turning raw
inspection data into meaningful insights that improve food safety, regulatory
compliance, and consumer protection in a simple and effective manner.
INTERVIEW QUESTIONS FOR YOU
·
What is MDY?
·
What is ROUND function?
·
What is the difference between MEAN and
SUM?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
·
What is MDY?
·
What is ROUND function?
·
What is the difference between MEAN and
SUM?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
About the Author:
SAS Learning Hub is a data analytics and SAS programming platform focused on clinical, financial, and real-world data analysis. The content is created by professionals with academic training in Pharmaceutics and hands-on experience in Base SAS, PROC SQL, Macros, SDTM, and ADaM, providing practical and industry-relevant SAS learning resources.
Disclaimer:
The datasets and analysis in this article are created for educational and demonstration purposes only. They do not represent FOOD SAFETY data.
Our Mission:
This blog provides industry-focused SAS programming tutorials and analytics projects covering finance, healthcare, and technology.
This project is suitable for:
· Students learning SAS
· Data analysts building portfolios
· Professionals preparing for SAS interviews
· Bloggers writing about analytics and smart cities
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Follow Us On :
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--->Follow our blog for more SAS-based analytics projects and industry data models.
---> Support Us By Following Our Blog..
To deepen your understanding of SAS analytics, please refer to our other data science and industry-focused projects listed below:
1.Can SAS Identify Top Gaming Players Using Performance Data and Advanced Analytics?
2.Can SAS Data Analysis Accurately Measure a Sports Player’s
True Performance Over Time?
3.Can SAS Analytics Reveal Differences Between Various Types of Neem Trees?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1.Can SAS Identify Top Gaming Players Using Performance Data and Advanced Analytics?
2.Can SAS Data Analysis Accurately Measure a Sports Player’s True Performance Over Time?
3.Can SAS Analytics Reveal Differences Between Various Types of Neem Trees?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Comments
Post a Comment